February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 1 (2011) 1–51 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004229
N/V -LIMIT FOR LANGEVIN DYNAMICS IN CONTINUUM
FLORIAN CONRAD∗,†,‡ and MARTIN GROTHAUS∗,§ ∗Mathematics Department, University of Kaiserslautern, P. O. Box 3049, 67653 Kaiserslautern, Germany †Mathematics
Department, Bielefeld University, P. O. Box 100131, 33501 Bielefeld, Germany ‡
[email protected] §
[email protected] http://www.mathematik.uni-kl.de/∼grothaus/ Received 4 September 2009 We construct an infinite particle/infinite volume Langevin dynamics on the space of simple configurations in Rd having velocities as marks. The construction is done via a limiting procedure using N -particle dynamics in cubes (−λ, λ]d with periodic boundary condition. A main step to this result is to derive an (improved) Ruelle bound for the canonical correlation functions of N -particle systems in (−λ, λ]d with periodic boundary condition. After proving tightness of the laws of the finite particle dynamics, the identification of accumulation points as martingale solutions of the Langevin equation is based on a general study of properties of measures on configuration space fulfilling a uniform Ruelle bound (and their weak limits). Additionally, we prove that the initial/invariant distribution of the constructed dynamics is a tempered grand canonical Gibbs measure. All proofs work for a wide class of repulsive interaction potentials φ (including, e.g., the Lennard–Jones potential) and all temperatures, densities and dimensions d ≥ 1. Keywords: Limit theorems; non-sectorial diffusion processes; interacting continuous particle systems; periodic boundary condition; Ruelle bound. Mathematics Subject Classification 2010: 60B12, 82C22, 60K35, 60H10
1. Introduction The infinite particle Langevin equation dxit = vti dt, 2κ i i dwt − κvti dt − dvt = ∇φ(xit − xjt )dt, β
(1.1)
i=j
where κ > 0, β > 0, describes the motion of particles at positions xit ∈ Rd having velocities vti ∈ Rd , i ∈ N, t ∈ [0, ∞). This motion is influenced by a surrounding medium causing friction (corresponding to the second summand in the second line 1
February 10, J070-S0129055X11004229
2
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
of (1.1)) and stochastic perturbation, modeled by a sequence of independent Rd valued Brownian motions (wti )t≥0 . Moreover, the particles interact via a symmetric pair potential φ. For investigating properties like, e.g., the equilibrium fluctuations of infinite systems of interacting particles the first main step is the construction of equilibrium (martingale) solutions for the corresponding model (cf. [26]). In [11], strong solutions are constructed to (1.1) in the case d = 2 for a wide class of symmetric pair potentials φ and initial configurations. In particular, for the construction given there a singularity of φ in the origin is allowed and φ is assumed to be C 1 (Rd \{0}) with derivatives fulfilling some local Lipschitz continuity (we do not give all the details on the conditions). Another construction for arbitrary d, but with more restrictions on the potential can be found in [26]. The potentials treated there are assumed to be positive, of finite range and C 2 , which, in particular, does not allow any singularities. There are also constructions of deterministic dynamics for infinitely many particles (κ = 0), see, e.g., [24, 30, 4], some of which work in more general situations. However, note that for the above mentioned purpose of considering a scaling limit the stochastic dynamics is preferable, since, as is mentioned in [11], one can expect it to exhibit a better long-time behavior. (See, e.g., [29] for the correspondence between ergodic properties and the Boltzmann–Gibbs principle, which is crucial for the derivation of hydrodynamic limits in [29, 26].) Up to now there are no results on the construction of equilibrium Langevin dynamics covering physically realistic situations, such as e.g., the Lennard–Jones potential in dimension d = 3. Moreover, generalizations to the case of noncontinuous forces ∇φ have never been considered and are impossible when using the method from [11, 26]. Therefore, in this article we present a completely different approach to construct for a wide class of potentials a martingale solution to (1.1) in the sense of [14], having a grand canonical Gibbs measure as initial distribution. The general method is the one used there for the construction of stochastic gradient dynamics. As assumptions on the potential we only need weak differentiability in Rd \{0}, boundedness of the weak derivatives away from 0 and some quite weak assumption on integrability of the weak derivatives. φ is not only allowed, but even supposed to be singular in the origin. This prevents particles from hitting each other and is the correct setting for applications in physics. Before describing the construction, we make the expression “martingale solution” more precise. To do so, we have to introduce some notation. Let us consider the space Γv = {γ ⊂ Rd × Rd | prx γ ∈ Γ, v = v whenever (x, v), (x, v ) ∈ γ}
(1.2)
of locally finite simple velocity marked configurations in Rd , where Γ = {ˆ γ ⊂ γ ∩ Λ) < ∞ for all compact Λ ⊂ Rd } and prx denotes the projection to Rd | (ˆ the first d coordinates, i.e. prx γ = {x ∈ Rd | (x, v) ∈ γ}, γ ⊂ Rd × Rd . A denotes the cardinality of a set A. By F Cb∞ (Ds , Γv ) we denote the space of smooth
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
3
cylinder functions on Γv of the form F (·) = gF (f1 , ·, . . . , fK , ·), where K ∈ N, gF ∈ Cb∞ (RK ) (which means gF is bounded and infinitely often differentiable ∞ and all derivatives are bounded) and fi ∈ Ds := Csbs (Rd × Rd ). Here we define ∞ d d ∞ Csbs (R × R ) to be the space of Cb functions with spatially bounded support, i.e. the subset of Cb∞ (Rd × Rd ) of functions having support in Λ × Rd for some com pact Λ ⊂ Rd . Moreover, one defines f, γ := (x,v)∈γ f (x, v) for f having spatially bounded support (or also for more general f , e.g. f ≥ 0) and γ ∈ Γv . Now let (xit , vti )t≥0,1≤i≤N be an N -particle solution of (1.1). Via the mapping symN: RN d × RN d → Γv , given by symN (x, v) = {(x1 , v 1 ), . . . , (xN , v N )}, we map the dynamics to Γv . (Here the possibility is ignored that positions of particles coincide.) We then find that the law of the resulting Γv -valued process solves the martingale problem for (L, F Cb∞ (Ds , Γv )), defined by LF (γ) :=
K κ ∂l ∂l gF ({fi }K i=1 , γ)(∇v fl )(∇v fl ), γ β
l,l =1
+
K
∂l gF ({fi }K i=1 , γ)
l=1
−
κ ∆v fl − κv∇v fl + v∇x fl , γ β
∇φ(x − x )(∇v fl (x, v) − ∇v fl (x , v ))
(1.3)
{(x,v),(x ,v )}⊂γ
where F is as above, γ ∈ Γv and {fi }K i=1 , · := (f1 , ·, , . . . , fK , ·). We therefore call any (possible infinite particle) Γv -valued process solving the martingale problem for L on F Cb∞ (Ds , Γv ) a martingale solution of (1.1) (on configuration space). Due to the degeneracy in the position coordinates of the generator L as given above, there is no hope to apply the theory of symmetric or sectorial Dirichlet forms to obtain an existence result (as is done in the case of the stochastic gradient dynamics in [25, 34, 1]). We mention that, as a possible alternative, one might think of applying the theory of generalized Dirichlet forms (cf. [31]) or results from [2] instead in order to construct the dynamics directly at least on the space of multiple configurations. Such a construction is an interesting topic for further research, but up to now we did not obtain nontrivial results in this direction (but see [5], where a Langevin dynamics of a finite particle system is constructed by these methods). As starting point for the construction of the infinite particle dynamics we use the finite volume N -particle Langevin dynamics constructed in [5]. We approximate Rd by cubes Λλn = (−λn , λn ]d , n ∈ N, where λn ↑ ∞ as n → ∞, and choose a Nn sequence (Nn )n∈N of natural numbers such that limn→∞ (2λ d = ρ < ∞. n) v The marked configuration space Γ , when equipped with a natural topology, is a Polish space. Generalizing results from [19], we explicitly give a corresponding complete metric in Sec. 2. Thus, the results stated in [9] can be applied in order to
February 10, J070-S0129055X11004229
4
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
obtain tightness of the sequence of the approximating dynamics. In order to obtain the corresponding tightness estimates, we establish a (uniform improved) Ruelle bound for the correlation functions of their invariant initial distributions, the finite volume canonical Gibbs measures with periodic boundary condition. In [28] one finds the (original) proof for such a bound, which works for empty boundary condition, but only in the grand canonical setting (and for tempered infinite volume grand canonical Gibbs measures, see [28, Corollary 5.3]). In [14] a Ruelle bound for canonical correlation functions with empty boundary condition is shown by an adaptation of Ruelles proof using an estimate for the partition functions from [7]. In the situation of the dynamics in [5] the boundary of (−λn , λn ] is assumed to be periodic; therefore, one has to consider the canonical correlation functions with periodic boundary condition. These functions may be written down similar to the empty boundary case using summations φˆλn (cf. (3.5) below) of the potential. However, these sums are not lower regular uniformly in n. But this would be necessary to apply the proof from [14] or [28] (essentially) directly. This problem is solved here by another modification of this proof (basically by adding a third case to the case differentiation of Ruelles proof, cf. Remark 3.13 below), allowing us to use uniformly lower regular cutoffs of the φˆλn , n ∈ N. Having shown tightness of the approximating laws and therefore the existence of weak accumulation points, we next need to prove that these accumulation points solve (1.1) in the sense of the martingale problem. The main problem here is to approximate LF as in (1.3) uniformly on the side of the approximations as well as on the side of the limit by bounded continuous random variables. We prove that this is indeed possible, when the approximating measures fulfill uniformly a Ruelle bound. Section 3.4 contains results on such approximations which we consider to be useful in general when dealing with limits of stochastic dynamics on configuration space. Though they are partly based on arguments from [14], these results can be used to generalize the construction of stochastic gradient dynamics given there to the case of potentials which are only weakly differentiable in Rd \{0} instead of C 1 (Rd \{0}). For details, see Remark 4.18 below. The question arises whether one even has convergence of the dynamics, when the initial distributions converge to some measure µ. For this it would be sufficient to have convergence of the semigroups corresponding to the approximating dynamics. As in [14], this convergence can be shown when imposing an additional assumption. In [14] a uniqueness condition had to be imposed on the associated Dirichlet form. In the non-sectorial case, we have to make an assumption of a slightly stronger type, namely essential m-dissipativity of (L, F Cb∞ (Ds , Γv )) in L1 (Γv ; µ): By Trotter’s semigroup convergence theorem (formulated in a modification of the framework of Kuwae and Shioya to L1 , see [21, 20, 32]) convergence of the semigroups follows immediately in this case. We give details in Sec. 4.5. Finally, by using a method from [13], where equivalence of the microcanonical and the grand canonical ensemble are shown in the periodic boundary situation, we transport this result to the case of the canonical ensemble. This shows that the
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
5
invariant measure of the dynamics constructed in this paper is a grand canonical Gibbs measure. The considerations in [13] and the present article are not restricted to the high temperature regime, contrary to the corresponding result in [14, Sec. 6]. This may be considered as an advantage of starting with a periodic setting. Let us briefly summarize the core results of this paper: • Derivation of an (improved) Ruelle bound for finite volume canonical correlation functions with periodic boundary condition. This bound is uniform for bounded particle densities. (Theorem 3.14, Corollary 3.16.) • Tightness of the laws P (n) of Nn -particle Langevin dynamics in cubes (−λn , λn ]d with periodic boundary condition for a wide class of symmetric pair potentials Nn that are weakly differentiable in Rd \{0}. Here we assume that λn ↑ ∞ and (2λ d n) converges to some ρ ∈ [0, ∞) as n → ∞. (Theorem 4.13.) • Identification of accumulation points P of (P (n) )n∈N as above as martingale solutions of the Langevin equation on configuration space. (Theorem 4.17.) • Identification of the limit of finite volume canonical Gibbs measures with periodic boundary condition (i.e. the invariant initial distribution of P as above) as grand canonical Gibbs measure. (Theorem 5.1.) (We should mention that the hard work was done by Georgii and by Georgii and Zessin in [15, 12, 13], where the corresponding result for limits of microcanonical Gibbs measures is shown.) The above results apply to any dimension d ≥ 1. The Ruelle bound and the result on equivalence of ensembles are true for any repulsive, tempered, bounded below potentials (see conditions (RP), (T), (BB) in Sec. 3.1). The results on the dynamics require the weak differentiability condition (WD) formulated in Sec. 4.1 and additionally, as a restriction coming from the approximation with periodic dynamics, one needs to control the forces at large distances with condition (IDF). However, this condition may be rather seen as a theoretical restriction (cf. Remark 4.1(i), and also Remark 4.1(iii)). 2. A Polish Metric on Γv A natural topology for the space Γv defined in (1.2) is the topology τ generated by the continuous functions with spatially bounded support, i.e. by mappings f, · with f ∈ Csbs (Rd × Rd ). In particular, using the (vague topology, i.e. the) topology generated by C0 (Rd × Rd ) functions instead, a sequence of configurations would be able to converge to the empty configuration just by convergence of the marks to infinity. In this section we define a Polish metric on Γv which generates τ . We use a construction similar to the one for unmarked simple configurations given in [19] and [14]. Below we consider Γv as a subset of the set Mv of Radon measures on Rd × Rd and Γ as a subset of the set M of Radon measures on Rd (in the sense that a set of points in Rd × Rd (respectively, Rd ) is identified with the sum of Dirac
February 10, J070-S0129055X11004229
6
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
measures in these points). The notation ·, · is then extended to the dualization between continuous compactly supported functions and Radon measures. It is well known that the vague topology on Mv is generated by the metric dMv , given by d
Mv
(µ, ν) :=
∞ k=1
2−k
|fk , µ − fk , ν| , 1 + |fk , µ − fk , ν|
µ, ν ∈ Mv ,
where fk , k ∈ N, are suitable elements of C02 (Rd × Rd ) (cf. e.g. (the proof be a sequence in C02 (Rd ) such that dM (µ, ν) := of) [18, A7.7]). Let (g ) ∞ −k | gk ,µ − gk ,ν |k k∈N k=1 2 1+| gk ,µ − gk ,ν | , µ, ν ∈ M, generates the vague topology on M. For any two µ, ν ∈ Mv assigning finite mass to any Λ × Rd , Λ ⊂ Rd compact, we may define d (µ, ν) := dMv (µ, ν) + dM (prx µ, prx ν)
(2.1)
where prx µ ∈ M denotes the image measure of µ ∈ Mv with respect to the projection to the first d coordinates. Then d : Γv × Γv → [0, ∞) generates the topology τ and prx : (Γv , d ) → (Γ, dM ) is continuous. (For the proof we refer to [6, Lemma 3.2.1].) However, d is far from being a complete metric on Γv . Firstly, consider the sequence (δ(x,vn ) )n∈N of Dirac measures, where vn → ∞. Such a sequence is a Cauchy sequence with respect to d , but it does not converge. Secondly, nothing prevents positions of particles from converging to each other. We use the idea from [19] to solve these problems. Let (Ik )k∈N be a sequence of C 1 functions on Rd such that 1{|·|≤k} ≤ Ik ≤ 1{|·|≤k+1} and choose a function h: Rd → (0, 1] such that h ∈ C 1 (Rd ) ∩ L1 (Rd ). Moreover, let Φ: (0, ∞) → [0, ∞) be a continuous decreasing function such that limt→0 Φ(t) = ∞. Then the space Γ of simple unmarked configurations is a complete (separable) metric space when equipped with the metric γ , γˆ ) := dM (ˆ γ , γˆ ) + dΦ,h (ˆ
∞
2−k rk
k=1
|S Φ,hIk (ˆ γ ) − S Φ,hIk (ˆ γ )| 1 + |S Φ,hIk (ˆ γ ) − S Φ,hIk (ˆ γ )| for γˆ , γˆ ∈ Γ,
(2.2)
where for nonnegative f ∈ C 1 (Rd ) and γˆ ∈ Γ we set S Φ,f (ˆ γ ) := eΦ(|x−y|2 ) f (x)f (y) {x,y}⊂ˆ γ
with | · |2 being the Euclidean norm on Rd , and (rk )k≥0 is any bounded sequence of positive numbers (cf. [19, Theorem 3.5]). (The topology and the completeness of the metric are invariant with respect to the weights (rk )k∈N , as long as they are positive and bounded.) Moreover, in [19, Theorem 3.5], it is shown that the metric dΦ,h generates the vague topology on Γ. This construction solves the problem of avoiding convergence to multiple configurations. It remains to keep mass away from v = ∞. Let a: [0, ∞) → [0, ∞) be an increasing surjective C 2 function and define χk (x, v) := a(v)(hIk )(x), x, v ∈ Rd .
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
7
We define for γ, γ ∈ Γv dΦ,a,h (γ, γ ) := dMv (γ, γ ) + dΦ,h (prx (γ), prx (γ )) +
∞
qk 2−k
k=1
|χk , γ − χk , γ | 1 + |χk , γ − χk , γ |
with (qk )k∈N also being a bounded sequence of positive numbers. Then dΦ,a,h has the desired properties, i.e. the following lemma holds. (It is shown using the completeness of Γ with respect to dΦ,h and the closedness of the set of N0 -valued measures from Mv (cf. [18, A7.4]) with respect to vague convergence, for a complete proof see [6, Lemma 3.2.2].) Lemma 2.1. dΦ,a,h generates the topology τ on Γv and (Γv , dΦ,a,h ) is a complete metric space. 3. Ruelle Bound for Finite Volume Canonical Gibbs Measures with Periodic Boundary Condition In this section, we derive the Ruelle bound for correlation functions corresponding to finite volume canonical Gibbs measures with periodic boundary condition. We first state and discuss conditions on the potential which are similar to those in [14, Sec. 3] in Sec. 3.1 and investigate properties of the periodic sum of the potential in Sec. 3.2. In particular, we prove that the important superstability property holds uniformly for these sums as well as temperedness and lower regularity in a sense sufficient for our purposes. We then go on with the proof of the Ruelle bound in the periodic case in Sec. 3.3. Finally, in Sec. 3.4, we show that a uniform Ruelle bound extends to weak limits of measures, and prove some approximation results which we need for the proof of Theorem 4.17 below. Though all considerations are stated for the configurational case (not including velocities) they also extend to the case of “full” measures (with independent Gaussian distributed velocities). For details on this fact, see also Sec. 3.4. 3.1. Conditions on the potential Until Sec. 3.3, we fix a tempered, repulsive, bounded below pair potential φ, i.e. a function φ: Rd → R ∪ {∞} which is measurable, symmetric (i.e. φ(x) = φ(−x), x ∈ Rd ) and fulfills the assumptions (BB), (RP ), (T ) which are given below. By | · | we denote the maximum norm in Rk , k ∈ N, i.e. |(y1 , . . . , yk )| := max1≤i≤k |yi |, (y1 , . . . , yk ) ∈ Rk . (BB) (bounded below ) There exists M < ∞ such that φ(x) ≥ −M for all x ∈ Rd . (RP) (repulsion) There exist R1 > 0 and a decreasing continuous function Φ: (0, ∞) → [0, ∞) with limt→0 Φ(t)td = ∞ such that φ(x) ≥ Φ(|x|)
for |x| ≤ R1 .
February 10, J070-S0129055X11004229
8
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
Furthermore, φ is bounded from above on {x ∈ Rd |r ≤ |x| < ∞} for all r > 0. (T) (temperedness) There exist G, R2 < ∞ and ε > 0 such that |φ(x)| ≤ G|x|−d−ε
for |x| ≥ R2 .
Note that the second condition in (RP) implies that we may (and therefore we will) set R1 = R2 =: R. Moreover, R may be chosen arbitrarily small (changing, of course, the other constants). For later use in Sec. 4, we need more regularity of the function Φ, so we state the following lemma. Lemma 3.1. Let Φ: (0, ∞) → [0, ∞) be continuous, decreasing and such that ˆ (0, ∞) → [0, ∞) such that Φ ˆ ≤ Φ, Φ(t)td → ∞ as t → 0. Then there exists Φ: ˆ d −a Φ ˆ ˆ is bounded ˆ is continuously differentiable and e Φ(t)t Φ → ∞ as t → 0 and Φ for any a > 0. Proof. The proof is straightforward and therefore omitted. Let Λ ⊂ Rd . By ΓΛ we denote the set of locally finite simple configurations in Λ (i.e. locally finite subsets). In the sequel we will often denote finite or periodic configurations by Z (or similar notations) instead of γ, such that the notation looks a bit more similar to the one in [14, Sec. 3] and [28]. For a finite configuration Z ∈ ΓRd = Γ we define the configurational energy φ(x − y) (3.1) Uφ (Z) := {x,y}⊂Z
and for Z , Z ∈ ΓRd being disjoint finite configurations we define the interaction energy φ(x − y). Wφ (Z , Z ) := Uφ (Z ∪ Z ) − Uφ (Z ) − Uφ (Z ) = x∈Z ,y∈Z
It is well known (cf. [28, Proposition 1.4]) that the assumptions (RP), (T) and (BB) imply (SS) (superstability) There exist A > 0, B ≥ 0 such that, if Z is a finite configuration in Rd , then (Z ∩ Q1 (r))2 − BZ. Uφ (Z) ≥ A r∈Zd
(LR) (lower regularity) There exists a decreasing mapping Ψ: N0 → [0, ∞) such that r∈Zd Ψ(|r|) < ∞ and for disjoint finite configurations Z, Z it holds Ψ(|r − r |)(Z ∩ Q1 (r))(Z ∩ Q1 (r )), (3.2) Wφ (Z, Z ) ≥ − r,r ∈Zd
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
where Q1 (r) := {(x1 , . . . , xd ) ∈ Rd | ri − (r1 , . . . , rd ) ∈ Zd .
1 2
< xi ≤ ri +
1 2}
9
for r =
In the case of pair interactions corresponding to a symmetric potential which is bounded from below (i.e. the case we consider here), (LR) as given above is equivalent to (LR) as given in [28] and also to the existence of a decreasing measurable ∞ ψ: [0, ∞) → [0, ∞) such that 0 ψ(t)td−1 dt < ∞ and −φ(x) ≤ ψ(|x|) for all x ∈ Rd (see also [22, Sec. 2.2.3]). 3.2. Potentials fulfilling (RP), (T), (BB) in periodic domains For λ > 0 we define Λλ := (−λ, λ]d . If Z ∈ ΓΛλ , we define Z˜ ∈ ΓRd to be the configuration resulting from 2λ-periodic continuation of Z to Rd . A configuration Z ∈ ΓΛλ is said to have distances < λ, if it holds {((x1 , . . . , xd ), (y1 , . . . , yd )) ∈ Z × Z | xi − yi = λ for some 1 ≤ i ≤ d} = ∅. Note that when we consider Λλ to have a periodic boundary, λ is the maximal possible distance between two particles in Λλ . Usually (e.g. in the sense of canonical Gibbs measures in continuous systems) a configuration has distances < λ. In the case of periodic boundary condition we have to deal with the configurational energy of a finite configuration Z ∈ ΓΛλ with periodic boundary condition, which we define by ˜φ,λ (Z) := U φ(x − y + 2λr). (3.3) {x,y}⊂Z r∈Zd
Remark 3.2. Note that in this definition the interaction between one particle and its copies is ignored. This does not have consequences for the results derived below. The corresponding canonical Gibbs measures and their correlation functions are exactly the same as if these interactions were included. Temperedness of the potential φ ensures that the above definition makes sense as well as the following. We define for λ > 0, y ∈ Rd φλ (y) := 1(−λ,λ)d (y) φ(y + 2λr). r∈Zd
˜φ,λ in terms of a finite configuration (cf. Lemma 3.3 We use φλ in order to express U below). Possibly one would at first sight prefer to use the indicator function 1(−2λ,2λ) instead of 1(−λ,λ) to simplify this, but see Remark 3.5 below. Lemma 3.3. There is a set S ⊂ Λ2λ\Λλ , such that for Z ∈ ΓΛλ having distances < λ it holds ˜φ,λ (Z) = Wφ (Z, Z˜ ∩ S) + Uφ (Z). U λ λ Remark 3.4. In order to avoid lenghty descriptions of the shape of the set S, the assertion of Lemma 3.3 looks more mysterious than necessary (see the proof below).
February 10, J070-S0129055X11004229
10
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
Proof. First note that by assumption for any x, y ∈ Z˜ the statement x − y ∈ Λλ is equivalent to |x − y| < λ, which is symmetric in x, y. It holds ˜φ,λ (Z) = U φ(x − y + 2λr) {x,y}⊂Z r∈Zd
{x,y}⊂Z
r∈Zd
= Uφλ (Z) +
x−y ∈Λ / λ
φ(x − y + 2λr).
(3.4)
We consider the set M := {{−r, r}|r ∈ Zd , |r| = 1} and choose an arbitrary mapping χ: M → {r ∈ Zd ||r| = 1} such that χ({−r, r}) ∈ {−r, r} for any r ∈ Zd , |r| = 1, i.e. χ selects only one representative of each antipodal pair {−r, r} from M. We define S := r∈χ(M) (Λλ + 2λr) ∩ Λ2λ . Define X1 := {{x, y} ⊂ Z||x − y| > λ} and X2 := {(x, y)|x ∈ Z, y ∈ Z˜ ∩ S, |x − y| < λ}. We define θ: X1 → X2 in the following way. For {x, y} ∈ X1 there exists (uniquely) an r{x,y} ∈ χ(M) such that y − x ∈ Λλ − 2λr{x,y} (without loss of generality, possibly after interchanging x and y). Then we set θ({x, y}) := (x, y + 2λr{x,y} ), which is in X2 . θ is a bijection fulfilling r∈Zd φ(x−y +2λr) = φλ (x −y ) for any θ({x, y}) = (x , y ), {x, y} ∈ X1 . This and (3.4) imply the assertion. ˜φ,λ (Z), Z ∈ ΓΛ can be easier expressed in terms of φˆλ , defined by Remark 3.5. U λ ˆ φ(y + 2λr) (3.5) φλ (y) := 1(−2λ,2λ)d (y) r∈Zd
but below we prove properties of φλ which cannot be obtained for φˆλ . In particular, the latter potentials are not uniformly lower regular (or tempered). ˜φ,λ with Let us now focus on properties of φλ , λ > 0, and the total energy U periodic boundary condition. We first observe that φλ fulfill uniformly in λ ≥ λ0 > 0 the conditions we imposed on φ. ˜ M ˜, G ˜ in R+ and a decreasing continuous Lemma 3.6. Let λ0 > 0. There exist R, ˜ ˜ function Φ: (0, ∞) → [0, ∞) fulfilling lims→0 Φ(s)sd = ∞ (which, as is possible by ˜ ˜ is bounded for Lemma 3.1, shall be continuously differentiable and such that e−aΦ Φ any a > 0), such that ˜. (i) For all λ ≥ λ0 it holds φλ ≥ −M (ii) For all λ ≥ λ0 it holds ˜ −d−ε |φλ (x)| ≤ G|x|
˜ whenever |x| ≥ R.
(iii) For all λ ≥ λ0 it holds ˜ φλ (x) ≥ Φ(|x|)
˜ whenever |x| ≤ R.
(iv) For all c > 0 it holds sup sup |φλ (x)| < ∞.
λ≥λ0 |x|≥c
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
11
Proof. We may without loss of generality assume that in (RP), (T) it holds R1 = ˜ := R. R2 =: R < λ0 . Temporarily we choose R d d For y ∈ (−λ, λ) and r ∈ Z \{0} it holds |y + 2λr| ≥ 2λ − |y| ≥ λ ≥ R, so φ(y + 2λr) ≥ −M − G |y + 2λr|−d−ε φλ (y) = r∈Zd
≥ −M − Gλ0−d−ε
r∈Zd \{0}
|r|−d−ε
r∈Zd \{0}
and the right-hand side is a constant larger than −∞, which proves (i). The same argument shows that for y ∈ (−λ, λ)d , λ ≥ λ0 , it holds |φλ (y) − φ(y)| ≤ Gλ−d−ε |r|−d−ε ≤ Gλ0−d−ε |r|−d−ε . r∈Zd \{0}
(3.6)
r∈Zd \{0}
This proves (iv). −d−ε ˜ 1 := G(1 + d ). Let λ ∈ (λ0 , ∞) and To show (ii), we define G r∈Z \{0} |r| d ˜ If |x| ≥ λ, then φλ (x) = 0 and there is nothing to prove. Therefore, x ∈ R , |x| ≥ R. let |x| < λ. It holds |x + 2λr| ≥ |λr| ≥ λ > |x| ≥ R for all r ∈ Zd \{0}. Hence ˜ 1 |x|−d−ε |φ(x + 2λr)| ≤ G|x|−d−ε + G |λr|−d−ε ≤ G |φλ (x)| ≤ r∈Zd \{0}
r∈Zd
proving (ii). −d−ε ˜ := Φ − Gλ−d−ε d Finally, (3.6) implies (iii) with Φ . Since this 0 r∈Z \{0} |r| ˜ might become negative away from 0, we may have to choose R ˜ a bit function Φ ˜ smaller. By (iv) we see that then (ii) still holds with G1 replaced by some possibly ˜ larger constant G. In Lemma 3.8 below, the above result is used to prove that the φλ are superstable and lower regular uniformly in λ ≥ λ0 and that moreover also the energy functions ˜φ,λ of configurations in Λλ with periodic boundary are uniformly superstable. As U a preparation for obtaining the latter result we need the following lemma. Lemma 3.7. Let λ0 > 0. There exists a constant cλ0 > 0 such that for all λ ≥ λ0 , Z ∈ ΓΛλ , k ∈ N it holds 2 (Z ∩ Q (r)) ≤ (Zk ∩ Q1 (r))2 (2k + 1)d c−1 1 λ0 r∈Zd
r∈Zd
≤ (2k + 1)d cλ0 where Zk :=
r∈Zd ,|r|≤k (Z
(Z ∩ Q1 (r))2
r∈Zd
+ 2λr).
Proof. We have (2k + 1)d r∈Zd (Z ∩ Q2λ (r))2 = r∈Zd (Zk ∩ Q2λ (r))2 and the same is true when 2λ is replaced by some δ ∈ [min{2λ0 , 1/3}, 1] such that 2λ is an
February 10, J070-S0129055X11004229
12
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
odd multiple of δ. The assertion follows from repeated application of the fact that for any δ1 , δ2 > 0 such that δ2 ≥ δ1 ≥ δ2 /2 and all finite configurations Y ∈ ΓRd it holds (Y ∩ Qδ1 (r))2 ≤ (Y ∩ Qδ2 (r))2 ≤ 6d (Y ∩ Qδ1 (r))2 . 6−d r∈Zd
r∈Zd
r∈Zd
Here for δ > 0 and r ∈ Zd we set Qδ (r) := {(x1 , . . . , xd ) ∈ Rd | δ(rl − 12 ) < xl ≤ δ(rl + 12 ) for all 1 ≤ l ≤ d}. Lemma 3.8. Let λ0 > 0. It holds ˜ ˜ N0 → [0, ∞) fulfilling d Ψ(|r|) <∞ (i) There exists a decreasing mapping Ψ: r∈Z such that if Z, Z are disjoint finite configurations and λ ≥ λ0 it holds ˜ − r |)(Z ∩ Q1 (r))(Z ∩ Q1 (r )). Wφ (Z, Z ) ≥ − Ψ(|r λ
r,r ∈Zd
˜ ≥ 0 such that for all λ ≥ λ0 and all finite (ii) There are constants A˜ > 0, B configurations Z, it holds ˜ (Z ∩ Q1 (r))2 − BZ. Uφλ (Z) ≥ A˜ r∈Zd
˜ for all λ ≥ λ0 and Z ∈ ΓΛ having Moreover, after possibly enlarging A, λ distances < λ it holds ˜φ,λ (Z) ≥ A˜ ˜ U (Z ∩ Q1 (r))2 − BZ. r∈Zd
Proof. We define a potential φ: Rd → R ∪ {∞} by ˜ ˜ Φ(|x|) if |x| ≤ R, φ(x) := ˜ −d−ε if R ˜ ≤ |x|, −G|x| ˜ G, ˜ Φ ˜ as in Lemma 3.6. This potential fulfills (RP), (BB) and (T) and is with R, therefore superstable and lower regular by [28, Proposition 1.4]. Since φλ ≥ φ for all λ ≥ λ0 this already implies (i) and the first assertion in (ii). For r ∈ Zd we set Λλ,r := Λλ + 2λr. Let k ∈ N be a natural number and define Zk := r∈Zd ,|r|≤k (Z + 2λr). It holds Uφλ (Z˜ ∩ Λλ,r ) Uφλ (Zk ) = r∈Zd ,|r|≤k
+
r ∈χ(M)
Wφλ (Z˜ ∩ Λλ,r , Z˜ ∩ Λλ,r+r )η(r, r , k)
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
13
where χ is defined as in the proof of Lemma 3.3 and η(r, r , k) = 1 for |r + r | ≤ k and 0 else. It holds η(r, r , k) = 1 for |r| ≤ k − 1, thus by Lemma 3.3 ˜φ,λ (Z) + Uφλ (Z˜ ∩ Λλ,r ) U Uφλ (Zk ) = r∈Zd ,|r|
+
r∈Zd ,|r|=k
Wφλ (Z˜ ∩ Λλ,r , Z˜ ∩ Λλ,r+r )η(r, r , k) .
r ∈χ(M)
But for Wφλ (Z˜ ∩ Λλ,r , Z˜ ∩ Λλ,r+r )η(r, r , k), r ∈ χ(M) there are only finitely many possible finite values, independently of k, hence there exists C < ∞ such that ˜φ,λ (Z)| ≤ Ck d−1 |Uφλ (Zk ) − (2k − 1)d U proving that 1 Uφλ (Zk ). k→∞ (2k − 1)d
˜φ,λ (Z) = lim U
By the first assertion in (ii) and by Lemma 3.7, we conclude that ˜φ,λ (Z) ≥ c−1 A˜ ˜ U (Z ∩ Q1 (r))2 − BZ. λ0 r∈Zd
3.3. Ruelle bound for canonical correlation functions with periodic boundary condition Before going into the proof of the Ruelle bound we note a property of the canonical partition functions with periodic boundary stated in Lemma 3.9 below. Its proof is a slight adaptation of the proof of [7, Lemma 3 ] to the periodic boundary case (with external potential equal to 0). Note that the result of the following lemma in particular holds for the type of potentials we consider in this section. Its assumptions are obviously weaker than (RP), (T), (BB). Lemma 3.9. Let φ: Rd → R ∪ {∞} be measurable, symmetric, bounded from below
˜φ,λ := and such that for any a > 0 it holds Ca := {|x|≥a} |φ(x)| dx < ∞. We define U 1 ˆ U ˆ , where φλ := ((−2λ, 2λ)\{0}; dx). d φ(· + 2λr) is defined as limit in L φλ
loc
r∈Z
Consider for λ > 0, N ∈ N0 , β > 0
˜ N,β e−β Uφ,λ ({x1 ,...,xN }) dx1 · · · dxN , Zλ := ΛN λ
which are (N ! times) the canonical partition functions with periodic boundary condition. Set Zλ0,β := 1. Let S ⊂ [0, ∞) × (0, ∞) be any compact subset. There exists a N constant kφ,S < ∞ such that for any N ∈ N0 , λ > 0, β > 0 fulfilling ( (2λ) d , β) ∈ S it holds kφ,S N +1,β Z . ZλN,β ≤ (2λ)d λ
February 10, J070-S0129055X11004229
14
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
Proof. Set ρmax := sup{ρ | ∃β such that (ρ, β) ∈ S} and choose a > 0 small enough such that the volume Va = (2a)d of a | · |-ball with radius a fulfills Va ρmax ≤ 12 . Then N Va ≤ 12 (2λ)d . Fix Z = (x1 , . . . , xN ) ∈ ΛN λ and consider ¯ a (x1 ) ∪ · · · ∪ B ¯a (xN )}, where B ¯a (x) := d the set Λaλ := Λλ \{B r∈Z ,|r|≤1 {y ∈ d R | |y − (x + 2λr)| ≤ a}, x ∈ Λλ . It holds
N N |φˆλ (ξ)|dξ φˆλ (ξ − xi ) dξ ≤ Λa λ i=1 i=1 Λλ \Ua (0)
|φ(x)|dx = N Ca ≤ ρmax (2λ)d Ca . ≤N Consequently, {ξ ∈ Λλ |
N i=1
Rd \Ua (0)
φˆλ (ξ − xi ) ≤ 4ρmax Ca } has volume of at least
vol(Λaλ ) − (2λ) ≥ (2λ) 4 4 . (Here and in the sequel vol(·) shall denote Lebesgue measure.) Hence
PN ˆ ˜ φ,λ ({x1 ,...,xN }) N +1,β −β U = e e−β i=1 φλ (xi −ξ) dξ dx1 · · · dxN Zλ d
d
ΛN λ
Λλ
d
(2λ) −4βρmax Ca N,β e Zλ 4 so the assertion holds with kφ,S = 4e4βρmax Ca . ≥
Now, fix λ0 > 0, β > 0 and ρmax > 0. ρmax will be used below as a bound for the particle density. We choose sequences (φj )j∈N , (Vj )j∈N and (lj )j∈N and numbers ˜ A, ˜ B ˜ as in Lemma 3.8. For P ∈ N, α > 0 as in [28, Sec. 2] corresponding to Ψ, 1 ˜ −1 ln(k)). k := k[0,ρmax ]×{β} as in Lemma 3.9 we define γ(ρmax , β) := A ˜ (B + β d We define Q(j) := [−lj − 0.5, lj + 0.5] , j ∈ N, then Vj is the volume of Q(j) with respect to Lebesgue measure. The following is somehow obvious, but important. Lemma 3.10. For any g > 0 there exists λ1 (g) > 0 such that for all λ ≥ λ1 (g) it holds (2λ)d g < ψj Vj
(3.7)
for some j ≥ P such that Q(j + 1) ⊂ Λλ/2 . Proof. For any λ > 0 large enough we can fix jλ ≥ P such that Q(jλ + 1) Λλ/2 ⊂ Q(jλ + 2). Then by the definition of Vj , lj (cf. [28]) Vjλ (1 + 3α)2d ≥ Vjλ +2 ≥ λd . Thus (3.7) holds as soon as ψjλ > 2d (1 + 3α)2d g. Hence our assertion follows since jλ → ∞ and consequently ψjλ → ∞ as λ → ∞. We define λ∗ := max{λ0 , λ1 (γ(ρmax , β) cλ0 ρmax 3d )}, where cλ0 is as in Lemma 3.7 and λ1 (·) is as in the above Lemma. As in [28] we write [j] := {r ∈ Zd | |r| ≤ lj }.
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
15
Lemma 3.11. Let λ ≥ λ∗ , and let Z ∈ ΓΛλ be such that Z has distances < λ and fulfills Z ≤ ρmax (2λ)d . Let Z¯ := Z˜ ∩ (S ∪ Λλ ), where S is as in Lemma 3.3. Then one of the following statements is valid: (I) For all j ≥ P it holds
(Z¯ ∩ Q1 (r))2 ≤ ψj Vj .
r∈[j]
(II) It holds ˜φ,λ (Z) ≥ ln(k)Z. βU (III) There exists a largest q ≥ P fulfilling (Z¯ ∩ Q1 (r))2 ≥ ψq Vq r∈[q]
and it additionally holds Q(q + 1) ⊂ Λλ/2 . Proof. Let us at first consider the situation where γ(ρmax , β)Z. Using Lemma 3.8(ii) we find that
r∈Zd
(Z ∩ Q1 (r))2 ≥
˜φ,λ (Z) ≥ β(Aγ(ρ ˜ max , β)Z − BZ) ˜ βU = ln(k)Z, i.e. (II) holds. Hence we may assume for the rest of the proof that (Z ∩ Q1 (r))2 ≤ γ(ρmax , β)Z. r∈Zd
Using Lemma 3.7, the notations given there and the definition of λ∗ we find that (Z¯ ∩ Q1 (r))2 ≤ (Z1 ∩ Q1 (r))2 ≤ γ(ρmax , β)cλ0 3d Z r∈Z d
r∈Zd
≤ cλ0 γ(ρmax , β)ρmax 3d (2λ)d < ψj0 Vj0 for some j0 ≥ P fulfilling Q(j0 + 1) ⊂ Λλ/2 by Lemma 3.10. Consequently, for all j ≥ j0 it holds (Z¯ ∩ Q1 (r))2 < Vj0 ψj0 ≤ ψj Vj . (3.8) r∈[j]
Now, if (I) is not valid, the existence of a largest q ≥ P such that r∈[q] (Z¯ ∩ Q1 (r))2 ≥ ψq Vq is clear. But from (3.8), we find that this number q fulfills also the second condition in (III). ˜
−d−1 Let us have another look at the energy in case (III). Set C := A 4 (1 + 3α) d c (this is the constant C from [28, Proposition 2.5]). If A ⊂ R we denote by A the complement of A in Rd .
February 10, J070-S0129055X11004229
16
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
Lemma 3.12. Let λ ≥ λ∗ . There exists a constant κ, not depending on Z and λ, such that the following holds: If in Lemma 3.11 statement (III) is valid, then ˜φ,λ (Z ∩ Q(q + 1)c ) − ˜φ,λ (Z) ≤ −U −U
A˜ (Z ∩ Q1 (r))2 4 r∈[q+1]
− Cψq+1 Vq+1 + κ(Z ∩ Q(q + 1)). Moreover, there is another constant κ such that in the same situation ˜φ,λ (Z) ≤ −U ˜φ,λ (Z ∩ Q(q + 1)c ) − (Cψq+1 − κ )Vq+1 − ln(k)(Z ∩ Q(q + 1)). −U (q+1) ∩ (S ∪ Λ ), Proof. Let Z¯ be defined as in Lemma 3.11 and define Z¯ (q+1) := Z λ (q+1) where Z := Z ∩ Q(q + 1) and S is as in Lemma 3.3. It holds
˜φ,λ (Z) = −U ˜φ,λ (Z ∩ Q(q + 1)c ) − Uφ (Z (q+1) ) −U λ ¯ + 1)) − Wφλ (Z (q+1) , Z\Q(q − Wφλ (Z ∩ Q(q + 1)c , Z¯ (q+1) \(Z (q+1) )). Using [28, Proposition 2.5a] we find that the first assertion is shown as soon as we can prove that −Wφλ (Z ∩ Q(q + 1)c , Z¯ (q+1) \(Z (q+1) )) ≤ κ(Z (q+1) ). But this can be seen using Lemma 3.8: Note that (Z ∩ Q(q + 1)c ) ≤ Z and d (Z¯ (q+1) \(Z (q+1) )) = 3 2−1 Z (q+1) . We obtain by the uniform lower regularity (Lemma 3.8(i)) −Wφλ (Z ∩ Q(q + 1)c , Z¯q+1 \(Z ∩ Q(q + 1))) 3d − 1 3λ (q+1) ˜ Z (Z ≤ )Ψ − λ 2 2 3λ 3d − 1 ˜ ρmax (Z (q+1) )(2λ)d Ψ ≤ − λ . 2 2 ˜ 3λ − λ) is bounded ˜ we know that λd Ψ( By the summability property of Ψ 2 independently of λ. (It even tends to 0 as λ → ∞.) Hence the first assertion follows. The second assertion is seen from the first one using that there exists κ > 0 such that for any l ∈ N0 it holds A˜ − l2 + (ln(k) + κ)l ≤ κ 4 and Vq+1 = [q + 1]. Remark 3.13. Note that for the proofs of Ruelle bounds in [28] and [14] it is only necessary to consider the cases (I) and (III) as in Lemma 3.11. In case (III), the restriction Q(q + 1) ⊂ Λλ/2 does not occur there. For the periodic boundary case,
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
17
however, a restriction on q like this is essential in order to estimate the interaction term −Wφλ (Z ∩ Q(q + 1)c , Z¯ (q+1) \(Z (q+1) )) in the proof of Lemma 3.12. For this reason (II) is considered as a separate case: When one chooses λ∗ large enough and for some configuration (III) holds with q being too large, the total periodic energy is large enough to be estimated from below in a suitable way. The meaning of this estimate and the other estimates in the preceding Lemmas become clear in the proof of Theorem 3.14 below (which works as in [14] or [28]). We are now prepared to prove the main result of this section. For λ > 0, n, N ∈ N and n ≤ N , one defines the canonical correlation function with periodic boundary condition by
1 N! ˜ (n,N ) (x1 , . . . , xn ) := e−β Uφ,λ ({x1 ,...,xN }) dxn+1 · · · dxN , kλ −n) (N − n)! ZλN,β Λ(N λ (0,N )
x1 , . . . , xn ∈ Λλ . This definition is supposed to imply that one sets kλ = 1 ˜φ,λ ({x1 ,...,xN }) 1 (N,N ) −β U for any N ∈ N, λ > 0. Moreover k (x1 , . . . , xN ) = N ! Z N,β e , λ
x1 , . . . , xN ∈ Λλ .
Theorem 3.14. Let φ be a pair potential fulfilling (RP ), (BB ), (T ) given in Sec. 3.1 and let ρmax > 0, β > 0. Then there exists a constant ξ > 0 and some λ∗ > 0 such that the following holds: For all λ ≥ λ∗ and n ∈ N0 , N ∈ N fulfilling n ≤ N ≤ ρmax (2λ)d , and all x1 , . . . , xn ∈ Λλ having distances < λ in the above sense (and thus Lebesgue-a.e.) it holds (n,N )
kλ
(x1 , . . . , xn ) ≤ ξ n .
Remark 3.15. (i) In fact, for any ρmax , β and λ0 > 0 there exists ξ such that the estimate from Thereom 3.14 holds for all λ ≥ λ0 . This follows since for (n,N ) ≤ λ ∈ (λ0 , λ∗ ) the particle number is restricted to N ≤ ρmax (2λ∗ )d and kλ ˜ ˜ is chosen according to Lemma 3.8 for the ρnmax k N eβN B , n ≤ N , where B prescribed value of λ0 . (ii) The restriction to x1 , . . . , xn having distances < λ is a result of the more convenient symmetric choice of φλ we made above. It should be possible to drop it when doing all calculations in a nonsymmetric way, but this would be wasted effort: The correlation functions are to be seen as density of the correlation measures with respect to Lebesgue–Poisson measure (see also Sec. 3.4 or [22], or [27, p. 72]), so their values on a Lebesgue null set do not matter for applications. Both remarks are also valid for Corollary 3.16 below. Proof of Theorem 3.14. Choose λ∗ , C, k etc. as above, let D < ∞ be as in [28, Proof of Proposition 2.6]. Set ξ := ρmax 1 + keβD + e−(βCψq+1 −βκ −ρmax )Vq+1 , q≥P
February 10, J070-S0129055X11004229
18
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
which is < ∞, since ψq+1 → ∞ as q → ∞ and Vq+1 grows at least as fast as q. The proof is done by induction on n. For n = 0 the assertion is trivially fulfilled. We may assume that x1 = 0. Let S I , S II and S III be the sets of tuples −n such that Z := {x1 , . . . , xN } has distances < λ and satisfies (xn+1 , . . . , xN ) ∈ ΛN λ III the subset of S III such that q (I)–(III) in Lemma 3.11, respectively. Denote by Sq,l is as in Lemma 3.11(III) and l = ({xn+1 , . . . , xN } ∩ Q(q + 1)). Let (xn+1 , . . . , xN ) ∈ S I . Then as in [28, Proof of Proposition 2.6] we find that −Wφλ ({x1 }, {x2 , . . . , xN }) ≤ D. Hence, since x1 = 0 we have ˜φ,λ ({x2 , . . . , xN }) − Wφ ({x1 }, {x2 , . . . , xN }) ˜φ,λ ({x1 , . . . , xN }) = −U −U λ ˜φ,λ ({x2 , . . . , xN }) + D. ≤ −U Thus N! 1 N,β (N − n)! Zλ ≤ eβD
˜
SI
e−β Uφ,λ ({x1 ,...,xN }) dxn+1 · · · dxN
1 N! N,β (N − n)! Zλ
≤ eβD N
˜
−n ΛN λ
e−β Uφ,λ ({x2 ,...,xN }) dxn+1 · · · dxN
k (n−1,N −1) k (x2 , . . . , xn ) ≤ eβD kρmax ξ n−1 , (2λ)d λ
(3.9)
by Lemma 3.9. Now let us consider the configurations in S II . Here Lemmas 3.11 and 3.9 yield
N! 1 ˜ e−β Uφ,λ ({x1 ,...,xN }) dxn+1 · · · dxN (N − n)! ZλN,β S II ≤ Nn
1 ZλN,β
N −n −N (2λ)d k ≤ ρnmax ≤ ρmax ξ n−1 .
(3.10)
III We finally turn to Sq,l , q ≥ P , 0 ≤ l ≤ N − n. Set N (q) := ({x1 , . . . , xn } ∩ Q(q + 1)) ≥ 1 and assume without loss of generality that x1 , . . . , xN (q) ∈ Q(q + 1). We set χq := e−β(Cψq+1 −κ )Vq+1 . Lemma 3.12 shows that
1 N! ˜ e−β Uφ,λ ({x1 ,...,xN }) dxn+1 · · · dxN III (N − n)! ZλN,β Sq,l l Vq+1 1 N! N −n ≤ χq (N − n)! ZλN,β k N (q)+l N − n − l
˜ e−β Uφ,λ ({xN (q)+1 ,...,xN −l}) dxn+1 · · · dxN −l × −n−l ΛN λ
(N −N (q)−l)
= χq
l Vq+1 Zλ N! N (q)+l (N − N (q) − l)!l! k ZλN,β (N −l−N (q),n−N (q))
× kλ
(xN (q)+1 , . . . , xn )
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum l N N (q)+l Vq+1 ≤ χq l! k N (q)+l
k (2λ)d
N (q)+l
19
ξ n−N (q)
(ρmax Vq+1 )l . l! Summing over q ≥ P and l we obtain
N! 1 ˜ e−β Uφ,λ ({x1 ,...,xN }) dxn+1 · · · dxN N,β (N − n)! Zλ S III ≤ ξ n−1 ρmax e−(βCψq+1 −βκ −ρmax )Vq+1 . ≤ χq ρmax ξ n−1
(3.11)
q≥P
The assertion is implied by our choice of ξ, (3.9)–(3.11) since the set of tuples −n (xn+1 , . . . , xN ) ∈ ΛN such that {x1 , . . . , xN } has distances < λ has full Lebesgue λ N −n measure in Λλ . As in [14, Theorem 3.2] in the case of empty boundary condition one also obtains an improved Ruelle bound. Corollary 3.16. Under the assumptions of Theorem 3.14, there exists a constant ζ ≥ ξ such that for all λ ≥ λ∗ and n ∈ N0 , N ∈ N fulfilling n ≤ N ≤ ρmax (2λ)d it holds (n,N )
kλ
(x1 , . . . , xn ) ≤ ζ n inf e−β
P j=i
ˆλ (xi −xj ) φ
1≤i≤n
(3.12)
all x1 , . . . , xn ∈ Λλ having distances < λ (thus Lebesgue-a.e.). It follows also that (n,N )
kλ
2β
P
2β
˜
(x1 , . . . , xn ) ≤ ζ n e− n
{i,j}⊂{1,...,N }
ˆλ (xi −xj ) φ
= ζ n e− n Uφ,λ ({x1 ,...,xn }) . Proof. The proof is a slight modification of the proof of the second assertion of [14, Theorem 3.2]. Since the canonical ensemble with respect to φ with periodic boundary condition equals the canonical ensemble with respect to φˆλ with empty boundary condition, we have the following Kirkwood–Salsburg type equation (cf. [14] or [16, Eq. (38.16)]): ZλN −1,β −β P2≤i≤N φˆλ (x1 −xj ) (n,N ) (n−1,N −1) (x1 , . . . , xn ) = N e (x2 , . . . , xN ) kλ kλ N,β Zλ +
N −n l=1
×
l
1 l!
(e
i=1
Λlλ
(n+l−1,N −1)
kλ
ˆλ (x1 −yi ) −β φ
(x2 , . . . , xn , y1 , . . . , yl )
− 1)dy1 · · · dyl
.
February 10, J070-S0129055X11004229
20
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
We may assume that any tuples occurring in this formula have distances < λ and by translation invariance we are allowed to assume that x1 = 0. Then under the integral sign we may replace φˆλ by φλ . But φλ fulfills (T) and (BB) uniformly in λ ≥ λ0 . Thus
|e−βφλ (y) − 1|dy < ∞, Iλ := Rd
is bounded independently of λ ≥ λ0 . Therefore, by Lemma 3.9 and Theorem 3.14, we have N −n n+l−1 l P ξ I ˆ (n,N ) kλ (x1 , . . . , xn ) ≤ e−β 2≤i≤n φλ (x1 −xi ) kρmax ξ n−1 + l! l=1
≤e
−β
P
ˆ 2≤i≤n φλ (x1 −xi ) n
ζ
where ζ := max{ξ, kρmax eξI } with I = supλ≥λ0 Iλ . Symmetry of the correlation function implies the assertions. 3.4. Weak limits of measures and Ruelle bounds In this section, we prove that a uniform (improved) Ruelle bound is transported to weak limits µn → µ of measures on Γ. Moreover, we prove that for a large γ ⊂ Rd | ˆ γ < ∞} of class of functions f : Γ0 → R defined on the space Γ0 := {ˆ finite configurations it holds µn (Kf ) → µ(Kf ) and that one may find bounded continuous local functions approximating Kf uniformly in L1 (Γ; µn ), n ∈ N, and γ ) := in L1 (Γ; µ). Here Kf : Γ → R denotes the K-transform of f , given by Kf (ˆ f (ˆ η ). For further information on this mapping see [22]. These results η ˆ⊂ˆ γ ,ˆ η ∈Γ0 then also hold for Γ0 , Γ replaced by the velocity marked spaces Γv0 , Γv , when one assumes that the velocities are independently Gaussian distributed and do also not depend on the configuration. This is (basically) seen with the help of Lemma 3.17 below. Let us at first collect some more notations (cf. [22]). By ΓΛ , we denote the subset of Γ consisting of configurations contained in Λ ⊂ Rd . Let now Λ ⊂ Rd be open. The projection prΛ : Γ → ΓΛ mapping γ ∈ Γ to γ ∩ Λ ∈ ΓΛ is continuous, when ΓΛ and Γ are equipped with the vague topology, which we will always assume below. This means, we equip ΓΛ (respectively, Γ) with the vague topology on the set of Radon measures on Λ (respectively, Rd ). Moreover, these spaces shall be equipped with the corresponding Borel σ-fields. We denote by Γn ⊂ Γ0 the set of n-point configurations and by ΓΛ,n ⊂ ΓΛ the set of n-point configurations contained in Λ, Λ ⊂ Rd open, bounded. For measurable and topological structures on these spaces we refer to [22] and to the considerations around Lemma 3.21 below. We denote the Borel σ-field on Γ0 by B(Γ0 ). Let Λ ⊂ Rd be open and bounded. Below we use the fact that when we consider ΓΛ ⊂ Γ0 as a topological, hence as a measurable space, the corresponding measurable structure
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
21
coincides with the one on ΓΛ induced by the vague topology (cf. [22, Remark 2.1.2]). This implies that prΛ : Γ → ΓΛ ⊂ Γ0 is measurable. The Lebesgue–Poisson measure λ on Γ0 is defined by
∞ 1 1A ({x1 , . . . , xn })dx1 · · · dxn for A ∈ B(Γ0 ). λ(A) := n! (Rd )n n=0 A measure µ on Γ is said to be locally absolutely continuous with respect to Lebesgue–Poisson measure if for each open bounded Λ ⊂ Rd the image measure µ ◦ pr−1 Λ is absolutely continuous with respect to the restriction of λ to ΓΛ . For any such probability measure µ one defines the correlation functional ρµ : Γ0 → R by
d(µ ◦ pr−1 Λ ) (ˆ γ ∪ γˆ )dλ(ˆ γ ) := γ ) when γˆ ∈ ΓΛ , Λ ⊂ Rd open, bounded. ρµ (ˆ dλ ΓΛ In the same manner we define ΓvΛ , prvΛ , etc., but we replace the vague topology by the topology generated by bounded continuous functions with spatially bounded support. We define λv to be the Lebesgue–Poisson measure corresponding to the 2 intensity measure √ 1 d e−βv /2 d(x, v) (cf. [22, Chap. 3.1.3]), i.e. 2π/β
λv (A) :=
∞
1 1 nd n! 2π/β n=0 2
(Rd ×Rd )n
1A ({(x1 , v1 ), . . . , (xn , vn )})
2
× e−β(v1 +···+vn )/2 dx1 · · · dvn for A ∈ B(Γv0 ). We also define for a function f : Γv0 → R similarly to the unmarked case Kf (γ) := η⊂γ,η∈Γv f (η). 0 The difference between the velocity marked situation and the unmarked situation is negligible for the sort of results we derive below, if the velocities are assumed to be distributed independent and Gaussian. This is (mainly) seen by the following lemma. Lemma 3.17. Let µ be a probability measure on Γ which is locally absolutely continuous with respect to Lebesgue–Poisson measure λ. Then there exists a unique measure µv on Γv , defined via d(µ ◦ pr−1 d(µv ◦ (prvΛ )−1 ) Λ ) ({x1 , . . . , xk }) ({(x1 , v1 ), . . . , (xk , vk )}) = v dλ dλ
(3.13)
for any {(x1 , v1 ), . . . , (xk , vk )} ∈ ΓvΛ , k ∈ N0 , Λ ⊂ Rd open, bounded. Moreover, for measures µ, µn , n ∈ N, on Γ which are locally absolutely continuous with respect to Lebesgue–Poisson measure one obtains (i) µn → µ weakly iff µvn → µv weakly. (ii) (µn )n is tight iff (µvn )n is tight. (iii) For any nonnegative measurable f : Γv0 → R+ 0 it holds µv (Kf ) = µ(Kf∗ )
February 10, J070-S0129055X11004229
22
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
where 1 f∗ ({x1 , . . . , xk }) := kd 2π/β
Rkd
f ({(x1 , v1 ), . . . , (xk , vk )})
2
2
× e−(β/2)(v1 +···+vk ) dv1 · · · dvk for {x1 , . . . , xk } ∈ Γ0 , k ∈ N0 . Moreover µn (Kf∗ ) → µ(Kf∗ ) as n → ∞ iff µvn (Kf ) → µv (Kf ) as n → ∞. (iv) For the correlation functionals it holds ρvµv (γ) = ρµ (prx γ), when we define ρvµv analogously to ρµ . Proof. Existence and uniqueness are seen using Kolmogorov’s theorem (cf. [22, Sec. 3.1.3]). For proofs of (i), (ii) and (iv) we refer to [6, Lemma 3.3.16]. (iii) follows for f ∈ ΓΛ,m , Λ ⊂ Rd bounded, by a calculation as in the proof of [6, Lemma 3.3.16(i)] and extends to general f by monotone convergence. If a probability measure µ on Γ is locally absolutely continuous with respect to Lebesgue–Poisson measure and its correlation functional fulfills ρµ (η) ≤ ξ η
for λ-a.e. η ∈ Γ0 ,
(3.14)
it is said to fulfill a Ruelle bound. Note that by [22, Proposition 4.2.2] (or by Proposition 3.19 below) any measure fulfilling a Ruelle bound possesses finite local moments, i.e. µ((· ∩ Λ)m ) < ∞ holds for any relatively compact Λ ⊂ Rd and m ∈ N. For the unmarked situation the following lemma is contained in [22, Theorem 4.2.11]. In the velocity marked case it can be shown analogously or using Lemma 3.17(iii) above. For a measure µ on Γ0 (respectively, Γv0 ) and a nonnegative measurable function f : Γ0 → R (respectively, f : Γv0 → R) we denote by f (·)µ or f µ the measure having density f with respect to µ. Lemma 3.18. Let µ be a probablity measure on Γ which is locally absolutely continuous with respect to Lebesgue–Poisson measure. Let f ∈ L1 (Γ0 ; ρµ (·)λ). Then Kf ∈ L1 (Γ; µ), Kf L1(Γ;µ) ≤ K|f | L1(Γ;µ) ≤ f L1(Γ0 ;ρµ (·)λ) and
f (η)ρµ (η)dλ(η) = (Kf )(γ)dµ(γ). Γ0
Γ
In particular, the sum defining Kf converges µ-a.s. absolutely. The same holds with µ, ρµ , λ, Γ0 , Γ replaced by µv , ρvµv , λv , Γv0 , Γv , respectively. When one assumes some more integrability for f , the above integrability result may be extended also to powers of K-transforms, since these powers can be expressed as sums of K-transforms of products:
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
23
Proposition 3.19. Let µ be a probability measure on Γ which is locally absolutely continuous with respect to Lebesgue–Poisson measure. Let K ∈ N and f : Γm → R (or equivalently f : (Rd )m → R symmetric) be measurable. Define for M ≤ mK m YM,K := {(α1 , . . . , αmK ) ∈ {1, . . . , M }2K | αlm+1 , . . . , αlm+m are pairwise
distinct for all l = 0, . . . , K − 1, {α1 , . . . , αmK } = M }. (The last condition in this definition ensures that each element of {1, . . . , M } m appears at least once.) Assume that for any M ≤ mK and (α1 , . . . , αmK ) ∈ YM,K it holds K−1 ρµ ({ξ1 , . . . , ξM }) |f (ξαlm+1 , . . . , ξαlm+m )|dξ1 · · · dξM < ∞. (Rd )M
l=0
(3.15) Then
Γ
|Kf |K dµ < ∞
and this integral may be estimated by a (finite) linear combination of the integrals in (3.15) with coefficients only depending on K and M, not on µ. A similar statement holds in the velocity marked case with independent Gaussian velocities. In this case, in (3.15) one also integrates over the velocities, but with respect to the corresponding Gaussian measure instead of Lebesgue measure. Proof. See [6, Proposition 3.3.5]. Remark 3.20. (i) In fact, we only use the above proposition for m = 1 and for m = 2. If m = 1 and ρµ fulfills (3.14), the situation becomes considerably easy, since (3.15) is implied by f ∈ L1 (Rd ; dx) ∩ LK (Rd ; dx) and µ(|Kf |K ) can be estimated in terms of f L1(Rd ;dx) , f LK (Rd ;dx) and ξ. In the velocity marked case the situation for m = 1 is analogous. (ii) Note that if (µn )n∈N are as in Proposition 3.19 and fulfill a Ruelle bound uniformly in n, one finds that the estimate for Kf LK (Γ;µn ) resulting from Proposition 3.19 and the Ruelle bound is uniform with respect to n. Before going on we need some information on the topological and measur able structure of Γ0 . Γ0 = ∞ m=0 Γm is equipped with the topology of disjoint union. Therefore, B(Γ0 ) is generated by open sets U ⊂ Γm , m ∈ N, which are bounded in the sense that U ⊂ ΓΛ,m for some open bounded Λ ⊂ Rd . The topology on Γm , m ∈ N, is defined as the quotient topology with respect to the mapping symm : (Rd )m \D → Γm , where Dm = {(x1 , . . . , xm ) ∈ (Rd )m | xi = xj for some i = j}.
February 10, J070-S0129055X11004229
24
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
We find that the set Um of open bounded sets in Γm is closed with respect to finite intersections and that there exists a sequence (Uk )k∈N ⊂ Um increasing to Γm . Therefore, the collection U := m Um can be used to uniquely determine measures on Γ0 that are finite on elements of U (see [3, Theorem 10.3]). We use this fact in the proof of Lemma 3.22 below as well as the following lemma. Lemma 3.21. Let U ⊂ Γ0 be open and bounded, i.e. there exists M ∈ N0 and M an open bounded subset Λ ⊂ Rd such that U ⊂ m=0 ΓΛ,m . Then there exists a sequence (fk )k∈N of nonnegative bounded continuous functions on Γ0 increasing to 1U . Proof. Assume without loss of generality that U ⊂ ΓΛ,m , m ∈ N. Since d sym−1 m (U ) ⊂ R is open we may choose nonnegative bounded continuous func. By mixing tions f˜k : Rd → R, k ∈ N, with bounded support increasing to 1sym−1 m (U) over the permutations of the arguments we may assume that the f˜k are symmetric. Now define fk ({x1 , . . . , xm }) := f˜k (x1 , . . . , xm ) for {x1 , . . . , xm } ∈ Γm , k ∈ N. The desired properties of the sequence (fk )k∈N follow immediately. We now prove the result mentioned at the beginning for the case of the original (in contrast to “improved”) Ruelle bound. A measurable function F : Γv → R (respectively, F : Γ → R) is called a cylinder function, if for some bounded measurable Λ ⊂ Rd it holds F = F ◦ prvΛ (respectively, F = F ◦ prΛ ). Lemma 3.22. Let (µn )n be a sequence of probability measures on Γ such that each µn is locally absolutely continuous with respect to Lebesgue–Poisson measure and such that moreover the correlation functionals fulfill a Ruelle bound (3.14) uniformly in n. Let µn → µ weakly as n → ∞. Then the following holds: (i) For any f ∈ L1 (Γ0 ; ξ · λ) it holds µn (Kf ) → µ(Kf ) as n → ∞. Moreover, there exists a sequence (Gk )k∈N of bounded continuous cylinder functions Gk : Γ → R such that Gk → Kf as k → ∞ in L1 (Γ; µ) and L1 (Γ; µn ) uniformly in n ∈ N. (ii) µ is locally absolutely continuous with respect to Lebesgue–Poisson measure and its correlation functional fulfills the same Ruelle bound as the µn . ρ ρµ (iii) The sequence ( ξµ|·|n )n∈N converges in weak-* sense to ξ|·| in L∞ (Γ0 ; λ) (seen 1 as dual of L (Γ0 ; λ)). Similar statements hold for µv , µvn , n ∈ N. Proof. Let f : Γm → R be any nonnegative bounded continuous function having γ ) = 0 for all γˆ ∈ Γ0 \ΓΛ . local support, i.e. there exists Λ ⊂ Rd bounded such that f (ˆ Then the mappings Kf ∧ r: Γ → R, r > 0, are bounded and continuous (cf. [22, Proposition 4.1.5(v)]). Consequently, µn (Kf ∧ r) → µ(Kf ∧ r) as n → ∞. By )2 ) Proposition 3.19 and Remark 3.20(i) we find that µn (Kf −Kf ∧r) ≤ µn ((Kf →0 r
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
25
as r → ∞ uniformly in n. Moreover, for each r > 0 it holds µ(Kf ∧ r) = lim µn (Kf ∧ r) ≤ sup µn (Kf ) < ∞ n→∞
n∈N
by Lemma 3.18. So, the monotone convergence theorem implies Kf ∈ L1 (Γ; µ) and Kf ∧ r → Kf in L1 (Γ; µ) as r → ∞. Therefore (i) holds for f as described above. We continue the proof of (i) after showing (ii) and (iii). ρ Relative compactness of ( ξµ·n )n∈N with respect to weak-∗ topology follows already from boundedness and the Banach–Alaoglu theorem. Let ρ˜ be an accumulation point and set ρ := ρ˜ξ · . (This convenient method for obtaining a limiting correlation functional is taken from [28, Theorem 5.5], [22, Theorem 2.7.12], where it was used to prove the existence of a grand canonical Gibbs measure.) We now prove that ρ(·)λ coincides with the correlation measure of µ (cf. [22, Sec. 4.2]). Once this is shown, we find by [22, Propositions 4.2.2 and 4.2.16] (the conditions given there are fulfilled by ρ(·)λ) that µ is locally absolutely continuous with respect to Lebesgue–Poisson measure and by [22, Proposition 4.2.14] we see that ρ is indeed the correlation functional for µ. This implies (ii), and since it implies that there is at most one accumulation point ρ˜, (iii) also follows. Let U ∈ U, i.e. U ∈ Um for some m ∈ N0 . Choose a sequence (fk )k∈N increasing to 1U as in Lemma 3.21. Then by integrability of 1U with respect to ξ · λ and Lemma 3.18 we find that Kfk → K1U in L1 (Γ; µn ) as k → ∞ uniformly in n. As above, using the monotone convergence theorem we obtain that Kfk → K1U also in L1 (Γ; µ). Therefore, µn (K1U ) → µ(K1U )
(3.16)
as n → ∞. ρµn k = ρ˜ in L∞ (Γ0 ; λ) We choose a subsequence (ρµnk )k∈N such that limk→∞ ξ· with respect to weak-∗ topology. Then
ρµnk · ρµnk1U dλ = ξ 1U dλ → ρ1U dλ (3.17) · Γ0 Γ0 ξ Γ0 as n → ∞. By Lemma 3.18 the left-hand sides of (3.16) and (3.17) coincide. As U ∈ U was chosen arbitrarily, this implies by [22, Definition 4.2.1] and the considerations preceding Lemma 3.21 that the correlation measure of µ is indeed given by ρ(·)λ. We complete the proof of (i). Let f ∈ L1 (Γ0 ; ξ · λ). We may without loss of generality assume that f is nonnegative. Choose a sequence (fk )k∈N of bounded continuous functions having local support converging to f in L1 (Γ0 ; ξ · λ). Now, since the same Ruelle bound holds uniformly for µn , n ∈ N, and also for µ, (i) follows from Lemma 3.18: It holds Kfk → Kf in L1 (Γ; µn ) uniformly in n and in L1 (Γ; µ). In the velocity marked case (ii) now follows using the definition of µv and Lemma 3.17(iv). (iii) is also seen using this lemma: For f ∈ L1 (Γv0 , λv ) it holds
February 10, J070-S0129055X11004229
26
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
with f∗ defined as in Lemma 3.17(iii)
ρvµn ρvµ ρµn ρµ v f dλ = f∗ dλ → f∗ dλ = f dλv · · · · v v v ξ ξ ξ ξ v Γ Γ0 Γ0 Γ0 as n → ∞. (i) is shown for the velocity marked case analogously as for the unmarked case. We now extend the results from Lemma 3.22 for the case of the improved Ruelle bound. Lemma 3.23. Let (µn )n be a sequence of probability measures on Γ, which are locally absolutely continuous with respect to Lebesgue–Poisson measure and converge ˜ n )n∈N ⊂ L∞ (Γ0 ; λ) be (uniformly bounded and ) weak-∗ weakly to µ. Let ζ ≥ 1 and (h ∞ ˜ convergent to some h ∈ L (Γ0 ; λ). Assume that ρµn (ˆ η ) ≤ ζ ˆη ˜hn (ˆ η) is valid for all ηˆ ∈ Γ0 and n ∈ N. Then the following holds: ˜ n replaced by h. ˜ (i) ρµ fulfills the analogous bound with h ¯ For any ¯ (ii) Assume in addition that there exists a function h such that ˜hn , ˜h ≤ h. ¯ it measurable function f : Γ0 → R which is integrable with respect to ζ · h(·)λ holds µn (Kf ) → µ(Kf ). Moreover, there exists a sequence of bounded continuous cylinder functions (Gk )k≥0 such that Gk → Kf as n → ∞ uniformly in L1 (Γ; µn ), n ∈ N, and in L1 (Γ; µ). Similar statements hold for µv , µvn , n ∈ N. Proof. Since the hn , n ∈ N, are uniformly bounded and ρµ (∅) = 1, the ρµ fulfill ρ a uniform Ruelle bound ρµ ≤ ζ˜· with ζ˜ ≥ ζ without loss of generality. So ( ζ˜µ·n )n converges in weak-∗ sense to
A
ρµ dλ = lim n→∞ ζ˜·
ρµ ζ˜·
A
by Lemma 3.22(iii). Thus
ρµn dλ ≤ lim n→∞ ζ˜·
·
˜hn ζ dλ = ζ˜· A
·
˜ ζ dλ h ˜· A ζ
holds for any measurable set A ⊂ ΓΛ,m for some open relative compact Λ ⊂ Rd and some m ∈ N. This implies (i). We now prove (ii). Let (fk )k∈N ⊂ L1 (Γ0 ; ζ · λ) be such that fk → f in h(·)ξ · λ). Due to Lemma 3.18 it holds L1 (Γ0 ; ¯ Kf − Kfk L1 (Γ;µn ) ≤ f − fk L1 (Γ0 ;ρµn (·)λ) ≤ f − fk L1 (Γ0 ;h(·)ξ · λ) , ¯ which converges to 0 as k → ∞ uniformly in n. Analogously we see that Kfk → Kf in L1 (Γ; µ). Now (ii) follows from Lemma 3.22(i). In the velocity marked case (i) is directly seen by Lemma 3.17(iv) and (ii) is derived analogously to the unmarked case.
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
27
We now focus on a special class of measures, the canonical Gibbs measures. For any open bounded set Λ ⊂ Rd , N ∈ N, β > 0 and a symmetric potential φ (which we assume to be bounded below and finite a.e.) one defines the canonical Gibbs measure µφ,β Λ,N by
1 µφ,β (A) := 1A (x1 , . . . , xN )e−βUφ (x1 ,...,xN ) dx1 · · · dxN , (3.18) Λ,N φ,β ZΛ,N ΛN φ,β A ⊂ ΛN measurable, where ZΛ,N is the normalization constant. Define a mapping N symΛ,N : Λ → Γ by symΛ,N (x1 , . . . , xN ) := {x1 , . . . , xN }, x1 , . . . , xN ∈ Λ. Then −1 the image measure µφ,β Λ,N ◦ symΛ,N defines the corresponding distribution of N -point φ,β configurations. (Note that µΛ,N -a.s. symΛ,N has values in ΓΛ,N , i.e. one a.s. obtains N -point configurations.) We formulate the tightness result from [14, Lemma 5.2] more generally, such that it also admits the perodic boundary case, in which, as the particle number N and the volume Λ of the system, also the potential φ varies.
Lemma 3.24. Let (φn )n∈N be a sequence of symmetric pair interactions fulfilling (RP ) except of the last condition and (BB ) uniformly in n. Moreover let (Nn )n∈N ⊂ N and (Λn )n∈N be a sequence of open relatively compact subsets of Rd . φn ,β ◦ sym−1 Set µn := µN Λn ,Nn . If the correlation functionals ρµn of µn , n ∈ N, fulfill n ,Λn the improved Ruelle bound 2β
ρµn (η) ≤ ζ η e− η
P {x,y}⊂η
φn (x−y)
for all η ∈ Γ0
uniformly in n, then the sequence (µn )n∈N is tight. As a consequence, the same holds for the sequence (µvn )n∈N . Proof. The proof is essentially the same as the proof of [14, Lemma 5.2]. Remark 3.25. (i) Let φ fulfill (RP), (BB), (T) as in Sec. 3.1 and let Λn := Λλn = (−λn , λn ]d , n ∈ N. Then by Lemma 3.6 and Corollary 3.16 the conditions of Lemma 3.24 are fulfilled for any sequence (Nn )n∈N , (λn )n∈N fulfilling Nn ˆ limn→∞ λn = ∞ and limn→∞ (2λ d = ρ ∈ [0, ∞) with φn := φλn , defined as n) in (3.5). Hence the corresponding sequence (µn )n∈N is tight. φn ,β (ii) In the periodic case (cf. (i)) one might rather consider µΛ ◦ per−1 Λn ,Nn n ,Nn φn ,β −1 instead of µΛ ◦ sym , where per (x , . . . , x ) := {x d 1 N 1 + Λ ,N n Λn ,Nn r∈Z n n n ,Nn 2λn r, . . . , xNn + 2λn r}. But since for any cylinder function F : Γ → R one finds φn ,β φn ,β −1 ◦per−1 that µΛ Λn ,Nn (F ) = µΛn ,Nn ◦symΛn ,Nn (F ) for sufficiently large n, weak n ,Nn convergence properties of these sequences are equivalent. (A sequence (νn )n∈N of probability measures on Γ converges weakly to a probability measure ν if νn (F ) → ν(F ) as n → ∞ holds for all bounded continuous cylinder functions, see, e.g., [6, Lemma 3.3.15]). Finally, in order to apply the result from Lemma 3.23 to the case of periodic boundary condition, we make the following remark.
February 10, J070-S0129055X11004229
28
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
Remark 3.26. Consider again the situation from P Remark 3.25(i). For ψ: Rd → − 2β {x,y}⊂η ψ(x−y) (or b (η) η R we define := ψ P bψ : Γ0 → R by bψ (η) := e −β y∈η\{x} ψ(x−y) ˜ ˜ inf x∈η e ), η ∈ Γ0 . Setting hn := 1Λλn bφˆλ and h := bφ we find n by uniform stability of the periodic interaction energy of configurations in Λλn (cf. Lemma 3.8(ii)) that the ˜hn are uniformly bounded. From (3.6) we find that ˜n → h ˜ φλn → φ pointwise and hence also φˆλn → φ pointwise, which implies that h pointwise. Together with uniform boundedness we obtain weak-∗ convergence in L∞ (Γ0 ; λ). Hence Lemma 3.23(i) can be applied and µ fulfills the improved Ruelle bound for φ. ¯ which fulfills the assertion of Lemma 3.23(ii) and We now choose a function h is useful for the considerations in Sec. 4.4 below. By (3.6) there exists m > 0 such that |φˆλn (y) − φ(y)| = |φλn (y) − φ(y)| ≤ m
for n ∈ N, |y| < λn
and inf φˆλn (z) = inf φλn (z) ≥ −m − M
z∈Rd
z∈Rd
for all n ∈ N, where −M denotes a lower bound of φ. Hence, setting φ(y) − m if |y| < λ1 φ(y) := , −m − M else ¯ we obtain φˆλn ≥ φ for all n ∈ N and φ ≥ φ, which implies ˜hn , ˜h ≤ bφ =: h, n ∈ N. Thus the conclusion of Lemma 3.23(ii) is valid in this case. Moreover, φ + m + M + M ≥ φ, where M := sup|x|≥λ1 φ(x) < ∞, so · Lp (Rd ;e−βφ dx) and · Lp (Rd ;e−βφ dx) are equivalent norms for p ≥ 1. 4. N/V -Limit of Langevin Dynamics We now derive the main result of this article. Starting with N -particle Langevin dynamics on cuboid domains (Sec. 4.2), we go on by proving tightness of the corresponding laws on Γv (Sec. 4.3) and finally prove (Sec. 4.4) that any weak accumulation point of these laws solves (1.1) in the sense specified in Sec. 1. Throughout this section we fix an inverse temperature β > 0. We assume that any function g: A → R defined on some subset A ⊂ Γ0 (respectively, Γv0 ) is extended to the whole of Γ0 (respectively, Γv0 ) by being set to 0 on the complement of A (cf. Sec. 3.4 for the definition of Γ0 , Γv0 ). 4.1. Additional conditions on the potential Let φ be a (symmetric) pair potential fulfilling the conditions (RP), (BB), (T) given in Sec. 3.1. Consider the following additional conditions on φ: (WD) (weak differentiability) φ is continuous in Rd \{0}, φ is weakly differentiable on this set and ∇φ is bounded on each of the sets {x ∈ Rd | |x| > r}, r > 0. Moreover, ∇φ ∈ L1 (Rd ; e−βφ dx) ∩ L3 (Rd ; e−βφ dx).
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
29
(IDF) (integrably decreasing forces) φ is weakly differentiable in Rd \{0} and there exist R3 > 0 and a decreasing function θ: [R3 , ∞) → [0, ∞) such that |∇φ(x)| ≤ θ(|x|) and
[R3 ,∞)
for all x ∈ Rd , |x| ≥ R3
rd−1 θ(r) dr < ∞.
Remark 4.1. (i) Both assumptions we suppose to be quite natural and sufficiently weak, allowing e.g. the Lennard–Jones potential or any other potential fulfilling (WD) and being such that |∇φ(x)|2 decreases when |x|2 → ∞, x ∈ Rd . (ii) In order to do the construction using a limit of dynamics corresponding to φ with periodic boundary, we need uniform L3 -integrability of the ∇φˆλ with ˆ respect to e−β φλ (cf. (3.5)), at least for a sequence λn tending to ∞ as n → ∞. When (WD) holds, this is an assumption on the behavior of ∇φ away from the origin. Condition (IDF) yields an appropriate behavior, as we prove in the following lemma. Though it might be not optimal, it is sufficient for our purposes. (iii) We suppose that one does not need (IDF) to construct a martingale solution of (1.1). The construction for a potential φ not fulfilling (IDF) might be done by constructing first the dynamics for smooth cut-offs of φ by approximation with periodic potentials and then approximating φ by the cut-offs. However, we do not enter into a detailed discussion of this question here. Lemma 4.2. (i) Let φ fulfill (RP ), (BB ), (T ), (WD ). Then for any λ > 0 the function φˆλ is weakly differentiable in (−2λ, 2λ)d \{0} and for any λ0 > 0 it holds supλ≥λ0 ∇φˆλ L1 (Λ2λ ;e−βφˆλ dx) < ∞. (ii) If φ additionally fulfills (IDF ), then sup φˆλ 3 < ∞ holds for ˆ −β φ λ≥λ0
any λ0 > 0.
L (Λ2λ ;e
λ dx)
Proof. The functions φˆλ , λ > 0, are cut-offs of 2λ-periodic functions, hence all the assertions on integrals over Λ2λ can be reduced to assertions on integrals over Λλ . = 2d φˆλ p p , p ∈ [1, ∞). For example, we have φˆλ p p ˆ ˆ −β φ −β φ λ λ L (Λ2λ ;e
L (Λλ ;e
dx)
dx)
ˆ (i) For the first assertion, it suffices to prove that φˆλ is the L1 (Λ2λ ; e−β φλ dx)d limit of a sequence of weakly differentiable (in (−2λ, 2λ) \{0}) functions which converges with respect to the norm · W 1,1 (Λ2λ ,e−βφˆλ dx) := · L1 (Λ2λ ;e−βφˆλ dx) + ∇ · 1 . Define φˆλ,k := φ(· + 2λr), k ∈ N. Then φˆλ,k → φˆλ ˆ d −β φ L (Λ2λ ;e
λ dx)
1
as k → ∞ in L (Λ2λ ; e
r∈Z ,|r|≤k
ˆλ −β φ
dx) and moreover for k, l ∈ N, k ≥ l, it holds
∇φˆλ,k − ∇φˆλ,l L1 (Λ2λ ;e−βφˆλ dx) ≤ 2d
2lλ≤|x|
|∇φ(x)|eβ(m+M) dx → 0
as l → ∞. Here m is as in Remark 3.26. This shows the first assertion.
February 10, J070-S0129055X11004229
30
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
Since m can be chosen independently of λ ≥ λ0 , we find by an easy argument similar to the above calculation that ∇φˆλ L1 (Λ2λ ;e−βφˆλ dx) ≤
2d eβ(m+M+M ) ∇φL1 (Rd ;e−βφ dx) , where M := sup|x|≥λ0 φ(x). We now prove (ii). We may assume that R3 = R1 = R2 =: R ≤ λ0 . By the considerations preceding the proof of (i) we have to estimate 3
−β φˆ (x) λ e ∇φ(x + 2λr) dx Λλ r∈Zd
≤
Λλ
3
|∇φ(x)| +
ˆ θ(|x + 2λr|) e−β φλ (x) dx
0=r∈Zd
independently of λ ≥ λ0 . Due to L3 -integrability of ∇φ with respect to e−βφ dx, ˆ hence with respect to e−β φλ dx uniformly in λ ≥ λ0 , it suffices to show that (2λ)d supx∈Λλ ( r=0 θ(|x + 2λr|))3 is bounded independently of λ ≥ λ0 . This follows from monotonicity and integrability of θ: For λ ≥ λ0 , x ∈ Λλ it holds:
θ(|y + 2λr|) dy Λλ θ(|x + 2λr|) ≤ (2λ)d d d r=(r1 ,...,rd )∈Z ri ≥1∀i
r=(r1 ,...,rd )∈Z ri ≥0∀i
θ(t)td−1 dt
C ≤
[0,∞)
(2λ)d
for some C < ∞ independent of λ. Here we extend θ to [0, ∞) by setting θ(t) := θ(R3 ) for t ≤ R3 . Other parts of the sum 0=r∈Zd are treated in an analogous way. Remark 4.3. In order to have some more concrete legitimation for the introduction of the additional condition (IDF), we consider the following example: Set d = 1 and consider φ according to (RP), (BB) such that it holds ∇φ(x) = ∓1 whenever 1 1 1 2k ± 12 − (|k|+1) 2 ≤ x ≤ 2k ± 2 + (|k|+1)2 for some k ∈ Z\{0} and ∓∇φ(x) ≥ 0 when 2k ± 12 − 14 ≤ x ≤ 2k ± 12 + 14 . Then φ can be such that ∇φ ∈ L1 ∩ L3 (Rd ; e−βφ dx) and (T) is fulfilled, but r∈Z ∇φ(· + 2r) behaves like √ 1 1 around 12 , so one does ·− 2 not obtain L2 integrability of ∇φˆ1 . Lemma 4.4. Let φ fulfill (RP), (T ), (BB ), (WD ), and assume that for some λ0 > 0 it additionally holds supλ≥λ0 ∇φˆλ Lp (Λλ ;e−φˆλ dx) < ∞ for p = 1, 2, 3. Then it holds for i = 1, 2, 3 ˆ
ˆ
sup |∇φˆλ |e− 3 φλ Li (Λ2λ ;dx) = 2d sup |∇φˆλ |e− 3 φλ Li (Λλ ;dx) < ∞. β
λ≥λ0
β
λ≥λ0
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
31
Proof. The equality is clear, cf. the proof of Lemma 4.2. Choose any a > 0. The functions φˆλ are bounded in the set {x ∈ Λλ | |x| ≥ a} and the bound is uniform in λ ≥ λ0 (cf. Lemma 3.6). Hence there exists D > 0 iβ ˆ ˆ such that e− 3 φλ ≤ De−β φλ on this set for λ ≥ λ0 . We compute for λ ≥ λ0 β ˆ |∇φˆλ |e− 3 φλ Li (Λλ ;dx) ≤ D1/i ∇φˆλ Li (Λλ ∩{|·|>a};e−βφˆλ dx) βi
ˆ
1/i
+ |∇φˆλ |i e− 3 φλ L1 (Λλ ∩{|·|≤a};dx) ≤ D1/i ∇φˆλ Li (Λλ ;e−βφˆλ dx) + (2a)
(3−i)d 3i
∇φˆλ L3 (Λλ ;e−βφˆλ dx)
by the H¨ older inequality. The assertion follows. 4.2. The finite particle dynamics on Γv Let φ fulfill (RP), (T), (BB), (WD) and (IDF) and let N ∈ N, λ > 0. The state space for the N -particle dynamics is given by EλN , where Eλ := Mλ × Rd , Mλ being the manifold resulting from glueing the opposite surfaces of Λλ = (−λ, λ]d together. We define the N -particle potential Ψλ,N by Ψλ,N (x1 , . . . , xN ) := i<j φˆλ (xi − xj ), (x1 , . . . , xN ) ∈ MλN . The potential Ψλ,N fulfills the assumptions of [5, Theorem 3.1] (cf. [5, Example 3.4(i)]). Thus there is a law Pλ,N on C([0, ∞), EλN ) with invariant ˆ
initial distribution µλ,N = µφΛλλ,β ,N (i.e. the canonical Gibbs measure for φ with periodic boundary condition) such that the corresponding process is a Markov process solving the martingale problem for the m-dissipative L1 (EλN ; µλ,N )-closure (Lλ,N , D(Lλ,N )) of the generator (Lλ,N , C0∞ (EλN )), given by κ (4.1) Lλ,N = ∆v − κv∇v + v∇x − (∇Ψλ,N )∇v . β We do essentially not distinguish EλN and MλN × RdN : An element (x1 , v1 , . . . , xN , vN ) of EλN we sometimes denote by (x, v), x = (x1 , . . . , xN ), v = (v1 , . . . , vN ). ∆v , ∇v denote the Laplacian and the gradient respectively, in N v-direction, v denotes multiplication by the vector v, v∇v := i=1 vi ∇vi etc. We need some information on the domain of Lλ,N . Lemma 4.5. (i) Let f ∈ C(EλN ) be such that it possesses continuous partial derivatives up to order 2 in all v-directions and continuous partial derivatives of order 1 in all x-directions. Assume moreover that f and all mentioned partial derivatives are bounded in absolute value by a multiple of (x, v) → (1 + |v|)k for some k ∈ N. Then f ∈ D(Lλ,N ) and Lλ,N f is given as in (4.1). (ii) Let f : MλN \Dλ,N → R, where Dλ,N := {(x1 , . . . , xN ) ∈ MλN |∃i, j ∈ {1, . . . , N }: xi = xj , i = j}. Assume that f is once continuously differentiable and that f, ∇x f ∈ L2 (MλN ; e−βΨλ,N dx). Then when f is considered as a function on EλN which is constant in v-directions, it holds f ∈ D(Lλ,N ) and Lλ,N f is given as in (4.1), i.e. Lλ,N f = v∇x f .
February 10, J070-S0129055X11004229
32
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
Proof. (i) is contained in [5, Lemma 3.7(ii)], (ii) follows for functions with compact support in MλN \Dλ,N from (i), for bounded functions f it follows by a similar argument as used in the proof of [5, Lemma 3.7(ii)], and finally one shows that any unbounded f as in (ii) can be represented as graph norm limit of bounded functions as in (ii). In order to simplify notation, in the sequel we do not distinguish Mλ and Λλ as well as Eλ and Λλ × Rd . Moreover, since confusion would not cause trouble, we N do not use different notations for ΛN λ and the set Λλ \Dλ,N . (The diagonal Dλ,N , defined in Lemma 4.5(ii), is not hit Pλ,N -a.s. and hence may be omitted.) We consider the mapping perλ,N : EλN → Γv defined by perλ,N (x1 , . . . , vN ) := {(x1 + 2λr, v1 ), . . . , (xN + 2λr, vN )}. r∈Zd ⊗[0,∞)
Furthermore, we define the mapping perλ,N : C([0, ∞), EλN ) → C([0, ∞), Γv ) by assigning to a path ((x1 (t), . . . , vN (t)))t≥0 the path (perλ,N (x1 (t), . . . , vN (t)))t≥0 of images with respect to perλ,N . Both mappings are well-defined except on the diagonal (which is negligible with respect to both µλ,N and Pλ,N ) and measurable. We set µ(λ,N ) := µλ,N ◦ per−1 λ,N ⊗[0,∞)
and define P (λ,N ) := Pλ,N ◦ (perλ,N )−1 . These probability laws are the starting point for the construction of an infinite particle Langevin dynamics as a weak limit. Sometimes we also use the mappings symλ,N : EλN → Γv , given by symλ,N (x1 , . . . , vN ) := {(x1 , v1 ), . . . , (xN , vN )}. This is done for technical reasons: The measures µ(λ,N ) are not locally absolutely continuous with respect to Lebesgue– Poisson measure and in particular do not fulfill a Ruelle bound. Therefore, in order to apply the results from Sec. 3.4 we have to use µλ,N ◦ sym−1 λ,N instead. Remark 4.6. Note that one at best faces some technical difficulties trying a ⊗[0,∞) construction by dynamics given through Pλ,N ◦ (symλ,N )−1 , where one defines ⊗[0,∞)
⊗[0,∞)
symλ,N analogously to perλ,N . The paths corresponding to these laws are not even right continuous. In contrast, the laws P (λ,N ) describe diffusions. 4.3. Tightness In this section we prove, under the conditions and using the notations of Sec. 4.2, Nn tightness of any sequence (P (λn ,Nn ) )n∈N such that λn ↑ ∞ and (2λ d → ρ ∈ [0, ∞) n) as n → ∞. In the sequel we abbreviate subscripts λn , Nn by n, i.e. Pn := Pλn ,Nn , symn := symλn ,Nn etc. Paths in C([0, ∞), Γv ) will below always be denoted by (γt )t≥0 . We may assume that λ1 is large enough for Theorem 3.14 (and Corollary 3.16) to apply. Tightness of the sequence of distributions P (n) ◦ γt−1 (= µ(n) ), t ≥ 0, is seen from Remark 3.25 and Lemma 3.17(ii). So we go on by estimating moments of dΦ,a,h (γt , γs ), t, s ≥ 0, with dΦ,a,h as defined in Sec. 2. We follow an idea from [17]
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
33
and use semimartingale decompositions of the summands contained in dΦ,a,h (γt , γs ). Before we do so, we need some preparations. The following lemma might also be stated more generally. However, we restrict to what we are about to use later. Lemma 4.7. Let f : Rd × Rd → R have bounded spatial support and being once continuously differentiable in the x-directions and twice continuously differentiable in the v-directions. Assume moreover that all these derivatives are bounded. f itself may be unbounded. Set F := f, · = Kf . Then F ◦ pern is an element of the domain D(Ln ) and it holds sup µn (|Ln (F ◦ pern )|3 ) < ∞. n
Proof. We may without loss of generality assume that the spatial support of f is contained in an open cube of side length less than 2λ1 . (Otherwise we use an appropriate partition of unity corresponding to a suitable locally finite open cover of Rd to decompose f (cf. the proof of Lemma 4.10 below).) Moreover, we may without loss of generality assume that the spatial support of f is relatively compact in (−λ1 , λ1 )d , since Ln commutes with simultaneous spatial translations of all particles in (the manifold) Eλn and µn is invariant with respect to these translations. So we may replace pern by symn . The first assertion is seen from Lemma 4.5(i). (Note that f (x, v), (x, v) ∈ Rd × Rd grows at most linearly as |v| → ∞.) It holds for n ∈ N Ln (F ◦ symn ) = Kg1 ◦ symn − Kg2n ◦ symn , where g1 : Γ1 → R is given by g1 ({(x, v)}) :=
κ β ∆v f (x, v)−κv∇v f (x, v)+v∇x f (x, v) n g2 ({(x, v), (x , v )}) := ∇φˆλn (x − x )(∇v f (x, v) −
and g2n : Γ2 → R is given by ∇v f (x , v )), (x, v), (x , v ) ∈ Eλn . Let us first prove
sup µn (|Kg1 ◦ symn |3 ) < ∞.
(4.2)
n
Since g1 has (seen as a function defined on Rd × Rd ) bounded spatial support and there exists C > 0 such that |g1 (x, v)| ≤ C|v| for all (x, v) ∈ Rd × Rd , it follows g ∈ Lp (Γ0 ; λv ) for each p ∈ [1, ∞). So the improved Ruelle bound (Corollary 3.16), Proposition 3.19 and Remark 3.20 imply (4.2). Concerning g2n we have that for n ∈ N (K|g2n |(·))3 ≤
sup (x,v)∈Rd ×Rd
|∇v f (x, v)|32
|K g˜2n (prx ·)|3
where g˜2n (x, x ) := |∇φˆλn (x − x )|2 (1supps (f ) (x) + 1supps (f ) (x )), x, x ∈ Λλn , | · |2 denotes Euclidan norm and supps denotes the spatial support of f . So we are left to estimate µn (|K g˜2n ◦ symn |3 ). By Proposition 3.19 and using the improved Ruelle
February 10, J070-S0129055X11004229
34
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
bound (3.12) of the µn ◦ sym−1 n , we have to prove that for any M ∈ {2, . . . , 6} and any A, B, C, D, E, F ∈ {1, . . . , M } such that A = B, C = D, E = F and {A, B, C, D, E, F } = {1, . . . , M }, it holds
|∇φˆλn (xA − xB )|2 |∇φˆλn (xC − xD )|2 |∇φˆλn (xE − xF )|2 sup ΛM λn
n
× inf
1≤i≤M
e−β
P j=i
ˆλ (xi −xj ) φ n
1supps (f ) (xA )1supps (f ) (xC )
× 1supps (f ) (xE )dx1 · · · dxM < ∞ which by uniform boundedness of the φˆλn from below we may replace by
c(xA , xB )c(xC , xD )c(xE , xF ) sup n
ΛM λn
× 1supps (f ) (xA )1supps (f ) (xC )1supps (f ) (xE )dx1 · · · dxM < ∞
(4.3)
β ˆ with c(x, x ) := |∇φˆλn (x − x )|2 e− 3 φλn (x−x ) , x, x ∈ Λλn . ∈ To prove (4.3), we integrate successively over all xY with Y {B, D, F }\{A, C, E}. The integration yields finite values bounded independently of n even if Y appears more than once in the tuple (A, . . . , F ) due to Lemma 4.4 and H¨ older inequality. We continue to integrate over the remaining variables until there is no ∇φˆλn -term left. For every variable which then remains (there is at least one), there is a 1supps (f ) left. It follows supn µn (|Kg2 ◦ symn |3 ) < ∞ and together with (4.2) the assertion is shown.
A much simpler case than in Lemma 4.7 is considered in the following corollary. (Note that the first estimate is immediate.) Corollary 4.8. Let f : Rd × Rd be continuous and continuously differentiable in the v-directions such that all the derivatives are bounded, whereas f may be unbounded. Then it holds sup µn (|∇v (Kf ◦ pern )|32 ) ≤ sup µn (|∇v (Kf ◦ pern )|31 ) < ∞ n
n
where | · |1 denotes norm defined by |(y1 , . . . , yl )| := l ∈ N.
l j=1
|yj | for (y1 , . . . , yl ) ∈ Rl ,
Remark 4.9. For k ∈ N consider χk defined as in Sec. 2 such that a is twice continuously differentiable and ∇v a, ∆v a are bounded (and, as before, a(v) → ∞ as |v| → ∞. Then Lemma 4.7 and Corollary 4.8 apply to f = χk . If a function f is only dependent on x-coordinates, Ln (Kf ◦ pern ) does not contain ∇φˆλn , n ∈ N. This enables us to deal also with S Φ,hk , defined in Sec. 2. Here hk := hIk .
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
35
˜ be as in Lemma 3.6 (corresponding to the potential φ) and Lemma 4.10. Let Φ ˜ assume (without loss of generality) that Φ(r) = 0 for r ≥ λ1 /2. Let k ∈ N. It holds ˜ β Φ/6,h k ◦ pern ∈ D(Ln ) and S ˜
sup µn (|Ln (S β Φ/6,hk ◦ pern )|3 ) < ∞. n
˜
Proof. We first note that S β Φ/6,hk is the K-transform of g: Γ2 → R, given ˜ by g({x, x }) := hk (x)hk (x )eβ Φ(|x−x |2 )/6 , x, x ∈ Rd . g({x, x }) is equal to hk (x)hk (x ), when |x − x | ≥ λ1 /2. We choose a locally finite open cover U of Rd such that any ∆ ∈ U has diameter < λ1 /2 and we choose a corresponding partition of unity (η∆ )∆∈U consisting of C 1 -functions. Using this partition (and noting ˜ that hk has compact support) we see that we may replace S β Φ/6,hk by Kg∆1 ,∆2 , where g∆1 ,∆2 : Γ2 → R is defined by g∆1 ,∆2 ({x, x }) β
˜
:= e 6 Φ(|x−x |2 ) (ϕ∆1 (x)ϕ∆2 (x ) + ϕ∆1 (x )ϕ∆2 (x)),
x, x ∈ Rd ,
(4.4)
for ∆1 , ∆2 ∈ U, where ϕ∆1/2 := η∆1/2 hk . We first consider the case where dist(∆1 , ∆2 ) ≤ λ1 /2, where dist denotes the |·|distance of subsets of Rd . We may assume that ∆ := ∆1 ∪∆2 is relatively compact in (−λ1 , λ1 )d and replace pern by symn using spatial translations in Eλn as in the proof of Lemma 4.7 above. By Lemma 4.5(ii) we have that (Kg∆1 ,∆2 ◦ symn ) ∈ D(Ln ), n ∈ N. For 1 ≤ i ≤ Nn and (x, v) ∈ EλNnn we make the following estimate. |∇xi (Kg∆1 ,∆2 ◦ symn )(x, v)|2 = ∇xi g∆1 ,∆2 (xi , xj ) j=i
≤
2
β˜ e 6 Φ(|xi −xj |2 ) ϕ∆2 (xj ) β Φ ˜ (|xi − xj |2 ) xi − xj ϕ∆1 (xi ) + ∇ϕ∆1 (xi ) 6 |xi − xj |2 j=i
+
2
β ˜ ˜ (|xi − xj |2 ) xi − xj ϕ∆2 (xi ) + ∇ϕ∆2 (xi ) e 6 Φ(|xi −xj |2 ) ϕ∆1 (xj ) β Φ 6 |xi − xj |2
2
j=i
β˜ β Φ(|xi −xj |2 ) ˜ 6 ≤2 e 1∆ (xj ) Φ (|xi − xj |2 ) 1∆ (xi ) + C1∆ (xi ) 6 j=i
˜ e− β6 Φ˜ is bounded and Φ ˜ ≥ 0 the right-hand side is for some C < ∞. Since Φ estimated by β˜ e 3 Φ(|xi −xj |2 ) 1∆ (xi )1∆ (xj ) C j=i
February 10, J070-S0129055X11004229
36
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
for some C < ∞ and thus |Ln (Kg∆1 ,∆2 ◦ symn )(x, v)| β ≤ (|vi |1 + |vj |1 )e 3 Φ(|xi −xj |2 ) 1∆ (xi )1∆ (xj ). {i,j}⊂{1,...,N }
By Proposition 3.19 we only have to show that for M ∈ {2, . . . , 6}, A, B, C, D, E, F ∈ {1, . . . , M }, {A, . . . , F } = {1, . . . , M }, A = B, C = D, E = F the expression
P 2 ˆ c(xA , xB )c(xC , xD )c(xE , xF )e− M β i<j φλn (xi −xj ) dx1 · · · dxM ΛM λn
β
is bounded independently of n, where c(x, x ) = e 3 Φ(|x−x |2 ) 1∆ (x)1∆ (x ), x, x ∈ ˜ and the uniform Rd . But since the integrand is bounded by the properties Φ ˆ boundedness from below of φλn , n ∈ N, the above integral is estimated by D max{vol(∆)2 , vol(∆)6 } for some D < ∞. ˜ − x |2 ) = 0 for Now assume that dist(∆1 , ∆2 ) > λ1 /2 in (4.4). In this case Φ(|x x ∈ ∆1 , x ∈ ∆2 , so we have g∆1 ,∆2 ({x, x }) = ϕ∆1 (x)ϕ∆2 (x ). Using periodicity of the image configurations with respect to pern and spatial shifts in Eλn we may assume that ∆1 and ∆2 are relatively compact in (−λ1 , λ1 )d , so pern may be replaced by symn and the case dist(∆1 , ∆2 ) > λ1 /2 is reduced to a (trivial) special ˜ ≡ 0) of the one we treated above. case (Φ Now we arrive at the concluding tightness estimate. The expectation with respect to P (n) , n ∈ N, we denote by E (n) and in the sequel we also use similar notations for expectations with respect to other probability laws. ˜ be chosen as in Lemma 4.10 and a be chosen as in Remark 4.9. Lemma 4.11. Let Φ For each T > 0 there is a constant C > 0 such that for 0 ≤ s < t ≤ T < ∞ it holds β
˜
sup E (n) [(d 6 Φ,h,a (γt , γs ))3 ] ≤ C(t − s)3/2 , n
β
˜
when fk , gk , rk and qk in the definition of d 6 Φ,h,a are chosen in a suitable way (see the proof below). Proof. By the Minkowski inequality and since
r 1+r
≤ r for r ≥ 0 it holds
˜
(E (n) [(d 6 Φ,h,a (γt , γs ))3 ])1/3 β
≤
∞
2−k (E (n) [|Kfk (γt ) − Kfk (γs )|3 ])1/3
k=1
+
∞ k=1
2−k (E (n) [|Kgk (prx γt ) − Kgk (prx γs )|3 ])1/3
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
+
∞
37
2−k qk (E (n) [|(Kχk )(γt ) − (Kχk )(γs )|3 ])1/3
k=1
+
∞
β
˜
β
˜
2−k rk (E (n) [|S 6 Φ,hIk ◦ prx (γt ) − S 6 Φ,hIk ◦ prx (γs )|3 ])1/3 .
(4.5)
k=1
Concerning the first three summands on the right-hand-side we need to estimate E (n) [|Kf (γt ) − Kf (γs )|3 ]
(4.6)
for f as in Lemma 4.7. It suffices to prove that this expression is bounded by (t−s)3/2 C(f )D(T ) where C(f ) is a constant depending only on f and D(T ) depends only on T . Then by replacing fk by C(ffkk)1/3 and gk by C(ggkk)1/3 and setting qk :=
min{C(χk )−1/3 , 1}, k ∈ N, in the definition of the metric the first three summands in (4.5) are convergent and less or equal than (t − s)1/2 D(T )1/3 . So let f be as in Lemma 4.7. It holds E (n) [|Kf (γt ) − Kf (γs )|3 ] = En [|(Kf ) ◦ pern (Xt , Vt ) − (Kf ) ◦ pern (Xs , Vs )|3 ]. It holds Kf ◦ pern ∈ D(Ln ). Since Pn solves the martingale problem for Ln we find that [Kf ◦pern ],n
Mt
:= Kf ◦ pern (Xt , Vt ) − Kf ◦ pern (X0 , V0 )
t − Ln (Kf ◦ pern )(Xr , Vr )dr, 0
t ≥ 0, defines a Pn -martingale. By [5, Remark 3.22] the quadratic variation process
t [Kf ◦pern ],n 2 )t≥0 is given by ( 0 2κ of (Mt β |∇v (Kf ◦ pern )|2 (Xr , Vr ) dr)t≥0 . Using the Burkholder–Davies–Gundy inequality and the H¨ older inequality, we find [Kf ◦pern ],n
En [|Mt
≤ En
t
s
≤
2κ β
3/2
− Ms[Kf ◦pern ],n |3 ]
3/2 2κ 2 |∇v (Kf ◦ pern )|2 (Xr , Vr )dr β (t − s)3/2 µn (|∇v (Kf ◦ pern )|32 ),
which can be estimated using Corollary 4.8. Moreover, it holds by H¨ older inequality 3 t En Ln (Kf ◦ pern )(Xr , Vr )dr ≤ (t − s)3 µn (|Ln (Kf ◦ pern )|3 ) s
≤ T 3/2 (t − s)3/2 µn (|Ln (Kf ◦ pern )|3 ). This can be estimated using Lemma 4.7. Altogether we have independently of n an estimate of (4.6) by (1 + T 1/2 )3 C(f )(t − s)3/2 for some constant C(f ), concluding the consideration of the first three summands in (4.5).
February 10, J070-S0129055X11004229
38
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus β
˜
Concerning the fourth summand we first note that (denoting S 6 Φ,hIk ◦ prx also β ˜ by S 6 Φ,hIk ) ˜
˜
E (n) [|S 6 Φ,hIk (γt ) − S 6 Φ,hIk (γs )|3 ] β
β
β
˜
β
˜
= En [|(S 6 Φ,hIk ◦ pern )(Xt , Vt ) − (S 6 Φ,hIk ◦ pern )(Xs , Vs )|3 ]. β
˜
S 6 Φ,hIk ◦ pern is an element of D(Ln ) (cf. Lemma 4.10) such that for the correβ ˜
[S 6 Φ,hIk ◦pern ],n
sponding martingale it holds Mt β
˜
= 0 Pn -a.s. for all t ≥ 0. So β
˜
En [|(S 6 Φ,hIk ◦ pern )(Xt , Vt ) − (S 6 Φ,hIk ◦ pern )(Xs , Vs )|3 ] 3 t β ˜ Φ,hI k ◦ pern ))(Xr , Vr )dr = En (Ln (S 6 s
β
˜
≤ (t − s)3 µn (|Ln (S 6 Φ,hIk ◦ pern )|3 ) ˜
≤ T 3/2 (t − s)3/2 µn (|Ln (S 6 Φ,hIk ◦ pern )|3 ), β
which can be estimated with the help of Lemma 4.10 by T 3/2 Rk (t − s)3/2 for some β ˜ −1/3 Rk ∈ R+ . Setting rk := min{Rk , 1} in the definition of d 6 Φ,a,h we have an estimate for the fourth summand in (4.5). This completes the proof. Remark 4.12. In [14], the Lyons–Zheng decomposition was used in order to obtain the estimate corresponding to the above lemma. At first sight, using such a decomposition seems to be a significant simplification of the proof given above, since one avoids having to estimate the bounded variation terms. Therefore, we should mention that this is not possible here, since we are in a non-reversible situation and the method of proving tightness by a forward/backward martingale decomposition depends heavily on reversibility of the processes (cf. [14, Proof of Lemma 5.3]). We obtain the desired tightness result. Theorem 4.13. Let φ be a symmetric pair interaction fulfilling (RP ), (T ), (BB ), (WD ) and (IDF ). Let (Nn )n ⊂ N and (λn )n ⊂ R+ be sequences such that λn ↑ ∞ ⊗[0,∞) −1 Nn (n) := Pn ◦ (pern ) , n ∈ N, be and (2λ d → ρ ∈ [0, ∞) as n → ∞. Let P n) defined as in Sec. 4.2. Then the sequence (P (n) )n∈N is a tight sequence of probability laws on C([0, ∞), Γv ).
Proof. Using standard tightness results (cf. [9, Theorems 3.8.6 and 3.8.8]) we obtain tightness of (P (n) )n∈N as probability laws on D([0, ∞), Γv ), the space of cadlag paths in Γv . By [9, Exercise 3.25(c)] we find that any weak accumulation point of (P (n) )n∈N assigns full measure to the space C([0, ∞), Γv ) of continuous paths, hence by [9, Exercise 3.25(d)] the assertion follows.
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
39
4.4. The martingale problem By now we know that in the situation of Theorem 4.13 we have at least one accumulation point P of (P (n) )n . Let P (nk ) → P weakly as k → ∞. Note that then also the sequence µ(nk ) converges weakly and its weak limit µ is the invariant initial distribution of P . Moreover, also µnk ◦ sym−1 nk → µ weakly as k → ∞. In this section we verify that P is the law of an infinite particle Langevin dynamics. We first prove some preliminary technical properties of the generator L (cf. (1.3)). We do not bother about the possibility of generalizing assertions in Lemma 4.14 below to all n ∈ N, since in this section we are only interested in asymptotic properties of the sequences (P (n) )n∈N , (µ(n) )n∈N . ∞ v Lemma 4.14. Let F = gF ({fi }K i=1 , ·) ∈ FCb (Ds , Γ ). Choose n0 ∈ N such that d d each fi , i = 1, . . . , K, has support in (−λn0 , λn0 ) × R . v (i) Let n ∈ N. The expression LF (γ) is well-defined for µn ◦ sym−1 n -a.e. γ ∈ Γ , −1 i.e. the sums in the definition of LF converge absolutely µn ◦symn -a.s. and LF is µn ◦ sym−1 n -a.s. independent of the version one chooses for ∇φ. Moreover, < ∞. it holds supn∈N LF L1(Γv ;µn ◦sym−1 n ) (ii) Let n ≥ n0 . Then LF is well-defined µ(n) -a.e. and supn≥n0 LF L1 (Γv ;µ(n) ) < ∞ with the restriction that the versions for ∇φ have to be chosen in a way such that r∈Zd ∇φ(2λn r) = 0 for all n ∈ N.
t (iii) With the restriction from (ii), for any n ≥ n0 , t ≥ 0 the integral 0 LF (γr )dr is well-defined P (n) -a.s. (iv) If P is the weak limit of a sequence (P (n) )n∈N as above, then (i) holds with µ(n) replaced by µ and (iii) holds with P (n) replaced by P .
Remark 4.15. The restriction in Lemma 4.14(ii) means that the forces acting between a particle and its periodic copies sum up to 0, which is a quite natural assumption. Note that it is not an additional assumption on φ. It is only introduced for technical reasons (cf. (4.8) below) and it can be dropped when only considering the limiting process, as one sees in (the proof of) Lemma 4.14(iv). Proof. LF consists (except of multiplication by bounded continuous partial derivatives of gF ) of two types of sums. The first (e.g., ∆v fi , ·) are K-transforms of functions in Ds . By Lemma 3.18 and the uniform Ruelle bound, the assertion is easily shown for these sums. We concentrate on the second type of sums, which are K-transforms of functions g: Γv2 → R of the form g({(x, v), (x , v )}) = ∇φ(x − x )(∇v f (x, v) − ∇v (f (x , v )), (x, v), (x , v ) ∈ Rd × Rd , with f ∈ Ds . Let ¯ and hdλv ) with ¯h as in Remark 3.26 (here we identify h us prove that g ∈ L1 (Γv2 ; ¯ ¯ h ◦ prx ). It holds
2 2 1 |g({(x, v), (x , v )})|e−(β/2)(v +v ) e−βφ(x−x ) dx dx dv dv (2π/β)d R4d ≤ 2 |∇v f | ∞ vol(supps (f ))∇φL1 (Rd ;e−βφ dx) .
(4.7)
February 10, J070-S0129055X11004229
40
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
The proof of (i) and of the first assertion of (iv) are now completed by Lemma 3.18, the improved Ruelle bound, (WD) and the properties of φ (cf. Remark 3.26). (Remember that the improved Ruelle bound also holds for the limiting measure µ by Lemma 3.23.) Let us now prove (ii). Let n ≥ n0 . Due to the condition (IDF) and the bounded ness of 0=r∈Zd θ(· + 2λn r) in Λλn (cf. the proof of Lemma 4.2(ii)), hence in Λ2λn , ˆ (as element of L1 (Λ2λ ; e−β φˆλn dx)) the sum d ∇φ(· + 2λn r) defining ∇φλ r∈Z
n
n
converges Lebesgue-a.e. Any version of ∇φ uniquely determines by this sum a corresponding version of ∇φˆλn , when we set ∇φˆλn equal to 0 where it does not converge. Fixing a version of ∇φ we define for γ ∈ Γv K κ ∂l ∂l gF ({fi }K L(n) F (γ) := i=1 , γ)(∇v fl )(∇v fl ), γ β l,l =1 K κ K ∆v fl − κv∇v fl + v∇x fl , γ + ∂l gF ({fi }i=1 , γ) β l=1
−
∇φˆλn (x − x )(∇v fl (x, v) − ∇v fl (x , v )) .
{(x,v),(x ,v )}⊂γ
It can now be shown that (i) holds with L replaced by L(n) . Then, since in particular this means that the sums defining L(n) F are absolutely convergent, we may change the order of summation. Using a version of ∇φ as specified in (ii), we obtain that L(n) F ◦ symn = LF ◦ pern
µn -a.s.
(4.8)
Hence (ii) follows. (iii) follows from Fubini’s theorem, (ii) and the fact that
t (n) E |LF (γr )|dr ≤ tLF L1(Γv ;µ(n) ) . 0
The second assertion of (iv) follows in the same mannner from the first assertion.
From the above proof and from Lemma 3.23 we can conclude the following uniform approximation result. Lemma 4.16. Let F be as in Lemma 4.14. Then there exists a sequence (Hl )l∈N of bounded continuous cylinder functions Hl : Γv → R (cf. Sec. 3.4) such that lim lim sup Hl − LF L1 (Γv ;µ(nk ) ) = 0
l→∞ k→∞
and Hl → LF as l → ∞ in L1 (Γv ; µ). Proof. When µ(nk ) is replaced by µnk ◦ sym−1 nk , the assertion is a consequence of Lemma 3.23(ii) (cf. Remark 3.26) and the proof of Lemma 4.14 (in particular (4.7)). Let (Hl )l∈N be a corresponding sequence of bounded continuous cylinder functions.
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
41
Fix l ∈ N and let n0 ∈ N be as in the proof of Lemma 4.14. Choose k0 large enough such that nk0 ≥ n0 and such that Hl depends only on the configuration in Λλnk for k ≥ k0 . By (4.8) it holds for k ≥ k0 Hl − LF L1 (Γv ;µ(nk ) ) = Hl − L(nk ) F L1 (Γv ;µn ≤ Hl − LF L1 (Γv ;µn
k
k
◦sym−1 nk )
◦sym−1 nk )
+ LF − L(nk ) F L1 (Γv ;µn
k
. ◦sym−1 nk )
= 0. This So, we are left to prove that limk→∞ LF − L(nk ) F L1 (Γv ;µn ◦sym−1 nk ) k v reduces to proving that limn→∞ KgnL1 (Γv ;µn ◦sym−1 = 0 for gn : Γ2 → R of the n ) form gn ({(x, v), (x , v )}) := (∇φ − ∇φˆλn )(x − x )(∇v f (x, v) − ∇v f (x , v )), (x, v), (x , v ) ∈ Eλ2n , with f ∈ Ds , supp(f ) ⊂ (−δ, δ)d for some 0 < δ < λn0 . Applying Lemma 3.18 and the improved Ruelle bound we find that it suffices to verify that with φ as in Remark 3.26 it holds lim ∇φ − ∇φˆλn L1 (Λλ
n→∞
−βφ dx) n +δ ;e
= 0.
(4.9)
For n ≥ n0 it holds ∇φ − ∇φˆλn L1 (Λλ
−βφ dx) n +δ ;e
≤ ∇φ − ∇φˆλn L1 (Λλ + ∇φˆλn L1 (Λλ
−βφ dx) n ;e
+ ∇φL1 (Λλ
−βφ dx) n +δ \Λλn ;e
−βφ dx) n +δ \Λλn ;e
≤ 2D∇φL1 (Rd \Λλn ;dx) + 2d D∇φL1 (Rd \Λλn −δ ;dx) , where D := supx∈Rd e−βφ(x) . (4.9) follows, since ∇φ ∈ L1 (Rd ; e−βφ dx) implies ∇φ ∈ L1 ({|x| > a}; dx) for any a > 0 due to boundedness of φ in {|x| > a}. The laws P (n) , n ∈ N, behave nicely with respect to the operator L restricted to functions F = gF (f1 , ·, . . . , fK , ·) ∈ FCb∞ (Ds , Γv ) depending only on the
t [F ] particles in Λλn : We find that Mt := F (γt ) − F (γ0 ) − 0 LF (γr )dr, t ≥ 0, defines a martingale with respect to P (n) . To see this, first note that for such F we have by (4.8) that LF ◦ pern = Ln (F ◦ symn ) holds µn -a.s. We obtain for any 0 ≤ s ≤ t and any bounded function G: C([0, ∞), Γv ) → R which is σ(γr : 0 ≤ r ≤ s)-measurable
t (n) G F (γt ) − F (γs ) − LF (γr )dr E s
= En G ◦ pern⊗[0,∞) F ◦ symn (Xt , Vt ) − F ◦ symn (Xs , Vs )
−
t
Ln (F ◦ symn )(Xr , Vr )dr
,
s
which is equal to 0, since Pn solves the martingale problem for (Ln , D(Ln )).
February 10, J070-S0129055X11004229
42
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
We arrive at the result completing the construction. Theorem 4.17. Assumptions as in Theorem 4.13. Let P be an accumulation point of the sequence (P (n) )n∈N . Then P solves the martingale problem for (L, F Cb∞ (Ds , Γv )). Proof. Let P (nk ) → P weakly as k → ∞ and denote the weak limit of (µ(nk ) )k∈N ∞ v by µ. Let F = gF ({fi }K i=1 , ·) ∈ FCb (Ds , Γ ). Then there exists k0 ∈ N such that (−λnk , λnk )d contains the support of all fi , 1 = 1, . . . , K, for all k ≥ k0 . We have to prove that for any 0 ≤ s < t < ∞ and for any σ(γr : 0 ≤ r ≤ s)-measurable G: C([0, ∞), Γv ) → R being bounded and continuous it holds [F ]
E[(Mt
− Ms[F ] )G] = 0,
[F ]
[F ]
where (Mt )t ≥0 is as defined above. We already know that (Mt )t ≥0 is a martin[F ] gale with respect to P (nk ) , k ≥ k0 . Hence we are left to prove that E (nk ) [GMr ] → [F ] E[GMr ] as k → ∞ for r ∈ {t, s}. As G and F are continuous and bounded, this reduces to proving r E (nk ) G LF (γr )dr 0
→E G
r 0
LF (γr )dr
as k → ∞ for each r ∈ {t, s}.
(4.10)
Choosing (Hl )l≥0 according to Lemma 4.16 we find that for any r ≥ 0, l ∈ N and k ≥ k0 it holds |E (nk ) [GLF (γr )] − E[GLF (γr )] ≤ G∞ (LF − Hl L1 (Γv ;µ(nk ) ) + LF − Hl L1 (Γv ;µ) ) + |E (nk ) [GHl (γr )] − E[GHl (γr )]|, where we used the fact that the one-dimensional distributions of P (nk ) , P are given by µ(nk ) , µ, respectively. Hence by continuity and boundedness of G and Hl and by weak convergence of P (nk ) towards P lim sup |E (nk ) [GLF (γr )] − E[GLF (γr )]| k→∞
≤ G∞ lim sup LF − Hl L1 (Γv ;µ(nk ) ) + LF − Hl L1 (Γv ;µ) , k→∞
which by Lemma 4.16 can be made arbitrarily small by choosing l large. E (n) [GLF (γr )] is bounded uniformly in r ∈ [0, ∞) by G∞ LF L1(Γv ;µ(n) ) which is bounded uniformly in n ≥ nk0 due to Lemma 4.14(ii). Hence (4.10) follows by Lebesgue’s dominated convergence theorem. Remark 4.18. The results from Sec. 3.4 which we used to prove Theorem 4.17 also generalize some of the results in [14]. In the differentiability assumption (D)
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
43
given there we replace continuous differentiability of the potential φ on Rd \{0} by weak differentiability and continuity. (Moreover, the assumption on Φ in (D) can be dropped, cf. Lemma 3.1 above.) Then the existence of the approximating dynamics (cf. [14, Theorem 4.1] and [10]) is still ensured. Moreover, the tightness result [14, Theorem 5.1] does not depend on continuous differentiability of φ. We do not consider the question whether the results of [14, Sec. 5.2] are still valid, but focus on the martingale problem [14, Theorem 5.10]. Since the invariant initial distributions µ(N ) , N ∈ N, of the approximating dynamics P (N ) in [14] fulfill a uniform improved Ruelle bound, this bound extends to the invariant distribution µ of any weak accumulation point P by Lemma 3.23(i). This can be used to prove well-definedness of the expressions in [14, (5.19) and (5.20)] a.s. with respect to the µ(N ) , µ respectively, P (N ) , P similarly to Lemma 4.14(i), (iii) and (iv). Moreover, we also obtain an approximation of elements Hµ F , F ∈ FCb∞ (D, Γ), of the range of the generator (Hµ , F Cb∞ (D, Γ)) (cf. [14, (5.19)]) by bounded continuous cylinder functions in L1 (Γ; µ(N ) ) uniformly in N and in L1 (Γ; µ) (using Lemma 3.23(ii)). This replaces the only argument in the proof of [14, Theorem 5.10] making use of the continuity of the derivatives of φ (cf. [14, p. 150]). Finally, note that also [14, Theorem 6.1] is valid in the new setting. 4.5. A conditional result on convergence Here we discuss what is gained when making an essential m-dissipativity assumption for the generator L. Let µn , P (n) , Ln etc., n ∈ N, be as above. Denote by (Ttn )t≥0 the subMarkovian strongly continuous contraction semigroup in L1 (EλNnn ; µn ) generated by (Ln , D(Ln )). Let µ be a weak accumulation point of the sequence (µn ◦ sym−1 n )n∈N and let µnk ◦ sym−1 → µ weakly as k → ∞. For any bounded continuous cylinder nk → F L1 (Γv ;µ) as k → ∞. function F : Γv → R we have F ◦ symnk 1 Nnk L (Eλn ;µnk ) k
Hence, denoting the set of bounded continuous cylinder functions by C, we have Nn convergence of Xk := L1 (Eλn k , µnk ) → L1 (Γv ; µ) =: X in the sense of [21, 20, 32] k (which are formulated for Hilbert spaces, but at least the basic definitions also extend to Banach spaces). As defined in these articles, a sequence (Fk )k∈N , k ∈ N, with Fk ∈ Xk is said to converge to F ∈ X, if for one (hence all) (Um )m∈N ⊂ C such that Um → F in X as m → ∞ it holds limm→∞ lim supk→∞ Um ◦symnk −Fk Xk = 0. In particular, for any F ∈ FCb∞ (Ds , Γv ), we have that Fk := F ◦ symnk converges to F in this sense. Moreover, since for sufficiently large k we have Fk ∈ D(Lnk ) and Lnk Fk = L(nk ) F ◦ symnk , it follows (after replacing finitely many Fk by 0) that Lnk Fk converges to LF as k → ∞ by (the proof of) Lemma 4.16 above. Thus, we have generator convergence, and when one assumes that (1 − L)F Cb∞ (Ds , Γv ) is dense in L1 (Γv ; µ), Trotter’s theorem should provide convergence of the corresponding semigroups to a limiting semigroup generated by the closure of L. We make this precise: Quite similar to [32, Theorem 2.10] or [20,
February 10, J070-S0129055X11004229
44
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
Proposition 7.2] (using the existence of a Schauder basis in L1 (Γv ; µ) or only the bounded approximation property) one can show that there exists a uniformly bounded sequence of continuous linear operators Pn : X → Xn such that Pn F Xn → F X as n → ∞ for all F ∈ X and such that Fn ∈ Xn , n ∈ N, converge to F ∈ X as n → ∞ in the above sense iff Pn F − Fn Xn → 0. (There is a bit of work to do, and we refer to [6] for a complete proof.) This is precisely the setting from [33, 23], and thus the Trotter convergence theorem from these papers can be applied. More precisely, essentially [23, Theorem 2.1] implies the following result. Theorem 4.19. Let (µ(nk ) )k∈N be weakly convergent to µ. Then (L, F Cb∞ (Ds , Γv )) is dissipative. If it is even essentially m-dissipative in L1 (Γv ; µ), i.e. (1 − L)F Cb∞ (Ds , Γv ) is dense in L1 (Γv ; µ), then there exists a strongly continuous subMarkovian semigroup of contractions (Tt )t≥0 on L1 (Γv ; µ) generated by the closure of (L, F Cb∞ (Ds , Γv )) such that Ttn → Tt strongly as n → ∞, i.e. for all F ∈ C the sequence (Ttn (F ◦ symn ))n∈N converges to Tt F in the abovementioned sense. In particular, the sequence (P (nk ) )k∈N is convergent and its limiting law P solves the martingale problem even for the closure of L and has the elementary Markov property. Proof. For any F ∈ FCb∞ (Ds , Γv ) (by the proof of Lemma 4.16) it holds µ(LF ) = limk→∞ µnk (Lnk (F ◦ symnk )) = 0, since µnk is invariant for Lnk . Hence µ is invariant for L, and it follows that L is dissipative and if L is even essentially m-dissipative (i.e. its closure generates a contraction semigroup) in L1 (Γv ; µ), it follows that the semigroup (Tt )t≥0 generated by its closure is sub-Markovian (see [8, Appendix B]). The semigroup convergence follows from [23, Theorem 2.1]. From the convergence of semigroups the convergence of finite dimensional distributions follows. This implies that all weak accumulation points of (P (nk ) )k∈N necessarily coincide, thus (P (nk ) )k∈N converges weakly to some law P having finite dimensional distributions given by the semigroup (Tt )t≥0 . This implies the Markov property and also the assertion on the martingale problem, see e.g. [2, Proof of Proposition 1.4] or [5, Appendix A]. 5. The Initial Distribution Consider the situation of Theorems 4.13 and 4.17. We now focus on the initial distribution µ of the process constructed there. We already saw in Remark 3.25 that it is an accumulation point of the sequence (µn ◦ sym−1 n )n∈N or, equivalently, of the sequence (µn ◦ per−1 ) . Our aim is to prove that µ is a tempered grand n∈N n canonical Gibbs measure (cf. [13, p. 1348] for the definition). To achieve this, we adapt considerations from [13] on the equivalence of the microcanonical and the grand canonical ensemble and extend some results obtained there to the canonical ensemble. We use results and notations from [15, 12, 13]. As in [13] we restrict to the case where λn = n + 12 , n ∈ N.
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
45
Let Pθ be the space of probability measures P on Γv which are invariant with respect to spatial translations ϑa (γ) := γ − (a, 0), γ ∈ Γv , a ∈ Rd , and which have finite density and kinetic energy density, i.e. Γv (x,v)∈γ∩C (1 + |v|2 )dP (γ) < ∞, where C := [0, 1]d ×Rd . The set of tame cylinder functions F , i.e. functions F : Γv → R such that there exist Λ ⊂ Rd and C < ∞ with F (γ) = F (γ ∩ (Λ × Rd )) and |F (γ)| ≤ C+C (x,v)∈γ∩(Λ×Rd ) (1+|v|), γ ∈ Γv , is denoted by L. On Pθ the topology τL is defined as the weakest topology such that all mappings P → P (F ) = Γv F dP , F ∈ L, are continuous. This topology is finer than the weak topology on the space of probability measures on Γv . We now state the result which is shown in the course of this section. Theorem 5.1. Let a symmetric measurable function φ: Rd → R ∪ {∞} fulfill (RP ), (T ), (BB ) as given in Sec. 3.1. Let λn := n + 12 , n ∈ N, and (Nn )n∈N ⊂ N be such Nn [n] := µn ◦ sym−1 that (2λ d → ρ ∈ (0, ∞) as n → ∞. Define µ n with µn := µλn ,Nn n) and symn := symλn ,Nn as defined in Sec. 4.2. Then (µ[n] )n∈N is relatively compact with respect to τL and any accumulation point is a grand canonical Gibbs measure.
Remark 5.2. The conditions (RP), (T), (BB) imply conditions (A1) and the nonhard-core version of (A2) from [13]. The proof given below works for the latter conditions. We exclude the case of hard-core potentials for convenience and since it is not treated in the preceding sections of this article. The proof of the above theorem mainly follows the lines of arguments in [13], in particular the beginning of [13, Sec. 6]. However, there are some modifications to be made which can only be explained in the presence of some details. Note that here we only deal with the case of periodic boundary condition (this simplifies the considerations). We introduce some more notations from [13]. By ρ: Pθ → [0, ∞) we denote function assigning to each P ∈ Pθ its average particle density the τL -continuous
ρ(P ) := Γv (γ ∩ C) dP (γ). Upot : Pθ → R ∪ {∞} denotes the mean potential energy, which is given by
1 φ(x − x )dP (γ). Upot (P ) := lim n→∞ (2λn )d Γv {x,x }⊂(prx γ)∩Λλn
The mean kinetic energy Ukin : Pθ → R is defined by
1 Ukin (P ) := |v|2 dP (γ). Γv 2 (x,v)∈(γ∩C)
Both functions Upot , Ukin are measure affine and lower semicontinuous with respect to τL (cf. [13, p. 1349] and also [12]). Set U := Ukin + Upot . Moreover, we need to make use of the mean entropy S: Pθ → R ∪ {−∞}, which is an upper semicontinuous measure affine function and such that for c ∈ R, ε ≥ 0 the sets {P ∈ Pθ : S(P ) ≥ c, Ukin (P ) ≤ ε}
February 10, J070-S0129055X11004229
46
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
are compact and sequentially compact with respect to τL (cf. [13, p. 1349 and Lemma 4.2]). We also need to consider entropy functionals Iβ : Pθ → [0, ∞]. They are defined by Iβ (P ) := limn→∞
I(Pn ,Qβ n) (2λn )d ,
where Pn := P ◦ (prvΛn )−1 , Qβn := Qβ ◦ (prvΛn )−1 and 2
Qβ denotes the Poisson point random field with intensity measure dx e−βv /2 dv (cf. [13, Sec. 4], also for the definition of I(·, ·), which denotes relative entropy). We will sometimes make use of the identity ([13, (4.3)]) S(P ) = −Iβ (P ) + βUkin (P ) + c(β) d for P ∈ Pθ . Here c(β) = 2π/β . So, S(P ) < ∞ for all P ∈ Pθ . In [13, Theorem 3.2] it is shown that the function
(5.1)
s(ρ , ε) : = sup{S(P ) | P ∈ Pθ , U (P ) ≤ ε, ρ(P ) = ρ } = sup{S(P ) | P ∈ Pθ , U (P ) = ε, ρ(P ) = ρ },
ρ ≥ 0,
ε ∈ R,
is upper semicontinuous and concave and coincides in the convex set Σ = {(ρ , ε)|ε > Mn εmin (ρ )} with the thermodynamic entropy density limn→∞ log , where (Mn )n∈N (2λn )d denotes a sequence of microcanonical partition functions in Λλn with periodic boundary such that the densities converge towards ρ and the energy densities converge towards ε. Here εmin (ρ ) = inf{U (P ) | P ∈ Pθ , ρ(P ) = ρ , S(P ) > −∞}. We now state a variational principle for the thermodynamic free energy density, which we derive below as a direct consequence of the above mentioned corresponding result from [13] on the thermodynamic entropy density and some considerations (η) := from [27]. Set U y∈η+2λn r φ(x − y), η ∈ ΓΛλn. λn x∈η r∈Zd y=x
Lemma 5.3. For β > 0, ρ > 0, let the free energy f (ρ, β) be defined by βf (ρ, β) = inf ε>εmin (ρ) (βε − s(ρ, ε)). f (ρ, β) is finite and it holds βf (ρ, β) = − lim
n→∞
log Zn = inf{βU (P ) − S(P ) | P ∈ Pθ , ρ(P ) = ρ}, (2λn )d
(5.2)
where for n ∈ N
β 2 2 1 ˜ Zn = e−β Uλn (x1 ,...,xNn ) e− 2 (v1 +···+vNn ) dx1 · · · dxNn dv1 · · · dvNn n Nn ! ΛN dNn λ ×R n
is the canonical partition function with periodic boundary condition. Proof. By [13, p. 1350] for ε > εmin (ρ) it holds s(ρ, ε) > −∞ and moreover εmin (ρ) is finite, so we conclude that f (ρ, β) < ∞. Furthermore, due to [13, Lemma 4.1, Eq. (4.4)] and (5.1) the set {S(P ): P ∈ Pθ , ρ(P ) = ρ, U (P ) = ε} is bounded from above by βε + β(B 2 /4A) + c(β) with constants A, B as in (SS). This implies that for any ε > εmin (ρ) it holds βε − s(ρ, ε) ≥ −β(B 2 /4A) − c(β), hence f (ρ, β) > −∞. The arguments in [27, p. 55] also work in the case of periodic boundary condition and including velocities, which together with [13, Theorem 3.2] yields the first equality in (5.2).
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
47
To prove the second one, first note that for any P ∈ Pθ with ρ(P ) = ρ and U (P ) < ∞ it holds βU (P ) − S(P ) ≥ βU (P ) − s(ρ, U (P )) ≥ βf (ρ, β). This follows from the definition of f (ρ, β), when U (P ) > εmin (ρ). Moreover, since limε↓εmin (ρ) s(ρ, ε) = s(ρ, εmin (ρ)) due to upper semicontinuity and concavity of s(·, ·), this extends also to U (P ) = εmin . For U (P ) < εmin (ρ) it is implied by the definition of εmin (ρ). Thus βf (ρ, β) ≤ inf{βU (P ) − S(P ) | P ∈ Pθ , ρ(P ) = ρ, U (P ) < ∞}. To prove the converse inequality, for δ > 0 choose ε˜ ∈ (εmin (ρ), ∞) such that βf (ρ, β) + δ ≥ β ε˜ − s(ρ, ε˜). By [13, (3.9) and Theorem 3.2(b)] we may choose P ∈ Pθ such that ρ(P ) = ρ, U (P ) = ε˜ and such that S(P ) ≥ s(ρ, ε˜) − δ. Hence βU (P ) − S(P ) ≤ β ε˜ − s(ρ, ε˜) + δ ≤ βf (ρ, β) + 2δ. Since δ may be chosen arbitrarily small, the second equality is shown. Let n ∈ N. We define the measure µ ˆ[n] on Γv such that the configurations in d d (Λλn + 2λn r) × R , r ∈ Z are independent and distributed as shifts of µ[n] by the translation invariant measure µ ˜[n] as spatial average of the 2rλn . One defines
˜[n] := Λλ µ ˆ[n] ◦ ϑ−1 ˜[n] ∈ Pθ , cf. [13, (6.3)]. µ ˆ[n] , i.e. µ x dx. It holds µ n v Define for γ ∈ ΓΛn the measure Rn,γ ∈ Pθ defined by Rn,γ :=
1 (n) δ denotes the 2λn d (2λn ) Λλn ϑx γ (n) dx, where δ· denotes Dirac measure and γ periodic continuation of γ. (See [15] for details on these translation invariant We will below also have to consider the mixture µ[n] Rn :=
empirical fields.) [n] (n) := µn ◦ per−1 n due Γv Rn,γ dµ (γ), which, on the other hand, is equal to µ (n) resulting from the (spatial) translation invariance to translation invariance of µ of µn as a measure on the manifold EλNnn (defined in Sec. 4.2). Keeping this in mind, we will nevertheless use the notation µ[n] Rn in the sequel. Note that by [13, (6.3)] it holds for n ∈ N
1 1 [n] βv 2 dµ[n] (γ) = Ukin (µ[n] Rn ). µ )= (5.3) Ukin (˜ (2λn )d Γvλ 2 n
(x,v)∈γ
Nn and by the definition of µ[n] this expression is equal to C(β) (2λ d for some C(β) > 0 n) not depending on n ∈ N. The proof of Theorem 5.1 is based on the inequality given in the following lemma.
Lemma 5.4. It holds for n ∈ N S(˜ µ[n] ) ≥ βU (µ[n] Rn ) +
Proof. It holds
d(µ[n] ◦(prv Λ dQβ n
λn
)−1 )
(5.4)
1 e−Fn , where Fn : ΓvΛλn → R ∪ {∞} is −Fn ) Qβ n (e ∈ ΓvΛλn ,Nn and = ∞ else. These functions might
=
˜ (γ) for γ defined by Fn (γ) := β U λn
log(Zn ) . (2λn )d
February 10, J070-S0129055X11004229
48
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
not form an asymptotic empirical functional in the sense of [15]. Nevertheless, the ˜[n] . proof of the second assertion in [15, Lemma 5.5] is valid for the measures µ[n] , µ Therefore, using also (5.1) and (5.3) we find that µ[n] ) + βUkin (µ[n] Rn ) + c(β) S(˜ µ[n] ) = −Iβ (˜ ≥− =
1 µ[n] ◦ (prvΛλn )−1 (Fn ) + βUkin (µ[n] Rn ) + c(β) (2λn )d +
=
1 I(µ[n] ◦ (prvΛλn )−1 ; Qβn ) + βUkin (µ[n] Rn ) + c(β) (2λn )d
1 log(Qβn (e−Fn )) (2λn )d
1 1 µ[n] ◦ (prvΛλn )−1 (Fn ) + βUkin (µ[n] Rn ) + log Zn . (2λn )d (2λn )d
By [12, (2.16)] and since Upot is measure affine it holds 1 µ[n] ◦ (prvΛλn )−1 (Fn ) = βµ[n] (Upot (Rn , ·)) = βUpot (µ[n] Rn ). (2λn )d This completes the proof. Nn Since µ ˜[n] = C(β) (2λ ˜[n] is bounded. d , n ∈ N, the kinetic energy density of all µ n) Therefore, boundedness of Upot from below (by −B 2 /4A) together with convergence n) as n → ∞ (cf. Lemma 5.3) imply that (S(˜ µ[n] ))n∈N is bounded from of log(Z (2λn )d below. This together with the boundedness of the kinetic energy implies relative compactness of the sequence (˜ µ[n] )n (see the properties of S mentioned above). The following lemma shows that asymptotically one can treat µ ˜[n] , µ[n] and µ[n] Rn , n ∈ N, as equal.
µ[n] )n and (µ[n] Rn )n are asymptotically Lemma 5.5. The sequences (µ[n] )n , (˜ equivalent, i.e. for any two of them, say (ν1n )n , (ν2n )n , and any f ∈ L it holds µ[nk ] )k∈N to some µ ∈ Pθ limn→∞ |ν1n (f )−ν2n (f )| = 0. In particular, convergence of (˜ with respect to τL implies that also limk→∞ µ[nk ] = limk→∞ µ[nk ] Rnk = µ. Proof. This is shown as in the proof of [13, Lemma 6.2]: The asymptotic equivalence of µ[n] and µ[n] Rn is clear. For the second asymptotic equivalence note that µn ) < ∞ by (5.4), (5.1) and (5.3), so [15, Lemma 5.7] can be applied. For supn Iβ (˜ convenience of the reader we remark that to apply the mentioned lemma a function ψ: Rd → R, defined as function of velocities, has to be chosen appropriately. (This function is used in the definitions of L and Pθ in [15].) Setting ψ(v) := 1 + |v|, v ∈ Rd , is the standard choice here. Then the definition of Pθ from [15] does not coincide with the one given above, but denotes a larger space. This, however, does not affect the considerations made in the proof of [15, Lemma 5.7].
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
49
From Lemma 5.5 above, the preceding considerations, (5.4) and the properties of U , S and ρ we find that there exists an accumulation point µ ∈ Pθ of (µ[n] )n∈N with respect to τL fulfilling ρ(µ) = ρ and any such accumulation point fulfills βU (µ) − S(µ) ≤ − lim
n→∞
log(Zn ) . (2λn )d
(Note that since U is bounded from below and S cannot take the value +∞ in Pθ (one may see this from (5.1) and since Iβ only takes nonnegative values), both U (µ) and S(µ) are finite.) From Lemma 5.3 we see that βU (µ) − S(µ) = βf (ρ, β). µ is a minimizer of the canonical free energy density U (·) − β −1 S(·) under the constraint ρ(·) = ρ. We finally make considerations similar to those in [13, p. 1351]: For β > 0 the function f (β, ·) is convex. (This follows from its definition, the concavity of s(·, ·) and the convexity of the effective domain Σ of s(·, ·) defined in [13, (3.7)].) Hence we may choose some z > 0 and p ∈ R such that ρ → −p + ρ β −1 log(z) is a tangent to f (β, ·) at ρ. This and (5.2) imply for any P ∈ Pθ it holds βU (P ) − ρ(P ) log(z) − S(P ) ≥ βf (ρ(P ), β) − ρ(P ) log(z) ≥ −βp = f (ρ, β) − ρ log(z) = βU (µ) − ρ(µ) log(z) − S(µ).
(5.5)
In order to prove this inequality for ρ(P ) = 0 (i.e. P = δ∅ ), note that U (δ∅ ) = 0, S(δ∅ ) = 0 and ρ(δ∅ ) = 0. Hence we only need to verify that p ≥ 0. This, however, follows e.g. from the fact that for any measure Q ∈ Pθ fulfilling U (Q) < ∞, ρ(Q) > 0 and S(Q) > −∞ (such a measure exists) it holds lim βf (ρ , β) ≤ lim βU (αQ + (1 − α)δ∅ ) − S(αQ + (1 − α)δ∅ ) = 0,
ρ →0
α→0
where we used (5.2) and the fact that U and S are affine functions. (5.5) implies that µ is a minimizer of the mean free energy U (·)−ρ(·) log(z)−S(·). By [13, Theorem 3.4] we conclude that µ is a tempered grand canonical Gibbs measure, and the proof of Theorem 5.1 is completed. Acknowledgement Financial support by the DFG through the project GR 1809/5-1 is gratefully acknowledged. References [1] S. Albeverio, Yu. G. Kondratiev and M. R¨ ockner, Analysis and geometry on configuration spaces: The Gibbsian case, J. Funct. Anal. 157 (1998) 242–291. [2] L. Beznea, N. Boboc and M. R¨ ockner, Markov processes associated with Lp -resolvents and applications to stochastic differential equations on Hilbert space, J. Evol. Equ. 6(4) (2006) 745–772. [3] P. Billingsley, Probability and Measure (Wiley, New York, 1979).
February 10, J070-S0129055X11004229
50
2011 13:49 WSPC/S0129-055X
148-RMP
F. Conrad & M. Grothaus
[4] C. Bahn, Y. M. Park and H. J. Yoo, Nonequilibrium dynamics of infinite particle systems with infinite range interactions, J. Math. Phys. 40(9) (1999) 4337–4358. [5] F. Conrad and M. Grothaus, Construction, ergodicity and rate of convergence of N -particle Langevin dynamics with singular interactions, J. Evol. Equ. 10(3) (2010) 623–662. [6] F. Conrad, Construction and analysis of Langevin dynamics in continuous particle systems, PhD thesis, University of Kaiserslautern (2010). [7] R. L. Dobrushin and R. A. Minlos, The existence and continuity of pressure in classical statistical physics, Theor. Probab. Appl. 12(4) (1967) 535–559. [8] A. Eberle, Uniqueness and Non-Uniqueness of Semigroups Generated by Singular Diffusion Operators, Lecture Notes in Mathematics, Vol. 1718 (Springer, 1999). [9] S. Ethier and T. Kurtz, Markov Processes (Wiley & Sons, 1986). [10] T. Fattler and M. Grothaus, Construction of elliptic diffusions with reflecting boundary condition and an application to continuous N -particle systems with singular interactions, Proc. Edinburgh Math. Soc. 51(2) (2008) 337–363. [11] J. Fritz, Stochastic dynamics of two-dimensional infinite-particle systems, J. Stat. Phys. 20 (1979) 361–369. [12] H.-O. Georgii, Large deviations and the equivalence of ensembles for Gibbsian particle systems with superstable interaction, Probab. Theory Related Fields 99 (1994) 171– 195. [13] H.-O. Georgii, The equivalence of ensembles for classical systems of particles, J. Stat. Phys. 80 (1995) 1341–1378. [14] M. Grothaus, Y. Kondratiev and M. R¨ ockner, N/V -limit for stochastic dynamics in continuous particle systems, Probab. Theory Related Fields 137 (2007) 121–160. [15] H.-O. Georgii and H. Zessin, Large deviations and maximum entropy principle for marked point random fields, Probab. Theory Related Fields 96 (1993) 177–204. [16] T. L. Hill, Statistical Mechanics, McGraw-Hill Series in Advanced Chemistry (McGraw-Hill, 1956). [17] R. A. Holley and D. W. Stroock, Generalized Ornstein–Uhlenbeck processes and infinite particle branching Brownian motion, Publ. Res. Inst. Math. Sci. 14 (1978) 741–788. [18] O. Kallenberg, Random Measures, 2nd edn. (Academic Press, 1976). [19] Yu. G. Kondratiev and V. Kutovij, On the metrical properties of the configuration space, Math. Nachr. 279 (2006) 774–783. [20] A. V. Kolesnikov, Convergence of Dirichlet forms with changing speed measures on Rd , Forum Math. 17 (2005) 225–259. [21] K. Kuwae and T. Shioya, Convergence of spectral structures: A functional analytic theory and its applications to spectral geometry, Comm. Anal. Geom. 11(4) (2003) 599–673. [22] T. Kuna, Studies in configuration space analysis and applications, PhD thesis, University of Bonn, Bonner Mathematische Schriften 324 (1999). [23] T. G. Kurtz, Extensions of Trotter’s operator semigroup approximation theorems, J. Funct. Anal. 3 (1969) 354–375. [24] C. Marchioro, A. Pellegrinotti and E. Presutti, Existence of time evolutions for ν-dimensional statistical mechanics, Comm. Math. Phys. 40 (1975) 175–185. [25] H. Osada, Dirichlet form approach to infinite-dimensional Wiener process with singular interactions, Comm. Math. Phys. 176 (1996) 117–131. [26] S. Olla and C. Tremoulet, Equilibrium fluctuations for interacting Ornstein– Uhlenbeck particles, Comm. Math. Phys. 233(3) (2003) 463–491. [27] D. Ruelle, Statistical Mechanics. Rigorous Results (Benjamins, 1969).
February 10, J070-S0129055X11004229
2011 13:49 WSPC/S0129-055X
148-RMP
N/V -Limit for Langevin Dynamics in Continuum
51
[28] D. Ruelle, Superstable interactions in classical statistical mechanics, Comm. Math. Phys. 18 (1970) 127–159. [29] H. Spohn, Equilibrium fluctuations for interacting Brownian particles, Comm. Math. Phys. 103 (1986) 1–33. [30] R. Siegmund-Schultze, On non-equilibrium dynamics of multidimensional infinite particle systems in the translation invariant case, Comm. Math. Phys. 100 (1985) 245–265. [31] W. Stannat, The theory of generalized Dirichlet forms and its applications in analysis and stochastics, Mem. Amer. Math. Soc. 142 (1999), no. 678, viii + 101 pp. [32] J. M. T¨ olle, Convergence of non-symmetric forms with changing reference measure, Master’s thesis, University of Bielefeld (2006). [33] H. F. Trotter, Approximation of semi-groups of operators, Pacific J. Math. 8 (1958) 887–919. [34] M. W. Yoshida, Construction of infinite-dimensional interacting diffusion processes through Dirichlet forms, Probab. Theory Related Fields 106 (1996) 265–297.
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 1 (2011) 53–81 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004205
SPECTRAL AND SCATTERING THEORY FOR THE AHARONOV–BOHM OPERATORS
KONSTANTIN PANKRASHKIN∗,‡ and SERGE RICHARD†,§ ∗Laboratoire
de Math´ ematiques d’Orsay, CNRS UMR 8628, Universit´ e Paris-Sud XI, Bˆ atiment 425, 91405 Orsay Cedex, France †Department
of Pure Mathematics and Mathematical Statistics, Centre for Mathematical Sciences, University of Cambridge, Cambridge, CB3 0WB, UK ‡
[email protected] §
[email protected] Received 1 December 2009 Revised 3 September 2010
We review the spectral and the scattering theory for the Aharonov–Bohm model on R2 . New formulae for the wave operators and for the scattering operator are presented. The asymptotics at high energy and at low energy of the scattering operator are computed. Keywords: Spectral and scattering theory; Aharonov–Bohm model; wave and scattering operators; boundary triple. Mathematics Subject Classification 2010: 81U15, 47A40
1. Introduction The Aharonov–Bohm (A–B) model describing the motion of a charged particle in a magnetic field concentrated at a single point is one of the few systems in mathematical physics for which the spectral and the scattering properties can be completely computed. It has been introduced in [3] and the first rigorous treatment appeared in [26]. A more general class of models involving boundary conditions at the singularity point has then been developed in [2, 13] and further extensions or refinements appeared since these simultaneous works. Being unable to list all these subsequent papers, let us simply mention few of them: [28] in which it is proved that the A–B models can be obtained as limits in a suitable sense of systems with less singular magnetic fields, [27] in which it is shown that the low energy behavior of the scattering amplitude for two-dimensional magnetic Schr¨ odinger operators is similar to the scattering amplitude of the A–B models, and the series [7, 8, 30] in which, among other results, high energy estimates are obtained for the scattering † On
leave from Universit´e de Lyon, Universit´e Lyon I, CNRS UMR5208, Institut Camille Jordan, 43 blvd du 11 novembre 1918, 69622 Villeurbanne Cedex, France. 53
February 10, J070-S0129055X11004205
54
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
operator. Concerning the extensions we mention the papers [15] which considers the A–B operators with an additional uniform magnetic field and [21] which studies the A–B operators on the hyperbolic plane. The aim of the present paper is to provide the spectral and the scattering analysis of the A–B operators on R2 for all possible values of the parameters (boundary conditions). The work is motivated by the recent result of one of the authors ([25]) showing that the A–B wave operators can be rewritten in terms of explicit functions of the generator of dilations and of the Laplacian. However, the proof of this result used certain complicated expressions for the scattering operator borrowed from [2] and we have in the meanwhile found a simpler approach. For those reasons, we have decided to start again the analysis from scratch using the modern operatortheoretical machinery. For example, in contrast to [2, 13], our computations do not involve an explicit parametrization of U (2). Simultaneously, we recast this analysis in the up-to-date theory of self-adjoint extensions ([12]) and derive rigorously the expressions for the wave operators and the scattering operator from the stationary approach of scattering theory as presented in [31]. So let us now describe the content of this review paper. In Sec. 2, we introodinger operator in R2 with a duce the operator Hα which corresponds to a Schr¨ δ-type magnetic field at the origin. The index α corresponds to the total flux of the magnetic field, and on a natural domain this operator has deficiency indices (2, 2). The description of this natural domain is recalled and some of its properties are exhibited. Section 3 is devoted to the description of all self-adjoint extensions of the operator Hα . More precisely, a boundary triple for the operator Hα is constructed in Proposition 4. It essentially consists in the definition of two linear maps Γ1 , Γ2 from the domain D(Hα∗ ) of the adjoint of Hα to C2 which have some specific properties with respect to Hα , as recalled at the beginning of this section. Once these maps are exhibited, all self-adjoint extensions of Hα can be labeled by two 2 × 2-matrices C and D satisfying two simple conditions presented in (7). These self-adjoint extensions are denoted by HαCD . The γ-field and the Weyl function corresponding to the boundary triple are then constructed. By taking advantage of some general results related to the boundary triple’s approach, they allow us to explicitly mention the spectral properties of HαCD in very simple terms. At the end of the section, we add some comments about the role of the parameters C and D and discuss some of their properties. The short Sec. 4 contains formulae on the Fourier transform and on the dilation group that are going to be used subsequently. Section 5 is the main section on scattering theory. It contains the time dependent approach as well as the stationary approach of the scattering theory for the A–B models. Some calculations involving Bessel functions or hypergeometric 2 F1 -functions look rather tricky but they are necessary for a rigorous derivation of the stationary expressions. Fortunately, the final expressions are much more easily understandable. For example, it is proved in Proposition 11 that the channel wave operators for the original A–B operator HαAB are equal to very explicit functions of the generator of dilations. These functions are
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
55
continuous on [−∞, ∞] and take values in the set of complex numbers of modulus 1. Theorem 12 contains a similar explicit description of the wave operators for the general operator HαCD . In Sec. 6, we study the scattering operator and in particular its asymptotics at small and large energies. These properties highly depend on the parameters C and D but also on the flux α of the singular magnetic field. All the various possibilities are explicitly analyzed. The statement looks rather messy, but this simply reflects the richness of the model. The parametrization of the self-adjoint extensions of Hα with the pair (C, D) is highly non unique. For convenience, we introduce in the last section a one-toone parametrization of all self-adjoint extensions and explicit some of the previous results in this framework. For further investigations in the structure of the set of all self-adjoint extensions, this unique parametrization has many advantages. Finally, let us mention that this paper is essentially self-contained. Furthermore, despite the rather long and rich history of the Aharonov–Bohm model most of the our results are new or exhibited in the present form for the first time. Remark 1. After the completion of this paper, the authors were informed about the closely related work [10]. In this paper, the differential expression −∂x2 + (m2 − 1/4)x−2 on R+ is considered and a holomorphic family of extensions for (m) > −1 is studied. Formulae for the wave operators similar to our formula (14) were independently obtained by its authors. Remark 2. In December 2009, a two-day meeting celebrated the 50th anniversary of the Aharonov–Bohm effect, and 25th anniversary since the discovery of the related geometric, or Berry phase. It was pointed out to us by the referee that an interesting discussion took place in the physics literature on this occasion. We refer to the letter [9] for more information on the subject and thank the referee for drawing our attention to this reference. 2. General Setting Let H denote the Hilbert space L2 (R2 ) with its scalar product ·, · and its norm · . For any α ∈ R, we set Aα : R2 \{0} → R2 by −y x Aα (x, y) = −α , , x2 + y 2 x2 + y 2 corresponding formally to the magnetic field B = αδ (δ is the Dirac delta function), and consider the operator Hα := (−i∇ − Aα )2 ,
∞ 2 D(Hα ) = Cc (R \{0}).
Here Cc∞ (Ξ) denotes the set of smooth functions on Ξ with compact support. The closure of this operator in H, which is denoted by the same symbol, is symmetric and has deficiency indices (2, 2) ([2, 13]). For further investigation we need some more information on this closure.
February 10, J070-S0129055X11004205
56
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
So let us first decompose the Hilbert space H with respect to polar coordinates: For any m ∈ Z, let φm be the complex function defined by [0, 2π) θ → φm (θ) := e√imθ . Then, by taking the completeness of the family {φm }m∈Z in L2 (S1 ) into 2π account, one has the canonical isomorphism H∼ Hr ⊗ [φm ], (1) = m∈Z 2
where Hr := L (R+ , r dr) and [φm ] denotes the one-dimensional space spanned by φm . For shortness, we write Hm for Hr ⊗ [φm ], and often consider it as a subspace of H. Clearly, the Hilbert space Hm is isomorphic to Hr , for any m. In this representation the operator Hα is equal to [13, Sec. 2] Hα,m ⊗ 1, (2) m∈Z
with (m + α)2 d2 1 d + − , dr2 r dr r2 and with a domain which depends on m + α. It clearly follows from this representation that replacing α by α + n, n ∈ Z, corresponds to a unitary transformation of Hα . In particular, the case α ∈ Z is equivalent to the magnetic field-free case α = 0, i.e. the Laplacian and its zero-range perturbations, see [4, Chap. 1.5]. Hence throughout the paper, we restrict our attention to the values α ∈ (0, 1). So, for α ∈ (0, 1) and m ∈ {0, −1}, the domain D(Hα,m ) is given by Hα,m = −
2,2 {f ∈ Hr ∩ Hloc (R+ ) | −f − r−1 f + (m + α)2 r−2 f ∈ Hr }. (1)
For m ∈ {0, −1}, let Hν denote the Hankel function of the first kind and of order 2,2 let W (g, h) stand for the Wronskian ν, and for f, h ∈ Hloc W (f, h) := f h − f h. One then has
2,2 −1 2 −2 D(Hα,m ) = f ∈ Hr ∩ Hloc (R+ ) | −f − r f + (m + α) r f ∈ Hr and lim r[W (f, h±i,m )](r) = 0 , r→0+
(1)
(1)
where h+i,m (r) = H|m+α| (eiπ/4 r) and h−i,m (r) = H|m+α| (ei3π/4 r). It is known / {0, −1} are self-adjoint on the mentioned domain, that the operators Hα,m for m ∈ while Hα,0 and Hα,−1 have deficiency indices (1, 1). This explains the deficiency indices (2, 2) for the operator Hα . The problem of the description of all self-adjoint extensions of the operator Hα can be approached by two methods. On the one hand, there exists the classical description of von Neumann based on unitary operators between the deficiency subspaces. On the other hand, there exists the theory of boundary triples which
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
57
has been widely developed for the last twenty years ([12,14]). Since our construction is based only on the latter approach, we shall recall it briefly in the sequel. Before stating a simple result on D(Hα,m ) for m ∈ {0, −1} let us set some conventions. For a complex number z ∈ C\R+ , the branch of the square root √ √ z → z is fixed by the condition z > 0. In other words, for z = reiϕ with √ √ r > 0 and ϕ ∈ (0, 2π) one has z = reiϕ/2 . On the other hand, for β ∈ R we always take the principal branch of the power z → z β by taking the principal branch of the argument arg z ∈ (−π, π). This means that for z = reiϕ with r > 0 and ϕ ∈ (−π, π) we have z β = rβ eiβϕ . Let us also recall the asymptotic behavior (1) of Hν (w) as w → 0 in C\R− and for ν ∈ Z: Hν(1) (w) = −
2ν i 2−ν ie−iπν w−ν + wν + O(w2−ν ). sin(πν)Γ(1 − ν) sin(πν)Γ(1 + ν)
(3)
Proposition 3. For any f ∈ D(Hα,m ) with m ∈ {0, −1}, the following asymptotic behavior holds: lim
r→0+
f (r) r|m+α|
= 0.
Proof. Let us set ν := |m + α| ∈ (0, 1), and recall that f ∈ D (Hα,m ) implies (1) (1) f ∈ C 1 ((0, +∞)) and that the Hankel function satisfies (Hν (z)) = Hν−1 (z) − (1) ν z Hν (z). By taking this and the asymptotics (3) into account, the condition limr→0+ r[W (h±i,m , f )](r) = 0 implies that lim {rν+1 f (r) − νrν f (r)} = 0
(4)
lim {r1−ν f (r) + νr−ν f (r)} = 0.
(5)
r→0+
and that r→0+
Multiplying both terms of (5) by r2ν and subtracting it from (4) one obtains that lim rν f (r) = 0.
r→0+
(6)
On the other hand, considering (5) as a linear differential equation for f : r1−ν f (r)+ νr−ν f (r) = b(r), and using the variation of constant one gets for some C ∈ C: r C 1 f (r) = ν + ν t2ν−1 b(t)dt. r r 0 Now Eq. (6) implies that C = 0, and by using l’Hˆ opital’s rule, one finally obtains: r t2ν−1 b(t)dt f (r) r2ν−1 b(r) 1 lim lim b(r) = 0. = lim 0 = lim = ν 2ν r→0+ r r→0+ r→0+ 2νr2ν−1 r 2ν r→0+
February 10, J070-S0129055X11004205
58
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
3. Boundary Conditions and Spectral Theory In this section, we explicitly construct a boundary triple for the operator Hα and we briefly exhibit some spectral results in that setting. Clearly, our construction is very closed to the one in [13], but this paper does not contain any reference to the boundary triple machinery. Our aim is thus to recast the construction in an up-to-date theory. The following presentation is strictly adapted to our setting, and as a general rule we omit to write the dependence on α on each of the objects. We refer to [12, 14] for more information on boundary triples. Let Hα be the densely defined closed and symmetric operator in H previously introduced. The adjoint of Hα is denoted by Hα∗ and is defined on the domain 2,2 ∗ 2 D(Hα ) = {f ∈ H ∩ Hloc (R \{0}) | Hα f ∈ H}.
Let Γ1 , Γ2 be two linear maps from D(Hα∗ ) to C2 . The triple (C2 , Γ1 , Γ2 ) is called a boundary triple for Hα if the following two conditions are satisfied: (1) f, Hα∗ g − Hα∗ f, g = Γ1 f, Γ2 g − Γ2 f, Γ1 g for any f, g ∈ D (Hα∗ ), (2) the map (Γ1 , Γ2 ): D(Hα∗ ) → C2 ⊕ C2 is surjective. It is proved in the references mentioned above that such a boundary triple exists, and that all self-adjoint extensions of Hα can be described in this framework. More precisely, let C, D ∈ M2 (C) be 2 × 2 matrices, and let us denote by HαCD the restriction of Hα∗ on the domain CD ∗ D(Hα ) := {f ∈ D (Hα ) | CΓ1 f = DΓ2 f}.
Then, the operator HαCD is self-adjoint if and only if the matrices C and D satisfy the following conditions: (i) CD ∗ is self-adjoint,
(ii) det(CC ∗ + DD∗ ) = 0.
(7)
Moreover, any self-adjoint extension of Hα in H is equal to one of the operator HαCD . We shall now construct explicitly a boundary triple for the operator Hα . For √ that purpose, let us consider z ∈ C\R+ and choose k = z with (k) > 0. It is easily proved that the following two functions fz,0 and fz,−1 define an orthonormal basis in ker(Hα∗ − z), namely in polar coordinates: fz,0 (r, θ) = Nz,0 Hα(1) (kr)φ0 (θ),
(1)
fz,−1 (r, θ) = Nz,−1 H1−α (kr)φ−1 (θ),
where Nz,m is the normalization such that fz,0 = fz,−1 = 1. In particular, by making use of the equality ∞ r|Hν(1) (kr)|2 dr = (π cos(πν/2))−1 0
valid for k ∈ {e
iπ/4
,e
i3π/4
N±i,0 = (π cos(πα/2))1/2
}, one has and N±i,−1 = (π cos(π(1 − α)/2))1/2 = (π sin(πα/2))1/2 .
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
59
Let us also introduce the averaging operator P with respect to the polar angle acting on any f ∈ H and for almost every r > 0 by 2π f(r, θ)dθ. [P(f)](r) = 0
Following [13, Sec. 3], we can then define the following four linear functionals on suitable f: Φ0 (f) = lim rα [P(fφ0 )](r), r→0+
Φ−1 (f) = lim r r→0+
1−α
[P(fφ−1 )](r),
Ψ0 (f) = lim r−α ([P(fφ0 )](r) − r−α Φ0 (f)), r→0+
Ψ−1 (f) = lim rα−1 ([P(fφ−1 )](r) − rα−1 Φ−1 (f)). r→0+
For example, by taking the asymptotic behavior (3) into account one obtains Φ0 (fz,0 ) = Nz,0 aα (z),
Φ−1 (fz,0 ) = 0,
Ψ0 (fz,0 ) = Nz,0 bα (z),
Ψ−1 (fz,0 ) = 0,
Φ−1 (fz,−1 ) = Nz,−1 a1−α (z),
Φ0 (fz,−1 ) = 0,
Ψ−1 (fz,−1 ) = Nz,−1 b1−α (z),
Ψ0 (fz,−1 ) = 0,
(8)
with aν (z) = −
2ν i k −ν , sin(πν)Γ(1 − ν)
bν (z) =
2−ν ie−iπν kν . sin(πν)Γ(1 + ν)
(9)
The main result of this section is: Proposition 4. The triple (C2 , Γ1 , Γ2 ), with Γ1 , Γ2 defined on f ∈ D (Hα∗ ) by Φ0 (f) αΨ0 (f) Γ1 f := , Γ2 f := 2 , Φ−1 (f) (1 − α)Ψ−1 (f) is a boundary triple for Hα . Proof. We use the schema from [11, Lemma 5]. For any f, g ∈ D (Hα∗ ) let us define the sesquilinear forms B1 (f, g) := f, Hα∗ g − Hα∗ f, g and B2 (f, g) := Γ1 f, Γ2 g − Γ2 f, Γ1 g. We are going to show that these expressions are well defined and that B1 = B2 . (i) Clearly, B1 is well defined. For B2 , let us first recall that D (Hα∗ ) = D(Hα ) + ker(Hα∗ − i) + ker(Hα∗ + i). It has already been proved above that the four maps Φ0 , Φ−1 , Ψ0 and Ψ−1 are well defined on the elements of ker(Hα∗ − i) and ker(Hα∗ + i). We shall now prove that Γ1 f = Γ2 f = 0 for f ∈ D(Hα ), which shows that B2 is also well defined on D (Hα∗ ). In view of the decomposition (2)
February 10, J070-S0129055X11004205
60
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
it is sufficient to consider functions f of the form f(r, θ) = fm (r)φm (θ) for any m ∈ Z and with fm ∈ D(Hα,m ). Obviously, for such a function f with m∈ / {0, −1} one has [P(f)](r) = 0 for almost every r, and thus Γ1 f = Γ2 f = 0. For m ∈ {0, 1} the equalities Γ1 f = Γ2 f = 0 follow directly from Proposition 3. (ii) Now, since Γ1 f = Γ2 g = 0 for all f, g ∈ D (Hα ), the only non trivial contributions to the sesquilinear form B2 come from f, g ∈ ker(Hα∗ − i) + ker(Hα∗ + i). On the other hand one also has B1 (f, g) = 0 for f, g ∈ D(Hα ). Thus, we are reduced in proving the equalities B1 (fz,m , fz ,n ) = B2 (fz,m , fz ,n ) for any z, z ∈ {−i, i} and m, n ∈ {0, −1}. Observe first that for z = z and arbitrary m, n one has B1 (fz,m , fz ,n ) = fz,m , z fz ,n − zfz,m , fz ,n = 0 since z = z¯. Now, for m = n one has Γ1 fz,m ⊥ Γ2 fz ,n , and hence B2 (fz,m , fz ,n ) = 0 = B1 (fz,m , fz ,n ). For m = n one easily calculates with ν := |m − α| that B2 (fz,m , fz ,m ) = 2νNz,m Nz ,m (aν (z)bν (z ) − bν (z)aν (z )) = 0, and then B2 (fz,m , fz ,m ) = 0 = B1 (fz,m , fz ,m ). We now consider z = z and m = n. One has B1 (fz,m , fz,n ) = fz,m , zfz,n − zfz,m , fz,n = 2zfz,m , fz,n = 0 and again Γ1 fz,m ⊥ Γ2 fz,n . It then follows that B2 (fz,m , fz,n ) = 0 = B1 (fz,m , fz,n ). So it only remains to show that B1 (fz,m , fz,m ) = B2 (fz,m , fz,m ). For that purpose, observe first that B1 (fz,m , fz,m ) = 2zfz,m , fz,m = 2z. On the other hand, one has B2 (fz,m , fz,m ) = 2i(Γ1 fz,m , Γ2 fz,m ) = 2i(2ν|Nz,m |2 aν (z)bν (z)) with ν = |m − α|. By inserting (9) into this expression, one obtains (with √ k = z and (k) > 0) −(k ν )2 e−iπν B2 (fz,m , fz,m ) = 4iν|Nz,m |2 sin2 (πν)Γ(1 − ν)Γ(1 + ν) = 4zν|Nz,m |2
sin(πν/2) . sin2 (πν)Γ(1 − ν)Γ(1 + ν)
Finally, by taking the equality Γ(1 − ν)Γ(1 + ν) =
πν sin(πν)
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
61
into account, one obtains B2 (fz,m , fz,m ) = 4z|Nz,m|2
sin(πν/2) sin(πν/2) = 4zπ cos(πν/2) = 2z, sin(πν)π sin(πν)π
which implies B2 (fz,m , fz,m ) = 2z = B1 (fz,m , fz,m ). (iii) The surjectivity of the map (Γ1 , Γ2 ): D(Hα∗ ) → C2 ⊕ C2 follows from the equalities (8). Let us now construct the Weyl function corresponding to the above boundary triple. The presentation is again adapted to our setting, and we refer to [12] for general definitions. As already mentioned, all self-adjoint extensions of Hα can be characterized by the 2 × 2-matrices C and D satisfying two simple conditions, and these extensions are denoted by HαCD . In the special case (C, D) = (1, 0), the operator Hα10 is equal to the original Aharonov–Bohm operator HαAB . Recall that this operator corresponds to the Friedrichs extension of Hα and that its spectrum is equal to R+ . This operator is going to play a special role in the sequel. Let us consider ξ = (ξ0 , ξ−1 ) ∈ C2 and z ∈ C\R+ . It is proved in [12] that there exists a unique f ∈ ker(Hα∗ − z) with Γ1 f = ξ. This solution is explicitly given by the formula: f := γ(z)ξ with γ(z)ξ =
ξ0 ξ−1 fz,0 + fz,−1 Nz,0 aα (z) Nz,−1 a1−α (z)
The Weyl function M (z) is then defined by the relation M (z) := Γ2 γ(z). In view of the previous calculations one has
M (z) = 2
α bα (z)/aα (z)
0
(1 − α) b1−α (z)/a1−α (z)
0
Γ(1 − α)2 e−iπα α 2 (k ) 4α 2 = − sin(πα) π 0
0 Γ(α)2 e−iπ(1−α) 1−α 2 (k ) 41−α
.
In particular, one observes that for z ∈ C\R+ one has M (0) := limz→0 M (z) = 0. In terms of the Weyl function and of the γ-field γ the Krein resolvent formula has the simple form: z )∗ (HαCD − z)−1 − (HαAB − z)−1 = −γ(z)(DM (z) − C)−1 Dγ(¯ = −γ(z)D∗ (M (z)D∗ − C ∗ )−1 γ(¯ z )∗
(10)
February 10, J070-S0129055X11004205
62
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
for z ∈ ρ(HαAB )∩ρ(HαCD ). The following result is also derived within this formalism, see [5] for (i), [14, Theorem 5] and the matrix reformulation [16, Theorem 3] for (ii). In the statement, the equality M (0) = 0 has already been taken into account. Lemma 5. (i) The value z ∈ R− is an eigenvalue of HαCD if and only if det(DM (z) − C) = 0, and in that case one has ker(HαCD − z) = γ(z) ker(DM (z) − C). (ii) The number of negative eigenvalues of HαCD coincides with the number of negative eigenvalues of the matrix CD∗ . We stress that the number of eigenvalues does not depend on α ∈ (0, 1), but only on the choice of C and D. Let us now add some comments about the role of the parameters C and D and discuss some of their properties. Two pairs of matrices (C, D) and (C , D ) satisfying (7) define the same boundary condition (i.e. the same self-adjoint extension) if and only if there exists some invertible matrix L ∈ M2 (C) such that C = LC and D = LD ([23, Proposition 3]). In particular, if (C, D) satisfies (7) and if det(D) = 0, then the pair (D−1 C, 1) defines the same boundary condition (and D−1 C is selfadjoint). Hence there is an arbitrariness in the choice of these parameters. This can avoided in several ways. First, one can establish a bijection between all boundary conditions and the set U (2) of the unitary 2 × 2-matrices U by setting C = C(U ) :=
i 1 (1 − U ) and D = D(U ) = (1 + U ), 2 2
(11)
see a detailed discussion in [17]. We shall comment more on this in the last section. Another possibility is as follows (cf. [24] for details): There is a bijection between the set of all boundary conditions and the set of triples (L, I, L), where L ∈ {{0}, C, C2}, I: L → C2 is an identification map (identification of L as a linear subspace of C2 ) and L is a self-adjoint operator in L. For example, given such a triple (L, I, L) the corresponding boundary condition is obtained by setting C = C(L, I, L) := L ⊕ 1
and D = D(L, I, L) := 1 ⊕ 0
with respect to the decomposition C2 = [IL] ⊕ [IL]⊥ . On the other hand, for a pair (C, D) satisfying (7), one can set L := Cd with d := 2 − dim[ker(D)], I: L → C2 is the identification map of L with ker(D)⊥ and L := (DI)−1 CI. In this framework, one can check by a direct calculation that for any K ∈ M2 (C) such that DK − C is invertible, one has (DK − C)−1 D = I(PKI − L)−1 P,
(12)
where P : C2 → L is the adjoint of I, i.e. the composition of the orthogonal projection onto IL together with the identification of IL with L.
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
63
Let us finally note that the conditions (7) imply some specific properties related to commutativity and adjointness. We shall need in particular: Lemma 6. Let (C, D) satisfies (7) and K ∈ M2 (C) with K > 0. Then (i) The matrices DK − C and DK ∗ − C are invertible, (ii) The equality [(DK − C)−1 D]∗ = (DK ∗ − C)−1 D holds. Proof. (i) By contraposition, let us assume that det(DK − C) = 0. Passing to the adjoint, one also has det(K ∗ D∗ − C ∗ ) = 0, i.e. there exists f ∈ C2 such that K ∗ D∗ f = C ∗ f . By taking the scalar product with D∗ f , one obtains that D∗ f, KD∗ f = f, CD∗ f . The right-hand side is real due to (i) in (7). But since K > 0, the equality is possible if and only if D∗ f = 0. It then follows that C ∗ f = K ∗ D∗ f = 0, which contradicts (ii) in (7). The invertibility of DK ∗ − C can be proved similarly. (ii) If det(D) = 0, then the matrix A := D−1 C is self-adjoint and it follows that [(DK − C)−1 D]∗ = [(K − A)−1 ]∗ = (K ∗ − A)−1 = (DK ∗ − C)−1 D. If D = 0, then the equality is trivially satisfied. Finally, if det(D) = 0 but D = 0 one has L := C. Furthermore, let us define I: C → C2 by IL := ker(D)⊥ and let P : C2 → C be its adjoint map. Then, by the above construction there exists ∈ R such that (DK − C)−1 D = I(P KI − )−1 P . It is also easily observed that P KI is just the multiplication by some k ∈ C with k > 0, and hence (DK − C)−1 D = I(k − )−1 P . Similarly one has (DK ∗ − C)−1 D = I(k¯ −
)−1 P . Taking the adjoint of the first expression leads directly to the expected equality. 4. Fourier Transform and the Dilation Group Before starting with the scattering theory, we recall some properties of the Fourier transform and of the dilation group in relation with the decomposition (1). Let F be the usual Fourier transform, explicitly given on any f ∈ H and y ∈ R2 by 1 l.i.m. f(x)e−ix·y dx [F f](y) = 2π 2 R where l.i.m. denotes the convergence in the mean. Its inverse is denoted by F ∗ . Since the Fourier transform maps the subspace Hm of H onto itself, we naturally set Fm : Hr → Hr by the relation F (f φm ) = Fm (f )φm for any f ∈ Hr . More explicitly, the application Fm is the unitary map from Hr to Hr given on any f ∈ Hr and almost every κ ∈ R+ by |m| ˆ f (κ) := [Fm f ](κ) = (−i) l.i.m. rJ|m| (rκ)f (r)dr, R+
where J|m| denotes the Bessel function of the first kind and of order |m|. The inverse ∗ is given by the same formula, with (−i)|m| replaced by i|m| . Fourier transform Fm
February 10, J070-S0129055X11004205
64
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
Now, let us recall that the unitary dilation group {Uτ }τ ∈R is defined on any f ∈ H and x ∈ R2 by [Uτ f](x) = eτ f(eτ x). Its self-adjoint generator A is formally given by 12 (X · (−i∇) + (−i∇) · X), where X is the position operator and −i∇ is its conjugate operator. All these operators are essentially self-adjoint on the Schwartz space on R2 . An important property of the operator A is that it leaves each subspace Hm invariant. For simplicity, we shall keep the same notation for the restriction of A to each subspace Hm . So, for any m ∈ Z, let ϕm be an essentially bounded function on R. Assume furthermore that the family {ϕm }m∈Z is bounded. Then the operator ϕ(A): H → H defined on Hm by ϕm (A) is a bounded operator in H. Let us finally recall a general formula about the Mellin transform. Lemma 7. Let ϕ be an essentially bounded function on R such that its inverse Fourier transform is a distribution on R. Then, for any f ∈ Cc∞ (R2 \{0}) one has ∞ ds ϕ(−ln(s/r))f(s, ˇ θ) , [ϕ(A)f](r, θ) = (2π)−1/2 r 0 where the right-hand side has to be understood in the sense of distributions. Proof. The proof is a simple application for n = 2 of the general formulae developed in [18, p. 439]. Let us however mention that the convention of this reference on the minus sign for the operator A in its spectral representation has not been adopted. As already mentioned ϕ(A) leaves Hm invariant. More precisely, if f = f φm for some f ∈ Cc∞ (R+ ), then ϕ(A)f = [ϕ(A)f ]φm with ∞ ds [ϕ(A)f ](r) = (2π)−1/2 (13) ϕ(−ln(s/r))f ˇ (s) , r 0 where the right-hand side has again to be understood in the sense of distributions. 5. Scattering Theory In this section, we briefly recall the main definitions of the scattering theory, and then give explicit formulae for the wave operators. The scattering operator will be studied in the following section. Let H1 , H2 be two self-adjoint operators in H, and assume that the operator H1 is purely absolutely continuous. Then the (time dependent) wave operators are defined by the strong limits Ω± (H2 , H1 ) := s-lim eitH2 e−itH1 t→±∞
whenever these limits exist. In this case, these operators are isometries, and they are said complete if their ranges are equal to the absolutely continuous subspace
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
65
Hac (H2 ) of H with respect to H2 . In such a situation, the (time dependent) scattering operator for the system (H2 , H1 ) is defined by the product S(H2 , H1 ) := Ω∗+ (H2 , H1 )Ω− (H2 , H1 ) and is a unitary operator in H. Furthermore, it commutes with the operator H1 , and thus is unitarily equivalent to a family of unitary operators in the spectral representation of H1 . We shall now prove that the wave operators exist for our model and that they are complete. For that purpose, let us denote by H0 := −∆ the Laplace operator on R2 . Lemma 8. For any self-adjoint extension HαCD , the wave operators Ω± (HαCD , H0 ) exist and are complete. Proof. On the one hand, the existence and the completeness of the operators Ω± (HαAB , H0 ) has been proved in [26]. On another hand, the existence and the completeness of the operator Ω± (HαCD , HαAB ) is well known since the difference of the resolvents is a finite rank operator, see for example [19, Sec. X.4.4]. The statement of the lemma follows then by taking the chain rule [31, Theorem 2.1.7] and [31, Theorem 2.3.3] on completeness into account. The derivation of the explicit formulae for the wave operators is based on the stationary approach, as presented in [31, Secs. 2.7 and 5.2]. For simplicity, we shall CD consider only ΩCD − := Ω− (Hα , H0 ). For that purpose, let λ ∈ R+ and ε > 0. We first study the expression ε (H0 − λ + iε)−1 f, (HαCD − λ + iε)−1 g π and its limit as ε → 0+ for suitable f, g ∈ H specified later on. By taking Krein resolvent formula into account, one can consider separately the two expressions: ε (H0 − λ + iε)−1 f, (HαAB − λ + iε)−1 g π and ε − (H0 − λ + iε)−1 f, γ(λ − iε)(DM (λ − iε) − C)−1 Dγ(λ + iε)∗ g. π The first term will lead to the wave operator for the original Aharonov–Bohm system, as shown below. So let us now concentrate on the second expression. For simplicity, we set z = λ + iε and observe that ε z )(DM (¯ z ) − C)−1 Dγ(z)∗ g − (H0 − z¯)−1 f, γ(¯ π ε z ) − C)−1 D]∗ γ(¯ z )∗ (H0 − z¯)−1 f, g. = − γ(z)[(DM (¯ π
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
66
Then, for every r ∈ R+ and θ ∈ [0, 2π) one has ε z ) − C)−1 D]∗ γ(¯ z )∗ (H0 − z¯)−1 f](r, θ) − [γ(z)[(DM (¯ π (1) √ T Hα ( zr)φ0 (θ) ξ (z, f) ε 0 · A(z)[(DM (¯ z ) − C)−1 D]∗ A(¯ z )∗ =− π H (1) (√zr)φ−1 (θ) ξ−1 (z, f) 1−α
with
A(z) =
and
ξ0 (z, f)
ξ−1 (z, f)
aα (z)−1
0
0
a1−α (z)−1
(1) √ Hα ( z¯·)φ0 , (H0 − z¯)−1 f (1) √ H1−α ( z¯·)φ−1 , (H0 − z¯)−1 f (1) √ F(H0 − z)−1 Hα ( z¯·)φ0 , ˆf (1) √ F(H0 − z)−1 H1−α ( z¯·)φ−1 , ˆf (1) √ F0 (H0 − z)−1 Hα ( z¯·), fˆ0 R+ (1) √ F−1 (H0 − z)−1 H1−α ( z¯·), fˆ−1 R+ (1) √ (X 2 − z)−1 F0 Hα ( z¯·), fˆ0 R+ (1) √ (X 2 − z)−1 F−1 H ( z¯·), fˆ−1 R
=
=
=
=
1−α
+
2
where ·, ·R+ denotes the scalar product in L (R+ , r dr). We shall now calculate separately the limit as ε → 0 of the different terms. We √ recall the convention that √ for z ∈ C\R+ on choose k = z with (z) > 0. For λ ∈ R+ one sets limε→0+ λ + iε =: κ with κ ∈ R+ . We first observe that for ν ∈ (0, 1) one has aν (λ+ ) := lim aν (λ + iε) = − ε→0+
2ν i κ−ν sin(πν)Γ(1 − ν)
but aν (λ− ) := lim aν (λ − iε) = − ε→0+
2ν ie−iπν κ−ν . sin(πν)Γ(1 − ν)
Similarly, one observes that M (λ± ) := lim M (λ ± iε) ε→0+
Γ(1 − α)2 e∓iπα 2α κ 4α 2 = − sin(πα) π 0
0 Γ(α)2 e∓iπ(1−α) 2(1−α) κ 41−α
.
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
67
Note that M (λ+ ) = M (λ− )∗ . Finally, the most elaborated limit is calculated in the next lemma. Lemma 9. For m ∈ Z, ν ∈ (0, 1) and f ∈ Cc∞ (R+ ) one has √ lim ε(X 2 − z)−1 Fm Hν(1) ( z¯·), f R+ = ieiπν/2 (−1)|m| f (κ).
ε→0+
Proof. Let us start by recalling that for w ∈ C satisfying − π2 < arg(w) ≤ π one has [1, Eq. (9.6.4)]: Hν(1) (w) =
2 −iπν/2 e Kν (−iw), iπ
where Kν is the modified Bessel function of the second kind and of order ν. Then, for r ∈ R+ it follows that (by using [29, Sec. 13.45] for the last equality) √ [Fm Hν(1) ( z¯·)](r) = (−i)|m| l.i.m.
R+
√ ρJ|m| (rρ)Hν(1) ( z¯ρ)dρ
2 = (−i)|m| e−iπν/2 l.i.m. iπ
R+
√ ρJ|m| (rρ)Kν (−i z¯ρ)dρ
√ 2 −iπν/2 1 z¯ e ρ dρ = (−i) l.i.m. ρJ|m| (ρ)Kν −i iπ r2 r R+ √ −2 √ −2−|m| 1 |m| − ν z¯ z¯ |m| + ν = c 2 −i + 1, + 1; |m| + 1; − −i 2 F1 r r 2 2 r |m|
where 2 F1 is the Gauss hypergeometric function [1, Chap. 15] and c is given by |m| − ν |m| + ν +1 Γ +1 Γ 2 2 2 . c := (−i)|m| e−iπν/2 iπ Γ(|m| + 1) √
Now, observe that (−i rz¯ )−2−|m| = −(−i)−|m|( Thus, one has obtained √ 1 z¯·)](r) = d 2 r
[Fm Hν(1) (
√
z¯ −2−|m| r )
and −(−i
=
r2 z¯ .
√ −2−|m| |m| + ν |m| − ν r2 z¯ + 1, + 1; |m| + 1; 2 F1 r 2 2 z¯
with d=−
√ z¯ −2 r )
2 −iπν/2 e iπ
Γ
|m| − ν |m| + ν +1 Γ +1 2 2 . Γ(|m| + 1)
February 10, J070-S0129055X11004205
68
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
By taking into account [1, Equality (15.3.3)] one can isolate from the 2 F1 -function a factor which is singular when the variable goes to 1: 2 F1
|m| − ν r2 |m| + ν + 1, + 1; |m| + 1; 2 2 z¯ |m| + ν |m| − ν r2 1 , ; |m| + 1; F = 2 1 1 − r2 z¯−1 2 2 z¯ |m| + ν |m| − ν r2 z¯ , ; |m| + 1; F =− 2 . 2 1 r − z¯ 2 2 z¯
Altogether, one has thus obtained: √ ε[(X 2 − z)−1 Fm Hν(1) ( z¯·)](r) √ −2−|m| z¯ z¯ ε = −d 2 (r − z¯)(r2 − z) r2 r
2 F1
|m| + ν |m| − ν r2 , ; |m| + 1; . 2 2 z¯
Now, observe that ε ε = 2 (r2 − z¯)(r2 − z) (r − λ + iε)(r2 − λ − iε) ε =: πδε (r2 − λ) = 2 (r − λ)2 + ε2 which converges to πδ(r2 − λ) in the sense of distributions on R as ε goes to 0. Furthermore, the map R+ r → 2 F1
|m| + ν |m| − ν r2 , ; |m| + 1; 2 2 λ − iε
∈C
is locally uniformly convergent as ε → 0 to a continuous function which is equal √ + 1)Γ( |m|−ν + 1)]−1 . By considering trivial for r = κ = λ to Γ(|m| + 1)[Γ( |m|+ν 2 2 extensions on R, it follows that √ lim ε(X 2 − z)−1 Fm Hν(1) ( z¯·), f R+
ε→0+
Z ¯ lim = −dπ
ε→0+
=−
R+
rδε (r 2 − λ)
¯ dπ κ(−1)−|m| 2κ
„ Γ
= ieiπν/2 (−1)|m| f (κ).
z¯ r2
« „ √ «−2−|m| „ r2 |m| + ν |m| − ν z¯ , ; |m| + 1; f (r)dr 2 F1 r 2 2 z¯
Γ(|m| + 1) « „ « f (κ) |m| − ν |m| + ν +1 Γ +1 2 2
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
69
By adding these different results and by taking Lemma 6 into account, one has thus proved: Lemma 10. For any f of the form m∈Z fm φm with fm = 0 except for a finite number of m for which fˆm ∈ Cc∞ (R+ ) and for any g ∈ Cc∞ (R2 \{0}), one has ε lim − (H0 − λ + iε)−1 f, γ(λ − iε)(DM (λ − iε) − C)−1 Dγ(λ + iε)∗ g π (1) T Hα (κ·)φ0 1 A(λ+ )(DM (λ+ ) − C)−1 DA(λ− )∗ =− (1) π H1−α (κ·)φ−1 ieiπα/2 fˆ0 (κ) ,g . × −ieiπ(1−α)/2 fˆ−1 (κ)
ε→0+
Before stating the main result on ΩCD − , let us first present the explicit form of AB the stationary wave operator Ω− . Note that for this operator the equality between the time dependent approach and the stationary approach is known [2, 13, 26], and that a preliminary version of the following result has been given in [25]. So, let us observe that since the operator HαAB leaves each subspace Hm invariant [26], it AB gives rise to a sequence of channel operators Hα,m acting on Hm . The Laplacian AB H0 admitting a similar decomposition, the stationary wave operators Ω ± can be defined in each channel, i.e. separately for each m ∈ Z. Let us immediately observe that the angular part does not play any role for defining such operators. Therefore, we shall omit it as long as it does not lead to any confusion, and consider the AB from Hr to Hr . channel wave operators Ω ±,m The following notation will be useful: T := {z ∈ C | |z| = 1} and 1 − 2 πα if m ≥ 0 1 α . δm = π(|m| − |m + α|) = 2 1 πα if m < 0 2 Proposition 11. For each m ∈ Z, one has ± AB ΩAB ±,m = Ω±,m = ϕm (A),
with ϕ± m ∈ C([−∞, +∞], T) given explicitly by 1 1 (|m| + 1 + ix) Γ (|m + α| + 1 − ix) Γ α 2 2 ∓iδm ϕ± m (x) := e 1 1 (|m| + 1 − ix) Γ (|m + α| + 1 + ix) Γ 2 2 α
± ∓2iδm and satisfying ϕ± . m (±∞) = 1 and ϕm (∓∞) = e
(14)
February 10, J070-S0129055X11004205
70
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
Proof. As already mentioned, the first equality in proved in [26]. Furthermore it is also proved there that for any f ∈ Hr and r ∈ R+ one has α |m| [ΩAB f ](r) = i l.i.m. κ, J|m+α| (κr)e∓iδm [Fm f ](κ)dκ. ±,m R+
In particular, if f ∈
Cc∞ (R+ ), α
s-lim e∓iδm
N →∞
0
= s-lim e
this expression can be rewritten as ∞ N κJ|m+α| (κr) sJ|m| (sκ)f (s)ds dκ
α ∓iδm
N →∞
α ∓iδm
N →∞
=e
0
∞
0
∞
0
∞ 0
s r
0
s f (s) r
∞
N
sf (s) 0
= s-lim e α ∓iδm
0
κJ|m| (sκ)J|m+α| (κr)dκ ds Nr
κJ|m| ( rs κ)J|m+α| (κ)dκ
ds r
s ds κ J|m+α| (κ)dκ f (s) , κJ|m| r r
(15)
where the last term has to be understood in the sense of distributions on R+ . The distribution between square brackets has been computed in [20, Proposition 2] but we shall not use here its explicit form. Now, by comparing (15) with (13), one observes that the channel wave operator ± ± ΩAB ±,m is equal on a dense set in Hr to ϕm (A) for a function ϕm whose inverse Fourier transform is the distribution which satisfies for y ∈ R: ∞ √ α ± ∓iδm −y −y e κJ|m| (e κ)J|m+α| (κ)dκ . ϕˇm (y) = 2πe 0
The Fourier transform of this distribution can be computed. Explicitly one has (in the sense of distributions): α ∓iδm e−ixy e−y κJ|m| (e−y κ)J|m+α| (κ)dκ dy ϕ± m (x) = e α
= e∓iδm
R
R+
R+
κ(1−ix)−1 J|m+α| (κ)dκ
R+
s(1+ix)−1 J|m| (s)ds
which is the product of two Mellin transforms. The explicit form of these transforms are presented in [22, Eq. (10.1)] and a straightforward computation leads directly to the expression (14). The second equality of the statement follows then by a density argument. The additional properties of ϕ± m can easily be obtained by taking into account the equality Γ(¯ z ) = Γ(z) valid for any z ∈ C as well as the asymptotic development of the function Γ as presented in [1, Eq. (6.1.39)]. Since the wave operators ΩAB admit a decomposition into channel wave ± operators, so does the scattering operator. The channel scattering operator
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
71
AB ∗ AB Sm := (ΩAB +,m ) Ω−,m , acting from Hr to Hr , is simply given by α
AB − 2iδ Sm = ϕ+ m (A)ϕm (A) = e m .
Now, let us set Hint := H0 ⊕ H−1 which is clearly isomorphic to Hr ⊗ C2 , and ⊥ . It follows from the considerations of consider the decomposition H = Hint ⊕ Hint CD Sec. 2 that for any pair (C, D) the operator Ω ± is reduced by this decomposition CD | ⊥ = ΩCD | ⊥ = ΩAB | ⊥ . Since the form of ΩAB has been exposed and that Ω − Hint − Hint − Hint − CD to Hint . For that purpose, above, we shall concentrate only of the restriction of Ω −
let us define a matrix valued function which is closely related to the scattering operator. For κ ∈ R+ we set Γ(1 − α)e−iπα/2 α κ 0 2α SαCD (κ) := 2i sin(πα) Γ(α)e−iπ(1−α)/2 (1−α) 0 κ 21−α −1 Γ(1 − α)2 e−iπα 2α κ 0 4α π · D C D + 2 sin(πα) 2 −iπ(1−α) Γ(α) e 2(1−α) 0 κ 41−α Γ(1 − α)e−iπα/2 α κ 0 2α (16) · . −iπ(1−α)/2 Γ(α)e 0 − κ(1−α) 21−α Theorem 12. For any pair (C, D) satisfying (7), the restriction of the wave operator ΩCD − to Hint satisfies the equality − ϕ0 (A) ϕ˜0 (A) 0 0 CD CD − |H = SαCD ( H0 ), Ω− |Hint = Ω + int − 0 ϕ−1 (A) 0 ϕ˜−1 (A) (17) where ϕ˜m ∈ C([−∞, +∞], C) for m ∈ {0, −1}. Explicitly, for every x ∈ R, ϕ˜m (x) is given by 1 (|m| + 1 + ix) Γ 1 1 −iπ|m|/2 πx/2 2 Γ e (1 + |m + α| − ix) e 1 2π 2 Γ (|m| + 1 − ix) 2 1 ×Γ (1 − |m + α| − ix) 2 and satisfies ϕ˜m (−∞) = 0 and ϕ˜m (+∞) = 1.
February 10, J070-S0129055X11004205
72
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
CD is defined by the formula [31, DefiProof. (a) The stationary representation Ω − nition 2.7.2]: ∞ ε CD f, g = lim (H0 − λ + iε)−1 f, (HαCD − λ + iε)−1 gdλ Ω − ε→0+ π 0 for any f of the form m∈Z fm φm with fm = 0 except for a finite number of m for which fˆm ∈ Cc∞ (R+ ) and g ∈ Cc∞ (R2 \{0}). By taking Krein resolvent formula into account, we can first consider the expression ∞ ε lim (H0 − λ + iε)−1 f, (HαAB − λ + iε)−1 gdλ ε→0+ π 0 which converges to [2, 13, 26]:
∞
0
i
α |m| iδm
e
ˆ J|m+α| (κ·)fm (κ)φm , g κ dκ.
m∈Z
This expression was the starting point for the formulae derived in Proposition 11. This leads to the first term in the right-hand side of (17). (b) The second term to analyze is −
∞
ε (H0 − λ + iε)−1 f, γ(λ − iε)(DM (λ − iε) − C)−1 Dγ(λ + iε)∗ gdλ. ε→0+ π (18) lim
0
By using Lemma 10 and by performing some simple calculations, one obtains that (18) is equal to 0
∞
1 2
T
1 α (1) i Hα (κ·)φ0 2 (1)
i1−α H1−α (κ·)φ−1
CD fˆ0 (κ) S (κ) , g κ dκ. α fˆ−1 (κ)
Now, it will be proved below that the operator Tm defined for m ∈ {0, −1} on F ∗ [Cc∞ (R+ )] by 1 [Tm f ](r) := i|m+α| 2
0
∞
(1)
H|m+α| (κr)[Fm f ](κ)κ dκ
(19)
satisfies the equality Tm = ϕ˜m (A) with ϕ˜m given in the above statement. The stationary expression is then obtained by observing that F ∗ SαCD (k)F = √ SαCD ( H0 ), where SαCD (k) is the operator of multiplication by the function SαCD (·). Finally, the equality between the time dependent wave operator and the stationary wave operator is a consequence of Lemma 8 and of [31, Theorem 5.2.4].
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
73
(c) By comparing (19) with (13), one observes that the operator Tm is equal on a dense set in Hr to ϕ˜m (A) for a function ϕ˜m whose inverse Fourier transform is the distribution which satisfies for y ∈ R: √ α (1) ˇ˜m (y) = 1 2πe−iδm ϕ ey κH|m+α| (ey κ)J|m| (κ)dκ. 2 R+ As before, the Fourier transform of this distribution can be computed. Explicitly one has (in the sense of distributions): α 1 −iδm (1) −ixy y y ϕ˜m (x) = e e e κH|m+α| (e κ)J|m| (κ)dκ dy 2 R R+ α 1 −iδm (1) (1+ix)−1 = e κ J|m| (κ)dκ s(1−ix)−1 H|m+α| (s)ds 2 R+ R+ 1 (|m| + 1 + ix) Γ 1 −iπ|m|/2 2 e = (−i)ix 1 2π Γ (|m| + 1 − ix) 2 1 1 (1 + |m + α| − ix) Γ (1 − |m + α| − ix) . ·Γ 2 2 The last equality is obtained by taking into account the relation between the (1) Hankel function Hν and the Bessel function Kν of the second kind as well as the Mellin transform of the functions Jν and the function Kν as presented in [22, Eqs. (10.1) and (11.1)]. The additional properties of ϕ˜m can easily be obtained by using the asymptotic development of the function Γ as presented in [1, Eq. (6.1.39)]. 6. Scattering Operator In this section, we concentrate on the scattering operator and on its asymptotic values for large and small energies. Proposition 13. The restriction of the scattering operator S(HαCD , H0 ) to Hint is explicitly given by −iπα 0 e + SαCD (κ). S(HαCD , H0 )|Hint = SαCD ( H0 ) with SαCD (κ) := 0 eiπα Proof. Let us first recall that the scattering operator can be obtained from ΩCD − by the formula [6, Proposition (4.2)]: CD s-lim eitH0 e−itH ΩCD − = S(Hα , H0 ). t→+∞
We stress that the completeness has been taken into account for this equality. Now, let us set U (t) := e−it ln(H0 )/2 , where ln(H0 ) is the self-adjoint operator obtained
February 10, J070-S0129055X11004205
74
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
by functional calculus. By the intertwining property of the wave operators and by the invariance principle, one also has CD s-lim U (−t)ΩCD − U (t) = S(Hα , H0 ). t→+∞
On the other hand, the operator ln(H0 )/2 is the generator of translations in the spectrum of A, i.e. U (−t)ϕ(A)U (t) = ϕ(A + t) for any ϕ: R → C. Since {U (t)}t∈R is also reduced by the decomposition (1), it follows that s-lim U (−t) ΩCD − |Hint U (t) t→+∞
− ϕ0 (A) = s-lim U (−t) t→+∞ 0
0
ϕ˜0 (A)
0
SαCD (
H0 ) U (t) + ϕ− 0 ϕ˜−1 (A) −1 (A) − 0 ϕ˜0 (+∞) ϕ0 (+∞) 0 SαCD ( H0 ). = + − 0 ϕ−1 (+∞) 0 ϕ˜−1 (+∞) The initial statement is then obtained by taking the asymptotic values mentioned in Proposition 11 and Theorem 12 into account. Even if the unitarity of the scattering operator follows from the general theory we give below a direct verification in order to better understand its structure. In the next statement, we only give the value of the scattering matrix at energy 0 and energy equal to +∞. However, more explicit expressions for SαCD (κ) are exhibited in the proof. Proposition 14. The map R+ κ → SαCD (κ) ∈ M2 (C)
(20)
is continuous, takes values in the set U (2) and has explicit asymptotic values for κ = 0 and κ = +∞. More explicitly, depending on C, D or α one has: 0 " , eiπα ! iπα 0 " then SαCD (+∞) = e 0 e−iπα , CD = 1 and α = 1/2, then Sα (+∞) ⊥
(i) If D = 0, then SαCD (κ) = (ii) If det(D) = 0, (iii) If dim[ker(D)]
!e−iπα 0
! = (2P − 1) 0i
0" −i ,
where P
is the orthogonal !0" ! " projection onto ker(D) ,
(iv) If ker(D) = C0 or if dim[ker(D)] = 1, α < 1/2 and ker(D) = C , then " ! −iπα e 0 SαCD (+∞) = , 0 e−iπα !0" !C" (v) If ker(D) = C or if dim[ker(D)] = 1, α > 1/2 and ker(D) =
0 , then " ! iπα e 0 SαCD (+∞) = 0 eiπα .
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
75
Furthermore, 0 " , e−iπα ! −iπα 0 " , SαCD (0) = e 0 eiπα and α = 1/2, then SαCD (0) ⊥
(a) If C = 0, then SαCD (0) = (b) If det(C) = 0, then
!eiπα 0
! 0" , where Π is = (1 − 2Π) 0i −i (c) If dim[ker(C)] = 1 the orthogonal! projection on ker(C) . " ! " (d) If ker(C) = C0 or if dim[ker(C)] = 1, α > 1/2 and ker(C) = C0 , then ! −iπα 0 " SαCD (0) = e 0 , e−iπα ! " !C" (e) If ker(C) = 0 or if dim[ker(C)] = 1, α < 1/2 and ker(C) = C0 , then ! iπα 0 " SαCD (0) = e 0 eiπα . Proof. Let us fix κ > 0 and set S := SαCD (κ). For shortness, we also set L := π 2 sin(πα) C and
Γ(1 − α) α κ 2α B = B(κ) := 0
Φ :=
e−iπα/2 0
0
,
Γ(α) (1−α) κ 21−α 0 1 0 , J := . 0 −1 e−iπ(1−α)/2
Note that the matrices B, Φ and J commute with each other, that the matrix B is self-adjoint and invertible, and that J and Φ are unitary. (I) It is trivially checked that if D = 0 the statement (i) is satisfied. (II) Let us assume det(D) = 0, i.e. D is invertible. Without loss of generality and as explained at the end of Sec. 3, we assume than that D = 1 and that the matrix C is self-adjoint. Then one has S = Φ2 J + 2i sin(πα)BΦ(B 2 Φ2 + L)−1 BΦJ = BΦ(B 2 Φ2 + L)−1 [B(Φ2 + 2i sin(πα)) + LB −1 ]ΦJ. By taking the equality Φ2 + 2i sin(πα) = Φ−2 into account, it follows that S = BΦ(B 2 Φ2 + L)−1 (BΦ−2 + LB −1 )ΦJ = Φ(Φ2 + B −1 LB −1 )−1 (Φ−2 + B −1 LB −1 )ΦJ = Φ(B −1 LB −1 + cos(πα)J − i sin(πα))−1 × (B −1 LB −1 + cos(πα)J + i sin(πα))ΦJ.
February 10, J070-S0129055X11004205
76
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
Since the matrix B −1 LB −1 + cos(πα)J is self-adjoint, the above expression can be rewritten as S=Φ
B −1 LB −1 + cos(πα)J + i sin(πα) ΦJ B −1 LB −1 + cos(πα)J − i sin(πα)
(21)
which is clearly a unitary operator. The only dependence on κ in the terms B is continuous and one has iπα cos(πα)J + i sin(πα) e 0 CD ΦJ = lim S (κ) = Φ κ→+∞ α 0 e−iπα cos(πα)J − i sin(πα) which proves the statement (ii) (III) We shall now consider the situation det(D) = 0 but D = 0. Obviously, ker(D) is of dimension 1. So let p = (p1 , p2 ) be a vector in ker(D) with p = 1. By (12) and by using the notation introduced in that section one has S = Φ2 J + 2i sin(πα)BΦI(P B 2 Φ2 I + )−1 P BΦJ.
(22)
Note that the matrix of P := IP : C2 → C2 , i.e. the orthogonal projection onto p⊥ , is given by |p2 |2 −p1 p¯2 P= −¯ p1 p2 |p1 |2 and that PB 2 Φ2I is just the multiplication by the number c(κ) = b21 (κ)|p2 |2 e−iπα − b22 (κ)|p1 |2 eiπα ,
(23)
Γ(α) (1−α) α . with b1 (κ) = Γ(1−α) 2α κ and b2 (κ) = 21−α κ In the special case α = 1/2, the matrices B and Φ have the special form B = π2 κ1/2 and φ = e−iπ/4 . Clearly, one also has b1 = b2 = π2 κ1/2 := b and c(κ) = −ib2 . In that case, the expression (22) can be rewritten as πκ/2 − i P + (P − 1) J (24) S=i πκ/2 + i
which is the product of unitary operators and thus is unitary. Furthermore, the dependence in κ is continuous and the asymptotic value is easily determined. This proves statement (iii) If α = 1/2, let us rewrite S as S = Φ(c(κ) + )−1 [2i sin(πα)BPB + c(κ) + ]ΦJ.
(25)
Furthermore, by setting X− := (b21 |p2 |2 −b22 |p1 |2 ) and X+ := (b21 |p2 |2 +b22 |p1 |2 ) one has c(κ) + = cos(πα)X− + − i sin(πα)X+
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
77
and M := 2i sin(πα)BPB + c(κ) + −2i sin(πα)b1 b2 p1 p¯2 eiπα X− + = . − 2i sin(πα)b1 b2 p¯1 p2 e−iπα X− + With these notations, the unitary of S easily follows from the equality det(M ) = |c(κ)+ |2 . The continuity in κ of all the expressions also implies the expected continuity of the map (20). Finally, by taking (23) and the explicit form of M into account, the asymptotic values of SαCD (κ) for the cases (iv) and (v) can readily be obtained. (IV) Let us now consider the behavior of the scattering matrix near the zero energy. If C = 0, then det(D) = 0 and one can use (21) with L = 0. The statement (a) follows easily. (V) Assume that det(C) = 0. In this case, it directly follows from (16) that ! −iπα 0 " which proves (b). SαCD (0) = 0, and then S(0) = e 0 eiπα (VI) We now assume that dim[ker(C)] = 1 and consider two cases. Firstly, if det(D) = 0 we can assume as in (II) that D = 1 and that C is self-adjoint and use again (21). Introducing the entries of L, l11 l12 L= l12 l22 one obtains B −1 LB −1 + cos(πα)J + i sin(πα) B −1 LB −1 + cos(πα)J − i sin(πα) =
b21 l22 e−iπα ·
1 − b22 l11 eiπα − b21 b22
!
b21 l22 eiπα − b22 l11 eiπα − b21 b22 e2iπα
b1 b2 l12 (e−iπα − eiπα )
b1 b2 l12 (e−iπα − eiπα )
b21 l22 e−iπα − b22 l11 e−iπα − b21 b22 e−2iπα
.
For α = 1/2 one easily obtains the result stated in (d) and (e). For α = 1/2, it follows that B −1 LB −1 + cos(πα)J + i sin(πα) 2 L − 1, = κ→0+ B −1 LB −1 + cos(πα)J − i sin(πα) tr(L) lim
and it only remains to observe that L = tr(L)Π, where Π is the orthogonal projection on ker(L)⊥ = ker(C)⊥ . This proves (c). Secondly, let us assume that dim[ker(D)] = 1. By (11), there exists U ∈ U (2) such that ker(C) = ker(1 − U ) and ker(D) = ker(1 + U ). As a consequence, one has ker(C) = ker(D)⊥ and then P = 1 − Π. On the other hand, we can use the expressions for the scattering operator obtained in (III).
February 10, J070-S0129055X11004205
78
2011 13:49 WSPC/S0129-055X
148-RMP
K. Pankrashkin & S. Richard
However, observe that CI = C|ker(D)⊥ = C|ker(C) = 0 so we only have to consider these expressions in the special case = 0. The asymptotic at 0 energy are then easily deduced from these expressions. By summing the results obtained for det(D) = 0 and for dim[ker(D)] = 1, and since D = 0 is not allowed if det(C) = 0, one proves the cases (c), (d) and (e). Remark 15. As can be seen from the proof, the scattering matrix is independent of the energy in the following cases only: ! −iπα 0 " , • D = 0, then SαCD (κ) = e 0 eiπα " ! iπα e 0 • C = 0, then SαCD (κ) = 0 e−iπα , see (21), ! " ! iπα 0 " , see (25), • ker(C) = ker(D)⊥ = C0 , then SαCD (κ) = e 0 eiπα " " ! ! −iπα 0 e 0 • ker(C) = ker(D)⊥ = C , then SαCD (κ) = , see (25), 0 e−iπα ! 0" CD • α = 1/2 and det(C) = det(D) = 0, then Sα (κ) = (2P − 1) 0i −i , where P is the orthogonal projection on ker(D)⊥ ≡ ker(C), see (24).
7. Final Remarks As mentioned before, the parametrization of the self-adjoint extensions of Hα with the pair (C, D) satisfying (7) is highly non unique. For the sake of convenience, we recall here a one-to-one parametrization of all self-adjoint extensions and reinterpret a part of the results obtained before in this framework. So, let U ∈ U (2) and set C = C(U ) :=
i 1 (1 − U ) and D = D(U ) = (1 + U ). 2 2
(26)
It is easy to check that C and D satisfy both conditions (7). In addition, two different elements U, U of U (2) lead to two different self-adjoint operators HαCD and HαC D with C = C(U ), D = D(U ), C = C(U ) and D = D(U ), cf. [17]. Thus, without ambiguity we can write HαU for the operator HαCD with C, D given by (26). Moreover, the set {HαU | U ∈ U (2)} describes all self-adjoint extensions of Hα , and, by (10), the map U → HαU is continuous in the norm resolvent topology. Let us finally mention that the normalization of the above map has been chosen such that Hα−1 ≡ Hα10 = HαAB . Obviously, we could use various parametrizations for the set U (2). For example, one could set a −b U = U (η, a, b) = eiη b a with η ∈ [0, 2π) and a, b ∈ C satisfying |a|2 + |b|2 = 1, which is the parametrization used in [2] (note nevertheless that the role of the unitary parameter was quite
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
79
different). We could also use the parametrization inspired by [13]: −(1 − q 2 )1/2 e−ib qeia iω U = U (ω, a, b, q) = e (1 − q 2 )1/2 eib qe−ia with ω, a, b ∈ [0, 2π) and q ∈ [0, 1]. However, the following formulae look much simpler without such an arbitrary choice, and such a particularization can always be performed later on. We can now rewrite part of the previous results in terms of U : Lemma 16. Let U ∈ U (2). Then, (i) For z ∈ ρ(HαAB ) ∩ ρ(HαU ) the resolvent equation holds: z )∗ , (HαU − z)−1 − (HαAB − z)−1 = −γ(z)[(1 + U )M (z) + i(1 − U )]−1 (1 + U )γ(¯ (ii) The number of negative eigenvalues of HαU coincides with the number of negative eigenvalues of the matrix i(U − U ∗ ), (iii) The value z ∈ R− is an eigenvalue of HαU if and only if det((1 + U )M (z) + i(1 − U )) = 0, and in that case one has ker(HαU − z) = γ(z) ker((1 + U )M (z) + i(1 − U )). The wave operators can also be rewritten in terms of the single parameter U . We shall not do it here but simply express the asymptotic values of the scattering operator SαU := S(HαU , H0 ) in terms of U . If λ ∈ C is an eigenvalue of U , we denote by Vλ the corresponding eigenspace. Proposition 17. One has: ! −iπα 0 " (i) If U = −1, then SαU (κ) ≡ SαAB = e 0 , eiπα !eiπα 0 " U (ii) If −1 ∈ σ(U ), then Sα (+∞) = 0 e−iπα ,
(iii) !If −1 "∈ σ(U ) with multiplicity one and α = 1/2, then SαU (+∞) = (2P − 1) i 0 ⊥ 0 −i , where P is the orthogonal projection onto V−1 , ! " ! " (iv) If V−1 = C0 or if −1 ∈ σ(U ) with multiplicity one, α < 1/2 and V−1 = C0 , ! −iπα 0 " then SαU (+∞) = e 0 , e−iπα !0" ! " (v) If V−1 = C or if −1 ∈ σ(U ) with multiplicity one, α > 1/2 and V−1 = C0 , ! iπα 0 " . then SαU (+∞) = e 0 eiπα Furthermore, (a) If U = 1, then SαU (0) =
!eiπα
(b) If 1 ∈ σ(U ), then SαU (0) =
0 " , e−iπα !e−iπα 0 " , 0 eiπα
0
! (c) If 1 ∈ σ(U ) with multiplicity one and α = 1/2, then SαU (0) = (1 − 2Π) 0i where Π is the orthogonal projection on V1⊥ .
0" −i ,
February 10, J070-S0129055X11004205
80
148-RMP
K. Pankrashkin & S. Richard
(d) If V1 = (e)
2011 13:49 WSPC/S0129-055X
!0"
C or if 1 ∈ σ(U ) ! −iπα 0 " , SαU (0) = e 0 e−iπα !C" If V1 = 0 or if 1 ∈ σ(U ) ! iπα 0 " SαU (0) = e 0 eiπα .
with multiplicity one, α > 1/2 and V1 =
!C" 0 , then
with multiplicity one, α < 1/2 and V1 =
!0" C , then
Remark 18. The scattering matrix is independent of the energy in the following cases only: ! −iπα 0 " , • U = −1, then SαU (κ) ≡ SαAB = e 0 eiπα " ! iπα 0 • U = 1, then SαU (κ) = e 0 e−iπα , see (21), !1 0 " ! iπα e 0 " , see (25), • U = 0 −1 , then SαU (κ) = 0 eiπα !−1 0" !e−iπα 0 " U • U = 0 1 , then Sα (κ) = , see (25), 0 e−iπα ! 0" • α = 1/2 and σ(U ) = {−1, 1}, then SαU = (2P − 1) 0i −i , where P is the orthogonal projection on V1 , see (24). Acknowledgment S. Richard is supported by the Swiss National Science Foundation. References [1] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standards Applied Mathematics Series, Vol. 55 (U.S. Government Printing Office, Washington, D.C., 1964). [2] R. Adami and A. Teta, On the Aharonov–Bohm Hamiltonian, Lett. Math. Phys. 43 (1998) 43–54. [3] Y. Aharonov and D. Bohm, Significance of electromagnetic potentials in the quantum theory, Phys. Rev. 115 (1959) 485–491. [4] S. Albeverio, F. Gesztesy, R. Høegh-Krohn and H. Holden, Solvable Models in Quantum Mechanics, with an Appendix by P. Exner, 2nd edn. (AMS, Providence, Rhode Island, 2005). [5] S. Albeverio and K. Pankrashkin, A remark on Krein’s resolvent formula and boundary conditions, J. Phys. A 38 (2005) 4859–4864. [6] W. O. Amrein, J. M. Jauch and K. B. Sinha, Scattering Theory in Quantum Mechanics, Lecture Notes and Supplements in Physics, Vol. 16 (W. A. Benjamin, Inc., Reading, Mass.-London-Amsterdam, 1977). [7] M. Ballesteros and R. Weder, High-velocity estimates for the scattering operator and Aharonov–Bohm effect in three dimensions, Comm. Math. Phys. 285 (2009) 345–398. [8] M. Ballesteros and R. Weder, Aharonov–Bohm effect and Tonomura et al. experiments, rigorous results, J. Math. Phys. 50 (2009) 122108, 54 pp. [9] M. V. Berry, Aptly named Aharonov–Bohm effect has classical analogue, long history, Phys. Today 63 (2010) 8 pp. [10] L. Bruneau, J. Derezi´ nski and V. Georgescu, Homogeneous Schr¨ odinger operators on half-line, preprint; arXiv: 0911.5569.
February 10, J070-S0129055X11004205
2011 13:49 WSPC/S0129-055X
148-RMP
Spectral and Scattering Theory for the Aharonov–Bohm Operators
81
[11] J. Br¨ uning and V. Geyler, Scattering on compact manifolds with infinitely thin horns, J. Math. Phys. 44 (2003) 371–405. [12] J. Br¨ uning, V. Geyler and K. Pankrashkin, Spectra of self-adjoint extensions and applications to solvable Schr¨ odinger operators, Rev. Math. Phys. 20 (2008) 1–70. ˇˇtov´ıˇcek, Aharonov–Bohm effect with δ-type interaction, J. [13] L. D¸abrowski and P. S Math. Phys. 39 (1998) 47–62. [14] V. A. Derkach and M. M. Malamud, Generalized resolvents and the boundary value problems for Hermitian operators with gaps, J. Funct. Anal. 95 (1991) 1–95. ˇˇtov´ıˇcek and P. Vytˇras, Generalised boundary conditions for the [15] P. Exner, P. S Aharonov–Bohm effect combined with a homogeneous magnetic field, J. Math. Phys. 43 (2002) 2151–2168. [16] N. Goloshchapova and L. Oridoroga, The one-dimensional Schr¨ odinger operator with point δ- and δ -interactions, Math. Notes 84 (2008) 125–129. [17] M. Harmer, Hermitian symplectic geometry and extension theory, J. Phys. A 33 (2000) 9193–9203. [18] A. Jensen, Time-delay in potential scattering theory, some “geometric” results, Comm. Math. Phys. 82(3) (1981/82) 435–456. [19] T. Kato, Perturbation Theory for Linear Operators, Classics in Mathematics, reprint of the 1980 edition (Springer-Verlag, Berlin, 1995). [20] J. Kellendonk and S. Richard, Weber–Schafheitlin type integrals with exponent 1, Int. Transf. Spec. Funct. 20 (2009) 147–153. [21] O. Lisovyy, Aharonov–Bohm effect on the Poincar´e disk, J. Math. Phys. 48 (2007) 052112, 17 p. [22] F. Oberhettinger, Tables of Mellin Transforms (Springer-Verlag, 1974). [23] K. Pankrashkin, Resolvents of self-adjoint extensions with mixed boundary conditions, Rep. Math. Phys. 58 (2006) 207–221. [24] A. Posilicano, Self-adjoint extensions of restrictions, Oper. Matrices 2 (2008) 483–506. [25] S. Richard, New formulae for the Aharonov–Bohm wave operators, in Spectral and Scattering Theory for Quantum Magnetic Systems, Contemporary Mathematics, Vol. 500 (AMS, Providence, Rhode Island, 2009), pp. 159–168. [26] S. N. M. Ruijsenaars, The Aharonov–Bohm effect and scattering theory, Ann. Phys. 146(1) (1983) 1–34. [27] H. Tamura, Magnetic scattering at low energy in two-dimensions, Nagoya Math. J. 155 (1999) 95–151. [28] H. Tamura, Norm resolvent convergence to magnetic Schr¨ odinger operators with point interactions, Rev. Math. Phys. 13(4) (2001) 465–511. [29] G. N. Watson, A Treatise on the Theory of Bessel Functions, 2nd edn. (Cambridge University Press, Cambridge, England, 1966). [30] R. Weder, The Aharonov–Bohm effect and time-dependent inverse scattering theory, Inverse Problems 18 (2002) 1041–1056. [31] D. R. Yafaev, Mathematical Scattering Theory. General Theory, Translations of Mathematical Monographs, Vol. 105 (American Mathematical Society, Providence, RI, 1992).
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 1 (2011) 83–125 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004230
PERTURBATION OF NEAR THRESHOLD EIGENVALUES: CROSSOVER FROM EXPONENTIAL TO NON-EXPONENTIAL DECAY LAWS
VICTOR DINU CAQP, Faculty of Physics, University of Bucharest, P. O. Box MG 11, RO-077125 Bucharest, Romania ARNE JENSEN Department of Mathematical Sciences, Aalborg University, Fr. Bajers Vej 7G, DK-9220 Aalborg Ø, Denmark
[email protected] GHEORGHE NENCIU Institute of Mathematics of the Romanian Academy, P. O. Box 1-764, RO-014700 Bucharest, Romania and Department of Mathematical Sciences, Aalborg University, Fr. Bajers Vej 7G, DK-9220 Aalborg Ø, Denmark Received 6 June 2010 For a two-channel model of the form " # " Hop 0 0 +ε Hε = 0 E0 W21
W12 0
# on H = Hop ⊕ C,
appearing in the study of Feshbach resonances, we continue the rigorous study, begun in our paper (J. Math. Phys. 50 (2009) 013516), of the decay laws for resonances produced by perturbation of unstable bound states close to a threshold. The operator Hop is assumed to have the properties of a Schr¨ odinger operator in odd dimensions, with a threshold at zero. We consider for ε small the survival probability |Ψ0 , e−itHε Ψ0 |2 , where Ψ0 is the eigenfunction corresponding to E0 for ε = 0. For E0 in a small neighborhood of the origin independent of ε, the survival probability amplitude is expressed in terms of some special functions related to the error function, up to error terms vanishing as ε → 0. This allows for a detailed study of the crossover from exponential to non-exponential decay laws, and then to the bound state regime, as the position of the resonance is tuned across the threshold. Keywords: Decay laws; exponential decay; non-exponential decay; Fermi Golden Rule. Mathematics Subject Classification 2010: 81Q10, 35B25, 35J10, 47A55, 47N50, 81U05
83
February 10, J070-S0129055X11004230
84
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
1. Introduction The problem of the decay laws for resonances produced by perturbation of unstable bound states has a long and distinguished history in quantum mechanics. There is an extensive body of literature about decay laws for resonances in general, both at the level of theoretical physics (see e.g. [4, 10, 11, 22, 27–29] and references therein), and at the level of rigorous mathematical physics (see e.g. [3, 5–7, 9, 12, 16–19, 24, 31, 32] and references therein). It started with the computation by Dirac of the decay rate in second order time-dependent perturbation theory, leading to the well-known exponential decay law, e−Γt . Here Γ is given by the famous “Fermi Golden Rule” (FGR), Γ ∼ |Ψ0 , εW Ψcont,E0 |2 , where Ψ0 , E0 are the unperturbed bound state eigenfunction and energy, respectively, and Ψcont,E0 is the continuum “eigenfunction” degenerate in energy with the bound state. The FGR formula met with a fabulous success, and as a consequence, the common wisdom is that the decay law for the resonances produced by perturbation of non-degenerate bound states is exponential, at least in the leading non-trivial order in the perturbation strength (for degenerate bound states Rabi type exponentially decaying oscillations can appear). However, it has been known for a long time, at least for semi-bounded Hamiltonians, that the decay law cannot be purely exponential; there must be deviations at least at short and long times. This implies that, in more precise terms, the question is whether the decay law is exponential up to errors vanishing as the perturbation strength tends to zero. So at the rigorous level the crucial problem is the estimation of the errors. This proved to be a hard problem, and only during the past decades consistent rigorous results have been obtained. The generic result is that (see [3, 5, 12, 16, 24] and references therein) the decay law is indeed (quasi)exponential, i.e. exponential up to error terms vanishing in the limit ε → 0, as long as the resolvent of the unperturbed Hamiltonian is sufficiently smooth, when projected onto the subspace orthogonal to the eigenvalue under consideration. For most cases of physical interest, this turns out to be the case, as long as the unperturbed eigenvalue lies in the continuum, far away from the energetic thresholds, and this explain the tremendous success of the FGR formula. The problem with the exponential decay law appears for bound states situated near a threshold, since, in this case, the projected resolvent might not be smooth, or may even blow up, when there is a zero resonancea at the threshold, see, e.g., [14–16] and references therein. As it has been pointed out in [2], at threshold the FGR formula does not apply. Moreover, the fact that the non-smoothness of the resolvent opens the possibility of a non-exponential decay at all times has been mentioned at the heuristic level [20, 22] (although this possibility for the non-degenerate case has been sometimes denied [25]). a To
clarify the terminology, H = −∆ + V on L2 (Rd ), d = 1, 2, 3, with V (x) = O(|x|−2−δ |) as |x| → ∞, is said to have a zero resonance, if HΦ = 0 has a solution, which is not in L2 (Rd ), but in a slightly larger space, see, e.g., [14, 15]. A zero resonance is also called a half-bound state.
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
85
Let us mention that the question of the decay law for near threshold bound states is more than an academic one. While having the bound state in the very neighborhood of a threshold is a non-generic situation, recent advances in experimental technique have made it possible to realize this case for the so-called Feshbach resonances, where (with the aid of a magnetic field) it is possible to tune the energy of the bound state (and then the resonance position) throughout a neighborhood of the threshold energy. The decay law for the case, when the resonance position is close to the threshold, has been considered at the rigorous level in [16–19]. More precisely, in [16–18], the threshold bound states were considered, but under the condition that the shift in the energy due to perturbation (see [16, (3.1)]) is sufficiently large, such that the resonance position is at a distance of order ε from the threshold. In this case it turns out that the decay law is still exponential, but the FGR has to be modified. It is interesting to note that since in this case the ε dependence of the decay rate is a fractional power, the modified FGR cannot be obtained by naive perturbation theory. The other case, when the resonance position is very close to the threshold (in a neighborhood of the threshold, shrinking as ε → 0), has been considered in [8] for a two-channel model Hamiltonian with the structure used in Feshbach resonance theory [21, 30]. The main result is the proof at the rigorous level that for some energy ranges the decay law is definitely non-exponential. More precisely, we proved that the survival probability, up to some error terms vanishing in the limit ε → 0, can be written as an explicit integral, which has been analyzed numerically. The numerical study revealed that a remarkable variety, depending upon the values of the parameters involved, of different decay laws appear: Close to an exponential one, definitely non-exponential, or bound state like. The present paper is a continuation of [8]. The setting is the same, but we add two important things. First, using an appropriate ansatz, close in the spirit to the well-known Lorentzian (Breit–Wigner) approximation for perturbed eigenvalues far from the threshold, but with a functional form taking into account the threshold behavior of the resolvent near threshold, we are able to cover a small ε-independent neighborhood of the threshold, improving at the same time the error term. Secondly, we express the approximated survival probability amplitude in terms of some special functions, related to the error function, replacing the exponential function in the decay law. As a result, we are able to obtain a rigorous and detailed description of the crossover of the decay law, as the resonance position is tuned through the threshold from positive to negative energies via tuning of E0 : Exponential decay with the usual FGR decay rate, to exponential decay with the modified FGR decay rate, then to non-exponential decay, and finally to bound state behavior. The contents of the paper is as follows. In Sec. 2, we recall from [8] the model Hamiltonian and its properties. Section 3 contains the guiding heuristics discussion, and a detailed description of the results. Section 4 contains the proofs. In the Appendix, we discuss the properties of the functions appearing in the expression
February 10, J070-S0129055X11004230
86
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
for the approximate survival probability amplitude, and their relations with error function and related special functions. 2. Notation and Assumptions The setting is the same as in [8]. We repeat it below for the reader’s convenience. We develop the theory in a somewhat abstract setting, which is applicable to two channel Schr¨ odinger operators in odd dimensions, as they appear for example in the theory of Feshbach resonances (see, e.g., [21, 30], and references therein). Consider 0 Hop on H = Hop ⊕ Hcl . H= 0 Hcl In concrete cases Hop = L2 (R3 ) (or L2 (R+ ) in the spherically symmetric case), and Hop = −∆ + Vop with lim|x|→∞ Vop (x) = 0. Hop describes the “open” channel. As for the “closed” channel, one starts again with a Schr¨ odinger operator, but with lim|x|→∞ Vcl (x) = Vcl,∞ > 0. One assumes that Hcl has bound states below Vcl,∞ , which may be embedded in the continuum spectrum of Hop . Only these bound states are relevant for the problem at hand. Thus one can retain only one isolated eigenvalue (or a group of almost degenerate eigenvalues isolated from the rest of the spectrum); the inclusion of the rest of the spectrum of Hcl merely “renormalizes” the values of some coefficients, without changing the qualitative picture. In this paper, we shall consider only non-degenerate eigenvalues, i.e. we shall take Hcl = E0 in Hcl = C, such that Hop 0 H= , (2.1) 0 E0 on
H = Hop ⊕ C =
ψ Ψ= ψ ∈ Hop , β ∈ C . β
Apart from the spectrum of Hop , H has a bound state 0 0 0 Ψ0 = , such that H = E0 . 1 1 1 The problem is to study the fate of E0 , when an interchannel perturbation 0 W12 εW = ε W21 0
(2.2)
(2.3)
is added to H, i.e. the total Hamiltonian is Hε = H + εW.
(2.4)
Throughout the paper we assume, without loss of generality, that ε > 0. For simplicity, we assume that W is a bounded self-adjoint operator on H.
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
87
As already mentioned in the Introduction, the quantity to be studied is the so-called survival probability amplitude Aε (t) = Ψ0 , e−itHε Ψ0 .
(2.5)
As in [16, 8], we shall use the stationary approach to write down a workable formula for Aε (t). For this purpose, we use the Stone formula to express the compressed evolution in terms of the compressed resolvent, and then we use the Schur–Livsic– Feshbach–Grushin (SLFG) partition formula to express the compressed resolvent as an inverse (for details, further references, and historical remarks about the SLFG formula, the reader should refer to [16]). More precisely, by using the Stone formula and the SLFG formula, one arrives at the following basic formula for Aε (t), which often appears in the physics literature and is a particular case of the general formula in [16]. 1 ∞ −itx e Im F (x + iη, ε)−1 dx (2.6) Aε (t) = lim η0 π −∞ with F (z, ε) = E0 − z − ε2 g(z),
(2.7)
g(z) = Ψ0 , W Q∗ (Hop − z)−1 QW Ψ0 ,
(2.8)
where
and Q is the orthogonal projection on Hop , considered as a map from H to Hop . Since we are interested in the form of Aε (t), when E0 is near a threshold of Hop , we shall assume that 0 is a threshold of Hop , and that E0 is close to zero. The following assumption is imposed in the sequel and will not be repeated. Condition (iii) is imposed to exclude the trivial case. Assumption 2.1. (i) There exists a > 0, such that (−a, 0) ⊂ ρ(Hop ) (the resolvent set) and [0, a] ⊂ σess (Hop ). (ii) |E0 | ≤ 12 . (iii) We have QW Ψ0 = 0. From Assumption 2.1 and (2.8) we get the following result. Proposition 2.2. (i) g(z) is analytic in C\{(−∞, −a] ∪ [0, ∞)}. (ii) g(z) = g(z). (iii) g(z) is strictly increasing on (−a, 0). (iv) Im g(z) > 0 for Im z > 0. The aim of this paper is to consider at the rigorous mathematical physics level the problem of the decay law, for the case that E0 is tuned past the threshold. For that purpose we need assumptions about the behavior of the function g(z)
February 10, J070-S0129055X11004230
88
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
in the neighborhood of the origin. In stating this assumption we use the notation from [8, 16] to facilitate reference to those papers. Assumption 2.3. For Re κ ≥ 0 and z ∈ C\[0, ∞) we let √ κ = −i z, z = −κ2 .
(2.9)
Let for a > 0 Da = {z ∈ C\[0, ∞) | |z| < a}.
(2.10)
Then for z ∈ Da g(z) =
4
κj gj + κ5 r(κ),
(2.11)
j=−1 4 1 d g(z) = − jκj−1 gj + κ3 s(κ), dz 2κ j=−1
(2.12)
sup {|r(κ)|, |s(κ)|} < ∞.
(2.13)
z∈Da
Furthermore, we assume that limIm z0 (g(z) − g−1 κ−1 ) exists and is continuous on (−a, a). As already explained, Assumption 2.3 includes the case, when Hop = −∆ + Vop in odd dimensions. The expansions for the resolvent of −∆ + Vop leading to (2.11) are provided in [13–17, 26]. Taking into account that (at least formally) d g(z) = Ψ0 , W Q∗ (Hop − z)−2 QW Ψ0 , dz the result (2.12) can be derived in the same manner. More precisely, it can be shown that the expansion (2.11) is differentiable, see [14, 26, 33]. Examples of expansions with the corresponding explicit expressions for coefficients gj are given in [16, Appendix], with references to the literature. Since the form of the decay law depends strongly upon the behavior of g(z) near 0, we divide the considerations into three cases. odinger case this corresponds (i) The singular case, in which g−1 = 0. In the Schr¨ to the situation, when Hop has a zero resonance at the threshold (see, e.g., [14, 16]). Let us recall that the free particle in one dimension belongs to this class. From Proposition 2.2(iv) follows that g−1 > 0.
(2.14)
(ii) The regular case, in which g−1 = 0 and g1 = 0. We note that g−1 = 0 is the generic case for Schr¨ odinger operators in one and three dimensions. Again, from Proposition 2.2(iv), one has g1 < 0.
(2.15)
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
89
Let us remark that the behavior Im g(x + i0) ∼ x1/2 as x → 0 is nothing but the famous Wigner threshold law [30, 23]. (iii) The smooth case, in which g−1 = g1 = 0. This case occurs for free Schr¨ odinger operators in odd dimensions larger that three, and in the spherical symmetric d g(z) is case for partial waves ≥ 1, see [16, 17]. Notice that in this case dz uniformly bounded in Da . Let for x ∈ (0, a), lim F (x + iη, ε) = F (x + i0, ε) = R(x, E0 , ε) + iI(x, E0 , ε).
η0
(2.16)
Due to (2.7) and Assumption 2.3, R(x, E0 , ε) is continuous and strictly decreasing, for sufficiently small ε. Thus for ε and E0 small enough, with E1 = E0 − g0 ε2 > 0 the equation R(x, E0 , ε) = 0 has a unique solution x0 (E0 , ε) on (0, a): R(x0 (E0 , ε), E0 , ε) = 0.
(2.17)
To simplify the notation we omit the dependence of R(x, E0 , ε), I(x, E0 , ε), and x0 (E0 , ε) on E0 and ε. Throughout the paper, Hop and W are kept fixed, while E0 and ε are parameters; ε is positive and small, and E0 is tuned past the threshold, i.e. takes values in a neighborhood of the origin. A finite number of constants will appear; they are strictly positive, finite and independent of the parameters ε and E0 . We introduce the following notation: Notation. (i) A B means that there exists a constant c such that A ≤ cB. An analogous definition holds for A B. (ii) A B means that both A B and A B hold. (iii) A ∼ = B means that A and B are equal to leading order in a parameter, e.g. A = B + δ(ε) with limε0 δ(ε) = 0. 3. Heuristics and the Results We first give the heuristics, and then we state our results. 3.1. Heuristics For E0 outside a small (possibly ε-dependent) neighborhood of the origin, the situation is well understood, both at the heuristic level, and at the rigorous level. Indeed, for negative E0 , using the analytic perturbation theory, one can show that |Aε (t) − e−itEε | ε2 ,
(3.1)
where Eε is the perturbed eigenvalue, which coincides with E0 in the limit ε → 0. As a consequence, the survival probability remains close to one uniformly in time. On heuristic grounds, if E0 is positive, i.e. embedded in the essential spectrum of Hop , Ψ0 turns into a metastable decaying state. The main problem is to compute the “decay law”, i.e. |Aε (t)|2 , up to error terms vanishing in the limit ε → 0.
February 10, J070-S0129055X11004230
90
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
For eigenvalues embedded in the continuum spectrum the heuristics for the exponential decay lawb |Aε (t)|2 ∼ = e−2Γ(ε)t runs as follows. Suppose F (z, ε) is sufficiently smooth, as z approaches the real line from above, F (x + i0, ε), for x in a neighborhood of E0 . Let F (x + i0, ε) = R(x) + iI(x). Then the equation R(x) = 0 has a solution x0 nearby E0 . The idea is that the main contribution to the integral in (2.6) comes from the neighborhood of x0 , and in this neighborhood F (x + i0, ε) ∼ = x0 − x + iI(x0 ),
(3.2)
and then Im F (x, ε)−1 ∼ =
−I(x0 ) , (x − x0 )2 + I(x0 )2
(3.3)
i.e. it has a Lorentzian peak shape leading to |Aε (t)|2 ∼ = e−2|I(x0 )|t .
(3.4)
For mathematical substantiation of this heuristics in the case, where either E0 > 0 (embedded eigenvalues) or E0 = 0 (threshold eigenvalues) but the perturbation “pushes” the eigenvalue sufficiently “far” into the continuum spectrum such that x0 ε, we send the reader to [16] and references therein. In the cases where the resolvent has an analytic continuation through the positive semi-axis, z0 ∼ = x0 + iI(x0 ) is nothing but the position of the “resonance pole” and x0 and −I(x0 ) are called resonance position and width, respectively. In this setting, the exponential decay law comes from the resonance pole contribution, while the error term comes from the contribution of the “background” integral, see [12]. It is well known that irrespective of the approach the main technical difficulty is to estimate the error term. The problem with the energies near the threshold is that F (x + i0, ε) might not be smooth and can even blow up (see Assumption 2.3), if the open channel has a zero resonance at the threshold. Then a Lorentzian approximation might break down. For the case at hand, elaborating on a heuristic argument in [20], one can quantify at the heuristic level how far from the origin x0 > 0 must be in order to have a chance for an exponential decay law: The contribution of the tail at negative x of the Lorentzian must be negligible. Since 0 |I(x0 )| |I(x0 )| dx , (3.5) 2 + I(x )2 (x − x ) x0 0 0 −∞ one gets the condition |I(x0 )| x0 . bA
(3.6)
better name is probably quasi-exponential decay law, in order to emphasize the fact that the equality is up to errors vanishing as ε → 0.
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
91
Consider first the condition (3.6) in the singular case. For x > 0 small enough I(x) ∼ = −g−1 ε2 x−1/2 , −1/2
and the condition (3.6) gives g−1 ε2 x0
x0 , i.e.
x0 ε4/3 .
(3.7)
x0 = bεp ,
(3.8)
If we take (by adjusting E0 !)
then one obtains, for 0 ≤ p < 4/3, the exponential decay law (see (3.4)) −1/2 2−p/2 ε t |Aε (t)|2 ∼ . = e−2g−1 b
(3.9)
Notice that for p = 0 (i.e. the resonance stays away from the threshold as ε → 0), (3.9) is nothing but the usual Fermi Golden Rule (FGR) formula. However, for p > 0 but not very large (i.e. the resonance position approaches zero as ε → 0, but not too fast) one gets a “modified FGR formula ” for which the ε-dependence of the resonance width is ε2−p/2 instead of the usual ε2 -dependence. For the regular case, a similar argument leads to the condition x0 ε4 ,
(3.10)
1/2 2+p/2 t |Aε (t)|2 ∼ . = e−2|g1 |b ε
(3.11)
and a decay law
Finally, in the smooth case the condition (3.6) reads 1/2
ε2 x0
1,
(3.12)
which holds true irrespective of how close to zero x0 is. In other words, in the smooth case, one observes an exponential decay law (with a resonance width vanishing as x0 → 0), as the resonance position is tuned past the threshold, via the tuning of the eigenvalue E0 . 3.2. Reduction to the case g2 = 0 We argue that it is sufficient to consider the case, when g2 = 0, i.e. (2.11) is replaced with g(z) =
N
κj gj + κN +1 r(κ),
(3.13)
j≥−1 j=2
which leads to a significant simplification of the proofs. Indeed, let F (z, ε) F˜ (z, ε) = . 1 − ε2 g2
(3.14)
February 10, J070-S0129055X11004230
92
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
On the one hand, notice that if 1 A˜ε (t) = lim η0 π
∞
e−itx Im F˜ (x + iη, ε)−1 dx,
(3.15)
−∞
then for sufficiently small ε (e.g., |ε2 g2 | ≤ 12 ) we have |A˜ε (t) − Aε (t)| ε2 .
(3.16)
˜1 − z − ε2 (˜ F˜ (z, ε) = E g−1 κ−1 + g˜1 κ + g˜3 κ3/2 + · · ·),
(3.17)
On the other hand,
˜1 = E0 −ε22 g0 , and g˜j = gj2 , which is exactly of the same form as F (z, ε), with E 1−ε g2 1−ε g2 but without the linear term in the expansion of g(z), and the other coefficients slightly “renormalized”. In the sequel we consider only F (z, ε) with g(z) satisfying (3.13). 3.3. Resonance and bound state positions Summing up, the heuristics predicts that if the resonance position is outside an energy window of size ε4/3 and ε4 in the singular and regular case, respectively, then the decay law is exponential with a decay rate depending on, how rapidly the resonance position approaches zero as ε → 0. Moreover, it suggests that for the resonance position inside the above energy windows, the decay law is not exponential, but gives no hint about its actual form. We proceed to the rigorous substantiation of the above heuristics, and to the derivation of decay laws. As the heuristics suggests, the zeroes of R(x) (which for x < 0 coincide with those of F (x+i0, ε)) play a central role. The zero on the positive semi-axis, x0 , gives the resonance position, while the zero on negative semi-axis, xb , gives the position of the bound state. The propositions below give estimates on x0 and xb in terms of the parameters appearing in F (x + i0, ε), as given by (2.7) and (2.8). In the remainder of this paper we shall take a > 0 small enough, such that Assumption 2.1 holds true, and in addition the terms in g(z) with j ≥ 3 can be treated as perturbations. Let a (3.18) E1 = E0 − ε2 g0 with |E1 | ≤ . 2 Proposition 3.1. For E1 > 0 and ε > 0 sufficiently small, the equation R(x) = 0 on (0, a) has a unique solution x0 , and x0 = E1 + O(ε2 x20 ).
(3.19)
lim x0 = 0.
(3.20)
In particular, E1 0
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
93
Proposition 3.2. (i) Assume g−1 = 0. Then for ε > 0 sufficiently small (and irrespective of the value of E1 ) the equation F (x, ε) = 0 has a unique solution on (−a, 0) and 4/3 4/3 ε , if 0 E1 ε , (3.21) |xb | ε4 , if E1 ε4/3 , 2 E1 |xb | ε4/3 + |E1 |,
for E1 ≤ 0.
(3.22)
(ii) Assume g−1 = 0, g1 = 0. Then for E1 ≥ 0, the equation F (x, ε) = 0 has no solutions on (−∞, 0). For −a/2 ≤ E1 < 0 and ε sufficiently small the equation F (x, ε) = 0 has a unique solution xb on (−a, 0) and |xb | ≤ |E1 |.
(3.23)
3.4. Some previous results In the case, where F (x + i0, ε) is sufficiently smooth in a neighborhood of x0 , the mathematical substantiation of the quasi-exponential decay law (i.e. exponential decay up to errors vanishing as ε → 0) follows from the results in [16] (see further references in this paper). In particular, for the smooth case, as well as for x0 ε (i.e. p = 1 in (3.8)), in the singular and regular case, one still has (quasi)-exponential decay. Let us stress here that in these cases the ansatz (3.2) is nothing but the approximation of F (z, ε) with a linear function L(z) = α + iβ − z, where the constants α and β are fixed by the condition that F and L coincide at x0 (ε): F (x0 + i0, ε) = L(x0 + i0).
(3.24)
In the (non-smooth) threshold case it has been proved in [8] that indeed in some energy windows, which depend on the spectral properties of the unperturbed Hamiltonian at the threshold, the decay law is definitely non-exponential for all times. The main idea in [8] is that in the neighborhood of z = 0 one can replace F (z, ε) by the following model function F (z, ε) ∼ = E0 − z − ε2
N
κj gj ≡ H(z, ε),
(3.25)
j=−1
which leads to non-exponential decay laws. As an example we reproduce below the main result in [8] for the regular case. In this case the model function is (3.25) with N = 2 (and g−1 = 0), √ √ ˜ − z + i˜ Hr (z, ε) = E0 − z − ε2 (g0 − ig1 z − g2 z) = d(E g1 z), (3.26) where 2 ˜ = E0 − ε g0 E 1 − ε2 g2
and g˜1 =
g1 . 1 − ε2 g2
It is assumed that ε is sufficiently small, such that d is close to one.
(3.27)
February 10, J070-S0129055X11004230
94
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
˜ ∈ [−a/2, (c/2)ε3/4] for some Theorem 3.3 ([8, Theorem 2.8]). Suppose E c > 0. Then for all t ≥ 0, and for sufficiently small ε, we have the following results. ˜ ≥ 0 we have (i) For E ∞ y 1/2 −i˜ sy ε4/3 , Aε (t) − 1 e dy π 0 (f˜ − y)2 + y
(3.28)
where s˜ = (ε2 g˜1 )2 t
and
˜ f˜ = (ε2 g˜1 )−2 E.
˜ ≤ 0 we have (ii) For E 1 + 4|f˜| − 1 −itxb 1 ∞ y 1/2 −i˜ sy e e − dy ε4/3 . Aε (t) − 2 ˜ π 0 (f − y) + y 1 + 4|f˜|
(3.29)
(3.30)
Remark 3.4. In terms of f˜ the scaling in (3.8) can be written as x0 ∼ = f˜εp−4 ,
(3.31)
so according to the heuristics f˜ = const. is just the borderline between exponential and non-exponential decay laws. This is substantiated by the numerical computations presented in [8], as well as by the rigorous results in [16] for p = 1 i.e. f˜ ε−3 . Remark 3.5. Again in terms of the scaling (3.8), the interval p ∈ (0, 3/4) is not covered by the results in [8]. One of the main goals here is to fill this gap, in order to have a complete picture of the crossover from exponential to non-exponential decay laws. 3.5. The model functions We recall first that in the case of embedded eigenvalues (i.e. p = 0) the “model function” approximating F (z, ε) is the linear approximation (3.2), determined by the condition (3.24). The ansatz we shall adopt in this paper for the “model function” approximating F (z, ε) for all p ∈ (0, ∞), see (3.8), is to replace F (z, ε) by a function, H(z), resembling the expansion of F (z, ε) around the threshold, whose free parameters are fixed by a condition similar to (3.24). More precisely, in the singular case, g−1 = 0, Hs (z) = α − z − ε2 βκ−1 ,
(3.32)
and in the regular case, g−1 = 0, g1 = 0, Hr (z) = α − z + ε2 βκ.
(3.33)
The signs in (3.32) and (3.33) are chosen to ensure that in all cases β ≥ 0. The condition β > 0 should be compared with the Fermi Golden Rule condition, imposed in the case of an embedded eigenvalue (p = 0 case).
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
95
Thus in both cases there are two real parameters α and β, to be determined. In the case E1 ≥ 0 the condition used is F (x0 , ε) = Hι (x0 ), ι ∈ {s, r}. Equality of the real and imaginary parts gives the two equations used to determine α and β. In the case E1 < 0 the functions F (x) and H(x) are real-valued. The conditions used are F (xb , ε) = Hι (xb ), ι ∈ {s, r}, together with d d F (xb , ε) = Hι (xb ), dx dx
ι ∈ {s, r}.
(3.34)
Thus in the case E1 < 0 our conditions determining α and β give as a result that 1 and Hι1(x) , ι ∈ {s, r}, are equal. the residues at the pole xb of F (x,ε) It is clear from this discussion that the parameters α and β take values depending on which case is being considered. These values will be given in connection with the proofs, since they are only needed there. Remark 3.6. Let us note that the determination of α leads to x0 , for E1 ≥ 0, α= xb + small term, for E1 < 0.
(3.35)
The precise form of the small term for E1 < 0 depends on the case being considered. 3.6. Main results; error analysis We are now in a position to formulate the main technical results of this paper: For ε sufficiently small and E0 in an ε-independent neighborhood of the threshold, the error in Aε (t) due to the replacement of F (z, ε) with the model functions Hι (z), ι ∈ {s, r} (as given by (3.32)–(3.35) vanishes in the limit ε → 0. In other words, we
∞ have to control |Aε (t) − limη0 π1 −∞ e−itx Im Hι (x + iη)−1 dx|
∞as ε → 0. The contribution of the negative semi-axis in limη0 π1 −∞ e−itx Im Hι (x + ˜b , of Hι (z) (when it exists) and equals iη)−1 dx is just the residue at the zero, x 1 −it˜ xb . Accordingly − d H (˜x ) e dx
ι
b
1 η0 π
∞
lim
−∞
e−itx Im Hι (x + iη)−1 dx = −
1 e−it˜xb + Aˆε,ι (t) d Hι (˜ xb ) dx
(3.36)
with 1 Aˆε,ι (t) ≡ π
0
∞
e−itx Im Hι (x + i0)−1 dx,
ι ∈ {s, r}.
(3.37)
Notice that by definition of Hι (z), for E1 < 0, we have x ˜b = xb , and the ∞ contribution of the negative semi-axis in limη0 π1 −∞ e−itx Im Hι (x+ iη)−1 dx and ∞ in limη0 π1 −∞ e−itx Im F (x + iη, ε)−1 dx are equal. For ι = s, xb exists also for
February 10, J070-S0129055X11004230
96
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
E1 ≥ 0, and in this case x ˜b = xb , but still as ε → 0, we have (see the proof of Theorem 3.8).
1
d xb ) dx Hs (˜
Theorem 3.7 (Regular Case). Let g−1 = 0, g1 = 0. There exists c, such that for sufficiently small ε, |E0 | ≤ c, and all t ≥ 0, we have ∞ −itx −1 Aε (t) − lim 1 e Im H (x + iη) dx r η0 π −∞ 1/2 ε2 (1 + x0 |ln ε|), for E1 > 0, for E1 ≤ 0. ε2 , Theorem 3.8 (Singular Case). Let g−1 = 0. There exists c, that for sufficiently small ε, |E0 | ≤ c, and all t ≥ 0, we have
a 2
∼ =
a 2
1
d dx F (xb ,ε)
≥ c > 0,
(3.38)
≥ c > 0, such
(i) Let E1 ≥ 0, and let x ˜b be the unique solution of Hs (x) = 0 on the negative semi-axis. Then ∞ 1 1 −itxb −itx −1 e − lim e Im Hs (x + iη) dx Aε (t) + d η0 π 0 Hs (˜ xb ) dx 2 ε | ln ε|, for x0 ε4/3 , 1/2 x (3.39) 0 4/3 4/3 for x0 ε . ε , (ii) Let E1 < 0. Then ∞ 1 1 e−itx Im Hs (x + i0)−1 dx e−itxb − lim Aε (t) + d η0 π 0 Hs (xb ) dx 2 ε , for |E1 | ε4/3 , 1/2 |E | 1 (3.40) 4/3 4/3 for |E1 | ε . ε , Remark 3.9. For the smooth case (g−1 = g1 = 0), see [8, Theorem 2.10]. This result gives an exponential decay law irrespective of the value of E0 . Remark 3.10. We compare Theorem 3.3 with Theorem 3.7. The latter result is valid in an ε-independent neighborhood of zero, and has a better error estimate. This is due to the choice of α and β. The disadvantage is that these coefficients are not given in terms of expansion coefficients in g(z), but are solutions to equations, which can be solved by perturbative methods.
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
97
3.7. Error function analysis: Crossover from exponential to non-exponential decay laws As shown in the previous section, the bound state contribution has a simple form. Thus it remains to compute Aˆε,ι (t) as given by (3.37). Since Hι , ι ∈ {s, r}, have a simple functional form with only two free parameters, the integral in the right-hand side of (3.37) can be evaluated numerically or expressed in closed form in terms of some special functions. Some examples of numerical computations substantiating the heuristics presented in previous subsections have been presented in [8], but a detailed analytical study of the asymptotics was postponed. One of the main goals of this paper is to perform this analysis. The main point is that Aˆε (t), as well as its asymptotics, can be expressed in terms of some special functions 2 2z ∞ e−ix dx, (3.41) I1 (z) = iπ 0 x2 − z 2 Ip (z) =
1 dp−1 I1 (z), (p − 1)! dz p−1
p = 2, 3, . . . ,
(3.42)
closely related to the error function (see the Appendix for a detailed study of Ip (z)). 3.7.1. The regular case We begin with the (simpler) regular case. Here Hr (z) = α − z + ε2 βκ, where α lies in a small neighborhood of the origin and β > 0. Let 1 ∞ −itx ˆ e Im Hr (x + i0)−1 dx. (3.43) Aε,r (t) ≡ π 0 √ Passing to the variable k = z = iκ we get ∞ −itk2 e 1 ˆ kdk, (3.44) Aε,r (t) = iπ −∞ Pr (κ) where Pr (κ) = κ2 + ε2 βκ + α.
(3.45)
The integral on the right-hand side of (3.44) is to be understood as
A −itk2 limA→∞ −A ePr (κ) kdk. When the zeroes of Pr are distinct, a partial fraction decomposition yields the following result. Proposition 3.11 (Regular Case). Aˆε,r (t) = −
2
√ qj I1 (iκj t),
(3.46)
j=1
where κj =
1 (−ε2 β − (−1)j ε4 β 2 − 4α), 2
j = 1, 2,
(3.47)
February 10, J070-S0129055X11004230
98
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
are the roots of Pr (κ), and 2 ε β 1 1 + (−1)j , qj = 2 ε4 β 2 − 4α are the corresponding residues of Remark 3.12. For α =
β 2 ε4 4
j = 1, 2,
(3.48)
κ Pr (κ) .
, the κj , j = 1, 2, coincide and the coefficients qj 2 4
become infinite. One can show that as α → β 4ε . the formula (3.46) has a limit βε2 √ Aˆε,r (t) = iI3 i t . (3.49) 2 2 4
We shall not make use of (3.49) in what follows, since the case α = β 4ε belongs to the crossover regime (see below), when Aˆε,r is given also by (3.67), which has been analyzed in [8]. As a consequence of Proposition 3.11, we can now discuss the various regimes. 3.7.1.1. The exponential regime According to the heuristics, if we set α = bεp ,
b > 0,
(3.50)
then for p ∈ [0, 4)
(3.51)
the decay law is still exponential. Fix p ∈ [0, 4). Notice that as ε → 0 p ε2 √ ε2− 2 α
(3.52)
and then (see Remark 3.12) one can use Proposition 3.11. In this case there is no bound state contribution. Using the properties of Ip (z) (see the Appendix), one obtains Proposition 3.13 (Regular Case). (i) Using (3.47) and (3.48), we have √ 2 β ε2 Aˆε,r (t) = 2q2 eiκ2 t − √ I3 (i αt) + O(ε4−p ), 2 α
(3.53)
and up to error terms as in Theorem 3.7, we have Aε (t) = Aˆε,r (t).
(3.54)
(ii) For p ∈ [0, 4) we have p 1/2 2+ 2 ε
|Aε (t)|2 = e−2βb
t
,
(3.55)
and for p ∈ (0, 4) we have p 1/2 2+ 2
|Aε (t)|2 = e−2|g1 |b
ε
in both cases up to errors vanishing as ε → 0.
t
,
(3.56)
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
99
The formula (3.56) agrees with the heuristic formula (3.11), as well as with the rigorous result in [16] for p = 1. Notice that, as expected, as p approaches 4 from below, the exponential decay law becomes less and less accurate, so one needs to compute corrections. Proposition 3.13 gives only the first order correction in ε2 / |α|, but the method of proof provides also the higher order corrections. 3.7.1.2. The bound state regime If α = −bεp ,
b > 0,
(3.57)
then for p ∈ [0, 4)
(3.58)
one expects (see the heuristics) a bound state regime i.e. to leading order the contribution comes from the bound state. The result below provides the mathematical substantiation as well as the first order correction. Again the proof gives the means to compute higher order corrections. As in the previous case, as ε → 0, we have p ε2 ε2− 2 . |α|
Note that κ1 > 0, and there is a contribution from the pole of The analogue of Proposition 3.13 reads
(3.59) 1 F (z,ε)
at xb = −κ21 .
Proposition 3.14 (Regular Case). β ε2 Aˆε,r (t) = −i I3 (i |α|t) + O(ε(4−p) ), 2 |α| and up to error terms as in Theorem 3.7 we have 2 2 ε β Aε (t) = Aˆε,r (t) + 1 − i eiκ1 t . 2 |α|
(3.60)
(3.61)
To leading order one obtains the bound state behavior |Aε (t)|2 = 1.
(3.62)
3.7.1.3. The non-exponential regime We come now to the most interesting part of our analysis, when |α| = bεp ,
b > 0,
with p ≥ 4.
(3.63)
According to the heuristics, for these values of p the decay law is neither (quasi)exponential nor bound state like. We consider two cases separately. Case 1. The threshold regime given by p > 4.
February 10, J070-S0129055X11004230
100
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
In this case the survival probability amplitude is given by Proposition 3.15 (Regular Case). Up to errors as in Theorem 3.7 we have √ Aε (t) = I1 (iε2 β t) + O(εp−4 ). (3.64) The result (3.64) implies that the decay law is non-exponential for all p > 4. Remark 3.16. The error term becomes more and more important, as p approaches the critical value p = 4. As before, higher order corrections to the leading term can be computed in terms of Ip . The leading term is independent of α and equals the threshold case x0 = E1 = 0. Since I1 can be expressed (see the Appendix) in terms of the error function, one can rewrite (3.64) as follows: √ Aε (t) = w(ei3π/4 ε2 β t) + O(εp−4 ) = eis (1 − erf(eiπ/4 s1/2 )) + O(εp−4 ), (3.65) where s = β 2 ε4 t.
(3.66)
In particular, (3.66) implies that the threshold decay time scale in the regular case is t ∼ ε−4 . Case 2. The “crossover regime”, which is given by p = 4. This case has been considered in [8]. Indeed, in this case in scaled variables s = ε4 β 2 t, f = ε4αβ 2 (note that for p = 4, f = const.) we have directly from (3.43) and (3.33): 1 Aˆε (t) = π
∞ 0
y 1/2 e−isy dy, (f − y)2 + y
(3.67)
and the integral has been analyzed numerically in [8]. The decay law is nonexponential for finite f , while as f → ±∞, one reaches the exponential and bound state behavior, respectively. 3.7.2. The singular case We now turn to the singular case, where the function approximating F (z, ε) is given by Hs (z) = α − z − ε2 β/κ, with α lying in a small neighborhood of the origin, and β > 0. The results are similar to those in the regular case, but a bit more complicated, due to the singular behavior of Hs (z). We recall (3.37) that 1 ∞ −itx e Im Hs (x + i0)−1 dx. (3.68) Aˆε,s (t) = π 0
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
√ z = iκ, we can write it as 2 −1 ∞ k 2 e−itk Aˆε,s (t) = dk, π −∞ Ps (κ)
101
Passing to the variable k =
(3.69)
where Ps (κ) = κ3 + ακ − ε2 β,
(3.70)
and the integral on the r.h.s. of (3.69) is to be understood as the improper integral
A −itk2 limA→∞ −A ePs (κ) k 2 dk. If the zeroes of Ps are distinct, the partial fraction decomposition leads to the following result. Proposition 3.17 (Singular Case). Aˆε,s (t) = −
3
qj I1 (iκj t)
(3.71)
j=1
where κj , j = 1, 2, 3, are the roots of Ps (κ) (as given by the Cardano formula, see (4.82) below ), and
are the corresponding residues of
qj =
κ2j 3κ2j + α
κ2 Ps (κ)
at κ = κj , j = 1, 2, 3.
(3.72)
Without using the explicit formulae for κj we can get the following properties of the roots of Ps (κ): (i) Ps (κ) = 0 always has a positive solution, which we label κ3 . It corresponds to the bound state at x ˜b = −κ23 . (ii) For α = α0 (ε) ≡ −3(ε2 β/2)2/3
(3.73)
all κj are distinct. Note again that the case α = α0 (ε) belongs to the crossover regime. (iii) For α > α0 (ε) the other two solutions κ1 and κ2 are complex conjugates with real part equal to − κ23 . We label by κ1 the one with positive imaginary part. (iv) For α < α0 (ε), κ1 and κ2 are real, κ1 , κ2 < 0, and κ1 + κ2 = −κ3 . We recall that we take |α| = bεp . For p = 43 the somewhat complicated expressions for κj and qj have simple expansions in the limit ε → 0. Combining these expansions with the properties of Ip , one arrives at results, which are very similar to those in the regular case. As before, we give only “first order” corrections, but higher order corrections can be computed. These expressions are much more complicated.
February 10, J070-S0129055X11004230
102
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
3.7.2.1. The exponential regime If we have α = bεp ,
b > 0;
p ∈ [0, 4/3),
(3.74)
the decay law is still exponential at small ε. Notice that as ε → 0 3p ε2 √ ε2− 2 . 3 α
(3.75)
Using the properties of Ip (z) one gets the following result. Proposition 3.18 (Singular Case). (i) We have √ √ √ 2 βε2 Aˆε,s (t) = 2q2 eiκ2 t − i 3/2 (I1 (i αt) − i αtI2 (i αt)) 2α + O(ε2(2−3p/2) )
(3.76)
(ii) For p ∈ [0, 4/3), we have p 1/2 2− 2
|Aε (t)|2 = e−2βb
ε
t
,
(3.77)
and for p ∈ (0, 4) we have p 1/2 2− 2 ε
|Aε (t)|2 = e−2g−1 b
t
,
(3.78)
in both cases up to errors vanishing as ε → 0. A remark is in order here. In spite of the fact that there is a bound state at x ˜b = −κ23 for “large” positive α, its contribution is small and can be absorbed in the error term. 3.7.2.2. The bound state regime In the case α = −bεp ,
b > 0;
p ∈ [0, 4/3),
(3.79)
the heuristics predicts a bound state regime: The leading order contribution comes only from the bound state at x ˜b = −κ23 . The result below, similar to Proposition 3.14, provides the mathematical substantiation, as well as the first order correction. Proposition 3.19 (Singular Case). Aˆε,s (t) =
p βε2 (I (i |α|t) − i |α|tI (i |α|t)) + O(ε2(2− 2 ) ), 1 2 2|α|3/2
(3.80)
and up to error terms as in Theorem 3.8 2 Aε (t) = Aˆε,s (t) + 2q3 eiκ3 t .
(3.81)
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
103
To leading order one obtains the bound state behavior |Aε (t)|2 = 1.
(3.82)
3.7.2.3. The non-exponential regime As in the regular case, according to the heuristics for α = bεp , b ∈ R, p ≥ 4/3, the decay is neither exponential nor bound state like. As in the regular case, we shall distinguish between two cases. Case 1. The threshold regime, which is p > 4/3. Up to errors as in Theorem 3.8 we have the result. Proposition 3.20 (Singular Case). Assume p > 4/3. Then we have 3
Aε (t) =
√ 4 2 iκ23 t 1 e − I1 (iρj β 1/3 ε2/3 t) + O(εp− 3 ). 3 3 j=1
(3.83)
Notice that in contrast to the regular case, there is a non-vanishing contribution from the bound state. Case 2. The “crossover regime”, which in the singular case takes place at the value p = 4/3. When p = 4/3, f = β −2/3 b, b ∈ R does not depend on ε, so there is no useful expansions for κj , qj , j = 1, 2, 3. Accordingly (it has been done in [8] using the scaled variable s = ε4/3 β 2/3 t) one writes (3.37) as 1 ∞ y 1/2 ˆ Aε,s (t) = e−isy dy, π 0 y(f − y)2 + 1 and the integral can be analyzed numerically. In accordance with the above results, as f → ±∞ one reaches the exponential and bound state behavior respectively; we refer to [8] for details. 4. The Proofs We now give the proofs of the results stated in the previous section. 4.1. Proof of Proposition 3.1 Assumption 2.3 and (2.7) (see also (3.13)) imply that for 0 < x < a, we can write R in the form R(x) = E1 − x − ε2 x2 f1 (x)
with
sup |f1 (x)| < ∞.
0<x
(4.1)
We have R(0) = E1 . For a sufficiently small ε we have R(a) < 0, and also that R(x) is strictly decreasing. Thus it follows that the equation R(x) = 0 has a unique solution, x0 ∈ (0, a), which satisfies x0 = E1 − ε2 x20 f1 (x0 ), which together with (4.1) finishes the proof.
(4.2)
February 10, J070-S0129055X11004230
104
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
4.2. Proof of Proposition 3.2 Part (i). One can obtain (3.21) and (3.22) by a rather tedious perturbation procedure for solving the equation for xb ; instead we shall give below a simple geometric argument. Consider first, for m ≥ 0 and n 1, the (unique) positive solution y˜ of fm,n (y) ≡ m + y −
nε2 = 0. y 1/2
Using Cardano’s formulae one can see that n2/3 4/3 ε , ε4/3 , for 0 ≤ m ≤ 2 y˜ 4 ε , for m ε4/3 . m2
(4.3)
(4.4)
A simpler argument for (4.4) is to argue as follows. Let y˜˜ = (nε2 )2/3 , such that fm,n (y˜˜) = m. Then d fm,n (y) ≥ 1 dy
(4.5)
implies that y˜˜ − m ≤ y˜ ≤ y˜˜ ε4/3 , which gives the first part of (4.4). Assume m ε4/3 . Then for y ε4/3 we have m−
nε2 nε2 nε2 ≤ m + y − ≤ const. m − , y 1/2 y 1/2 y 1/2
which implies
nε2 const. m
2 ≤ y˜ ≤
nε2 m
2 ,
and thus the second part of (4.4). Consider now F (x, ε) for x < 0 and |x| sufficiently small. Recall that for x < 0 we have κ = |x|1/2 . Then E1 − x − ε2
2g−1 g−1 ≤ F (x, ε) ≤ E1 − x − ε2 , |x|1/2 2|x|1/2
(4.6)
and the estimates (4.4) lead to (3.21). The argument for (3.22) is similar. Consider fm,n (y) for m ≤ 0. Notice that (4.5) still holds true and implies y˜˜ ≤ y˜ ≤ |m| + y˜˜. Use again (4.6). Part (ii). Note that in this case F (0, ε) = E1 , and as always (see (2.7) and Propod F (x, ε) ≤ −1. This implies the non-existence of sition 2.2) for x < 0 we have dx bound states for E1 ≥ 0, the existence and uniqueness of the solution for E1 ≤ 0
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
105
(recall that for ε sufficiently small and − a2 ≤ E1 we have F (−a, ε) > 0), as well as (3.23).
4.3. Proof of Theorem 3.7 Consider first the case E1 ≥ 0, such that x0 ≥ 0 exists. Lemma 4.1. Assume E1 ≥ 0. For c > 0 sufficiently small, and 0 ≤ x0 ≤ 2c we have 1 c c ε2 ln , for 0 ≤ x0 ≤ 1 1 ε 2 F (x + i0, ε) − Hr (x + i0) dx 0 2 for 0 ≤ x0 εp , any p > 0. ε , (4.7) Proof. We recall that we always have c ≤ a/2. Furthermore, recall also that g1 < 0. Then Assumption 2.3 and (3.13) imply R(x) ≡ E1 − x − ε2 x2 f1 (x),
√ √ I(x) = ε2 x(g1 + O(x)) ≡ −ε2 xf2 (x), (4.8)
with fj uniformly Lipschitz on [0, c], j = 1, 2. Taking c small enough, (4.8) implies |R(x)| ≥
1 |x − x0 |, 2
|I(x)| ≥
1 2√ ε x|g1 |. 2
(4.9)
By definition of the model function (see Sec. 3.5) we have α = E1 − ε2 x20 f1 (x0 ),
β = f2 (x0 ).
(4.10)
Since both f1 and f2 are uniformly Lipschitz, we get |F (x + i0, ε) − Hr (x + i0)| ≤ ε2 (|x2 f1 (x) − x20 f1 (x0 )| + √ x + x), ε2 |x − x0 |(ˆ
√ x|f2 (x) − f2 (x0 )|) (4.11)
where x ˆ lies between x and x0 . Putting together (4.9) and (4.11) we get 0
c
√ c 1 1 |x − x0 |(ˆ x + x) dx ε2 − dx. F (x + i0, ε) Hr (x + i0) |x − x0 |2 + ε4 x 0
(4.12)
We estimate side in (4.12) on three subintervals.
cthe integral on the right-hand √ Consider first 2x0 . In this case x ˆ ≤ x x, |x − x0 | < x, |x − x0 | > x/2, and then ε
2
c
2x0
√ c √ |x − x0 |(ˆ x + x) x 2 dx ε dx ε2 . 2 4 4 |x − x0 | + ε x 2x0 x + ε
(4.13)
February 10, J070-S0129055X11004230
106
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
Consider now
x0
. Here we have x ˆ < x0 , such that √ x0 x0 √ |x − x0 |(ˆ x + x) x0 − x dx x0 dx 2 4 |x − x0 | + ε x |x − x0 |2 + ε4 x 0 0 1 √ y = x0 dy. 4 ε 0 2 y + (1 − y) x0
0
In estimating the last integral in (4.14), we use the notation m =
1/2
0
1
1/2
y dy y 2 + m(1 − y) y dy 2 y + m(1 − y)
Since supu>0 u ln(1 +
1 2u )
ε4 . We get
1 1 y dy = ln 1 + , y2 + m 2 4m 1 m 1 dy ln 1 + . 1 + m(1 + y) m 2
1/2
0
ε4 x0
(4.14)
1
1/2
(4.15) (4.16)
< ∞, the estimates (4.15) and (4.16) imply 0
1
y 1 dy ln . 4 ε ε y 2 + (1 − y) x0
(4.17)
Finally consider √ 2x0 2x0 √ |x − x0 |(ˆ x + x) x − x0 dx x0 dx 2 + ε4 x |x − x | |x − x0 |2 + ε4 x 0 x0 x0 x0 √ √ u x0 ≤ x0 du = x0 ln 1 + 4 u 2 + ε4 u ε 0
√ 1 x0 ln ε
(4.18)
Putting together (4.13), (4.14), (4.17), and (4.18), one obtains (4.7) and the proof of Lemma 4.1 is finished. Consider now E1 < 0. In this case we claim the following result. Lemma 4.2. Assume E1 < 0. For c > 0 sufficiently small, and − 2c < E1 < 0 we have c 1 1 2 (4.19) F (x + i0, ε) − Hr (x + i0) dx ε . 0 Proof. In this case, we write for x < 0 F (x, ε) = E1 − x − g1 ε2
|x| + ε2 |x|3/2 f (x).
(4.20)
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
107
From Assumption 2.3 (recall that for F (x, ε) is analytic for x < 0) follows that d (|x|3/2 f (x))| 1. The equation for β (the equality of the derivatives of ||x|−1/2 dx F and Hr at xb ) leads to d β = −g1 + 2 |xb | (|x|3/2 f (x))|xb , dx which implies |β + g1 | |xb |. The equation for α is α − xb + ε2 Using (4.21), we get α = xb − ε2
(4.21)
|xb |β = F (xb , ε) = 0.
|xb |β = xb + ε2 |xb |g1 + ε2 O(|xb |3/2 ).
On the other hand, F (xb , ε) = 0 and (4.20) give E1 = xb + ε2 |xb |g1 + ε2 O(|xb |3/2 ),
(4.22)
(4.23)
which together with (4.21) yields |α − E1 | ε2 |xb |3/2 .
(4.24)
As in the proof of the previous lemma, we need estimates on |F | and |Hr |. We claim that 1 min{|F (x + i0, ε)|, |Hr (x + i0)|} ≥ ((|E1 | + x)2 + g12 ε4 x)1/2 . (4.25) 2 Concerning |F |, the estimate for the imaginary part is the same as in the previous √ lemma, i.e. |I(x)| ≥ 12 |g1 |ε2 x. As for |R(x)|, (4.8) is written as R(x) = E1 − x(1 + ε2 xf1 (x)), such that R(x) ≤ E1 − x2 , and then |R(x)| ≥ 12 (|E1 | + x). As for |Hr |, from its definition follows |Hr (x + i0)|2 = |α − x|2 + ε4 β 2 x.
(4.26)
For a sufficiently small ε (and using E1 < 0), the results (4.24) and (3.23) imply α ≤ E1 + |α − E1 | = E1 + O(ε2 |E1 |3/2 ) ≤
E1 . 2
(4.27)
On the other hand, for |xb | sufficiently small, we get from (4.21) the estimate β = −g1 + β + g1 ≥ |g1 | − O(|xb |) ≥
|g1 | . 2
(4.28)
Putting together (4.26)–(4.28), one obtains (4.25). Furthermore, from (4.20), (3.33), (4.21), and (4.24), we get |F (x + i0) − Hr (x + i0)| = |E1 − α + iε2 |x|(g1 + β)| + O(ε2 |x|3/2 ) (4.29) ε2 (|xb |3/2 + |xb | |x| + |x|3/2 ),
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
108
which together with (4.25) gives 0
c
c 3/2 |x | + |x | |x| + |x|3/2 1 1 b b 2 . F (x + i0, ε) − Hr (x + i0) dx ε 2 (|E1 | + x) + g12 ε4 x 0
Consider now various terms in (4.30). estimates. c 1 |xb |3/2 2 4 dx ≤ 2 0 (|E1 | + x) + g1 ε x √ c x |xb | 2 4 dx ≤ 2 0 (|E1 | + x) + g1 ε x c x3/2 2 4 dx ≤ 2 0 (|E1 | + x) + g1 ε x
(4.30)
Due to (3.23) we have the following three
|xb |3/2 |xb | 0
c
c
0
1 |xb |3/2 1, dx 2 |E1 | 0 (|E1 | + x) √ x |xb | 1, dx 2 (|E1 | + x) |E1 | c
1 √ dx 1, x
which together with (4.30) gives (4.19). Proof of Theorem 3.7. From this point onwards, the proof of Theorem 3.7 follows closely the proof of [8, Theorem 3.3]. For the convenience of the reader we outline it for the case E1 ≥ 0. We take E1 sufficiently small (e.g. E1 ≤ 4c ), such that α ≤ 2c . Then for x ≥ c we have |Hr (x + i0)|2 x2 + ε4 x. Using (3.33) we get √ |Im Hr (x + i0)| ε2 x. Thus c
∞
|Im Hr (x + i0)| dx ε2 |Hr (x + i0)|2
c
∞
√ x dx ε2 . x2 + ε4 x
(4.31)
Let now 1 Ac,ε (t) = lim η0 π
c
e−itx Im F (x + iη, ε)−1 dx.
0
Due to Assumption 2.3, the limit η 0 can be taken, such that Ac,ε (t) =
1 π
c
e−itx Im F (x + i0, ε)−1 dx.
(4.32)
0
Lemma 4.1 and (4.31) imply that ∞ −itx −1 Ac,ε (t) − 1 e Im Hr (x + i0) dx π 0 1 c ε2 ln , for 0 ≤ x0 ≤ , ε 2 2 ε , for 0 ≤ x0 εp , any p > 0.
(4.33)
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
109
We finish the proof by using “Hunziker’s trick”, see [12]. More precisely, observe that |Ac,ε (t) − Aε (t)| ≤ |Ac,ε (0) − Aε (0)| = |Ac,ε (0) − 1|.
(4.34)
From [8, Lemma 3.4(i)] we get 1 π
∞
0
Im Hr (x + i0)−1 dx = 1.
(4.35)
Putting together (4.33)–(4.35), and (4.33) for t = 0, finishes the proof of Theorem 3.7 for the case E1 ≥ 0. The proof for the case is E1 < 0 similar: use Lemma 4.2 and (see [8, Lemma 3.4(ii)]) −
1
+
d Hr (xb ) dx
1 π
∞ 0
Im Hr (x + i0)−1 dx = 1.
(4.36)
4.4. Proof of Theorem 3.8 The proof consists of the same steps as in the proof of Theorem 3.7. In this case, I(x) = ε2 x−1/2 (g−1 + O(x)) ≡ ε2 x−1/2 f2 (x) (not the same function that was also denoted by f2 above). Consider first the case E1 ≥ 0, such that x0 ≥ 0 exists. Lemma 4.3. For c > 0 sufficiently small and 0 ≤ x0 ≤ 2 ε 1 1/2 ln , c 1 1 ε x F (x + i0, ε) − Hs (x + i0) dx 0 0 ε4/3 ,
c 2
we have
for ε4/3 x0 ≤ for 0 ≤ x0 ε
c , 2
4/3
(4.37)
.
A computation similar to the one in the proof of Lemma 4.1 leads to 0
c
c 1 |x − x0 |(ˆ x + x−1/2 ) 1 2 dx, F (x + i0, ε) − Hs (x + i0) dx ε |x − x0 |2 + ε4 x−1 0
(4.38)
where xˆ lies between x and x0 . The term containing x ˆ is the same as in the regular
−1/2 2 c |x−x0 |x case, so we have to consider only ε 0 |x−x0 |2 +ε4 x−1 dx. Consider first the case x0 ε4/3 . Observe that ε2
|x − x0 |x−1/2 1 ≤ , 2 4 −1 |x − x0 | + ε x 2
(4.39)
February 10, J070-S0129055X11004230
110
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
and then ε
2
ε2
0
|x − x0 |x−1/2 dx ≤ ε2 . |x − x0 |2 + ε4 x−1
(4.40)
We use the following estimate −1/2
x−1/2 ≤ |x−1/2 − x0
−1/2
| + x0
= ≤
1 x−1/2
+
|x − x0 | 1/2 xx0
−1/2 x0
|x − x0 | −1/2 + x0 x0 x
−1/2
+ x0
.
We then get ε
2
c
ε2
|x − x0 |x−1/2 ε2 dx ≤ 1/2 |x − x0 |2 + ε4 x−1 x0 +
1/2 2
1/2 x0
|x − x0 | dx |x − x0 |2 + ε4 x−1 c c |x − x0 | ε2 x−1 + 1/2 dx 2 4 ε2 ε2 |x − x0 | + ε x
1/2 x0
x0
|x − x0 |2 dx x|x − x0 |2 + ε4
c
ε2
ε2
ε2
ε
c
ε2
0
1 ln . ε
Consider now the case x0 ε4/3 . Due to (4.39) it remains to estimate Here one has x2 ≤ x − x0 ≤ x and then
c
const. ε4/3
c |x − x0 |x−1/2 x1/2 2 dx ≤ ε dx 2 4 −1 |x − x0 |2 + ε4 x−1 const. ε4/3 x + ε x c x−3/2 dx ε4/3 ,
c
const. ε4/3
.
(4.41)
const. ε4/3
and the proof of Lemma 4.3 is finished. The next step (still for the case E1 ≥ 0) is to control the error when F is replaced by Hs in the bound state contribution. Lemma 4.4. Assume E1 ≥ 0. Let x˜b be the unique solution on (−∞, 0) of the equation Hs (x) = 0. Then 4 ε 2 , for x0 ε4/3 1 1 x0 − (4.42) d d F (xb ) xb ) ε4/3 , for x ε4/3 . Hs (˜ 0 dx dx
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
111
Proof. 1 1 1 1 1 1 − − − ≤ + . d d d d d d F (xb ) xb ) F (xb ) xb ) Hs (˜ Hs (xb ) H(xb ) Hs (˜ dx dx dx dx dx dx (4.43) Consider the first term in (4.43). Assumption 2.3 and (3.32) follows that 2 d ε2 F (xb ) − d Hs (xb ) |β − g−1 | ε + , dx 3/2 dx |xb | |xb |1/2 2 d F (xb ) | d Hs (xb )| 1 + ε . dx dx |xb |3/2
(4.44) (4.45)
Furthermore, the equation for β is βε2 g−1 ε2 1/2 = + ε2 O(x0 ), 1/2 |x0 | |x0 |1/2 which implies β = g−1 + O(x0 ).
(4.46)
−2 x0 ε2 1 ε2 ε2 1 + . 1+ − d d |xb |3/2 |xb |1/2 |xb |3/2 F (xb ) Hs (xb ) dx dx
(4.47)
Using (4.44)–(4.46), we get
From (3.21) and (2.17), one has
|xb |
ε4/3 , if 0 ≤ x0 ε4/3 , 4
ε 2, E1
if x0 ε4/3 .
(4.48)
Combining (4.47) with (4.48), one obtains 4/3 ε , for 0 ≤ x0 ε4/3 , 1 1 − ε4 d d F (xb ) 2 , for x0 ε4/3 . Hs (xb ) x0 dx dx
(4.49)
February 10, J070-S0129055X11004230
112
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
1 The last step in proving Lemma 4.4 is to estimate d H(x − d H1 (˜x ) . Taylor’s s b) b dx dx theorem implies d2 dx2 Hs (u) 1 1 − ˜b |, (4.50) 2 |xb − x d d H(xb ) Hs (˜ xb ) d Hs (u) dx dx dx d where u lies between x ˜b and xb . Since Hs (˜ xb ) = 0 and | dx H(x)| ≥ 1, we get
|xb − x ˜b | ≤ |Hs (xb )|. Since F (xb ) = 0, one has from (2.7), (3.32), (3.35), (3.19), and (4.46) that |Hs (xb )| = |Hs (xb ) − F (xb )| |x0 − E1 | + ε2 |g−1 − β||xb |−1/2 + ε2 |xb |1/2 ε2 (x20 + x0 |xb |−1/2 + |xb |1/2 ).
(4.51)
Again we have to consider two cases separately. First we consider the case x0 ε4/3 . The result (3.35) and the proof of Proposition 3.2, together with (4.48) leads to xb | ε4/3 . |xb | |˜ 2
d d −4/3 Since u lies between xb and x ˜b , we get | dx H(u)| 1 and | dx . 2 H(u)| ε Inserting these results into (4.51) one gets from (4.50) the result 1 1 − (4.52) ε4/3 . d d H(xb ) Hs (˜ xb ) dx dx
Consider now the case x0 ε4/3 . By the same argument as before |xb | |˜ xb | |u| x3
2
ε4 . x20
x5
d d 0 Then | dx H(u)| 1 + ε40 and | dx 2 H(u)| ε8 , which together with (4.51) and (4.50) leads to ε4 1 1 (4.53) − 2. d x0 d H(xb ) xb ) Hs (˜ dx dx
Putting together (4.49), (4.52), and (4.53) finishes the proof of the Lemma 4.4. We are left with the estimate of the error for the case E1 < 0. Since the pole positions and the residues for F and Hs coincide by the definition of Hs (see Sec. 3.5) the error comes only from the positive semi-axis integral.
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
Lemma 4.5. For c > 0 sufficiently small and − 2c < E1 < 0, ε2 c , for |E1 | ε4/3 1/2 1 1 |E | 1 F (x + i0, ε) − Hs (x + i0) dx 0 ε4/3 , for |E1 | ε4/3 .
113
(4.54)
Proof. The arguments that lead to (4.21), (4.24) and (4.11), can be applied in this case and yield |β − g−1 | |xb | and |α − E1 | ε2 |xb |1/2 ,
(4.55)
and furthermore |F (x + i0) − Hs (x + i0)| = |E1 − α + iε2
|x|(g−1 − β)| + ε2 O(|x|3/2 ).
(4.56)
Using (4.55) in (4.56), one gets |F (x + i0) − Hs (x)| ε2 (|xb |1/2 + |xb ||x|−1/2 + |x|1/2 ).
(4.57)
Since E1 < 0, one has |R(x)| ≥ (|E1 | + x)/2. Furthermore, |I(x)| = ε2 x−1/2 (g−1 + O(x)), and then |F (x + i0)|
1/2 ε4 (|E1 | + x)2 + . x
(4.58)
1/2 ε4 (α − x)2 + . x
(4.59)
The result (3.32) leads to |Hs (x + i0)|
The problem with |Hs | is that α − x might vanish for some x > 0. However, for x ε2 , we can use (4.55) to get x−α =
x x x x + |E1 | − E1 + + E1 − α ≥ − E1 ≥ , 2 2 2 2
and then from (4.57)–(4.59), one has c 1 1 dx − Hs (x + i0) const. ε2 F (x + i0, ε) c |xb |1/2 + |xb |x−1/2 + x1/2 dx. ε2 ε4 const. ε2 2 (|E1 | + x) + x
(4.60)
February 10, J070-S0129055X11004230
114
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
We estimate (4.60) for |E1 | ε4/3 . In this case we get Proposition 3.2(i) |xb | ε4/3 , and then
ε2
|xb |1/2 + |xb |x−1/2 + x1/2 dx ε4 (|E1 | + x)2 + x
c
const. ε2
ε2
c
x 0
ε4/3
ε2/3 + ε4/3 x−1/2 + x1/2 dx x3 + ε4
∞
0
1 + y 1/2 + y −1/2 dy. 1 + y3
(4.61)
Using Proposition 3.2(i) once more, we get for |E1 | ε4/3 that |xb | |E1 |, and then that
ε
2
|xb |1/2 + |xb |x−1/2 + x1/2 dx ε4 (|E1 | + x)2 + x
c
const. ε2
|E1 |1/2 + |E1 |x−1/2 + x1/2 dx (x + |E1 |)2 0 ∞ ε2 1 + y 1/2 + y −1/2 dy ε4/3 . (1 + y)2 |E1 |1/2 0
ε
2
c
(4.62)
The last step is to estimate
ε2
const. ε2
0
|xb |1/2 + |xb |x−1/2 + x1/2 dx. ε4 2 (|E1 | + x) + x
As before, Proposition 3.2(i) implies for |E1 | ε4/3 that |xb | ε4/3 , and then
ε
2
const. ε2
0
ε2 ε
|xb |1/2 + |xb |x−1/2 + x1/2 dx ε4 (|E1 | + x)2 + x const. ε2
0
4/3
0
ε2/3 + ε4/3 x−1/2 + x1/2 x3 + ε4
const. ε2/3
1/2
1 + y 1/2 + y −1/2 dy · v. (1 + y 3 )1/2
xdx
(4.63)
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
115
For |E1 | ε4/3 , again from Proposition 3.2(i), |xb | |E1 |, and then ε2
const. ε2
|xb |1/2 + |xb |x−1/2 + x1/2 dx 4 (|E1 | + x)2 + εx
0
ε
2
const. ε2
0
const.
=
ε2 |E1 |
|E1 |1/2 + |E1 |x−1/2 + x1/2 dx 2 (x + |E1 |) xε1/2 1 + y 1/2 + y −1/2 |E1 |dy ε2 . 1+y
0
(4.64)
Now (4.54) follows from (4.61) and (4.64), and the proof of the lemma is finished. Proof of Theorem 3.8. From this point, the proof is the same as the proof of Theorem 3.7: use Lemmas 4.4 and 4.5 and (see the proof of [8, Lemma 3.2]) −
1 d Hs (xb ) dx
e−itxb +
1 π
∞
0
e−itx Im Hs (x + i0)−1 dx = 1.
(4.65)
4.5. Proof of Proposition 3.11 Straightforward computation, which we omit.
4.6. Proof of Proposition 3.13 With the notation r=
α−
ε4 β 2 4
(4.66)
we can write (3.47) and (3.48) as follows. κj = (−1)j+1 ir − 2
ε2 β , 2
qj =
1 ε2 β + (−1)j+1 i , 2 4r
j = 1, 2.
(4.67)
p
Note that εr ε2− 2 . we have κ1 = κ2 . Thus iκ1 and iκ2 lie in the third and fourth quadrants respectively. Using (A.8) and (4.67), one obtains by direct computation √ √ √ √ 2 1 βε2 Aˆε,r = 2q2 eitκ2 − (I1 (κ1 t) + I1 (κ1 t) + i (I1 (κ1 t) − I1 (κ1 t). 2 4r
(4.68)
February 10, J070-S0129055X11004230
116
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
√ √ Now using (4.67) and the Taylor expansion for I1 (κ1 t) around ir t one gets (with a slight abuse of notation) 2 2 2 √ βε √ d βε2 √ d I1 (κ1 t) = I1 (ir t) − t I1 (ir t) + t I1 (˜ z ), 2 dz 2 dz 2 √ √ where Im z˜ = r t. This expansion (and the corresponding expansion for I1 (κ1 t)) gives, together with (A.5)–(A.7), and Lemma A.4 the result 4 √ 2 ε βε2 Aˆε,r = 2q2 eitκ2 − I3 (ir t) + O . (4.69) 2r r2 √ √ Expand once again I3 (ir t), this time around i αt, see (4.66), and the proof of the first part of the proposition is concluded. For the second part of the proposition, we first note that Ip are uniformly bounded on the imaginary axis (see Lemma A.5). In order to compute |Aε (t)|2 up to errors vanishing as ε → 0, one need only to consider the first term on the right-hand side of (3.53). By (3.47) we have Im κ22 = βε2 r, such that neglecting the terms vanishing as ε → 0 in (3.48), we get √
√
2
|Aε (t)|2 = e−2βrε t .
(4.70)
To further simplify (4.70), we employ the following elementary inequality. For a > 0, |b| ≤ a2 we have sup |e−(a+b)t − e−at | ≥ t≥0
2|b| sup yey . a y≥0
(4.71)
Now (3.55) follows from (4.70), (4.71), and the fact that r=
√ α(1 + O(ε4−p )).
(4.72)
Finally, the definition of β implies that β = −g1 + O(εp ),
(4.73)
which together with (3.55) and (4.71) gives (3.56), and the proof is finished. 4.7. Proof of Proposition 3.14 For the computation of Aˆε,r (t) we proceed exactly as in the proof of Proposition 3.13, with the difference that here there is no need to use Lemma A.3. In this case κ1 and κ2 are real, βε2 β 2 ε4 , r = |α| + , j = 1, 2, κj = ±r − 2 4
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
and a computation similar to the one leading to (4.69) gives 4 √ ε βε2 Aˆε,r = −i I3 (ir t) + O . 2r r2
117
(4.74)
The contribution of the bound state at xb = −κ21 (see (3.36)) reads after a straightforward computation 2 βε2 1 −it˜ xb e = 1 − eitκ1 . (4.75) − d 2r Hr (˜ xb ) dx Recall that |α| = εp . Thus 4 ε r = |α| 1 + O = |α|(1 + O(ε4−p )). |α| √ Expanding in (4.74) I3 (ir t) around i |α|t, and using an argument similar to the one leading to (4.69), one obtains (3.61). Finally (3.62) is a direct consequence of (3.61).
4.8. Proof of Proposition 3.15 Recall that |α| = bεp , p > 4. As ε → 0, and |α| << in (3.47) and (3.48), one gets κ1 = −
α (1 + O(εp−4 )), ε2 β q1 = O(εp−4 ),
ε4 β 2 2 ,
we get κj = κj . Expanding
κ2 = −ε2 β(1 + O(εp−4 )), q2 = 1 + O(εp−4 ).
(4.76) (4.77)
An argument similar to the one leading to (4.69) implies that (see (3.46) and (A.7)) √ Aˆε,r = I1 (iε2 β t) + O(εp−4 ). (4.78) Thus (4.77) implies that the contribution (when it exists) of the bound state is also of the order O(εp−4 ), which together with (4.78) finishes the proof of (3.64). 4.9. Proof of Proposition 3.17 As the proof of Proposition 3.11 this is a straightforward computation. 4.10. Proof of Proposition 3.18 In what follows we need the formulae for κj . Using the scaled quantities f = (ε2 β)−2/3 α,
y = (ε2 β)−1/3 κ,
(4.79)
February 10, J070-S0129055X11004230
118
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
one has κj = (ε2 β)1/3 yj
(4.80)
y 3 + f y − 1.
(4.81)
where yj are the solutions of
The Cardano formula gives (for the labeling of κj see the discussion following Proposition 3.17). f y j = ρj r − (4.82) 3r where ρj = e
2πi 3 (j−3)
,
r=
1 + (1 + (4f 3 /27))1/2 2
1/3 .
(4.83)
In (4.83) the principal determination of fractional powers is taken. We recall that we are assuming α = bεp and 0 < p < 4/3. Thus f = −2/3 −( 43 −p) bβ ε → ∞ as ε → ∞. Expanding in (4.82) one obtains √ βε2 −3 κ1 = i α 1 + i 3/2 + O(f ) 2α (4.84)
κ2 = κ1 κ3 =
βε2 (1 + O(f −3/2 )) α
and q1 =
βε2 1 − i 3/2 + O(f −3 ) 2 4α (4.85)
q2 = q1 q3 = O(f −3 ). Now (4.85) and Lemma A.5 imply that √ q3 I1 (iκ3 t) = O(f −3 ) uniformly in t > 0, such that up to errors of order ε4−3p , we have Aˆε,s (t) = −
2
√ qj I1 (iκj t).
j=1
From this point the proof of Proposition 3.18 is a repetition of the proof of Proposition 3.13.
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
119
4.11. Proof of Proposition 3.19 In this case all κj , qj are real, and as ε → 0 we have βε2 (1 + O(|f |−3/2 )) |α| βε2 −3 + O(|f | ) κ1 = |α|1/2 −1 + 2|α|3/2 βε2 −3 κ3 = |α|1/2 1 + + O(|f | ) 2|α|3/2
κ2 =
(4.86)
and q2 = O(|f |−3 )
Thus we have
q3 =
βε2 1 − + O(|f |−3 ) 2 4|α|3/2
q1 =
1 βε2 + O(|f |−3 ). + 2 4|α|3/2
(4.87)
√ q2 I1 (iκ2 t) = O(|f |−3 ),
uniformly in t > 0, such that up to errors of order ε4−3p , √ Aˆε,s (t) = − qj I1 (iκj t). j=1,3
Repetition of the proof of (3.60) leads to (3.80). Adding the contribution of the bound state one obtains (3.81). Finally, taking into account (4.87) one obtains (3.82) and the proof of Proposition 3.19 is finished. 4.12. Proof of Proposition 3.20 In this case we have f → 0 as ε → 0. Expanding in (4.82) one obtains 4 4 1 κj = ρj β 1/3 ε2/3 (1 + O(εp− 3 ), qj = + O(εp− 3 ). (4.88) 3 √ √ Using the expansion of I1 (iκj t) around iρj β 1/3 ε2/3 t, one obtains from (3.71) (by an argument similar to the one leading to (4.69))
1 Aˆε,s (t) = − 3
3
√ 4 I1 (iρj β 1/3 ε2/3 t) + O(εp− 3 ).
j=1
Adding the bound state contribution and using (4.88) in the form 2q3 eiκ3 t = one obtains (3.83).
4 2 iκ3 t e + O(εp− 3 ), 3
February 10, J070-S0129055X11004230
120
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
Acknowledgment Arne Jensen was partially supported by the grant “Mathematical Physics” from the Danish Natural Science Research Council. Most of the work was carried out, while Gheorghe Nenciu was visiting professor at Department of Mathematical Sciences, Aalborg University. The support of the Department is gratefully acknowledged. Appendix. The Function Ip Define for integers p ≥ 1, and for complex z with Im z = 0, the function ∞ 2 1 e−ix Ip (z) = dx. iπ −∞ (x − z)p
(A.1)
The integral in (A.1) is absolutely convergent, if p ≥ 2. For p = 1 one can define it by A −ix2 1 e dx. (A.2) I1 (z) = lim A→∞ iπ −A x − z One can also define
1 A→∞ iπ
I0 (z) = lim
A
2
e−ix dx.
(A.3)
−A
As an alternative to (A.2) one can use the formula 2 2z ∞ e−ix I1 (z) = dx. iπ 0 x2 − z 2
(A.4)
The functions Ip for p ≥ 2 are up to a multiplicative constant the derivatives of I1 , since we have Ip (z) =
dp−1 1 I1 (z). (p − 1)! dz p−1
Lemma A.1. For p ≥ 2 we have |Ip (z)| ≤
1 π|Im z|p−1
∞
(1 + x2 )−p/2 dx.
(A.5)
(A.6)
−∞
Proof. The result follows from a simple computation, which we omit. Lemma A.2. For p ≥ 1 and b ∈ R\{0} we have Ip (ib) = (−1)p Ip (−ib). Proof. Using (A.4) we get I1 (ib) =
2b π
0
∞
2
e−ix dx = (−1)I1 (−ib). x2 + b2
This result, together with (A.5), implies (A.7).
(A.7)
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
Fig. A.1.
121
The integration path ΓA .
Lemma A.3. Assume that Re z = 0 and Im z = 0.Then we have the result 2
I1 (z) = I1 (−iz) − (sign(Re z) − sign(Im z))e−iz .
(A.8)
Proof. This result follows from the calculus of residues. We sketch the details. Consider the contour ΓA in the complex plane, as shown in Fig. A.1. We assume A > |z|. The residue theorem implies that we have 2 2 e−iζ dζ = −(sign(Re z) − sign(Im z))iπe−iz . (A.9) ΓA ζ − z The contributions from the two circular arcs vanish as A → ∞, since sin(2t) is negative for t satisfying π/2 < t < π and 3π/2 < t < 2π. Thus we have A iy2 A −ix2 e e dx + dy x − z y + iz −A −A 2
= −(sign(Re z) − sign(Im z))iπe−iz + o(1).
(A.10)
The result now follows by rewriting the second term on the left using complex conjugation, and by taking the limit A → ∞. Lemma A.4. For Im z = 0 and p ≥ 2 we have ip Ip−1 (z) + zIp (z) = Ip+1 (z). 2
(A.11)
Proof. Follows from an integration by parts. The details are omitted. Lemma A.5. We have for all p ≥ 1 that sup|Ip (ib)| < ∞. b>0
(A.12)
February 10, J070-S0129055X11004230
122
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
Proof. For p = 1 we use (A.4) to get 2b ∞ e−it2 x 2 ∞ dt ≤ dt = 1. |I1 (ib)| = π 0 t2 + b2 π 0 t2 + x2 For p ≥ 2 we get a uniform estimate for b ≥ 1 from (A.6). For 0 < b < 1 we can deform the integration contour in Ip (ib) into the lower half plane, e.g. follow the real axis from −∞ to −1, the unit circle in the lower half plane, and then the real axis from 1 to ∞. Then we get a uniform estimate also in this case. We now establish the connections with the error function. To this end we recall some definitions, see [1, p. 297]. z 2 2 erf(z) = √ e−t dt, (A.13) π 0 erfc(z) = 1 − erf(z),
(A.14)
The integral in (A.13) can be taken along any path in the complex plane connecting 0 and z. We also need 2
w(z) = e−z erfc(−iz) for Im z > 0, see [1, 7.1.3]. Note that all these functions are entire functions. We define the function 2 2 2iz ∞ e−t i ∞ e−t dt = dt for Im z = 0. w(z) ˆ = π −∞ z − t π 0 z 2 − t2
(A.15)
(A.16)
We have the result [1, (7.1.3)] w(z) = w(z) ˆ for Im z > 0.
(A.17)
ˆ we use In the lemma below, which gives the relation between I1 (z) and w(z), Arg z to denote the determination of the argument taking values in (0, 2π). Lemma A.6. Assume that Im z = 0 and Im(eiπ/4 z) = 0. Then we have 2
ˆ iπ/4 z), I1 (z) = σ(z)e−iz + w(e where
2, σ(z) =
−2, 0,
(A.18)
3π < Arg z < π, 4 7π < Arg z < 2π, 4
(A.19)
otherwise.
Proof. Let z be fixed and satisfying the assumption in the lemma. We denote by γA the path shown in Fig. A.2, where A > |z|.
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
Fig. A.2.
123
The integration path γA .
The calculus of residues yields that 2 2 e−iζ dζ = σ(z)πie−iz , γA ζ − z where σ(z) is defined in the lemma. On the other hand, using the explicit parametrization and the fact that the contributions from the two circular arcs tend to zero as A → ∞, cf. the proof of Lemma A.3, we also have ∞ −ix2 ∞ 2 2 e−iζ e e−t dζ = dx − dt + o(1). iπ/4 z −∞ x − z −∞ t − e γA ζ − z Thus the result follows by taking the limit A → ∞ and using the definitions of ˆ noting the choice of signs in these definitions. I1 (z) and w(z), We finish by giving also the relation between I1 (z) and w(z). From (A.17) and Lemma A.3, one has the result Lemma A.7. Let for Im z = 0, Σ(z) = sign(Im z). Then I1 (z) = Σ(z)w(Σ(z)eiπ/4 z).
(A.20)
Proof. For 0 < Arg z < 3π/4 (A.20) follows from (A.17) and Lemma A.3. By analytic continuation one obtains (A.20) for Σ(z) > 0. Similarly, for 7π/4 < 2 Arg z < 2π, (A.20) follows from (A.17), Lemma A.3, and w(−z) = 2e−z − w(z), see [1, 7.1.11]. References [1] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions (Dover, New York, 1965). [2] B. Baumgartner, Interchannel resonances at a threshold, J. Math. Phys. 37 (1996) 5928–5938.
February 10, J070-S0129055X11004230
124
2011 13:49 WSPC/S0129-055X
148-RMP
V. Dinu, A. Jensen & G. Nenciu
[3] L. Cattaneo, G. M. Graf and W. Hunziker, A general resonance theory based on Mourre’s inequality, Ann. H. Poincar´e 7 (2006) 583–614. [4] C. B. Chiu, E. C. G. Sudarshan and B. Misra, The evolution of unstable quantum states and a resolution of Zeno’s paradox, Phys. Rev. D 16 (1976) 520–529. [5] O. Costin and A. Soffer, Resonance theory for Schr¨ odinger operators, Comm. Math. Phys. 224 (2001) 133–152. [6] E. B. Davies, Resonances, spectral concentration and exponential decay, Lett. Math. Phys. 1 (1975) 31–35. [7] M. Demuth, Pole approximation and spectral concentration, Math. Nachr. 73 (1976) 65–72. [8] V. Dinu, A. Jensen and G. Nenciu, Non-exponential decay laws in perturbation theory of near threshold eigenvalues, J. Math. Phys. 50 (2009) 013516, 20 pp. [9] P. Exner, Open Quantum Systems and Feynman Integrals (Reidel, Dordrecht, 2002). [10] L. Fonda, G. C. Ghirardi and A. Rimini, Decay theory of unstable quantum systems, Rep. Prog. Phys. 41 (1978) 587–631. [11] P. T. Greenland, Seeking the non-exponential decay, Nature 335 (1988) 298–299. [12] W. Hunziker, Resonances, metastable states and exponential decay laws in perturbation theory, Comm. Math. Phys. 132 (1990) 177–188. [13] A. Jensen, Spectral properties of Schr¨ odinger operators and time-decay of the wave functions. Results in L2 (Rm ), m ≥ 5, Duke Math. J. 47 (1980) 57–80. [14] A. Jensen and T. Kato, Spectral properties of Schr¨ odinger operators and time-decay of the wave functions, Duke Math. J. 46 (1979) 583–611. [15] A. Jensen and G. Nenciu, A unified approach to resolvent expansions at thresholds, Rev. Math. Phys. 13(6) (2001) 717–754. [16] A. Jensen and G. Nenciu, The Fermi golden rule and its form at thresholds in odd dimensions, Comm. Math. Phys. 261 (2006) 693–727. [17] A. Jensen and G. Nenciu, Schr¨ odinger operators on the half line: Resolvent expansions and the Fermi golden rule at thresholds, Proc. Indian Acad. Sci. (Math. Sci.) 116 (2006) 375–392. [18] A. Jensen and G. Nenciu, On the Fermi golden rule: Degenerate eigenvalues, in Perspectives in Operator Algebras and Mathematical Physics, Proc. Conf. Operator Theory and Mathematical Physics, Bucharest (August, 2005) (Theta, Bucharest, 2008), pp. 91–103. [19] A. Jensen and G. Nenciu, Exponential decay laws in perturbation theory of threshold and embedded eigenvalues, in New Trends in Mathematical Physics, Proceedings of 15th International Congress on Mathematical Physics, Rio de Janeiro, 2006 (Springer, 2009), pp. 525–539. [20] T. Jittoh, S. Masumoto, J. Sato, Y. Sato and K. Takeda, Non-exponential decay of an unstable quantum system: Small Q-value s-wave decay, Phys. Rev. A 71 (2005) 012109, 7 pp. [21] T. K¨ ohler, K. G` oral and P. Julienne, Production of cold molecules via magnetically tunable Feshbach resonances, Rev. Mod. Phys. 78 (2006) 1311–1361. [22] M. Lewenstein, J. Zakrewski, T. Mossberg and J. Mostowski, Non-exponential spontaneous decay in cavities and waveguides, J. Phys. B 21 (1988) L9–L14. [23] B. Marcelis, E. van Kempen, B. Verhaar and J. Kokkelmans, Feshbach resonances with large background scattering length: Interplay with open-channel resonances, Phys. Rev. A 70 (2004) 012701, 15 pp. [24] M. Merkli and I. M. Sigal, A time-dependent theory of quantum resonances, Comm. Math. Phys. 201 (1999) 549–576.
February 10, J070-S0129055X11004230
2011 13:49 WSPC/S0129-055X
148-RMP
Perturbation of Near Threshold Eigenvalues
125
[25] M. Miyamoto, Bound-state eigenenergy outside and inside the continuum for unstable multilevel systems, Phys. Rev. A 72 (2005) 063405, 9 pp. [26] M. Murata, Asymptotic expansions in time for solutions of Schr¨ odinger-type equations, J. Funct. Anal. 49(1) (1982) 10–56. [27] H. Nakazato, M. Namiki and S. Pascazio, Temporal behavior of quantum mechanical systems, Int. J. Mod. Phys. B 3 (1996) 247–295. [28] R. G. Newton, Quantum Physics (Springer-Verlag, 2002). [29] C. Nicolaides, Physical constraints on nonstationary states and nonexponential decay, Phys. Rev. A 66 (2002) 022118, 7 pp. [30] N. Nygaard, B. Schneider and P. Julienne, Two-channel R-matrix analysis of magnetic-field-induced Feshbach resonances, Phys. Rev. A 73 (2006) 042705, 10 pp. [31] A. Orth, Quantum mechanical resonance and limiting absorption: The many body problem, Comm. Math. Phys. 126 (1990) 559–573. [32] X. P. Wang, Embedded eigenvalues and resonances of Schr¨ odinger operators with two channels, Ann. Fac. Sci. Toulouse Math. (6 ) 16(1) (2007) 179–214. odinger operators with [33] K. Yajima, The Lp boundedness of wave operators for Schr¨ threshold singularities I, Odd dimensional case, J. Math. Sci. Univ. Tokyo 13 (2006) 43–93.
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 2 (2011) 127–154 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004242
ON THE STOCHASTIC DEPENDENCE STRUCTURE OF THE LIMIT LOGNORMAL PROCESS
DMITRY OSTROVSKY 125 Field Point Rd. #3, Greenwich, CT 06830, USA dm [email protected] Received 2 February 2010 Revised 8 September 2010
The distribution of a single increment of the limit lognormal process of Mandelbrot, several representations of its Mellin transform, and an explicit analytic continuation of the Selberg integral are reviewed. The intermittency invariance of the limit lognormal construction is used to establish a functional Feynman–Kac equation that captures the entire stochastic dependence structure of the limit process. This equation is a general rule of intermittency differentiation that quantifies how the joint distribution of an arbitrary number of increments of the limit process evolves as a function of intermittency. The solution is represented by means of a formal intermittency expansion and shown to be an exactly renormalized expansion in the joint centered moments of the limit process. The expansion coefficients are related to a novel extension of the Selberg integral. Keywords: Multifractal stochastic processes; intermittency; Selberg integral; stochastic dependence. Mathematics Subject Classification 2010: 60G57, 76F55, 81T40
1. Introduction In this paper we study the stochastic dependence structure of the limit lognormal process of Mandelbrot [10, 11] and Bacry et. al. [1]. The interest in this multifractal stochastic process derives primarily from its remarkable stochastic self-similarity property and its connection to the famous Selberg integral, confer [7, 17], which gives the positive integral moments of the limit distribution as shown in [2]. The mathematical structure that is behind these properties is the stochastic dependence structure of increments of the limit process. The key feature of the limit lognormal distribution is that it is defined as the limit of an infinite sum of lognormal, logarithmically correlated random variables, which are built from a two-dimensional Gaussian free field. Such constructions have recently attracted substantial interest as they occur in a wide spectrum of problems in mathematical physics ranging from statistical modeling of fully developed turbulence [3], to conformal field theory and quantum gravity [5, 6], to random energy models [8]. The dependence structure of 127
March 23, J070-S0129055X11004242
128
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
increments of the limit lognormal process is thus of fundamental interest for our understanding of the mathematics of multifractality. The review part of this paper deals with the distribution of a single increment of the limit lognormal process. The limit lognormal process is defined as the zeroscale limit of the exponential functional of the underlying stationary normal process with strongly dependent increments. In our previous work [13], we introduced the technique of functional Feynman–Kac equations, which translates the invariances of this underlying process with respect to scale, decorrelation length, and intermittency parameters into the corresponding functional equations for the limit lognormal process. We used the latter invariance in [14] to derive the general rule of intermittency differentiation for a single increment of the limit process. This rule is a functional equation for the derivative of the expectation of an arbitrary smooth function of the limit distribution with respect to the intermittency parameter. By iterating this rule ad infinitum, we obtained a formal power series expansion of any such functional with certain universal coefficients that are determined uniquely by the Selberg integral. The resulting intermittency expansion can thus be interpreted as an exactly renormalized expansion in the moments of the limit distribution. We applied it successfully in [15] to compute the expansion of the Mellin transform of a single increment of the limit process. We summed it by a moment constant method and thus obtained an explicit closed-form formula for the Mellin transform. We then verified that the resulting formula reproduces the known values of the integral moments of the limit lognormal distribution and is in fact the Mellin transform of a valid positive probability distribution. Hence, we effectively introduced a new probability distribution with the properties that its integral moments at arbitrary intermittency and Mellin transform asymptotic in the limit of small intermittency coincide with the corresponding quantities of the limit lognormal distribution. It is our conjecture that the two are one and the same. In [16], we extended our solution for the Mellin transform to the general transform of the limit distribution and presented the solution in an operator form. We also computed in [16] the cumulants of the logarithm of the limit distribution, which capture the distribution uniquely as the associated moment problem is determinate, unlike the moment problems for the distribution and its reciprocal as shown in [15]. The fact that the positive integral moments of the limit lognormal distribution are given by the Selberg integral was first established in [2]. Hence, the problem of computing the Mellin transform of this distribution is the same as that of finding an analytic continuation of the Selberg integral as a function of its dimension into a domain in the complex plane. We solved this problem in [15] and found an explicit extension of Selberg’s finite product formula in the form of an infinite product of gamma factors. In this paper, we give a novel proof of this fact that is more direct than the original. The objective of the research part of this paper is to quantify the dependence structure of increments of the limit lognormal process. The main technical difficulty in solving this problem lies in the strongly non-markovian nature of the
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
129
process and, more specifically, in the divergence of its positive integral moments, i.e. the Selberg integral, at any level of intermittency, however small, which renders the naive expansion in the moments to be useless. The main result of this paper is an extension of the rule of intermittency differentiation for a single increment to the case of multiple increments. We show how the intermittency invariance of the underlying normal process can be translated into a functional Feynman–Kac equation for an arbitrary number of increments to capture the entire dependence structure of the limit process. This equation is solved exactly by means of a formal intermittency expansion, whose coefficients are determined by positive integral moments and vanish beyond a finite range thereby providing a type of exact renormalization as they do in the single-increment case. The main innovation is that the moments are now the joint moments and given by a nontrivial generalization of the Selberg integral. For example, in the simplest case of two adjacent increments of size 1/2, the integral involved is
k1
[0,1]k1 +k2 i<j
|xi − xj |−µ
+k2 k1
|xi + xj |−µ
i=1···k1 j=k1 +1···k1 +k2
|xi − xj |−µ dx.
(1.1)
k1 +1≤i<j
We will show how the collection of such integrals for all k1 and k2 determines the stochastic dependence of the two increments, in other words, how this dependence can be reconstructed from the dependence of the multiple integral in Eq. (1.1) on the intermittency parameter µ. The plan of this paper is as follows. In Sec. 2, we present a detailed review of the known results in the case of a single increment. In Sec. 3, we state our main result, describe some of the properties of the ensuing intermittency expansions, and relate them to an extension of the Selberg integral. In Sec. 4, we study the basic properties of the new integral in the case of two adjacent increments. In Sec. 5, we give all the proofs. Conclusions are presented in Sec. 6.
2. Review of Intermittency Differentiation and Intermittency Expansions for a Single Increment We begin by giving the necessary definitions and refer the reader to [1–3, 12] for the original presentation. Let ωε (s) ≡ ωµ,L,ε (s) be a Gaussian process in s, whose mean and covariance are functions of a finite scale ε > 0. We consider the process Mµ,L,ε (t) =
t
eωε (s) ds.
(2.1)
0
Bacry and Muzy [12] defined the mean and covariance of ωε (s) to be µ L E[ωε (t)] = − 1 + log , 2 ε
(2.2)
March 23, J070-S0129055X11004242
130
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
Cov[ωε (t), ωε (s)] = µ log
L , |t − s|
ε ≤ |t − s| ≤ L,
L |t − s| Cov[ωε (t), ωε (s)] = µ 1 + log − , ε ε
(2.3)
if |t − s| < ε, and covariance is zero in the remaining case of |t − s| ≥ L. Thus, ε is used as a truncation scale. L is the fundamental decorrelation length of the process that regulates the extent of long-range dependence. Indeed, nonoverlapping increments of Mµ,L,ε (t) are dependent only if they are within L apart. µ is the intermittency parameter.a The two key properties of this construction are, first, that E[ωε (t)] = −Var[ωε (t)]/2 so that E[exp(ωε (s))] = 1 and, second, that Var[ωε (t)] is logarithmically divergent as ε → 0. The first property is essential for convergence, the second is responsible for multifractality, and both are originally due to Mandelbrot [10]. The limit lognormal process Mµ,L (t) is defined to be the zero scale limit ε → 0 of finite scale processes Mµ,L,ε (t). Strictly speaking, Mµ,L,ε (dt) is a random measure on the real line, whose weak a.s. convergence to a nondegenerate limit measure Mµ,L (dt) was established in [3] based on the theory of convergence of a certain class of positive martingales developed by Kahane [9] and the work of Barral and Mandelbrot [4] on log-Poisson cascades. Nondegeneracy means that E[Mµ,L (dt)] = dt. The fundamental property of the limit process, first established in [12], is its stochastic self-similarity, also known as continuous dilation invariance. Given t < L, let Wt denote a lognormal random variable that is independent of Mµ,L (L) such that E[log Wt ] = −(1 + µ/2) log L/t, Var[log Wt ] = µ log L/t. Then, there holds the following equality in law Mµ,L (t) = Wt Mµ,L (L).
(2.4)
It must be emphasized that this equality is strictly in law, that is, Wt is not a stochastic process and Wt and Ws for s = t are not defined on the same probability space. In particular, Eq. (2.4) determines the distribution of Mµ,L (t) in terms of that of Mµ,L (L) but says nothing about Mµ,L (L) or their joint distribution. Throughout this section, the decorrelation length is set at L = 1 without loss of generality because Mµ,L (t = L) = L Mµ,1 (1) in law as shown in [13]. We will abbreviate Mµ,1,ε (t) as Mµ,ε (t), Mµ,1 (t) as Mµ (t), and Mµ,1 (1) as Mµ when no confusion can arise. The positive integral moments of Mµ were shown in [2] to be given by the celebrated Selberg integral, confer [17]. Given integral l such that 2 ≤ l < 2/µ, 1 1 l ··· |si − sj |−µ ds (l) E[Mµl ] = 0
=
l−1 k=0
a The
0 i<j
Γ(1 − (k + 1)µ/2)Γ2 (1 − kµ/2) , Γ(1 − µ/2)Γ(2 − (l + k − 1)µ/2)
intermittency parameter is denoted by λ2 in Bacry and Muzy [12] so that µ = λ2 .
(2.5)
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
131
which from now on we will denote by Sl (µ). In general, it is shown in [3] that for q > 0 we have E[Mµq ] < ∞ if q < 2/µ and, conversely, E[Mµq ] < ∞ implies that q ≤ 2/µ. Let F (x) be an arbitrary smooth function that does not involve the intermittency parameter 0 ≤ µ < 1b and let F (k) (x) denote its kth derivative. Our results on general intermittency expansions for a single increment established in [14] are summarized in the following propositions. Consider the expectation of a general functional of the limit lognormal process 1 µf (s) e dMµ (s) , (2.6) v(µ, f, F ) E F 0
where f (s) is an arbitrary continuous function that does not involve µ. This functional is path-dependent unless f ≡ 0, its somewhat peculiar functional form is motivated by the fact that this functional form is invariant under intermittency differentiation. The integration with respect to the limit measure dMµ (s) is understood in the sense of ε → 0 limit so that v(µ, f, F ) = limε→0 vε (µ, f, F ) and 1 vε (µ, f, F ) E[F ( 0 eµf (s) dMµ,ε (s))] with dMµ,ε (s) as in Eq. (2.1) with L = 1. Also, let g(s1 , s2 ) be defined by g(s1 , s2 ) −log|s1 − s2 |.
(2.7)
Its significance is that limε→0 Cov(ωµ,1,ε (s1 ), ωµ,1,ε (s2 )) = µ g(s1 , s2 ) on 0 < |s1 − s2 | < 1. Finally, we will use [0, 1]k to denote the k-dimensional unit interval [0, 1] × · · · × [0, 1]. Then, we have the following general rule of intermittency differentiation for a single increment. Theorem 2.1. The expectation v(µ, f, F ) is invariant under intermittency differentiation and satisfies ∂ v(µ, f, F ) = v(µ, f + g(·, s), F (1) )eµf (s) f (s)ds ∂µ [0, 1] +
1 2
[0, 1]2
v(µ, f + g(·, s1 ) + g(·, s2 ), F (2) )
× eµ(f (s1 )+f (s2 )+g(s1 ,s2 )) g(s1 , s2 )ds (2) .
(2.8)
The mathematical content of Eq. (2.8) is that differentiation with respect to the intermittency parameter µ is equivalent to a combination of two functional shifts induced by the g function. The single-integral term corresponds to the exponential prefactor in Eq. (2.6), whereas the double-integral term corresponds to the intrinsic dependence of Mµ (t) on µ. It is clear that both terms in Eq. (2.8) are of the same shown in [3], nondegeneracy is guaranteed by µ < 2, the restriction of µ < 1 ensures the finiteness of the 2nd moment. Many of our results remain valid for µ < 2 but we maintain µ < 1 in this paper for simplicity.
b As
March 23, J070-S0129055X11004242
132
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
functional form as the original functional in Eq. (2.6) so that Theorem 2.1 allows us to compute derivatives of all orders. Proposition 2.1. The expectation E[F (Mµ )] has the formal expansion 2n
∞ µn (k) E[F (Mµ )] = F (1) + F (1)Hn,k . n! n=1
(2.9)
k=2
The universal expansion coefficients Hn,k , n = 1, 2, 3, . . . , are given by the binomial transform of the derivatives of the Selberg integral Hn,k =
n k k ∂ Sl (−1)k (−1)l . l ∂µn µ=0 k!
(2.10)
l=2
Proposition 2.2. The expansion coefficients Hn,k satisfy Hn,k = 0
∀ k > 2n.
(2.11)
Corollary 2.1. There holds the following formal expansion in terms of the derivatives of moment-related expectations of the process 2n
∞ µn F (k) (1) ∂ n k E[(Mµ − 1) ] . (2.12) E[F (Mµ )] = F (1) + n! k! ∂µn µ=0 n=1 k=2
The representation in Eq. (2.12) reveals an essential feature of the structure of our expansions. We see that Eq. (2.12) is an exactly renormalized expansion in the centered moments of Mµ . Indeed, it is easy to see that if the positive moments of all orders were finite and Taylor expandable in µ, then Eq. (2.12) would be the same ∞ as the naive expansionc in the moments E[F (Mµ )] = F (1) + k=1 F (k) (1)E[(Mµ − 1)k ]/k!. Unlike the naive expansion, however, all the coefficients in Eq. (2.12) are finite because the derivatives are taken at µ = 0, and Sl (µ) is finite so long as l < 2/µ. Moreover, our renormalization is exact as the expansion in Eq. (2.12) is not ad hoc but is rather derived from the exact functional equation in Theorem 2.1. The main result of [15] was to explicitly calculate the intermittency expansion for the Mellin transform (complex moments) of Mµ . The moments correspond to F (x) = xq for some given q ∈ C. By Proposition 2.1, the intermittency expansion for the moments is E[Mµq ]
∞ µn fn (q), = 1+ n! n=1
(2.13)
c To see this, note that the k sum in Eq. (2.12) can be extended to infinity due to ∂ n /∂µn |µ=0 E[(Mµ − 1)k ] ≡ k! Hn,k = 0 ∀ k > 2n and n ≥ 1 by Proposition 2.2.
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
fn (q) =
2n
(q)k Hn,k ,
n = 1, 2, 3, . . . .
133
(2.14)
k=2
As usual, (q)k denotes the “falling factorial” (q)k q(q − 1)(q − 2) · · · (q − k + 1), ζ(s)d denotes the Riemann zeta function, Bn (s) the nth Bernoulli polynomial, and Yn (x1 , . . . , xn ) the complete exponential Bell polynomial of order n. Theorem 2.2. Let f0 (q) = 1 and define the polynomials br (q), r = 0, 1, 2 . . . Br+2 (q + 1) + 2Br+2 (q) − 3Br+2 1 −q br (q) = r+1 ζ(r + 1) 2 r+2 Br+2 (q − 1) − Br+2 (2q − 1) + (ζ(r + 1) − 1) . (2.15) r+2 Then, fn (q) satisfies the recurrence fn+1 (q) = n!
n fn−r (q) br (q) (n − r)! r=0
(2.16)
and is given explicitly in terms of Yn by fn (q) = Yn (b0 (q)0!, b1 (q)1!, . . . , bn−1 (q)(n − 1)!). The moments have the following exact formal representation
∞ µr+1 q br (q) , q ∈ C. E[Mµ ] = exp r+1 r=0
(2.17)
(2.18)
r+1 The series ∞ br (q)/(r + 1) is divergent in general with the exception r=0 µ of a finite range of positive and negative integral q, confer Theorem 2.4 below. This means that the Mellin transform of Mµ is not analytic in the intermittency parameter, and Eq. (2.18) ought to be interpreted as its asymptotic expansion. Consider next the general transform of the form E[G(s + log Mµ )] for some fixed constant s. The corresponding intermittency expansion is E[G(s + log Mµ )] =
∞ n=0
Gn (s)
µn . n!
(2.19)
The following theorem and its corollary established in [16] characterize this expansion completely. d We will write ζ(1) to denote Euler’s constant. It never enters any of the final formulas as the coefficient it multiplies is identically zero throughout this paper.
March 23, J070-S0129055X11004242
134
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
Theorem 2.3. The coefficients Gn (s) of the general intermittency expansion satisfy the recurrence n d n! br (2.20) Gn+1 (s) = Gn−r (s), G0 (s) = G(s). (n − r)! ds r=0 Corollary 2.2.
d d Gn (s) = Yn 0!b0 , . . . , (n − 1)!bn−1 G(s), ds ds
∞ µr+1 d br E[G(s + log Mµ )] = exp G(s). r+1 ds r=0
(2.21) (2.22)
The significance of Theorem 2.3 is that it generalizes the recurrence in Theorem 2.2 and shows that the polynomials br (q) play a fundamental role not only for the Mellin transform but in fact for the general transform. The analogy of the general and Mellin transforms extends much further in that the solution for the general transform in Corollary 2.2 is obtained by replacing q with d/ds in the solution for the Mellin transform in Eqs. (2.17) and (2.18). Thus, Eq. (2.22) is an exact operator solution for the general intermittency expansion in Eq. (2.19). It is obvious that Eq. (2.22) reproduces the Mellin transform expansion in Eq. (2.18) by taking G(s) = exp(qs) for some fixed q ∈ C and using d (2.23) eqs = br (q) eqs . br ds The solution for the general intermittency expansion in Corollary 2.2 is formal and needs to be regularized, i.e. we need to sum the divergent series in Eq. (2.22). The regularized solution must correspond to a valid positive probability distribution having the correct integral moments given by the Selberg integral in Eq. (2.5). In addition, it must reproduce Eq. (2.22) as its asymptotic expansion in the small intermittency limit. A solution with these properties is given below. The question µ to distinguish our of uniqueness of such a solution is open, hence we will write M construction from the limit lognormal distribution as defined in the beginning of this section. We begin by regularizing the expansion for the Mellin transform in Eq. (2.18). In [15] we proposed to sum it by means of ∞ µr+1 br (q) r+1 r=0 µx (q+1) ∞ µx µx µx 1 e2 + 2e 2 q − 3 + e 2 (q−1) − e 2 (2q−1) ∼ µx ex − 1 e 2 −1 0 µx (2q−1) µx µx dx e2 − e 2 (q−1) −x − (1 + q + qe 2 ) + e Dµ (q). −q µx x e 2 −1
(2.24)
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
135
∞ The meaning of “∼” is that the series r=0 µr+1 br (q)/(r + 1) is the asymptotic expansion in µ of the integral on the right-hand side of Eq. (2.24) that we denote by Dµ (q). The integral is convergent for (q) < 2/µ. Hence, we define the Mellin transform by q ] exp(Dµ (q)), E[M µ
(q) <
2 . µ
(2.25)
The motivation for this particular way of summing the divergent series in Eq. (2.18) is explained in the following theorems that were established in [15]. Theorem 2.4. For real integral values of q such that −2/µ + 1/2 < q < 2/µ the series in Eq. (2.18) is convergent and satisfies ∞ µr+1 br (q) = Dµ (q). r+1 r=0
(2.26)
In addition, for positive integral q = l, l = 1, 2, 3, . . . , such that l < 2/µ we have l−1
l ] = E[M µ
k=0
Γ(1 − (k + 1)µ/2)Γ2 (1 − kµ/2) . Γ(1 − µ/2)Γ(2 − (l + k − 1)µ/2)
(2.27)
Theorem 2.5. Given 0 < µ < 1, the function q → exp(Dµ (iq)), q ∈ R, is the characteristic function of an infinitely divisible distribution. In summary, Theorem 2.5 says that for every 0 < µ < 1 exp(Dµ (iq)) is µ . Hence the characteristic function of some random variable that we call log M exp(Dµ (q)) is the Mellin transform of the random variable Mµ , whose positive integral moments coincide with the known moments of the limit lognormal distribution by Theorem 2.4 for every µ, confer Eq. (2.5). Finally, the small intermittency expansion of this Mellin transform coincides with the intermittency expansion of the limit q ] ∼ exp( ∞ µr+1 br (q)/(r + 1)) as lognormal distribution by construction E[M µ r=0 µ . The primary diffiµ → +0. We conjecture that the two are the same Mµ = M culty in verifying this statement stems from the indeterminacy of the corresponding moment problems. µ µ is determinate and those for M Theorem 2.6. The moment problem for log M −1 are indeterminate. The left tail of M µ is lognormal, the right tail is a and M µ power law and its asymptotic behavior is P[Mµ > u] ∼ const u−2/µ as u → +∞. µ can also be computed exactly and are The negative integral moments of M given by a formula that is similar to Selberg’s finite product in Eq. (2.27). −l ] = E[M µ
l−1 j=0
Γ(2 + (l + 2 + j)µ/2)Γ(1 − µ/2) . Γ2 (1 + (j + 1)µ/2)Γ(1 + jµ/2)
(2.28)
March 23, J070-S0129055X11004242
136
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
Unlike the positive moments, the negative moments do not become infinite. In fact, the finiteness of the negative moments plays an essential role in the thermodynamic formalism as developed in [11] and has long been conjectured. It is not too difficult −l ] grows as l2 as l → +∞. This is the same to see from Eq. (2.28) that log E[M µ rate of growth as that of the moments of the lognormal distribution. The regularization of the general transform was done in [16] in an operator form. µ be the probability distribution defined by Eq. (2.25). Then, Theorem 2.7. Let M µ )] = exp Dµ d E[G(s + log M G(s). (2.29) ds µ )] coincides with The small intermittency asymptotic expansion of E[G(s + log M the intermittency expansion in Eq. (2.22) µ )] ∼ E[G(s + log M
∞ n=0
Gn (s)
µn n!
as µ → +0.
(2.30)
The action of the operator Dµ (d/ds) is given by d Dµ f (s) ds µx ∞ 1 e 2 f (s + µx/2) + 2f (s + µx/2) − 3f (s) dx = µx x x e −1 e 2 −1 0 µx µx µx df df e− 2 f (s + µx/2) − e− 2 f (s + 2µx/2) 2 +e − 1+ + µx ds ds e 2 −1 − µx µx − e 2 f (s + 2µx/2) − e 2 f (s + µx/2) df − + e−x . (2.31) µx ds e 2 −1 We conclude our review of the single-increment distribution by giving an alternative representation of the Mellin transform. It is clear that by virtue of Eq. (2.5) the Mellin transform of the limit lognormal distribution is an analytic continuation of the Selberg integral to the complex domain. We know from Theorem 2.4 that Eq. (2.25) gives an explicit continuation to the half-plane (q) < 2/µ. The following formula for the Mellin transform that we established in [15] extends Selberg’s finite product to an infinite product of gamma factors. Theorem 2.8. Let Dµ (q) be as in Eq. (2.24) and (q) < 2/µ. Then, q 2 Γ(2 + 2/µ − 2q) Γ(1 − µq/2)Γ−q (1 − µ/2) exp(Dµ (q)) = µ Γ(2 + 2/µ − q) ∞ 2q 2n Γ3 (1 − q + 2n/µ) Γ(2 − q + 2n/µ) . × µ Γ3 (1 + 2n/µ) Γ(2 − 2q + 2n/µ) n=1
(2.32)
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
137
A direct proof that Eq. (2.32) coincides with Eq. (2.27) in the special case of integral q is given in Sec. 5. Corollary 2.3. The Mellin transform satisfies the following functional equations 2 2 µq+ µ ] = (2π) µ2 −1 Γ− µ2 1 − µ 2Γ(−q) Γ (1 − q) Γ(2 − q + 2/µ) E[M µq ], E[M 2 µ Γ(2 − 2q) Γ(2 − 2q + 2/µ) (2.33) q−1 ] = E[M µ
Γ(1 − µ/2) Γ(2 − (2q − 2)µ/2) Γ(2 − (2q − 3)µ/2) q E[Mµ ]. Γ(1 − µq/2) Γ2 (1 − (q − 1)µ/2) Γ(2 − (q − 2)µ/2)
(2.34)
Eq. (2.33) holds for (q) < 0 and Eq. (2.34) for (q) < 2/µ. These functional equations are interesting for at least two reasons. First, they indicate that the underlying mathematical structure is hypergeometric for they are equivalent to integral equations of the Mellin convolution type involving the Appell µ . The other F3 function in the kernel for the probability density function of M reason is that Eq. (2.34) follows directly for integral q from Selberg’s formula in Eq. (2.5). If we postulate that it must hold for complex q, then it can be used to derive the formula for the Mellin transform in Eq. (2.32). This is the route that was taken in [8], where the authors established a formula that is equivalent to our Eq. (2.32). µ is determinate, the In view of the fact that the moment problem of log M (p) p p µ uniquely. cumulants Dµ (0) ≡ ∂ /∂q |q=0 Dµ (q) capture the distribution of log M The following result was obtained in [16]. µ are Corollary 2.4. The cumulants of log M 2 µ µ 2 µ µ Dµ(1) (0) = log − log Γ 1 − −ψ 2+ − ψ 1+ − ψ(1) µ 2 µ 2 2 2 ∞ −t 1 2 e 1 +µ + t − dt, (2.35) µt/2 2t e −1 e − 1 µt 0 p µ 2 µ (p) (2p − 1)ζ p, 1 + Dµ (0) = (p − 1)! (2p − 1)ζ p, 2 + + µ 2 2
p ∞ µ 2k ζ(p) + 4(1 − 2p−2 ) ζ p, 1 + + , p = 2, 3, 4, . . . . 2 µ k=1
(2.36) Hence, the Mellin transform has nontrivial connections with both the gamma and Hurwitz zeta functions. We conclude our review by emphasizing that the uniqueness of our solution for the distribution of the single increment of the limit lognormal distribution remains an open problem.
March 23, J070-S0129055X11004242
138
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
3. Intermittency Differentiation for Multiple Increments In this section, we will formulate the rule of intermittency differentiation for multiple increments of the limit lognormal process. We will then state basic properties of the resulting intermittency expansions with the ultimate goal of reconstructing the dependence structure of the limit process from the dependence of its joint integral moments on the intermittency parameter. The proofs of all results in this section are deferred to Sec. 5 for the convenience of the reader. Throughout this section, we continue to assume that the decorrelation length is set at L = 1. We will abbreviate Mµ,1,ε (t) as Mµ,ε (t) and Mµ,1 (t) as Mµ (t) when no confusion can arise. We will write Ij , j = 1 · · · N, for a sequence of possibly overlapping subintervals of [0, 1]. In what follows, we let Fj (x), j = 1 · · · N, denote smooth functions that equal their Taylor series on x ∈ [0, 1] and do not involve the intermittency parameter 0 ≤ µ < 1. Consider a general functional of the limit lognormal process
N Fj eµfj (s) dMµ (s) , (3.1) v(µ, f, F , I ) E j=1
Ij
where fj (s), j = 1 · · · N, are arbitrary continuous functions that do not involve µ. The integration with respect to the limit measure dMµ (s) is understood in the sense of ε → 0 limit so that v(µ, f, F , I ) = limε→0 vε (µ, f, F , I ) and
N vε (µ, f, F , I ) E Fj eµfj (s) dMµ,ε (s) , (3.2) j=1
Ij
with dMµ,ε (s) as in Eq. (2.1) with L = 1. Define the function g(s1 , s2 ) by g(s1 , s2 ) −log|s1 − s2 |.
(3.3)
We will use |I| to denote the length of I, I k to denote the k-dimensional product I × · · · × I, and F (k) (x) to denote the kth derivative of F (x). Finally, we will write f+g(·, s) to denote the vector function with components fj (u) + g(u, s), j = 1 · · · N. Then, we have the following general rule of intermittency differentiation for multiple increments. Theorem 3.1. N
∂ v(µ, f, F , I ) = ∂µ j=1
N
+
fj (s)eµfj (s) v(µ, f + g (·, s), F1 · · · Fj
(1)
Ij
1 2 j=1
· · · FN , I )ds
Ij2
exp(µ(fj (s1 ) + fj (s2 ) + g(s1 , s2 ))) g(s1 , s2 )
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
139
(2) × v(µ, f + g(·, s1 ) + g (·, s2 ), F1 · · · Fj · · · FN , I )ds1 ds2
+
N Ij ×Ik
j
exp(µ(fj (s1 ) + fk (s2 ) + g(s1 , s2 ))g(s1 , s2 ) (1)
× v(µ, f + g(·, s1 ) + g (·, s2 ), F1 · · · Fj
(1)
· · · Fk · · · FN , I )ds1 ds2 . (3.4)
The structure of the intermittency derivative thus amounts to certain functional shifts that are induced by the g function. In the special case of N = 1 and I1 = [0, 1] the last term drops out and we recover the rule of intermittency differentiation for a single increment, confer Theorem 2.1. It is worth emphasizing that Theorem 3.1 indicates that the functional in Eq. (3.1) is invariant under intermittency differentiation so that the rule can be iterated ad infinitum. We are primarily inter ested in the joint distribution of the increments I1 dMµ , . . . , IN dMµ of the limit process over the intervals I1 , . . . , IN , which corresponds to the case of fj = 0 ∀ j. This joint distribution is determined by the following formal intermittency expansion. Proposition 3.1. There exist universal expansion coefficients Hn;k1 ···kN such that
N E Fj dMµ (s) Ij
j=1
=
N
Fj (|Ij |) +
j=1
∞ µn n! n=1
2≤
PN
Hn;k1 ···kN
j=1 kj ≤2n
N
(kj )
Fj
(|Ij |).
(3.5)
j=1
The coefficients Hn;k1 ···kN are universal in the sense of being the same for all can be determined from the joint positive integral moments of the F1 · · · FN . They increments I1 dMµ , . . . , IN dMµ . In fact, consider the special case of Fj (x) = xqj for arbitrary complex numbers qj , j = 1 · · · N, so that F (k) (x) = (q)k xq−k and (q)k denotes the “falling factorial” (q)k q(q − 1)(q − 2) ·· · (q − k + 1). Then, the intermittency expansion for the joint Mellin transform of I1 dMµ , . . . , IN dMµ is
E
N
j=1
=
Ij N j=1
qj dMµ
(|Ij |)qj +
∞ µn n! n=1
2≤
PN
j=1
kj ≤2n
Hn;k1 ···kN
N
(qj )kj |Ij |qj −kj .
(3.6)
j=1
This object is of particular interest as it determines the joint distribution uniquely. If we restrict our attention to the case of positive integral qj ∈ N, we obtain an
March 23, J070-S0129055X11004242
140
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
explicit formula for the expansion coefficients. Denote
qj N dMµ . Sq1 ···qN (µ, I ) E
(3.7)
Ij
j=1
Corollary 3.1. Given 2 ≤ k1 + · · · + kN ≤ 2n, the expansion coefficients satisfy N (−1)k1 +···+kN q1 +···+qN kj −qj kj Hn;k1 ···kN = (−1) |Ij | k1 ! · · · kN ! qj j=1
qj ≤kj j=1···N
∂ n × n Sq ···q (µ, I ). ∂µ µ=0 1 N
(3.8)
Finally, the joint integral moments Sq1 ···qN (µ, I ) that enter Eq. (3.8) are given by a generalized Selberg integral. Proposition 3.2. Given non-negative integers k1 · · · kN , let sj , j = 1 · · · N, denote the vector sj ≡ (sj1 , . . . , sjkj ) such that sjl ∈ Ij ∀ l. Then, Sk1 ···kN (µ, I ) =
···
N k
k
I1 1 ×···×INN
−µ
[|sip − siq |
]
i=1 p
N
[|sip − sjq |−µ ]ds1 · · · dsN .
i<j p≤ki q≤kj
(3.9)
The integrand in Eq. (3.9) is the discriminant, i.e. the product i<j |xi −xj |2 , on the k1 + · · · + kN variables sj1 , . . . , sjkj for j = 1 · · · N that is raised to the power −µ/2. Clearly, the integral coincides with the Selberg integral, confer Eq. (2.5) above, when N = 1 and I1 = [0, 1]. Proposition 3.3. The expansion coefficients as defined by Eq. (3.8) for all indices k1 , . . . , kN satisfy Hn;k1 ···kN = 0
if
N
kj > 2n.
(3.10)
j=1
It is important to emphasize that the intermittency expansion in Eq. (3.5) is an exactly renormalized expansion in the joint centered moments. Indeed, the expansion can also be written in the form
(k ) N N ∞ N Fj j (|Ij |) µn Fj dMµ (s) = Fj (|Ij |) + E n! PN kj ! Ij n=1 j=1 j=1 j=1 ∂ × n ∂µ
n
E µ=0
2≤
N
j=1
Ij
j=1
kj ≤2n
kj dMµ (s) − |Ij |
(3.11)
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
due to the identity
kj N ∂ n = k1 ! · · · kN ! Hn;k1 ···kN E dMµ (s) − |Ij | ∂µn µ=0 Ij j=1
141
(3.12)
that holds for all k1 · · · kN . As E[ I dMµ (s)] = |I|, this identity also shows that the expansion coefficients Hn;k1 ···kN are essentially intermittency derivatives of the joint centered moments. Now, if all the positive joint moments were finite, then our expansion would coincide with the naive expansion
N N N ∞ ∞ (k ) Fj dMµ (s) = Fj (|Ij |) + ··· Fj j (|Ij |)/kj ! E Ij
j=1
j=1
× E
k1 =1
N
j=1
Ij
kN =1 j=1
kj
dMµ (s) − |Ij |
.
In fact, the k sum in Eq. (3.11) can be extended to infinity as the derivative in N Eq. (3.12) vanishes for j=1 kj > 2n and n ≥ 1 by Proposition 3.3. Unlike the naive expansion, however, all the coefficients in Eq. (3.11) are finite because the derivatives are taken at µ = 0, and Sk1 ···kN (µ, I ) is finite so long as µ is sufficiently small. Moreover, our renormalization is exact as the expansion in Eq. (3.5) is derived from the exact functional equation in Theorem 3.1. The integral in Proposition 3.2 is unknown for N ≥ 2. In the next section we will examine some of its basic properties in relation to the Selberg integral in the special case of N = 2. 4. Properties of the Joint Integral Moments In this section, we will consider the integral in Eq. (3.9) in the case of two adjacent increments: N = 2, I1 = [a, b], and I2 = [b, c], a < b < c. From now on, it will be understood that all products involving two indices are over the set {i < j}, unless stated otherwise. Then, the joint moments take the form k1 +k2 |xi − xj |−µ dx. (4.1) Sk1 , k2 (µ, [a, b], [b, c]) = [a,b]k1
[b,c]k2
1
Denote α b − a,
β c − b,
m k1 + k2 ;
µ λ− . 2
(4.2)
j = k1 + 1 · · · m,
(4.3)
Changing variables xi = 1 −
xi − a , α
i = 1 · · · k1 ,
xj =
xj − b , β
we obtain the equality Sk1 , k2 (µ, [a, b], [b, c]) = αλk1 (k1 −1)+k1 β λk2 (k2 −1)+k2 Ik1 , k2 (−µ/2, α, β),
(4.4)
March 23, J070-S0129055X11004242
142
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
where Ik1 , k2 (λ, α, β) is the integral Ik1 ,k2 (λ, α, β)
k1
[0,1]m 1
|xi −xj |2λ
|αxi +βxj |2λ
i=1···k1 j=k1 +1···m
m
|xi −xj |2λ dx.
(4.5)
k1 +1
In the special case of adjacent increments of equal size, we recover the integral in Eq. (1.1). The value of Ik1 , k2 (λ, α, β) is unknown for arbitrary k1 and k2 . In the rest of this section, we will establish some of its basic properties. We begin by recalling the full Selberg integral. Sm (λ, λ1 , λ2 )
m
[0,1]m 1
|xi − xj |2λ
m
xλi 1 (1 − xi )λ2 dx.
(4.6)
1
It has the essential property that its dependence on λ1 and λ2 determines the full integral due to the identity m+1
|xi − xj |2λ = y λm(m+1)
1
m
|ti − tj |2λ
1
m
|1 − ti |2λ ,
(4.7)
1
where xi = ti y, i = 1 · · · m and xm+1 = y, which implies the recurrence Sm+1 (λ, 0, 0) =
(m + 1) Sm (λ, 0, 2λ). 1 + λm(m + 1) + m
(4.8)
Our first result is that this property of the Selberg integral extends to our integral. In fact, letting xi = ti y, i = 1 · · · k1 , xk1 +1 = y, xj = tj−1 z, j = k1 + 2 · · · m + 1, xm+2 = z, we obtain k 1 +1
|xi − xj |2λ
|αxi + βxj |2λ
i=1···k1 +1 j=k1 +2···m+2
1
k1 1
m
|ti − tj |2λ |αy + βz|2λ
m
|ti − tj |2λ
|αyti + βztj |2λ
i=1···k1 j=k1 +1···m
k1 [|1 − ti |2λ |αyti + βz|2λ ] i=1
k1 +1
×
|xi − xj |2λ
k1 +2
= y λk1 (k1 −1)+2λk1 z λk2 (k2 −1)+2λk2
×
m+2
[|αy + βztj |2λ |1 − tj |2λ ].
j=k1 +1
(4.9)
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
143
We are thus naturally led to generalize the integral in Eq. (4.5) to Ik1 ,k2 (λ, λ1 , λ2 , λ3 , α, β) k1 |xi − xj |2λ [0,1]m
×
k1
|αxi + βxj |2λ
i=1···k1 j=k1 +1···m
1
|xi − xj |2λ
k1 +1 m
[|1 − xi |λ1 |αxi + β|λ2 |xi |λ3 ]
i=1
m
[|α + βxj |λ1 |1 − xj |λ2 |xj |λ3 ]dx.
j=k1 +1
(4.10) To simplify notation, let us introduce an auxiliary quantity Jk1 ,k2 (λ, λ1 , λ2 , λ3 , α, β) |α|λk1 (k1 −1)+k1 +k1 (λ1 +λ3 ) |β|λk2 (k2 −1)+k2 +k2 (λ2 +λ3 ) × Ik1 ,k2 (λ, λ1 , λ2 , λ3 , α, β).
(4.11)
There results the following recurrence, which generalizes Eq. (4.8). Proposition 4.1. Jk1 +1,k2 +1 (λ, 0, 0, λ3 , α, β) = |α|
1+λ3
|β|
1+λ3
(k1 + 1)(k2 + 1)
[0,1]2
|αy + βz|2λ y λ3 z λ3
× Jk1 ,k2 (λ, 2λ, 2λ, λ3 , αy, βz)dydz.
(4.12)
Hence, as in the case of the Selberg integral, it is sufficient to determine the dependence of the new integral on λ1 and λ2 . The quantity Jk1 ,k2 (λ, λ1 , λ2 , λ3 , α, β) has the properties that it is symmetric and recovers the joint moments by Eq. (4.4). Jk1 ,k2 (λ, λ1 , λ2 , λ3 , α, β) = Jk2 ,k1 (λ, λ2 , λ1 , λ3 , β, α),
(4.13)
Sk1 , k2 (µ, [a, b], [b, c]) = Jk1 ,k2 (−µ/2, 0, 0, 0, α, β).
(4.14)
Jk1 ,k2 (λ, λ1 , λ2 , λ3 , α, β) is also of interest as its values for positive and negative β are related by a functional equation. Consider the integral identity that follows from the symmetry of the integrand m m |xi − xj |2λ |xi − a|λ1 |xi − c|λ2 |xi − b|λ3 dx [a,c]k1
=
[b,c]k2
k1 p=0
k1 p
i=1
1
[a,b]p
[b,c]m−p
m
|xi − xj |2λ
m
|xi − a|λ1 |xi − c|λ2 |xi − b|λ3 dx.
i=1
1
(4.15) Applying the change of variables xi = 1 −
xi − a , α+β
i = 1 · · · k1 ,
xj = 1 −
xj − b , β
j = k1 + 1 · · · m
(4.16)
March 23, J070-S0129055X11004242
144
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
to the left-hand side of Eq. (4.15), we obtain
m
[a,c]k1 [b,c]k2
m
|xi − xj |2λ
|xi − a|λ1 |xi − c|λ2 |xi − b|λ3 dx
i=1
1
= Jk1 ,k2 (λ, λ1 , λ3 , λ2 , α + β, −β).
(4.17)
Finally, applying the change of variables in Eq. (4.3) to the right-hand side of Eq. (4.15) and recalling Eq. (4.11), we get
[a,b]k1
m [b,c]k2
|xi − xj |2λ
m
|xi − a|λ1 |xi − c|λ2 |xi − b|λ3 dx
i=1
1
= Jk1 ,k2 (λ, λ1 , λ2 , λ3 , α, β),
(4.18)
which generalizes Eqs. (4.4) and (4.14). There results the following functional equation. Proposition 4.2. For any k = 0 · · · m there holds the identity Jk,m−k (λ, λ1 , λ3 , λ2 , α + β, −β) =
k k p=0
p
Jp,m−p (λ, λ1 , λ2 , λ3 , α, β).
(4.19)
If we let k = m and λ3 = 0 in Eq. (4.19), we obtain Corollary 4.1. λm(m−1)+m+m(λ1 +λ2 )
(α + β)
Sm (λ, λ1 , λ2 ) =
k1 +k2 =m
m Jk1 ,k2 (λ, λ1 , λ2 , 0, α, β). k1 (4.20)
In particular, using Eqs. (4.13) and (4.14) it is easy to calculate the moments for m = 2 and m = 3 and α = β = 1/2 in terms of the Selberg integral. 1 (21−µ − 1) 1 S1, 1 µ, 0, , , 1 = S2 (−µ/2, 0, 0), (4.21) 2 2 22−µ 1 1 1 1 1 (22−3µ − 1) S3 (−µ/2, 0, 0). S1, 2 µ, 0, , , 1 = S2, 1 µ, 0, , , 1 = 2 2 2 2 3 23−3µ (4.22) 5. The Proofs In this section we will give the proofs of all results that were stated in Sec. 3. We begin with a direct proof of the fact that our infinite product formula in Theorem 2.8 coincides with Selberg’s finite product in the special case of q = l, l = 1, 2, 3, . . . .
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
145
Proof of Eq. (2.27) from Eq. (2.32). Recall the identity, confer [18], Sec. 12.13, ∞ l l n − a Γ(1 − bj ) j (5.1) = n − bj Γ(1 − aj ) n=1 j=1 j=1 that holds for arbitrary constants aj and bj so long as j aj = j bj . Hence, in the special case of q = l, we obtain 2q 3 ∞ l−1 2n Γ (1 − q + 2n/µ) Γ(2 − q + 2n/µ) Γ3 (1 − µj/2) = . 3 µ Γ (1 + 2n/µ) Γ(2 − 2q + 2n/µ) j=0 Γ(1 − µ(l + j − 1)/2) n=1 (5.2) On the other hand, using the functional equation of the gamma function, we have q l−1 2 Γ(2 + 2/µ − 2q) 1 = . (5.3) µ Γ(2 + 2/µ − q) 1 − µ(l + j − 1)/2 j=0 Thus, exp(Dµ (l)) =
l−1 Γ3 (1 − µj/2) Γ(1 − µl/2) . Γl (1 − µ/2) j=0 Γ(2 − µ(l + j − 1)/2)
(5.4)
It is elementary now to see that this expression is the same as Eq. (2.27). We remark in passing that the proof of Eq. (2.28) from Eq. (2.32) goes through verbatim. We now proceed to the proof of Theorem 3.1. In the proof, we work with the general decorrelation length L and set L = 1 at the end. Throughout this section, s → ωµ,L,ε (s) refers to the Gaussian process defined in Eqs. (2.2) and (2.3). As a slight abuse of notation, we write dMε (s) as an abbreviation for dMµ,L,ε (s) exp(ωµ,L,ε (s))ds. We also define µ gL,ε (s1 , s2 ) Cov(ωµ,L,ε (s1 ), ωµ,L,ε (s2 )), then limε→0 gL=1,ε (s1 , s2 ) = g(s1 , s2 ) as in Eq. (3.3). The functions Fj (x) and fj (x) are as in Sec. 3: Fj (x) is smooth, fj (x) is continuous, and neither involves µ as a parameter. We need the statements of several lemmas, whose proofs the reader can readily find in [13]. Lemma 5.1. Let ω(s) be a Gaussian process with respect to some probability measure P that is defined on an interval s ∈ [0, L], has continuous sample paths, and satisfies EP [exp(ω(s))] = 1 for all s. Let s1 and s2 be any two distinct times, s1 , s2 ∈ [0, L], and let C(s, t) CovP (ω(s), ω(t)) denote the covariance function of ω(s), which is assumed to be continuous. Then, the measure dQ eω(s1 ) dP is a probability measure that is equivalent to P, and the law of the process s → ω(s) + C(s, s1 ) with respect to P equals the law of the original process s → ω(s) with respect to Q on the interval [0, L]. Similarly, the measure dQ eω(s1 )+ω(s2 )−C(s1 ,s2 ) dP is a probability measure that is equivalent to P, and
March 23, J070-S0129055X11004242
146
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
the law of the process s → ω(s) + C(s, s1 ) + C(s, s2 ) with respect to P equals the law of the original process s → ω(s) with respect to Q on the interval [0, L]. Lemma 5.2. The process ωµ,L,ε (s) has continuous sample paths. Lemma 5.3. Let f(δ, s) be an arbitrary continuous function that vanishes as δ → 0. Let B(s) exp(f(δ, s) + ωδ,L,ε (s)) − 1. Then, given any distinct s1 , . . . , sk ∈ [0, L], as δ → 0, E[B(s1 )B(s2 )] = (ef(δ,s1 ) − 1)(ef(δ,s2 ) − 1) + δ gL,ε (s1 , s2 ) + o(δ), E[B(s1 ) · · · B(sk )] = (ef(δ,s1 ) − 1) · · · (ef(δ,sk ) − 1) + o(δ),
k = 2.
(5.5) (5.6)
µf (s) N j dMε (s)) Proof of Theorem 3.1. Let uε (z, µ, f, F , I ) j=1 Fj (z Ij e and let vε (z, µ, f , F , I ) E[uε (z, µ, f , F , I )] so that v(µ, f , F , I ) in Eq. (3.1) satisfies v(µ, f, F , I ) = limε→0 vε (z = 1, µ, f, F , I ). The starting point is the limit A
∗ ∂ E∗ [vε (zeB (δ) , µ, f, F , I )] ∂δ δ=0
(5.7)
where B ∗ (δ) is the standard Brownian motion starting at zero that is independent of the process s → ωµ,L,ε (s). We will evaluate this limit in two different ways. The first is based on the intermittency parameter invariance δ ¯ δ,eL,ε (s) + , B ∗ (δ) + ωµ,L,ε (s) = ωµ−δ,L,ε (s) + ω 2
(5.8)
that we first introduced in [13]. This equality is understood as the equality in law of stochastic processes viewed as random functions of s on the interval [0, L] at fixed 0 < δ < µ and ε. The process ω ¯ δ,eL,ε (s) denotes an independent copy of the ωε (s) process at the intermittency parameter δ and rescaled decorrelation length eL, where e denotes the base of natural logarithm. Now, we need to define three auxiliary quantities. Aδ,ε (s) ωµ−δ,L,ε (s) − ωµ,L,ε (s), A¯δ,ε (s) ω ¯ δ,eL,ε (s), ¯ Cε (δ, f, I) eµf (s) (eAδ,ε (s)+Aδ,ε (s) − 1)dMε (s).
(5.9) (5.10) (5.11)
I
The key observation is that the intermittency parameter invariance in Eq. (5.8) implies the following identity in law
N ∗ δ ¯ Fj ze 2 eµfj (s)+Aδ,ε (s)+Aδ,ε (s) dMε (s) . (5.12) uε (zeB (δ) , µ, f, F , I ) = j=1
Ij
Thus, to compute the limit in Eq. (5.7), we need to expand the right-hand side of Eq. (5.12) in δ up to o(δ) terms. While we do not know how to expand either
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
147
Aδ,ε (s) or A¯δ,ε (s) in δ, they both clearly vanish as δ → 0 and, therefore, so does Cε (δ, f, I). It follows that we can write
ze
Fj
δ 2
e
¯δ,ε (s) µfj (s)+Aδ,ε (s)+A
Ij
z (1) = δ Fj 2
dMε (s)
µfj (s) e dMε (s) eµfj (s) dMε (s) z Ij
Ij
∞ zk (k) µfj (s) Fj e dMε (s) Cεk (δ, fj , Ij ) + o(δ). z + k! Ij
(5.13)
k=0
There results the expansion of N
z (1) F δ 2 j=1 j +
N
δ
j=1
Fj (ze 2
Ij
¯
eµfj (s)+Aδ,ε (s)+Aδ,ε (s) dMε (s)).
N µfj (s) e dMε (s) eµfj (s) dMε (s) Fl z eµfl (s) dMε (s) z Ij
Ij
l=j
Il
∞ N z k1 +···+kN (kl ) Fl z eµfl (s) dMε (s) Cεkl (δ, fl , Il ) + o(δ). k1 ! · · · kN ! Il
k1 ···kN
(5.14)
l=1
The next step in the proof is to compute the expectation of this expression with the goal of eventually evaluating the limit in Eq. (5.7) by means of the identity in Eq. (5.12). There are two expectations involved: the E with respect to ωε process that is inherited from the definition of vε (z, µ, f, F , I ) and the E∗ expectation with respect to ω ¯ ε process. Interchanging their order, it follows from Eq. (5.14) that N computing the E∗ expectation is now reduced to computing E∗ [ l=1 Cεkl (δ, fl , Il )]. As Aδ,ε (s) and A¯δ,ε (s) are independent processes, it follows from Lemma 5.3 applied to B(s) = exp(Aδ,ε (s) + A¯δ,ε (s)) − 1 with f(δ, s) Aδ,ε (s) that the E∗ expectation equals E
∗
N
Cεkl (δ,
fl , Il )
l=1
=
N l=1
+δ
Il N
kl eµfl (s) [eAδ,ε (s) − 1]dMε (s)
δkl ,1 δkj ,1
l<j
+δ
N l=1
N
δkp ,0
p=l,j
δkl ,2
N p=l
δkp ,0
Il2
Il Ij
eµ(fl (s1 )+fj (s2 )) geL,ε (s1 , s2 )dMε (s1 )dMε (s2 )
eµ(fl (s1 )+fl (s2 )) geL,ε (s1 , s2 )dMε (s1 )dMε (s2 ) + o(δ). (5.15)
March 23, J070-S0129055X11004242
148
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
It follows that the expectation of the second term in Eq. (5.14) satisfies
∞ N z k1 +···+kN (kl ) ∗ µfl (s) kl E Fl e dMε (s) Cε (δ, fl , Il ) z k1 ! · · · kN ! Il k1 ···kN
l=1
∞ N z k1 +···+kN (kl ) µfl (s) = Fl e dMε (s) z k1 ! · · · kN ! Il k1 ···kN l=1 kl × eµfl (s) [eAδ,ε (s) − 1]dMε (s) Il
+ δz
2
N
(1) Fl
(1) µfl (s) µfj (s) z e dMε (s) Fj e dMε (s) z Il
l<j
× Fk z k=l,j
×
Il
Ij
Ik
Ij
eµfk (s) dMε (s)
eµ(fl (s1 )+fj (s2 )) geL,ε (s1 , s2 )dMε (s1 )dMε (s2 )
N z 2 (2) µfl (s) µfj (s) Fl e dMε (s) Fj z e dMε (s) +δ z 2 Il Ij l=1 j=l × eµ(fl (s1 )+fl (s2 )) geL,ε (s1 , s2 )dMε (s1 )dMε (s2 ) + o(δ). (5.16) Il2
The first term on the right-hand side of Eq. (5.16) can be reduced further by means of an elementary identity
N ∂ Fj z eµfj (s) dMε (s) ∂µ j=1 Ij =z
N
(1)
Fl
eµfl (s) dMε (s) z Il
l=1
×
Il
eµfl (s) fl (s)dMε (s)
δ→0
Ij
∞ N 1 z k1 +···+kN (kl ) Fl eµfl (s) dMε (s) z δ k1 ! · · · kN ! Il
×
Fj z eµfj (s) dMε (s)
j=l
− lim
Il
k1 ···kN
l=1
kl N eµfl (s) [eAδ,ε (s) − 1]dMε (s) − Fj z eµfj (s) dMε (s) . j=1
Ij
(5.17)
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
149
Putting Eqs. (5.14), (5.16), and (5.17) together, we obtain for the limit in Eq. (5.7) N ∂ z A = − E[uε (z, µ, f , F1 · · · FN , I )] + E eµfj (s) dMε (s) ∂µ 2 I j j=1 × uε (z, µ, f,
(1) F1 · · · Fj
· · · FN , I ) + zE
N l=1
Il
eµfl (s) fl (s)dMε (s)
N (1) eµ(fl (s1 )+fj (s2 )) × uε (z, µ, f, F1 · · · Fl · · · FN , I ) + z 2 E l<j
Il
Ij
(1) (1) × geL,ε (s1 , s2 )dMε (s1 )dMε (s2 )uε (z, µ, f, F1 · · · Fl · · · Fj · · · FN , I )
N z2 + E eµ(fl (s1 )+fl (s2 )) geL,ε (s1 , s2 )dMε (s1 )dMε (s2 ) 2 2 I l l=1
(2) × uε (z, µ, f, F1 · · · F · · · FN , I ) . l
(5.18)
Finally, recalling the definition vε (z, µ, f, F , I ) E[uε (z, µ, f, F , I )], the expression in Eq. (5.18) can be simplified by means of Lemma 5.1. ∂ z ∂ vε (z, µ, f, F1 · · · FN , I ) + vε (z, µ, f, F1 · · · FN , I ) ∂µ 2 ∂z N (1) +z eµfl (s) fl (s)vε (z, µ, f + g(·, s), F1 · · · Fl · · · FN , I )ds
A=−
l=1
+ z2
Il
N l<j
Il
Ij
exp(µ(fl (s1 ) + fj (s2 ) + gL,ε (s1 , s2 )))geL,ε (s1 , s2 )
(1) (1) × vε (z, µ, f + g(·, s1 ) + g(·, s2 ), F1 · · · Fl · · · Fj · · · FN , I )ds1 ds2 N z2 + exp(µ(fl (s1 ) + fl (s2 ) + gL,ε (s1 , s2 )))geL,ε (s1 , s2 ) 2 Il2 l=1
(2) × vε (z, µ, f + g(·, s1 ) + g(·, s2 ), F1 · · · Fl · · · FN , I )ds1 ds2 .
(5.19)
Now, we compute the limit in Eq. (5.7) in a different way. By the backward Kolmogorov equation we have 2 1 ∂ 2 ∂ A= +z (5.20) vε (z, µ, f, F1 · · · FN , I ). z 2 ∂z ∂z 2
March 23, J070-S0129055X11004242
150
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
It remains to equate Eqs. (5.19) and (5.20) and make use of the identity geL,ε (s1 , s2 ) = 1 + gL,ε (s1 , s2 ).
(5.21)
The result follows by setting z = 1 and letting ε → 0. Proof of Proposition 3.1. We work with general fj . Given non-negative integers kj , j = 1 · · · N, let sj , j = 1 · · · N, denote the vector sj ≡ (sj1 , . . . , sjkj ) such that sjl ∈ Ij ∀ l. For a scalar function of a scalar we will write f (sj ) to mean the kj sum f (sj ) l=1 f (sjl ). Similarly, for a scalar function of two variables we will ki g(t, sip ), and g (t, si ) write g (s, t) (g(s, t), . . . , g(s, t)) N times, g(t, si ) p=1 (g(t, si ), . . . , g(t, si )) N times. Finally, we set
g(si , sj )
kj ki g(sip , sjq )
if i = j
p=1 q=1
(5.22)
ki g(sip , sjq )
if i = j.
p
Then, repeated application of the differentiation rule in Theorem 3.1 shows that there exist functions hn;k1 ···kN (s1 , . . . , sN ) of vectors sj = (sj1 , . . . , sjkj ) such that ∂n v(µ, f, F , I ) = ∂µn
1≤
PN
j=1 kj ≤2n
k
I1 1
···
k
INN
N N expµ fl (sl ) + g(sl , sj ) l=1
l≤j
N (k ) (k ) × v µ, f + g(·, sl ), F1 1 · · · FN N , I l=1
× hn;k1 ···kN (s1 , . . . , sN )ds1 · · · dsN .
(5.23)
The functions hn;k1 ···kN (s1 , . . . , sN ) are determined by the recurrence hn+1;k1 ···kN =
N
fl (sl ) +
l=1
+
N
N
g(sl , sj ) hn;k1 ···kN
l≤j
i=1
fi (siki ) +
N
g(siki , sl ) hn;k1 ···ki −1···kN
l=1
N
1 g(siki −1 , siki )hn;k1 ···ki −2···kN 2 i=1 + g(siki , sjkj )hn;k1 ···ki −1···kj −1···kN +
i<j
(5.24)
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
151
subject to the initial conditions h1; 0···ki =1···0 (0 · · · si = (s) · · · 0) = fi (s), 1 h1; 0···ki =2···0 (0 · · · si = (s1 , s2 ) · · · 0) = g(s1 , s2 ), 2 h1; 0···ki =1···kj =1···0 (0 · · · si = (s1 ) · · · sj = (s2 ) · · · 0) = g(s1 , s2 ).
(5.25) (5.26) (5.27)
This is all verified by a straightforward induction argument. Finally, we define the expansion coefficients ··· hn;k1 ···kN (s1 , . . . , sN )ds1 · · · dsN . (5.28) Hn;k1 ···kN k
I1 1
k
INN
The intermittency expansion in Eq. (3.5) is then the formal Taylor series that follows from Eq. (5.23) by setting µ = 0 and recalling that dM (s) = ds at zero intermittency. Note that we have established the expansion in Eq. (3.5) for general functions fj , in which case the expansion coefficients Hn;k1 ···kN depend on fj , and N the sum is over the range 1 ≤ j=1 kj ≤ 2n. If fj = 0, then this range becomes N 2 ≤ j=1 kj ≤ 2n because of the structure of the recurrence in Eq. (5.24) and the initial condition in Eq. (5.25). Proof of Corollary 3.1. The argument is based on binomial inversion. Consider the joint positive integral moments Sq1 ···qN (µ, I ) of I1 dMµ , . . . , IN dMµ as in Eq. (3.7). Let N j=1 qj ≤ 2n. The corresponding intermittency expansion given in Eq. (3.6) implies N ∂ n ) = S (µ, I H (qj )kj |Ij |qj −kj . (5.29) q ···q n;k ···k 1 N 1 N ∂µn µ=0 j=1 kj ≤qj j=1···N
The result now follows from binomial inversion. Proof of Proposition 3.2. The proof requires a routine calculation using that {ωµ,L,ε (sj )} is jointly Gaussian for any sequence {sj }, and for any Gaussian random variable ω there holds the identity E[exp(ω)] = exp(E[ω] + (1/2) Var[ω]). Proof of Proposition 3.3. The argument is based on symmetrization. Introduce the following notation. We will write Sqk to denote the set of all subsets of {1 · · · k} consisting of exactly q elements, q = 1 · · · k. Given the indices qj ≤ kj , j = 1 · · · N, k and subsets σj ∈ Sqjj of the sets {1 · · · kj } each consisting of qj indices, respectively, we will abbreviate the sum over all the possible pairs of indices as σl l=1···N
log|si − sj |
N i<j p∈σi l∈σj
log|sip − sjl | +
N i=1 p
log|sip − sil |.
(5.30)
March 23, J070-S0129055X11004242
152
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
Then, by Proposition 3.2, we can write ∂ n Sq ···q (µ, I ) = (−1)n ∂µn µ=0 1 N ×
···
q q I1 1 ×···×INN
n
ds1 · · · dsN
σl ={1···ql } l=1···N
log |si − sj | .
(5.31)
The integrand is clearly symmetric in (si1 , . . . , siqi ) for every i = 1 · · · N. We can kN by symmetrizing the integrand. Given qj ≤ kj , extend the integral to I1k1 × · · ·× IN j = 1 · · · N, it is easy to see from Eq. (5.31) that there holds the identity N l=1
|Il |kl −ql
n kl ∂ Sq ···q (µ, I ) ql ∂µn µ=0 1 N
= (−1)n
···
k
k
I1 1 ×···×INN
k σl ∈Sqll
n ds1 · · · dsN log |si − sj | .
(5.32)
σl l=1···N
l=1···N
We now substitute this identity into the right-hand side of Eq. (3.8), take the sum over q1 · · · qN , and obtain a new representation of the expansion coefficients Hn;k1 ···kN = (−1)n
(−1)k1 +···+kN k1 ! · · · kN !
···
k
k
I1 1 ×···×INN
ds1 · · · dsN n
× (−1)q1 +···+qN log|si − sj | σl k l=1···N ql ≤kl
σl ∈Sqll
.
(5.33)
l=1···N
l=1···N
It remains to observe that the integrand in Eq. (5.33) coincides, up to a multiplicative constant, with the terms in the multinomial expansion of
n
σl ={1···kl } l=1···N
log|si − sj |
(5.34)
that involve all the indices (11) · · · (1k1 ) · · · (N 1) · · · (N kN ). The total number of these indices is k1 + · · · + kN . As each term in the multinomial expansion has exactly n factors and each factor has two indices, 2n is the greatest number of indices that can be involved in any of the expansion terms. Hence the result in Eq. (3.10).
March 23, J070-S0129055X11004242
2011 10:41 WSPC/S0129-055X
148-RMP
Stochastic Dependence Structure of Limit Lognormal Process
153
6. Conclusions We have presented a thorough review of the single-increment distribution of the limit lognormal process. In particular, we have reviewed our work on the Mellin transform of the limit distribution and its connection with the Selberg integral. We have also characterized the general transform and its operator formulation. Despite substantial progress in our understanding of the limit distribution, several problems remain outstanding. The problem of uniqueness of our solution for the Mellin transform is open mainly due to the fact that the corresponding moment problem is indeterminate. The problem of computing transforms other than the Mellin transform is open as the action of our operator on functions other than the exponential is unknown. The problem of computing the probability density function of the limit distribution is also open. We have derived the rule of intermittency differentiation for multiple increments of the limit lognormal process from the intermittency parameter invariance of the underlying normal process. This rule is an exact functional equation that expresses the intermittency derivative of a general class of functionals of the joint distribution of multiple increments in terms of certain expectations over the paths of the limit lognormal process. We have solved this equation by means of formal intermittency expansions and thereby showed how to reconstruct the whole dependence structure of the limit process from the dependence of its joint integral moments on the intermittency parameter. Our solution is a type of exact renormalization because the naive expansion in the joint moments is divergent. Our intermittency expansion bypasses this divergence in a systematic way as it is an expansion involving derivatives of the joint moments at zero intermittency as opposed to the moments themselves so that all the expansion coefficients are finite. The task of computing particular functionals of the joint distribution of multiple increments is left to future research. Any such calculation requires the knowledge of the joint integral moments as a function of the intermittency parameter. The moments are given by a nontrivial extension of the Selberg integral that is derived in this paper. The value of this integral is unknown in general at the present time. Our results pave the first step in the direction of addressing the very practical and equally difficult problem of computing the likelihood of future values of the limit lognormal process given an observation of its past, i.e. the problem of multifractal prediction.
References [1] E. Bacry, J. Delour and J.-F. Muzy, Multifractal random walk, Phys. Rev. E 64 (2001) 026103. [2] E. Bacry, J. Delour and J.-F. Muzy, Modelling financial time series using multifractal random walks, Phys. A 299 (2001) 84–92. [3] E. Bacry and J.-F. Muzy, Log-infinitely divisible multifractal random walks, Comm. Math. Phys. 236 (2003) 449–475.
March 23, J070-S0129055X11004242
154
2011 10:41 WSPC/S0129-055X
148-RMP
D. Ostrovsky
[4] J. Barral and B. B. Mandelbrot, Multifractal products of cylindrical pulses, Probab. Theory Related Fields 124 (2002) 409–430. [5] I. Benjamini and O. Schramm, KPZ in one dimensional random geometry of multiplicative cascades, Comm. Math. Phys. 289 (2009) 653–662. [6] B. Duplantier and S. Sheffield, Liouville quantum gravity and KPZ (2008); arXiv:0808.1560v1 [math.PR]. [7] P. J. Forrester and S. O. Warnaar, The importance of the Selberg integral, Bull. Amer. Math. Soc. 45 (2008) 489–534. [8] Y. V. Fyodorov, P. Le Doussal and A. Rosso, Statistical mechanics of logarithmic REM: Duality, freezing and extreme value statistics of 1/f noises generated by Gaussian free fields, J. Stat. Mech. 2009 (2009) P10005; doi: 10.1088/17425468/2009/10/P10005. [9] J.-P. Kahane, Positive martingales and random measures, Chinese Ann. Math. Ser. A 8(1) (1987) 1–12. [10] B. B. Mandelbrot, Possible refinement of the log-normal hypothesis concerning the distribution of energy dissipation in intermittent turbulence, in Statistical Models and Turbulence, eds. M. Rosenblatt and C. Van Atta, Lecture Notes in Physics, Vol. 12 (Springer, New York, 1972), pp. 333–351. [11] B. B. Mandelbrot, Limit lognormal multifractal measures, in Frontiers of Physics: Landau Memorial Conference, eds. E. A. Gotsman et al. (Pergamon, New York, 1990), pp. 309–340. [12] J.-F. Muzy and E. Bacry, Multifractal stationary random measures and multifractal random walks with log-infinitely divisible scaling laws, Phys. Rev. E 66 (2002) 056121. [13] D. Ostrovsky, Functional Feynman–Kac equations for limit lognormal multifractals, J. Stat. Phys. 127 (2007) 935–965. [14] D. Ostrovsky, Intermittency expansions for limit lognormal multifractals, Lett. Math. Phys. 83 (2008) 265–280. [15] D. Ostrovsky, Mellin transform of the limit lognormal distribution, Comm. Math. Phys. 288 (2009) 287–310. [16] D. Ostrovsky, On the limit lognormal and other limit log-infinitely divisible laws, J. Stat. Phys. 138 (2010) 890–911. [17] A. Selberg, Remarks on a multiple integral, Norske Mat. Tidsskr. 26 (1944) 71–78. [18] E. T. Whittaker and G. N. Watson, A Course of Modern Analysis, 4th edn. (Cambridge University Press, London, 1958).
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 2 (2011) 155–178 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004254
EIGENFUNCTIONS AT THE THRESHOLD ENERGIES OF MAGNETIC DIRAC OPERATORS
¯ ∗ and TOMIO UMEDA† YOSHIMI SAITO ∗Department
of Mathematics, University of Alabama at Birmingham, Birmingham, AL 35294, USA
†Department
of Mathematical Sciences, University of Hyogo, Himeji 671-2201, Japan ∗
[email protected] †
[email protected] Received 1 October 2009 Revised 6 December 2010
Discussed are ±m modes and ±m resonances of Dirac operators with vector potentials HA = α · (D − A(x)) + mβ. Asymptotic limits of ±m modes at infinity are derived when |A(x)| ≤ Cx−ρ , ρ > 1, provided that HA has ±m modes. In wider classes of vector potentials, sparseness of the vector potentials which give rise to the ±m modes of HA are established. It is proved that no HA has ±m resonances if |A(x)| ≤ Cx−ρ , ρ > 3/2. Keywords: Dirac operators; magnetic potentials; threshold energies; threshold resonances; threshold eigenfunctions; zero modes. Mathematics Subject Classification 2010: 35Q40, 35P99, 81Q10
1. Introduction The introduction is devoted to exhibiting our results as well as to reviewing previous contributions in connection with the results in the present paper. This paper is concerned with eigenfunctions and resonances at the threshold energies of Dirac operators with vector potentials HA = α · (D − A(x)) + mβ,
D=
1 ∇x , x ∈ R3 . i
Here α = (α1 , α2 , α3 ) is the triple of 4 × 4 Dirac matrices 0 σj αj = (j = 1, 2, 3) σj 0 with the 2 × 2 zero matrix 0 and the triple of 2 × 2 Pauli matrices 0 1 0 −i 1 0 σ1 = , σ2 = , σ3 = , 1 0 i 0 0 −1 155
(1.1)
March 23, J070-S0129055X11004254
156
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
and β=
I2 0
0 . −I2
The constant m is assumed to be positive. Throughout the present paper, we assume that each component of the vector potential A(x) = (A1 (x), A2 (x), A3 (x)) is a real-valued measurable function. In addition to this, we shall later impose four different sets of assumptions on A(x) under which the operator −α · A(x) is relatively compact with respect to the free Dirac operator H0 = α · D + mβ. Therefore, under any set of assumptions to be made, the magnetic Dirac operator HA is a self-adjoint operator in the Hilbert space [L2 (R3 )]4 , and the essential spectrum of HA is given by the union of the intervals (−∞, −m] and [m, +∞): σess (HA ) = (−∞, −m] ∪ [m, +∞).
(1.2)
By the threshold energies of HA , we mean the values ±m, the edges of the essential spectrum σess (HA ). We shall see in Secs. 3–5 that the discrete spectrum of HA in the gap (−m, m) is empty, although we should like to mention that this fact is well known by the result of Thaller [31, Theorem 7.1, p. 195] where smoothness of vector potentials is assumed though. In other words, there are no isolated eigenvalues with finite multiplicity in the spectral gap (−m, m). In the present paper, this fact will be obtained as a by-product of Theorem 2.3 in Sec. 2, where we shall deal with an abstract Dirac operator, i.e. a supersymmetric Dirac operator. As a result, we shall have σ(HA ) = σess (HA ) = (−∞, −m] ∪ [m, +∞) under any set of the assumptions on A(x) of the present paper. In relation with the relative compactness of −α · A(x) with respect to H0 , it is worthwhile to mention a work by Thaller [30], where he showed that (1.2) is true under the assumption that |B(x)| → 0 as |x| → ∞. Here B(x) denotes the magnetic field: B(x) = ∇ × A(x). It is clear that the assumption that |B(x)| → 0 does not necessarily imply the relative compactness of −α·A(x) with respect to H0 . In Helffer et al. [16], they showed that (1.2) is true under much weaker assumptions on B(x), which do not even need the requirement that |B(x)| → 0 as |x| → ∞; see also [31, §7.3.2]. It is generally expected that eigenfunctions corresponding to a discrete eigenvalue of HA decay exponentially at infinity (describing bound states), and that (generalized) eigenfunctions corresponding to an energy inside the continuous spectrum (−∞, −m] ∪ [m, +∞) behave like the sum of a plane and a spherical waves at infinity (describing scattering states). At the energies ±m, on which we shall focus in the present paper, (generalized) eigenfunctions are expected to behave like C0 + C1 |x|−1 + C2 |x|−2 at infinity, where Cj , j = 1, 2, 3, are constant vectors in C4 . If C0 = 0 and C1 = 0, then the (generalized) eigenfunctions become either of
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
157
±m resonances, and if C0 = 0 and C1 = 0, then the (generalized) eigenfunctions become either of ±m modes. For the precise definitions of ±m resonances and ±m modes, see Definition 6.1 in Sec. 6, and Definition 3.1 in Sec. 3, respectively. As for the exponential decay of eigenfunctions, we refer the reader to works by Helffer and Parisse [17], Wang [32], and a recent work by Yafaev [34]. As for the generalized eigenfunctions corresponding to an energy in (−∞, −m) ∪ (m, +∞), we refer the reader to Yamada [36]. As mentioned above, our main concern is the threshold energies ±m of the magnetic Dirac operator HA . These energies are of particular importance and of interest from the physics point of view. We should like to mention Pickl and D¨ urr [21], and Pickl [20], where they investigate generalized eigenfunctions not only at the energies ±m but also at the energies near ±m, with the emphasis on the famous relativistic effect of the pair creation of an electron and a positron. It is worthwhile to note that ±m modes and ±m resonances play decisive roles in their results. In the same spirit as in [20, 21], Pickl and D¨ urr [22] mention the possibility of experimental verifications of the pair creation by combining lasers and heavy ions fields. Therefore, it is obvious that results on ±m modes and ±m resonances of magnetic Dirac operators are useful to understand the physics of the pair creations in such laser fields; see [22] for details. The goal of the present paper is to derive a series of new results on ±m modes and ±m resonances of the magnetic Dirac operators HA . Precisely speaking, we shall study asymptotic behaviors at infinity of the ±m modes, show sparseness of vector potentials which give rise to the ±m modes, and establish non-existence of ±m resonances. According to Pickl [20, Theorems 3.4 and 3.5], the behavior of the generalized eigenfuntions of Dirac operators near criticality largely depends on whether Dirac operators with critical potentials have ±m resonances or not. Since the modulus of their cirtical potentials are less than or equal to C(1 + |x|)−2 , we can actually conclude from Theorems 6.1 and 6.2 in Sec. 6 that the magnetic Dirac operators with the critical potentials have no ±m resonances. However, one has to pay attention to a slight difference between our definition of the threshold resonances (cf. Definition 6.1) and theirs (cf. [20, Definition 2.3 and the paragraph after it]). Finally, we would like to mention that there is a striking difference between twoand three-dimensional Dirac operators with magnetic fields at the threshold energies. Compare the results in Sec. 6 of the present paper with those of Aharonov and Casher [1], where their arguments indicate that one can find magnetic Dirac operators in dimension two which possess threshold resonances. See also [33, Sec. 10]. The plan of the paper is as follows. In Sec. 2, we shall prepare a few results on a supersymmetric Dirac operator, which will be used in all the later sections. In Sec. 3, we shall investigate asymptotic behaviors at infinity of ±m modes of HA , provided that it has ±m modes. Sparseness of the set of vector potentials A(x) which yield ±m modes of HA will be discussed in Secs. 4 and 5 in different regimes. In Sec. 6, we shall prove that any HA does not have ±m resonances under a stronger
March 23, J070-S0129055X11004254
158
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
assumption than those made in the previous sections. Finally in Sec. 7, we shall give examples of vector potentials A(x) which yield ±m modes of magnetic Dirac operators HA , and shall show that these operators HA do not have ±m resonances. Then we shall propose an open question in relation with ±m resonances. 2. Supersymmetric Dirac Operators This section is devoted to a discussion about spectral properties of a class of supersymmetric Dirac operators. We should like to remark that our approach appears to be in the reverse direction in the sense that we start with two Hilbert spaces, and introduce a supersymmetric Dirac operator on the direct sum of the two Hilbert spaces. We find this approach convenient for our purpose; see [31, Chap. 5] for the standard theory of the supersymmetric Dirac operator. The supersymmetric Dirac operator H which we shall consider in the present paper is defined as follows: I 0 0 T∗ (2.1) +m on K = H+ ⊕ H− , H := T 0 0 −I where T is a densely defined operator from a Hilbert space H+ to another Hilbert space H− , m is a positive constant, and the identity operators in H+ and H− are both denoted by I with an abuse of notation. We recall that the domain of H is given by D(H) = D(T ) ⊕ D(T ∗ ), and that the inner product of the Hilbert space K is defined by (f, g)K := (ϕ+ , ψ + )H+ + (ϕ− , ψ − )H− for f=
+ ϕ , ϕ−
g=
ψ+ ψ−
(2.2)
∈ K.
(2.3)
The first term on the right-hand side of (2.1) is called the supercharge and the second term the involution, and denoted by Q and τ respectively: I 0 0 T∗ , τ= Q= . (2.4) T 0 0 −I We now state the main results in this section, which are about the nature of eigenvectors of the supersymmetric Dirac operator (2.1) at the eigenvalues ±m. We mention that T does not need to be a closed operator, and that T ∗ does not need to be densely defined, because we only focus on the eigenvalues ±m of H and the corresponding eigenspaces. In the standard theory of the supersymmetric Dirac operator, T is assumed to be a densely defined closed operator and T ∗ needs to be densely defined; cf. [31, §5.2.2]. We should like to draw attention to the fact that Theorems 2.1 and 2.2 below are simply abstract restatements of Thaller [31, Theorem 7.1, p. 195], where he dealt with the magnetic Dirac operators under the assumption that Aj ∈ C ∞ .
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
159
From the mathematically rigorous point of view, it is not appropriate to apply [31, Theorem 7.1] to the magnetic Dirac operators with non-smooth vector potentials. However, the vector potentials we shall treat in Secs. 3–5 are not smooth. In particular we shall deal, in Sec. 4, with vector potentials which can have local singularities. In this case, even self-adjointness of the magnetic Dirac operators is not trivial. Hence [31, Theorem 7.1] is not applicable to this case. These are the reasons why we need to generalize and restate [31, Theorem 7.1] in an abstract setting. Theorem 2.1. Suppose that T is a densely defined operator from H+ to H− . Let H be a supersymmetric Dirac operator defined by (2.1). (i) If f = t (ϕ+ , ϕ− ) ∈ Ker(H − m), then ϕ+ ∈ Ker(T ) and ϕ− = 0. (ii) Conversely, if ϕ+ ∈ Ker(T ), then f = t (ϕ+ , 0) ∈ Ker(H − m). Theorem 2.2. Assume that T and H are the same as in Theorem 2.1. (i) If f = t (ϕ+ , ϕ− ) ∈ Ker(H + m), then ϕ+ = 0 and ϕ− ∈ Ker(T ∗ ). (ii) Conversely, if ϕ− ∈ Ker(T ∗ ), then f = t (0, ϕ− ) ∈ Ker(H + m). As immediate consequences, we have Corollary 2.1. Assume that T and H are the same as in Theorem 2.1. Then (i) Ker(H − m) = Ker(T ) ⊕ {0}, dim(Ker(H − m)) = dim(Ker(T )). (ii) Ker(H + m) = {0} ⊕ Ker(T ∗ ), dim(Ker(H + m)) = dim(Ker(T ∗ )). The eigenspaces corresponding to the eigenvalues ±m of supersymmetric Dirac operators do not seem to have been explicitly formulated in the literature as in the form of Corollary 2.1. It is straightforward from this formulation that the eigenspaces of H corresponding to the eigenvalue ±m are independent of m. Proof of Theorem 2.1. We first prove Assertion (i). Let f = t (ϕ+ , ϕ− ) ∈ Ker (H − m). We then have + + + 0 T∗ I 0 ϕ ϕ ϕ +m =m , (2.5) T 0 ϕ− ϕ− ϕ− 0 −I hence
T ∗ ϕ− + mϕ+ = mϕ+
(2.6)
T ϕ+ − mϕ− = mϕ− , which immediately implies that T ∗ ϕ− = 0 and T ϕ+ = 2mϕ− . It follows that T ϕ+ 2H− = (T ϕ+ , T ϕ+ )H− = (T ϕ+ , 2mϕ− )H− = (ϕ+ , 2mT ∗ϕ− )H+ = 0. +
(2.7) −
−1
Thus we see that ϕ ∈ Ker(T ), and that ϕ = (2m)
+
T ϕ = 0.
March 23, J070-S0129055X11004254
160
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
We next prove Assertion (ii). Let ϕ+ ∈ Ker(T ) and put f := t (ϕ+ , 0). Then it follows that Hf = t (mϕ+ , T ϕ+ ) = mt (ϕ+ , 0) = mf . We omit the proof of Theorem 2.2, which is quite similar to that of Theorem 2.1. In connection with our applications to the magnetic Dirac operator HA in later sections, we should like to consider the case where the Hilbert space H+ coincides with H− and T is self-adjoint (T ∗ = T ). In this case, the supersymmetric Dirac operator H becomes of the form 0 T I 0 H= +m (2.8) T 0 0 −I in the Hilbert space K = H ⊕ H, and it follows from Theorems 2.1 and 2.2 that the operator H of the form (2.8) possesses of a simple but important equivalence: T has a zero mode ⇔ H has an m mode ⇔ H has a −m mode,
(2.9)
which is actually a well-known fact: see [31, Corollary 5.14, p. 155]. Here we say that T has a zero mode if 0 is an eigenvalue of T . In a similar manner, we say that H has an m mode (respectively, a −m mode) if m (respectively, −m) is an eigenvalue of H. Furthermore, Theorems 2.1 and 2.2 imply the following equivalence for a zero mode ϕ of T : ϕ ϕ 0 0 Tϕ = 0 ⇔ H =m ⇔H = −m . (2.10) 0 0 ϕ ϕ We shall show in Theorem 2.3 below that a sufficient condition for the fact that σ(H) = σess (H) = (−∞, −m] ∪ [m, ∞)
(2.11)
is given by the inclusion σ(T ) ⊃ (0, ∞). Therefore ±m are always threshold energies of the supersymmetric Dirac operator H, provided that σ(T ) ⊃ (0, ∞). Theorem 2.3. Let T be a self-adjoint operator in the Hilbert space H. Suppose that σ(T ) ⊃ [0, +∞). Then σ(H) = (−∞, −m] ∪ [m, +∞). In particular, σd (H) = ∅, i.e. the set of discrete eigenvalues of H with finite multiplicity is empty. Proof. It follows from (2.8) that D(H 2 ) = D(T 2 ) ⊕ D(T 2 ) and that 2 T + m2 I 0 0 2 2 I H = ≥m . 0 T 2 + m2 I 0 I This inequality implies that σ(H) ⊂ (−∞, −m] ∪ [m, +∞).
(2.12)
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
161
To complete the proof, we shall prove the fact that σ(H) ⊃ (−∞, −m]∪[m, +∞). 2 2 To this end, suppose λ0 ∈ (−∞, −m] ∪ [m, +∞) be given. Since λ0 − m ≥ 0, we 2 2 see, by the assumption of the theorem, that λ0 − m ∈ σ(T ). Therefore, we can find a sequence {ψn }∞ n=1 ⊂ H such that 1 1 (2.13) ψn H = 1, ψn ∈ Ran ET ν0 − , ν0 + , ν0 := λ20 − m2 n n for each n, where ET (·) is the spectral measure associated with T : ∞ T = λ dET (λ).
(2.14)
−∞
Here we have used a basic property of the spectral measure: see, for example [25, Proposition, p. 236]. It is straightforward to see that (T − ν0 )ψn H → 0 as n → ∞.
(2.15)
We shall construct a sequence {fn } ⊂ D(H) = D(T ) ⊕ D(T ) satisfying fn K = 1 and (H − λ0 )fn K → 0 as n → ∞. To this end, we choose a pair of real numbers a and b so that a2 + b 2 = 1 and that
m ν0
ν0 −m
(2.16)
a a . = λ0 b b
(2.17)
This is possible because the 2 × 2 symmetric matrix in (2.17) has eigenvalues ±λ0 . We now put aψn . (2.18) fn := bψn It is easy to see that fn K = 1. By using (2.16) and (2.17), we can show that (H − λ0 )fn 2K = (m − λ0 )aψn + bT ψn 2H + aT ψn − (m + λ0 )bψn 2H = b(−ν0 + T )ψn 2H + a(T − ν0 )ψn 2H = (T − ν0 )ψn 2H → 0
as n → ∞.
We thus have shown that λ0 ∈ σ(H). Here we briefly mention of the abstract Fouldy–Wouthuysen transformation UFW in connection with Theorem 2.3. The transformation UFW is a unitary operator in K, and transforms the supersymmetric Dirac operator H of the form (2.8) into the diagonal form: √ 2 T + m2 0 ∗ √ . UFW HUFW = 0 − T 2 + m2
March 23, J070-S0129055X11004254
162
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
Note that it is possible to prove (2.11) based on this unitary equivalence. For the abstract Fouldy–Wouthuysen transformation for the supersymmetric Dirac operator of the form (2.1), we refer the reader to Thaller [31, Chap. 5, §5.6]. In all the later sections, we shall apply the obtained results on the supersymmetric Dirac operator to the magnetic Dirac operator HA of the form (1.1) in the Hilbert space K = [L2 (R3 )]4 , where we take T to be the Weyl–Dirac operator TA = σ · (D − A(x))
(2.19)
acting in the Hilbert space H = [L2 (R3 )]2 . As was mentioned above (cf. (2.9) and (2.10)), the investigations of properties of ±m modes of the magnetic Dirac operator HA are reduced to the investigations of the corresponding properties of zero modes of the Weyl–Dirac operator TA = σ · (D − A(x)). We have to emphasize the broad applicability of the supersymmetric Dirac operator in the context of the present paper. Namely, thanks to the generality of Theorems 2.1 and 2.2, we are able to utilize most of the existing works on the zero modes of the Weyl–Dirac operator TA (cf. [2–8, 10–14, 19]) for the purpose of investigating ±m modes of the magnetic Dirac operator HA . 3. Asymptotic Limits of ±m Modes In this section, we consider a class of magnetic Dirac operators HA under Assumption(SU) below, and will focus on the asymptotic behaviors at infinity of ±m modes of HA , assuming that ±m are the eigenvalues of HA . In Sec. 7, we shall see that there exists infinitely many A’s such that the corresponding magnetic Dirac operators HA have the threshold eigenvalues ±m. We now introduce the terminology of ±m modes for the magnetic Dirac operator HA . Definition 3.1 (Following [18]). By an m mode (respectively, a −m mode), we mean an eigenfunction corresponding to the eigenvalue m (respectively, −m) of HA , provided that the threshold energy m (respectively −m) is an eigenvalue of HA . Assumption(SU). Each element Aj (x) (j = 1, 2, 3) of A(x) is a measurable function satisfying |Aj (x)| ≤ Cx−ρ
(ρ > 1),
(3.1)
where C is a positive constant. It is easy to see that under Assumption(SU) the Dirac operator HA is a selfadjoint operator in the Hilbert space K = [L2 (R3 )]4 with Dom(HA ) = [H 1 (R3 )]4 , where H 1 (R3 ) denotes the Sobolev space of order 1. Also it is easy to see that under Assumption(SU) the Weyl–Dirac operator TA is a self-adjoint operator in the Hilbert space H = [L2 (R3 )]2 with Dom(TA ) = [H 1 (R3 )]2 . Since the operator
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
163
−σ · A(x) is relatively compact with respect to the operator T0 := σ · D, and since σ(T0 ) = R, it follows that σ(TA ) = R. Recalling that 0 TA I 0 HA = +m , (3.2) TA 0 0 −I we can apply Theorem 2.3 to HA , and get σ(HA ) = σess (HA ) = (−∞, −m] ∪ [m, +∞). Hence ±m are the threshold energies of the operator HA . Assuming that ±m are the eigenvalues of HA , we find that the eigenspaces corresponding to the eigenvalues ±m of HA are given as the direct sum of Ker(TA ) and the zero space {0} (cf. Corollary 2.1 in Sec. 2), and that these two eigenspaces themselves as well as their dimensions are independent of m. Theorem 3.1. Suppose that Assumption(SU) is verified, and that m (respectively, −m) is an eigenvalue of HA . Let f be an m mode (respectively, a −m modes) of HA . Then there exists a zero mode ϕ+ (respectively, ϕ− ) of TA such that for any ω ∈ S2 + u (ω) 0 2 lim r f (rω) = , (3.3) respectively, − r→∞ 0 u (ω) where u± (ω) =
i 4π
R3
{(ω · A(y))I2 + iσ · (ω × A(y))}ϕ± (y)dy,
(3.4)
and the convergence is uniform with respect to ω. Theorem 3.1 is a direct consequence of Corollary 2.1, together with [27, Theorem 1.2]. Note that under Assumption(SU) every eigenfunction of HA corresponding to either one of eigenvalues ±m is a continuous function of x (cf. [28, Theorem 2.1]), therefore the expression f (rω) in (3.3) makes sense for each ω. 4. Sparseness of Vector Potentials Yielding ±m Modes In this section, we shall discuss the sparseness of the set of vector potentials A which give rise to ±m modes of magnetic Dirac operators HA , in the sprit of Balinsky and Evans [5,6], where they investigated Pauli operators and Weyl–Dirac operators respectively. We shall make the following assumption: Assumption(BE). Aj ∈ L3 (R3 ) for j = 1, 2, 3. Under Assumption(BE) Balinsky and Evans [6, Lemma 2] showed that −σ · A is infinitesimally small with respect to T0 = σ · D with Dom(T0 ) = [H 1 (R3 )]2 (see (4.5) below). This fact enables us to define the self-adjoint realization TA in the Hilbert space H = [L2 (R3 )]2 as the operator sum of T0 and −σ ·A, thus Dom(TA ) = [H 1 (R3 )]2 . It turns out that under Assumption(BE), −α · A is infinitesimally small
March 23, J070-S0129055X11004254
164
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
with respect to H0 := α·D+mβ, and hence we can define the self-adjoint realization HA in the Hilbert space K = [L2 (R3 )]4 as the operator sum of H0 and −α · A, thus Dom(HA ) = [H 1 (R3 )]4 . Therefore we can regard HA as a supersymmetric Dirac operator, and shall apply the results in Sec. 2 to HA . (Recall (3.2).) Proposition 4.1. Let Assumption(BE) be satisfied. Then σ(TA ) = R. We shall prepare a few lemmas for the proof of Proposition 4.1. Lemma 4.1. Let z ∈ C\R. Then D1/2 (T0 − z)−1 is a bounded operator in H. Moreover we have Ran(D1/2 (T0 − z)−1 ) ⊂ [H 1/2 (R3 )]2 .
(4.1)
Proof. It is sufficient to show the conclusions of the lemma for z = −i. Let ϕ ∈ Dom(T0 ). Then we have 2 |((σ · ξ) + iI2 )ϕ(ξ)| 2C2 dξ (T0 + i)ϕH = R3
=
R3
(|ξ|2 + 1)|ϕ(ξ)| 2C2 dξ
= Dϕ2H ,
(4.2)
where we have used the anti-commutation relation σj σk + σk σj = 2δjk I2 in the second equality. It follows from (4.2) that ϕH = D(T0 + i)−1 ϕH
(4.3)
for all ϕ ∈ H. Furthermore, we see that D1/2 (T0 + i)−1 ϕH ≤ D1/2 (T0 + i)−1 ϕ[H 1/2 (R3 )]2 = D(T0 + i)−1 ϕH = ϕH .
(4.4)
It is evident that (4.4) proves the conclusions of the lemma for z = −i. Lemma 4.2. If ϕ ∈ [H 1/2 (R3 )]2 , then (σ · A)D−1/2 ϕ ∈ H. Proof. By [6, Lemma 2], we see that for any > 0, there exists a constant k > 0 such that for all ϕ ∈ Dom(T0 ) (σ · A)ϕH ≤ T0 ϕH + k ϕH . −1/2
By virtue of the fact that D (4.5) that
ϕ ∈ Dom(T0 ) for ϕ ∈ [H
(4.5) 1/2
3
2
(R )] , it follows from
(σ · A)D−1/2 ϕH ≤ (T0 + i)D−1/2 ϕH + k D−1/2 ϕH ≤ D1/2 ϕH + k ϕH < +∞, where we have used (4.2) and the fact that D1/2 ϕH = ϕ[H 1/2 (R3 )]2 .
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
165
Lemma 4.3. D−1 (σ · A)D−1/2 is a compact operator in H. Proof. One can make a factorization |D|1/2 1 |D|1/2 1 −1 −1/2 D (σ · A)D = (σ · A) . D |D|1/2 |D|1/2 D1/2
(4.6)
It is obvious that the first term and the last term on the right-hand side of (4.6) are bounded operators in H. Then it follows from (4.6) and [6, Lemma 1] that the conclusion of the lemma holds true. Lemma 4.4. Let z ∈ C\R. Then (TA − z)−1 D|[H 1 (R3 )]2 can be extended to a A (z) in H. Moreover bounded operator R A (z)D−1 ϕ (TA − z)−1 ϕ = R
for ∀ ϕ ∈ H.
(4.7)
Proof. We first show that D(T − z)−1 is a closed operator in H. To this end, suppose that {ϕj } is a sequence in H such that ϕj → 0 in H and D(T − z)−1 ϕj → ψ in H as j → ∞. Then {(T − z)−1 ϕj } is a Cauchy sequence in [H 1 (R3 )]2 , hence there exists a ψ ∈ [H 1 (R3 )]2 such that (T − z)−1 ϕj → ψ in [H 1 (R3 )]2
as j → ∞.
(4.8)
Since the topology of [H 1 (R3 )]2 is stronger than that of H, (4.8) implies that (T − z)−1 ϕj → ψ in H
as j → ∞.
(4.9)
On the other hand, since ϕj → 0 in H, and since (T − z)−1 is a bounded operator in H, we see that (T − z)−1 ϕj → 0 in H
(4.10)
as j → ∞. Combining (4.9) and (4.10), we see that ψ = 0. This fact, together with (4.8), D(T − z)−1 ϕj → 0 in H as j → ∞. Hence ψ = 0. We have thus shown that D(T − z)−1 is a closed operator. Noting that Dom(D(T − z)−1 ) = H, we can conclude from the Banach closed graph theorem that D(T − z)−1 is a bounded operator in H, which will be denoted by QA (z). A (z) := QA (z)∗ , where QA (z)∗ denotes the adjoint operator of We now put R QA (z). Then for any ϕ ∈ H and any ψ ∈ [H 1 (R3 )]2 , we have A (z)ψ)H = (QA (z)ϕ, ψ)H (ϕ, R = (D(T − z)−1 ϕ, ψ)H = (ϕ, (T − z)−1 Dψ)H .
(4.11)
It follows from (4.11) that A (z)ψ = (T − z)−1 Dψ R
(4.12)
for all ψ ∈ [H 1 (R3 )]2 . Replacing ψ in (4.12) with D−1 ϕ, ϕ ∈ H, we get (4.7).
March 23, J070-S0129055X11004254
166
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
Proof of Proposition 4.1. Since σ(T0 ) = σess (T0 ) = R, it is sufficient to show that σess (TA ) = σess (T0 ).
(4.13)
To this end, we shall prove that the difference (TA + i)−1 − (T0 + i)−1 is a compact operator in H. Then, this fact implies (4.13); see [26, Corollary 1, p. 113]. We see that (TA + i)−1 − (T0 + i)−1 = (TA + i)−1 (σ · A)(T0 + i)−1 A (−i){D−1 (σ · A)D−1/2 }{D1/2 (T0 + i)−1 }, =R
(4.14)
where we have used Lemma 4.4 in (4.14). It follows from Lemmas 4.1–4.4 that (4.14) makes sense as a product of three bounded operators in H and that the product is a compact operator in H. Proposition 4.1, together with Theorem 2.3, gives the following result on the spectrum of the magnetic Dirac operator HA . Theorem 4.1. Let Assumption(BE) be satisfied. Then σ(HA ) = σess (HA ) = (−∞, −m] ∪ [m, ∞). We now state the main results in this section, which are concerned with the eigenspaces corresponding to the threshold eigenvalues of the magnetic Dirac operator HA . Theorem 4.2. Let Assumption(BE) be satisfied. Then (i) Ker(HA − m) is non-trivial if and only if Ker(HA + m) is non-trivial; in other words, {A ∈ [L3 (R3 )]3 | Ker(HA −m) = {0}} = {A ∈ [L3 (R3 )]3 | Ker(HA + m) = {0}}. (ii) There exists a constant c such that dim(Ker(HA − m)) = dim(Ker(HA + m)) ≤ c
R3
|A(x)|3 dx.
(4.15)
Moreover, the dimension of Ker(HA ∓ m) is independent of m. (iii) The set {A ∈ [L3 (R3 )]3 | Ker(HA ∓ m) = {0}} contains an open dense subset of [L3 (R3 )]3 . Proof. By Corollary 2.1, we see that Ker(TA ) is trivial ⇔ Ker(HA − m) is trivial ⇔ Ker(HA + m) is trivial.
(4.16)
Assertion (i) is equivalent to (4.16). Assertion (ii) follows from Corollary 2.1 and [6, Theorem 3]. Assertion (iii) follows from Corollary 2.1 and [6, Theorem 2].
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
167
Remark 4.1. Assertions (i) and (ii) of Theorem 4.2 mean the following facts: The threshold energy m is an eigenvalue of HA if and only if the threshold energy −m is an eigenvalue of HA . If this is the case, their multiplicity are the same. Remark 4.2. As for the best constant in the inequality (4.15), see [6, Theorem 3]. 5. The Structure of the Set of Vector Potentials Yielding ±m Modes In this section, we shall discuss a property of non-locality of magnetic vector potentials as well as the sparseness of the set of vector potentials A which give rise to ±m modes of HA in the spirit of Elton [11], where he investigated Weyl–Dirac operators. We make the following assumption: Assumption(E). Each Aj (j = 1, 2, 3) is a real-valued continuous function such that Aj (x) = o(|x|−1 ) as |x| → ∞. It is straightforward to see that under Assumption(E), −σ · A is a bounded self-adjoint operator in the Hilbert space H = [L2 (R3 )]2 . Hence we can define the self-adjoint realization TA with Dom(TA ) = [H 1 (R3 )]2 as the operator sum of T0 and −σ · A. Also, it is straightforward to see that −α·A is a bounded self-adjoint operator in the Hilbert space K = [L2 (R3 )]4 , hence we can define the self-adjoint realization HA with Dom(HA ) = [H 1 (R3 )]4 in K as the operator sum of H0 and −α · A. Therefore, in the same way as in Sec. 5, we can regard HA as a supersymmetric Dirac operator, and apply the results in Sec. 2 to HA . We note that under Assumption(E), (−σ · A)(T0 + i)−1 is a compact operator in H. Hence, in the same way as in the proof of Proposition 4.1, we can show that σ(TA ) = R. This fact, together with Theorem 2.3, implies the following result. Theorem 5.1. Let Assumption(E) be satisfied. Then σ(HA ) = σess (HA ) = (−∞, −m] ∪ [m, ∞). To state the main results in this section, we need to introduce the following notation: A := {A | A satisfies Assumption(E)}. We regard A as a Banach space with the norm AA = sup{x|A(x)|}. x
Theorem 5.2. Let Assumption(E) be satisfied. Define Zk± = {A ∈ A | dim(Ker(H ∓ m)) = k} for k = 0, 1, 2, . . . . Then (i) Zk+ = Zk− for all k.
(5.1)
March 23, J070-S0129055X11004254
168
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
(ii) Z0± is an open and dense subset of A. (iii) For any k and any open subset Ω(= ∅) of R3 there exists an A ∈ Zk± such that A ∈ [C0∞ (Ω)]3 . Proof. Assertion (i) is a direct consequence of Corollary 2.1. Assertions (ii) and (iii) follows from Corollary 2.1 and [11, Theorem 1]. It is of some interest to point out a conclusion following from Theorem 3.1 and Assertion (iii) of Theorem 5.2. Namely, there are (at least) countably infinite number of vector potentials A with compact support such that the corresponding Dirac operators HA have ±m modes f ± with the property (3.4). The ±m modes f ± behave like |f ± (x)| |x|−2 for |x| → ∞, in spite of the fact that the vector potentials and the corresponding magnetic fields vanish outside bounded regions. It is obvious that this phenomenon describes a certain kind of non-locality. Also, it is of some interest to mention that HA does not have ±m resonances if the support of vector potential A is compact. This is an immediate consequence of Theorem 6.1 in the next section. 6. Non-Existence of ±m Resonances In this section, we will work in bigger Hilbert spaces than H = [L2 (R3 )]2 and K = [L2 (R3 )]4 . Therefore, the results on the supersymmetric Dirac operators in Sec. 2 are not applicable in this section. In this section, we shall occasionally write the inner product of H as (ϕ(x), ψ(x))C2 dx (ϕ, ψ)H = R3
for ϕ, ψ ∈ H, where (·, ·)C2 denotes the inner product of C2 . We need to introduce weighted L2 spaces in order to deal with ±m resonances, which do not belong to the Hilbert space K. By L2,s (R3 ), we mean the weighted L2 space defined by L2,s (R3 ) := {u | xs u ∈ L2 (R3 )}
where x = 1 + |x|2 , and we set
(s ∈ R)
L2,s = [L2,s (R3 )]4 . Definition 6.1. By an m resonance (respectively, a −m resonance), we mean a function f ∈ L2,−s \K, 0 < s ≤ 3/2, such that HA f = mf (respectively, HA f = −mf ) in the distributional sense. We would like to caution that in Definition 6.1 one has to take the meaning of HA f = ±mf in the distributional sense, because of the reason that ±m resonances do not belong to the Hilbert space K, hence do not belong to the domain of the selfadjoint realization of HA . For this reason, we let HA stand for the formal differential
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
169
operator throughout this section, in spite of the fact that HA has the unique selfadjoint realization in K under the assumption of Theorem 6.1 below. We hope this will not cause any confusion. Theorem 6.1. Assume that each element Aj (x) (j = 1, 2, 3) of A(x) is a measurable function satisfying |Aj (x)| ≤ Cx−ρ
(ρ > 3/2),
(6.1)
where C is a positive constant. Suppose that f = t (ϕ+ , ϕ− ) belongs to L2,−s for some s with 0 < s < min(1, ρ − 1) and satisfies HA f = mf (respectively, HA f = −mf ) in the distributional sense. Then f ∈ [H 1 (R3 )]4 and ϕ− = 0 (respectively, ϕ+ = 0). Theorem 6.1 implies the non-existence of ±m resonances in the sense of Definition 6.1, as well as in the sense described in the following theorem. Theorem 6.2. Let A(x) satisfy the same assumption as in Theorem 6.1. Suppose that f belongs to [L2loc (R3 )]4 and satisfies either equation of HA f = ±mf in the distributional sense. In addition, suppose that f has the asymptotic expansion f (x) = C1 |x|−1 + C2 |x|−2 + o(|x|−2 )
(6.2)
as |x| → ∞, where C1 and C2 are constant vectors in C4 . Then C1 = 0. Proof. It follows from (6.2) that f ∈ L2,−s for any s with 1/2 < s < 1. This fact, together with the assumptions of the theorem, enables us to apply Theorem 6.1 and to conclude that f ∈ [H 1 (R3 )]4 . In particular, f ∈ K, which leads to the fact that C1 = 0. We shall give a proof of Theorem 6.1 only for m resonances, since the proof for −m resonances is similar. Roughly speaking, we will mimick the idea of the proof of Assertion (i) of Theorem 2.1. Therefore we need the Weyl–Dirac operator TA = σ · (D − A(x)) again. However, we are not allowed to use the Weyl–Dirac operator as a self-adjoint operator in the Hilbert space H, but only allowed to use it as a formal differential operator instead. This is because ±m resonances do not belong to K. This fact causes complication, in a certain extent, in the proof of Theorem 6.1. We begin the proof of Theorem 6.1 with a lemma whose proof will be given after the proof of the theorem. The proof of the lemma is lengthy. Lemma 6.1. Under the hypotheses of Theorem 6.1, ϕ± have the following properties: (i) (σ · D)ϕ+ ∈ H, (σ · A)ϕ+ ∈ H and ϕ− ∈ [H 1 (R3 )]2 . (ii) ((σ · D)ϕ+ , ϕ− )H = ((σ · A)ϕ+ , ϕ− )H .
March 23, J070-S0129055X11004254
170
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
Proof of Theorem 6.1. Let f satisfy HA f = mf in the distributional sense. Then we have mϕ+ + σ · (D − A(x))ϕ− = mϕ+ (6.3) σ · (D − A(x))ϕ+ − mϕ− = mϕ− in the distributional sense, which immediately implies σ · (D − A(x))ϕ− = 0
(6.4)
σ · (D − A(x))ϕ+ = 2mϕ− .
(6.5)
and
In view of Lemma 6.1, it follows from (6.5) that 4m2 ϕ− 2H = 2m(2mϕ− , ϕ− )H = 2m(σ · (D − A)ϕ+ , ϕ− )H = 2m{((σ · D)ϕ+ , ϕ− )H − ((σ · A)ϕ+ , ϕ− )H } = 0.
(6.6)
Hence ϕ− = 0. This fact, together with (6.5), means that σ · (D − A(x))ϕ+ = 0
(6.7)
in the distributional sense. It follows from [28, Theorem 2.2] that ϕ+ ∈ [H 1 (R3 )]2 . (Note that the hypothesis 0 < s < min(1, ρ − 1) is stronger than the one imposed in [28, Theorem 2.2].) This implies that f ∈ [H 1 (R3 )]4 , because ϕ− = 0 as was shown above. Before proving Lemma 6.1, we should like to remark that (6.4) and (6.5) follow directly from the hypothesis that HA f = mf in the distributional sense. Therefore we are allowed to use (6.4) and (6.5) in the proof of Lemma 6.1 below. Proof of Lemma 6.1. Since ρ − s > 1 by assumption, we see that (σ · A)ϕ+ ∈ [L2,ρ−s (R3 )]2 ⊂ H.
(6.8)
It follows from (6.4) and [28, Theorem 2.2] that ϕ− ∈ [H 1 (R3 )]2 . This fact, together with (6.5) and (6.8), implies that (σ · D)ϕ+ = 2mϕ− + (σ · A)ϕ+ ∈ H. Thus Assertion (i) is proved. In order to prove Assertion (ii), we need to introduce a cutoff function. Let χ be a function in C ∞ (R) such that 0 ≤ χ ≤ 1, χ(r) = 1 (r ≤ 1), and χ(r) = 0 (r ≥ 2). Set χn (x) = χ(|x|/n) (n = 1, 2, 3, . . .).
(6.9)
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
171
It is evident that ((σ · D)ϕ+ , ϕ− )H = lim ((σ · D)ϕ+ , χn ϕ− )H . n→∞
(6.10)
Let {jε }0<ε<1 be Friedrichs’ mollifier, i.e. j (x) := ε−3 j(x/ε), where j ∈ C0∞ (R3 ) and jL1 = 1. Since χn ϕ− ∈ H, we see that jε ∗ (χn ϕ− ) converges to χn ϕ− in H as ε ↓ 0. Hence, for each n, we have ((σ · D)ϕ+ , χn ϕ− )H = lim((σ · D)ϕ+ , jε ∗ (χn ϕ− ))H . ε↓0
(6.11)
It is straightforward that jε ∗ (χn ϕ− ) ∈ [C0∞ (R3 )]2 and that supp[jε ∗ (χn ϕ− )] ⊂ {x | |x| ≤ 2n + 1}.
(6.12)
Appealing to the definition of the distributional derivatives, we get ((σ · D)ϕ+ , jε ∗ (χn ϕ− ))H =
R3
(ϕ+ (x), (σ · D)(jε ∗ (χn ϕ− ))(x))C2 dx.
(6.13)
For each n and ε, we have
−
(σ · D)(jε ∗ (χn ϕ ))(x) =
R3
= R3
= R3
(σ · Dx )(jε (x − y))χn (y)ϕ− (y)dy −(σ · Dy )(jε (x − y))χn (y)ϕ− (y)dy jε (x − y)(σ · Dy )(χn (y)ϕ− (y))dy
= jε ∗ {(σ · D)(χn ϕ− )}(x).
(6.14)
In the third equality of (6.14), we have regarded jε (x − ·) as a function in C0∞ (R3y ) and have appealed to the definition of the destributional derivatives with respect to y variable. Note that (σ · D)(χn ϕ− ) = {(σ · D)χn }ϕ− + χn (σ · D)ϕ− .
(6.15)
Combining (6.13)–(6.15), we obtain ((σ · D)ϕ+ , jε ∗ (χn ϕ− ))H =
R3
(ϕ+ (x), jε ∗ [{(σ · D)χn }ϕ− ](x))C2 dx
+ R3
(ϕ+ (x), jε ∗ [χn (σ · D)ϕ− ](x))C2 dx.
(6.16)
March 23, J070-S0129055X11004254
172
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
We examine the limit of each integral on the right-hand side of (6.16) as ε ↓ 0. As for the first integral, we have + − 3 (ϕ (x), jε ∗ [{(σ · D)χn }ϕ ](x))C2 dx R + − − (ϕ (x), {(σ · D)χn }(x)ϕ (x))C2 dx ≤
R3
|x|≤2n+1
|ϕ+ (x)|C2
× |jε ∗ [{(σ · D)χn }ϕ− ](x) − {(σ · D)χn }(x)ϕ− (x)|C2 dx ≤ |ϕ+ |C2 L2 (|x|≤2n+1) jε ∗ [{(σ · D)χn }ϕ− ] − {(σ · D)χn }ϕ− H →0
(ε ↓ 0),
(6.17)
since {(σ · D)χn }ϕ− ∈ H. In the first inequality (6.17) we have used the Schwarz inequality in C2 , and in the second inequality the Schwarz inequality in L2 . Therefore (ϕ+ (x), jε ∗ [{(σ · D)χn }ϕ− ](x))C2 dx lim ε↓0
R3
= R3
(ϕ+ (x), {(σ · D)χn }(x)ϕ− (x))C2 dx.
(6.18)
In a similar manner, we see that lim ε↓0
R3
(ϕ+ (x), jε ∗ [χn (σ · D)ϕ− ](x))C2 dx
= R3
(ϕ+ (x), χn (x)(σ · D)ϕ− (x))C2 dx,
(6.19)
where we have used the fact that ϕ− ∈ [H 1 (R3 )]2 . (Recall that this fact was shown in Assertion (i) of the lemma.) It follows from (6.11), (6.16), (6.18) and (6.19) that +
−
((σ · D)ϕ , χn ϕ )H =
R3
(ϕ+ (x), {(σ · D)χn }(x)ϕ− (x))C2 dx
+ R3
(ϕ+ (x), χn (x)(σ · D)ϕ− (x))C2 dx.
(6.20)
To estimate the first integral on the right-hand side of (6.20), we need the fact that 1 |x| 1 (σ · ω) (ω = x/|x|). (6.21) {(σ · D)χn }(x) = χ n n i
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
173
Note that supp[(σ · D)χn ] ⊂ {x | n ≤ |x| ≤ 2n} and that σ · ω is a unitary matrix. Hence we have (ϕ+ (x), {(σ · D)χn }(x)ϕ− (x))C2 dx 3 R 1 |ϕ+ (x)|C2 |ϕ− (x)|C2 dx sup |χ (r)| ≤ n r>0 n≤|x|≤2n
1/2 1 −2s + 2 x |ϕ (x)|C2 dx ≤ sup |χ (r)| n r>0 n≤|x|≤2n ×
1/2 −
x |ϕ n≤|x|≤2n
1 ≤ n
2s
(x)|2C2 dx
sup |χ (r)| | ϕ+ |C2 L2,−s × (1 + 4n2 )s/2 ϕ− H r>0
≤ const. n−1+s | ϕ+ |C2 L2,−s ϕ− H → 0 (n → ∞),
(6.22)
since s < 1 by assumption of Theorem 6.1. Thus the first integral on the right-hand side of (6.20) tends to 0 as n → ∞. We now investigate the limit of the second integral on the right-hand side of (6.20) as n → ∞. It follows from (6.4) that + − (ϕ (x), χn (x)(σ · D)ϕ (x))C2 dx − ((σ · A)(x)ϕ+ (x), ϕ− (x))C2 dx R3
R3
= R3
((χn (x) − 1)(σ · A)(x)ϕ+ (x), ϕ− (x))C2 dx,
(6.23)
where we have used the fact that (σ · A)(x) is a Hermitian matrix for each x. Noting (6.8), we find that the absolute value of the right-hand side of (6.23) is less than or equal to (χn − 1)(σ · A)ϕ+ H ϕ− H ,
(6.24)
which obviously tends to 0 as n → ∞. Combining this fact with (6.20), (6.22) and (6.23), we obtain lim ((σ · D)ϕ+ , χn ϕ− )H = ((σ · A)ϕ+ , ϕ− )H .
n→∞
(6.25)
Assertion (ii) of the lemma is a direct consequence of (6.10) and (6.25). 7. Examples, Concluding Remarks and an Open Question We shall give examples of vector potentials A(x) which yield ±m modes but do not give rise to ±m resonances. The basic idea in this section is to exploit the equivalences (2.9), (2.10), and to apply Theorem 6.1. It turns out that beautiful spectral
March 23, J070-S0129055X11004254
174
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
properties are in common to all the examples of the magnetic Dirac operators in this section. See Properties (i)–(iv) of Examples 7.1 and 7.2. Example 7.1 ([19]). Let ALY (x) = 3x−4 {(1 − |x|2 )w0 + 2(w0 · x)x + 2w0 × x}
(7.1)
where φ0 = t (1, 0) (φ0 can be any unit vector in C2 ) and w0 = φ0 · (σφ0 ) := ((φ0 , σ1 φ0 )C2 , (φ0 , σ2 φ0 )C2 , (φ0 , σ3 φ0 )C2 ).
(7.2)
Here w0 · x and w0 × x denotes the inner product and the exterior product of R3 respectively. Then the magnetic Dirac operator HLY := HALY = α · (D − ALY (x)) + mβ has the following properties: (i) σ(HLY ) = σess (HLY ) = (−∞, −m] ∪ [m, ∞); (ii) HLY has ±m modes. Moreover, the point spectrum of HLY consists only of ±m, i.e. σp (HLY ) = {−m, m}; (iii) HLY does not have ±m resonances; (iv) HLY is absolutely continuous on (−∞, −m) ∪ (m, ∞). We shall show these properties one-by-one. It is easy to see that −σ · ALY (x) is relatively compact perturbation of T0 = σ · D, hence the Weyl–Dirac operator TLY := TALY = σ · (D − ALY (x)) is a self-adjoint operator in the Hilbert space H = [L2 (R3 )]2 with the domain [H 1 (R3 )]2 . Since the spectrum of the operator T0 equals the whole real line, we see that σ(TLY ) = R. Property (i) immediately follows from Theorem 2.3. We shall show Property (ii). According to Loss and Yau [19, Sec. II], the Weyl– Dirac operator TLY has a zero mode ϕLY defined by ϕLY (x) = x−3 (I2 + iσ · x)φ0 .
(7.3)
It follows from (2.9) and (2.10) that t (ϕLY , 0) (respectively, t (0, ϕLY )) is an m mode (respectively, a −m mode) of HLY . Hence σp (HLY ) ⊃ {−m, m}. On the other hand, it follows from Yamada [35] that HLY has no eigenvalue in (−∞, −m) ∪ (m, ∞). (Note that the vector potential ALY satisfies the assumption of [35, Proposition 2.5].) This fact, together with Property (i), implies that σp (HLY ) ⊂ {−m, m}. Summing up, we get Property (ii). Since |ALY (x)| ≤ Cx−2 , Property (iii) follows from Theorem 6.1. Property (iv) is a direct consequence of [35, Corollary 4.2]. As for absolutely continuity and limiting absorption principle for Dirac operators, see also [36, 9, 24]. Remark 7.1. Since ALY is C ∞ , one can apply Thaller [31, Theorem 7.1, p. 195] to conclude that t (ϕLY , 0) (respectively, t (0, ϕLY )) is an m mode (respectively a −m mode) of HLY . This fact is also mentioned in [30].
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
175
Remark 7.2. As was pointed out in [19, Sec. II], one sees that div ALY = 0, and one can find, by a gauge transformation, a vector potential A˜LY which satisfies div A˜LY = 0 and rot A˜LY = rot ALY and yields a zero mode ϕ˜LY of σ·(D−A˜LY (x)). In fact, defining A˜LY := ALY + ∇χLY , with χLY (x) :=
1 4π
R3
ϕ˜LY := eiχLY ϕLY
1 (div ALY )(y)dy, |x − y|
we observe that A˜LY and ϕ˜LY have the desired properties mentioned above. Moreover, we can show that |A˜LY (x)| ≤ Cx−2 (∈ L6 (R3 )). Hence, the magnetic Dirac operator HA˜LY = α·(D−A˜LY (x))+mβ shares the Properties (i)–(iv) of Example 7.1 with HALY = α · (D − ALY (x)) + mβ. This same idea is applicable to the vector potentials A() in Example 7.2 below. Example 7.2 ([2]). In the same spirit as in Example 7.1, we can show the existence of countably infinite number of vector potentials with which the magnetic Dirac operators have the Properties (i)–(iv) in Example 7.1. In fact, we shall exploit a result on the Weyl–Dirac operator by Adam, Muratori and Nash [2], where they construct a series of vector potentials A() ( = 0, 1, 2, . . .), each of which gives rise a zero mode ψ () of the Weyl–Dirac operator T () := σ · (D − A() (x)). The idea of [2] is an extension of that of [19, Sec. II]. Indeed A(0) and ψ (0) give the same vector potential and zero mode as in (7.1) and (7.3). For ≥ 1, the construction of the zero mode ψ () (x) is based on an anzatz (see [2, Sec. II], (7)) and the definition of A() is given by A() (x) =
h() (x) {ψ () (x) · (σψ () (x))}, |ψ () (x)|2
where h() (x) is a real valued function defined as c h() (x) = (c a real constant depending only on ) x2
(7.4)
(7.5)
and ψ () (x) · (σψ () (x)) is defined in the same way as in (7.2). (For the definition of h() (x), see [29].) By the same arguments as in Example 7.1, we can deduce that the magnetic Dirac operator H () := α · (D − A() (x)) + mβ, = 0, 1, 2, . . . , has the Properties (i)–(iv) of Example 7.1. Section 3 was based upon our results on supersymmetric Dirac operators in Sec. 2 of the present paper and those of [27]. It turned out that all ±m mode have the same asymptotic limit at infinity, i.e. |x|−2 as |x| → ∞. This means that the asymptotic limits of ±m modes of the mangetic Dirac operator are the same as those of zero modes of the Weyl–Dirac operator. Section 4 was based upon our results on supersymmetric Dirac operators in Sec. 2 and those of Balinsky and Evans [6] on the Weyl–Dirac operator. Section 5 was based upon our results on
March 23, J070-S0129055X11004254
176
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
supersymmetric Dirac operators in Sec. 2 and those of Elton [11] on the Weyl– Dirac operator. In each section from Secs. 3–5, we made a different assumption on the vector potentials. It is meaningful to compare these assumptions with each other. To this end, imitating (5.1), we introduce the following notation ASU := {A | A satisfies Assumption(SU)}, ABE := {A | A satisfies Assumption(BE)}. We then have ASU ABE , A\ASU = ∅,
ASU \A = ∅,
A\ABE = ∅,
ABE \A = ∅.
In Secs. 4 and 5, it was shown that the set of vector potentials which give rise to ±m modes is scarce in each regime. The non-existence of ±m resonances was proved in Sec. 6 under the assumption that |Aj (x)| ≤ Cx−ρ , ρ > 3/2. Based on the results in Sec. 6 , it follows that all the examples of vector potentials in this section do not have ±m resonances. A natural question arises: Is there a vector potential A which satisfies |Aj (x)| ≤ Cx−ρ , ρ > 0, and yields ±m resonances of the magnetic Dirac operator HA ? Acknowledgments The authors would like to thank the referees for their valuable comments and constructive suggestions which led them to the consideration of threshold resonances and had them improve the paper. The second author was supported by Grant-inAid for Scientific Research (C) No. 21540193, Japan Society for the Promotion of Science. References [1] Y. Aharonov and A. Casher, Ground state of a spin-1/2 charged particle in a twodimensional magnetic field, Phys. Rev. A 19 (1979) 2461–2462. [2] C. Adam, B. Muratori and C. Nash, Zero modes of the Dirac operator in three dimensions, Phys. Rev. D 60 (1999) 125001-1–125001-8. [3] C. Adam, B. Muratori and C. Nash, Degeneracy of zero modes of the Dirac operator in three dimensions, Phys. Lett. B 485 (2000) 314–318. [4] C. Adam, B. Muratori and C. Nash, Multiple zero modes of the Dirac operator in three dimensions, Phys. Rev. D 62 (2000) 085026-1–085026-9. [5] A. A. Balinsky and W. D. Evans, On the zero modes of Pauli operators, J. Funct. Anal. 179 (2001) 120–135. [6] A. A. Balinsky and W. D. Evans, On the zero modes of Weyl–Dirac operators and their multiplicity, Bull. London Math. Soc. 34 (2002) 236–242. [7] A. A. Balinsky and W. D. Evans, Zero modes of Pauli and Weyl–Dirac operators, in Advances in Differential Equations and Mathematical Physics (Birmingham, AL, 2002), Contemp. Math., Vol. 327 (Amer. Math. Soc., Providence, Rhode Island, 2003), pp. 1–9.
March 23, J070-S0129055X11004254
2011 10:41 WSPC/S0129-055X
148-RMP
Eigenfunctions at the Threshold Energies of Magnetic Dirac Operators
177
[8] A. A. Balinsky, W. D. Evans and Y. Sait¯ o, Dirac–Sobolev inequalities and estimates for the zero modes of massless Dirac operators, J. Math. Phys. 49 (2008) 043514-1– 043514-10. [9] E. Balslev and B. Helffer, Limiting absorption principle and resonances for the Dirac operators, Adv. Appl. Math. 13 (1992) 186–215. [10] L. Bugliaro, C. Fefferman and G. M. Graf, A Lieb–Thirring bound for a magnetic Pauli Hamiltonian, II, Rev. Mat. Iberoamericana 15 (1999) 593–619. [11] D. M. Elton, The local structure of zero mode producing magnetic potentials, Comm. Math. Phys. 229 (2002) 121–139. [12] L. Erd¨ os and J. P. Solovej, The kernel of Dirac operators on S3 and R3 , Rev. Math. Phys. 13 (2001) 1247–1280. [13] L. Erd¨ os and J. P. Solovej, Uniform Lieb–Thirring inequality for the threedimensional Pauli operator with a strong non-homogeneous magnetic field, Ann. Henri Poincar´e 5 (2004) 671–741. [14] L. Erd¨ os and J. P. Solovej, Magnetic Lieb–Thirring inequalities with optimal dependence on the field strength, J. Stat. Phys. 116 (2004) 475–506. [15] J. Fr¨ ohlich, E. H. Lieb and M. Loss, Stability of Coulomb systems with magnetic fields. I. The one-electron Atom, Comm. Math. Phys. 104 (1986) 251–270. [16] B. Helffer, J. Nourrigat and X. P. Wang, Sur le spectre de l’´equation de Dirac (dans ´ Norm. Sup. (4) 22 (1989) 515– R3 ou R2 ) avec champ magn´etique, Ann. Sci. Ecole 533. [17] B. Helffer and B. Parisse, Comparaison entre la d´ecroissance de fonctions propres pour les op´erateurs de Dirac et de Klein–Gordon. Application ` a l’´etude de l’effect tunnel, Ann. Inst. H. Poincar´ e Phys. Th´eor. 60 (1994) 147–187. [18] E. Lieb, Private communication (2009). [19] M. Loss and H. T. Yau, Stability of Coulomb systems with magnetic fields. III. Zero energy bound states of the Pauli operators, Comm. Math. Phys. 104 (1986) 283–290. [20] P. Pickl, Generalized eigenfunctions for critical potentials with small perturbations, J. Math. Phys. 48 (2007) 123505-1–123505-31. [21] P. Pickl and D. D¨ urr, On adiabatic pair creation, Comm. Math. Phys. 282 (2008) 161–198. [22] P. Pickl and D. D¨ urr, Adiabatic pair creation in heavy ion and laser fields, Europhys. Lett. 81 (2008) 40001–40007. [23] C. Pladdy, Asymptotics of the resolvent of the Dirac operator with a scalar shortrange potential, Analysis 21 (2001) 79–97. [24] C. Pladdy, Y. Sait¯ o and T. Umeda, Radiation condition for Dirac operators, J. Math. Kyoto Univ. 37 (1998) 567–584. [25] M. Reed and B. Simon, Methods of Modern Mathematical Analysis. I, Functional Analysis, Revised and enlarged edition (Academic Press, London, 1980). [26] M. Reed and B. Simon, Methods of Modern Mathematical Analysis. IV, Analysis of Operators (Academic Press, New York, 1978). [27] Y. Sait¯ o and T. Umeda, The asymptotic limits of zero modes of massless Dirac operators, Lett. Math. Phys. 83 (2008) 97–106. [28] Y. Sait¯ o and T. Umeda, The zero modes and zero resonances of massless Dirac operators, Hokkaido Math. J. 37 (2008) 363–388. [29] Y. Sait¯ o and T. Umeda, A sequence of zero modes of Weyl–Dirac operators and an associated sequence of solvable polynomials, to appear in Spectral Theory, Function Spaces and Inequalities — New Techniques and Recent Trends, Operator Theory: Advances and Applications (Birkh¨ auser).
March 23, J070-S0129055X11004254
178
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Sait¯ o & T. Umeda
[30] B. Thaller, Dirac particles in magnetic fields, in Recent Developments in Quantum Mechanics, eds. A. Boutet de Monvel et al. (Kluwer Acadimic Publishers, Dordrecht, 1991), pp. 351–366. [31] B. Thaller, The Dirac Equation (Springer-Verlag, Berlin, 1992). [32] X. P. Wang, Puits multiples pour l’op´erateur de Dirac, Ann. Inst. H. Poincar´ e Phys. Th´eor. 43 (1985) 269–319. [33] T. Weidl, On the virtual bound states for semi-bounded operators, Comm. Partial Differential Equations 24 (1999) 25–60. [34] D. R. Yafaev, Exponential decay of eigenfunctions of first order systems, in Adventures in Mathematical Physics, Contemporary Mathematics, Vol. 447 (American Mathematical Society, 2007), pp. 249–256. [35] O. Yamada, On the principle of limiting absorption for the Dirac operators, Publ. Res. Inst. Math. Sci. Kyoto Univ. 8 (1972/73) 557–577. [36] O. Yamada, Eigenfunction expansions and scattering theory for Dirac operators, Publ. Res. Inst. Math. Sci. Kyoto Univ. 11 (1976) 651–689.
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 2 (2011) 179–209 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004266
SPECTRAL RENORMALIZATION GROUP AND LOCAL DECAY IN THE STANDARD MODEL OF NON-RELATIVISTIC QUANTUM ELECTRODYNAMICS
∗ , MARCEL GRIESEMER† and ¨ ¨ JURG FROHLICH ISRAEL MICHAEL SIGAL‡ ∗Institute
for Theoretical Physics, ETH Z¨ urich, CH-8093 Z¨ urich, Switzerland and IHES, Bures-sur-Yvette, France
†Department
of Mathematics, University of Stuttgart, Pfaffenwaldring 57, D-70569, Stuttgart, Germany
‡Department
of Mathematics, University of Toronto, 40, St. George Street, Bahen Centre, Toronto, ON M5S 2E4 Canada ∗
[email protected] †
[email protected] ‡
[email protected] Received 28 July 2010 Revised 16 January 2011
We prove a limiting absorption principle for the standard model of non-relativistic quantum electrodynamics (QED) and for Nelson’s model describing interactions of electrons with phonons. To this end, we use the spectral renormalization group technique on the continuous spectrum in conjunction with Mourre theory. Keywords: Non-relativistic quantum electrodynamics; renormalization group; local decay; spectral theory; scattering theory; theory of radiation; emission and absorption of light. Mathematics Subject Classification 2010: 81Q10, 81T17, 84U99, 81V10
1. Introduction The mathematical framework for the theory of non-relativistic matter interacting with the quantized electromagnetic field (non-relativistic quantum electrodynamics) is well established. It is based on the quantum Hamiltonian, which in notation and units explained below, is of the form Hg =
n 1 (i∇xj − gA(xj ))2 + V (x) + Hf 2m j j=1
179
(1.1)
March 23, J070-S0129055X11004266
180
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
acting on the Hilbert space H = Hp ⊗ Hf , the tensor product of the state spaces of the particle system and the quantized electromagnetic field. Here the coupling constant g is related to the charge of the electron. An ultraviolet cutoff is imposed on the electromagnetic vector potential A, with the effect that photons of very high frequency do not interact with the electrons. This model provides the foundation of the physical theory that describes phenomena of emission and absorption of radiation by systems of matter, such as atoms and molecules, as well as other processes of quantum radiation interacting with matter. It has been extensively studied during the last decade; (see the books [47, 26] and references therein, for a partial list of contributions). For “reasonable” potentials V (x), the operator Hg is self-adjoint and its spectral and resonance structure — and therefore the dynamics for long, but finite timeintervals — is well understood (see, e.g., [1, 2, 15, 25, 27–29, 32, 38, 39, 45], and references therein, for recent results). However, we still know little about the asymptotic dynamics, as time tends to ∞. In particular, a complete scattering theory for this operator does not, at present, exist; (see, however, [16–18, 11, 12]). A key notion connected to the asymptotic dynamics is that of local decay. This notion also lies at the foundation of modern quantum scattering theory. It states that a quantum system of the type considered here is either in a bound state, or, as time tends to infinity, it breaks apart, i.e. the probability to occupy any bounded region of physical space tends to zero and, consequently, the average distance between particles diverges to infinity. In our case, this means that photons leave the region of physical space occupied by the particle system. Until recently, local decay for the Hamiltonian Hg has been proven only for energies away from O(g 2 )-neighborhoods of the ground state energy, eg , and the ionization threshold. However, starting from a state with energy below the ionization threshold, the part of the system in any bounded space region eventually winds up in a state with energy close to the ground state energy. Indeed, while the total energy is conserved, photons carry away energy from regions of space where matter is concentrated. Understanding the dynamics in this energy interval is an important matter. Recently, local decay was proven for states in the spectral interval (p) (eg , eg + gap /12) for the Hamiltonian Hg [19]. Here (p)
(p)
(p) gap := 1 − 0 , (p)
(p)
where 0 and 1 are the energies of the ground state and the first excited state of the particle system. In this paper, we present another proof of this result. The main goal of this paper is to develop a new approach to time-dependent problems in non-relativistic QED that combines the spectral renormalization group (RG), developed in [6, 7, 3] (see also [20]), with more traditional spectral techniques such as the Mourre estimates. Our key result is that the stronger property of the limiting absorption principle (LAP) “propagates” along the renormalization flow. Next, we explain units and notation employed in (1.1). We use units in which the Planck constant divided by 2π, the speed of light and the electron mass are
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
181
equal to 1( = 1, c = 1 and m = 1). In these units the electron charge is equal to √ √ e2 1 ≈ 137 is the fine-structure constant; length, time − α(e = − α), where α = 4πc and energy are measured in units of /mc = 3.86·10−11 cm, /mc2 = 1.29·10−21 sec and mc2 = 0.511 MeV, respectively, (natural units). We show below that one can set g := α3/2 . Our particle system consists of n particles of masses mj (the ratio of the mass of the jth particle to the mass of an electron) and positions xj , where j = 1, . . . , n. We write x = (x1 , . . . , xn ). The total potential of the particle system is denoted by V (x). The Hamilton operator of the particle system alone is given by n 1 Hp := − ∆xj + V (x), 2m j j=1
(1.2)
where ∆xj is the Laplacian in the variable xj . This operator acts on the Hilbert space of the particle system, denoted by Hp , which is either L2 (R3n ) or a subspace of this space determined by the symmetry of the particle system under permutations. The spin of particles will be neglected for simplicity. The electromagnetic field is described by the quantized vector potential in the Coulomb gauge χ(k)d3 k , (1.3) A(y) = (eiky a(k) + e−iky a∗ (k)) 2|k| where χ is an ultraviolet cut-off: χ(k) = 1, in a neighborhood of k = 0, and vanishing sufficiently fast at infinity, and its dynamics is described by the quantum Hamiltonian (1.4) Hf = d3 ka∗ (k)ω(k)a(k). These operators act on the Fock space Hf ≡ F. Above, ω(k) = |k| is the dispersion law, i.e. the energy of a field quantum with wave vector k, a∗ (k) and a(k) denote creation and annihilation operators on F , and the right-hand side can be understood as a weak integral. The symbols a∗ (k) and a(k) denote operator-valued generalized transverse vector field: eλ (k)a# a# (k) := λ (k), λ∈{0,1}
where eλ (k), are polarization vectors, i.e. orthonormal vectors in R3 satisfying k · eλ (k) = 0, and a# λ (k) are ordinary creation and annihilation operators satisfying canonical commutation relations. The subscript λ labels the helicity of the field quantum. (See the supplement for a brief review of the definitions of Fock space, creation and annihilation operators and the definition of the operator Hf .) First, we consider (1.1) for an atom or molecule. Then, in the natural units, g = √ α and V (x), the total Coulomb potential of the particle system, is proportional
March 23, J070-S0129055X11004266
182
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
to α. Rescaling x → α−1 x and k → α2 k we arrive at (1.1) with g := α3/2 , V (x) of the order O(1)a and A(x) replaced by A (x), where A (x) = A(αx)|χ(k)→χ (k) , and where χ (k) := χ(α2 k) (see [8]). After that we drop the prime in the vector potential A (x) and the ultraviolet cut-off χ (x) (see a discussion of the latter below). Finally, we relax the restriction on V (x) we relax the restriction on V (x) and consider standard generalized n-body potentials (see, e.g., [43, 44, 40]): V (x) = i Wi (πi x), where πi are linear maps from R3n to Rνi , νi ≤ 3n, and Wi are Kato–Rellich potentials (i.e. Wi ∈ Lpi (Rνi ) + (L∞ (Rνi ))ε with pi = 2, for νi ≤ 3, pi > 2, for νi = 4, and pi ≥ νi /2, for νi > 4. In order not to deal with the problem of center-ofmass motion, we assume that either some of the particles (the nuclei) are infinitely heavy, or the system is placed in an external potential field. It is well known that Hf defines a positive, self-adjoint operator on F with purely absolutely continuous spectrum, except for a simple eigenvalue 0 corresponding to the eigenvector Ω (the vacuum vector, see Supplement). Thus, for g = 0, the (p) low energy spectrum of the operator H0 consists of branches [i , ∞) of abso(p) lutely continuous spectrum, where i are the isolated eigenvalues of Hp , and of (p) the eigenvalues i sitting at the tip of the branches (“thresholds”) of the continuous spectrum. The absence of gaps between the eigenvalues and the thresholds is a consequence of the fact that the photons (and phonons) are massless. This leads to subtle problems in perturbation theory, known collectively as the infrared problem. In this paper, we prove the local decay property for the Hamiltonian Hg . In fact, we prove a slightly stronger property — the limiting absorption principle — which states that the resolvent sandwiched between appropriate weighting operators has H¨ older continuous limits on the spectrum. To be specific, let B denote the selfadjoint generator of dilatations on Fock space F . It can be expressed in terms of creation and annihilation operators as i d3 ka∗ (k){k · ∇k + ∇k · k}a(k). (1.5) B= 2 We extend it to the Hilbert space H = Hp ⊗ F. Let B := (1 + B 2 )1/2 . Our goal is to prove the following (p)
1 (p) 12 gap ),
where g is the ground
B −θ (Hg − λ ± i0)−1 B −θ ∈ C ν (∆)
(1.6)
Theorem 1.1. Let g gap and let ∆ ⊂ (g , g + state energy of H. Then, as a function of λ ∈ ∆,
for θ >
a In
1 2
and 0 < ν < θ − 12 .
the case of a molecule in the Born–Oppenheimer approximation, the resulting V (x) also depends on the rescaled coordinates of the nuclei.
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
183
Defining functions of self-adjoint operators by functional calculus, we derive from this theorem the following consequence: Corollary 1.2. For ∆ as above, for any function f (λ) with supp f ⊆ ∆, and for ν < θ − 12 , we have that B −θ e−iHt f (H)B −θ ≤ Ct−ν .
(1.7)
This corollary follows from (1.6) and the formula B −θ e−iHt f (H)B −θ =
∞
−∞
dλf (λ)e−iλt ImB −θ (H − λ − i0)−1 B −θ
(see, e.g., [44] and the detailed discussion in [19]). Remark 1.3. Let Σp := inf σ(Hp ). We expect that the method of this paper can be extended to the energy interval (σ(H)\σpp (H)) ∩ (−∞, Σp − ε], for some ε > 0, for QED and Nelson’s models. Previously, the limiting absorption principle and local decay estimates were proven in [7, 9] for the standard model of non-relativistic QED and for the Nelson model away from neighborhoods of the ground state energy and ionization threshold. In [21, 22], they were proven for the Nelson model near the ground state energy and for all values of the coupling constant, but under rather stringent assumptions, including one on the infra-red behavior of the coupling functions; (see also [6, 8, 42, 46] for earlier works). Finally, as was mentioned above, Theorem 1.1 has been proven in a neighborhood of the ground state energy in [19]. The approach followed in the present paper consists of three steps. First, following [45], we use a generalized Pauli–Fierz transform to map the QED Hamiltonian Eq. (1.1) to a new Hamiltonian HgP F whose interaction has improved infrared behavior. To this new Hamiltonian we apply a suitable power of the renormalization map (depending on ∆, obtaining, in the end, a rather simple Hamiltonian whose spectral properties can be analyzed with the help of the Mourre estimate. This proves the LAP for this particular Hamiltonian. Since, as we prove in this paper, the renormalization map preserves the LAP property we conclude from this that the Hamiltonian HgP F enjoys the LAP property as well. The size of the spectral interval on which the LAP holds and the number of iterations of the RG map depend on the distance of that interval to the ground state energy. We will also consider Nelson’s model. The Hamiltonian of this model is given by HgN = H0N + IgN .
(1.8)
It acts on the state space H = Hp ⊗ F, where now F is the Fock space for phonons, i.e. spinless, massless bosons. Here g is a positive parameter — a coupling
March 23, J070-S0129055X11004266
184
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
constant — that we assume to be small, and H0N = Hp + HfN ,
(1.9)
where Hp and HfN are given in (1.2) and (1.4), respectively, but, in Nelson’s model, a∗ and a are scalar creation and annihilation operators, and the interaction Hamiltonian is 3 d kκ(k) −ikx ∗ N Ig ≡ Ig = g {e a (k) + eikx a(k)}. (1.10) |k|1/2 (We can also treat terms quadratic in a and a∗ , but, for the sake of simplicity, we do not consider such terms.) Here, κ is a real-valued function with the property that 1/2 d3 k 2 |κ(k)| < ∞, (1.11) κµ := |k|3+2µ for some (arbitrarily small, but) strictly positive µ > 0. In the following, we fix some κ with κµ = 1 and vary g. It is easy to see that the operator Ig is symmetric and bounded relative to H0N , in the sense of Kato [44, 40], with an arbitrarily small constant. Thus HgN is self-adjoint on the domain of H0N for arbitrary g. Of course, for Nelson’s model the dimension of space may be arbitrary, d ≥ 1. All the results mentioned above for the standard model Hamiltonian Hg (or more precisely for HgP F ) also hold for the Hamiltonian of Nelson’s model, HgN , with µ > 0. To simplify our exposition we prove Theorem 1.1 only for the case 0 < θ ≤ 1 leaving the case θ > 1 to the interested reader. 2. Generalized Pauli–Fierz Transform Here we describe the generalized Pauli–Fierz transform mentioned in the introduction; (see [45]). We define the Hamiltonian HgP F := e−igF HgSM eigF ,
(2.1)
which we call the generalized Pauli–Fierz Hamiltonian. In order to keep our notation simple, we define this transformation only for a single charged particle: d3 k , (2.2) (fx,λ (k)aλ (k) + fx,λ (k)a∗λ (k)) F (x) = 2|k| λ with the test function fx,λ (k) chosen as fx,λ (k) :=
e−ikx χ(k) 1 ϕ(|k| 2 eλ (k) · x). |k|
The function ϕ is assumed to be C 2 , real-valued, bounded, with bounded second derivative and satisfying ϕ (0) = 1. A straightforward calculation, using
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
185
2
the commutator expansion e−igF (x) Hf eigF (x) = Hf − ig[F (x), Hf ] − g2 [F (x), [F (x), Hf ]], yields 1 (p + gA1 (x))2 + Vg (x) + Hf + gG(x) 2 2 where A1 (x) = A(x) − ∇F (x), Vg (x) := V (x) + g2 |fx,λ (k)|2 d3 k and λ d3 k G(x) := −i . |k|(fx,λ (k)aλ (k) − fx,λ (k)a∗λ (k)) 2|k| λ HgP F =
(2.3)
(2.4)
(The terms gG and Vg − V come from the last two terms of the expansion of e−igF (x) Hf eigF (x) .) Observe that the operator family A1 (x) is of the form d3 k , (2.5) A1 (x) = (χx,λ (k)aλ (k) + χx,λ (k)a∗λ (k)) 2|k| λ where χλ,x (k) is given by χλ,x (k) := eλ (k)e−ikx χ(k) − ∇x fx,λ (k). It satisfies the estimates |χλ,x (k)| ≤ const min(1, with x := (1 + |x|2 )1/2 , and
|k|x ),
d3 k |χλ,x (k)|2 < ∞. |k|
(2.6)
(2.7)
Using the fact that the operators A1 and G have much better infrared behavior than the original vector potential A, we can use our approach and prove the limiting absorption principle for HgP F and B: B −θ (HgP F − z)−1 B −θ is H¨older continuous in z,
(2.8)
where the ranges of the parameters g, z and θ and the type of H¨ older continuity are quantified in Theorem 1.1. Now we show that (2.8) and an additional restriction on the spectral interval imply the limiting absorption principle for HgSM . Let B1 := eigF (x) Be−igF (x) . We compute B1 = B + gC
(2.9)
where C := i[F (x), B]. Note that the operator C contains a term proportional to x. Let f be a real-valued function supported in (−∞, Σp ). Then, using that (Hg − z)−1 = eigF (HgP F − z)−1 e−igF , we obtain B −θ f (Hg )2 (Hg − z)−1 B −θ = DE(z)D∗ , where D := B −θ f (Hg )B1 θ eigF (x) and E(z) := B −θ (HgP F − z)−1 B −θ . The operator D is bounded by standard operator calculus estimates and the
March 23, J070-S0129055X11004266
186
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
fact that eδx f (Hg ) is bounded, for δ > 0 sufficiently small. Furthermore, the operator-family E(z) is H¨older continuous by result (2.8). Now, for z ∈ (−∞, Σp −ε], for some ε > 0, the previous conclusion remains true even if we remove the cut-off function f (Hg ). For further reference, we mention that the operator (2.3) can be written as PF HgP F = H0g + IgP F ,
where PF H0g
g2 |χλ,x (k)|2 2 = H0 + |fx,λ (k)| + d3 k, 2 2|k|
(2.10)
(2.11)
λ
SM with H0 := Hg=0 = Hp + Hf (see (1.1)), and IgP F is defined by this relation. (The 2 g2 |χλ,x (k)| 3 d k comes from requiring IgP F to have the normal form.) term 2 λ 2|k| Note that the operator IgP F contains linear and quadratic terms in creation and annihilation operators, with coupling functions (form-factors) in the linear terms satisfying estimate (2.6) and with coupling functions in the quadratic terms satisfyPF PF PF is of the form H0g = Hpg +Hf ing a similar estimate. Moreover, the operator H0g where g2 |χλ,x (k)|2 PF Hpg := Hp + (2.12) |fx,λ (k)|2 + d3 k, 2 2|k| λ
where Hp is given in (1.2). 3. The Smooth Feshbach–Schur Map In this section, we review and extend, in a simple but important way, the method of isospectral decimations or Feshbach–Schur maps introduced in [6, 7] and refined in [3]. For further extensions, see [23]. At the root of this method is the isospectral smooth Feshbach–Schur map acting on a set of closed operators and mapping a given operator to one acting on a much smaller space expected to be easier to handle. Let χ, χ be a partition of unity on a separable Hilbert space H, i.e. χ and χ are positive operators on H whose norms are bounded by one, 0 ≤ χ, χ ≤ 1, and χ2 + χ2 = 1. We assume that χ and χ are nonzero. Let τ be a (linear) projection acting from (a subspace of) the space of closed operators on H to itself, mapping selfadjoint operators into self-adjoint and having the property that operators from its image commute with χ and χ. It is also convenient to assume that τ (1) = 1. Assume that τ and χ (and therefore also χ) leave D(H) invariant, D(τ (H)) ⊇ D(H), and χD(H) ⊂ D(H). Let τ := 1 − τ and define Hτ,χ# := τ (H) + χ# τ (H)χ# , where χ# stands for either χ or χ.
(3.1)
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
187
Given χ and τ as above, we denote by Dτ,χ the space of closed operators, H, on H which belong to the domain of τ and satisfy the following conditions: τ (H), Hτ,χ are (bounded) invertible on Ran χ, τ (H)χ and χτ (H) extend to bounded operators on H.
(3.2)
(For more general conditions see [3, 23].) We define H0 := τ (H) and W := τ (H). Then H0 and W are closed operators on H s.t. D(H0 ), D(W ) ⊇ D(H), and H = H0 + W . We remark that the domains of χW χ, χW χ, Hτ,χ , and Hτ,χ all contain D(H). The smooth Feshbach–Schur map (FSM ) maps operators on H to operators on H by H → Fτ,χ (H), where −1 Fτ,χ (H) := H0 + χW χ − χW χHτ,χ χW χ.
(3.3)
Clearly, it is defined for all H ∈ Dτ,χ . Remarks. (i) The definition of the smooth Feshbach–Schur map given above is the same as in [20] and differs from the one given in [3]. In [3], the map Fτ,χ (H) is denoted by Fχ (H, τ (H)) and the pair of operators (H, τ (H)) is referred to as a Feshbach pair. (ii) The Feshbach–Schur map is obtained from the smooth Feshbach–Schur map by specifying χ = projection and, usually, τ = 0. Next we define two operators entering identities involving the Feshbach–Schur map: −1 χW χ, Qτ,χ (H) := χ − χHτ,χ
(3.4)
−1 Q# τ,χ (H) := χ − χW χHτ,χ χ.
(3.5)
Note that Qτ,χ (H) ∈ B(Ran χ, H) and Q# τ,χ (H) ∈ B(H, Ran χ); (see (3.2)). The smooth Feshbach map Fτ,χ (H) of H is isospectral to H in the sense of the following theorem. Theorem 3.1. Let χ and τ be as above, with properties as specified above and let H ∈ Dτ,χ . Then Fτ,χ (H) has the following properties: (i) 0 ∈ ρ(H) ⇔ 0 ∈ ρ(Fτ,χ (H)), i.e. H is bounded invertible on H if and only if Fτ,χ (H) is bounded invertible on Ran χ. (ii) If ψ ∈ H\{0} solves Hψ = 0 then ϕ := χψ ∈ Ran χ\{0} solves Fτ,χ (H)ϕ = 0. (iii) If ϕ ∈ Ran χ\{0} solves Fτ,χ (H)ϕ = 0 then ψ := Qτ,χ (H)ϕ ∈ H\{0} solves Hψ = 0. (iv) The multiplicity of the spectral value {0} is preserved in the sense that dim Ker H = dim Ker Fτ,χ (H).
March 23, J070-S0129055X11004266
188
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
(v) If one of the inverses, H −1 or Fτ,χ (H)−1 , exists then so does the other, and they are related as −1 χ, H −1 = Qτ,χ (H)Fτ,χ (H)−1 Qτ,χ (H)# + χ Hτ,χ
(3.6)
and Fτ,χ (H)−1 = χH −1 χ + χ τ (H)−1 χ. This theorem is proven in [3] (see [23] for a more general result). Another property of used below is H is self-adjoint ⇒ Fτ,χ (H) is self-adjoint.
(3.7)
Next, we establish a key result relating smoothness of the resolvent of an operator on its continuous spectrum with smoothness of the resolvent of its image under a smooth Feshbach–Schur map. In what follows ∆ stands for an open interval in R. Theorem 3.2. Let ∆ ⊂ R and, ∀ λ ∈ ∆, H(λ) be a C 1 family of self-adjoint operators, such that H(λ) ∈ Dτ,χ (see (3.2)). Assume that there is a self-adjoint operator B such that adjB (A) is bounded and differentiable in λ,
∀ j ≤ 2,
(3.8)
τ (H(λ)), τ¯(H(λ))χ, ∂λk (χHτ,χ (λ)−1 χ), k = where A is one of the operators χ, χ, χ¯ 0, 1. Then, for any 0 ≤ ν ≤ 1 and 0 < θ ≤ 1 and in the operator norm, we have lim B −θ (Fτ,χ (H(λ)) − iε)−1 B −θ
ε→0+
⇒ lim B −θ (H(λ) − iε)−1 B −θ ε→0+
exists and ∈ C ν (∆) exists and ∈ C ν (∆).
(3.9) (3.10)
Proof. Let Bθ := B −θ . Since τ (1) = 1, we have that (H(λ) − i)τ,χ# = H(λ)τ,χ# − i, where χ# is either χ or χ. Furthermore, since H(λ) ∈ Dτ,χ the operator family [(H(λ) − iε)τ,χ ]−1 , on Ran χ, is differentiable in λ and analytic in ε and can be expanded as [(H(λ) − iε)τ,χ ]−1 = [H(λ)τ,χ ]−1 + iε[H(λ)τ,χ ]−1 χ2 [H(λ)τ,χ ]−1 + O(ε2 ). This together with (3.3) implies that Fτ,χ (H(λ) − iε) = Fτ,χ (H(λ)) − iεκ + O(ε2 ), where κ := 1+ χW (λ)χ[H(λ)τ,χ ]−1 χ2 [H(λ)τ,χ ]−1 χW (λ)χ, which, in turn, together with (3.9), gives, in the norm topology, lim Bθ [Fτ,χ (H(λ) − iε)]−1 Bθ
ε→0+
exists and ∈ C ν (∆).
∞ = C Next, conditions (3.8) and the formula B θ θ 0 ∞ −1 −1 ] , imply that the operators Cθ := [ 0 ωdω θ/2 (ω + 1) Bθ χBθ−1 ,
Bθ χBθ−1 ,
dω (ω ω θ/2
Bθ [H(λ)τ,χ ]−1 Bθ−1
(3.11)
+ 1 + B 2 )−1 , where (3.12)
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
189
and the transposed operators (i.e. Bθ−1 χBθ , etc.) are bounded and differentiable −1 in λ. Let Q := Qτ χ and Q# := Q# τ χ . The above property shows that Bθ QBθ −1 # and Bθ Q Bθ are bounded and differentiable in λ ∈ ∆. This, together with (3.6), implies Bθ (H(λ) − iε)−1 Bθ = (Bθ QBθ−1 )Bθ (Fτ,χ (H(λ)) − iε)−1 Bθ (Bθ−1 Q# Bθ ) + Bθ χH(λ)−1 χ χBθ . The last relation, together with (3.11), implies (3.10). Remark. Another way to relate the boundary values of the resolvents of Fτ,χ (H) and H (also starting with (3.6)) is given in [3, Theorem 2.1(v)]. 4. A Banach Space of Hamiltonians We construct a Banach space of Hamiltonians on which a renormalization transformation, involving the smooth Feshbach–Schur map, is defined. To simplify matters we will think of the creation and annihilation operators used below as scalar operators. We explain at the end of a supplement how to reinterpret the corresponding expressions for the creation and annihilation operators of the electromagnetic field. Let B1 denote the unit ball in R3 and B1d , the d-fold product of B1 . I := [0, 1] and m, n ≥ 0. Given functions w0,0 : [0, ∞) → C and wm,n : I × B1m+n → C, m + n > 0, we consider monomials, Wm,n ≡ Wm,n [wm,n ], in creation and annihilation operators of the form W0,0 := w0,0 [Hf ] (defined by operator calculus), and dk(m,n) ∗ Wm,n [wm,n ] := a (k(m) )wm,n [Hf ; k(m,n) ]a(k˜(n) ), (4.1) 1/2 m+n |k (m,n) | B1 for m + n > 0. Here we use the notation k(m) := (k1 , . . . , km ) ∈ R3m ,
a∗ (k(m) ) :=
m
a∗ (ki ),
(4.2)
i=1
k(m,n) := (k(m) , k˜(n) ),
dk(m,n) :=
m i=1
|k(m,n) | := |k(m) | · |k˜(n) |,
d3 ki
n
d3 k˜i ,
(4.3)
i=1
|k(m) | := |k1 | · · · |km |.
(4.4)
We assume that, for every m and n with m + n > 0, the function wm,n [r, k(m,n) ] is 1 ≤ σ ≤ 2 times continuously differentiable in r ∈ I, for almost every k(m,n) ∈ B1m+n , and weakly differentiable in k(m,n) ∈ B1m+n , for almost every r in I. As a function of k(m,n) , it is totally symmetric with respect to the variables k(m) = (k1 , . . . , km ) and k˜(n) = (k˜1 , . . . , k˜n ) and obeys the norm bound wm,n µ,σ := ∂rn (k∂k )q wm,n µ < ∞, (4.5) n+|q|≤σ
March 23, J070-S0129055X11004266
190
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
m+n where q := (q1 , . . . , qm+n ), (k∂k )q := 1 (kj ·∇kj )qj , with km+j := k˜j , and where the sum is taken over the indices n and q satisfying 0 ≤ n + |q| ≤ σ, and wm,n µ := max j
sup r∈I,k(m,n) ∈B1m+n
||kj |−µ wm,n [r; k(m,n) ]|,
(4.6)
for µ ≥ 0. Here, and in what follows, kj is the jth (3-dimensional) entry of the k-vector k(m,n) over which we take the supremum. For m + n = 0, the variable r ranges in [0, ∞) and we assume that the following norm is finite: sup |∂rn w0,0 (r)|; (4.7) w0,0 µ,σ := |w0,0 (0)| + 1≤n≤σ
r∈I
(for σ = 0 we drop the sum on the right-hand side). This norm is independent of µ, but we keep this index for notational convenience. The Banach space of functions µ,σ . Moreover, Wm,n [wm,n ] stresses wm,n , with wm,n µ,σ < ∞, is denoted by Wm,n the dependence of Wm,n on wm,n . In particular, W0,0 [w0,0 ] := w0,0 [Hf ]. We fix three numbers µ, 0 < ξ < 1 and σ ≥ 0 and define the Banach space
µ,σ Wm,n , (4.8) Wξµ,σ := m+n≥0
with the norm wµ,σ,ξ :=
ξ −(m+n) wm,n µ,σ < ∞.
(4.9)
m+n≥0
Clearly, Wξµ ,σ ⊂ Wξµ,σ if µ ≥ µ, σ ≥ σ and ξ ≤ ξ. Remarks. (1) Though we use the same notation, the Banach spaces, Wξµ,σ , etc. introduced above differ from the ones used in [45, 20]. The latter are obtained from the former by setting q = 0 in (4.5). To extend estimates of [45, 20] to the present setting one has to estimate the effect of the derivatives (k∂k )q , which is straightforward. (2) Actually, one can consider smaller Banach spaces, with B1d in (4.1) replaced by S1d := {k = (k1 , . . . , kd ∈ R3d | di=1 |ki | ≤ 1}. Let χρ (r) be a smooth cut-off function such that supp χρ (r) ⊂ [0, ρ] and 0 ≤ χρ (r) ≤ 1 and let χρ := χρ (Hf ). The following basic bound, proven in [3] (see also [26], links the norm defined in (4.6) to the operator norm on B[F ], where B[F ] is the algebra of all bounded operators on Fock space F . µ,0 , Theorem 4.1. Fix m, n ∈ N0 such that m + n ≥ 1. Suppose that wm,n ∈ Wm,n and let Wm,n ≡ Wm,n [wm,n ] be as defined in (4.1). Then ∀ ν > 0
(Hf + ν)−m/2 Wm,n (Hf + ν)−n/2 ≤ wm,n 0 ,
(4.10)
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
191
and (as a consequence) χρ Wm,n χρ ≤
ρ(m+n)(1+µ) √ wm,n µ , m!n!
(4.11)
where · denotes the operator norm on B[F ]. Theorem 4.1 says that the finiteness of wm,n µ ensures that χρ Wm,n χρ defines a bounded operator on F . With a sequence w := (wm,n )m+n≥0 in Wξµ,σ we associate an operator by setting H(w) := W0,0 [w] +
χ1 Wm,n [w]χ1 ,
(4.12)
m+n≥1
where we write Wm,n [w] := Wm,n [wm,n ]. This form of operators on Fock space will be called generalized normal (or Wick) form. Theorem 4.1 shows that the series in (4.12) converges in the operator norm and obeys the estimate H(w) − W0,0 (w) ≤ ξw1 µ,0,ξ ,
(4.13)
for any w = (wm,n )m+n≥0 ∈ Wξµ,0 . Here w1 = (wm,n )m+n≥1 . Hence we have the linear map H : w → H(w)
(4.14)
from W µ,0 into the set of closed operators on Fock space F . Furthermore the following result was proven in [3]. Theorem 4.2. For any µ ≥ 0 and 0 < ξ < 1, the map H : w → H(w), given in (4.12), is one-to-one. µ,σ µ,σ µ,σ := H(Wξµ,σ ) and Wmn,op := H(Wmn ). Theorem 4.2 Define the spaces Wop µ,σ implies that Wop is a Banach space under the norm H(w)µ,σ,ξ := wµ,σ,ξ . Similarly, the other spaces defined above are Banach spaces in the corresponding norms. Finally, we mention the following elementary result, which follows from [3, Eq. (3.33)]:
Proposition 4.3. If H(w) is self-adjoint, m+n≥1 χ1 Wm,n [w]χ1 (see (4.12)).
then
so
are
W0,0 [w]
and
5. The Renormalization Transformation Rρ In this section, we present an operator-theoretic renormalization transformation based on the smooth Feshbach–Schur map closely related to the one defined in [3, 6, 7]. We fix the parameter µ of our Banach spaces at some positive value, µ > 0.
March 23, J070-S0129055X11004266
192
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
The renormalization transformation is homothetic to an isospectral map defined on a subset of a suitable Banach space of Hamiltonians. It has a certain contraction property that ensures that (upon an appropriate tuning of the spectral parameter) its iteration converges to a fixed-point (limiting) Hamiltonian, whose spectral analysis is particularly simple. Thanks to the isospectrality of the renormalization map, certain properties of the spectrum of the initial Hamiltonian can be studied by analyzing the limiting Hamiltonian. The renormalization map is defined below as a composition of a decimation map, Fρ , and a rescaling map, Sρ . Here ρ is a positive parameter — the photon energy scale — to be chosen later. The decimation of degrees of freedom is accomplished by the smooth Feshbach map, Fτ,χ . Except in the first step, the decimation map will act on the Banach µ,σ . We define a smooth partition of unity, {χρ , χρ }, 0 < ρ ≤ 1, χ2ρ +χ2ρ = 1, space Wop consisting of operators on F , given by χρ ≡ χ1 (Hf /ρ) and χρ ≡ χ1 (Hf /ρ),
(5.1)
where {χ1 (r), χ1 (r)} form a smooth partition of unity, χ1 (r)2 +χ1 (r)2 = 1, with the properties χ1 (r) = 1 for r ≤ 1, = 0 for r ≥ 11/10, 0 ≤ χ1 (r) ≤ 1 and sup |∂rn χ1 (r)| ≤ 30, ∀ r and for n = 1, 2. The operators τ and χ will be chosen as τ (H) = W00 := w00 (Hf )
and χ = χρ ,
(5.2)
where H = H(w) is given in Eq. (4.12). Note that by Proposition 4.3, τ maps self-adjoint opetators into self-adjoint operators. With τ and χ identified in this way, we will use the notation Fρ ≡ Fτ,χρ .
(5.3)
Let w 1 := (wm,n )m+n≥1 . The following lemma shows that the domain of this map µ,σ : contains the following polydisc in Wop µ,σ Dµ,σ (α, β, γ) := H(w) ∈ Wop ||w0,0 [0]| ≤ α, sup
|w0,0 [r]
− 1| ≤ β, w1 µ,σ,ξ
≤γ ,
(5.4)
r∈[0,∞)
for appropriate numbers α, β, γ > 0. Lemma 5.1. Fix 0 < ρ < 1, µ > 0, and 0 < ξ < 1. Then it follows that the polydisc Dµ,1 (ρ/8, 1/8, ρ/8) is in the domain of the Feshbach–Schur map Fρ . Proof. Let H(w) ∈ Dµ,1 (ρ/8, 1/8, ρ/8). We observe that W := H[w] − W0,0 [w] defines a bounded operator on F , and we only need to check the invertibility of H(w)τ,χρ on Ran χρ . Now, the operator W0,0 [w] is invertible on Ran χρ , since, for
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
193
all r ∈ [3ρ/4, ∞), Re w0,0 [r] ≥ r − |w0,0 [r] − r| [r] − 1| − |w0,0 [0]| ≥ r 1 − sup |w0,0 r
ρ 3ρ 1 ρ ≥ 1− − ≥ . 4 8 8 2 Furthermore, by (4.11), W ≤ ξρ/8 ≤ ρ/8. Hence Re(W0,0 [w] + W ) ≥ Ran χρ , i.e. H(w)τ,χρ is invertible on Ran χρ .
(5.5) ρ 3
on
We introduce the scaling transformation Sρ : B[F ] → B[F ], by Sρ (a# (k)) := ρ−3/2 a# (ρ−1 k),
Sρ (1) := 1,
(5.6)
where a# (k) is either a(k) or a∗ (k), and k ∈ R3 . On the domain of the decimation map Fρ we define the renormalization map Rρ as Rρ := ρ−1 Sρ ◦ Fρ .
(5.7)
Remark 5.2. The renormalization map above is different from the one defined in [3]. The map in [3] contains an additional change of the spectral parameter λ := −Ω, HΩ , where, recall, Ω is the vacuum in our Fock space. We mention here some properties of the scaling transformation. It is easy to check that Sρ (Hf ) = ρHf , and hence Sρ (χρ ) = χ1
and ρ−1 Sρ (Hf ) = Hf ,
(5.8)
which shows that the operator Hf is a fixed point of ρ−1 Sρ . Further, note that E · 1, E ∈ C is expanded under the scaling map, ρ−1 Sρ (E · 1) = ρ−1 E · 1, at a rate ρ−1 . Next, we show that the interaction W contracts under the scaling transformaµ,s , induces a tion. To this end we note that the scaling map Sρ , restricted to Wopξ µ,s scaling map, sρ , on Wξ by ρ−1 Sρ (H(w)) =: H(sρ (w)),
(5.9)
where sρ (w) := (sρ (wm,n ))m+n≥0 , and, using (5.6) and the change of variables of integration in the expression for Wm,n , it is easy to verify that, for all (m, n) ∈ N20 , sρ (wm,n )[r, k(m,n) ] = ρm+n−1 wm,n [ρr, ρk(m,n) ].
(5.10)
We note that, by Theorem 4.1, the operator norm of Wm,n [sρ (wm,n )] is controlled by the norm sρ (wm,n )µ = max j
sup
ρm+n−1
r∈I,k∈B1m+n
≤ ρm+n+µ−1 wm,n µ .
|wm,n [ρr, ρk(m,n) ]| |kj |µ
March 23, J070-S0129055X11004266
194
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Hence, for m + n ≥ 1, we have that sρ (wm,n )µ ≤ ρµ wm,n µ .
(5.11)
Since µ > 0, this estimate shows that Sρ is a contraction on the Banach spaces µ,s , m + n ≥ 1, with contraction rate given by ρµ < 1. Our next result shows Wm,n that contraction is actually a key property of the renormalization map Rρ along “stable” directions. Recall that χ1 is the cut-off function introduced at the beginning of Sec. 3. We define a constant Cχ by
2 4 n 2 Cχ := sup |∂r χ1 | + sup |∂r χ1 | ≤ 200. (5.12) 3 n=0 Theorem 5.3. Let 0 : H → H Ω , and µ > 0. Then, for any σ ≥ 1, 0 < ρ < 1/2, α, β ≤ ρ8 and γ ≤ 8Cρ χ , we have that Rρ − ρ−1 0 : Dµ,σ (α, β, γ) → Dµ,σ (α , β , γ ), √
(5.13)
ρ
continuously, with ξ := 4Cχ in the definition of the corresponding norms, and (5.14) α = 3Cχ γ 2 /2ρ , β = β + 3Cχ γ 2 /2ρ , γ = 128Cχ2 ρµ γ. Moreover, if H is self-adjoint, then so is Rρ (H). With minor modifications, this theorem follows from [3, Theorem 3.8 and its proof], especially [Eqs. (3.104), (3.107) and (3.109)]. For the norms (4.5), with q = 0, it is presented in [45, Appendix I]. The generalization to the case with q > 0 is straightforward. Remark 5.4. Subtracting the term ρ−1 0 from Rρ allows us to control the expanding direction during the iteration of the map Rρ . In [3], such control was achieved by changing the spectral parameter λ, which controls H Ω ; (see [3, Appendix I]). Recall that B denotes the dilatation generator on Fock space F (see (1.5)). Proposition 5.5. Let ∆ be an open interval in R, µ > 0, ρ and ξ be as in Theorem 5.3 and let H(λ) ∈ Dµ,1 (α, β, γ), with α, γ < ρ8 , β ≤ 18 , and be analytic in λ ∈ ∆ and self-adjoint ∀ λ ∈ ∆. Then, for 1 ≥ θ > 0 and ν ≥ 0, B −θ (Rρ (H(λ)) − i0)−1 B −θ ∈ C ν (∆)
implies that
B −θ (H(λ) − i0)−1 B −θ ∈ C ν (∆).
(5.15)
Proof. We deduce the proof of this proposition from the following: µ,1 Lemma 5.6. Let χ# ρ be either χρ or χρ . If H ∈ Wop , then the operators j j −1 adjB (χ# ρ ), Hf adB (W00 ), adB (H − W00 ),
j = 0, 1,
are bounded.
(5.16)
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
195
Proof. The j = 0 part of the statement follows from the definitions. The j = 1 part follows from the following relations 3 (5.17) a# (k), B, a# (k) = ±i k · ∇k + 2 i[B, Hf ] = Hf , i[B, f (Hf )] = Hf f (Hf ).
(5.18)
Indeed, using Eq. (5.18) and the Helffer–Sj¨ ostrand operator calculus (see, e.g., [43]) one can easily derive the first two statements in (5.16) as well as the inequalities adB (wmn )µ,0 ≤ cwmn µ,1 ,
m + n ≥ 1.
(5.19)
These bounds, together with (5.17), imply that, for m + n ≥ 1, µ,0 µ,1 adB (Wmn )Wmn,op ≤ c(m + n + 1)Wmn Wmn,op .
(5.20)
The latter inequalities, together with (4.11) and the relation H − W00 = m+n≥1 χ1 Wmn χ1 , imply the last inequality in (5.16). By Theorem 5.3, ∀ λ ∈ ∆, H(λ) ∈ D(Rρ ). Moreover, by [20, Proposition C.1] inherited from a result of [24], and by Proposition 4.3, W (λ) := τ¯(H(λ)) is analytic in λ ∈ ∆ and self-adjoint ∀ λ ∈ ∆. Lemma 5.6 implies that condition (3.8) of Theorem 3.2, with τ (H) := W00 is satisfied. Therefore the property (3.9) and (3.10), µ,1 ) ∩ D(Fρ ). This and the invariance of with χ = χρ , holds for H(λ) ∈ C 1 (∆, Wop −θ the operator B under rescaling (Sρ ) yield the result. 6. Renormalization Group In this section, we describe some dynamical properties of the “renormalization group” Rnρ , ∀ n ≥ 1 (more precisely, semi-group), generated by the renormalization map Rρ . A closely related iteration scheme is used in [3]. First, we observe that, ∀ w ∈ C, Rρ (wHf ) = wHf and Rρ (w1) = 1ρ w1. This motivates us to define Mf p := CHf and Mu := C1 as candidates for the manifold of fixed points of Rρ and the unstable manifold, emerging from Mf p := CHf . The next theorem identifies the stable manifold passing through Mf p , which turns out to be of (complex) codimension 1 and is foliated locally by (complex) co-dimension- 2 stable manifolds passing through the fixed points in Mf p . This implies in particular that, in a vicinity of Mf p , there are no other fixed points and that Mu is the entire unstable manifold of Mf p . We require some further definitions. As an initial set of operators we take D := Dµ,2 (α0 , β0 , γ0 ),
with α0 , β0 , γ0 1.
(6.1)
(The choice σ = 2 of the smoothness index in the definition of the polydiscs is dictated by the needs of Mourre theory, which is applied in the next section.) We
March 23, J070-S0129055X11004266
196
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
also let Ds := Dµ,2 (0, β0 , γ0 )
(6.2)
(the subindex s stands for “stable”). For H ∈ D, we denote Hu := H Ω and Hs := H − H Ω 1 (the unstable- and stable-central-space components of H, respectively). Note that Hs ∈ Ds . We fix the scale ρ such that 1 2 α0 , β0 , γ0 ρ ≤ min , Cχ , (6.3) 2 where, we recall, the constant Cχ is defined in (5.12). Below, we will use the nth iteration of the numbers α0 , β0 and γ0 under the map (5.14): ∀ n ≥ 1 2 αn := c ρ−1 (cρµ )n−1 γ0 , β n = β0 +
n−1
2 c ρ−1 (cρµ )j γ0 ρ ,
j=1 µ n
γn = (cρ ) γ0 . Recall that a complex function f on an open set Ω in a complex Banach space W is said to be analytic if, for H ∈ Ω and ∀ ξ ∈ W, f (H + τ ξ) is analytic in the complex variable τ , for |τ | sufficiently small, see [10]. Our analysis uses the following result from ([20, Theorems 5.1 and 5.2, and the proof of Proposition 5.3]): Theorem 6.1. Let δn := νn ρn with 4αn ≤ νn ≤ e : Ds → C s.t.
1 18 .
There is an analytic map
(i) e(H) ∈ R, for H = H ∗ ; (ii) If Uδ := {H ∈ D||e(Hs ) + Hu | ≤ δ}, then Uδn ⊂ D(Rnρ )
and
Rnρ (Uδn ) ⊂ Dµ,2 (2νn , βn , γn );
(6.4)
(iii) Rnρ (H) is analytic in λ = −Hu ∈ D(e(H), δn ); (iv) For H ∈ D, the number E := e(Hs ) + Hu is the eigenvalue of the operator H with the smallest real part. (In particular, if H is self-adjoint, then it is the ground state energy of H.) Theorem 6.1 is proven in [20] for somewhat simpler Banach spaces (not containing the derivatives (k∂k )q in the definition of norms). However, an extension to the Banach spaces used in this paper is straightforward and is not dwelled upon here. Theorem 6.1 implies that Mf p := CHf is (locally) a manifold of fixed points of Rρ , and Mu := C1 is the unstable manifold. Moreover (see [20]), Uδn = {H ∈ D|e(Hs ) = −Hu } (6.5) Ms := n
is a local stable manifold for the fixed point manifold Mf p , in the sense that, ∀ H ∈ Ms , ∃w ∈ C such that Rnρ (H) → wHf ,
µ,2 in the sense of convergence in Wop ,
(6.6)
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
197
as n → ∞. Moreover, Ms is an invariant manifold for Rρ : Ms ⊂ D(Rρ ) and Rρ (Ms ) ⊂ Ms , though we do not need this property here (and therefore we do not prove it). 7. Mourre Estimate In this section, we prove the Mourre estimate for the operator-family H (n) (λ) := Rnρ (Hs − λ), where Hs = Hs∗ ∈ Ds , where Ds has been defined in (6.2). This will yield the limiting absorption principle for H (n) (λ). The latter is then transferred with the help of Theorem 3.2 to the limiting absorption principle for the operator H = Hs − λ. In Sec. 8, this limiting absorption principle will be connected to the limiting absorption principle for the family Hg − λ, where Hg is either HgP F or HgN . Denote ∆δ := [δ, ∞). Theorem 7.1. Let H(λ) = H(λ)∗ ∈ C 1 (∆, Dµ,2 (α, β, γ)), where ∆ is an open interval in R. If δ γ and β ≤ 13 , then B −θ (H(λ) − i0)−1 B −θ ∈ C ν (∆ ∩ E −1 (∆δ )),
(7.1)
where E : λ → E(λ) with E(λ) := H(λ) Ω , for any 1/2 < θ ≤ 1 and ν < θ − 12 . Proof. In what follows we omit the argument λ. Let E := w0,0 [0], T := w0,0 [Hf ] − w0,0 [0] and W := m+n≥1 χ1 Wm,n [w]χ1 , so that H = E1 + T + W . Let H1 := H − E = T + W. , where T := i [T, B] = T (Hf )Hf and W := i [W, B]. We write i [H1 , B] = T + W µ,σ Let W ∈ Wop,ξ , 1 ≤ σ ≤ 2. (In this argument, we display the index ξ in the space µ,σ := H(Wξµ,σ ).) By relation (5.17) we have that Wop,ξ µ,σ−1 ≤ cγ, W W op,ξ
with ξ < ξ.
The shift in the smoothness index from σ to σ − 1 is due to the fact that the = i [W, B] are (k · ∇k + 3(m+n) )wm,n (r, k), coupling functions for the operator W 2 (m,n) , and one therefore looses one derivative, as compared to the where k := k coupling functions, wm,n (r, k), of W . We write i [H1 , B] =
1 1 − 1 W. H1 + T − T + W 2 2 2
µ,0 Recalling that the operator norm is dominated by the Wop -norm, we see that the last two terms are bounded as − 1 W ≤ Cγ. W (7.2) 2
March 23, J070-S0129055X11004266
198
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Furthermore, using the estimate |T (r) − 1| < β and the definition of T, we find that 1 1 1 T(r) − T (r) ≥ (1 − β)r − (1 + β)r = (1 − 3β)r 2 2 2 and therefore 1 T − T ≥ 2
1 T(r) − T (r) 0≤r≤∞ 2 inf
1 (1 − 3β)r = 0. 2 This gives [H1 , B] ≥ 12 H1 − cγ, and therefore, for ∆ := 12 δ, ∞ , δ γ, ≥
inf
0≤r≤∞
E∆ (H1 )i [H1 , B] E∆ (H1 ) ≥
1 δE∆ (H1 )2 . 4
(7.3)
This proves the Mourre estimate for the operator H1 ≡ H1 (λ). Moreover, since H(λ) ∈ C ∞ (∆, Dµ,2 (α, β, γ)), we conclude that the commutators [H1 , B] and [[H1 , B], B] are bounded relative to the operator H1 ; (this is guaranteed by choosing the index σ = 2 for the polydisc Dµ,σ (α, β, γ)). Hence the standard Mourre theory is applicable and gives H¨ older continuity of the resolvent, sandwiched between factors B −θ , in the spectral parameter as well as in the “operator” (see [43]): B −θ R1 (λ, ω)E∆ (H1 (λ))B −θ ∈ C ν (∆ × R),
(7.4)
where R1 (λ, ω) := (H1 (λ) − ω)−1 and ν < θ − 1/2. Here the argument λ has been reintroduced in our notation, and we have transferred the H¨ older continuity from the operator H1 (λ) to its argument λ. Since B −θ R1 (λ, ω)B −θ = B −θ R1 (λ, ω)E∆ (H1 (λ))B −θ + B −θ R1 (λ, ω)(1 − E∆ (H1 (λ)))B −θ
(7.5)
and since the last term on the right-hand side is C ν (∆) in λ and C ∞ (∆δ ) in σ, we conclude from (7.4) that B −θ R1 (λ, ω)B −θ ∈ C ν (∆ × ∆δ ).
(7.6)
Now take σ = E(λ) + i0. Since by the condition of the theorem E(λ) := H(λ) Ω ∈ C ∞ (∆), we conclude that (7.1) holds. In the previous section the parameter δn was allowed to vary over a certain range (see Theorem 6.1). In this section we make a particular choice of δn , namely 1 n ρ . δn := 18 The definitions of the sets Ds and Uδ appearing below have been given at the beginning of Sec. 6 and in Theorem 6.1, respectively.
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
199
1 n Theorem 7.2. Assume (6.3). Let n ≥ 1, δn := 18 ρ and let Hs = Hs∗ ∈ Ds and ρ ∆δn := e(Hs ) + δn , e(Hs ) + δn . 8 Then
B −θ (Hs − λ − i0)−1 B −θ ∈ C ν (∆δn )
(7.7)
for any 1/2 < θ ≤ 1 and ν < θ − 12 . Proof. Let Bθ := B −θ and let Dn be the disc of the radius δn centered at e(Hs ). Since ∆δn ⊂ {z ∈ R||e(Hs ) − z| ≤ δn }, we have that Hs − λ ∈ Uδn , for λ ∈ ∆δn . By Theorem 6.1, we have that the operator H (n) (λ) := Rnρ (Hs − λ) is well defined and 1 (n) µ,2 ρ, βn−1 , γn−1 H (λ) ∈ D is analytic in λ ∈ ∆δn , 8 H (n) (λ) = H (n) (λ)∗ ,
∀ λ ∈ ∆δn ∩ R.
Hence, since ρ γn−1 , βn−1 ≤ 13 , due to (6.3), we have, by Theorem 7.1, that Bθ (H (n) (λ) − i0)−1 Bθ ∈ C ν (∆δn ∩ En−1 (∆δ )),
(7.8)
where 0 ≤ ν < θ − 1/2δ γn−1 , and, as before, En (λ) ≡ En (λ, Hs ) := (H (n) (λ))u , which, by the above conclusion, is analytic. We need the following proposition to describe the set En−1 (∆δ ). 1 n Proposition 7.3. Let n ≥ 0, δn := 18 ρ and Aδn := {λ ∈ C| ρ8 δn ≤ |λ − e(Hs )| ≤ δn }. For H ∈ Uδn , we denote En (λ, Hs ) := (Rnρ (H))u ≡ Rnρ (H) Ω , λ = −Hu . Then
|En (λ, Hs )| ≥ 2−8 ρ,
for λ ∈ Aδn .
(7.9)
Proof. In this proof, we do not display the dependence of various quantities on Hs . Let λ ∈ Aδn , with δn as specified above. Define E0n (λ) by the equation En (λ) = ρ−n (E0n (λ) − λ).
(7.10)
The following estimate is shown in ([45, Eq. (V.27)] of the latter paper): 1 |λ − e| + (1 − ρ)−1 ρn+1 αn+1 . 5 This inequality and the definition of αn imply that |E0n (λ) − e| ≤
(7.11)
|E0n (λ) − λ| ≥ |λ − e| − |E0n (λ) − e| ≥
4 |λ − e| − 2γ02 c(c2 ρ2µ+1 )n ρ−1 . 5
(7.12)
March 23, J070-S0129055X11004266
200
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
Since 2γ02 cρ−1 (c2 ρ2µ+1 )n ρn+1 and since, for λ ∈ Aδn , |λ − e| ≥
ρn+1 8·18
(7.12) gives
|E0n (λ) − λ| ≥ 2−8 ρn+1 .
(7.13)
Due to (7.10), this implies the statement of the proposition. We now conclude the proof of Theorem 7.2. Proposition 7.3 says that −8
En : ∆δn λ → En (λ) ∈ ∆2 −8
Hence En−1 (∆2
ρ
ρ
.
(7.14)
) ⊃ ∆δn and therefore by (7.8) we have that B −θ (H (n) (λ) − i0)−1 B −θ ∈ C α (∆δn ),
(7.15)
which, due to Proposition 5.5, gives (7.7). 8. Initial Conditions for the Renormalization Group Now we turn to the operators (2.10) and (1.8). Let Hg stand for either the Pauli– Fierz Hamiltonian, (2.10), or the Nelson Hamiltonian, (1.8). (Not to be confused with the standard model Hamiltonian (1.1), which is not used in the remainder of this paper.) The operators Hg − λ do not belong to the Banach spaces defined above. We define an additional renormalization transformation which acts on such operators and maps them into the disc Dµ,s (α0 , β0 , γ0 ) for an appropriate choice of α0 , β0 , γ0 . PF PF or for H0N and Hpg , for either Hpg = Hp + O(g 2 ) Let H0g stand either for H0g or for Hp (see (2.10)–(2.12) and (1.8)–(1.10)), so that Hg = H0g + Ig , (p)
(p)
H0g = Hpg + Hf (p)
(p)
and Hpg = Hp + O(g 2 ).
(p)
(p)
Let e0 < e1 < · · · < gap := 1 − 0 be the eigenvalues of Hpg , with e0 (p) its ground state energy. (For the Nelson model, ej coincide with the eigenvalues (p)
j
of the particle Hamiltonian Hp .) Let Pp be the orthogonal projection onto the (p)
eigenspace corresponding to e0 . On Hamiltonians acting on Hp ⊗ Hf , which were described above, we define the map −1 R(0) ρ0 = ρ0 Sρ0 ◦ Fτ0 π0 , (p)
(8.1) (p)
(p)
(p)
where ρ0 ∈ (0, gap ] is an initial photon energy scale (recall that gap := 1 − 0 ) and where τ0 (Hg − λ) = H0g − λ and π0 ≡ π0 [Hf ] := Pp ⊗ χHf ≤ρ0 ,
(8.2)
PF given after Eq. (2.10) (see above) and for any λ ∈ C. Define the for H0g := H0g set 1 (p) (8.3) I0 := z ∈ C|Re z ≤ e0 + ρ0 . 2
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
201
We assume ρ0 g 2 . To simplify the notation we assume that the ground state (p) energy, 0 , of the operator Hp is simple (otherwise we would have to deal with matrix-valued operators on Hf ). We have Theorem 8.1. Let Hg be the Hamiltonian given either by (2.10) or by (1.8), and (p) let gap ≥ ρ0 g 2 , µ > −1/2 and λ ∈ I0 . Then: Hg − λ ∈ D(R(0) ρ0 ). (0)
(8.4)
(0)
The operators Hλ := Rρ0 (Hg − λ)|RanPp ⊗1 , λ ∈ I0 ∩ R, are self-adjoint; (0)
(p)
µ,2 (α0 , β0 , γ0 ), Hλ − ρ−1 0 (e0 − λ) ∈ D
(8.5)
µ 2 with λ ∈ I0 , µ as in Eq. (1.11), α0 = O(g 2 ρ−1 0 ), β0 = O(g ), and γ0 = O(gρ0 ); (0)
Hλ
is analytic in λ ∈ I0 .
(8.6)
For the Pauli–Fierz and Nelson Hamiltonians, µ = 1/2 and µ > 0, respectively. (p)
Note that if ψ (p) is the ground state of Hpg with the energy e0 (p) ψ ⊗ Ω, then (p)
e0 − λ = ψ0 , (Hg − λ)ψ0 .
and ψ0 = (8.7)
Theorem 8.1 is proven in [45, Appendix II], for somewhat simpler Banach spaces that do not contain the derivatives (k∂k )q . However, an extension to the Banach spaces which are used in the present paper is straightforward. (0) Let, as before, P¯p := 1 − Pp and K := Rρ0 (Hg − λ)|Ran(P¯p ⊗1) . Note that K = (p)
(H0g − λ)|Ran(P¯p ⊗1) and therefore, ∀ λ ∈ I0 ∩ R, σ(K) = σ(Hpg )\{e0 } + [0, ∞) − λ. Hence, ∀ λ ∈ I0 ∩ R,
1 1 (p) (p) (p) (p) K ≥ e1 − e0 − ρ0 ≥ (e1 − e0 ). 2 2 (0)
(0)
Therefore 0 ∈ / σ(K). This, the relation σ(Rρ0 (Hg − λ)) = σ(Hλ ) ∪ σ(K) and Theorem 3.1 imply that (0)
Hλ
is isospectral to Hg − λ
in the sense of Theorem 3.1. Moreover, similarly to Proposition 5.5 and using the relation (0)−1
−1 = Hλ R(0) ρ0 (Hg − λ)
(Pp ⊗ 1) + (H0g − λ)−1 (P¯p ⊗ 1),
(8.8)
one shows the following result Proposition 8.2. Let µ > 0, ρ0 g 2 and ∆0 ⊆ I0 ∩ R. If Hg is given in either (2.10) or (1.8), then, for any 0 ≤ θ, ν ≤ 1, (0)
B −θ (Hλ − i0)−1 B −θ ∈ C ν (∆0 ) ⇒ B −θ (Hg − λ − i0)−1 B −θ ∈ C ν (∆0 ).
(8.9)
March 23, J070-S0129055X11004266
202
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
9. LAP on Small Sets In this section, we prove the limiting absorption principle (LAP) for Hg on small subsets of (g , g + 12 ρ0 ), where g is the ground state energy of Hg and, as in the previous section, Hg stands for either (2.10) or (1.8). In the next section, we show 1 ρ0 ) completing the proof of Theorem 1.1. Recall that these subsets cover (g , g + 18 the definition (0)
Hλ := R(0) ρ0 (Hg − λ)|RanPp ⊗1 ,
λ ∈ I0 .
(9.1)
The right-hand side is well defined according to Theorem 8.1. By Eq. (8.5), if λ ∈ I0 , then (0)
(p)
µ,2 Hλ − ρ−1 (α0 , β0 , γ0 ), 0 (e0 − λ) ∈ D
(9.2)
where α0 , β0 and γ0 are given in Theorem 8.1. Since, by our assumption, g 1 we (p) can choose ρ0 in the interval (0, gap ] so that g 2 ρ−1 0 ,
gρµ0 1.
(9.3)
Let H ψ := ψ, Hψ . We isolate the stable-central and unstable components of (0) the operator Hλ : (0)
(0)
(0)
Hλs := (Hλ )s = Hλ − Hλ Ω 1
(0)
(0)
and Hλu := (Hλ )u = Hλ Ω
(9.4)
(see Sec. 6), and let e : Ds → C be the map introduced in Theorem 6.1. We pick the parameter ρ so that 1 . (9.5) 2 Then the conditions of Theorem 7.2 are satisfied by Hs = Hλs , µ ∈ I0 ∩ R and this theorem implies that gρµ0 ρ ≤
B −θ (Hλs − ω − i0)−1 B −θ ∈ C ν ({(λ, ω) ∈ (I0 ∩ R) × ∆λδn }), where
1 2
1 n < θ ≤ 1, 0 < ν < θ − 12 , recall, δn = 18 ρ for n ≥ 0 and 1 ∆λδ := e(Hλs ) + δ, e(Hλs ) + δ . 8
(9.6)
(9.7)
(0)
The relations (9.6) and Hλ = Hλs + Hλu yield, in turn, that (0)
B −θ (Hλ − i0)B −θ ∈ C ν (Fδn ), where Fδ := {λ ∈ I0 ∩ R| − Hλu ∈ ∆λδ },
(9.8)
which, due to Proposition 7.4, yields B −θ (Hg − λ − i0)−1 B −θ ∈ C ν (Fδn ). This is the main result of this section.
(9.9)
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
203
10. Covering Lemma. End of Proof of Theorem 1.1 Let g be the solution to the equation e(Hλs ) = −Hλu for λ (see the definitions of Hλs and Hλu in (9.4)). By Theorem 6.1, 0 = e(Hg s ) + Hg u is the ground state (0) energy of the operator Hλ=g and therefore, by Theorem 3.1, g is the ground state energy of the operator Hg . In the lemma below we show that for g sufficiently small 3ρ0 ρ0 ρδn , g + δn ⊂ Fδn . (10.1) g + 16 4 Since
ρ0 4 δn
>
3ρ0 16 ρδn−1 ,
this embedding implies 1 Fδn covers g , g + ρ0 . 18
(10.2)
n≥0
This together with (9.9) implies the statement of Theorem 1.1. Lemma 10.1. For g sufficiently small, (10.1) holds. Proof. Recall that δn = obtain
1 n 18 ρ .
We first use the relation e(Hg s ) + Hg u = 0 to
e(Hλs ) + Hλu = Hλu − Hg u − e(Hg s ) + e(Hλs ).
(10.3)
Our next goal is to estimate |Hg u − Hλu | and |e(Hλs ) − e(Hg s )|. In order to use the Cauchy bound on derivatives, we pass to complex domains. We define the set Dδ := {λ ∈ I0 ||e(Hλs ) + Hλu | ≤ δ}.
(10.4)
Recall that D(, δ) denotes the disc centered at of a radius δ. We claim that, for g sufficiently small and for n ≥ 0, ρ 0 (10.5) D g , δn ⊂ Dδn . 4 We prove this claim by induction in n and obtain needed estimates on |Hg u − Hλu | and |e(Hλs ) − e(Hg s )| in the process. We assume it is true for n ≤ j − 1 and prove it for n = j. For j = 0, the induction assumption is absent and so our proof of the induction step yields also the first step. We estimate |Hg u − Hλu |. Let ∆0 E(λ) be defined by the relation Hλu =: ρ−1 0 (e0 − λ) + ∆0 E(λ). Then Hλu − Hg u = ρ−1 0 (g − λ) + ∆0 E(λ) − ∆0 E(g ).
(10.6)
We have by (9.2) and analyticity of ∆0 E(λ) on I0 , |∂λ ∆0 E(λ)| ≤ α0 /ρ0 . This gives |∆0 E(g ) − ∆0 E(λ)| ≤ ρ−1 0 α0 |g − λ|.
(10.7)
March 23, J070-S0129055X11004266
204
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
The last two relations imply that in D(g , ρ40 δj ), |Hg u − Hλu | ≤
1 (1 + α0 )δj . 4
(10.8)
Now we estimate |e(Hλs ) − e(Hg s )|. Recall the definition En (ω, Hλs ) := (Rnρ (Hλs − ω))u ≡ Rnρ (Hλs − ω) Ω . Define ∆n E(ω, Hλs ) := En (ω, Hµs ) − ρ−1 En−1 (ω, Hλs ).
(10.9)
It is shown in [45, Eqs. (V.24)–(V.25)] that e(Hs ) satisfies the (fixed point) equation e(Hs ) =
∞
ρi ∆i E(e(Hs ), Hs )
(10.10)
i=1
(the eigenvalue is equal to the sum of the corrections due to individual steps of the RG). Here the series on the right-hand side converges absolutely by the estimate −m 1 i+1 m ρ |∂ω ∆i E(ω, Hs )| ≤ αi for i ≤ j and m = 0, 1, (10.11) 12 shown in [20]. The relation (10.10) implies e(Hλs ) − e(Hg s ) =
∞
ρi (∆i E(e(Hλs ), Hλs ) − ∆i E(e(g ), Hg s )). (10.12)
i=1
We estimate this series. It follows from the analyticity of En (ω, Hs ) in Hs , see [20, Proposition 5.3], that ∆i E(ω, Hλs ) are analytic in λ ∈ Dδi , i ≤ j − 1. By the induction assumption Dδi ⊃ D(g , ρ40 δi ) for i ≤ j − 1. Hence using the Cauchy formula we conclude from (10.11) that ρ 4αi 0 on D g , δi |∂λ ∆i E(ω, Hλs )| ≤ (1 − ρ)ρ0 δi 4 for i ≤ j − 1. The latter estimate together with (10.11) and the definition δj = gives ∞
1 j 18 ρ
ρi |∆i E(e(Hλs ), Hλs ) − ∆i E(e(Hg s ), Hg s )|
i=1
≤
j−1 i=1
≤
ρ
i
∞ 12αi 4αi |e(Hλs ) − e(Hg s )| + |λ − g | + 2 ρi αi ρi+1 (1 − ρ)ρ0 δi i=j
20α1 80α1 |e(Hλs ) − e(Hg s )| + |λ − g | + 4αj ρj ρ (1 − ρ)ρ0
in D(g , ρ40 δj ). This estimate, together with the relation (10.12), gives |e(Hλs ) − e(Hg s )| ≤
40α1 δj + 160αj δj 1−ρ
(10.13)
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
205
ρ in D(g , ρ40 δj ), provided α1 ≤ 40 . This estimate, together with (10.3) and (10.8) and the definition of Dδn , implies (10.5) with n = j, provided 1 + α0 40α1 + + 160α0 ≤ 1. (10.14) 4 1−ρ Remembering the definition of αj , we see that the latter conditions can be easily arranged by taking g sufficiently small. This proves (10.5). Now we proceed to the proof of (10.1). Notice that Fδ can be written in the form ρ (10.15) Fδ := λ ∈ I0 ∩ R| − δ ≤ e(Hλs ) + Hλu < − δ . 8 Hence we estimate e(Hλs ) + Hλu from above and below. ρ0 ρ0 0 Since (g + 3ρ 16 ρδn , g + 4 δn ] ⊂ D(g , 4 δn ) the estimates (10.8) and (10.13), ρ0 0 with j = n, on |Hg u − Hλu | and |e(Hλs ) − e(Hg s )| hold on (g + 3ρ 16 ρδn , g + 4 δn ] and together with the relation (10.3) they give
e(Hλs ) + Hλu ≥ −|Hλu − Hg u | − |e(Hg s ) − e(Hλs )| 1 40α1 ≥ − (1 + α0 )δn − δn − 160αnδn ≥ −δn , (10.16) 4 1−ρ provided (10.14) holds. Furthermore, (10.3), (10.6) and (10.7) and (10.13), with 0 j = n, and the relation g + 3ρ 16 ρδn ≤ λ yield e(Hλs ) + Hλu ≤ ρ−1 0 (g − λ) + |∆0 E(λ) − ∆0 E(g )| + |e(g ) − e(λ)| 3 α0 40α1 ρ0 ρ0 ρδn + δn + δn + 160αn δn ≤ − ρδn , 16 4 1−ρ 8 provided ρ and ρ0 are chosen in such a way that 1 40α1 α0 ρ0 ρ ≥ + + 160αn 16 4 1−ρ ≤−
(10.17)
(10.18)
(the latter is possible, since by the definition of αj ≤ α0 = O(g 2 ρ−1 0 ) (see Theorem 8.1), if one chooses g sufficiently small). The last two inequalities and (10.15) imply (10.1). 11. Supplement: Background on Fock Space Let h be either L2 (R3 , C, d3 k) or L2 (R3 , C2 , d3 k). In the first case, we consider h to be the Hilbert space of one-particle states of a scalar boson or phonon, and in the second case, of a photon. The variable k ∈ R3 is the wave vector or momentum of the particle. (Recall that throughout this paper, the propagation speed c, of photon or photons and Planck’s constant, , are set equal to 1.) The Bosonic Fock space, F , over h is defined by ∞
Sn h⊗n , (11.1) F := n=0
where Sn is the orthogonal projection onto the subspace of totally symmetric n-particle wave functions contained in the n-fold tensor product h⊗n of h; and
March 23, J070-S0129055X11004266
206
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
S0 h⊗0 := C. The vector Ω := (1, 0, . . .) is called the vacuum vector in F . Vectors Ψ ∈ F can be identified with sequences (ψn )∞ n=0 of n-particle wave functions, which are totally symmetric in their n arguments, and ψ0 ∈ C. In the first case these functions are of the form, ψn (k1 , . . . , kn ), while in the second case, of the form ψn (k1 , λ1 , . . . , kn , λn ), where λj ∈ {−1, 1} are the polarization variables. In what follows, we present some key definitions in the first case, limiting ourselves to remarks at the end of this appendix on how these definitions have to be modified for the second case. The scalar product of two vectors Ψ and Φ is given by ∞ n d3 kj ψn (k1 , . . . , kn )ϕn (k1 , . . . , kn ). (11.2) Ψ, Φ := n=0
j=1
Given a one particle dispersion law ω(k) = |k|, the energy of a configuration of n non-interacting field particles with wave vectors k1 , . . . , kn is given by nj=1 ω(kj ). We define the free-field Hamiltonian, Hf , by n (Hf Ψ)n (k1 , . . . , kn ) = ω(kj ) ψn (k1 , . . . , kn ), (11.3) j=1
for n ≥ 1 and (Hf Ω)n = 0. Here Ψ = (ψn )∞ n=0 . To ensure that the r.h.s. makes sense, we assume that ψn = 0, except for finitely many n, for which ψn (k1 , . . . , kn ) decrease rapidly at infinity. Clearly, the operator Hf has a single eigenvalue, 0, with eigenvector Ω, and the rest of its spectrum is absolutely continuous. With each function ϕ ∈ h one associates an annihilation operator a(ϕ) defined as follows. For Ψ = (ψn )∞ n=0 ∈ F with the property that ψn = 0, for all but finitely many n, the vector a(ϕ)Ψ is defined by √ (11.4) (a(ϕ)Ψ)n (k1 , . . . , kn ) := n + 1 d3 kϕ(k)ψn+1 (k, k1 , . . . , kn ). These equations define a closable operator a(ϕ) whose closure is also denoted by a(ϕ). We also define a(ϕ)Ω = 0.
(11.5)
The creation operator a∗ (ϕ) is defined to be the adjoint of a(ϕ) with respect to the scalar product defined in Eq. (11.2). Since a(ϕ) is anti-linear, and a∗ (ϕ) is linear in ϕ, we write 3 ∗ (11.6) a(ϕ) = d kϕ(k)a(k), a (ϕ) = d3 kϕ(k)a∗ (k), where a(k) and a∗ (k) are unbounded, operator-valued distributions. The latter are well known to obey the canonical commutation relations (CCR): [a# (k), a# (k )] = 0, #
∗
where a = a or a .
[a(k), a∗ (k )] = δ 3 (k − k ),
(11.7)
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
207
One can rewrite the quantum Hamiltonian Hf in terms of creation and annihilation operators, a and a∗ , as (11.8) Hf = d3 ka∗ (k)ω(k)a(k). More generally, for any operator, t, on the one-particle space h, we define operator, T , on Fock space F by the following formal expression T := a∗ (k)(ta)(k)dk, where the operator t acts on the k-variable; (T ≡ dΓ(t) is the second quantization of t.) The precise meaning of the latter expression can obtained by using a basis {φj } in the space h to rewrite it as T := j a∗ (φj )a(t∗ φj )dk. To modify the above definitions to the case of photons, one replaces the variable k by the pair (k, λ) and adds to the integrals in k sums over λ. In particular, the # creation and annihilation operators have now two variables: a# λ (k) ≡ a (k, λ); they satisfy the commutation relations # [a# λ (k), aλ (k )] = 0,
[aλ (k), a∗λ (k )] = δλ,λ δ 3 (k − k ).
(11.9)
One can introduce the operator-valued transverse vector fields by eλ (k)a# a# (k) := λ (k), λ∈{−1,1}
where eλ (k) ≡ e(k, λ) are polarization vectors, i.e. orthonormal vectors in R3 satisfying k · eλ (k) = 0. Then, in order to reinterpret the expressions in this paper for photons, one either adds the variable λ, as was mentioned above, or replaces, in appropriate places, the usual product of scalar functions or scalar functions and scalar operators by the dot product of vector-functions or vector-functions and operator-valued vector-functions. Acknowledgments Part of this work was done while the third author was staying at IAS Princeton and visiting ETH Z¨ urich and ESI Vienna. He is grateful to these institutions for hospitality. He was supported by NSERC Grant No. NA7901. The authors are grateful to anonymous referee for several useful remarks. References [1] W. Abou Salem, J. Faupin, J. Fr¨ ohlich and I. M. Sigal, On theory of resonances in non-relativisitc QED, Adv. Appl. Math. 43(3) (2009) 201–230. [2] A. Arai and M. Hirokawa, Ground states of a general class of quantum field Hamiltonians, Rev. Math. Phys. 12(8) (2000) 1085–1135. [3] V. Bach, Th. Chen, J. Fr¨ ohlich and I. M. Sigal, Smooth Feshbach map and operatortheoretic renormalization group methods, J. Funct. Anal. 203 (2003) 44–92. [4] V. Bach, Th. Chen, J. Fr¨ ohlich and I. M. Sigal, The renormalized electron mass in non-relativistic quantum electrodynamics, J. Funct. Anal. 243(2) (2007) 426–535. [5] V. Bach, J. Fr¨ ohlich and A. Pizzo, Infrared-finite algorithms in QED: The groundstate of an atom interacting with the quantized radiation field, Comm. Math. Phys. 264(1) (2006) 145–165.
March 23, J070-S0129055X11004266
208
2011 10:41 WSPC/S0129-055X
148-RMP
J. Fr¨ ohlich, M. Griesemer & I. M. Sigal
[6] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998) 299–395. [7] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Renormalization group analysis of spectral problems in quantum field theory, Adv. Math. 137 (1998) 205–298. [8] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field, Comm. Math. Phys. 207(2) (1999) 249–290. [9] V. Bach, J. Fr¨ ohlich, I. M. Sigal and A. Soffer, Positive commutators and spectrum of Pauli–Fierz Hamiltonian of atoms and molecules, Comm. Math. Phys. 207(3) (1999) 557–587. [10] M. Berger, Nonlinearity and Functional Analysis. Lectures on Nonlinear Problems in Mathematical Analysis, Pure and Applied Mathematics (Academic Press, 1977). [11] T. Chen, J. Fr¨ ohlich and A. Pizzo, Infraparticle scattering states in non-relativistic QED: I. The Bloch–Nordsieck paradigm, Comm. Math. Phys. 294(3) (2010) 761–825. [12] T. Chen, J. Fr¨ ohlich and A. Pizzo, Infraparticle scattering states in non-relativistic QED: II. Mass shell properties; arXiv:0709.2812. [13] T. Chen, J. Faupin, J. Fr¨ ohlich and I. M. Sigal, Local decay in non-relativistic QED (2009); arXiv:0911.0828v1. [14] C. Cohen-Tannoudji, J. Dupont-Roc and G. Grynberg, Photons and Atoms — Introduction to Quantum Electrodynamics (John-Wiley, 1991). [15] J. Faupin, Resonances of the confined hydrogen atom and the Lamb–Dicke effect in non-relativisitc QED, Ann. Henri Poincar´e 9(4) (2008) 743–773. [16] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic electromagnetic fields in models of quantum-mechanical matter interacting with the quantized radiation field, Adv. Math. 164(2) (2001) 349–398. [17] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic completeness for Rayleigh scattering, Ann. Henri Poincar´e 3(1) (2002) 107–170. [18] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic completeness for Compton scattering, Comm. Math. Phys. 252(1–3) (2004) 415–476. [19] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, Spectral theory for the standard model of non-relativisitc QED, Comm. Math. Phys. 283(3) (2008) 613–646. [20] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, Spectral renormalization group analysis, Rev. Math. Phys. 21(4) (2009) 511–548. [21] V. Gergescu, C. G´erard, and J. S. Møller, Commutators, C0 -semigroups and resolvent estimates, J. Funct. Anal. 216 (2004) 303–361. [22] V. Gergescu, C. G´erard and J. S. Møller, Spectral theory of massless Pauli–Fierz models, Comm. Math. Phys. 249 (2004) 29–78. [23] M. Griesemer and D. Hasler, On the smooth Feshbach–Schur map, J. Funct. Anal. 254(9) (2008) 2329–2335. [24] M. Griesemer and D. Hasler, Analytic perturbation theory and renormalization analysis of matter coupled to quantized radiation, Ann. Henri Poincar´e 10(3) (2009) 577–621. [25] M. Griesemer, E. H. Lieb and M. Loss, Ground states in non-relativistic quantum electrodynamics, Invent. Math. 145(3) (2001) 557–595. [26] S. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics, 2nd edn. (Springer, 2006). [27] C. Hainzl, M. Hirokawa and H. Spohn, Binding energy for hydrogen-like atoms in the Nelson model without cutoffs, J. Funct. Anal. 220(2) (2005) 424–459.
March 23, J070-S0129055X11004266
2011 10:41 WSPC/S0129-055X
148-RMP
Spectral Renormalization Group and Local Decay in the Standard Model
209
[28] D. Hasler and I. Herbst, Absence of ground states for a class of translation invariant models of non-relativistic QED, Comm. Math. Phys. 279(3) (2008) 769–787. [29] D. Hasler, I. Herbst and M. Huber, On the lifetime of quasi-stationary states in non-relativisitc QED, Ann. Henri Poincar´e 9(5) (2008) 1005–1028. [30] D. Hasler and I. Herbst, Analytic perturbation theory and renormalization analysis of matter coupled to quantized radiation (2008); arXiv:0801.4458v1. [31] D. Hasler and I. Herbst, Smoothness and analyticity of perturbation expansions in QED (2010); arXiv:1007.0969v1. [32] D. Hasler and I. Herbst, Convergent expansions in non-relativistic QED (2010); arXiv:1005.3522v1. [33] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics. I, J. Math. Phys. 40(12) (1999) 6209–6222. [34] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics. II, J. Math. Phys. 41(2) (2000) 661–674. [35] F. Hiroshima, Ground states and spectrum of quantum electrodynamics of nonrelativistic particles, Trans. Amer. Math. Soc. 353(11) (2001) 4497–4528 (electronic). [36] F. Hiroshima, Self-adjointness of the Pauli–Fierz Hamiltonian for arbitrary values of coupling constants, Ann. Henri Poincar´e 3(1) (2002) 171–201. [37] F. Hiroshima, Localization of the number of photons of ground states in nonrelativistic QED, Rev. Math. Phys. 15(3) (2003) 271–312. [38] F. Hiroshima, Analysis of ground states of atoms interacting with a quantized radiation field, Topics in the Theory of Schr¨ odinger Operators (World Scientific Publishing, 2004), pp. 145–272. [39] F. Hiroshima and H. Spohn, Ground state degeneracy of the Pauli–Fierz Hamiltonian with spin, Adv. Theor. Math. Phys. 5(6) (2001) 1091–1104. [40] P. Hislop and I. M. Sigal, Introduction to Spectral Theory. With Applications to Schr¨ odinger Operators, Applied Mathematical Sciences, Vol. 113 (Springer-Verlag, 1996). [41] M. H¨ ubner and H. Spohn, Radiative decay: Nonperturbative approaches, Rev. Math. Phys. 7(3) (1995) 363–387. [42] M. H¨ ubner and H. Spohn, Spectral properties of the spin-boson Hamiltonian, Ann. Inst. Henri Poincar´e 3 (2002) 269–295. [43] W. Hunziker and I. M. Sigal, The quantum N -body problem, J. Math. Phys. 41(6) (2000) 3448–3510. [44] M. Reed and B. Simon, Methods of Modern Mathematical Physics, IV, Analysis of Operators (Academic Press, 1978). [45] I. M. Sigal, Ground state and resonances in the standard model of the non-relativistic quantum electrodynamics, J. Stat. Phys. 134(5–6) (2009) 899–939. [46] E. Skibsted, Spectral analysis of N -body systems coupled to a bosonic field, Rev. Math. Phys. 10 (1998) 989–1026. [47] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge University Press, 2004).
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 2 (2011) 211–232 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004291
RUELLE–LANFORD FUNCTIONS AND LARGE DEVIATIONS FOR ASYMPTOTICALLY DECOUPLED QUANTUM SYSTEMS
YOSHIKO OGATA Graduate School of Mathematics, University of Tokyo, Japan
[email protected] LUC REY-BELLET Department of Mathematics and Statistics, University of Massachusetts Amherst, USA
[email protected] Received 22 September 2010 Revised 2 February 2011 We recover, expand, and unify quantum (and classical) large deviation results for lattice Gibbs states. The main new ingredient in this paper is a control on the overlap of spectral projections for non-commutative observables. Our proof of large deviations is based on Ruelle–Lanford functions [20, 34] which establishes the existence of a rate function directly by subadditivity arguments, as done in the classical case in [23, 32], instead of relying on G¨ artner–Ellis theorem, and cluster expansion or transfer operators as done in the quantum case in [21, 13, 27, 22, 16, 28]. We assume that the Gibbs states are asymptotically decoupled [23, 32], which controls the dependence of observables localized at different spatial locations. In the companion paper [29], we discuss the characterization of rate functions in terms of relative entropies. Keywords: Quantum spin systems; large deviations; thermodynamics formalism; quantum statistical mechanics. Mathematics Subject Classification 2010: 82B10, 60F10
1. Introduction Consider a spin system on the lattice Zd in thermal equilibrium described by a state ω. For a region Λ ⊂ Zd , let KΛ be a macroscopic observable, for example the total energy or the total magnetization in the region Λ. One expects, as a rule, that such observables have a distribution which is very sharply concentrated around the equilibrium mean value and that the fluctuations of such observables are exponentially small in the volume |Λ| of the domain, except at a first order phase transition where coexisting phases can induce macroscopically large fluctuations. This property is expressed by a large deviation principle: For a Borel set A ⊂ R 211
March 23, J070-S0129055X11004291
212
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
let IA (KΛ /|Λ|) denote the spectral projection onto the eigenspace of KΛ /|Λ| corresponding to eigenvalues of KΛ /|Λ| in the set A. A large deviation principle holds if there exists a rate function s(x) such that ω(IA (KΛ /|Λ|)) exp |Λ| sup s(x) . x∈A
In classical mechanics systems this problem is mathematically very wellunderstood and very general large deviations theorems have been proved both for systems on a lattice or in the continuum see [20, 34, 30, 12, 6, 10, 14, 15, 23, 32, 33]. For quantum mechanical systems, the problem of large deviations has, in comparison, received little attention and is only partially understood. The difficulty lies, partly, in the non-commutativity of quantum mechanical observables but also at a deeper level, in the lack of control on the boundary effects in quantum mechanics. Known bulk/boundary estimates are sufficient to prove the existence of thermodynamic functions such as entropy and free energy, see e.g., [35, 19, 5, 36] but they are, so far, not sufficient to prove general large deviation results, especially at low temperatures for spatial dimension more than one. A number of quantum large deviation results have been proved in the past few years [31, 21, 13, 22, 27, 16, 17, 7, 28, 8], (see also [4] for an information-theoretic interpretation of relative entropy). Common to all these papers is that the large deviation results are obtained by an application of G¨ artner–Ellis theorem, in particular the smoothness of the logarithmic moment generating functions (i.e. a suitable free energy functional) is necessary and is proved by cluster expansion or using a transfer operator. For classical Gibbs states there are several different proofs of the large deviation principle and we follow here the approach by Lewis, Pfister and Sullivan [23, 32]. In this approach, the state ω of the infinite system is assumed to be asymptotically decoupled (see Sec. 3.2 for a formal and more general definition): Given a finite region V ⊂ Zd there exists a function c(V ) such that if A is a nonnegative observable supported in the region V and B is a nonnegative observable supported in the complement Zd \V one has e−c(V ) ω(A)ω(B) ≤ ω(AB) ≤ ω(A)ω(B)ec(V ) , with lim V
Zd
c(V ) = 0. |V |
In the classical case, this property is a fairly easy consequence of the DLR equation for Gibbs states. Using this property and subadditivity arguments one proves then directly the existence of a rate function s(x) for the observable of interest. This general approach to large deviation (summarized in Sec. 2) was pioneered by Lanford and Ruelle papers [20, 34] and we follow here the terminology of [23, 32]. To use this strategy for quantum systems we face two obstacles. The first one is that it is not known whether a Gibbs-KMS state for a quantum spin system is asymptotically decoupled in general. This property is only known to hold in spatial
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
213
dimension one (proved by Araki in [1]) and at high temperatures in arbitrary dimension (see, e.g., [2]). We believe that it is an important open problem to determine whether this property holds for general quantum spin systems or not but there is no new result in this direction in this paper. The second obstacle lies in the generalization of the subadditivity argument of [23, 32] to general non-commutative observables. The new ingredient needed here is a control on the overlap of spectral projections for KΛ and the spectral projections for KΛ1 + KΛ2 with Λ = Λ1 ∪ Λ2 , which differs from KΛ by a boundary term. This new estimate is proved in Sec. 4.4, see Proposition 4.10. Using this approach we are able to recover, unify, and extend slightly the known large deviation results for quantum (and classical) spin systems. In addition the proofs given here are quite short and self-contained. This paper is organized as follows. In Sec. 2, we give a brief exposition of the road to large deviation via proving the existence of the Ruelle–Lanford function which is an Boltzmann entropy-like functional. In Sec. 3, we recall the elements of the quantum spin system formalism needed in the paper and we introduce the asymptotic decoupling condition for states of quantum systems which is central in our analysis. In Sec. 4, we prove large deviation theorems for three different cases: (a) Commuting observables, (b) Classical observables, (c) General finiterange observables in dimension 1. The discussion of the rate functions and their characterization in terms of relative entropies is in the companion paper [29]. 2. Ruelle–Lanford Functions Let X be a complete metric space, let {µn } be a sequence of Borel probability measures on X, and let {vn } an increasing sequence of positive numbers with limn→∞ vn = +∞. We say that µn satisfies a large deviation principle (LDP) on the scale vn if there exists a function I : X → [0, ∞], lower semicontinuous and with compact level sets, such that for any closed set C lim sup n→∞
1 log µn (C) ≤ − inf I(x), x∈C vn
(2.1)
1 log µn (O). vn
(2.2)
and for any open set O − inf I(x) ≤ lim inf x∈O
n→∞
The function I is called the rate function for the LDP. In statistical mechanics applications the measures µn are often distributions of sums of R- or Rd - valued weakly dependent random variables. One standard approach to prove an LDP is to combine the exponential Markov inequality for the upper bound (2.1) and a change of measure and ergodicity argument for the lower bound (2.2) (see, e.g., the proofs of Cramer and G¨ artner–Ellis theorem in [9]). In the presence of phase transitions, i.e. lack of ergodicity with respect to spatial translation, additional arguments are needed to provide a lower bound. For example,
March 23, J070-S0129055X11004291
214
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
in [12], the lower bound for the LDP for classical lattice Gibbs states is obtained by using the Shannon–McMillan theorem and an approximation argument by ergodic states. Another route to LDP’s using subadditivity arguments, much in the spirit of statistical mechanics, was pioneered in a remarkable paper by Lanford [20], itself based on earlier work by Ruelle [35]. We follow closely here the presentation in [23], see also [32]. For Borel sets B let us define the set functions 1 1 m(B) = lim sup log µn (B), m(B) = lim inf log µn (B). (2.3) n→∞ vn n→∞ vn One has the elementary properties (1) For any Borel set B, we have −∞ ≤ m(B) ≤ m(B) ≤ 0. (2) If B1 ⊂ B2 , then m(B1 ) ≤ m(B2 ) and m(B1 ) ≤ m(B2 ). (3) For all B1 , B2 , we have m(B1 ∪ B2 ) = max{m(B1 ), m(B2 )}. Property (3) is an key property in large deviations and is usually refereed to as the principle of the largest term: large deviations occur in the least unlikely way of all possible ways. Let Bε (x) denote the ball of radius ε centered at x and let us define s(x) = inf m(Bε (x)),
s(x) = inf m(Bε (x)).
ε
ε
(2.4)
Definition 2.1. The pair (µn , vn ) has a Ruelle–Lanford function (RL-function) s(x) if s(x) = s(x), for all x ∈ X. In this case we set s(x) = s(x) = s(x). The next proposition is standard and shows that the existence of RL-function (almost) implies the existence of a LDP. Proposition 2.2. The Ruelle–Lanford function s(x) is upper semicontinuous and m(O) ≥ sup s(x),
O open,
(2.5)
K compact .
(2.6)
x∈O
m(K) ≤ sup s(x), x∈K
Proof (Sketch). The upper semicontinuity follows from the definition. The lower bound is immediate: For any x ∈ O and ε sufficiently small we have m(O) ≥ m(Bε (x)) and thus m(O) ≥ s(x) = s(x) for all x ∈ O. To prove the upper bound, given ε > 0 we cover the compact set K by N = N (ε) balls Bε (xl ) with centers in xl ∈ K. Using Properties (2) and (3) we have N m(K) ≤ m Bε (xl ) ≤ max m(Bε (xl )) ≤ sup m(Bε (x)). l=1
l
Since ε is arbitrary the upper bound follows.
x∈K
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
215
The statement in Proposition 2.2 is usually referred to as a weak large deviation principle since the upper bound holds only for compact sets. In the problems discussed in this paper, the probability measures µn are supported uniformly on compact sets and the previous lemma yields immediately a large deviation principle with rate function −s(x). More generally one obtains a large deviation principle by combining Proposition 2.2 with a proof that the sequence of probability measures µn is exponentially tight (see, e.g., [9, Sec. 1.2]). To identify the rate function we use a standard large deviation result. Proposition 2.3 (Laplace–Varadhan’s Lemma). Suppose that µn satisfies a large deviation principle on the scale vn with rate function I(x). Let f be any continuous function and suppose that for some γ > 1 we have the moment condition lim supn→∞ v1n log µn (eγvn f (x) ) < ∞. Then 1 log µn (evn f (x) ) = sup(f (x) − I(x)). n→∞ vn x lim
If X = Rn and f (x) = α · x, we obtain 1 log µn (evn α·x ) = sup(α · x + s(x)), n→∞ vn x
e(α) ≡ lim
i.e. the moment generating function of µn is the Legendre transform of −s(x). If, in addition, we know, a priori, that the rate function s(x) is concave then by convex duality we obtain that s(x) = inf (e(α) − α · x), α
that is, the rate function is the Legendre transform of the logarithmic moment generating function. Note that in our examples the moment condition will be trivially satisfied. 3. Quantum Lattice Systems 3.1. Interactions and states We introduce some notations and briefly recall the mathematical framework for quantum spin systems, [19, 36, 5, 3]. C ∗ -algebras. Let A be a finite-dimensional C ∗ -algebra. For any finite subset Λ ⊂ Zd , let OΛ = x∈Λ Ox where Ox is isomorphic to A. If Λ ⊂ Λ , there is a natural embedding OΛ into OΛ and the algebras {OΛ }Λ⊂Zd ,finite form a partially ordered family of matrix algebras. The algebra of observables for the infinite system is given by the C ∗ -inductive limit O of Λ⊂Zd ,finite OΛ . States. Let ω be a state on O, i.e. ω is a positive, normalized linear functional on O. Let {τx }x∈Zd denote the group of spatial translations. A state ω is called translation invariant if ω(τx A) = ω(A) for all x ∈ Zd and all A ∈ O. The action of Zd on O is asymptotically abelian [5] and thus the set of translation invariant states is a simplex. We say that a state is ergodic if it is an extremal point of this simplex.
March 23, J070-S0129055X11004291
216
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
Classical subalgebras and states. A standard probabilistic setting is recovered by considering commutative (sub)algebras. Let A(cl) be an abelian subalgebra of A (cl) (cl) (cl) with N = dim A(cl) . For finite subsets Λ of Zd let OΛ = x∈Λ Ox with Ox (cl) is isomorphic to A(cl) . We denote by O(cl) the inductive limit of Λ⊂Zd ,finite OΛ . The commutative algebra O(cl) can be identified with C(L) where L = {1, . . . , N }Z with product topology is called a classical C ∗ -algebra. The restriction of any state ω on O gives a normalized linear functional ω (cl) on O(cl) . By Riesz Markov Theorem there exists a probability measure dω (cl) such that for any A ∈ O(cl) (cl) ω(A) = ω (A) = A(l)dω(l). d
L
Interactions and Hamiltonians. An interaction Ψ = {ψX }X⊂Zd,finite is a map from the the finite subsets of Zd to selfadjoint elements ψX in OX . We will assume throughout this paper that Ψ is translation invariant, i.e. τx (ψX ) = ψX+x for any X ⊂ Zd and any x ∈ Zd . An interaction Ψ is classical if there exists a classical C ∗ -subalgebra O(cl) such that ψX ∈ O(cl) for all X ⊂ Zd . We equip translation invariant interactions Ψ with the norm |X|−1 ψX ,
Ψ ≡ X0
where |X| is the cardinality of the set X and denote by B the corresponding Banach space. To any interaction Ψ ∈ B we associate Hamiltonians (or macroscopic observables) KΛ = KΛ (Ψ): For Λ ⊂ Zd finite we define KΛ = ψX . X⊂Λ
Furthermore, to any Ψ ∈ B, we associate an observable in O by 1 AΨ = ψX . |X| X0
When we consider Gibbs state, two kinds of interactions Ψ and Φ will be considered. The interaction Ψ corresponds to the observables while Φ defines the Gibbs state. We denote by KΛ the local Hamiltonian associated with Ψ and by HΛ associated with Φ. Large deviations. For n ∈ N let Λ(n) = {z ∈ Zd ; 0 ≤ zi ≤ n − 1} denote the cube with |Λ(n)| = nd lattice points and left hand corner at the origin. If ω is an ergodic state then the von Neumann ergodic theorem implies that lim
n→∞
1 KΛ(n) = ω(AΨ ) |Λ(n)|
strongly in the GNS representation and it is natural to investigate the large deviation properties, on the scale vn = |Λ(n)|, of the sequence of Borel measures on R µn (A) ≡ ω(IA (|Λ(n)|−1 KΛ(n) )),
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
217
where A is a Borel set and IA (H) denotes the spectral projection onto the eigenspace of H spanned by the eigenvalues contained in the set A. We interpret the µn (A) as the probability that the observables |Λ(n)|−1 KΛ(n) takes value in A if the system is in the state ω. 3.2. Asymptotically decoupled states The states we consider in this paper obey a property of weak dependence between disjoint regions of the lattice. We follow here the terminology used in [32] for the classical case. Let C(m) be an arbitrary cube of side length m and let us denote by C r (m) the cube of side length m + 2r centered at the same point of Zd as C(m). Definition 3.1. A state ω on O is asymptotically decoupled with parameters g and c if (1) There exist a function g : N → N with limm→∞ g(m)/m = 0 and a function c : N → [0, ∞) with limm→∞ c(m)/|C(m)| = 0. (2) For any cube C(m), m ∈ N, any nonegative A ∈ OC(m) , any nonnegative B ∈ OC g(m) (m)c we have e−c(m) ω(A)ω(B) ≤ ω(AB) ≤ ec(m) ω(A)ω(B). Examples of asymptotically decoupled states are (a) Product states. Any product state ω0 is asymptotically decoupled with parameters c = g = 0. (b) Classical Gibbs states. Let O(cl) be a classical C ∗ -algebra and let Φ be a
classical translation invariant interaction such that Φ 0 ≡ X0 φx is finite. A Gibbs state for the interaction Φ is a probability measure ω (Φ) which satisfies the DLR equation (see, e.g., [35, 36]). Using the DLR equation one proves easily (see, e.g., [23, Sec. 9]) that for any positive A ∈ OC(m) we have e−c(m) ω (Φ) (A) ≤
tr(Ae−HΛ ) ≤ ec(m) ω (Φ) (A), tr(e−HΛ )
(3.1)
with c(m) = WC(m) where WC(m) is the boundary interaction WC(m) =
(Φ) is asymptotically decoupled if X∩C(m) =∅ φX . This implies easily that ω X∩C(m)c =∅
Φ 0 < ∞. (c) Quantum KMS states. Let Φ be a translation invariant interaction. A KMS state for the interaction Φ is a state which satisfies the KMS condition or equivalently the Gibbs condition which is a quantum analog of the DLR equation (see, e.g., [5, 36, 3] for an up-to-date presentation). It is not known if KMSGibbs states are asymptotically decoupled, in general. Let us assume however that [1, 2] either
March 23, J070-S0129055X11004291
218
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
(i) d = 1 and Φ finite range (i.e. for some R > 0 diamX > R implies φX = 0), or
(ii) d arbitrary and Φ λ ≡ X0 eλ|X| φX is sufficiently small, then one can show that for a Gibbs-KMS state ω (Φ) and A ∈ OC(m) we have the
bound (3.1) where c(m) = C(Φ) X∩C(m) =∅ φX . Contrary to the classical X∩C(m)c =∅
case the bound is highly nontrivial to prove and relies on the Gibbs condition, Araki perturbation theory, and control of imaginary-time dynamics. This bound implies that ω (Φ) is asymptotically decoupled. (d) Markov measures. Let ω be a stationary Markov chain on a finite state space with transition matrix Q and invariant probability q. Then ω is asymptotically decoupled if and only if Q is irreducible and aperiodic (i.e. mixing). If m is the smallest integer such that Qm has strictly positive entries then the parameters are Qm (σ1 , σ2 ) g(m) = m − 1, c(n) = sup log . q(σ2 ) σ1 ,σ2 (e) Finitely correlated states. These states are a non-commutative generalization of Markov measures and are asymptotically decoupled if and only if they are mixing which occur under suitable conditions similar to the aperiodicity condition for Markov measures. See [18, 11, 28] for details. 4. Quantum Large Deviations Theorems We prove several large deviations theorems for quantum states (in order of increasing difficulty) by showing the existence of concave RL-functions. This unifies, simplifies and extend a number of quantum large deviation results which have been proved with different techniques (G¨ artner–Ellis Theorem via transfer operators, cluster expansions, etc.). Our proof have the advantage of being fairly short, selfcontained, to apply in some situations where the rate function is not smooth. 4.1. Preliminaries In this section we prove an energy estimate used throughout the paper and explain the strategy (after [23]) used to prove the existence of a concave Ruelle–Lanford function. The first fact is a very slight variation on standard bulk/boundary energy estimate, see, e.g., [36, 5, 32]. Given integers n and m and a function g(m) such that limm→∞ g(m)/m = 0 we choose k to be largest even integer such that n = k(m + 2g(m)) + r,
0 ≤ r < 2(m + 2g(m)),
(having k even will be convenient in the sequel). We next decompose the cube Λ(k(m + 2g(m)) into k d pairwise disjoint and contiguous cubes C˜j , each of which are each translates of Λ(m + 2g(m)) and then further divide each cube C˜j into a
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
219
cube Cj which is centered at the same point as C˜j and is a translate of Λ(m) and a “corridor” C˜j \ Cj of width g(m). We shall need estimates on the difference between the Hamiltonian KΛ (n) and the “decoupled” Hamiltonian for the collection of cubes
d Cj , i.e. kj=1 KCj .
Lemma 4.1. Let Ψ be an interaction with Ψ ≡ X0 |X|−1 ψx < ∞. Then there exists a function F (m) = F (m, Ψ) with limm→∞ F (m) = 0, such that kd 1 (4.1) KΛ(n) − lim sup KCj ≤ F (m). |Λ(n)| n→∞ j=1 We will also use an immediate consequence of Lemma 4.1. Corollary 4.2. Let Ψ be an interaction with Ψ < ∞. Then there exists a function F (m) = F (m, Ψ) with limm→∞ F (m) = 0, such that kd 1 1 (4.2) lim sup K − K Cj ≤ F (m). Λ(n) |Λ(km)| j=1 n→∞ |Λ(n)| Proof of Lemma 4.1. To simplify notation we set l = m + 2g(m) in the proof. If D = {x ∈ Zd ; ai ≤ xi < ai + l} is a cube of side length l and r ∈ N such that r < l/2 we denote Dr = {x ∈ Zd ; ai + r ≤ xi < ai + l − r} the cube of side length l − 2r centered at the same point as D. Let us consider two cubes D ⊂ D ⊂ Zd . We have 1
KD − KD ≤
ψX
ψX ≤ |X| x∈D Xx X ⊂D
X⊂D X ⊂D
≤
1 1
ψX +
ψX
|X| |X|
x∈D \Dr Xx
≤ |D \Dr | Ψ + |Dr |
x∈Dr Xx X ⊂D
X0 diam(X)>r
1
ψX . |X|
(4.3)
Using (4.3), we have for any r, lim sup n→∞
1
KΛ(n) − KΛ(kl)
|Λ(n)|
|Λ(n)\Λr (kl)| |Λr (kl)|
Ψ + ≤ lim n→∞ |Λ(n)| |Λ(n)| =
X0 diam(X)>r
1
ψX . |X|
X0 diam(X)>r
1
ψX |X|
March 23, J070-S0129055X11004291
220
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
Since r is arbitrary, we have lim sup n→∞
1
KΛ(n) − KΛ(kl) = 0. |Λ(n)|
(4.4)
Using (4.3) again, we have kd 1 (KC˜j − KCj ) lim sup n→∞ |Λ(n)| j=1
k d |Λ(l)| |Λ(l)\Λr (m)| Ψ + |Λr (m)| n→∞ |Λ(n)| |Λ(l)| |Λ(l)|
≤ lim
X0 diam(X)>r
1
ψX . |X|
If r = h(m) with limm→∞ h(m) = ∞ and limm→∞ h(m)/m = 0, we get d k 1 lim sup (KC˜j − KCj ) = o(m). n→∞ |Λ(n)| j=1 Finally kd KΛ(kl) − KC˜j ≤ j=1
ψX =
X⊂Λ(kl) ˜j X ⊂someC
X⊂Λ(kl) ˜j X ⊂someC
(4.5)
k |X ∩ C˜j |
ψX
|X| j=1 d
k 1 1 |X ∩ C˜j |
ψX = |Λ(kl)|d(Ψ, l) ≤ |Λ(kl)| d k j=1 |C˜j | |X| ˜ d
X ⊂Cj
(4.6) with d(Ψ, l) =
|X ∩ Λ(l)| 1
ψX = |Λ(l)| |X| X ⊂Λ(l)
≤
|Λr (l)| |Λ(l)|
X0 diam(X)>r
x∈Λ(l) Xx X ⊂Λ(l)
1
ψX
|X||Λ(l)|
1 |Λ(l)| − |Λr (l)|
ψX +
Ψ . |X| |Λ(l)|
Since l = m + 2g(m) if we pick r = h(m) as above we get kd 1 lim sup − K K ˜j = o(m). Λ(kl) C n→∞ |Λ(n)| j=1
(4.7)
(4.8)
Combining the bounds (4.4), (4.5) and (4.8) concludes the proof of Lemma 4.1.
Proof of Corollary 4.2. An easy estimate shows that the difference between
kd
k d
|Λ(n)|−1 KΛ(n) − |Λ(km)|−1 j=1 KCj and |Λ(n)|−1 KΛ(n) − j=1 KCj is O(g(m)/m) Ψ .
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
221
The second fact is a general remark on the strategy to prove the existence of a concave RL function [23]. Remark 4.3. Let x, x1 , x2 such that 12 (x1 + x2 ) = x and let 0 < ε < ε. To prove the existence of a concave RL-function it is enough to prove that m(Bε (x)) ≥
m(Bε (x1 )) + m(Bε (x2 )) . 2
(4.9)
Indeed if we set x1 = x2 = x in (4.9), then we obtain s(x) ≥ s(x), and therefore the Ruelle–Lanford function s(x) exists. Using then (4.9) again, we obtain that s(x) ≥
s(x1 ) + s(x2 ) . 2
Since s(x) is upper-semicontinuous, this implies that s(x) is concave. 4.2. Tracial state and conserved quantities In this section, we prove a quantum large deviation theorem in the simplest possible case. We bypass a number of issue associated to taking thermodynamic limits for the states by considering first the finite volume Gibbs states ωΛ(n) (A) =
tr(Ae−HΛ(n) ) . tr(e−HΛ(n) )
In addition, we assume that the Hamiltonian and that the macroscopic observables KΛ is a conserved quantity, i.e. the commutators [KΛ , HΛ ] vanish for all Λ. Note that, although very restrictive, this condition is, in general, satisfied for thermodynamic quantities such as magnetization, density, energy, etc. The following theorem provides a (weak) justification that macroscopic conserved quantities are exponentially concentrated in equilibrium. An important special case is the case where HΛ = 0, that is one consider the tracial state tr. In this case any observable KΛ can be chosen arbitrarily and the rate function s(x) is the microcanonical entropy whose existence is of course well known. The large deviation statement for the tracial state can be found, e.g., in [36]; the only novelty here, maybe, is a very simple proof. Theorem 4.4. Let Φ and Ψ be interaction with Φ < ∞ and Ψ < ∞. Suppose that the commutators [KΛ(n) , HΛ(n) ] commute for all n. Then the probability measures µn (A) =
tr(IA (|Λ(n)|−1 KΛ(n) )e−HΛ (n) ) , tr(e−HΛ (n) )
March 23, J070-S0129055X11004291
222
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
satisfies a large deviation principle on the scale |Λ(n)| with a concave rate function s(x). We have s(x) = inf (P (α) − αx),
sup(αx + s(x)) = P (α),
α
x −1
where P (α) = limn→∞ |Λ(n)|
log tr(e
−HΛ(n) +αKΛ(n)
) is the translated free energy.
Proof. Let us choose x, x1 , x2 and ε, ε as in Remark 4.3. Given n > m let k be the even integer such that n = km + r with 0 ≤ r < 2m − 1 (having k even is useful later). Divide the cube Λ(km) into k d disjoint contiguous cube Cj , j = 1, . . . , k d each of which is a translate of the cube Λ(m). (n) (n) Let us denote by λj the eigenvalues of HΛ(n) and by µj the eigenvalues of KΛ(n) . Since HΛ(n) and KΛ(n) commute we have (n) e−λj (n) µ
µn (Bε (x)) =
j j; |Λ(n)| ∈Bε (x)
.
(n)
e−λj
(4.10)
j
By Corollary 4.2, we can choose M and N = Nm so that for m > M and n > N we have kd |Λ(n)|−1 KΛ(n) − |Λ(km)|−1 KCj ≤ (ε − ε ). j=1 ˆ(m) be an Let µ(m) be an eigenvalue of KΛ(m) with µ(m) /|Λ(m)| ∈ Bε (x1 ) and let µ ˆ (m) /|Λ(m)| ∈ Bε (x2 ). Let us assign µ(m) to each cube eigenvalue of KΛ(m) with µ d kd ˆ(m) to the each cube Cj with j = k2 + 1, . . . , k d . Then Cj with j = 1, . . . , 2 and µ
d ˆ(m) ) is an eigenvalue of j KCj such that µ ˜(km) /|Λ(km)| ∈ µ ˜(km) ≡ k2 (µ(m) + µ Bε (x). For m > M and n ≥ N = Nm , by Weyl’s perturbation theorem, for ˆ(m) there exists an eigenvalue µ(n) of KΛ (n) such that any choice of µ(m) and µ (n) µ /|Λ(n)| ∈ Bε (x). (n) Assume that the eigenvalues λi of HΛ(n) are listed in increasing order, counting
(n) ˜ be the eigenvalues of multiplicity. Let λ i j HCj ⊗ 1Λ(n)\Λ(km) also listed in increasing order. By Weyl’s perturbation theorem, and Lemma 4.1, there exists M such that for m > M there exists N = Nm such that n ≥ N we have ˜ (n) + |Λ(n)|F (m). ˜(n) − |Λ(n)|F (m) ≤ λ(n) ≤ λ λ i i i Using the formula (4.10), we obtain that µn (Bε (x)) ≥ µm (Bε (x1 )) and thus log µn (Bε (x)) ≥ |Λ(n)|
kd 2
µm (Bε (x2 ))
kd 2
e−2|Λ(n)|F (m)
log µm (Bε (x1 )) log µm (Bε (x2 )) k d |Λ(m)| + − 2F (m). 2|Λ(m)| 2|Λ(m)| |Λ(n)|
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
223
To conclude we take first a lim inf over n keeping m fixed and then choose a subsequence ml such that liml→∞ |Λ(ml )|−1 log µml (Bε (x1 )) = m(Bε (x1 )). Together with Remark 4.3 this concludes the proof of Theorem 4.4. Theorem 4.5. Let Φ and Ψ be interactions with Φ < ∞ and Ψ < ∞. Suppose that the commutators [KΛ(n) , HΛ(n) ] vanish for all n. Suppose ω (Φ) satisfies the condition (3.1). Then the probability measure µn (A) = ω (Φ) (IA (|Λ(n)|−1 KΛ(n) )) satisfies a large deviation principle on the scale |Λ(n)| with a concave rate function s(x). We have sup(αx + s(x)) = P (α), x −1
where P (α) = limn→∞ |Λ(n)|
log tr(e
s(x) = inf (P (α) − αx), α
−HΛ(n) +αKΛ(n)
) is the translated free energy.
Proof. Since ω (Φ) (IA (|Λ(n)|−1 KΛ(n) )) ≥ e−c(n)
tr(IA (|Λ(n)|−1 KΛ(n) )e−HΛ (n) ) , tr(e−HΛ (n) )
the theorem follows immediately from Theorem 4.4. Remark 4.6 (Equivalence of Ensembles). For the tracial case it is not difficult [36] to show the variational formula s(x) = sup{s(ω); ω(AΨ ) = x} where s(ω) is the specific entropy of the state ω and that the supremum is attained exactly if ω = ω βΦ is a Gibbs-KMS state at temperature β = β(x) with β chosen in such a way that ω βΨ (AΨ ) = x. This is the equivalence of ensemble: the thermodynamic function entropy can be computed via microcanonical or canonical prescriptions. Furthermore, the LDP can be used to prove that suitable microcanonical states are equivalent to canonical states, see [36] for the classical case and [24, 25] for the quantum case. Non-commutative versions of equivalence of ensembles are considered in [7]. 4.3. Classical subalgebras In this section, we assume that ω is an asymptotically decoupled state and that Ψ ∈ B is a classical interaction, i.e. there exists a classical subalgebra O(cl) ⊂ O such that, for all X, ψX ∈ O(cl) . For example if Ψ = {ψx }x∈Zd consists of only of “one-site” interactions then Ψ is classical. More generally any classical spin system is described by a classical interaction. Note that we do not assume any relation between the interaction Ψ and the state ω; if ω = ω Φ is a Gibbs state for the interaction Φ then Φ and Ψ need not commute. As noted in Sec. 3.1 the restriction of ω on O(cl) can be identified with a probability measure dω (cl) on the configuration space L. Furthermore, it is easy to see that the state ω (cl) on the C ∗ -algebra O(cl) C(L) is asymptotically decoupled whenever the state ω on O is asymptoticallly decoupled.
March 23, J070-S0129055X11004291
224
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
We have Theorem 4.7. Let Ψ be a classical interaction with Ψ < ∞ and let ω be an asymptotically decoupled state. Then the sequence of probability measures µn (A) = ω(IA (|Λ(n)|−1 KΛ(n) )), satisfies a large deviation principle on the scale |Λ(n)| with a concave rate function s(x). Moreover s(x) = inf (f (α) − αx), α
where f (α) = lim
n→∞
1 log ω(exp(αKΛ(n) )). |Λ(n)|
Proof. The proof reduces to the classical case (see [23]) since the measures µn can be written as µn (A) = ω (cl) (IA (|Λ(n)|−1 KΛ(n) )) = I{|Λ(n)|−1 KΛ(n) ∈A} (l)dω (cl) (l) and the restriction of ω (cl) on O(cl) is asymptotically decoupled. Following Remark 4.3 we choose arbitrary x, x1 , x2 such that x21 + x22 = x and 0 < ε < ε. We divide the cube Λ(n) as explained before Lemma 4.1. We choose M and N = Nm such that for m > M and n > N kd 1 1 K − K (4.11) Cj ≤ ε − ε . |Λ(n)| Λ(n) |Λ(km)| j=1 d
Let lCj be configurations such that KCj (lCj )/|Cj | ∈ Bε (x1 ) for 1 ≤ j ≤ k2 and d KCj (lCj )/|Cj | ∈ Bε (x2 ) for k2 + 1 ≤ j ≤ k d . By (4.11) any configuration lΛ(n) which coincides with lCj on all Cj satisfies KΛ(n) (lΛ(n) )/|Λ(n)| ∈ Bε (x). Therefore using the fact that ω (cl) is asymptotically decoupled we have the bound KΛ(n) ω IBε (x) |Λ(n)| ff dω (cl) = I KΛ(n) |Λ(n)|
k 2
∈Bε (x)
d
≥
j=1
k d
I KCj |Cj
ff
∈Bε (x1 ) |
kd 2
I KCj
+1
≥
|Cj
∈Bε (x2 ) |
ff dω (cl)
k2d I KΛ(m)
∈Bε (x1 ) |Λ(m)|
ff dω (cl)
k2d I KΛ(m) |Λ(m)|
∈Bε (x2 )
ff dω (cl)
e−c(m)k . d
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
225
Thus we obtain log µm (Bε (x1 )) log µm (Bε (x2 )) k d |Λ(m)| log µn (Bε (x)) ≥ + |Λ(n)| 2|Λ(m)| 2|Λ(m)| |Λ(n)| −
1 c(m)k d . |Λ(n)|
We conclude by taking the lim inf over n and then choosing a subsequence ml such that liml→∞ (|Λ(ml )|)−1 log µml (Bε (x1 )) = m(Bε (x1 )). The identification of the rate function follows from Varadhan’s lemma. Remark 4.8. One can show (see [32, 29] for more details) that the rate function satisfies the following variational characterization: s(x) = sup{−hcl (ν, ω (cl) ); ν(AΨ ) = x}, where hcl is the classical relative entropy per unit volume, and the supremum is taken over all classical translation invariant states. 4.4. Dimension 1 Throughout this section we assume that d = 1 (so we write |Λ(n)| = n) and that ω is an asymptotically decoupled state, for example we may assume that ω a KMSGibbs state for a finite range interaction. We also assume that Ψ is a finite range interaction. The crucial estimate needed to control the effect of non-commutativity is an estimate on the difference between the spectral projections associated to KΛ(n) and
k j=1 KCj (see Sec. 4.1). To prove this we relies on a “cocycle estimate” proved in [1], which follows from the fact that the time-evolution τt (A) of any local observable A for a finite-range quantum spin system can be extended to a entire analytic function of t. This allows to prove the following “exponential version” of Lemma 4.1. Proposition 4.9. Let Ψ be a finite range interaction of range R and let β ∈ R. Then there exists a function Fβ (m) = Fβ (m, R, Ψ) with lim Fβ (m) = 0.
m→∞
such that lim sup n→∞
Pk 1 log eβKΛ(n) e−β j=1 KCj ≤ |β|Fβ (m). n
(4.12)
(4.13)
Proof. The proof is an application of the results in [1], see in particular Secs. 4 and 5. The basic bound in [1, Sec. 5], is that if AX ∈ OX with diam(X) ≤ R then there exists a constant D(β, R, Ψ) such that
eβKΛ (n) e−β(KΛ (n)−AX ) ≤ e|β|D(β,R,Ψ) AX .
(4.14)
The bound (4.14) follows from Dyson formula and estimates (uniform in n) on the dynamics in imaginary time generated by the Hamiltonian KΛ(n) . To apply these
March 23, J070-S0129055X11004291
226
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
results here we write KΛ(n) =
k
KCj +
j=1
ψX .
X⊂Λ(n) X ⊂someCj
Let tX ∈ {0, 1} and let us define the family of interpolating Hamiltonians KΛ(n) ({tX }) =
k
KCj +
j=1
tX ψX .
X⊂Λ(n) X ⊂someCj
The estimates on the dynamics in [1] are easily seen to be uniform in {tX } and so we can apply the bound (4.14) iteratively, changing at each step one tX from 1 to 0. Using that Ψ has a finite range R we obtain the bound
e
βKΛ(n)
e
−β
|β|D(β,R,Ψ)
Pk
j=1 KCj
P
≤e
X⊂Λ(n) ψX X ⊂someCj .
But the sum over X is now treated exactly as Lemma 4.1 and we find Fβ (m) = F (m)D(β, R, Ψ). We use this bound to prove an exponential estimates which control how the
k spectral projections change when we replace KΛ(n) by j=1 KCj . Proposition 4.10. Let ε > ε > 0. Then for any α > 0 there exists a function F˜α (m) with limm→∞ F˜α (m) = 0 such that k 1 −1 −1 lim sup log K C (n K ) (mk) I I Cj Bε (x) Λ(n) Bε (x) n→∞ n j=1 ≤ −α(ε − ε − F˜α (m)).
(4.15)
Proof. Let us write KΛ(n) =
µi Pi ,
i
k
KCj =
j=1
λl Ql ,
(4.16)
l
where Pi and Ql are rank-one projections and µi and λl are the eigenvalues of KΛ(n)
and j KCj . For any β ∈ R Pi IBδ (y) (n−1 KΛ(n) ) = i;
=e
µi n
∈Bδ (y)
β(KΛ(n) −ny) µ i; ni
e−β(µi −ny) Pi
∈Bδ (y)
≡ eβ(KΛ(n) −ny) Vβ,y,δ
(4.17)
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
227
and IBε (x) (mk)−1
k
KCj =
j=1
Ql
λ
l ∈B (x) l; mk ε
=
eβ(λl −xn) Ql e−β(
P j
KCj −xn)
λ
l ∈B (x) l; mk ε
≡ Wβ,x,ε e−β(
P j
KCj −xn)
,
(4.18)
with the bounds
Vβ,y,δ ≤ e|β|nδ ,
Wβ,x,ε ≤ e|β|mk(ε +( mk −1)|x|) . n
(4.19)
If y > x we choose β = α > 0 and using Eqs. (4.17) and (4.18) as well as the bounds (4.13) and (4.19), we obtain k 1 −1 −1 I lim sup log (mk) I K (n K ) Cj Bε (x) Bδ (y) Λ(n) n→∞ n j=1 P 1 log Wα,x,ε e−α( j KCj −nx) eα(KΛ(n) −ny) Vα,y,δ
n→∞ n P 1 ≤ lim sup −α(y − x) + log e−α j KCj eαKΛ(n)
n n→∞ α + (nδ + mkε + (n − mk)|x|) n
= lim sup
≤ −α(y − x) + αFα (m) + α(δ + ε ) + α
g(m) |x|. m
Similarly, for y < x, we choose β = −α and obtain a similar bound and finally k 1 −1 −1 lim sup log IBε (x) (mk) KCj IBδ (y) (n KΛ(n) ) n→∞ n j=1 ≤ −α|y − x| + αFα (m) + α(δ + ε ) + α
g(m) |x|. m
(4.20)
Next we choose δ be such that ε > 2δ + ε and choose finitely many intervals Tl and xl ∈ Tl , l = 1, . . . , L such that Bε (x)C ∩ [− Ψ , Ψ ] =
l
Tl ,
Tl ⊂ Bδ (xl ).
March 23, J070-S0129055X11004291
228
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
By the principle of the largest term, and using the bound (4.20), we obtain k 1 −1 −1 (mk) I K C (n K ) lim sup log I Cj Bε (x) Λ(n) Bε (x) n→∞ n j=1 L k 1 −1 −1 I K I (n K ) (mk) ≤ lim sup log Cj Tl Bε (x) Λ(n) n→∞ n j=1 l=1 k 1 −1 −1 ≤ max lim sup log (mk) I K (n K ) I Cj Bε (x) Bδ (xl ) Λ(n) l n→∞ n j=1 g(m) |x| . (4.21) ≤ −α(ε − ε − δ) + α Fα (m) + m Since δ is arbitrary this concludes the proof with F˜α (m) = Fα (m) +
g(m) m |x|.
With this estimate we can now prove Theorem 4.11. Let d = 1, let ω be an asymptotically decoupled translation invariant state, and let Ψ be a finite range interaction. Then the sequence of probability measures µn (A) = ω(IA (n−1 KΛ(n) )), satisfies a large deviation principle with a concave rate function s(x). Moreover s(x) = inf (f (α) − αx), α
where f (α) = lim n−1 log ω(exp(αKΛ(n) )). n→∞
Proof. Let ω be an asymptotically decoupled state with parameters g and c. Let x, x1 , x2 be such that x21 + x22 = x and 0 < ε < ε. For any n > m we decompose Λ(n) as in Sec. 4.1. Note that k/2 j=1
IBε (x1 )
KCj m
k j=k/2+1
IBε (x2 )
KCj m
≤ IBε (x)
j
KCj
mk
,
(4.22)
and that for any projections P and Q and a state ω we have ω(P ) = ω(QP Q) + ω((1 − Q)P Q + QP (1 − Q)) + ω((1 − Q)P (1 − Q)) ≤ ω(Q) + 2 (1 − Q)P Q + (1 − Q)P (1 − Q)
≤ ω(Q) + 3 (1 − Q)P .
(4.23)
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
229
Using that ω is asymptotically decoupled, and estimates (4.22) and (4.23), we obtain KΛ(m) KΛ(m) 1 1 log ω IBε (x1 ) log ω IBε (x2 ) + 2m m 2m m k/2 k KCj KCj 1 + c(m)k log ω IBε (x1 ) IBε (x2 ) ≤ mk m m mk j=1 j=k/2+1 KCj j c(m) 1 log ω ≤ IBε (x) mk + m mk KΛ(n) 1 logω IBε (x) ≤ mk n k KCj KΛ(n) c(m) j=1 . + 3 IBε (x) + IBε (x)C mk n m Keeping m fixed we take a lim inf over n and using Proposition 4.10 we obtain KΛ(m) KΛ(m) 1 1 log ω IBε (x1 ) log ω IBε (x2 ) + 2m m 2m m g(m) c(m) . (4.24) ≤ 1+ max{m(Bε (x)), −α(ε − ε − F˜α (m))} + m m To conclude we will use the bound (4.24) repeteadly. (a) Assume first x = x1 = x2 and assume that s(x) > −∞. Choose first α so large that 1 − α(ε − ε ) < m(Bε (x)), 2 and then M = M (α) so that F˜α (m) ≤ 12 (ε − ε ) for m > M . By (4.24) we have then KΛ(m) c(m) 1 g(m) log ω IBε (x) , ≤ 1+ m(Bε (x)) + m m m m and thus m(Bε (x)) ≤ m(Bε (x)). This implies that the Ruelle function s(x) exists and is finite.
March 23, J070-S0129055X11004291
230
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
(b) Assume that s(x) > −∞ and x = 12 (x1 + x2 ). Repeating the same argument as in (a) one obtains, for m large enough, KΛ(m) KΛ(m) 1 1 log ω IBε (x1 ) log ω IBε (x2 ) + 2m m 2m m g(m) c(m) , ≤ 1+ m(Bε (x)) + m m and this implies that 12 m(Bε (x1 )) + 12 m(Bε (x2 )) ≤ m(Bε (x)). Thus the rate function s(x) is concave wherever it is finite. (c) Let us assume that s(x) = −∞. Then for any t > 0 we can find εt such that for ε < εt we have m(Bε (x)) ≤ −t. By (4.24) we have KΛ(m) 1 log ω IBε (x) m m g(m) c(m) , ≤ 1+ max{−t, −α(ε − ε − F˜α (m))} + m m and thus taking m → ∞ we obtain m(Bε (x)) ≤ max{−t, −α(ε − ε )}, and so s(x) ≤ max{−t, −αε}. Since α and t are arbitrary we have s(x) = −∞. (d) Assume that s(x) = −∞ and x = 12 (x1 + x2 ). Repeating the same argument as in (c) for any t > 0 there exists εt > 0 such that for all α > 0, KΛ(m) KΛ(m) 1 1 log ω IBε (x1 ) log ω IBε (x2 ) + 2m m 2m m c(m) g(m) ≤ 1+ max{−t, −α(εt − ε − F˜α (m))} + m m and this implies that Hence we obtain
1 2 m(Bε (x1 ))
+ 12 m(Bε (x2 )) ≤ max{−t, −α(εt − ε )}.
1 1 1 1 s(x1 ) + s(x2 ) = s(x1 ) + s(x2 ) = −∞ ≤ s(x). 2 2 2 2 Combining (a)–(d) shows the existence of a concave RL-function and this concludes the proof of Theorem 4.11. Remark 4.12. A characterization of the rate function using classical relative entropies is proved in [29].
March 23, J070-S0129055X11004291
2011 10:41 WSPC/S0129-055X
148-RMP
Ruelle–Lanford Functions and Large Deviations
231
Acknowledgments The first author was supported by JSPS Grant-in-Aid for Young Scientists (B), Hayashi Memorial Foundation for Female Natural Scientists, Sumitomo Foundation, and Inoue Foundation. The second author was supported by NSF, Grant DMS-0605058.
References [1] H. Araki, Gibbs states of a one dimensional quantum lattice, Comm. Math. Phys. 14 (1969) 120–157. [2] H. Araki, On the equivalence of the KMS condition and the variational principle for quantum lattice systems, Comm. Math. Phys. 38 (1974) 1–10. [3] H. Araki and H. Moriya, Equilibrium statistical mechanics of fermion lattice systems, Rev. Math. Phys. 15 (2003) 93–198. [4] I. Bjelakovic, J.-D. Deuschel, T. Kr¨ oger, R. Siegmund-Schultze, A. Szkola and R. Seiler, Typical support and Sanov large deviations of correlated states, Comm. Math. Phys. 279 (2008) 559–584. [5] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vols. 1 and 2, Texts and Monographs in Physics (Springer, 1981). [6] F. Comets, Grandes d´eviations pour des champs de Gibbs sur Zd , C. R. Acad. Sci. Paris S´er. I Math. 303 (1986) 511–513. [7] W. De Roeck, C. Maes and K. Netoˇcny, Quantum macrostates, equivalence of ensembles and an H-theorem, J. Math. Phys. 47 (2006) 073303, 12 pp. [8] W. De Roeck, C. Maes, K. Netoˇcny and L. Rey-Bellet, A note on the noncommutative Laplace–Varadhan integral lemma, Rev. Math. Phys. 22(7) (2010) 839–858. [9] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications, Applications of Mathematics, Vol. 38, 2nd edn. (Springer, 1998). [10] J.-D. Deuschel, D. W. Stroock and H. Zessin, Microcanonical distributions for lattice gases, Comm. Math. Phys. 139 (1991) 83–101. [11] M. Fannes, B. Nachtergaele and R. F. Werner, Finitely correlated states on quantum spin chains, Comm. Math. Phys. 144 (1992) 443–490. [12] H. F¨ ollmer and S. Orey, Large deviations for the empirical field of a Gibbs measure, Ann. Probab. 16 (1988) 961–977. [13] G. Gallavotti, J. L. Lebowitz and V. Mastropietro, Large deviations in rarefied quantum gases, J. Stat. Phys. 108 (2002) 831–861. [14] H.-O. Georgii, Large deviations and maximum entropy principle for interacting random fields on Zd , Ann. Probab. 21 (1993) 1845–1875. [15] H.-O. Georgii, Large deviations and the equivalence of ensembles for Gibbsian particle systems with superstable interaction, Probab. Theory Related Fields 99 (1994) 171–195. [16] F. Hiai, M. Mosonyi and T. Ogawa, Large deviations and Chernoff bound for certain correlated states on a spin chain, J. Math. Phys. 48 (2007) 123301, 19 pp. [17] F. Hiai, M. Mosonyi, H. Ohno and D. Petz, Free energy density for mean field perturbation of states of a one-dimensional spin chain, Rev. Math. Phys. 20 (2008) 335–365. [18] F. Hiai and D. Petz, Entropy densities for algebraic states, J. Funct. Anal. 125 (1994) 287–308.
March 23, J070-S0129055X11004291
232
2011 10:41 WSPC/S0129-055X
148-RMP
Y. Ogata & L. Rey-Bellet
[19] R. B. Israel, Convexity in the Theory of Lattice Gases, Princeton Series in Physics (Princeton University Press, 1979). [20] O. E. Lanford III, Entropy and equilibrium states in classical statistical mechanics, in Statistical Mechanics and Mathematical Problems, Lecture Notes in Physics, Vol. 20 (Springer, 1973), pp. 1–113. [21] J. L. Lebowitz, M. Lenci and H. Spohn, Large deviations for ideal quantum systems, J. Math. Phys. 41 (2000) 1224–1243. [22] M. Lenci and L. Rey-Bellet, Large deviations in quantum lattice systems: One phase region, J. Stat. Phys. 119 (2005) 715–746. [23] J. T. Lewis, C.-E. Pfister and W. G. Sullivan, Entropy, concentration of probability and conditional limit theorems, Markov Process. Related Fields 1 (1995) 319–386. [24] R. Lima, Equivalence of ensembles in quantum lattice systems, Ann. Inst. H. Poincar´e Sect. A 15 (1971) 61–68. [25] R. Lima, Equivalence of ensembles in quantum lattice systems: States, Comm. Math. Phys. 24 (1972) 180–192. [26] T. Matsui, On non-commutative Ruelle transfer operator, Rev. Math. Phys. 13 (2001) 1183–1201. [27] K. Netoˇcny and F. Redig, Large deviations for quantum spin systems, J. Stat. Phys. 117 (2004) 521–547. [28] Y. Ogata, Large deviations in quantum spin chains, Comm. Math. Phys. 296 (2010) 35–68. [29] Y. Ogata and L. Rey-Bellet, The rate function for quantum large deviations, in preparation. [30] S. Olla, Large deviations for Gibbs random fields, Probab. Theory Related Fields 77 (1988) 343–357. [31] D. Petz, G. A. Raggio and A. Verbeure, Asymptotics of Varadhan-type and the Gibbs variational principle, Comm. Math. Phys. 121 (1989) 271–282. [32] C.-E. Pfister, Thermodynamical aspects of classical lattice systems, in In and Out of Equilibrium (Mambucaba, 2000), Progr. Probab., Vol. 51 (Birkh¨ auser, 2002), pp. 393–472. [33] S. Roelly and H. Zessin, The equivalence of equilibrium principles in statistical mechanics and some applications to large particle systems, Expo. Math. 11 (1993) 385–405. [34] D. Ruelle, Correlation functionals, J. Math. Phys. 6 (1965) 201–220. [35] D. Ruelle, Statistical Mechanics: Rigorous Results (World Scientific, 1999). [36] B. Simon, The Statistical Mechanics of Lattice Gases, Vol. I, Princeton Series in Physics (Princeton University Press, 1993).
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 3 (2011) 233–260 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004278
APPLICATIONS OF MAGNETIC ΨDO TECHNIQUES TO SAPT
GIUSEPPE DE NITTIS SISSA, via Bonomea, 265 34136 Trieste TS, Italy
[email protected] MAX LEIN Technische Universit¨ at M¨ unchen, Zentrum Mathematik, Boltzmannstraße 3, 85747 Garching, Germany
[email protected] Received 15 September 2010 Revised 24 December 2010 In this review, we show how advances in the theory of magnetic pseudodifferential operators (magnetic ΨDO) can be put to good use in space-adiabatic perturbation theory (SAPT). As a particular example, we extend results of [24] to a more general class of magnetic fields: we consider a single particle moving in a periodic potential which is subjected to a weak and slowly-varying electromagnetic field. In addition to the semiclassical parameter ε 1 which quantifies the separation of spatial scales, we explore the influence of an additional parameter λ that allows us to selectively switch off the magnetic field. We find that even in the case of magnetic fields with components in Cb∞ (Rd ), e.g., for constant magnetic fields, the results of Panati, Spohn and Teufel hold, i.e. to each isolated family of Bloch bands, there exists an associated almost invariant subspace of L2 (Rd ) and an effective hamiltonian which generates the dynamics within this almost invariant subspace. In case of an isolated non-degenerate Bloch band, the full quantum dynamics can be approximated by the hamiltonian flow associated to the semiclassical equations of motion found in [24]. Keywords: Magnetic field; pseudodifferential operators; Weyl calculus; Bloch electron. Mathematics Subject Classification 2010: 81Q15, 81Q20, 81S10
1. Introduction A fundamental and well-studied problem is that of a Bloch electron subjected to an electric field and a constant magnetic field where the dynamics is generated by ˆ ≡ H(ε, ˆ λ) := 1 −i∇x − λA(εx) 2 + VΓ (x) + φ(εx) H (1.1) 2 acting on L2 (Rdx ). Here, ε and λ are dimensionless non-negative parameters whose significance will be discussed momentarily. A is assumed to be a smooth 233
April 11, J070-S0129055X11004278
234
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
polynomially bounded vector potential to the magnetic field B = dA which is constant in time and uniform in space. Since the magnetic field is uniform, A needs to grow at least linearly. The potential generated by the nuclei and all other electrons VΓ is periodic with respect to the crystal lattice [3, 4] Γ := γ ∈ Rd γ = dj=1 αj ej , αj ∈ Z (1.2) where the family of vectors {e1 , . . . , ed } which defines the lattice forms a basis of Rd . The potential assumed to be infinitesimally bounded with respect to − 21 ∆x . By [26, Theorem XIII.96], this is ensured by the following Assumption 1.1 (Periodic Potential). We assume that VΓ is Γ-periodic, i.e. VΓ (· + γ) = VΓ (·) for all γ ∈ Γ, and M dy |VΓ (y)| < ∞. The dual lattice Γ∗ is spanned by the dual basis {e∗1 , . . . , e∗d }, i.e. the set of vectors which satisfy ej ·e∗k = 2πδkj . The assumption on VΓ ensures the unperturbed periodic hamiltonian ˆ per = 1 (−i∇x )2 + VΓ H 2
(1.3)
defines a selfadjoint operator on the second Sobolev space H 2 (Rd ) and gives rise to Bloch bands in the usual manner (cf. Sec. 2.1): the unitary Bloch–Floquet–Zak ˆ per into the fibered operator transform Z defined by Eq. (2.1) decomposes H ⊕ ⊕
2 Z 1 ˆ Z := Z H ˆ per Z −1 = −i∇ H dk H (k) := dk + k + V (y) y Γ per per 2 M∗
where we have introduced the M ∗ := k ∈ Rd
M∗
Brillouin zone d k = j=1 αj e∗j , αj ∈ [−1/2, +1/2]
(1.4)
as fundamental cell in reciprocal space. For each k ∈ M ∗ , the eigenvalue equation Hper (k)ϕn (k) = En (k) ϕn (k),
ϕn (k) ∈ L2 (Tdy ),
where Tdy := Rd /Γ, is solved by the Bloch function associated to the nth band. Assume for simplicity we are given a band E∗ which does not intersect or merge with other bands (i.e. there is a local gap in the sense of Assumption 3.1). Then common lore is that transitions to other bands are exponentially suppressed and the effective dynamics for an initial state localized in the eigenspace associated to E∗ is generated by E∗ (−i∇x ) [6, 1]. If we switch on a constant magnetic field, no matter how weak, the Bloch bands are gone as there is no BFZ decomposition with respect to Γ for hamiltonian (1.1). ˆ is a Cantor set [7] if the flux through the As a matter of fact, the spectrum of H Wigner–Seitz cell d (1.5) M := y ∈ Rd y = j=1 αj ej , αj ∈ [−1/2, +1/2]
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
235
is irrational. Even if the flux through the unit cell is rational, we recover only magnetic Bloch bands that are associated to a larger lattice Γ ⊃ Γ. A natural question is if it is at all possible to see signatures of non-magnetic Bloch bands if the applied magnetic field is weak? Our main result, Theorem 4.1, answers this question in the positive in the following sense: if the electromagnetic field varies on the macroscopic level, i.e. ε 1, then to leading order the dynamics is still generated by the so-called Peierls substitution E∗ −i∇x − λA(εx) + φ εx (defined as a magnetic pseudodifferential operator through Eq. (2.15), cf. Sec. 2.2). Hence, the dynamics are dominated by the Bloch bands even in the presence of a weak, but constant magnetic field. Furthermore, we can derive corrections to any order in ε in terms of Bloch bands, Bloch functions, the magnetic field and the electric potential. We do not need to choose a “nice” vector potential for B, in fact, in all of the calculations only the magnetic field B enters. Existing theory is ill-equipped to deal with constant or even more general magnetic fields. We tackle this obstacle by incorporating the rich theory of magnetic Weyl calculus [20, 14] with two small parameters [17] into space-adiabatic perturbation theory [25, 24, 27]. Magnetic Weyl calculus does not single out constant magnetic fields, in fact we will only assume that the components of the magnetic field B are bounded with bounded derivatives to any order. Assumption 1.2 (Electromagnetic Fields). We assume that the components of the external (macroscopic) magnetic fields B and the electric potential φ are Cb∞ (Rd ) functions, i.e. smooth, bounded functions with bounded derivatives to any order. Remark 1.1. All vector potentials A associated to magnetic fields B = dA with ∞ (Rd ). This components in Cb∞ (Rd ) are always assumed to have components in Cpol is always possible as one could pick the transversal gauge, n 1 Ak (x) := − ds Bkj (sx)sxj . j=1
0
This is to be contrasted with the original work of Panati, Spohn and Teufel where the vector potential had to have components in Cb∞ (Rd ). ˆ defines an essentially selfadjoint operator on Under Assumptions 1.1 and 1.2, H 2 d ⊂ L (Rx ). The extension of the rigorous derivation of the Peierls substitution to the case of constant magnetic field is not only an “accademic result”, it is crucial for modeling the quantum Hall effect (QHE): the Peierls substitution can be used to link the QHE to the well-studied Harper equation [2, 8–11]. In particular the Peierls substitution led Hofstadter to study a simplified tight binding model for the QHE today called the Hofstadter model [12]. The Hofstadter model is a paradigm in the study of fractal spectra (Hofstadter butterfly) and was used by Thouless et al. in the C0∞ (Rdx )
April 11, J070-S0129055X11004278
236
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
seminal paper [28] to give the first theoretical explanation of the topological quantization of the QHE. More recently, Avron [22] interpreted the results by Thouless et al. from the viewpoint of thermodynamics and connected the QHE to anomalous phase transition diagrams (colored quantum butterflies). Even though this list of publications is very much incomplete, it shows the significance of the Peierls substitution in the study of the QHE. The main merit of this paper is to provide the first rigorous proof of the Peierls substitution under the conditions relevant to the QHE and consequently the first rigorous justification for the use of the Hofstadter model as weak field limit for the analysis of the QHE. Let us now explain why we have chosen to include the additional parameter λ ˆ ≡ H(ε, ˆ λ). Our proposal is to model an experimental setup in the hamiltonian H that applies an external, i.e. macroscopic electric and magnetic field. The parameter ε 1 relates the microscopic scale as given by the crystal lattice to the scale on which the external fields vary. We always assume ε to be small. It is quite easy to fathom an apparaturs where electric and magnetic field can be regulated separately by, say, two dials. We are interested in the case where we can selectively switch off the magnetic field. If we regulate the strength of the magnetic field by varying the relative amplitude λ ≤ 1 which quantifies the ratio between scaled electric and magnetic field, B ε,λ (x) := ελB(εx),
Eε (x) := εE(εx).
We emphasize that λ need not be small, as a matter of fact, λ = 1 is perfectly admissible. In this sense, λ is a perturbative parameter which allows us to take the limit B ε,λ → 0 without changing the external electric field Eε . This is very much relevant to experiments since magnetic fields are typically much weaker than electric fields and thus both, from a physics and a mathematics perspective, the study of the dynamics under the λ → 0 limit is an interesting problem which merits further research. The aim of this review is to show how recent advances in the theory of magnetic pseudodifferential operators (magnetic ΨDOs) (see [14, 17] as well as Sec. 2.2) can be used to extend the range of validity of the results of Panati, Spohn and Teufel derived via space-adiabatic perturbation theory (SAPT) [24] to magnetic fields with components in Cb∞ (Rd ). The original proof uses standard pseudodifferential techniques and thus is limited to magnetic vector potentials of class Cb∞ (Rd ). In a recent work by one of the authors with Panati [5], adiabatic decoupling for the Bloch electron has been proven in the case of constant magnetic field. Their proof rests on a particular choice of gauge, namely the symmetric gauge. We take a different route: according to the philosophy of magnetic Weyl calculus, it is the properties of the magnetic field and not those of the vector potentials which enter the hypotheses of theorems. As the proofs in [24] carry over mutatis mutandis, we feel it is more appropriate to elucidate the structure of the problem and mention the necessary modifications in proofs when necessary.
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
237
Our paper is divided in five, sections: in Sec. 2, we will decompose the Hamiltonian using the Bloch–Floquet–Zak transform and rewrite it as magnetic Weyl quantization of an operator-valued function. Section 3 contains a comprehensive description of our technique of choice, SAPT. The main results, adiabatic decoupling to all orders and a semiclassical limit, will be stated and proven in Secs. 4 and 5. 2. Rewriting the Problem As a preliminary step, we will rewrite the problem: first, we extract the Bloch band picture via the BFZ transform and then we reinterpret the BFZ-transformed hamiltonian as magnetic quantization of an operator-valued symbol. We insist we only rephrase the problem, no additional assumptions are introduced. 2.1. The Bloch–Floquet–Zak transform Usually, one would exploit lattice periodicity by going to the Fourier basis: each Ψ ∈ S(Rdx ) ⊂ L2 (Rdx ) is mapped onto e−ik·y Ψ(y + γ) (F Ψ)(k, y) := γ∈Γ
and the corresponding representation is usually called Bloch–Floquet representation (see, e.g., [16]). It is easily checked that (F Ψ)(k − γ ∗ , y) = (F Ψ)(k, y), (F Ψ)(k, y − γ) = e−ik·γ (F Ψ)(k, y),
∀ γ ∗ ∈ Γ∗ , ∀ γ ∈ Γ,
holds and F Ψ can be written as (F Ψ)(k, y) = eik·y u(k, y) where u(k, y) is Γ-periodic in y and Γ∗ -periodic up to a phase in k. For technical reasons, we prefer to use a variant of the Bloch–Floquet transform introduced by Zak [29] which maps Ψ ∈ S(Rdx ) onto u, e−ik·(y+γ) Ψ(y + γ). (2.1) (ZΨ)(k, y) := γ∈Γ
The BFZ transform has the following periodicity properties: (ZΨ)(k − γ ∗ , y) = e+iγ
∗
·y
(ZΨ)(k, y) =: τ (γ ∗ ) (ZΨ)(k, y), ∀ γ ∗ ∈ Γ∗ ,
(ZΨ)(k, y − γ) = (ZΨ)(k, y),
∀ γ ∈ Γ,
(2.2)
τ is a unitary representation of the group of dual lattice translations Γ∗ . By density, Z immediately extends to L2 (Rdx ) and it maps it unitarily onto Hτ := ψ ∈ L2loc Rdk , L2 (Tdy ) ψ(k − γ ∗ ) = τ (γ ∗ ) ψ(k) a.e. ∀γ ∗ ∈ Γ∗ , (2.3)
April 11, J070-S0129055X11004278
238
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
which is equipped with the scalar product
ϕ, ψ τ := dk ϕ(k), ψ(k) L2 (Td ) . M∗
y
It is obvious from the definition that the left-hand side does not depend on the choice of the unit cell M ∗ in reciprocal space. The BFZ representation of momentum −i∇x and position operator xˆ on L2 (Rdx ), equipped with the obvious domains, can be computed directly, Z(−i∇x )Z −1 = idL2 (M ∗ ) ⊗ (−i∇y ) + kˆ ⊗ idL2 (Tdy ) ≡ −i∇y + k, Zx ˆZ −1 = i∇τk ,
(2.4)
where we have used the identification Hτ ∼ = L2 (M ∗ ) ⊗ L2 (Tdy ). The superscript 1 Rdk , L2 (Tdy ) consists of τ on i∇τk indicates that the operator’s domain Hτ ∩ Hloc τ -equivariant functions. The BFZ transformed domain for momentum −i∇y + k is ˆ L2 (M ∗ ) ⊗ H 1 (Tdy ). Since the phase factor τ depends on y, the BFZ transform of x does not factor — unless we consider Γ-periodic functions, then we have x)Z −1 = idL2 (M ∗ ) ⊗ VΓ (ˆ y ) ≡ VΓ (ˆ y ). ZVΓ (ˆ ˆ namely Equations (2.4) immediately give us the BFZ transform of H, ˆ Z := Z HZ ˆ −1 = 1 −i∇y + k − λA(iε∇τ ) 2 + VΓ (ˆ H y ) + φ(iε∇τk ), k 2
(2.5)
which defines an essentially selfadjoint operator on ZC0∞ (Rdx ). If the external electromagnetic field vanishes, the hamiltonian ⊕ Z Z ˆ per ˆ per Z −1 = H := Z H dk Hper (k) (2.6) M∗
fibers into a family of operators on L2 (Tdy ) indexed by crystal momentum k ∈ M ∗ . Z Z (k − γ ∗ ) and Hper (k) via τ -equivariance relates Hper Z Z Hper (k − γ ∗ ) = τ (γ ∗ )Hper (k) τ (γ ∗ )−1 ,
∀ γ ∗ ∈ Γ∗
which, among other things, ensures that Bloch bands {En }n∈N , i.e. the solutions to the eigenvalue equation Z (k)ϕn (k) = En (k)ϕn (k), Hper
ϕn (k) ∈ L2 (Tdy ),
Z are Γ∗ -periodic functions. Standard arguments show that Hper (k) has purely dis∗ crete spectrum for all k ∈ M and if Bloch bands are ordered by magnitude, they are smooth functions away from band crossings. Similarly, the Bloch functions k → ϕn (k) are smooth if the associated energy band En does not intersect with or touch others [26]. The next subsection shows that the effect of introducing an external electromagnetic field can be interpreted as “replacing” the direct integral with the magnetic Z + φ. quantization of Hper
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
239
2.2. Magnetic ΨDO and Weyl calculus Instead of using regular Weyl calculus, we use a more sophisticated Weyl calculus that is adapted to magnetic problems. It has first been proposed by M¨ uller in 1999 [19] in a non-rigorous fashion. Independently, M˘ antoiu and Purice [20] as well as Iftimie, M˘ antoiu and Purice [14] have laid the mathematical foundation. All of the main results of ordinary Weyl calculus have been transcribed to the magnetic context; for details, we refer to the two aforementioned publications and those we give in the remainder of this section. 2.2.1. Ordinary magnetic Weyl calculus The basic building blocks of magnetic pseudodifferential operators are PA ≡ PA ε,λ := −i∇x − λA(Q), Q ≡ Qε := εˆ x.
(2.7)
ˆ can be written in terms of PA and Q as the quantization of With this notation, H H(x, ξ) := 12 ξ 2 + VΓ (x/ε) + φ(x),
(2.8)
ˆ = H(Q, PA ). As H is the sum of a contribution quadratic in momentum and i.e. H a contribution depending only on x, this prescription is unambiguous. However, not all objects (e.g., resolvents and projections) we will encounter are of this type. We need a functional calculus for the non-commuting family of operators Q and PA of non-commutative observables that are characterized by the commutation relations A i Ql , PA i PA (2.9) i Ql , Qj = 0 j = ε δlj l , Pj = ελBlj (Q) The commutation relations can be rigorously implemented via the Weyl system W A (x, ξ) := e−i(ξ·Q−x·P
A
)
=: e−iσ((x,ξ),(Q,P
A
))
where A is a smooth, polynomially bounded vector potentialassociated to a magnetic field with components in Cb∞ (Rd ) and σ (x, ξ), (y, η) := ξ · y − x · η is the non-magnetic symplectic form. We will also introduce the symplectic Fourier transform 1 dy dη eiσ((x,ξ),(y,η)) f (y, η), f ∈ S(T ∗ Rdx ), (Fσ f )(x, ξ) := (2π)d Rdy Rd η which is also its own inverse on S(T ∗ Rdx ) and extends to a continuous bijection on the space of tempered distributions S (T ∗ Rdx ). The Weyl quantization of h ∈ S(T ∗ Rdx ) given by 1 dx dξ (Fσ h)(x, ξ) W A (x, ξ) (2.10) OpA (h) := (2π)d Rdx Rd ξ defines a magnetic ΨDO. Associated to this, we have a product B akin to the usual Moyal product which emulates the product of magnetic operators on the level of
April 11, J070-S0129055X11004278
240
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
functions on phase space, OpA f B g = OpA (f ) OpA (g). For suitable functions ormander-class symbols, their magnetic product is given f, g : T ∗ Rdx → C, e.g., H¨ by the oscillatory integral ε 1 dy dη dz dζ eiσ((x,ξ),(y,η)+(z,ζ)) ei 2 σ((y,η),(z,ζ)) · (f B g)(x, ξ) = 2d (2π) Rd Rd Rd Rd y η z ζ · e−iεγε
B
(x,y,z)
(Fσ f )(y, η) (Fσ g)(z, ζ)
(2.11)
where γεB (x, y, z) is the scaled magnetic flux through a triangle whose corners depend on x, y and z. In [17], it was shown that for H¨ ormander-class symbols, this product has an asymptotic development in ε and λ, f B g = f g − ε 2i {f, g}λB + O(ε2 ),
(2.12)
where {f, g}λB :=
d d ∂ξl f ∂xl g − ∂xl f ∂ξl g − λ Blj ∂ξl f ∂ξj g l=1
l,j=1
is the magnetic Poisson bracket. In the limit λ → 0 it reduces to the standard Poisson bracket. The crucial fact that this product depends on the magnetic field B rather than the vector potential A can be traced back to the gauge-covariance of magnetic Weyl quantization: if A = A + dχ is an equivalent vector potential, dA = B = dA , then the quantizations with respect to either vector potential are unitarily equivalent, (2.13) OpA+dχ (h) = e+iλχ(Q) OpA (h)e−iλχ(Q) . This is generically false if we quantize hA (x, ξ) := h x, ξ − λA(x) via usual, nonmagnetic Weyl quantization, OpA (h) := Op(hA ).a Otherwise, the properties of the relevant pseudodifferential operators were to depend on the choice of gauge. Fortunately, the difference starts to appear at second order in ε [17, Sec. 1.1.2], so we generically expect results derived by non-magnetic Weyl calculus to agree to first order. However, there is a second advantage of magnetic Weyl calculus (beyond being more natural): we can treat more general magnetic fields as properties of B enter rather than those of A — and associated vector potentials are always worse behaved than the magnetic field. With some effort, one can treat magnetic fields which admit a vector potentials whose derivatives are all bounded, ∂xa A(x) ≤ Ca for all a ∈ Nd0 , |a| ≥ 1, see [14, Sec. 5], for instance, but with magnetic Weyl calculus we are instantly able to treat magnetic fields whose components are Cb∞ (Rd ) functions with zero extra effort. The limitation to this class of fields is due to the fact that we are interested in H¨ ormander class symbols which need to be bounded in the x variable. In fact, the extension of the results of Panati, Spohn and Teufel a Coincidentally,
the quantization of polynomials of degree ≤ 2 in momentum with respect to OpA are covariant. Bloch bands, however, are not quadratic functions — as are all the other objects (terms of the development of the projection and the unitary) involved in this paper.
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
241
to magnetic fields of class Cb∞ is our main motivation as explained in the introduction. Covariance ensures that our results do not depend on the choice of a nice or symmetric gauge, any vector potential that is a smooth, polynomially bounded function will do. Many standard results of pseudodifferential theory have been transcribed to the magnetic case for a large class of magnetic fields: typically, it is either assumed that ∞ (Rd ) or Cb∞ (Rd ) functions, although we shall always the components of B are Cpol assume the latter. The quantization and dequantization have been extended to tempered distributions and it was shown that smooth, uniformly polynomially bounded ∞ ∗ d functions on phase space Cpol u (T R ) are among those with good composition properties [20, Proposition 23]. H¨ ormander symbols are preserved under the magnetic Weyl product and quantizations of real-valued, elliptic H¨ ormander symbols of positive order m define selfadjoint operators on the mth magnetic Sobolev space [14]. A magnetic version of the Cald´eron–Vaillancourt theorem [14] and commutator criteria [15] show the interplay between properties of magnetic pseudodifferential operators and their associated symbols. Lastly, we mention something that will be important in the next section: the magnetic Weyl quantization is a position representation for a magnetic pseudodifferential operator. However, equivalently, we can use the momentum representation A where x is quantized to Q := iε∇ξ and ξ to P := ξˆ − λA(iε∇ξ ). The commutation relations of the building block operators in momentum representation are again encoded into the Weyl system
A
W A (x, ξ) := e−iσ((x,ξ),(Q ,P
))
= FW A (x, ξ)F−1
which is related to the Weyl system in momentum representation via the Fourier transform F : L2 (Rdx ) → L2 (Rdξ ). If OpA is the quantization associated to the Weyl system W A , then OpA and OpA are related via F as well. An important consequence is that the Weyl product is independent of the choice of representation: for suitable distributions f and g, we conclude that the Weyl products must agree: A A OpA (f B ε,λ g) = Op (f ) Op (g)
= F OpA (f ) F−1 F OpA (g) F−1 −1 = F OpA (f B ε,λ g) F
= OpA (f B ε,λ g). This is related to an algebraic point of view proposed in [21] and elaborated upon in [18]: the choice of Hilbert space in this case can be seen as a choice of (equivalent) representation of more fundamental C ∗ -algebras of distributions. Properties such as boundedness of the quantization of certain classes of distributions and the form of the composition law are preserved if we choose a unitarily equivalent representation. In fact, we could have replaced F in the above argument by any other unitary operator U : L2 (Rdx ) → H where the target space H is again a separable Hilbert
April 11, J070-S0129055X11004278
242
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
space. Gauge transformations are another particular example of unitarities which connect equivalent representations. 2.2.2. Equivariant magnetic Weyl calculus For technical reasons, we must adapt magnetic Weyl calculus to deal with equivariant, unbounded operator-valued functions. We follow the general strategy outlined in [24], but we need to be more careful as the roles of Q and PA are not interchangeable if B = 0. We would like to reuse results for Weyl calculus on T ∗ Rd — in particular, the two-parameter expansion of the product (Eq. (2.12)). Consider the building block kinetic operators macroscopic position R and magnetic crystal momentum KA , R = iε∇k ⊗ idL2 (Tdy ) ≡ iε∇k , K = kˆ − λA(R), A
(2.14)
in momentum they define selfadjoint operators whose domains are representation: ∗ dense in L2τ Rdk , L2 (Tdy ) where τ stands for either τ : γ ∗ → e−iγ ·ˆy or 1 : γ ∗ → 1. The elements of this Hilbert space can be as vector-valued tempered dis considered 2 d 2 d , L (T ) can be continuously embedded R tributions with special properties as L y τ k simplicity, let us ignore questions of domains and assume into S Rdk , L2(Tdy ) . For that h ∈ Cb∞ T ∗ Rdx , B L2 (Tdy ) is a bounded operator -valued function. Then its magnetic Weyl quantization 1 dr dk(Fσ h)(k, r)W A (k, r) (2.15) OpA (h) := (2π)d Rdr Rd k d defines a continuous operator from S Rdk , L2 (T y ) to itself which has a continuous extension as an operator from S Rdk , L2 (Tdy ) to itself [20, Proposition 21]. Here, the corresponding Weyl system W A (k, r) := e−iσ((r,k),(R,K
A
))
⊗ idL2 (Tdy ) ≡ e−i(k·R−r·K
A
)
is defined in terms of the building block operators KA and R and acts trivially on L2 (Tdy ). The Weyl product f B g of two suitable distributions associated to the quantization OpA is also given by a suitable reinterpretation of Eq. (2.11) as f and g are now operator-valued functions. Furthermore, we can also develop f B g asymptotically in ε and λ [17, Theorem 1.1]. To see this, we remark that the difference between the products associated to OpA and OpA is two-fold: first of all, OpA is a position representation while OpA is a momentum representation. Let OpA be the magnetic Weyl quantization defined with respect to R := F−1 RF = εˆ r r ), i.e. the position representation. As explained and KA := F−1 KA F = −i∇r −λA(εˆ at the end of the previous subsection, unitarily equivalent representations, here OpA and OpA , have the same Weyl product. Secondly, the functions which are to be quantized by OpA and OpA take values in C and the bounded operators on L2 (Tdy ), respectively. The interested reader may
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
243
check the proofs regarding the various properties of the product B in [20, 14, 17] can be generalized to accommodate operator-valued functions, including H¨ ormander symbols. Definition 2.1 (H¨ ormander Symbols S m ρ (B(H1 ,H2 ))). Let m ∈ R, ρ ∈ [0, 1] separable Hilbert spaces. Then a function f is said to be in and H1 , H2 be Sρm B(H1 , H2 ) if and only if for all a, α ∈ Nd0 the seminorms f := sup ( 1 + ξ 2 )|α|ρ−m ∂xa ∂ξα f (x, ξ)B(H1 ,H2 ) < ∞ m,aα (x,ξ)∈Ξ
are finite where ·B(H1 ,H2 ) denotes the operator norm on B(H1 , H2 ). In case ρ = 1, one also writes S m := S1m H¨ ormander symbols which have an expansion in ε that is uniform in the small parameter are called semiclassical. Definition 2.2 (Semiclassical Symbols AS m ρ (B(H1 ,H2 ))). A map f : m [0, ε0 ) → Sρ , ε → fε is called a semiclassical symbol of order m ∈ R and weight ρ ∈ [0, 1], that is f ∈ ASρm , if there exists a sequence {fn }n∈N0 , fn ∈ Sρm−nρ , such that for all N ∈ N0 , one has N −1 −N n ε ε fn ∈ Sρm−N ρ fε − n=0
uniformly in ε in the sense that for any N ∈ N0 and a, α ∈ Nd0 , there exist constants CN aα > 0 such that N −1 n ε fn ≤ CN aα εN fε − n=0
m−N ρ,aα
holds for all ε ∈ [0, ε0 ). If ρ = 1, then one abbreviates AS1m with AS m . Lastly, we will need the notion of τ -equivariant symbols. ∗ Definition 2.3 (τ -Equivariant Symbols AC ∞ τ (B(H1 ,H2 ))). Let τj : Γ → ofthe group Γ∗ . Then f ∈ AS00 is U(Hj ), j = 1, 2, be unitary ∗-representations ∞ τ -equivariant, i.e. an element of ACτ B(H1 , H2 ) , if and only if
f (k − γ ∗ , r) = τ2 (γ ∗ )f (k, r)τ1 (γ ∗ )−1 holds for all k ∈ Rdk , r ∈ Rdr and γ ∗ ∈ Γ∗ . Now the reader is in a position to translate the results derived in [27, Appendix B] to the context of magnetic Weyl calculus, the modifications are straightforward and all the necessary references have been given in this section.
April 11, J070-S0129055X11004278
244
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
3. The Magnetic Bloch Electron as a Space-Adiabatic Problem Our tool of choice to derive effective dynamics is space-adiabatic perturbation theory [25, 24, 27] which uses pseudodifferential techniques to derive perturbation expansions order-by-order in a systematic fashion. We adapt their results by replacing ordinary Weyl calculus with magnetic Weyl calculus. Adiabatic decoupling only hinges on ε 1 and does not rely on λ to be small. 3.1. The three ingredients of space-adiabatic problems The insight of [24] was that the slow variation of the external electromagnetic field (quantified by ε 1) leads to a decoupling into slow (macroscopic) and fast (microscopic) degrees of freedom. This is characteristic of adiabatic systems whose three main features are (i) A distinction between slow and fast degrees of freedom: the original (physical) Hilbert space H = L2 (Rdx ) is decomposed unitarily into Hslow ⊗ Hfast := L2 (M ∗ ) ⊗ L2 (Tdy ) in which the unperturbed hamiltonian is block diagonal (see diagram (3.1)). The names slow and fast Hilbert space are due to the operators defined on them: on the fast Hilbert space, the two conjugate observables are −i∇y and yˆ acting on Hfast = L2 (Tdy ), the Hilbert space associated to the Wigner Seitz cell M∼ = Tdy (cf. Eq. (1.5)); their commutator is of O(1). The operators R and KA (cf. Eq. (2.14)) defined on Hslow = L2 (M ∗ ) are considered slow, because their commutator is of O(ε). Since Hslow is the Hilbert space over the Brillouin zone M ∗ (cf. Eq. (1.4)), the dynamics of the slow variables (R, KA ) describes the motion across unit cells in momentum representation whereas the dynamics of the fast variables (ˆ y , −i∇y ) describe what happens within the Wigner–Seitz cell M . (ii) A small, dimensionless parameter ε that quantifies the separation of spatial scales. In our situation, ε 1 relates the variation of the external electromagnetic field to the microscopic scale as given by the lattice constant. In addition, we have the parameter λ. However, only the semiclassical parameter ε is crucial for adiabatic decoupling. (iii) A relevant part of the spectrum, i.e. a subset of the spectrum which is separated from the remainder by a gap. We are interested in the dynamics associated to a family of Bloch bands {En }n∈I that does not intersect or merge with bands from the remainder of the spectrum. ˆ Z satisfies the gap conAssumption 3.1 (Gap Condition). The spectrum of H per dition, namely there exists a family of Bloch bands {En }n∈I , I = [I− , I+ ] ∩ N0 such that
{En (k)}, {Ej (k)} =: Cg > 0. inf ∗ dist k∈M
n∈I
j∈I
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
245
The spectral gap ensures that transitions from and to the relevant part of the spectrum are exponentially suppressed. Band crossings within the relevant part of the spectrum are admissible, though. In the original publication, an additional assumption was made on the existence of a smooth, τ -equivariant basis, a condition that is equivalent to the triviality of a certain U (N ) bundle over the torus Tdk where N := |I| is the number of bands including multiplicity. At least for the physically relevant cases, i.e. d ≤ 3, Panati has shown that this is always possible for non-magnetic Bloch bands [23]. For d ≥ 4, our results still hold if we add Assumption 3.2 (Smooth Frame (d ≥ 4)). If d ≥ 4, we assume there exists an orthonormal basis (called smooth frame) {ϕj (·)}j=1,...,|I| of whose elements are smooth and τ -equivariant with respect to k, i.e. ϕj (· − γ ∗ ) = τ (γ ∗ )ϕj (·) for all γ ∗ ∈ Γ∗ and for all j ∈ {1, . . . , |I|}. 3.2. Rewriting the unperturbed problem: An adiabatic point of view Let us consider the unperturbed case, i.e. in the absence of anexternal electromag⊕ Z Z ˆ per = M ∗ dk Hper (k). Each netic field. Then the dynamics on Hτ is generated by H Z fiber hamiltonian Hper (k) is an operator on the fast Hilbert space Hfast = L2 (Tdy ). ⊕ Then π ˆ0 = M ∗ dk π0 (k) is the projection onto the relevant part of the spectrum, where |ϕn (k)ϕn (k)|. π0 (k) := n∈I
Even though the ϕn (k) may not be continuous at eigenvalue crossings, the projection k → π0 (k) is due to ⊕the spectral gap. Associated to the relevant band is a (nonunique) unitary u ˆ0 = M ∗ dk u0 (k) which “straightens” Hτ into L2 (Mk∗ ) ⊗ L2 (Tdy ): for each k ∈ M ∗ , we define u0 (k) := |χn ϕn (k)| + u⊥ 0 (k) n∈I
where χn ∈ L2 (Tdy ), n ∈ I, are fixed vectors independent of k and u⊥ 0 (k) (also nonunique) acts on the complement of ran π0 (k) and is such that u ˆ0 is a proper unitary. Even though this means u0 is not unique, the specific choices of the {χn }n∈I and u⊥ 0 will not enter the derivation. Then we can put all parts of the puzzle into a diagram: t ˆ
e−i ε H
t ˆZ
e−i ε H
u ˆ0 Z / Hτ / L2 (M ∗ ) ⊗ L2 (Tdy ) L2 (Rdx ) π ˆ0 Πref Z −1 π ˆ0 Z −1 Z π ˆ0 ZL2 (Rdx ) _ _ _ _/ π ˆ0 Hτ _ _ _ _ _/ L2 (M ∗ ) ⊗ CN G tˆ
e−i ε heff 0
(3.1)
April 11, J070-S0129055X11004278
246
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
The reference projection Πref = idL2 (M ∗ ) ⊗ πref acts trivially on the first factor, L2 (M ∗ ), and projects via πref =
N
|χj χj | = u0 (k) π0 (k)u∗0 (k)
(3.2)
j=1
onto an N -dimensional subspace of L2 (Tdy ). We will identify πref L2 (Tdy ) with CN when convenient and in this sense, we identify the range of Πref with the reference space Href := L2 (M ∗ ) ⊗ CN . The dynamics in the lower-right corner is generated by the effective hamiltonian ∗ ˆ eff 0 := Πref u ˆZ u ˆ0 H h per ˆ0 Πref
ˆ if the relevant part of the spectrum consists of an isolated which reduces to En (k) Bloch band. 3.3. Adiabatic decoupling in the presence of external fields Now the question is whether a similar diagram exists even if the perturbation is present, i.e. if there exist a tilted projection Π, an intertwining unitary U and ˆ eff 0 ? This has ˆ eff that take the place of π ˆ0 , u ˆ0 and h an effective hamiltonian h ∞ been answered in the positive for magnetic fields that admit Cb (Rd , Rd ) vector potentials in [24] where these objects are explicitly constructed by recursion. We replace standard Weyl calculus used in the original publication with its magnetic variant (cf. Sec. 2.2) which naturally allows for the treatment of more general magnetic fields with components in Cb∞ . The construction of Π and U detailed in the next section is a “defect construction” where recursion relations derived from ˆ Z ] = 0, [Π, H
Π2 = Π,
U ∗ U = idHτ , U U ∗ = idL2per (M ∗ )⊗L2per (M) , U Π U ∗ = Πref , relate the nth term to all previous terms. These four conditions merely characterize that Π and U are still a projection and a unitary (first column) and adapted to the problem (second column). These equations can be translated via magnetic Weyl calculus to π, H Z B = O(ε∞ ), πB π = π + O(ε∞ ), (3.3) uB u∗ = 1 + O(ε∞ ) = u∗ B u, uB πB u∗ = πeff + O(ε∞ ), where H Z (k, r) :=
1 2
2 −i∇y + k + VΓ (ˆ y ) + φ(r)
(3.4)
is the operator-valued symbol associated to the magnetic pseudodifferential operaˆ Z defined by Eq. (2.5). Note that the magnetic vector potential A does not tor H enter the definition of the symbol H Z and we do not need to impose conditions on
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
247
A to ensure the symbol H Z is well-behaved after minimal substitution. This is why magnetic Weyl calculus can be used to treat much more general magnetic fields. For technical reasons, OpA (u) and U , for instance, agree only up to an error that is arbitrarily small in ε with respect to the operator norm, U = OpA (u) + O · (ε∞ ). The tilted projection and intertwining unitary are now used to define the effective hamiltonian as the magnetic quantization of heff := πref uB H Z B u∗ πref which generates effective dynamics, i.e. for initial states in ΠHτ we can approximate A t the full time evolution in terms of e−i ε Op (heff ) . In turn, the effective quantum evolution can be approximated by semiclassical dynamics. Theorem 5.2, the main result of the next section, will make this statement precise. 4. Derivation of effective quantum dynamics The aforementioned “defect construction” yields the tilted projection π and the intertwining unitary u as asymptotic expansion in ε and λ. It is important that the decoupling is solely due to the separation of spatial scales quantified by ε and independent of λ which regulates the strength of the magnetic field. 4.1. The dynamics in the Almost Invariant Subspace We will quickly explain how Π and U are computed order-by-order in ε and λ. We adapt the general recipe explained in [27] to incorporate two parameters: since the decoupling is due to the separation of spatial scales quantified by ε 1, we will order corrections in powers of ε first. Expanding the magnetic Weyl product to zeroth order, we can check π0 , H Z B = O(ε), π0 B π0 = π0 + O(ε), u0 B u0 ∗ = 1 + O(ε) = u0 ∗ B u0 , u0 B π0 B u∗0 = πeff + O(ε). Here, π0 , H Z B := π0 B H Z − H Z B π0 denotes the magnetic Weyl commutator. The asymptotic expansion of the product is key to deriving corrections in a systematic manner: the O(ε) terms can be used to infer π1 and u1 , the subprincipal symn n bols. Then, one proceeds by recursion: if π (n) := l=0 εl πl and u(n) := l=0 εl ul satisfy Eqs. (3.3) up to errors of order εn+1 , then we can compute πn+1 and un+1 . The construction of π and u follows exactly from [27, Lemmas 3.8 and 3.15]; it is purely algeraic and only uses that we have a recipe to expand the Moyal product in terms of the semiclassical parameter ε. Let us define π (n) B π (n) − π (n) =: εn+1 Gn+1 + O(εn+2 ) Z (n) d =: εn+1 Fn+1 + O(εn+2 ) H , π + εn+1 πn+1
B
(4.1)
April 11, J070-S0129055X11004278
248
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
as projection and commutation defects and ∗
u(n) B u(n) − 1 =: εn+1 An+1 + O(εn+2 ), ∗ (n) u + εn+1 an+1 u0 B π (n+1) B u(n) + εn+1 an+1 u0 =: εn+1 Bn+1 + O(εn+2 ), (4.2) d as unitarity and intertwining defects. The diagonal part of the projection πn+1 can be computed from Gn+1 via D := −π0 Gn+1 π0 + (1 − π0 )Gn+1 (1 − π0 ). πn+1
(4.3)
an+1 = − 12 An+1
(4.4)
The term
stems from the ansatz un+1 = (an+1 + bn+1 )u0 where an+1 and bn+1 are symmetric and antisymmetric, respectively. One can solve the second equation for bn+1 = πref , Bn+1 (4.5) where πref is the reference projection on L2 (Tdy ) given by Eq. (3.2). This equation fixes only the off-diagonal part of bn+1 as πref Bn+1 πref = 0 = (1−πref )Bn+1 (1−πref ) and in principle one is free to choose the diagonal part of bn+1 . This means, there is a freedom that allows arbitrary unitary transformations within πref L2 (Tdy ) as well as its orthogonal complement. In general, it is not possible to solve Z OD H , πn+1 = −Fn+1 (4.6) explicitly since Bloch functions at band crossings within the relevant part of the spectrum (which are admissible) are no longer differentiable. In any case, π can be constructed locally around (k0 , r0 ) by asymptotically expanding the Moyal resolvent (H Z − z)(−1)B , i.e. the symbol defined through the relations (H Z − z)B (H Z − z)(−1)B = 1 = (H Z − z)(−1)B B (H Z − z), and setting i π(k, r) = 2π
dz(H Z − z)(−1)B (k, r) + O(ε∞ )
(4.7)
C(k0 ,r0 )
in a neighborhood of (k0 , r0 ). A recent result by Iftime et al. [15] suggests that under these circumstances (H Z is elliptic and selfadjoint operator-valued) (H Z − z)(−1)B always exists and is a H¨ ormander symbol even in the presence of a magnetic field. We reckon their result extends to the case of operator-valued symbols, but seeing how tedious the proof is, we simply stick to the procedure used by Panati et al. [27, Lemma 5.17]. This construction uniquely fixes the tilted Moyal projection π, but not the Moyal unitary u.
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
249
As the two-parameter expansion of the product f g B
∞ n
εn λk (f B g)(n,k) ,
n=0 k=0
contributes only finitely many terms in λ for fixed power of n of ε [17], we can order the terms of the expansion of π and u in powers of λ, e.g., πn =
n
λk π(n,k) .
k=0
The magnetic Weyl product as well as its asymptotic expansion are defined in terms of oscillatory integrals, i.e. integrals which exist in the distributional sense. If we take the limit λ → 0 of f B g, we can interchange oscillatory integration and limit procedure [13, p. 90] and conclude limλ→0 f B g = f g where is the usual Moyal product. Similarly, we can apply this reasoning to the asymptotic expansion: for any fixed N ∈ N0 , we may write the product as n N B n k B B ε λ (f g)(n,k) + εN +1 RN f g = +1 (f, g) n=0
k=0
and taking the limit λ → 0 means only the non-magnetic terms (f B g)(n,0) survive. The remainder also behaves nicely when taking the limit as it is also just another B oscillatory integral and limλ→0 RN +1 (f, g) is exactly the remainder of the nonmagnetic Weyl product. Hence, we can now prove the main result of this paper: Theorem 4.1 (Effective Quantum Dynamics). Let Assumptions 1.1, 1.2 and 3.1 be satisfied. Furthermore, if d ≥ 4, we add Assumption 3.2. Then there exist (i) an orthogonal projection Π ∈ B(Hτ ), (ii) a unitary map U which intertwines Hτ and L2 (M ∗ ) ⊗ L2 (Tdy ), and (iii) a selfadjoint operator OpA (heff ) ∈ B L2 (M ∗ ) ⊗ CN , N := |I| such that
and
Z H ˆ , Π = O(ε∞ ) −isHˆ Z A e − U ∗ e−isOp (heff ) U ΠB(H
τ)
(4.8) = O ε∞ (1 + |s|) .
(4.9)
∗
The effective hamiltonian is the magnetic quantization of the Γ -periodic symbol Z B ∗
heff := πref u H u πref B
∞
εn heff n ∈ ASτ0≡1 B(CN )
(4.10)
n=0
whose asymptotic expansion can be computed to any order in ε and λ. To each order n in ε, only finitely many terms in λ contribute, heff n = k=0 λk heff (n,k) .
April 11, J070-S0129055X11004278
250
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
One deduces from Eq. (4.8) and a Duhamel argument that the unitary time ˆ Z and Π almost commute even for macroscopic times evolution generated by H t = εs and hence, up to an error of arbitrarily large order in ε, the space Πref Hτ is left invariant by the dynamics, ˆZ
(1 − Π)e−i ε H Π = O · (ε∞ |t|). t
Thus we call ΠHτ the almost invariant subspace and in a sense, it is the tilted “eigenspace” associated to the family of relevant bands. The proof of the above theorem amounts to showing (i)–(iii) separately. Proposition 4.1 (Tilted Projection). Under the assumptions of Theorem 4.1 there exists and orthogonal projection Π ∈ B(Hτ ) such that Z ˆ , Π = O · (ε∞ ) H (4.11) and Π = OpA (π) + O · (ε∞ ) where OpA (π) is the magnetic Weyl quantization of a τ -equivariant semiclassical symbol π
∞
εn πn ∈ ACτ∞ B(Hfast )
n=0
whose principal part π0 (k, r) coincides with the spectral projection of H Z (k, r) onto the subspace corresponding to the given isolated family of Bloch bands {En }n∈I . Each term in the expansion can be written as a finite sum πn =
n
λk π(n,k) ∈ ACτ∞ B(Hfast )
k=0
ordered by powers of λ. For λ → 0, the projection π reduces to the non-magnetic ∞ projection π 0 n=0 εn π(n,0) . Proof [Sketch]. The proof relies on a well-developed magnetic Weyl calculus adapted to operator-valued symbols (cf. Sec. 2.2) and the gap condition. In particular, one needs a magnetic Cald´eron–Vaillancourt theorem, composition and quantization of H¨ ormander symbols [14] and finally, an asymptotic two-parameter expansion of the magnetic Weyl product B [17]. The interested reader may check line-by-line that the original proof [27, Proposition 5.16] can be transliterated to the magnetic context with obvious modifications. If we were using standard Weyl calculus, the major obstacle would be to control derivatives of π since vector potentials may be unbounded. In magnetic Weyl calculus the vector potential at no point enters the and the assumptions on the magnetic field assure calculuations ∞ that π ∈ ACτ B(Hfast ) is a proper τ -equivariant semiclassical H¨ormander-class symbol (cf. Definition 2.3). The fact that we can write all of the πn as finite sum of terms ordered by powers of λ stems from the fact that calculating πn involves the expansion of the product
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
251
up to nth power in ε, e.g., for the projection defect, we find π (n−1) B π (n−1) − π (n−1) = εn (πa B πb (c) + O(εn+1 ) a+b+c=n
= εn
a
b
c
a+b+c=n a =0 b =0 c =0
λa +b +c π(a,a ) B π(b,b ) (c,c ) + O(εn+1 ).
Certainly, the exponent of λ is always bounded by n ≥ a +b +c . And since the sum is finite, this clearly defines a semiclassical symbol in ε is shown (cf. Definition 2.2). Similar arguments for the commutation defect in conjunction with the comments in the beginning of this section show π to be a semiclassical symbol. It is well-behaved under the λ → 0 limit and reduces to the projection associated to the case B = 0. Lastly, to make the almost projection OpA (π) into a true projection, we define Π to be the spectral projection onto the spectrum in the vicinity of 1, −1 dz OpA (π) − z . Π := 1 |z− 2 |=1 This concludes the proof. Similarly, one can modify [27, proof of Proposition 5.18] to show the existence of the intertwining unitary. Proposition 4.2 (Intertwining Unitary). Let {En }n∈I be a family of bands separated by a gap from the others and let Assumption 1.1 be satisfied. If d > 3, assume u0 ∈ S00 B(Hfast ) . Then there exists a unitary operator U : Hτ → L2 (M ∗ ) ⊗ L2 (Tdy ) such that U = OpA (u) + O · (ε∞ ) where u
∞
εn un ∈ AS 0 B(Hfast )
n=0
is right-τ -covariant at any order and has principal symbol u0 . Each term in the expansion can be written as a finite sum un =
n
λk u(n,k)
k=0
ordered by powers of λ. For λ → 0, the unitary u reduces to the non-magnetic ∞ unitary u0 n=0 εn u(n,0) . Proof [Sketch]. Equations (4.4) and (4.5) give us an+1 and bn+1 which combine to un+1 = (an+1 + bn+1 )u0 ; by [17, Theorem 1.1] it is also in the correct symbol class, namely S00 B(Hfast ) . The right τ -covariance is also obvious from the ansatz. Lastly, the true unitary U is obtained via the Nagy formula as described in [27]. Proof of Theorem 4.1 [Sketch]. The existence of Π and U have been the subject symof Propositions 4.1 and 4.2. By right-τ -covariance of u, heff is a Γ∗ -periodic bol and since it is the magnetic Weyl product of Cb∞ T ∗ Rd , B L2 (Tdy ) functions,
April 11, J070-S0129055X11004278
252
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
[17, Theorem 1.1] ensures and its asymptotic two-parameter 2thed product ∗ d that ∞ expansion are in Cb T R , B L (Ty ) as well. Equation (4.9) follows as usual from a Duhamel argument. The crucial statement of Theorem 4.1 is Eq. (4.9) and it is worthwhile to discuss its implications to applications: in practice, one only computes finitely many terms of the asymptotic expansions of Π, U and heff . Let us call their finite resummations u(l) := ln=0 εn un , π (l) := ln=0 εn πn and heff (l) := ln=0 εn heff n . Assume we are interested in times t = εk s where |t| ≤ τ . Then a closer inspection of the Duhamel argument in the proof of Theorem 4.1 yields (n+k) Z A t t ∗ ) Πref OpA u(n) + O · εn−k+1 . e−i εk H Π = OpA u(n) e−i εk Op (heff Hence, if we would like to consider macroscopic times t = εs and make an error in the propagation of order O(ε2 ), then we need to expand the tilted projection, intertwining unitary and the effective hamiltonian to second order. However, if the relevant band consists of a single band, computing the first-order correction to the effective hamiltonian suffices as we shall see in Sec. 5. 4.2. Effective dynamics for a single band: The Peierls substitution In case the relevant part of the spectrum consists of a single non-degenerate band E∗ , we can calculate the first-order correction to heff explicitly: the magnetic Weyl product reduces to the pointwise product to zeroth order in ε. Thus, we can directly compute heff 0 = πref u0 H0 u∗0 πref =: πref h0 πref = E∗ + φ. For the first order, we use the recursion formula [27, Eq. (3.35)] and the fact that heff 0 is a scalar-valued symbol:
heff 1 = u1 H0 − h0 u1 + (u0 B H0 )(1) − (h0 B u0 )(1) u∗0 = πref u1 u∗0 , h0 πref + (u0 B H0 )(1) u∗0 − (h0 B u0 )(1) u∗0
= − 2i πref u0 , H0 λB − h0 , u0 λB πref . The term with the magnetic Poisson bracket can be easily computed:
πref u0 ,H0 λB − h0 , u0 λB u∗0 πref = πref ∂kl u0 ∂rl H0 − λBlj ∂kl u0 u∗0 ∂kj h0 u0 − ∂kj u0 u∗0 h0 − h0 u0 ∂kj u∗0
+ ∂rl h0 ∂kl u0 + λBlj ∂kl h0 ∂kj u0 u∗0 πref = 2i ∂rl φ − λBlj ∂kj E∗ Al + λBlj πref ∂kl u0 u∗0 ∂kj u0 u∗0 h0 + h0 u0 ∂kj u∗0 πref
= 2i ∂rl φ − λBlj ∂kj E∗ Al + λBlj ∂kl ϕb , Hper − E∗ ∂kj ϕb The first term combines to a Lorentz force term, the second one — which is purely imaginary — yields the Rammal–Wilkinson term. The components of the magnetic
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
253
field are Cb∞ (Rd ) functions and hence, principal and subprincipal symbol are in Cb∞ (T ∗ Rd ) as well. This means, we have proven the following corollary to Theorem 4.1: Corollary 4.1. Under the assumptions of Theorem 4.1, the principal and subprincipal symbol of the effective hamiltonian for a single non-degenerate Bloch band E∗ are given by heff 0 = E∗ + φ heff 1 = − −∂rl φ + λBlj ∂kj E∗ Al − λBlj Mlj
(4.12)
=: −FLor l Al − λBlj Mlj where Al (k) := iϕb (k), ∇k ϕb (k) and Mlj (k) = Re
i 2 ∂kl ϕb , Hper (k) − E∗ (k) ∂kj ϕb
are the Berry connection and the so-called Rammal–Wilkinson term, respectively. To leading order, this is the well-known Peierls substitution. In particular, for zero electric field, φ ≡ 0, and constant magnetic field (and the equations written in symmetric gauge), the ansatz E∗ (k) =
d
cos kj
j=1
yields the celebrated Hofstadter model [12]. 5. Derivation of Semiclassical Equations of Motion In the preceding section, we have approximated the full quantum evolution by a simpler effective quantum evolution on a smaller reference space Href = L2 (M ∗ ) ⊗ t ˆZ CN . Now, we will link the time evolution generated by heff – and thus e−i ε H – to a semiclassical flow which contains quantum corrections. Conceptually, this is a two-step process: if we reconsider the diagram of spaces, t ˆ
e−i ε H
t ˆZ
e−i ε H
U Z / Hτ / L2 (M ∗ ) ⊗ L2 (Tdy ) L2 (Rdx ) −1 Πref Π Z ΠZ Z −1 ΠZL2 (Rdx ) _ _ _ _ _/ ΠHτ _ _ _ _ _/ L2 (M ∗ ) ⊗ CN G t
A (h eff )
e−i ε Op
(5.1)
April 11, J070-S0129055X11004278
254
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
we notice that our physical observables live on the upper-left space L2 (Rdx ) — or equivalently on Hτ . The effective evolution generated by heff approximates the dynamics if the initial states are localized in the almost invariant subspace associated to the relevant bands. In this section, we always assume the relevant part of the spectrum consists of a single non-degenerate band E∗ and thus L2 (M ∗ ) ⊗ C1 ∼ = L2 (M ∗ ). In a first step, we need to connect the semiclassical dynamics in the left column of the diagram with those in the lower-right corner. The second, much simpler step is to establish an Egorov-type theorem on the reference space Href = L2 (M ∗ ) ⊗ CN . 5.1. Relation between dynamics for macroscopic and effective observables Since we are concerned with the semiclassical dynamics of a particle in an electromagnetic field, the magnetic field must enter in the classical equations of motion. There are two ways: either one uses minimal coupling, i.e. one writes down the equations of motion for position r and kinetic momentum k = k − λA(r) with respect symplectic form for the variables (k , r) and the hamiltonian to the usual Z H k − λA(r), r . Then the classical flow which enters the Egorov theorem is generated by r˙ λB(r) −id ∇r (5.2) H Z (k, r) = ∇k +id 0 k˙ where the appearance of B in the matrix representation of the symplectic form is due to the fact that k is kinetic momentum. What constitutes a suitable observable? Physically, we are interested in measurements on the macroscopic scale, i.e. the observable should be independent of the microscopic degrees of freedom. On the level of symbols, this means f (k, r) has to commute pointwise with the Hamiltonian H Z (k, r) for all k and r. Hence, such an observable is a constant of motion with respect to the fast dynamics. In the simplest case, the observables are scalarvalued. This also ensures we are able to “separate” the contributions to the full dynamics band-by-band. Note that this by no means implies OpA (f ) commutes with OpA (H Z ), but rather that all of the non-commutativity is contained in the slow variables (k, r). Definition 5.1 (Macroscopic Semiclassical Observable). A macropscopic observable f is a scalar-valued semiclassical symbol (cf. Definition 2.2) AS00 (C) which is Γ∗ -periodic in k, f (k + γ ∗ , r) = f (k, r) for all (k, r) ∈ T ∗ Rd , γ ∗ ∈ Γ∗ . Our assumption that our dynamics lives on the almost-invariant subspace ΠHτ modifies the classical dynamics to first order in ε as well: instead of using KA and R as building block observables, the proper observables should be ΠKA Π and ΠRΠ. Equivalently, we can switch to the reference space representation and use the
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
255
magnetic quantization of keff := πref uB kB u∗ πref = k + ελB(r)A(k) + O(ε2 ), reff := πref uB rB u∗ πref = r + εA(k) + O(ε2 ).
(5.3)
The crucial proposition we will prove next says that for suitable observables f , the effect of going to the effective representation is, up to errors of order ε2 at least, equivalent to replacing the arguments k and r by keff and reff , Πref U OpA (f )U ∗ Πref = OpA πref uB f B u∗ πref + O · (ε∞ ) =: OpA (feff ) + O · (ε∞ ).
(5.4)
Then it follows that the effective observable feff coincides with the original observable f after a change of variables up to errors of order O(ε2 ), feff = πref uB f B u∗ πref = f ◦ Teff + O(ε2 ),
(5.5)
where the map Teff : (k, r) → (keff , reff ) maps the observables k and r onto the effective observables keff and reff defined via Eqs. (5.3). Proposition 5.1. Let f be a macroscopic semiclassical observable. Then up to errors of order ε2 Eq. (5.5) holds and consequently, we have Πref U OpA (f ) U ∗ Πref = OpA feff + O · (ε∞ ) = OpA f ◦ Teff + O · (ε2 ). (5.6) Proof. The equivalence of the left-hand sides of Eqs. (5.6) and (5.5) follows from U = OpA (u) + O · (ε∞ ) and the fact that we only need to consider the first two terms in the ε expansion. With the help of [17, Theorem 1.1], we conclude feff ∈ AS 0 is also a semiclassical symbol of order 0. The left-hand side of (5.5) can be computed explicitly: to zeroth order, nothing changes as f commutes pointwise with u and u∗ , feff 0 = πref u0 f u∗0 πref = f0 . To first order, we have
feff 1 = πref u0 f1 + u1 f0 − feff 0 u1 + (u0 B f0 )(1) − (feff 0 B u0 )(1) u∗0 πref
= πref u0 f1 u∗0 πref − 2i u0 , f0 λB − feff 0 , u0 λB = f1 − i ∂rj f0 + λBlj (r)∂kl f0 πref ∂kj u0 u∗0 πref = f1 + ∂rj f0 + λBlj (r)∂kl f0 Aj . On the other hand, if we Taylor expand f ◦ Teff = f keff , reff to first order in ε, we get f keff , reff = f0 k + ελB(r)A(k) + O(ε2 ), r + εA(k) + O(ε2 ) + εf1 k + ελB(r)A(k) + O(ε2 ), r + εA(k) + O(ε2 ) + O(ε2 ) = f0 (k, r) + ε f1 (k, r) + λ∂kl f0 (k, r) Blj (r) Aj (k)
+ ∂rj f0 (k, r) Aj (k) + O(ε2 ) which coincides with feff up to O(ε2 ).
April 11, J070-S0129055X11004278
256
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
Now if the equations of motion (5.2) are an approximation of the full quantum dynamics, what are the equations of motion with respect to the effective variables? The classical flow Φeff t generated by heff with respect to the magnetic symplectic form (Eq. (5.2)) can be rewritten in terms of effective variables, −1 2 = Teff ◦ Φeff Φmacro t t ◦ Teff + O(ε ).
(5.7)
The right-hand side does not serve as a definition for the flow of the macroscopic is the flow associated to a modified observables, but it is a consequence: Φmacro t symplectic form and a modified hamiltonian. The modified symplectic form includes the Berry curvature associated to E∗ acting as a pseudo-magnetic field on the position variables. be the flow on T ∗ Rd generated by Proposition 5.2. Let Φmacro t ∇reff r˙eff −id λB(reff ) = hsc (keff , reff ) ∇keff +id εΩ(keff ) k˙ eff
(5.8)
where the semiclassical hamiltonian is given by −1 hsc := heff ◦ Teff .
(5.9)
−1 2 and Teff ◦ Φeff Then Eq. (5.7) holds, i.e. Φmacro t t ◦ Teff agree up to errors of order ε .
Proof. We express k and r in terms of keff and reff in (5.2) since, for ε small enough, Teff : (k, r) → (keff , reff ) is a bijection. For instance, the semiclassical hamiltonian −1 simplifies to heff ◦ Teff hsc (keff , reff ) := heff keff − ελB(reff )A(keff ), reff − εA(keff ) + O(ε2 ) = E∗ (keff ) + φ(reff ) − ελB(reff ) · M(keff ) + O(ε2 ). The symplectic form can be easily expanded to λB(reff − εA(keff )) −id +id 0 =
λB(reff ) −id λ∂reff l B(reff ) A(keff ) 0 −ε + O(ε2 ). +id 0 0 0
The other two terms, the time derivatives and gradients of keff and reff have slightly more complicated expansions, but they can be worked out explicitly. Then if we put all of them together, we arrive at the modified symplectic form (5.8). This proves the first claim. Hence, the hamiltonian vector fields agree up to O(ε2 ) and Lemma 5.24 in [27] implies that also the flows differ only by O(ε2 ). Remark 5.1. These equations of motion have first been proposed in [24, Appendix] and we have derived them in a more systematic fashion. The effective coordinates
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
257
reff and keff are associated to a non-standard Poisson structure on T ∗ Rdk : from Eq. (5.8), one can read off that the Poisson bracket with respect to reff and keff is given by {f, g}λB,εΩ =
d d ∂ξl f ∂xl g − ∂xl f ∂ξl g − λBlj ∂ξl f ∂ξj g − εΩlj ∂xl f ∂xj g l=1
l,j=1
and thus different components of position reff no longer commute, {reff l , reff j }λB,εΩ = −ε Ωlj . Hence, Ω acts as a pseudomagnetic field that is due to quantum effects.
5.2. An Egorov-type theorem The semiclassical approximation hinges on an Egorov-type theorem which we first prove on the level of effective dynamics: Theorem 5.1. Let heff be the effective hamiltonian as given by Theorem 4.1 associated to an isolated, non-degenerate Bloch band E∗ . Then for any Γ∗ -periodic semigenerated by heff with classical observable f ∈ AS 0 , f = f0 + εf1 , the flow Φeff t respect to the magnetic symplectic form (Eq. (5.2)) approximates the quantum evolution uniformly for all t ∈ [−T, +T ], A t +i εt OpA (heff ) A Op (f )e−i ε Op (heff ) − OpA f ◦ Φeff e t
B(L2 (M ∗ ))
≤ Cε2 .
(5.10)
d Proof. Since heff ∈ Cb∞ (T ∗ Rd ), the flow inherits the smoothness and f ◦Φeff t , dt f ◦ ∈ AS 0 (C) remain also Γ∗ -periodic in the momentum variable. To compare the Φeff t two time-evolutions, we use the usual Duhammel trick which yields A t OpA (f )e−i ε Op (heff ) − OpA f ◦ Φeff t t −i s OpA (h )
d +i s OpA (heff ) A eff ε = e ε ds Op f ◦ Φeff t−s e ds 0 t A s = dse+i ε Op (heff ) εi OpA (heff ), OpA f ◦ Φeff t−s 0 d −i s OpA (h ) eff f ◦ Φeff e ε − OpA ds t−s t A s = dse+i ε Op (heff ) OpA εi heff , f ◦ Φeff t−s B 0 −i s OpA (h ) eff ε − heff , f ◦ Φeff . (5.11) t−s λB e
e+i ε Op t
A
(heff )
April 11, J070-S0129055X11004278
258
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
The magnetic Moyal commutator — to first order — agrees with the magnetic Poisson bracket, eff eff 2 i ε heff , f ◦ Φt−s B = heff , f ◦ Φt−s λB + O(ε ) =
d d Blj ∂kl heff ∂kj f + O(ε2 ). ∂kl heff ∂rl f − ∂rl heff ∂kl f − l=1
l,j=1
Hence, the term to be quantized in Eq. (5.11) vanishes up to first order in ε, r.h.s. of (5.11) t A A s s = dse+i ε Op (heff ) OpA 0 + O(ε2 ) e−i ε Op (heff ) = O · (ε2 ). 0
This finishes the proof. The main result combines Proposition 5.1 with the Egorov theorem we have just proven: Theorem 5.2 (Semiclassical Limit). Let Assumptions 1.1, 1.2 and 3.1 be satisfied; if d ≥ 4, assume in addition that Assumption 3.2 holds true. Furthermore, let us assume the relevant part of the spectrum consists of a single non-degenerate Bloch band E∗ . Then for all macroscopic semiclassical observables f (Definition 5.1) as the full quantum evolution can be approximated by the hamiltonian flow Φmacro t given in Proposition 5.2 if the initial state is localized in the corresponding almost invariant subspace Z −1 ΠZL2 (Rdx ), −1 t ˆ t ˆ −1 Z ΠZ Z ΠZ e+i ε H OpA (f )e−i ε H − OpA f ◦ Φmacro 2 d ≤ CT ε2 . t B(L (Rx ))
(5.12) Proof. We now combine all of these results to approximate the dynamics: let f be a macroscopic observable. Then if we start with a state in the range of Z −1 ΠZ, the time-evolved observable can be written as ˆε
ˆε
Z −1 ΠZe−i ε H Z −1 OpA (f )Ze+i ε H Z −1 ΠZ t
t
ˆZ
ˆZ
= Z −1 Πe−i ε H OpA (f )e+i ε H ΠZ t
t
ˆ
ˆ
= Z −1 U −1 Πref U U −1 e−i ε h U OpA (f )U −1 e+i ε h U U −1 Πref UZ + O · (ε∞ ) t
ˆ
t
ˆ
= Z −1 U −1 Πref e−i ε h U OpA (f )U −1 e+i ε h Πref UZ + O · (ε∞ ) t
ˆ
t
ˆ
= Z −1 U −1 Πref e−i ε heff Πref U OpA (f )U −1 Πref e+i ε heff Πref UZ + O · (ε∞ ). t
t
After replacing U with OpA (u) (which adds another O · (ε∞ ) error) and Πref with OpA (πref ), the term in the middle combines to the quantization of feff = πref uB f B u∗ πref . We apply Proposition 5.1 and the Egorov theorem
April 11, J070-S0129055X11004278
2011 12:3 WSPC/S0129-055X
148-RMP
Applications of Magnetic ΨDO Techniques to SAPT
259
involving heff and obtain
tˆ tˆ . . . = Z −1 U −1 Πref e−i ε heff OpA πref uB f B u∗ πref e+i ε heff Πref UZ + O · (ε∞ ) tˆ tˆ = Z −1 U −1 Πref e−i ε heff OpA feff e+i ε heff Πref UZ + O · (ε∞ ) U ΠZ + O · (ε2 ). = Z −1 ΠU −1 OpA f ◦ Teff ◦ Φeff t
Since two flows are O(ε2 ) close if the corresponding hamiltonian vector fields are [27, Lemma 5.24], we conclude −1 2 . . . = Z −1 ΠU −1 OpA f ◦ Teff ◦ Φeff t ◦ Teff ◦ Teff U ΠZ + O · (ε ) = Z −1 ΠU −1 OpA f ◦ Φmacro ◦ Teff U ΠZ + O · (ε∞ ) t = Z −1 ΠOpA f ◦ Φmacro ΠZ + O · (ε2 ). t This finishes the proof. Acknowledgments The authors would like to thank M. M˘ antoiu, G. Panati and H. Spohn for useful discussions. M. L. thanks G. Panati for initiating the scientific collaboration with M. M˘ antoiu. Furthermore, the authors have found the suggestions by one of the referees very useful. References [1] N. W. Ashcroft and N. D. Mermin, Solid State Phys. (Sunders College Publishing, 2001). [2] J. Bellissard. C ∗ -Algebras in Solid State Physics. 2D Electrons in a Uniform Magnetic Field, Operator Algebras and Applications, Vol. 2 (University Press, 1988). [3] E. Canc´es, A. Deleurence and M. Lewin. A new approach to the modelling of local defects in crystals: The reduced Hartree–Fock case, Comm. Math. Phys. 281(1) (2008) 129–177. [4] E. Canc´es, A. Deleurence and M. Lewin, Non-perturbative embedding of local defects in crystalline materials, J. Phys.: Condens. Matt. 20(29) (2008) 294213. [5] G. De Nittis and G. Panati. Effective models for conductance in magnetic fields: derivation of Harper and Hofstadter models (2010); arXiv:0809.3199. [6] G. Grosso and G. P. Parravicini, Solid State Physics (Academic Press, 2003). [7] M. J. Gruber, Noncommutative Bloch theory, J. Math. Phys. 42(6) (2001) 2438– 2465. [8] P. G. Harper, Single band motion of conduction electrons in a uniform magnetic field, Proc. Phys. Soc. A 68 (1955) 874–892. [9] B. Helffer and J. Sj¨ ostrand, Analyse semi-classique pour l’´equation de Harper (avec application ` a l’´etude de Schr¨ odinger avec champ magn´etique), M´em. Soc. Math. France 34 (1988) 1761–1771. [10] B. Helffer and J. Sj¨ ostrand, Analyse semi-classique pour l’´equation de Harper III, M´em. Soc. Math. France 39 (1989) 1–124. [11] B. Helffer and J. Sj¨ ostrand, Analyse semi-classique pour l’´equation de Harper II (comportement semi-classique pr`es d’un rationnel), M´em. Soc. Math. France 40 (1990) 1–139.
April 11, J070-S0129055X11004278
260
2011 12:3 WSPC/S0129-055X
148-RMP
G. De Nittis & M. Lein
[12] D. R. Hofstadter, Energy levels and wave functions of Bloch electrons in rational and irrational magnetic fields, Phys. Rev. B 14 (1976) 2239–2249. [13] L. H¨ ormander, Fourier integral operators 1, Acta Math. 127(1) (1972) 79–183. [14] V. Iftimie, M. Mˇ antoiu and R. Purice, Magnetic pseudodifferential operators. Publ. Res. Inst. Math. Sci. 44(3) (2007) 585–623. [15] V. Iftimie, M. Mˇ antoiu and R. Purice, Commutator criteria for magnetic pseudodifferential operators, Comm. Partial Differential Equations 35(6) (2010) 1058–1094. [16] P. Kuchment, Floquet Theory for Partial Differential Equations (Operator Theory: Advances and Applications) (Birkh¨ auser, 1993). [17] M. Lein, Two-parameter asymptotics in magnetic Weyl calculus, J. Math. Phys. 51 (2010) 123519. [18] M. Lein, M. Mˇ antoiu and S. Richard, Magnetic pseudodifferential operators with coefficients in C ∗ -algebras (2009). [19] M. M¨ uller, Product rule for gauge invariant Weyl symbols and its application to the semiclassical description of guiding centre motion, J. Phys. A: Math. Gen. 32 (1999) 1035–1052. [20] M. Mˇ antoiu and R. Purice, The magnetic Weyl calculus, J. Math. Phys. 45(4) (2004) 1394–1417. [21] M. Mˇ antoiu, R. Purice and S. Richard, Twisted crossed products and magnetic pseudodifferential operators, in Advances in Operator Algebras and Mathematical Physics, Theta Ser. Adv. Math., Vol. 5 (Theta, Burcharest, 2005), pp. 137–172. [22] D. Osadchy and J. E. Avron. Hofstadter butterfly as quantum phase diagram, J. Math. Phys. 42 (2001) 5665–5671. [23] G. Panati, Triviality of Bloch and Bloch–Dirac bundles, Ann. Henri Poincar´e 8 (2007) 995–1011. [24] G. Panati, H. Spohn and S. Teufel. Effective dynamics for Bloch electrons: Peierls substitution, Comm. Math. Phys. 242 (2003) 547–578. [25] G. Panati, H. Spohn and S. Teufel, Space adiabatic perturbation theory, Adv. Theor. Math. Phys. 7 (2003) 145–204. [26] M. Reed and B. Simon, Methods of Mathematical Physics IV: Analysis of Operators (Academic Press, 1978). [27] S. Teufel, Adiabatic Perturbation Theory in Quantum Dynamics (Springer Verlag, 2003). [28] D. J. Thouless, M. Kohmoto, M. P. Nightingale and M. D. Nijs, Quantized Hall conductance in a two-dimensional periodic potential, Phys. Rev. Lett. 49(6) (1982) 405–408. [29] J. Zak, Dynamics of electrons in solids in external fields, Phys. Rev. 168(3) (1968) 686–695.
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 3 (2011) 261–307 c World Scientific Publishing Company DOI: 10.1142/S0129055X1100428X
THE ADHM CONSTRUCTION OF INSTANTONS ON NONCOMMUTATIVE SPACES
SIMON BRAIN∗ and WALTER D. VAN SUIJLEKOM† Institute for Mathematics, Astrophysics and Particle Physics, Faculty of Science, Radboud University Nijmegen, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands ∗
[email protected] †
[email protected] Received 29 September 2010 Revised 13 January 2011 We present an account of the ADHM construction of instantons on Euclidean space-time R4 from the point of view of noncommutative geometry. We recall the main ingredients of the classical construction in a coordinate algebra format, which we then deform using a cocycle twisting procedure to obtain a method for constructing families of instantons on noncommutative space-time, parametrized by solutions to an appropriate set of ADHM equations. We illustrate the noncommutative construction in two special cases: the Moyal–Groenewold plane R4 and the Connes–Landi plane R4θ . Keywords: Yang–Mills; instantons; noncommutative geometry; quantum groups. Mathematics Subject Classification 2010: 58B34, 16T05, 53C28, 81T13
1. Introduction There has been a great deal of interest in recent years in the construction of instanton gauge fields on space-times whose algebra of coordinate functions is noncommutative. In classical geometry, an instanton is a connection with anti-self-dual curvature on a smooth vector bundle over a four-dimensional manifold. The moduli space of instantons on a classical four-manifold is an important invariant of its differential structure [13] and it is only natural to try to generalize this idea to study the differential geometry of noncommutative four-manifolds. In this article, we study the construction of SU(2) instantons on the Euclidean four-plane R4 and its various noncommutative generalizations. The problem of constructing instantons on classical space-time was solved by the ADHM method of [2] and consequently it is known that the moduli space of (framed) SU(2) instantons on R4 with topological charge k ∈ Z is a manifold of dimension 8k − 3. In what follows, we review the ADHM construction of instantons on classical R4 and its extension to noncommutative geometry. In particular, we study the construction 261
April 11, J070-S0129055X1100428X
262
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
of instantons on two explicit examples of noncommutative Euclidean space-time: the Moyal–Groenewold plane R4 , whose algebra of coordinate functions has commutation relations of the Heisenberg form, and the Connes–Landi plane R4θ which arises as a localization of the noncommutative four-sphere Sθ4 constructed in [10] (cf. also [11]). In order to deform the ADHM construction, we adopt the technique of “functorial cocycle twisting”, a very general method which can be used in particular to deform the coordinate algebra of any space carrying an action of a locally compact Abelian group. Both of the above examples of noncommutative space-times are obtained in this way as deformations of classical Euclidean space-time: the Moyal plane R4 as a twist along a group of translational symmetries, the Connes–Landi plane R4θ as a twist along a group of rotational symmetries. Crucially, the twisting technique does not just deform space-time alone: its functorial nature means that it simultaneously deforms any and all constructions which are covariant under the chosen group of symmetries. In particular, the parameter spaces which occur in the ADHM construction also carry canonical actions of the relevant symmetry groups, whence their coordinate algebras are also deformed by the quantization procedure. In this way, we obtain a method for constructing families of instantons which are parametrized by noncommutative spaces. As natural as this may seem, it immediately leads to the conceptual problem of how to interpret these spaces of noncommutative parameters. Indeed, our quest in each case is for a moduli space of instantons, which is necessarily modeled on the space of all connections on a given vector bundle over space-time [3]. Even for noncommutative space-times, the set of such connections is an affine space and is therefore commutative [8]. To solve this problem, we adopt the strategy of [5] and incorporate into the ADHM construction the “internal gauge symmetries” of noncommutative space-time [9], in order to “gauge away” the noncommutativity and arrive at a classical space of parameters. The paper is organized as follows. The remainder of Sec. 1 is dedicated to a brief review of the algebraic structures that we shall need, including in particular the notions of Hopf algebras and their comodules, together with the cocycle twisting construction itself. Following this, in Secs. 2 and 3, we review the differential geometry of instantons from the point of view of coordinate algebras. We recall how to generalize these structures to incorporate the notions of noncommutative families of instantons and their gauge theory. In Sec. 4, we sketch the ADHM construction of instantons from the point of view of coordinate algebras, stressing how the method is covariant under the group of isometries of Euclidean space-time. It is precisely this covariance that we use in later sections to deform the ADHM construction by cocycle twisting. This coordinate-algebraic version of the ADHM construction has in fact already been studied in some detail [4,5]. In this sense, the first four sections of the present article consist mainly of review material. However, those earlier works studied the
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
263
geometry of the ADHM construction from a somewhat abstract categorical point of view: in the present exposition, our goal is to give an explanation of the ADHM method in which we keep things as concrete as we are able. Moreover, the focus of those papers was on the construction of instantons on the Euclidean four-sphere S 4 . In what follows, we show how to adapt this technique to construct instantons on the local coordinate chart R4 . We illustrate the deformation procedure by giving the noncommutative ADHM construction in two special cases. In Sec. 5, we look at how the method behaves under quantization to give an ADHM construction of instantons on the Moyal plane R4 . On the other hand, Sec. 6 addresses the issue of deforming the ADHM method to obtain a construction of instantons on the Connes–Landi plane R4θ . It is these latter two sections which contain the majority of our new results. Indeed, an algorithm for the construction of instantons on the Moyal plane R4 has been known for some time [24]. However, a good understanding of the noncommutative-geometric origins of this construction has so far been lacking and our goal is to shed some light upon this subject (although it is worth pointing out that a different approach to the twistor theory of R4 , from the point of view of noncommutative algebraic geometry, was carried out in [17]). Using the noncommutative twistor theory developed in [6], we give an explicit construction of families of instantons on Moyal space-time, from which we recover the well-known noncommutative ADHM equations of Nekrasov and Schwarz. Our methods are intended to be similar in spirit to those of [1] for the classical case; for some alternative approaches to the construction of noncommutative instantons, we refer, for example, to [7, 15, 16, 25, 28, 29] and references therein. On the other hand, the noncommutative geometry of instantons and the ADHM construction on the Connes–Landi plane R4θ was investigated in [18, 19, 21, 4, 5]. As already mentioned, however, this abstract geometric characterization of the instanton construction is in need of a more concrete description. In the present paper we derive an explicit set of ADHM equations whose solutions parametrize instantons on the Connes–Landi plane (although we do not claim at this stage that all such instantons arise in this way). We stress that our intention throughout the following is to present the various geometrical aspects of the ADHM construction in concrete terms. This means that, throughout the paper, we work purely at the algebraic level, i.e. with algebras of coordinate functions on all relevant spaces. A more detailed approach, which in particular addresses all of the analytic aspects of the construction, will be presented elsewhere. 1.1. Hopf algebraic structures Let H be a unital Hopf algebra. We denote its structure maps by ∆ : H → H ⊗ H,
: H → C,
S:H → H
April 11, J070-S0129055X1100428X
264
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
for the coproduct, counit and antipode, respectively. We use Sweedler notation for the coproduct, ∆h = h(1) ⊗ h(2) , as well as (∆ ⊗ id) ◦ ∆h = (id ⊗ ∆) ◦ ∆h = h(1) ⊗ h(2) ⊗ h(3) and so on, with summation inferred. We say that H is coquasitriangular if it is equipped with a convolution-invertible Hopf bicharacter R : H ⊗ H → C obeying g (1) h(1) R(h(2) , g (2) ) = R(h(1) , g (1) )h(2) g (2)
(1.1)
for all h, g ∈ H. Convolution-invertibility means that there exists a map R−1 : H ⊗ H → C such that R(h(1) , g (1) )R−1 (h(2) , g (2) ) = R−1 (h(1) , g (1) )R(h(2) , g (2) ) = (g)(h)
(1.2)
for all g, h ∈ H. Being a Hopf bicharacter means that R(f g, h) = R(f, h(1) )R(g, h(2) ),
R(f, gh) = R(f (1) , h)R(f (2) , g),
(1.3)
for all f, g, h ∈ H. If R also has the property that R(h(1) , g (1) )R(g (2) , h(2) ) = (g)(h) for all g, h ∈ H, then we say that H is cotriangular. A vector space V is said to be a left H-comodule if it is equipped with a linear map ∆V : V → H ⊗ V such that (id ⊗ ∆V ) ◦ ∆V = (∆ ⊗ id) ◦ ∆V ,
( ⊗ id) ◦ ∆V = id.
We shall often use the Sweedler notation ∆V (v) = v (−1) ⊗ v (0) for v ∈ V , again with summation inferred. If V , W are left H-comodules, a linear transformation σ : V → W is said to be a left H-comodule map if it satisfies ∆W ◦ σ = (id ⊗ σ) ◦ ∆V . Given a pair V, W of left H-comodules, the vector space V ⊗ W is also a left H-comodule when equipped with the tensor product H-coaction ∆V ⊗W (v ⊗ w) = v (−1) w(−1) ⊗ (v (0) ⊗ w(0) )
(1.4)
for each v ∈ V , w ∈ W . An algebra A is said to be a left H-comodule algebra if it is a left H-comodule equipped with a product m : A ⊗ A → A which is an H-comodule map, meaning in this case that (id ⊗ m) ◦ ∆A⊗A = ∆A ◦ m. Dually, a vector space V is said to be a left H-module if there is a linear map : H ⊗ V → V , denoted h ⊗ v → h v, such that h (g v) = (hg) v,
1v =v
for all v ∈ V . An algebra A is said to be a left H-module algebra if it a left H-module equipped with a product m : A ⊗ A which obeys h (ab) = (h(1) a)(h(2) b) for all a, b ∈ A.
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
265
In the special case where H is coquasitriangular, every left H-comodule V is also a left H-module when equipped with the canonical left H-action : H ⊗ V → V,
h v = R(v (−1) , h)v (0)
(1.5)
for each h ∈ H, v ∈ V . The action (1.5) does not commute with the H-coaction on V ; rather it obeys the “crossed module” condition h(1) v (−1) ⊗ h(2) v (0) = (h(1) v)(−1) h(2) ⊗ (h(1) v)(0)
(1.6)
for each h ∈ H and v ∈ V . In particular, if A is a left H-comodule algebra then it is also a left H-module algebra when equipped with the canonical action (1.5). To each left H-module algebra A there is an associated smash product algebra A > H, which is nothing other than the vector space A ⊗ H equipped with the product (a ⊗ h)(b ⊗ g) = a(h(1) b) ⊗ h(2) g
(1.7)
for each a, b ∈ A, h, g ∈ H. The main example of this construction relevant to the present paper is the following. Example 1.1. Let G be a locally compact Abelian Lie group with Pontryagin and let γ : G → Aut A be an action of G on a unital ∗-algebra A dual group G by ∗-automorphisms. Then associated to this action there is the crossed prod On the other hand, the G-action gives A the structure of a uct algebra A >γ G. gives an identification of C0 (G)-module algebra and the Fourier transform on G with the smash product A > C0 (G). A >γ G In this paper, our strategy is to keep things as explicit as possible, avoiding the technical details of analytic arguments and working purely at the algebraic level. For this reason, instead of using the function algebra C0 (G), we prefer to work with the bialgebra A[G] of representative functions on G equipped with pointwise multiplication. In this setting, we can still make sense of Example 1.1: although and A > C0 (G) are defined analytically using completions the algebras A >γ G of appropriate function algebras, we think of the smash product A > A[G] as an “algebraic version” of the crossed product algebra A >γ G. This also explains our use throughout the paper of coactions of Hopf algebras in place of group actions: the construction of crossed products by group actions is not defined at the algebraic level, whence we need to replace it by the smash product algebra instead. 1.2. Quantization by cocycle twist Following [22], in this section we give a brief review of the deformation procedure that we shall use later in the paper to “quantize” the ADHM construction of instantons.
April 11, J070-S0129055X1100428X
266
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
Let H be a unital Hopf algebra whose antipode we assume to be invertible. A two-cocycle on H is a linear map F : H ⊗ H → C which is unital, convolutioninvertible in the sense of Eq. (1.2) and obeys the condition ∂F = 1, i.e. F (g (1) , f (1) )F (h(1) , g (1) f (2) )F −1 (h(2) g (3) , f (3) )F −1 (h(3) , g (4) ) = (f )(h)(g) (1.8) for all f, g, h ∈ H. Given such an F , there is a twisted Hopf algebra HF with the same coalgebra structure as H, but with modified product •F and antipode SF given respectively by h •F g = F (h(1) , g (1) )h(2) g (2) F −1 (h(3) , g (3) ),
(1.9)
SF (h) := U (h(1) )S(h(2) )U −1 (h(3) ),
(1.10)
for each h, g ∈ HF , where on the right-hand sides we use the product and antipode of H and define U (h) := F (h(1) , Sh(2) ). The cocycle condition (1.8) is sufficient to ensure that the product (1.9) is associative. In the case where H is a Hopf ∗-algebra we also need to impose upon the cocycle F the reality condition F (h, g) = F ((S 2 g)∗ , (S 2 h)∗ ).
(1.11)
In this situation the twisted Hopf algebra HF acquires a deformed ∗-structure h∗F := V −1 (S −1 h(1) )(h(2) )∗ V (S −1 h(3) )
(1.12)
for each h ∈ HF , where V (h) := U −1 (h(1) )U (S −1 h(2) ). If H is a coquasitriangular Hopf algebra, then so is HF . In particular, if H is commutative then HF is cotriangular with “universal R-matrix” given by R(h, g) := F (g (1) , h(1) )F −1 (h(2) , g (2) )
(1.13)
for all h, g ∈ H. Since as coalgebras H and HF are the same, every left Hcomodule is a left HF -comodule and every H-comodule map is an HF -comodule map. This means that there is an invertible functor which “functorially quantizes” any H-covariant construction to give an HF -covariant one. As already mentioned, our strategy will be to apply this idea to the construction of instantons. In passing from H to HF , from each left H-comodule algebra A we automatically obtain a left HF -comodule algebra AF which as a vector space is the same as A but has the modified product AF ⊗ AF → AF ,
a ⊗ b → a ·F b := F (a(−1) , b(−1) )a(0) b(0) .
(1.14)
If A is a left H-comodule algebra and a ∗-algebra such that the coaction is a ∗-algebra map, the twisted algebra AF also has a new ∗-structure defined by a∗F := V −1 (S −1 a(−1) )(a(0) )∗ for each a ∈ AF .
(1.15)
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
267
2. The Twistor Fibration The Penrose fibration CP3 → S 4 is an essential component of the ADHM construction of instantons, since it encodes in its geometry the very nature of the anti-selfduality equations on S 4 [1]. Following [6, 4], in this section we sketch the details of the Penrose fibration from the point of view of coordinate algebras, then investigate what happens to the fibration upon passing to a local coordinate chart. 2.1. The Penrose fibration The ∗-algebra A[C4 ] of coordinate functions on the classical space C4 is the commutative unital ∗-algebra generated by the elements {zj , zl∗ | j, l = 1, . . . , 4}. The coordinate algebra A[S 7 ] of the seven-sphere S 7 is the quotient of A[C4 ] by the sphere relation z1∗ z1 + z2∗ z2 + z3∗ z3 + z4∗ z4 = 1.
(2.1)
On the other hand, the coordinate algebra A[S ] of the four-sphere S is the commutative unital ∗-algebra generated by the elements x1 , x∗1 , x2 , x∗2 and x0 = x∗0 subject to the sphere relation 4
4
x∗1 x1 + x∗2 x2 + x20 = 1. There is a canonical inclusion of algebras A[S 4 ] → A[S 7 ] defined on generators by x1 = 2(z1 z3∗ + z2∗ z4 ),
x2 = 2(z2 z3∗ − z1∗ z4 ),
x0 = z1 z1∗ + z2 z2∗ − z3 z3∗ − z4 z4∗ (2.2)
and extended as a ∗-algebra map. Clearly one has x∗1 x1 + x∗2 x2 + x20 = (z1∗ z1 + z2∗ z2 + z3∗ z3 + z4∗ z4 )2 = 1,
(2.3)
so that the algebra inclusion is well-defined. This is just a coordinate algebra description of the principal bundle S 7 → S 4 with structure group SU(2) (cf. [18] for further details of this construction). The twistor space of the Euclidean four-sphere S 4 is nothing other than the complex projective space CP3 . As a real six-dimensional manifold, twistor space CP3 may be identified with the set of all 4 × 4 Hermitian projector matrices of rank one, since each such matrix uniquely determines and is uniquely determined by a one-dimensional subspace of C4 . Thus the coordinate ∗-algebra A[CP3 ] of CP3 has a defining matrix of (commuting) generators a1 u 1 u 2 u 3 ∗ u1 a2 v3 v2 , q := ∗ (2.4) u2 v3∗ a3 v1 u∗3
v2∗
v1∗
a4
April 11, J070-S0129055X1100428X
268
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
with a∗j = aj for j = 1, . . . , 4 and Tr q = to the relations j aj = 1, subject 2 coming from the projection condition q = q, that is to say r qjr qrl = qjl for each j, l = 1, . . . , 4. There is a canonical inclusion of algebras A[CP3 ] → A[S 7 ] defined on generators by qjl = zj zl∗ ,
j, l = 1, . . . , 4,
(2.5)
with the relation q11 + q22 + q33 + q44 = 1 coming from the sphere relation (2.1). To determine the twistor fibration CP3 → S 4 in our coordinate algebra framework, we need the map J : A[C4 ] → A[C4 ] defined on generators by J(z1 , z2 , z3 , z4 ) := (−z2∗ , z1∗ , −z4∗ , z3∗ )
(2.6)
and extended as a ∗-anti-algebra map. Equipping the algebra A[C4 ] with the map J identifies the underlying space C4 with the quaternionic vector space H2 [21]. Accordingly, we define A[H2 ] to be the ∗-algebra A[C4 ] equipped with the quaternionic structure J. Using the identification of generators (2.5), the map J extends to an automorphism of the algebra A[CP3 ] given by J(a1 ) = a2 ,
J(a2 ) = a1 ,
J(v1 ) = −v1 ,
J(u2 ) = v2∗ , J(u3 ) = −v3∗ ,
J(a3 ) = a4 ,
J(a4 ) = a3 ,
J(u1 ) = −u1 ,
J(v2 ) = u∗2 ,
J(v3 ) = −u∗3
and extended as a ∗-anti-algebra map. It is straightforward to check that the subalgebra of A[CP3 ] fixed by this automorphism is precisely the four-sphere algebra A[S 4 ]. Indeed, there is an algebra inclusion A[S 4 ] → A[CP3 ] defined on generators by x0 → 2(a1 + a2 − 1),
x1 → 2(u2 + v2∗ ),
x2 → 2(v3 − u∗3 ),
(2.7)
which is just a coordinate-algebraic description of the Penrose fibration CP3 → S 4 . 2.2. Localization of the twistor bundle Next we look at what happens to the fibration A[S 4 ] → A[CP3 ] when we pass to a local chart of S 4 by removing a point. Indeed, it is well known that in making such a localization the twistor bundle CP3 → S 4 becomes isomorphic to the trivial fibration R4 × CP1 over R4 . In this section we demonstrate this fact using the langauge of coordinate algebras. By definition, the localization A0 [S 4 ] of A[S 4 ] is the commutative unital ∗-algebra x1 , x ˜∗1 , x ˜2 , x ˜∗2 , x˜0 , A0 [S 4 ] := A[˜ (1 + x ˜0 )−1 | x ˜∗1 x˜1 + x ˜∗2 x˜2 + x˜20 = 1, (1 + x ˜0 )(1 + x˜0 )−1 = 1]. It is the algebra obtained from A[S 4 ] by adjoining an inverse (1 + x0 )−1 to the function 1+x0 ; geometrically this corresponds to “deleting” the point (x1 , x2 , x0 ) = (0, 0, −1) from the spectrum of the (smooth completion of the) algebra A[S 4 ], with
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
269
A0 [S 4 ] being the algebra of coordinate functions on the resulting space. On the other hand, the coordinate algebra A[R4 ] of the Euclidean four-plane R4 is the commutative unital ∗-algebra generated by the elements {ζj , ζl∗ | j, l = 1, 2}.
(2.8)
Defining |ζ|2 := ζ1∗ ζ1 + ζ2∗ ζ2 , the element (1 + |ζ|2 )−1 clearly belongs to the (smooth completion of the) algebra A[R4 ] and so we have the following result. Lemma 2.1. The map A0 [S 4 ] → A[R4 ] defined on generators by x ˜1 → 2ζ1 (1 + |ζ|2 )−1 ,
x˜2 → 2ζ2 (1 + |ζ|2 )−1 ,
x ˜0 → (1 − |ζ|2 )(1 + |ζ|2 )−1 (2.9)
is a ∗-algebra isomorphism. Proof. The inverse of (2.9) is the map A[R4 ] → A0 [S 4 ] given on generators by ζ1 → x˜1 (1 + x˜0 )−1 ,
ζ2 → x ˜2 (1 + x ˜0 )−1
(2.10)
and extended as a ∗-algebra map. Thus we have an isomorphism of vector spaces. ˜2 , x˜0 satisfy the same relation as the generators One checks that the elements x ˜1 , x 4 x1 , x2 , x0 of the algebra A[S ]. The difference is that the point determined by the coordinate values (x1 , x2 , x0 ) = (0, 0, −1) is not in the spectrum of the (smooth) ˜2 , x ˜0 . In this way, we obtain R4 as a local chart of algebra generated by the x˜1 , x 4 the four-sphere S , with the identification (2.9) defining the “inverse stereographic projection”. The point (0, 0, −1) will henceforth be called the point at infinity. At the level of twistor space CP3 , passing to the local chart R4 by removing the point at infinity corresponds to removing the fiber CP1 over that point: we refer to this copy of CP1 as the line at infinity and denote it by ∞ . Under the algebra inclusion (2.7), inverting the element 1 + x0 in A[S 4 ] is equivalent to inverting the element a1 +a2 in A[CP3 ]. We denote by A0 [CP3 ] the resulting localized algebra, i.e. A0 [CP3 ]
−1
:= A qjl , (a1 + a2 )
−1 qjr qrl = qjl , Tr q = 1, (a1 + a2 )(a1 + a2 ) = 1 .
r
We now show that this algebra is isomorphic to the algebra of coordinate functions on the Cartesian product R4 × CP1 . The coordinate algebra A[CP1 ] is the commutative unital ∗-algebra generated by the entries of the matrix
a ˜1 u ˜1 ˜ := q (2.11) u ˜∗1 a ˜2
April 11, J070-S0129055X1100428X
270
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
˜2 = q ˜∗ = q ˜ and Tr ˜q = 1, that is to say a subject to the relations q ˜1 a ˜2 = u ˜∗1 u˜1 , ∗ ∗ 4 ˜1 , a ˜2 = a ˜2 and a ˜1 + a ˜2 = 1. With the coordinate algebra A[R ] of (2.8), we a ˜1 = a have the following result. Lemma 2.2. There is a ∗-algebra isomorphism A[R4 ] ⊗ A[CP1 ] ∼ = A0 [CP3 ] defined on generators by ζ1 ⊗ 1 → (a1 + a2 )−1 (u2 + v2∗ ), 1⊗a ˜1 → (a1 + a2 )−1 a1 ,
ζ2 ⊗ 1 → (a1 + a2 )−1 (v3 − u∗3 ),
1 ⊗ u˜1 → (a1 + a2 )−1 u1 ,
1⊗a ˜2 → (a1 + a2 )−1 a2
and extended as a ∗-algebra map. Proof. We need to show that this map is an isomorphism of vector spaces which respects the algebra relations in A[R4 ] ⊗ A[CP1 ]. Using the expressions (2.2) and (2.5), we find in A[S 7 ] the identities 2(a1 + a2 )z3 = x∗1 z1 + x∗2 z2 ,
2(a1 + a2 )z3∗ = x1 z1∗ + x2 z2∗ ,
2(a1 + a2 )z4 = x1 z2 − x2 z1 ,
2(a1 + a2 )z4∗ = x∗1 z2∗ − x∗2 z1∗ .
In the localization where 2(a1 + a2 ) = 1 + x0 is invertible, these expressions combined with the identifications qij = zi zj∗ define the inverse of the map stated in the lemma, so that we have a vector space isomorphism. The algebra A[CP1 ] generated ˜2 , u ˜1 and u ˜∗1 is identified with the subalgebra of A[CP3 ] generated by the by a ˜1 , a localized upper left 2 × 2 block of the matrix (2.4), i.e. the subalgebra generated by the elements (a1 + a2 )−1 qij for i, j = 1, 2. It is easy to check that the relations in A[CP1 ] are automatically preserved by this identification. To check that the trace 3 4 1 relation Tr q = j qjj = 1 in A0 [CP ] also holds in A[R ] ⊗ A[CP ], one first computes that (a1 + a2 )−1 (z1∗ z1 + z2∗ z2 ) → 1 ⊗ 1,
(a1 + a2 )−1 (z3∗ z3 + z4∗ z4 ) → (ζ1∗ ζ1 + ζ2∗ ζ2 ) ⊗ 1,
so that the trace relation holds if and only if (a1 + a2 )−1 → (1 + |ζ|2 ), which is certainly true. Moreover, in A[CP3 ] there are relations of the form qij qkl = zi zj∗ zk zl∗ = zi zl∗ zk zj∗ = qil qkj
(2.12)
for i, j, k, l = 1, . . . , 4. By adding together various linear combinations and using the trace relation Tr q = 1, one finds that the relations (2.12) are equivalent to the projector relations q2 = q. Hence it follows that the projector relations in A0 [CP3 ] are equivalent to the remaining relations in A[R4 ] ⊗ A[CP1 ]. In this way, we see that there is a canonical inclusion of algebras A[R4 ] → A0 [CP3 ] in the obvious way; this is a coordinate algebra description of the localized twistor fibration R4 × CP1 → R4 . Moreover, using the isomorphism in Lemma 2.2, the quaternionic structure J of Eq. (2.6) is well-defined on the algebra A0 [CP3 ], with A[R4 ] being the J-invariant subalgebra.
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
271
2.3. Symmetries of the twistor fibration In later sections we shall obtain deformations of the twistor fibration by the cocycle twisting of Sec. 1.2; for this we need a group of symmetries acting upon the twistor bundle. Here we describe the general strategy that we shall adopt. We write M(2, H) for the algebra of 2 × 2 matrices with quaternion entries. The algebra A[M(2, H)] of coordinate functions on M(2, H) is the commutative unital ∗-algebra generated by the entries of the 4 × 4 matrix α1 −α∗2 β1 −β2∗
α2 α∗1 β2 β1∗ aij bij . = A= (2.13) cij dij γ1 −γ2∗ δ1 −δ2∗ γ2
γ1∗
δ1∗
δ2
We think of this matrix as being generated by a set of quaternion-valued functions, writing
α1 −α∗2 a = (aij ) = α2 α∗1 and similarly for the other entries b, c, d. The ∗-structure on this algebra is evident from the matrix (2.13). We equip A[M(2, H)] with the matrix coalgebra structure ∆(Aij ) = Air ⊗ Arj , (Aij ) = δij for i, j = 1, . . . , 4. r
Dual to the canonical action of M(2, H) on C4 H2 there is a left coaction defined by Ajr ⊗ zr , (2.14) ∆L : A[C4 ] → A[M(2, H)] ⊗ A[C4 ], zj → r
extended as a ∗-algebra map. This coaction commutes with the quaternionic structure (2.6), in the sense that (id ⊗ J) ◦ ∆L = ∆L ◦ J, so that we have a coaction ∆L : A[H2 ] → A[M(2, H)] ⊗ A[H2 ] (cf. [21]). The Hopf algebra A[GL(2, H)] of coordinate functions on the group GL(2, H) is obtained by adjoining to A[M(2, H)] an invertible group-like element D obeying the relation D−1 = det A, where det A is the determinant of the matrix (2.13). This yields a left coaction ∆L : A[H2 ] → A[GL(2, H)] ⊗ A[H2 ], also defined by the formula (2.14) and extended as a ∗-algebra map. The group GL(2, H) is the group of conformal symmetries of the twistor fibration CP3 → S 4 [27, 23]. However, since we are interested in the localized twistor bundle described in Lemma 2.2, we work instead with the localized group of symmetries
April 11, J070-S0129055X1100428X
272
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
GL+ (2, H), which is just the “coordinate patch” of GL(2, H) in which the 2 × 2 block a is assumed to be invertible. The coordinate algebra A[GL+ (2, H)] of this which localization is obtained by adjoining to A[GL(2, H)] an invertible element D −1 = det a. The coaction of A[GL+ (2, H)] on A[H2 ] is once obeys the relation D again defined by the formula (2.14). We refer to [6] for further details of this construction. Throughout the paper, our strategy will be to deform the localized twistor fibration and its associated geometry using the action of certain subgroups of GL+ (2, H). In dual terms, we suppose H to be a commutative Hopf ∗-algebra obtained via a Hopf algebra projection π : A[GL+ (2, H)] → H.
(2.15)
This determines a left coaction of H on A[H2 ] by projection of the coaction (2.14), namely ∆π : A[H2 ] → H ⊗ A[H2 ],
∆π := (π ⊗ id) ◦ ∆L ,
(2.16)
which makes A[H2 ] into a left H-comodule ∗-algebra. Moreover, we assume that this coaction respects the defining relations of the localized twistor algebra A0 [CP3 ] given in Lemma 2.2, whence it makes A0 [CP3 ] and A[R4 ] into left H-comodule ∗-algebras in such a way that the algebra inclusion A[R4 ] → A0 [CP3 ] is a left H-comodule map. 3. Families of Instantons and Gauge Theory We are now ready to study differential structures on the twistor fibration. In this section we recall the basic theory of anti-self-dual connections on Euclidean space R4 from the point of view of noncommutative geometry. Following [21, 5], we then generalize this by recalling what it means to have a family of anti-self-dual connections on R4 and when such families are gauge equivalent. These notions will pave the way for the algebraic formulation of the ADHM construction to follow. 3.1. Differential structures and instantons As discussed, our intention is to present the construction of connections and gauge fields in an entirely H-covariant framework, from which all of our deformed versions will immediately follow by functorial cocycle twisting. First of all we discuss the various differential structures that we shall need. We write Ω(C4 ) for the canonical differential calculus on A[C4 ]. It is the graded differential algebra generated by the degree zero elements zj , zl∗ , j, l = 1, . . . , 4, and the degree one elements dzj , dzl∗ , j, l = 1, . . . , 4, subject to the relations dzj ∧ dzl + dzl ∧ dzj = 0,
dzj ∧ dzl∗ + dzl∗ ∧ dzj = 0
for j, l = 1, . . . , 4. The exterior derivative d on Ω(C4 ) is defined by d : zj → dzj and extended uniquely using a graded Leibniz rule. There is also an involution on Ω(C4 ) given by graded extension of the map zj → zj∗ .
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
273
The story is similar for the canonical differential calculus Ω(R4 ). It is generated by the degree zero elements ζj , ζl∗ and the degree one elements dζj , dζl∗ , j, l = 1, 2, subject to the relations dζj ∧ dζl + dζl ∧ dζj = 0,
dζj ∧ dζl∗ + dζl∗ ∧ dζj = 0.
With π : A[GL+ (2, H)] → H a choice of Hopf algebra projection as in Eq. (2.15), we assume throughout that the differential calculi Ω(C4 ) and Ω(S 4 ) are (graded) left H-comodule algebras such that the exterior derivative d is a left H-comodule map, i.e. the coaction (2.16) obeys ∆π (dzj ) = (id ⊗ d)∆π (zj ),
j = 1, . . . , 4.
In this way, the H-coactions on Ω(C4 ) and on Ω(R4 ) are given by extending the coaction on A[C4 ]. Next we come to discuss vector bundles over R4 . Of course, the fact that R4 is contractible means that the K-theory of the algebra A[R4 ] is trivial, i.e. all finitely generated projective (let us say right) modules E over A[R4 ] have the form E = A[R4 ]N for N a positive integer and are equipped with a canonical A[R4 ]-valued Hermitian structure · | · . A connection on E is a linear map ∇ : E → E ⊗A[R4 ] Ω1 (R4 ) satisfying the Leibniz rule ∇(ξx) = (∇ξ)x + ξ ⊗ dx for all ξ ∈ E, x ∈ A[R4 ]. The connection ∇ is said to be compatible with the Hermitian structure on E if it obeys
∇ξ | η + ξ | ∇η = d ξ | η for all ξ, η ∈ E, x ∈ A[R4 ]. Since E is necessarily free as an A[R4 ]-module, any compatible connection ∇ can be written ∇ = d + α, where α is a skew-adjoint element of HomA[R4 ] (E, E ⊗A[R4 ] Ω1 (R4 )). The curvature of ∇ is the EndA[R4 ] (E)-valued two-form F := ∇2 = dα + α2 . The Euclidean metric on R4 determines the Hodge ∗-operator on Ω(R4 ), which on two-forms is a linear map ∗ : Ω2 (R4 ) → Ω2 (R4 ) such that ∗2 = id. Since H coacts by conformal transformations on A[R4 ], there is an H-covariant splitting of two-forms Ω2 (R4 ) = Ω2+ (R4 ) ⊕ Ω2− (R4 ) into self-dual and anti-self-dual components, i.e. the ±1 eigenspaces of the Hodge operator. The curvature ∇2 of a connection ∇ is said to be anti-self-dual if it satisfies the equation ∗F = −F . Definition 3.1. A compatible connection ∇ on E is said to be an instanton if its curvature F = ∇2 is an anti-self-dual two-form.
April 11, J070-S0129055X1100428X
274
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
It is worth pointing out here that the usual definition of an instanton imposes the additional constraint that the Yang–Mills “energy” defined by the integral tr(F ∧ F ) YM(∇) := R4
should be finite or, equivalently, that the instanton connection extends to one on the conformal compactification S 4 of R4 . Indeed, it is known that the ADHM construction of instantons on R4 (to be described in Sec. 4) produces instantons whose Yang–Mills energy is finite and, conversely, that all such instantons arise via the ADHM construction. We shall address the corresponding issue for instantons on noncommutative spaces later on in the paper. The gauge group of E is defined to be U(E) := {U ∈ EndA[R4 ] (E) | U ξ | U η = ξ | η for all ξ, η ∈ E}. It acts upon the space of compatible connections by ∇ → ∇U := U ∇U ∗ for each compatible connection ∇ and each element U of U(E). We say that a pair of connections ∇1 , ∇2 on E are gauge equivalent if they are related by such a gauge transformation U . The curvatures of gauge equivalent connections are related by F U = (∇U )2 = UFU ∗ . Note in particular that if ∇ has anti-self-dual curvature then so does the gauge-transformed connection ∇U . We observe a posteriori that the above definitions do not depend on the commutativity of the algebras A[R4 ] and Ω(R4 ), so that they continue to make sense even if we allow for deformations of the algebras A[R4 ] and Ω(R4 ). 3.2. Noncommutative families of instantons Having given the definition of an instanton on R4 , we now come to discuss what it means to have a family of instantons over R4 . In the following, we let A be an arbitrary (possibly noncommutative) unital ∗-algebra. Definition 3.2. A family of Hermitian vector bundles over R4 parametrized by the algebra A is a finitely generated projective right module E over the algebra A ⊗ A[R4 ] equipped with an A ⊗ A[R4 ]-valued Hermitian structure · | · . By definition, any such module E is given by a self-adjoint idempotent P ∈ MN (A ⊗ A[R4 ]), i.e. an N × N matrix with entries in A ⊗ A[R4 ] satisfying P2 = P = P∗ ; the corresponding module is E := P(A ⊗ A[R4 ])N . Although Definition 3.2 is given in terms of an arbitrary algebra A, it is motivated by the case where A is the (commutative) coordinate algebra of some underlying classical space X. In this situation, for each point x ∈ X there is an evaluation map evx : A → C and
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
275
the object Ex := (evx ⊗ id)P((A ⊗ A[R4 ])N ) is a finitely generated projective right A[R4 ]-module corresponding to a vector bundle over R4 . In this way, the projection P defines a family of Hermitian vector bundles parametrized by the space X. When the algebra A is noncommutative, there need not be enough evaluation maps available, but we may nevertheless work with the whole family at once. Next we come to say what it means to have a family of connections over R4 . We write A ⊗ Ω1 (R4 ) for the tensor product bimodule over the algebra A ⊗ A[R4 ] and extend the exterior derivative d on A[R4 ] to A ⊗ A[R4 ] as id ⊗ d. Definition 3.3. A family of connections parametrized by the algebra A consists of a family of Hermitian vector bundles E := P(A ⊗ A[R4 ])N over A[R4 ], together with a linear map ∇ : E → E ⊗A⊗A[R4 ] (A ⊗ Ω1 (R4 )) E ⊗A[R4 ] Ω1 (R4 ) obeying the Leibniz rule ∇(ξx) = (∇ξ)x + ξ ⊗ (id ⊗ d)x for all ξ ∈ E and x ∈ A ⊗ A[R4 ]. The family is said to be compatible with the Hermitian structure if it obeys ∇ξ | η + ξ | ∇η = (id ⊗ d) ξ | η for all ξ ∈ E and x ∈ A ⊗ A[R4 ]. It is clear that a family of connections parametrized by A = C (i.e. by a onepoint space) is just a connection in the usual sense. In general, a given family of Hermitian vector bundles E = P(A ⊗ A[R4 ])N always carries the family of Grassmann connections defined by ∇0 = P ◦ (id ⊗ d). It follows that any family of connections can be written in the form ∇ = ∇0 + α, where α is a skew-adjoint element of EndA⊗A[R4 ] (E, E ⊗A⊗A[R4 ] (A ⊗ Ω1 (R4 ))) EndA⊗A[R4 ] (E, E ⊗A[R4 ] Ω1 (R4 )). Definition 3.4. Let E := P(A ⊗ A[R4 ])N be a family of Hermitian vector bundles parametrized by the algebra A. The gauge group of E is U(E) := {U ∈ EndA⊗A[R4 ] (E) | U ξ | U η = ξ | η for all ξ, η ∈ E}. We say that two families of compatible connections ∇1 , ∇2 on E are equivalent families and write ∇1 ∼ ∇2 if they are related by the action of the unitary group, i.e. there exists U ∈ U(E) such that ∇2 = U ∇1 U ∗ . More generally, if ∇1 and ∇2 are two families of connections on E parametrized by ∗-algebras A1 and A2 respectively, we say that ∇1 ∼ ∇2 if there exists a ∗-algebra
April 11, J070-S0129055X1100428X
276
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
B and ∗-algebra maps φ1 : A1 → B and φ2 : A2 → B such that φ∗1 ∇1 ∼ φ∗2 ∇2 in the above sense. Here φ∗i ∇i is the connection on E ⊗Ai B naturally induced by ∇i , where i = 1, 2 (cf. [5] for more details). In the case where A = C, i.e. for a family parametrized by a one-point space, the above equivalence relation reduces to the usual definition of gauge equivalence of connections. In the case where the families ∇1 , ∇2 are Grassmann families associated to projections P1 , P2 ∈ MN (A⊗A[R4 ]), equivalence means that P2 = U P1 U ∗ for some unitary U . Lemma 3.5. With E = P(A ⊗ A[R4 ])N , there exists PA ∈ MN (A) such that there is an algebra isomorphism EndA⊗A[R4 ] (E) EndA (PA (AN )) ⊗ A[R4 ] and hence an isomorphism U(E) U(EndA (PA (AN )) ⊗ A[R4 ]) of gauge groups. Proof. Since R4 is topologically trivial, there is an isomorphism of K-groups K0 (A ⊗ A[R4 ]) ∼ = K0 (A). It follows that for each projection P ∈ MN (A ⊗ A[R4 ]) there exists a projection PA ∈ MN (A) such that P and PA ⊗ 1 are equivalent projections in MN (A ⊗ A[R4 ]). This implies that there is an isomorphism E = P(A ⊗ A[R4 ])N PA (AN ) ⊗ A[R4 ]
(3.1)
of right A ⊗ A[R4 ]-modules, from which the result follows immediately. With these ideas in mind, finally we arrive at the following definition of a family of instantons over R4 parametrized by a (possibly noncommutative) ∗-algebra A. Definition 3.6. A family of instantons over R4 is a family of compatible connections ∇ over R4 whose curvature F := ∇2 obeys the anti-self-duality equation (id ⊗ ∗)F = −F, where ∗ is the Hodge operator on Ω2 (R4 ). Once again we stress that the “usual” definition of a family of instantons over R4 would require all members of the family to have finite Yang–Mills energy, in the sense that the A-valued integral YM(∇) := id ⊗ ((id ⊗ tr) F ∧ F ) (3.2) R4
should be well-defined (i.e. finite). We shall comment on this requirement for noncommutative instantons later on in the paper. 4. The ADHM Construction Next we review the ADHM construction of instantons on the classical Euclidean four-plane R4 . We present the construction in a coordinate algebra format which
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
277
is covariant under the coaction of a given Hopf algebra of symmetries, paving the way for a deformation by the cocycle twisting of Sec. 1.2. 4.1. The space of classical monads We begin by describing the input data for the ADHM construction of instantons. Although the ADHM construction is capable of constructing instanton bundles of arbitrary rank, in this paper we restrict our attention to the construction of vector bundles with rank two. Definition 4.1. Let k ∈ Z be a fixed positive integer. A monad over A[C4 ] is a complex of free right A[C4 ]-modules, σ
τ
z z 0 → H ⊗ A[C4 ] −→ K ⊗ A[C4 ] −→ L ⊗ A[C4 ] → 0,
(4.1)
where H, K and L are complex vector spaces of dimensions k, 2k + 2 and k respectively, such that the maps σz , τz are linear in the generators z1 , . . . , z4 of A[C4 ]. The first and last terms of the sequence are required to be exact, so that the only non-trivial cohomology is in the middle term. Given a monad (4.1), its cohomology E := Ker τz /Im σz is a finitely-generated projective right A[C4 ]-module and hence defines a vector bundle over C4 . In fact, since the maps σz , τz are assumed linear in the coordinate functions z1 , . . . , z4 , this vector bundle is well-defined over the projective space CP3 [26]. With respect to ordered bases (u1 , . . . , uk ), (v1 , . . . , v2k+2 ) and (w1 , . . . , wk ) for the vector spaces H, K and L respectively, the maps σz and τz have the form j j σz : ub ⊗ Z → Mab ⊗ va ⊗ zj Z, τz : vc ⊗ Z → Ndc ⊗ wd ⊗ zj , (4.2) a,j
d,j
j j where Z ∈ A[C4 ] and the quantities M j := (Mab ) and N j := (Ndc ), j = 1, . . . , 4, are complex matrices with a, c = 1, . . . , 2k + 2 and b, d = 1, . . . , k. In more compact notation, σz and τz may be written σz = M j ⊗ zj , τz = N j ⊗ zj . (4.3) j
j
It is immediate from the formulæ (4.2) that the composition τz σz is given by j l τz σz : H ⊗ A[C4 ] → L ⊗ A[C4 ], ub ⊗ Z → Ndc Mcb ⊗ wd ⊗ zj zl Z, j,l,c,d
with respect to the bases (u1 , . . . , uk ) and (w1 , . . . , wk ) of H and L. It follows that the condition τz σz = 0 is equivalent to requiring that j j l l (Ndr Mrb + Ndr Mrb )=0 (4.4) r
for all j, l = 1, . . . , 4 and b, d = 1, . . . , k.
April 11, J070-S0129055X1100428X
278
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
j ∗ l ∗ We now introduce the conjugate matrix elements Mab and Ndc . Just as we did j ∗ j l for the matrices M and N above, we use the compact notation (M j † )ab = Mba l† l ∗ and (N )cd = Ndc . Given a monad (4.1), the corresponding conjugate monad is defined to be ∗ τJ(z)
∗ σJ(z)
0 → L∗ ⊗ J(A(C4θ ))∗ −−−→ K∗ ⊗ J(A(C4θ ))∗ −−−→ H∗ ⊗ J(A(C4θ ))∗ ,
(4.5)
∗ ∗ and σJ(z) are the “adjoint” maps defined by where τJ(z) ∗ ∗ σJ(z) = M j † ⊗ J(zj )∗ , τJ(z) = N j † ⊗ J(zj )∗ j
j
and J is the quaternionic involution defined in Eq. (2.6) (cf. [4] for further explanation). If a given monad (4.1) is isomorphic to its conjugate (4.5) then we say that it is self-conjugate. A necessary and sufficient condition for a monad to be ∗ ∗ = −˜ σz and σ ˜J(z) = τ˜z , self-conjugate is that the maps σz and τz should obey τ˜J(z) j l equivalently that the matrices M , N should satisfy the reality conditions N 1 = M 2† ,
N 2 = −M 1† ,
N 3 = M 4† ,
N 4 = −M 3† .
(4.6)
This is for fixed maps σz , τz . In dual terms, by allowing σz , τz to vary, we think j j , Ndc as coordinate functions on the space of all possible pairs of the elements Mab of A[C4 ]-module maps σz : H ⊗ A[C4 ] → K ⊗ A[C4 ],
τz : K ⊗ A[C4 ] → L ⊗ A[C4 ].
(4.7)
Imposing the conditions (4.4) and (4.6), we obtain coordinate functions on the space Mk of all self-conjugate monads with index k ∈ Z. Definition 4.2. We write A[Mk ] for the commutative ∗-algebra generated j j , Ndc subject to the relations (4.4) and the by the coordinate functions Mab ∗-structure (4.6). Remark 4.3. For each point x ∈ Mk there is an evaluation map x : A[Mk ] → C and the complex matrices (x ⊗ id)σz and (x ⊗ id)τz define a self-conjugate monad over C4 , (evx ⊗id)σz
(evx ⊗id)τz
0 → H ⊗ A[C4 ] −−−−−−−→ K ⊗ A[C4 ] −−−−−−−→ L ⊗ A[C4 ] → 0.
(4.8)
As already remarked, the cohomology E = Ker, τz /Im σz of a monad (4.1) defines a vector bundle E over CP3 ; the self-conjugacy condition (4.6) ensures that E arises via pull-back along the Penrose fibration CP3 → S 4 . This means that the bundle E is trivial upon restriction to each of the fibers of the Penrose fibration and, in particular, to the fiber ∞ over the “point at infinity”. As already mentioned, we wish to view our monads as being covariant under a certain coaction of a Hopf algebra H. Recall that A[C4 ] is already a left H-comodule
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
279
algebra, with H-coaction defined using the projection π : A[GL+ (2, H)] → H and the formula (2.16). It automatically follows that the free modules H ⊗ A[C4 ], K ⊗ A[C4 ] and L ⊗ A[C4 ] are also left H-comodules whose A[C4 ]-module structures are H-equivariant. It remains to address the requirement that the module maps σz and τz should be H-equivariant as well. Lemma 4.4. The maps σz := j M j ⊗ zj and τz := j N j ⊗ zj are H-comodule j l maps if and only if the coordinate functions Mab , Ndc carry the left H-coaction j j r r Mab → π(S(Arj )) ⊗ Mab , Ndc → π(S(Arj )) ⊗ Ndc (4.9) r
r
for each j = 1, . . . , 4 and a, c = 1, . . . , 2k + 2, b, d = 1, . . . , k. Proof. Upon inspection of Eq. (4.2) we see that σz cannot possibly be an intertwiner for the H-coactions on H ⊗ A[C4 ] and K ⊗ A[C4 ] unless we also allow for a coaction of H on the algebra A[Mk ] as well. Since the definition of σz depends j only upon the generators Mab , it is enough to check equivariance only on these generators. It is immediate that, for fixed a, b, the four-dimensional H-comodule j , j = 1, . . . , 4, must be conjugate to the fourspanned by the generators Mab dimensional comodule spanned by the generators z1 , . . . , z4 , giving the coaction as stated. Indeed, we verify that M r ⊗ zr → π(S(Asr )Ars ) ⊗ M s ⊗ zs r
r,s
=
π((Ass )) ⊗ M s ⊗ zs
s
=
1 ⊗ M s ⊗ zs ,
s
as required. The same analysis applies to the map τz . By extending it as a ∗-algebra map, the formula (4.9) equips A[Mk ] with the structure of a left H-comodule ∗-algebra. This will be of paramount importance in later sections when we come to deform the ADHM construction. 4.2. The construction of instantons on R4 For self-conjugate monads, the maps of interest are the (2k + 2) × k algebra-valued matrices σz = M 1 ⊗ z1 + M 2 ⊗ z2 + M 3 ⊗ z3 + M 4 ⊗ z4 , σJ(z) = −M 1 ⊗ z2∗ + M 2 ⊗ z1∗ − M 3 ⊗ z4∗ + M 4 ⊗ z3∗ . ∗ In terms of these generators, the monad condition τz σz = 0 becomes σJ(z) σz = 0. By ∗ ∗ polarisation of this identity, one also finds that σJ(z) σJ(z) = σz σz . The identification
April 11, J070-S0129055X1100428X
280
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
of the vector space K with its dual K∗ means that the module K ⊗ A[C4 ] acquires a bilinear form given by (Jξ)∗a ηa (4.10) (ξ, η) := Jξ | η = a
for ξ = (ξa ) and η = (ηa ) ∈ K ⊗ A[C ], where · | · is the canonical Hermitian ∗ σz = 0 implies that the columns structure on K ⊗ A[C4 ]. The monad condition σJ(z) of the matrix σz are orthogonal with respect to the form ( · , · ). Let us introduce the notation 4
∗ σJ(z) , ρ2 := σz∗ σz = σJ(z)
(4.11)
a k × k matrix with entries in the algebra A[Mk ] ⊗ A[C4 ]. In order to proceed, we need this matrix ρ2 to be invertible, although of course this is not the case in general. Thus we need to slightly enlarge the matrix algebra Mk (C)⊗ A[Mk ]⊗ A[C4 ] by adjoining an inverse element ρ−2 for ρ2 . Doing so is equivalent to deleting a collection of points from the parameter space Mk , corresponding to the so-called “instantons of zero-size” [13]. We henceforth assume that this has been done, although we do not change our notation. We collect together the matrices σz , σJ(z) into the (2k + 2) × 2k matrix V := (σz which by the definition of ρ2 obeys ∗
V V=ρ
2
σJ(z) ),
1k
0
0
1k
(4.12)
,
where 1k denotes the k × k identity matrix. We form the matrix ∗ Q := Vρ−2 V∗ = σz ρ−2 σz∗ + σJ(z) ρ−2 σJ(z)
(4.13)
and for convenience we denote Qz := σz ρ−2 σz∗ ,
∗ QJ(z) := σJ(z) ρ−2 σJ(z) .
(4.14)
Immediately we have the following result. Proposition 4.5. The quantity Q := Vρ−2 V∗ is a (2k + 2) × (2k + 2) projection, Q2 = Q = Q∗ , with entries in the algebra A[Mk ] ⊗ A[R4 ] and trace equal to 2k. Proof. That Q is a projection is a direct consequence of the fact that V∗ V = ρ2 . The matrices Qz and QJ(z) are also projections: in fact they are orthogonal projections, since Qz QJ(z) = 0. Moreover, both matrices Qz and QJ(z) have entries whose A[C4 ]-components have the form zj∗ zl for j, l = 1, . . . , 4. From the proof of Lemma 2.2, we know that we can rewrite each of these expressions in terms of generators of the algebra A[R4 ]⊗ A[CP1 ]. Since the matrix sum Q has entries which are J-invariant, it follows that the A[C4 ]-components of these entries must lie in
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
281
the J-invariant subalgebra of A[R4 ] ⊗ A[CP1 ], which is just A[R4 ]. For the trace, we compute that (σz ρ−2 σz∗ )µµ = (σz )µr (ρ−2 )rs (σz∗ )sµ = (ρ−2 )rs (σz )µr (σz )∗sµ Tr Qz = µ
=
µ,r,s
(ρ−2 )rs (σz )∗sµ (σz )µr =
µ,r,s
µ,r,s
(ρ−2 )rs (σz∗ σz )sr = Tr 1k = k.
r,s
A similar computation establishes that QJ(z) also has trace equal to k, whence the trace of Q is 2k by linearity. From the projection Q, we construct the complementary projection P := 12k+2 − Q, also having entries in the algebra A[Mk ] ⊗ A[R4 ]. It is immediate that the trace of P is equal to two, so it follows that the finitely generated projective right A[Mk ] ⊗ A[R4 ]-module E := P(A[Mk ] ⊗ A[R4 ])2k+2 defines a family of rank two vector bundles over R4 parametrized by the space Mk of self-conjugate monads. We equip this family of vector bundles with the family of Grassmann connections ∇ := P ◦ (id ⊗ d), obtaining the following result. Proposition 4.6. The curvature F = P((id ⊗ d)P)2 of the family of Grassmann connections ∇ is anti-self-dual, that is to say (id ⊗ ∗)F = −F . Proof. By applying id ⊗ d to the relation ρ−2 ρ2 = 1k and using the Leibniz rule, one finds that (id ⊗ d)ρ−2 = −ρ−2 ((id ⊗ d)ρ2 )ρ−2 (this is a standard formula for calculating the derivative of a matrix-valued function). Using this, one finds that (id ⊗ d)(Vρ−2 V∗ ) = P((id ⊗ d)V)ρ−2 V∗ + Vρ−2 ((id ⊗ d)V∗ )P, and hence in turn that ((id ⊗ d)P) ∧ ((id ⊗ d)P) = P((id ⊗ d)V)ρ−2 ((id ⊗ d)V∗ )P + Vρ−2 ((id ⊗ d)V∗ )P((id ⊗ d)V)ρ−2 V∗ , where we have used the facts that ρ−2 V∗ P = 0 = PVρ−2 . The second term in the above expression is identically zero when acting on any element in the image E of P, whence the curvature F of the family ∇ works out to be F = P((id ⊗ d)P)2 = P((id ⊗ d)V)ρ−2 ((id ⊗ d)V∗ )P ∗ ))P. = P(((id ⊗ d)σz )ρ−2 ((id ⊗ d)σz∗ ) + ((id ⊗ d)σJ(z) )ρ−2 ((id ⊗ d)σJ(z)
It is clear by inspection that on twistor space A0 [CP3 ] this F is a horizontal twoform of type (1, 1) and it is known [1] that such a two-form is necessarily the pull-back of an anti-self-dual two-form on R4 .
April 11, J070-S0129055X1100428X
282
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
Thus we have reproduced the ADHM construction of instantons on R4 in our coordinate algebra framework: as usual, we must now address the question of the extent to which the construction depends on the choice of bases for the vector spaces H, K, L that we made in Sec. 4.1. It is clear that we are free to act on the A[C4 ]-module K ⊗ A[C4 ] by a unitary element of the matrix algebra M2k+2 (C) ⊗ A[C4 ]. In order to preserve the instanton construction, we must do so in a way which preserves the bilinear form (· , ·) of Eq. (4.10) determined by the identification of K with its dual K∗ . It follows that the map σz in Eq. (4.2) is defined up to a unitary transformation U ∈ EndA[C4 ] (K ⊗ A[C4 ]) which commutes with the quaternion structure J, namely, the elements of the group Sp(K ⊗ A[C4 ]) := {U ∈ EndA(C4θ ) (K ⊗ A[C4 ]) | U ξ | U ξ = ξ | ξ, J(U ξ) = U J(ξ)}. Similarly, we are free to change basis in the module H ⊗ A[C4 ], whence the map τz in Eq. (4.2) is defined up to an invertible transformation W ∈ GL(H ⊗ A[C4 ]). Given U ∈ Sp(K ⊗ A[C4 ]) and W ∈ GL(H ⊗ A[C4 ]), the available freedom in the ADHM construction is to map σz → U σz W . Proposition 4.7. For all W ∈ GL(H) the projection P = 12k+2 − Q is invariant under the transformation σz → σz W . For all U ∈ Sp(K), under the transformation σz → U σz the projection P of transforms as P → U PU ∗ . Proof. One first checks that ρ2 → (σz W )∗ (σz W ) = W ∗ ρ2 W , so that Qz → σz W (W ∗ ρ2 W )−1 W ∗ σz∗ = σz W (W −1 ρ−2 (W ∗ )−1 )W ∗ σz∗ = Qz , whence the projection P is unchanged. Replacing σz by U σz leaves ρ2 invariant (since U is unitary) and so has the effect that Qz → U σz ρ−2 σz∗ U ∗ = U Qz U ∗ , whence it follows that P is mapped to U PU ∗ . In this way, such changes of module bases result in gauge equivalent families of instantons. However, from the point of view of constructing equivalence classes of connections it is in fact sufficient to consider the effect of the subgroups of “constant” module automorphisms, i.e. those generated by changes of basis in the vector spaces H, K and L, described by the group Sp(K) = Sp(k + 1) ⊂ Sp(K ⊗ A[C4 ]) and the group GL(k, R) ⊂ GL(H ⊗ A[C4 ]) [2], although it is beyond our scope to prove this here. However, it is not difficult to see that the algebra A[Mk ] has a total of 4k(2k +2) generators and 5k(k − 1) constraints (determined by the orthogonality relations ∗ σz = 0); the Sp(k + 1) symmetries impose a further (k + 1)(2(k + 1) + 1) σJ(z)
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
283
constraints and the GL(k, R) a further k 2 constraints. This elementary argument yields that the construction has (8k 2 + 8k) − 5k(k − 1) − (3k 2 + 5k + 3) = 8k − 3 degrees of freedom, in precise agreement with the dimension of the moduli space computed in [3]. Definition 4.8. We say that a pair of self-conjugate monads are equivalent if they are related by a change of bases of the vector spaces H, K, L of the above form, i.e. by a pair of linear transformations U ∈ Sp(k + 1) and W ∈ GL(k, R). We denote by ∼ the resulting equivalence relation on the space Mk of self-conjugate monads. 5. The Moyal–Groenewold Noncommutative Plane R4 The Moyal noncommutative space-time R4 is arguably one of the best known and most widely-studied examples of a noncommutative space. In this section we analyze the construction of instantons on this space from the point of view of cocycle twisting. In this section we show how to deform Euclidean space-time R4 and its associated geometric structure into that of the Moyal–Groenewold space-time, then we look at what happens to the ADHM construction of instantons under the deformation procedure. 5.1. A Moyal-deformed family of monads In order to deform the ADHM construction of instantons, we need to choose a Hopf algebra H of symmetries together with a two-cocycle F by which to perform the twisting. For our twisting Hopf algebra we take H = A[R4 ], the algebra of coordinate functions on the additive group R4 . It is the commutative unital ∗-algebra A[R4 ] = A[tj , t∗j | j = 1, 2]
(5.1)
equipped with the Hopf algebra structure ∆(tj ) = 1 ⊗ tj + tj ⊗ 1,
(tj ) = 0,
S(tj ) = −tj ,
(5.2)
with ∆, extended as ∗-algebra maps and S extended as a ∗-anti-algebra map. In order to deform the twistor fibration, we have to equip our various algebras with left H-comodule algebra structures, which we achieve using the discussion of Sec. 2.3. There is a Hopf algebra projection from A[GL+ (2, H)] onto H, defined on generators by α1 −α∗2 β1 −β2∗ 1 0 0 0 α∗1 β2 β1∗ α2 0 1 0 0 + π : A[GL (2, H)] → H, → t∗ t∗ 1 0 (5.3) γ1 −γ2∗ δ1 −δ2∗ 1 2 ∗ ∗ −t2 t1 0 1 γ2 γ1 δ2 δ1
April 11, J070-S0129055X1100428X
284
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
and extended as a ∗-algebra map. Using Eq. (2.16), this projection determines a left H-coaction ∆π : A[C4 ] → H ⊗ A[C4 ] by z1 → 1 ⊗ z1 ,
z3 → t∗1 ⊗ z1 + t∗2 ⊗ z2 + 1 ⊗ z3 ,
(5.4)
z2 → 1 ⊗ z2 ,
z4 → −t2 ⊗ z1 + t1 ⊗ z2 + 1 ⊗ z4 ,
(5.5)
extended as a ∗-algebra map. Using the identification of generators (2.10), the coordinate algebra A[R4 ] of Euclidean space therefore carries the coaction A[R4 ] → H ⊗ A[R4 ],
ζ1 → 1 ⊗ ζ1 + t1 ⊗ 1,
ζ2 → 1 ⊗ ζ2 + t2 ⊗ 1,
(5.6)
making A[R4 ] into a left H-comodule ∗-algebra. Let (∂j l ), j, l = 1, 2, be the Lie algebra of translation generators dual to H. Writing
t∗1 t∗2 s τ := (τr ) = , r, s = 1, 2, −t2 t1 this means that there is a non-degenerate pairing
∂j l , τr s = δjs δrl ,
j, l, r, s = 1, 2,
which extends to an action on products of the generators tj , t∗l by differentiation and evaluation at zero. Using this pairing, we define a twisting two-cocycle by 1 iΘr r s s ∂r r ⊗ ∂s s , h ⊗ g F : H ⊗ H → C, F (h, g) = exp 2
for h, g ∈ H, where Θ = (Θr r s s ), r, r , s, s = 1, 2, is a real 4 × 4 anti-symmetric matrix with rows rr and columns ss , which we choose to have the canonical form 0 0 0 α 0 0 β 0 Θ = 0 −β 0 0 −α 0 0 0 for non-zero real constants α, β and > 0 a deformation parameter. We assume for simplicity that α + β = 0. This F is multiplicative (i.e. it is a Hopf bicharacter in the sense of Eq. (1.3)) and so it is determined by its values on the generators (τr s ), r, s = 1, 2. One computes in particular that F (t∗1 , t1 ) =
1 αi, 2
1 F (t∗2 , t2 ) = − βi, 2
with F evaluating as zero on all other pairs of generators. From the formulæ (1.9), (1.10) and (1.12), one immediately finds that H = HF as a Hopf ∗-algebra. However,
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
285
the effect of the twisting on the H-comodule algebras A[C4 ] and A[R4 ] is not trivial, as shown by the following lemmata. Lemma 5.1. The algebra relations in the H-comodule algebra A[C4 ] are twisted into [z3 , z4 ] = i(α + β)z1 z2 ,
[z3∗ , z4∗ ] = i(α + β)z1∗ z2∗ ,
(5.7)
[z3 , z3∗ ] = iαz1 z1∗ − iβz2 z2∗ ,
[z4 , z4∗ ] = iβz1 z1∗ − iαz2 z2∗ ,
(5.8)
with all other relations left unchanged. In particular, the generators z1 , z2 and their conjugates remain central in the deformed algebra. Proof. The cocycle-twisted product on the H-comodule algebra A[C4 ] is defined by the formula (1.14); the corresponding algebra relations can be expressed using the “universal R-matrix” of Eq. (1.13), namely zj ·F zl = R(zl (−1) , zj (−1) )zl (0) ·F zj (0) , zj ·F zl∗ = R(zl∗ (−1) , zj (−1) )zl∗ (0) ·F zj (0) .
(5.9)
One finds in particular that the R-matrix has the values R(t∗1 , t1 ) = 2F −1 (t∗1 , t1 ) = −iα,
R(t∗2 , t2 ) = 2F −1 (t∗2 , t2 ) = iβ,
(5.10)
and gives zero when evaluated on all other pairs of generators. By explicitly computing Eqs. (5.9) (and omitting the product symbol ·F ), one finds the relations as stated in the lemma. We denote by A[C4 ] the ∗-algebra generated by {zj , zj∗ | j = 1, . . . , 4} modulo the relations (5.7) and (5.8). This makes A[C4 ] into a left HF -comodule ∗-algebra. Lemma 5.2. The algebra relations in the H-comodule algebra A[R4 ] are twisted into [ζ1∗ , ζ1 ] = iα,
[ζ2∗ , ζ2 ] = −iβ,
j, l = 1, 2,
(5.11)
with vanishing commutators between all other pairs of generators. Proof. The product in A[R4 ] is twisted using the formula (1.14). Once again omitting the product symbol ·F , the corresponding algebra relations are computed to be those as stated. We denote by A[R4 ] the algebra generated by ζ1 , ζ2 , ζ1∗ , ζ2∗ , modulo the relations (5.11). This makes A[R4 ] into a left HF -comodule ∗-algebra. Remark 5.3. Since the generators z1 , z2 and their conjugates z1∗ , z2∗ remain central in the algebra A[C4 ], we see immediately from Lemma 2.2 that the localized twistor algebra has the form A[R4 ] ⊗ A[CP1 ]. Only the base R4 of the twistor fibration is deformed; the typical fiber CP1 remains classical.
April 11, J070-S0129055X1100428X
286
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
The canonical differential calculi described in Sec. 3.1 are also deformed using this cocycle twisting procedure. The relations in the deformed calculi are given in the following lemmata. Lemma 5.4. The twisted differential calculus Ω(C4 ) is generated by the degree zero elements zj , zl∗ and the degree one elements dzj , dzl∗ for j, l = 1, . . . , 4, subject to the bimodule relations between functions and one-forms [z3 , dz4 ] = i(α + β)z1 dz2 ,
[z3∗ , dz4∗ ] = i(α + β)z1∗ dz2∗ ,
[z4 , dz3 ] = −i(α + β)z2 dz1 ,
[z4∗ , dz3∗ ] = −i(α + β)z2∗ dz1∗ ,
[z3 , dz3∗ ] = iαz1 dz1∗ − iβz2 dz2∗ ,
[z4 , dz4∗ ] = iβz1 dz1∗ − iαz2 dz2∗
and the anti-commutation relations between one-forms {dz3 , dz4 } = i(α + β)dz1 dz2 ,
{dz3∗ , dz4∗ } = i(α + β)dz1∗ dz2∗ ,
{dz3 , dz3∗ } = iαdz1 dz1∗ − iβdz2 dz2∗ ,
{dz4 , dz4∗ } = iβdz1 dz1∗ − iαdz2 dz2∗ ,
with all other relations undeformed. Proof. One views the classical calculus Ω(C4 ) as a left H-comodule algebra and accordingly computes the deformed product using the twisting cocycle F . Since the exterior derivative d commutes with the H-coaction (5.4), it is straightforward to observe that the (anti-)commutation relations in the deformed calculus Ω(C4 ) are just the same as the algebra relations in A[C4 ] but with d inserted appropriately. Lemma 5.5. The twisted differential calculus Ω(R4 ) is generated by the degree zero elements ζ1 , ζ1∗ , ζ2 , ζ2∗ and the degree one elements dζ1 , dζ1∗ , dζ2 , dζ2∗ . The relations in the calculus are not deformed. Proof. Once again, the classical calculus Ω(R4 ) is deformed as a twisted left Hcomodule algebra. Although the products of functions and differential forms in the calculus are indeed twisted, one finds that the extra terms which appear in the twisted product all vanish in the expressions for the (anti-)commutators (cf. [6] for full details). In particular, we see that the vector space Ω2 (R4 ) is the same as it is classically. Since the coaction of H on A[R4 ] is by isometries, the Hodge ∗-operator ∗ : Ω2 (R4 ) → Ω2 (R4 ) commutes with the H-coaction in the sense that ∆π (∗ω) = (id ⊗ ∗)∆π (ω),
ω ∈ Ω2 (R4 ),
so there is also a Hodge operator ∗ : Ω2 (R4 ) → Ω2 (R4 ) defined by the same formula as in the classical case. In particular, this means that the decomposition of Ω2 (R4 ) into self-dual and anti-self-dual two-forms, Ω2 (R4 ) = Ω2+ (R4 ) ⊕ Ω2− (R4 ),
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
287
is identical at the level of vector spaces to the corresponding decomposition in the classical case. The above lemmata are really just special cases of the cocycle twisting procedure: recall that in fact our “quantization map” applies to every suitable H-covariant construction. Since the algebra A[Mk ] is a left H-comodule algebra according to Lemma 4.4, it is also deformed by the quantization. We write A[Mk; ] for the resulting cocycle-twisted HF -comodule algebra. Proposition 5.6. The coordinate ∗-algebra A[Mk; ] is generated by the matrix j l , Ndc for a, c = 1, . . . , k and b, d = 1, . . . , 2k + 2, modulo the relations elements Mab 1 2 3 4 , Mrs ] = i(α − β)Mab Mrs , [Mab 1 ∗ 2 ∗ 3 ∗ 4 ∗ [Mab , Mrs ] = i(α − β)Mab Mrs ,
1 1 ∗ 3 3 ∗ 4 4 ∗ [Mab , Mrs ] = iαMab Mrs + iβMab Mrs , 2 2 ∗ 3 3 ∗ 4 4 ∗ [Mab , Mrs ] = −iβMab Mrs − iαMab Mrs
and the ∗-structure (4.6). The generators M 3 , M 4 , M 3∗ , M 4∗ are central in the algebra. Proof. From Lemma 4.4 we read off the H-coaction on generators M j , j = 1, . . . , 4, obtaining M 1 → 1 ⊗ M 1 − t∗1 ⊗ M 3 + t2 ⊗ M 4 , M 3 → 1 ⊗ M 3 , M 2 → 1 ⊗ M 2 − t∗2 ⊗ M 3 − t1 ⊗ M 4 , M 4 → 1 ⊗ M 4 , which we extend as a ∗-algebra map. The deformed relations follow immediately from an application of the twisting formula (1.14). The coaction of H on A[Mk ] does not depend on the matrix indices of the generators M j , N l , j, l = 1, . . . , 4, hence neither do the twisted commutation relations. Similar computations yield the other commutation relations as stated. In terms of the deformed product, the relations (4.4) are twisted into the relations j j l l Ndr Mrb + Ndr Mrb + i(α + β)(δ j1 δ l2 − δ j2 δ l1 ) = 0 r
for each b, d = 1, . . . , k, where δ rs is the Kronecker delta symbol. We think of A[Mk; ] as the coordinate algebra of a noncommutative space Mk; of monads on C4 . Although we do not have as many evaluation maps on A[Mk; ] as we did in the classical case, we can nevertheless work with the whole family Mk; at once. 5.2. The construction of instantons on R4 Using the noncommutative space of monads Mk; , we proceed as in Sec. 4.2 to construct families of instantons, but now on the Moyal space R4 . 4 denote the Pontryagin dual to the additive group R4 used in Eq. (5.1). Let R 4 we define unitary elements Given a pair of complex numbers c := (c1 , c2 ) ∈ C2 R
April 11, J070-S0129055X1100428X
288
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
u = (u1 , u2 ) of the algebra HF by u1 = exp(i(c1 t1 + c∗1 t∗1 )),
u2 = exp(i(c2 t2 + c∗2 t∗2 )).
(5.12)
It is straightforward to check that u1 , u2 are group-like elements of (the smooth completion of) the Hopf algebra HF , i.e. they transform as ∆(uj ) = uj ⊗ uj under the coproduct ∆ : HF → HF ⊗ HF . Lemma 5.7. There is a canonical left action of HF on the algebra A[Mk; ] given by u1 M 1 = M 1 − αc1 M 3 ,
u1 M 2 = M 2 + αc∗1 M 4 ,
u1 M 1∗ = M 1∗ − αc∗1 M 3∗ ,
u1 M 2∗ = M 2∗ + αc1 M 4∗ ,
u2 M 1 = M 1 + βc∗2 M 4 ,
u2 M 2 = M 2 + βc2 M 3 ,
u2 M 1∗ = M 1∗ + βc2 M 4∗ ,
u2 M 2∗ = M 2 + βc∗2 M 3∗ ,
with uj M l = M l and uj M l∗ = M l∗ for l = 3, 4. Proof. Recall from the proof of Proposition 5.6 that A[Mk; ] is a left HF -comodule algebra: it is therefore also a left HF -module algebra according to the formula (1.5). Evaluating the R-matrix by expanding the exponentials as power series, one finds that R(t1 , u1 ) = R(t1 , ic∗1 t∗1 ) = −αc∗1 , R(t∗1 , u1 ) = R(t∗1 , ic1 t1 ) = αc1 , R(t2 , u2 ) = R(t1 , ic∗2 t∗2 ) = −βc∗2 ,
R(t∗2 , u2 ) = R(t∗2 , ic2 t2 ) = βc2 ,
with all other combinations evaluating as zero. Using the fact that the unitaries uj are group-like elements of the Hopf algebra HF , one finds the actions to be as stated. In turn, there is an infinitesimal version of the HF -action on A[Mk; ], given by t1 M 1 = iαM 3 ,
t∗1 M 1∗ = −iαM 3∗ ,
t∗1 M 2 = −iαM 4 ,
t1 M 2∗ = iαM 4∗ ,
t∗2 M 1 = −iβM 4 ,
t2 M 1∗ = iβM 4∗ ,
t2 M 2 = −iβM 3 ,
t∗2 M 2∗ = iβM 3∗ ,
with tj M l = 0 and tj M l∗ = 0 for all other possible combinations of generators. Either way, we obtain a group action 4 → Aut A[Mk; ] γ:R
(5.13)
4 on the coordinate algebra A[Mk; ] by ∗-automorphisms. of the Pontryagin dual R We also form the smash product algebra A[Mk; ] > HF associated to the above HF action, whose multiplication is defined by the formula (1.7). With the coproduct
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
289
∆(tj ) = 1 ⊗ tj + tj ⊗ 1 on HF , we find in particular the formula j j j l l l ⊗ tr )(Mcd ⊗ ts ) = Mab Mcd ⊗ tr ts + Mab (tr Mcd ) ⊗ ts , (Mab
with similar expressions for products involving the conjugate generators M j ∗ . The corresponding algebra relations between such elements are given by j l ⊗ tr , Mcd ⊗ ts ] [Mab j j j l l l , Mcd ] ⊗ tr ts + Mab (tr Mcd ) ⊗ ts − Mcd (ts Mab ) ⊗ tr = [Mab
(5.14)
for j, l = 1, . . . , 4 and r, s = 1, 2, with similar formulæ occurring when the generators M j and tr are replaced by their conjugates. Of course, these relations are just a small part of the full algebra structure in the smash product A[Mk; ] > HF , but those in Eq. (5.14) are the ones that we will need later on in the paper. Remark 5.8. This situation is a special case of Example 1.1. Recall that, upon making suitable completions of our algebras, we think of the smash product algebra A[Mk; ] > HF = A[Mk; ] > A[R4 ] 4 determined as an algebraic version of the crossed product algebra A[Mk; ] >γ R by the action (5.13). Thanks to the functorial nature of the cocycle twisting, mutatis mutandis the ADHM construction goes through as described in Sec. 4.2. In the following, we highlight the main differences which arise as a consequence of the quantization procedure. The next lemma takes care of an important technical point: as well as twisting the relations in the algebras A[Mk ] and A[C4 ], we also have to deform the cross-relations in the tensor product algebra A[Mk ] ⊗ A[C4 ]. Lemma 5.9. The algebra structure of the twisted tensor product algebra A[Mk; ] ⊗ A[C4 ] is given by the relations in the respective subalgebras A[Mk; ] and A[C4 ] determined above, together with the cross-relations z3 M 1 = M 1 z3 − iβM 4 z2 ,
z3 M 2 = M 2 z3 − iαM 4 z1 ,
z4 M 1 = M 1 z4 + iαM 3 z2 ,
z4 M 2 = M 2 z4 + iβM 3 z1 ,
z3∗ M 1 = M 1 z3∗ + iαM 3 z1∗ ,
z3∗ M 2 = M 2 z3∗ − iβM 3 z2∗ ,
z4∗ M 1 = M 1 z4∗ + iβM 4 z1∗ ,
z4∗ M 2 = M 2 z4∗ − iαM 4 z2∗
and their conjugates. The generators z1 , z2 , M 3 , M 4 are central. Proof. The classical algebra A[Mk ] ⊗ A[C4 ] is a left comodule ∗-algebra under the tensor product HF -coaction defined by Eq. (1.4). The twisted product is determined by the formula (1.14), with the non-trivial cross-terms in the deformed algebra being the ones stated in the lemma. We denote the deformed algebra by A[Mk; ] ⊗ A[C4 ], with the symbol ⊗ to remind us that the tensor product algebra structure is not the usual one, but has been twisted as well.
April 11, J070-S0129055X1100428X
290
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
Just as in the classical situation, we have a pair of matrices σz and τz , σz = M j ⊗ zj , τz = N j ⊗ zj , j
j
whose entries this time live in the twisted algebra A[Mk; ] ⊗ A[C4 ]. The resulting matrix V := (σz σJ(z) ) is a 2k×(2k+2) matrix with entries in A[Mk; ] ⊗ A[C4 ]. We set ρ2 := V∗ V. From the projection Q := Vρ−2 V∗ we construct the complementary matrix P := 12k+2 − Q, which has entries in the algebra A[Mk; ] ⊗ A[R4 ]. It is clear that this matrix P is a self-adjoint idempotent, P2 = P = P∗ , but it does not define an honest family of projections in the sense of Definition 3.2. Recall that, to define such a family, we need a matrix with entries in an algebra of the form A ⊗ A[R4 ] for some “parameter algebra” A, whereas the quantization procedure has produced a projection Q with entries in a twisted tensor product A[Mk; ] ⊗ A[R4 ]. We may nevertheless recover a genuine family of projections using the following lemma, in which we use the Sweedler notation Z → Z (−1) ⊗ Z (0) for the left coaction A[C4 ] → HF ⊗ A[C4 ] defined in Eqs. (5.4) and (5.5). Lemma 5.10. There is a canonical ∗-algebra map µ : A[Mk; ] ⊗ A[C4 ] → (A[Mk; ] > HF ) ⊗ A[C4 ] defined by µ(M ⊗ Z) = M ⊗ Z (−1) ⊗ Z (0) for each M ∈ A[Mk; ] and Z ∈ A[C4 ]. Proof. This follows from a straightforward verification. One checks that µ(M ⊗ Z)µ(M ⊗ Z ) = (M ⊗ Z (−1) ⊗ Z (0) )(M ⊗ Z (−1) ⊗ Z (0) ) = M (Z (−1) (1) M ) ⊗ Z (−1) (2) Z (−1) ⊗ Z (0) Z (0) = R(M (−1) , Z (−1) (1) )M M (0) ⊗ Z (−1) (2) Z (−1) ⊗ Z (0) Z (0) = R(M (−1) , Z (−1) )M M (0) ⊗ Z(Z (0)(−1) )Z (−1) ⊗ (Z (0)(0) )Z (0) = µ(R(M (−1) , Z (−1) )M M (0) ⊗ Z (0) Z ) = µ((M ⊗ Z)(M ⊗ Z )), so that µ is an algebra map, as well as (µ(M ⊗ Z))∗ = (M ⊗ Z (−1) ⊗ Z (0) )∗ = (M ⊗ Z (−1) )∗ ⊗ Z (0)∗ = R(M (−1)∗ , (Z (−1) (1) )∗ )(M (0)∗ ⊗ (Z (−1) (2) )∗ ) ⊗ Z (0)∗ = µ(R(M (−1)∗ , Z (−1)∗ )(M (0)∗ ⊗ Z (0)∗ )) = µ((M ⊗ Z)∗ ), so that µ respects the ∗-structures as well.
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
291
Remark 5.11. Lemma 5.10 is an example of Majid’s “bosonization” construction [22], which converts noncommutative “braid statistics” (in our case described by the twisted tensor product ⊗ ) into commutative “ordinary statistics” (described by the usual tensor product ⊗). As a consequence of Lemma 5.10, we find that there are maps σ ˜z : H ⊗ A[C4 ] → (A[Mk; ] > HF ) ⊗ K ⊗ A[C4 ],
(5.15)
τ˜z : K ⊗ A[C4 ] → (A[Mk; ] > HF ) ⊗ L ⊗ A[C4 ],
(5.16)
defined by composing σz and τz with the map µ. Explicitly, these maps are given by σ ˜z := M r ⊗ zr (−1) ⊗ zr (0) , τ˜z := N r ⊗ zr (−1) ⊗ zr (0) , r
r
which are, respectively, k × (2k + 2) and (2k + 2) × k matrices with entries in the noncommutative algebra (A[Mk; ] > HF ) ⊗ A[C4 ]. With this in mind, we form the := (˜ ˜J(z) ). (2k + 2) × 2k matrix V σz σ ∗ is a projection, = V ρ−2 V Proposition 5.12. The (2k + 2) × (2k + 2) matrix Q 2 ∗ Q = Q = Q , with entries in the algebra (A[Mk; ] > HF ) ⊗ A[C4 ] and trace equal to 2k. is a projection follows from the fact that Q is a proProof. The fact that Q jection and that µ : A[Mk; ] ⊗ A[C4 ] → (A[Mk; ] > HF ) ⊗ A[C4 ] is a ∗-algebra map. By construction, the entries of the matrix ρ2 are central in the algebra (A[Mk; ] > HF ) ⊗ A[C4 ] (this follows from the fact that its matrix entries are coinvariant under the left HF -coaction), from which it follows that the trace computation in Proposition 4.5 is valid in the noncommutative case as well [5]. we construct as before the complementary projection From the projection Q := 12k+2 − Q. It has entries in the algebra (A[Mk; ] > HF ) ⊗ A[R4 ] and has P trace equal to two. In analogy with Definition 3.2, the finitely generated projective module E := P((A[Mk; ] > HF ) ⊗ A[R4 ])2k+2 defines a family of rank two vector bundles over R4 parametrized by the noncommutative algebra A[Mk; ] > HF . We equip this family of vector bundles with the family of Grassmann connections associated to the projection P. Proposition 5.13. The curvature F = P((id ⊗ d)P)2 of the Grassmann family of connections ∇ := (id ⊗ d) ◦ P is anti-self-dual. Proof. From Lemma 5.5, we know that the space of two-forms Ω2 (R4 ) and the Hodge ∗-operator ∗ : Ω2 (R4 ) → Ω2 (R4 ) are undeformed and equal to their classical counterparts; similarly for the decomposition Ω2 (R4 ) = Ω2+ (R4 ) ⊕ Ω2− (R4 ) into
April 11, J070-S0129055X1100428X
292
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
self-dual and anti-self-dual two-forms. This identification of the “quantum” with the “classical” spaces of two-forms survives the tensoring with the parameter space A[Mk; ] > HF , which yields that (A[Mk; ] > HF ) ⊗ Ω2± (R4 ) and (A[Mk ] ⊗ H) ⊗ Ω2± (R4 ) are isomorphic as vector spaces. Computing the curvature F in exactly the same way as in Proposition 4.6, we see that it must be anti-self-dual, since the same is true in the classical case. This Grassmann family of connections is a noncommutative deformation of the family of instantons with topological charge k constructed by the classical ADHM construction. In Proposition 5.13 we only show that the deformed ADHM construction produces famiilies of connections with anti-self-dual curvature (which are instantons in the sense of Definition 3.6) and make no claim about whether or not they have finite energy. Since it is mathematically quite difficult to make sense of the finite energy condition (3.2) for instantons on the Moyal plane R4 and of its conformal compactification, we choose to avoid the subject altogether and refer instead to [24], for example. 5.3. The Moyal-deformed ADHM equations The noncommutative ADHM construction of the previous section produced families of instantons on R4 parametrized by the noncommutative algebra A[Mk; ] > HF . We interpret the latter as an algebra of coordinate functions on some underlying “quantum” parameter space, within which we shall seek a space of classical parameters. To this end, we introduce elements of A[Mk; ] > HF defined by 1 4 1 := M 1 ⊗ 1 + M 3 ⊗ 1 t∗1 − Mab 3 := M 3 ⊗ 1, M ⊗ t2 , M ab ab ab ab ab 2 2 2 := M 2 ⊗ 1 + M 3 ⊗ 1 t∗ + M 4 ⊗ 1 t1 , M 4 := M 4 ⊗ 1 M ab ab ab ab ab ab 2 2 2 j ∗ , for each a = 1, . . . , k and b = 1, . . . , 2k + 2, together with their conjugates M ab j = 1, . . . , 4. Definition 5.14. We write A[M(k; )] for the subalgebra of A[Mk; ] > HF generl ∗ , j, l = 1, . . . , 4. j , M ated by the elements M dc ab Proposition 5.15. The algebra A[M(k; )] is a commutative ∗-subalgebra of the smash product A[Mk; ] > HF . 4 are 3 and M Proof. This follows from direct computation. The generators M clearly central. On the other hand, we also have 2 = M 1 M 2 ⊗ 1 + M 1 M 3 ⊗ 1 t∗2 + M 1 M 4 ⊗ 1 t1 + M 3 M 2 ⊗ 1 t∗1 1 M M 2 2 2 1 1 1 − M 4 M 2 ⊗ t2 − iαM 3 M 4 ⊗ 1 + iβM 4 M 3 ⊗ 1, 2 2 2
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
293
2 M 1 = M 2 M 1 ⊗ 1 + M 3 M 1 ⊗ 1 t∗ + M 4 M 1 ⊗ 1 t1 + M 2 M 3 ⊗ 1 t∗ M 2 2 2 2 1 1 1 1 − M 2 M 4 ⊗ t2 + iαM 4 M 3 ⊗ 1 − iβM 3 M 4 ⊗ 1, 2 2 2 from which it follows that the commutator is given by 2 ] = [M 1 , M 2 ] ⊗ 1 − i(α − β)M 3 M 4 ⊗ 1 = 0. 1 , M [M All other commutators are shown to vanish in the same way. Although we have made a change of generators, this does not affect the family of instantons constructed in the previous section. In order to show this, let : HF ⊗ A[M(k; )] → A[M(k; )] be the left action of HF on A[M(k; )] defined on generators by 1 = iαM 3 , t1 M
1∗ = −iαM 3∗ , t∗1 M
2 = −iαM 4 , t∗1 M
2∗ = iαM 4∗ , t1 M
1 = −iβ M 4 , t∗2 M
1∗ = iβ M 4∗ , t2 M
2 = −iβ M 3 , t2 M
2∗ = iβ M 3∗ , t∗2 M
l = 0 and tj M l∗ = 0 for l = 3, 4. Let us write together with tj M A[M(k; )] > HF for the smash product algebra associated to the action . Theorem 5.16. There is a ∗-algebra isomorphism φ : A[M(k; )] > HF A[Mk; ] > HF defined for each h ∈ HF by 1 ⊗ h → M 1 ⊗ h + M 3 ⊗ 1 t∗ h − M 4 ⊗ 1 t2 h, M ab ab ab ab 2 1 2
3 ⊗ h → M 3 ⊗ h, M ab ab
1 4 2 ⊗ h → M 2 ⊗ h + M 3 ⊗ 1 t∗2 h − Mab M ⊗ t1 h, ab ab ab 2 2
4 ⊗ h → M 4 ⊗ h M ab ab
→
and extended as a ∗-algebra map. Proof. It is clear that this map is an isomorphism of vector spaces with inverse 1 4 1 3 1 ⊗ h − M 3 ⊗ 1 t∗1 h + M ab 3 ⊗ h, Mab ⊗ h → M ⊗ t2 h, Mab ⊗ h → M ab ab ab 2 2 2 2 ⊗ h − M 3 ⊗ 1 t∗ h − M 4 ⊗ 1 t1 h, M 4 ⊗ h → M 4 ⊗ h. ⊗ h → M Mab 2 ab ab ab ab ab 2 2
By definition, the map φ is a ∗-algebra homomorphism on the subalgebra A[M(k; )], so we just have to check that it preserves the cross-relations between
April 11, J070-S0129055X1100428X
294
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
A[M(k; )] and the subalgebra HF . This is straightforward to verify: one has for example that 1 ∗ 1 1 1 3 4 φ(1 ⊗ t1 )φ(M ⊗ 1) = (1 ⊗ t1 ) Mab ⊗ 1 + Mab ⊗ t1 − Mab ⊗ t2 2 2 1 1 1 3 4 ⊗ t1 + Mab ⊗ t∗1 t1 − Mab ⊗ t2 t1 + (t1 M 1 ) ⊗ 1 = Mab 2 2 1 ) ⊗ 1) 1 ⊗ t1 + (t1 M = φ(M 1 ⊗ 1)). = φ((1 ⊗ t1 )(M The remaining relations are verified in the same way. Our goal is now to see that the parameters corresponding to the subalgebra HF can be removed and that there is a family of instantons parametrized by the commutative algebra A[M(k; )]. This follows from the fact that there is a right coaction δR : A[M(k; )] > HF → (A[M(k; )] > HF ) ⊗ HF ,
δR := id ⊗ ∆,
(5.17)
where ∆ : HF → HF ⊗ HF is the coproduct on the Hopf algebra HF . This coaction ⊗ 1 and δR (P) is by “gauge transformations”, in the sense that the projections P are unitarily equivalent in the matrix algebra M2k+2 ((A[M(k; )] > HF ) ⊗ HF ) and so they define gauge equivalent families of instantons [5]. This means that the parameters determined by the subalgebra HF in A[M(k; )] > HF are just gauge parameters and so they may be removed. Indeed, by passing to the subalgebra of A[M(k; )] > HF consisting of coinvariant elements under the coaction (5.17), viz. A[M(k; )] ∼ = {a ∈ A[M(k; )] > HF | δR (a) = a ⊗ 1}, we obtain a projection Pk; with entries in A[M(k; )] ⊗ A[R4 ]. The precise construction of the projection Pk; goes exactly as in [5], as does the proof of the fact that the Grassmann family of connections ∇ = Pk; ◦ (id ⊗ d) has anti-self-dual curvature and hence defines a family of instantons on R4 . The commutative algebra A[M(k; )] is the algebra of coordinate funtions on a classical space of monads M(k; ). For each point x ∈ M(k; ) there is an evaluation map evx ⊗ id : A[M(k; )] ⊗ A[C4 ] → A[C4 ], which in turn defines a self-conjugate monad over the noncommutative space C4 . σz and (evx ⊗ id) τz determine In analogy with Remark 4.3, the matrices (evx ⊗ id) a complex of free right A[C4 ]-modules (evx ⊗id)˜ σz
(evx ⊗id)˜ τz
0 → H ⊗ A[C4 ] −−−−−−−→ K ⊗ A[C4 ] −−−−−−−→ L ⊗ A[C4 ] → 0.
(5.18)
The same evaluation map determines a projection (evx ⊗ id)Pk; and hence an instanton connection on R4 .
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
295
As described in Proposition 4.7, the gauge freedom in the classical ADHM construction is determined by the choice of bases of the vector spaces H, K, L. Clearly we also have this freedom in the noncommutative construction as well: we write ∼ for the equivalence relation induced on the space M(k; ) by such changes of basis (cf. Definition 4.8). This leads to the following explicit description of the parameter space M(k; ) (cf. [17]). Theorem 5.17. For each positive integer k ∈ Z, the space M(k; )/∼ of equivalence classes of self-conjugate monads over C4 is the quotient of the set of complex matrices B1 , B2 ∈ Mk (C), I ∈ M2×k (C), J ∈ Mk×2 (C) satisfying the equations (i) [B1 , B2 ] + IJ = 0, (ii) [B1 , B1∗ ] + [B2 , B2∗ ] + II ∗ − J ∗ J = −i(α + β)1k by the action of U(k) given by B1 → gB1 g −1 ,
B2 → gB2 g −1 ,
I → gI,
J → Jg −1
for each g ∈ U(k). Proof. Recall that we write the monad maps σ ˜z , τ˜z as 1 z1 + M 2 z2 + M 3 z3 + M 4 z4 , σ ˜z = M
1 z1 + N 2 z2 + N 3 z3 + N 4 z4 τ˜z = N
j , N l , where j, l = 1, . . . , 4. Upon expanding out the confor constant matrices M ˜z = 0 and using the commutation relations in Lemma 5.1, we find the dition τ˜z σ conditions l + N j + i(α + β)(δ j1 δ l2 − δ j2 δ l1 ) = 0 jM lM N
(5.19)
for j, l = 1, . . . , 4. Recall from Lemma 2.2 that the typical fiber CP1 of the twistor fibration R4 × CP1 has homogeneous coordinates z1 , z1∗ , z2 , z2∗ ; it follows that the “line at infinity” ∞ is recovered by setting z1 = z2 = 0. On this line, the monad ˜z = 0 becomes condition τ˜z σ 4 + N 3 = 0, 3M 4M N
3 = 0, 3M N
4 = 0. 4M N
(5.20)
Moreover, when z1 = z2 = 0 we see from the relations (5.7) and (5.8) that the coordinates z3 , z4 and their conjugates are mutually commuting, so that the line ∞ is classical. The self-conjugacy of the monad implies that the restricted bundle 4 = −N 3 3M 4M over ∞ is trivial, so we can argue as in [26] to show that the map N is an isomorphism. Using these conditions we choose bases for H, K, L such that 3M 4 = 1k and N tr tr 1k×k 0k×k 0k×k −1k×k 3 = 0k×k , M 4 = 0k×k . 4 = 1k×k , N 3 = 1k×k , N M 02×k 02×k 0k×2 0k×2
April 11, J070-S0129055X1100428X
296
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
Now invoking conditions (5.19) for j = 3, 4 and l = 1, 2, the remaining matrices are necessarily of the form tr tr B1 B1 −B2 −B2 1 = B2 , M 2 = B , N 2 = B . 1 = B1 , N M 2 1 J J I I ∗ ∗ Using the conditions τ˜J(z) = −˜ σz and σ ˜J(z) = τ˜z , which correspond to the requirement that the monad be self-conjugate, we find that
B1 = −B2∗ ,
B2 = B1∗ ,
J = I∗,
I = −J ∗ .
˜z = 0 it remains only to impose the Thus in order to satisfy the condition τ˜z σ conditions (5.19) in the cases j = l = 1 and j = 1, l = 2. The first of these is condition (i) in the theorem; the second case is equivalent to requiring [B1 , B1∗ ] + [B2 , B2∗ ] + II ∗ − J ∗ J + i(α + β) = 0, giving condition (ii) in the theorem. Just as in the classical case [12], it is evident that the remaining gauge freedom in this calculation is given by the stated action of U(k), whence the result. Finally, we comment on a significant difference between the parameter space Mk of instantons on the classical space R4 and the parameter space M(k; ) of instantons on the Moyal-deformed version R4 . Recall that, in the classical ADHM construction of Sec. 4.2, we needed to assume that the algebra-valued matrix ρ2 of Eq. (4.11) is invertible. Formally adjoining an inverse ρ−2 to the algebra Mk (C) ⊗ A[Mk ] ⊗ A[R4 ] resulted in the deletion of a collection of points from the parameter space Mk . In contrast, this noncommutative ADHM construction does not require this. Using the Moyal ADHM equations themselves one shows that the matrix ρ2 , which now having passed to the commutative parameter space has entries in the algebra A[M(k; )] ⊗ A[R4 ], is automatically invertible (we refer to [24, 14] for a proof). 6. The Connes–Landi Noncommutative Plane R4θ Next we turn to the construction of instantons on the noncommutative plane R4θ , which is an example of a toric noncommutative manifold (or isospectral deformation) in the sense of [10]. In particular, R4θ is obtained as a localization of the Connes–Landi quantum four-sphere Sθ4 , just as in Lemma 2.1 (cf. [20]), although here we shall obtain it directly from classical R4 by cocycle twisting. 6.1. Toric deformation of the space of monads Whereas the Moyal space-time R4 was obtained by cocycle twisting along an action of the group of translation sysmmetries of space-time, the noncommutative
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
297
space-time R4θ is constructed by deforming the classical coordinate algebra A[R4 ] along an action of a group of rotational symmetries. Indeed, for the twisting Hopf algebra we take H = A[T2 ], the algebra of coordinate functions on the two-torus T2 . It is the commutative unital algebra A[T2 ] := A[sj , s−1 j | j = 1, 2] equipped with the Hopf ∗-algebra structure s∗j = s−1 j ,
∆(sj ) = sj ⊗ sj ,
(sj ) = 1,
S(sj ) = s−1 j
(6.1)
for j = 1, 2, with ∆ and extended as ∗-algebra maps and S extended as a ∗-antialgebra map. In order to deform the twistor fibration, we need to equip the various coordinate algebras with left H-comodule structures. There is a Hopf algebra projection from A[GL+ (2, H)] onto H, defined on generators by α1 −α∗2 β1 −β2∗ s1 0 0 0 α2 0 s∗1 0 0 α∗1 β2 β1∗ + (6.2) π : A[GL (2, H)] → H, → γ1 −γ2∗ δ1 −δ2∗ 0 0 s2 0 γ1∗
γ2
δ2
δ1∗
0
0
0
s∗2
and extended as a ∗-algebra map. Using Eq. (2.16), this projection determines a left H-coaction ∆π : A[C4 ] → H ⊗ A[C4 ] by A[C4 ] → H ⊗ A[C4 ],
zj → ςj ⊗ zj ,
(6.3)
extended as a ∗-algebra map, where we use the shorthand notation (ςj ) = (s1 , s∗1 , s2 , s∗2 ) for the generators of H. Using the identification of generators in Eq. (2.10), this induces a coaction on the space-time algebra, A[R4 ] → H ⊗ A[R4 ],
ζ1 → ς1 ς4 ⊗ ζ1 ,
ζ2 → ς2 ς4 ⊗ ζ2 ,
(6.4)
and extended as a ∗-algebra map, making A[R4 ] into a left H-comodule ∗-algebra. As a twisting cocycle on H, we take the linear map defined on generators by F : H ⊗ H → C,
F (sj , sl ) = exp(iπΘjl )
(6.5)
and extended as a Hopf bicharacter in the sense of Eq. (1.3). Here the deformation matrix Θ is the 2 × 2 real anti-symmetric matrix
0 θ 1 Θ = (Θjl ) = 2 −θ 0 for 0 < θ < 1 a real parameter. It is straightforward to check using the formulæ (1.9), (1.10) and (1.12) that the product, antipode and ∗-structure on H are in fact undeformed by F , so that H = HF as a Hopf ∗-algebra. However, the effect of the twisting on the H-comodule algebras A[C4 ] and A[R4 ] is non-trivial. In what
April 11, J070-S0129055X1100428X
298
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
follows we write ηjl := F −2 (ςj , ςl ), 1 1 (ηjl ) = ¯ µ
namely µ µ ¯ 1 µ ¯ µ , µ 1 1 1
µ µ ¯
1
µ = eiπθ .
(6.6)
1
Lemma 6.1. The relations in the H-comodule algebra A[C4 ] are twisted into zj zl = ηlj zl zj ,
zj zl∗ = ηjl zl∗ zj ,
zj∗ zl = ηjl zl zj∗ ,
zj∗ zl∗ = ηlj zl∗ zj∗
(6.7)
for each j, l = 1, . . . , 4. Proof. The cocycle-twisted product on the H-comodule algebra A[C4 ] is defined by the formula (1.14). Just as in Lemma 5.1, the corresponding algebra relations can be expressed using the R-matrix (1.13): in this case one finds that the R-matrix takes the values R(ςj , ςl ) = F −2 (ςj , ςl ) = ηjl ,
R(ςj , ςl∗ ) = F −2 (ςj , ςl∗ ) = ηlj .
(6.8)
By explicitly computing Eqs. (5.9) (and omitting the product symbol ·F ), one obtains the relations stated in the lemma. We denote by A[C4θ ] the algebra generated by {zj , zj∗ | j = 1, . . . , 4} modulo the relations (6.7). In this way, we have that A[C4θ ] is a left HF -comodule ∗-algebra. Lemma 6.2. The algebra relations in the H-comodule algebra A[R4 ] are twisted into ζ1 ζ2 = λζ2 ζ1 ,
ζ1∗ ζ2∗ = λζ2∗ ζ1∗ ,
ζ2∗ ζ1 = λζ1 ζ2∗ , 2
where the deformation parameter is λ := µ = e
2πiθ
ζ2 ζ1∗ = λζ1∗ ζ2 ,
(6.9)
.
Proof. The product on A[R4 ] is once again twisted using the formula (1.14). Again omitting the product symbol ·F , the relations are computed to be as stated. We denote by A[R4θ ] the algebra generated by ζ1 , ζ2 and their conjugates, subject to these relations. They make A[R4θ ] into a left HF -comodule ∗-algebra. Remark 6.3. Since the generators z1 , z2 and their conjugates generate a commutative subalgebra of A[C4θ ], it is easy to see using Lemma 2.2 that it is only the base space R4θ of the localized twistor bundle that is deformed. The typical fiber CP1 remains classical and the localized twistor algebra is isomorphic to the tensor product A[R4θ ] ⊗ A[CP1 ]. The canonical differential calculi described in Sec. 3.1 are also deformed. The relations in the quantized calculi are given in the following lemmata. Lemma 6.4. The twisted differential calculus Ω(C4θ ) is generated by the degree zero elements zj , zl∗ and the degree one elements dzj , dzl∗ for j, l = 1, . . . , 4, subject to
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
299
the bimodule relations between functions and one-forms zj dzl = ηlj (dzl )zj ,
zj dzl∗ = ηjl (dzl∗ )zj
for j, l = 1, . . . , 4 and the anti-commutation relations between one-forms dzj ∧ dzl + ηlj dzl ∧ dzj = 0,
dzj ∧ dzl∗ + ηjl dzl∗ ∧ dzj = 0
for j, l = 1, . . . , 4. Proof. One views the classical calculus Ω(C4 ) as a left H-comodule algebra and accordingly computes the deformed product using the twisting cocycle F . Since the exterior derivative d is H-equivariant, the (anti-)commutation relations in the deformed calculus Ω(C4θ ) are exactly the same as the algebra relations in A[C4θ ] but with d inserted appropriately. Lemma 6.5. The twisted differential calculus Ω(R4θ ) is generated by the degree zero elements ζ1 , ζ1∗ , ζ2 , ζ2∗ and the degree one elements dζ1 , dζ1∗ , dζ2 , dζ2∗ , subject to the relations ζ1 dζ2 − λdζ2 ζ1 = 0,
ζ2∗ dζ1 − λdζ1 ζ2∗ = 0,
dζ1 ∧ dζ2 + λdζ2 ∧ dζ1 = 0,
dζ2∗ ∧ dζ1 + λdζ1 ∧ dζ2∗ = 0.
Proof. Once again, the classical calculus Ω(R4 ) is deformed as a twisted left H-comodule algebra, with the relations working out to be as stated. In particular, it is clear that the vector space Ω2 (R4θ ) is the same as it is classically. The Hodge operator ∗ : Ω2 (R4 ) → Ω2 (R4 ) commutes with the H-coaction in the sense that ∆π (∗ω) = (id ⊗ ∗)∆π (ω),
ω ∈ Ω2 (R4 ),
so that there is also a Hodge operator ∗θ : Ω2 (R4θ ) → Ω2 (R4θ ) defined by the same formula as it is classically. There is a decomposition of Ω2 (R4θ ) into self-dual and anti-self-dual two-forms Ω2 (R4θ ) = Ω2+ (R4θ ) ⊕ Ω2− (R4θ ) which, at the level of vector spaces, is identical to the corresponding decomposition in the classical case. We also apply the cocycle deformation the coordinate algebra A[Mk ] of the space of self-conjugate monads by viewing it as a left H-comodule algebra. We write A[Mk;θ ] for the resulting cocycle-twisted left HF -comodule algebra. Proposition 6.6. The noncommutative ∗-algebra A[Mk;θ ] is generated by the j l , Ndc for a, c = 1, . . . , k and b, d = 1, . . . , 2k + 2, modulo matrix elements Mab the relations j j l l Mcd = ηlj Mcd Mab , Mab
together with the ∗-structure (4.6).
j j l l Nba Ndc = ηlj Ndc Nba ,
April 11, J070-S0129055X1100428X
300
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
Proof. From Lemma 4.4, we read off the H-coaction on generators M j , j = 1, . . . , 4, obtaining j j → ςj∗ ⊗ Mab , Mab
l l Ndc → ςl∗ ⊗ Ndc ,
which we extend as a ∗-algebra map. The deformed relations follow immediately from an application of the twisting formula (1.14). The coaction of H on A[Mk ] does not depend on the matrix indices of the generators M j , N l , j, l = 1, . . . , 4, hence neither do the twisted commutation relations. In terms of the deformed product, the relations (4.4) are twisted into the relations j j l l (Ndr Mrb + ηjl Ndr Mrb )=0 r
for each j, l = 1, . . . , 4 and b, d = 1, . . . , k. 6.2. The construction of instantons on R4θ Just as we did for the Moyal plane, we now use the noncommutative space of monads Mk;θ to construct families of instantons on the Connes–Landi space-time R4θ . 2 Z2 . Given a The Pontryagin dual of the torus T2 is the discrete group T pair of integers (r1 , r2 ) ∈ Z2 we define unitary elements u = (u1 , u2 , u3 , u4 ) of the algebra HF by u = (u1 , u2 , u3 , u4 ) = (ς1m1 , ς2m2 , ς3m3 , ς4m4 ),
(6.10)
where (mj ) = (r1 , r1 , r2 , r2 ). It is clear that u∗1 = u2 and u∗3 = u4 , and that each uj is a group-like element of the Hopf algebra HF , i.e. it transforms as ∆(uj ) = uj ⊗ uj under the coproduct ∆ : HF → HF ⊗ HF . Lemma 6.7. There is a canonical left action of HF on the algebra A[Mk;θ ] defined on generators by j j j ml = R(ςj∗ , ςlml )Mab = ηlj Mab , ul Mab
j ∗ j ∗ j ∗ ml ul Mab = R(ςj , ςlml )Mab = ηjl Mab
for j, l = 1, . . . , 4. Proof. From Proposition 5.6 we know that A[Mk;θ ] is a left HF -comodule algebra; it is therefore also a left HF -module algebra according to the formula (1.5), which works out to be as stated. This gives us an action of the group Z2 on the algebra A[Mk;θ ] by ∗-automorphisms, γ : Z2 → Aut A[Mk;θ ].
(6.11)
The smash product algebra A[Mk;θ ] > HF corresponding the to HF -action of Lemma 6.7 works out using the coproduct ∆(ςj ) = ςj ⊗ ςj on HF and the
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
301
formula (1.7) to have relations of the form j j ml ms r r (Mab ⊗ ul )(Mcd ⊗ us ) = ηlr ηrj ηjs (Mcd ⊗ us )(Mab ⊗ ul ) j j ml ms r ∗ r ∗ ⊗ ul )(Mcd ⊗ us ) = ηrl ηrj ηsj (Mcd ⊗ us )(Mab ⊗ ul ) (Mab
for j, l, r, s = 1, . . . , 4, together with their conjugates. This is another special case of Example 1.1. We think of this smash product A[Mk;θ ] > HF as an algebraic version the crossed product algebra A[Mk;θ ] >γ Z2 determined by the group action (6.11). Lemma 6.8. The algebra structure of the tensor product A[Mk;θ ] ⊗ A[C4θ ] is determined by the relations in the respective subalgebras A[Mk;θ ] and A[C4θ ] given above, together with the cross-relations M j zl = ηjl zl M j ,
M j zl∗ = ηlj zl∗ M j ,
j, l = 1, . . . , 4,
as well as their conjugates. Proof. The classical algebra A[Mk ] ⊗ A[C4 ] is equipped with the tensor product HF -coaction of Eq. (1.4). We deform the product in this algebra using the formula (1.14). The cross-terms in the resulting algebra A[Mk;θ ] ⊗ A[C4θ ] are computed to be as stated [5]. Once again, the symbol ⊗ is to remind us that the algebra structure on the tensor product is not the standard one and has been twisted by the deformation procedure. Once again we have a pair of matrices σz and τz , σz = M j ⊗ zj , τz = N j ⊗ zj , j
j
but whose entries live in the twisted algebra A[Mk;θ ] ⊗ A[C4θ ]. The matrix V := (σz σJ(z) ) is a 2k × (2k + 2) matrix with entries in A[Mk;θ ] ⊗ A[C4θ ], using which we define ρ2 := V∗ V. From the projection Q := Vρ−2 V∗ we construct the complementary matrix P := 12k+2 − Q, which has entries in the algebra A[Mk;θ ] ⊗ A[R4θ ]. It is clear that this matrix P is a self-adjoint idempotent, P2 = P = P∗ . However, just as was the case for the Moyal plane, it does not define an honest family of projections in the sense of Definition 3.2, since it has values in the twisted tensor product algebra. We recover a genuine family of projections using the following lemma, in which we use the Sweedler notation Z → Z (−1) ⊗ Z (0) for the left coaction A[C4θ ] → HF ⊗ A[C4θ ] defined in Eq. (6.3). Lemma 6.9. There is a canonical ∗-algebra map µ : A[Mk;θ ] ⊗ A[C4θ ] → (A[Mk;θ ] > HF ) ⊗ A[C4θ ] defined by µ(M ⊗ Z) = M ⊗ Z (−1) ⊗ Z (0) for each M ∈ A[Mk;θ ] and Z ∈ A[C4θ ]. Proof. The proof is identical to that of Lemma 5.10, save for the replacement of the coaction (5.4) by the coaction (6.3).
April 11, J070-S0129055X1100428X
302
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
As a consequence, we find that there are maps σ ˜z : H ⊗ A[C4θ ] → (A[Mk;θ ] > HF ) ⊗ K ⊗ A[C4θ ],
(6.12)
τ˜z : K ⊗ A[C4θ ] → (A[Mk;θ ] > HF ) ⊗ L ⊗ A[C4θ ]
(6.13)
defined by composing σz and τz with the map µ. With the coaction (6.3), they work out to be M r ⊗ ςr ⊗ zr , τ˜z := N r ⊗ ςr ⊗ zr , σ ˜z := r
r
which are, respectively, k × (2k + 2) and (2k + 2) × k matrices with entries in the noncommutative algebra (A[Mk;θ ] > HF ) ⊗ A[C4θ ]. With this in mind, we form the := (˜ ˜J(z) ), this time yielding a 2k × (2k + 2) matrix (2k + 2) × 2k matrix V σz σ ∗ V. Just as in the with entries in (A[Mk;θ ] > HF ) ⊗ A[C4θ ], and define ρ 2 := V classical case, in order to proceed we need to slightly enlarge the matrix algebra Mk (C) ⊗ A[Mk;θ ] ⊗ A[C4θ ] by adjoining an inverse element ρ−2 for ρ 2 . = V ρ−2 V ∗ is a projection, Proposition 6.10. The (2k + 2) × (2k + 2) matrix Q =Q ∗ , with entries in the algebra (A[Mk;θ ] > HF ) ⊗ A[C4 ] and trace equal 2 = Q Q θ to 2k. is a projection follows from the fact that Q is a projecProof. The fact that Q tion and µ is a ∗-algebra map. By construction, the entries of the matrix ρ2 are central in the algebra (A[Mk;θ ] > HF ) ⊗ A[C4θ ] (this follows from the fact that the corresponding classical matrix elements are coinvariant under the left H-coaction), from which it follows that the trace computation in Proposition 4.5 is valid in the noncommutative case as well [5]. we construct the complementary projection P := 12k+2 − From the projection Q 4 Q. It has entries in the algebra (A[Mk;θ ] > HF ) ⊗ A[Rθ ] and has trace equal to two. In analogy with Definition 3.2, the finitely generated projective module 4 2k+2 E := P((A[M k;θ ] > HF ) ⊗ A[Rθ ])
defines a family of rank two vector bundles over R4θ parametrized by the noncommutative algebra A[Mk;θ ] > HF . We equip this family of vector bundles with the family of Grassmann connections associated to the projection P. 2 of the Grassmann family of Proposition 6.11. The curvature F = P((id ⊗ d)P) connections ∇ := (id ⊗ d) ◦ P is anti-self-dual. Proof. From Lemma 5.5, we know that the space of two-forms Ω2 (R4θ ) and the Hodge ∗-operator ∗θ : Ω2 (R4θ ) → Ω2 (R4θ ) are undeformed and equal to their classical counterparts; similarly for the decomposition Ω2 (R4θ ) = Ω2+ (R4θ ) ⊕ Ω2− (R4θ ) into selfdual and anti-self-dual two-forms. This identification of the “quantum” with the “classical” spaces of two-forms survives the tensoring with the parameter space
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
303
A[Mk;θ ] > HF , which yields that (A[Mk;θ ] > HF ) ⊗ Ω2± (R4θ ) and (A[Mk ] ⊗ H) ⊗ Ω2± (R4 ) are isomorphic as vector spaces. Computing the curvature F in exactly the same way as in Proposition 4.6, we see that it must be anti-self-dual, since the same is true in the classical case. Recall that, in the case of instantons on the Moyal plane R4 discussed in Sec. 5.2, we chose to avoid altogether the issue of whether or not the families of instantons constructed there have finite energy. However, in the present case of instantons on the toric manifold R4θ , it is clear that the instantons produced by the noncommutative ADHM construction are indeed localizations of the noncommutative families instantons on the conformal compactification Sθ4 constructed in [5], whence they must have finite Yang–Mills energy. 6.3. The toric ADHM equations The previous section produced a family of instantons parametrized by the noncommutative algebra A[Mk;θ ] > HF . Just as we did for the Moyal space-time, we would like to find a suitable commutative subalgebra of A[Mk;θ ] > HF and hence a family of instantons parametrized by a classical space. In order to do this, we introduce elements of A[Mk;θ ] > HF defined by 1 1 ab M := Mab ⊗ ς1 ,
2 ab M := M 2 ⊗ ς2 ,
3 3 ab M := Mab ⊗ 1,
4 4 ab M := Mab ⊗1
j ∗ , for each a = 1, . . . , k and b = 1, . . . , 2k + 2, together with their conjugates M ab j = 1, . . . , 4. Definition 6.12. We write A[M(k; θ)] for the ∗-subalgebra of A[Mk;θ ] > HF genl ∗ . j , M erated by the elements M ab ab Proposition 6.13. The algebra A[M(k; θ)] is a commutative ∗-subalgebra of the smash product A[Mk;θ ] > HF . 4 and their conjugates are obviously central. The 3 , M Proof. The generators M ab ab 2 and their conjugates are also easily seen to commute amongst 1 , M generators M ab ab themselves. We check the case j ∈ {1, 2}, l ∈ {3, 4}, yielding j j j l l l j M l M ab cd = (Mab ⊗ ςj )(Mab ⊗ 1) = ηjl Mab Mcd ⊗ ςj = ηjl ηlj Mcd Mab ⊗ ςj j l j l M = (Mcd ⊗ 1)(Mab ⊗ ςj ) = M cd ab .
All other pairs of generators are shown to commute using similar computations. Although we have changed our set of generators, we nevertheless combine them with the Hopf algebra HF using a smash product construction. Let : HF ⊗ A[M(k; θ)] → A[M(k; θ)] be the left HF -action defined by j = ηlj M j , ςl M ab ab
j ∗ = ηjl M j ∗ ςl M ab ab
April 11, J070-S0129055X1100428X
304
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
for j, l = 1, . . . , 4 and let A[M(k; θ)] > HF be the corresponding smash product algebra. The next proposition relates the parameter space A[M(k; θ)] to the parameter space A[Mk;θ ]. Theorem 6.14. There is a ∗-algebra isomorphism φ : A[M(k; θ)] > HF A[Mk;θ ] > HF defined for each h ∈ HF by 1 ⊗ h → M 1 ⊗ ς1 h, M ab ab
2 ⊗ h → M 2 ⊗ ς2 h, M ab ab
3 ⊗ h → M 3 ⊗ h, M ab ab
4 ⊗ h → M 4 ⊗ h M ab ab
→
and extended as a ∗-algebra map. Proof. The given map is clearly a vector space isomorphism with inverse 1 1 ⊗ ς ∗ h, M 2 ⊗ h → M 2 ⊗ ς ∗ h, Mab ⊗ h → M 1 2 ab ab ab 3 3 ⊗ h, Mab ⊗ h → M ab
4 4 ⊗ h, Mab ⊗ h → M ab
extended as a ∗-algebra map. By definition, the map φ is a ∗-algebra homomorphism on the subalgebra A[M(k; θ)], so it remains to check that it preserves the crossrelations with the subalgebra HF . This is easy to verify: one has for example that 1 ⊗ 1) = (1 ⊗ ςj )(M 1 ⊗ ς1 ) = ηj1 M 1 ⊗ ςj ς1 φ(1 ⊗ ςj )φ(M 1 ⊗ 1)φ(1 ⊗ ςj ). = ηj1 (M 1 ⊗ ς1 )(1 ⊗ ςj ) = ηj1 φ(M The remaining relations are checked in exactly the same way. Next we focus on the task of seeing how the parameters corresponding to the subalgebra HF can be removed in order to leave a family of instantons parametrized by the commutative algebra A[M(k; θ)]. There is a right coaction δR : A[M(k; θ)] > HF → (A[M(k; θ)] > HF ) ⊗ HF ,
δR := id ⊗ ∆,
(6.14)
where ∆ : HF → HF ⊗ HF is the coproduct on the Hopf algebra HF . This coaction are ⊗ 1 and δR (P) is by gauge transformations, meaning that the projections P unitarily equivalent in the matrix algebra M2k+2 ((A[M(k; θ)] > HF ) ⊗ HF ) and so they define gauge equivalent families of instantons [5]. The parameters determined by the subalgebra HF in A[M(k; θ)] > HF are therefore just gauge parameters and so they may be removed by passing to the subalgebra of A[M(k; θ)] > HF consisting of coinvariant elements under the coaction (6.14), viz. A[M(k; θ)] ∼ = {a ∈ A[M(k; θ)] > HF | δR (a) = a ⊗ 1}. In this way we obtain a projection Pk;θ with entries in A[M(k; θ)] ⊗ A[R4θ ]. The explicit details of the construction of the projection Pk;θ are given in [5], together with a proof of the fact that the Grassmann family of connections ∇ = Pk;θ ◦(id⊗d) also has anti-self-dual curvature and hence defines a family of instantons on R4θ .
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
305
Moreover, the commutative algebra A[M(k; θ)] is the algebra of coordinate functions on a classical space of monads M(k; θ). For each point x ∈ M(k; θ) there is an evaluation map evx ⊗ id : A[M(k; θ)] ⊗ A[C4θ ] → A[C4θ ], which in turn determines monad over the noncommutative space C4θ in terms of the σz and (evx ⊗ id) τz , i.e. a sequence of free right A[C4θ ]-modules matrices (evx ⊗ id) (evx ⊗id)˜ σz
(evx ⊗id)˜ τz
0 → H ⊗ A[C4θ ] −−−−−−−→ K ⊗ A[C4θ ] −−−−−−−→ L ⊗ A[C4θ ] → 0.
(6.15)
Recall from Proposition 4.7 that the gauge freedom in the classical ADHM construction is precisely the freedom to choose linear bases of the vector spaces H, K, L. Clearly we also have this freedom in the noncommutative construction and so we write ∼ for the resulting equivalence relation on the space M(k; θ) (cf. Definition 4.8). This yields the following description of the space M(k; θ) of classical parameters in the ADHM construction on R4θ . Theorem 6.15. For k ∈ Z a positive integer, the space M(k; θ)/∼ of equivalence classes of self-conjugate monads over C4θ is the quotient of the set of complex matrices B1 , B2 ∈ Mk (C), I ∈ M2×k (C), J ∈ Mk×2 (C) satisfying the equations (i) µ ¯B1 B2 − µB2 B1 + IJ = 0, (ii) [B1 , B1∗ ] + [B2 , B2∗ ] + II ∗ − J ∗ J = 0 by the action of U(k) given by B1 → gB1 g −1 ,
B2 → gB2 g −1 ,
I → gI,
J → Jg −1
for each g ∈ U(k). Proof. We express the monad maps σ ˜z , τ˜z as 1 z1 + M 2 z2 + M 3 z3 + M 4 z4 , σ ˜z = M
1 z1 + N 2 z2 + N 3 z3 + N 4 z4 τ˜z = N
j , N l , j, l = 1, . . . , 4. Upon expanding out the condition for constant matrices M ˜z = 0 and using the commutation relations in Lemma 6.1, we find the conditions τ˜z σ l + ηjl N lM j = 0 jM N
(6.16)
for j, l = 1, . . . , 4. The typical fiber CP1 of the twistor fibration R4 × CP1 has homogeneous coordinates z1 , z1∗ , z2 , z2∗ and the “line at infinity” ∞ is recovered by setting z1 = z2 = 0. On this line, the monad condition τ˜z σ ˜z = 0 becomes 3M 4 + N 3 = 0, 4M N
3 = 0, 3M N
4 = 0. 4M N
(6.17)
Moreover, when z1 = z2 = 0 we see from the relations (6.7) that the coordinates z3 , z4 and their conjugates are mutually commuting, so that the line ∞ is classical. Self-conjugacy of the monad once again implies that the restricted bundle over ∞
April 11, J070-S0129055X1100428X
306
2011 12:3 WSPC/S0129-055X
148-RMP
S. Brain & W. D. van Suijlekom
4 = −N 3 3M 4M is trivial, whence we can argue as in [26] to show that the map N 3 4 = 1k and M is an isomorphism. We choose bases for H, K, L such that N tr tr 1k×k 0k×k 0k×k −1k×k 3 = 0k×k , M 4 = 0k×k . 4 = 1k×k , N 3 = 1k×k , N M 02×k 02×k 0k×2 0k×2 Using the conditions (6.16) for j = 3, 4 and l = 1, 2, the remaining matrices are necessarily of the form tr tr B1 B1 −µB2 −¯ µB2 2 = 1 = 2 = 1 = M ¯B1 , N B2 , M B2 , N µ µB1 . J
J
I
I
∗ ∗ Invoking the relations τ˜J(z) = −˜ σz and σ ˜J(z) = τ˜z corresponding to the fact that the monad is self-conjugate, we find that
µB2∗ , B1 = −¯
B2 = µB1∗ ,
J = I∗,
I = −J ∗ .
Thus in order to fulfil the condition τ˜z σ ˜z = 0 it remains to impose conditions (6.16) in the cases j = l = 1 and j = 1, l = 2. These are precisely conditions (i) and (ii) in the theorem. It is evident just as in the classical case [12] that the remaining freedom in this set-up is given by the stated action of U(k), whence the result. Acknowledgments SJB gratefully acknowledges support from the ESF network “Quantum Geometry and Quantum Gravity” and the NWO grant 040.11.163. WvS acknowledges support from NWO under VENI-project 639.031.827. Both authors thank the Institut des ´ Hautes Etudes Scientifiques for hospitality during a short visit in 2010. References [1] M. F. Atiyah, Geometry of Yang–Mills Fields, Fermi Lectures (Scuola Normale Pisa, 1979). [2] M. F. Atiyah, V. G. Drinfel’d, N. J. Hitchin and Yu. I. Manin, Construction of instantons, Phys. Lett. A 65 (1978) 185–187. [3] M. F. Atiyah, N. J. Hitchin and I. M. Singer, Self-duality in four-dimensional Riemannian geometry, Proc. R. Soc. London A 362 (1978) 425–461. [4] S. Brain and G. Landi, Families of monads and instantons from a noncommutative ADHM construction, in Quanta of Maths, Clay Math. Proc., Vol. 11 (AMS Providence, RI, 2010), pp. 55–84. [5] S. Brain and G. Landi, Moduli spaces of instantons: Gauging away noncommutative parameters, to appear in Quart. J. Math. (2010); DOI:10.1093/qmath/haq036. [6] S. Brain and S. Majid, Quantisation of twistor theory by cocycle twist, Comm. Math. Phys. 284 (2008) 713–774. [7] C.-S. Chu, V. V. Khoze and G. Travagliani, Notes on noncommutative instantons, Nucl. Phys. B 621 (2002) 101–130.
April 11, J070-S0129055X1100428X
2011 12:3 WSPC/S0129-055X
148-RMP
ADHM Construction of Instantons on Noncommutative Spaces
307
[8] A. Connes, Noncommutative Geometry (Academic Press, New York, 1994). [9] A. Connes, Gravity coupled with matter and the foundation of noncommutative geometry, Comm. Math. Phys. 182 (1996) 155–176. [10] A. Connes and G. Landi, Noncommutative manifolds, the instanton algebra and isospectral deformations, Comm. Math. Phys. 221 (2001) 141–159. [11] A. Connes and M. Dubois-Violette, Noncommutative finite-dimensional manifolds I: Spherical manifolds and related examples, Comm. Math. Phys. 230 (2002) 539–579. [12] S. K. Donaldson, Instantons and geometric invariant theory, Comm. Math. Phys. 93 (1984) 435–460. [13] S. K. Donaldson and P. B. Kronheimer, The Geometry of Four-Manifolds (Oxford University Press, 1990). [14] K. Furuuchi, Instantons on noncommutative R4 and projection operators, Prog. Theor. Phys. 103 (2000) 1043–1068. [15] C. R. Gilson, M. Hamanaka and J. C. Nimmo, B¨ acklund transformations for noncommutative anti-self-dual Yang–Mills equations, Glasgow Math. J. 51 (2009) 83–93. [16] Z. Horv´ ath, O. Lechtenfeld and M. Wolf, Noncommutative instantons via dressing and splitting approaches, J. High Energy Phys. 12 (2002) 060. [17] A. Kapustin, A. Kuznetsov and D. Orlov, Noncommutative instantons and twistor transform, Comm. Math. Phys. 221 (2001) 385–432. [18] G. Landi and W. D. van Suijlekom, Principal fibrations from noncommutative spheres, Comm. Math. Phys. 260 (2005) 203–225. [19] G. Landi and W. D. van Suijlekom, Noncommutative instantons from twisted conformal symmetries, Comm. Math. Phys. 271 (2007) 591–634. [20] G. Landi and W. D. van Suijlekom, Noncommutative instantons in Tehran, in An Invitation to Noncommutative Geometry (World Sci. Publ., Hackensack, NJ, 2008), pp. 275–353. [21] G. Landi, C. Pagani, C. Reina and W. D. van Suijlekom, Noncommutative families of instantons, Int. Math. Res. Not. 12 (2008) Art. ID rnn038, 32 p. [22] S. Majid, Foundations of Quantum Group Theory (Cambridge University Press, 1995). [23] L. J. Mason and N. M. J. Woodhouse, Integrability, Self-Duality and Twistor Theory (Oxford University Press, 1996). [24] N. A. Nekrasov and A. Schwarz, Instantons on noncommutative R4 and (2, 0) superconformal six-dimensional theory, Comm. Math. Phys. 198 (1998) 689–703. [25] N. A. Nekrasov, Trieste lectures on solitons in noncommutative gauge theories, in Superstrings and Related Matters (Trieste, 2000) (World Sci. Publ., River Edge, NJ, 2001), pp. 141–205. [26] C. Okonek, M. Schneider and H. Spindler, Vector Bundles on Complex Projective Spaces (Birkhauser, Boston, 1980). [27] R. Penrose and W. Rindler, Spinors and Space-Time, Vol. 2 (Cambridge University Press, 1986). [28] Y. Tian and C.-J. Zhu, Instantons on general noncommutative R4 , Commun. Theor. Phys. 38 (2002) 691–697. [29] Y. Tian and C.-J. Zhu, Remarks on the noncommutative ADHM construction, Phys. Rev. D 67 (2003) 045016.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 3 (2011) 309–346 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004308
ON THE STATIC SPACETIME OF A SINGLE POINT CHARGE
A. SHADI TAHVILDAR-ZADEH Department of Mathematics, Rutgers, The State University of New Jersey, 110 Frelinghuysen Rd., Piscataway, NJ 08854, USA
[email protected] Received 6 December 2010 Revised 7 February 2011 Among all electromagnetic theories which (a) are derivable from a Lagrangian, (b) satisfy the dominant energy condition, and (c) in the weak field limit coincide with classical linear electromagnetics, we identify a certain subclass with the property that the corresponding spherically symmetric, asymptotically flat, electrostatic spacetime metric has the mildest possible singularity at its center, namely, a conical singularity on the time axis. The electric field moreover has a point defect on the time axis, its total energy is finite, and is equal to the ADM mass of the spacetime. By an appropriate scaling of the Lagrangian, one can arrange the total mass and total charge of these spacetimes to have any chosen values. For small enough mass-to-charge ratio, these spacetimes have no horizons and no trapped null geodesics. We also prove the uniqueness of these solutions in the spherically symmetric class, and we conclude by performing a qualitative study of the geodesics and test-charge trajectories of these spacetimes. Keywords: Nonlinear electromagnetism; Einstein equations; point charges; Born–Infeld theory; conical singularity; ADM mass; point defects; self-energy; naked singularity; Birkhoff theorem; test-charge orbits. Mathematics Subject Classification 2010: 83C50, 35Q60, 78A30, 78A35
1. Introduction In this paper, we study static, spherically symmetric solutions of the Einstein– Maxwell system featuring a single spinless point charge.a Our main results are summarized in a theorem at the end of this introduction. The Einstein–Maxwell system of PDEs reads: 1 (1.1) Rµν [g] − gµν R[g] = κTµν , 2 dF = 0, (1.2) dM = 0. a In
a sequel to this paper, we will study solutions with spin. 309
(1.3)
April 11, J070-S0129055X11004308
310
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
It describes the geometry of a spacetime (M, g) endowed with an electromagnetic field. Here R[g] is the Ricci curvature tensor and R[g] the scalar curvature of the metric g of a four-dimensional Lorentzian manifold M. Moreover, F is the Faraday tensor of the electromagnetic field and M is the Maxwell tensor corresponding to F(for which we will consider a whole family of choices), while T is the electromagnetic energy(density)-momentum (density)-stress tensor associated to F. Finally, κ = 8πG/c4 , with G being Newton’s constant of universal gravitation and c the “speed of light in vacuum”. (In the following we will work with units in which κ = 1.) Before we discuss the options of choosing the relationship between M and F, we should recall that an open domain M in a Lorentzian manifold M is (somewhat inappropriately) called static if it has a hypersurface-orthogonal Killing field K whose orbits are complete and everywhere timelike in M . Such a domain possesses a time-function t, i.e. a function defined on it whose gradient is timelike, and the vectorfield dual to its gradient is future-directed everywhere. It can always be chosen such that Kt = 1. One can use t as a coordinate function on M , and the level sets of t provide a foliation of M into spacelike leaves Σt . It follows that the induced metric on the leaves is independent of t, i.e. the space Σt is static. Furthermore, for a “static spacetime” to be spherically symmetric means that the rotation group SO(3) acts as a continuous group of isometries on the manifold, with orbits that are spacelike 2-spheres, and its action commutes with that of the group generated by the timelike Killing field K. For p ∈ M , let A(p) be the area of the spherical orbit that goes through p, and let r(p) = A(p)/4π be the radius of a Euclidean sphere with area A(p). As long as g(∇r, ∇r) > 0 one can use r as a spacelike coordinate function on the manifold, and in that case the metric of the spacetime can be put in the form gµν dxµ dxν = −eξ dt2 + e−ξ dr2 + r2 (dθ2 + sin2 θ dφ2 ),
(1.4)
where (θ, φ) are spherical coordinates on the orbit spheres, and ξ = ξ(r) is a smooth function which depends on the choice of relationship between F and M, here called an “aether law” (for short, and for historical reasons.) The simplest choice of the aether law is Maxwell’s M = − ∗ F, in which case the Einstein–Maxwell system will be called the Einstein–Maxwell–Maxwell system (EMM). It is well known that the Reissner–Weyl–Nordstr¨ om spacetimeb (RWN c for short) is the unique spherically symmetric, asymptotically flat solution of the EMM system. For the RWN solution, one has eξ = 1 −
b Generally
q2 2m0 + 02 , r r
known as having been discovered independently by Reissner [1] and Nordstr¨ om [2], this spacetime is also a member of a whole class of electrovacuum solutions found by Weyl [3]. c See Sec. 6 for a precise statement and proof of uniqueness.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
311
where q0 and m0 are two real parameters. They are in fact integration constants that come from solving the radial Liouville-type equation that arises as the reduction of the EMM system under the stated symmetry assumptions. Since the spacetime is asymptotically flat, its ADM mass [5] is well defined, and it is seen to be equal to the parameter m0 . Also, one has a formula for the Faraday tensor in the case of RWN: q0 F = dA, A = ϕ(r)dt, ϕ = , r which suggests, via the divergence theorem, that q0 is the total charge of the spacetime. As discussed in detail in [6], the causal structure of the RWN spacetime depends in a crucial way on the ratio |q0 |/m0 : When this is less than one, which is referred ξ to as the subextremal case, the metric coefficient e , in addition to being singular at 2 r = 0, has two zeros, namely, at r± = m0 ± m0 − q02 . It can be shown that r = r+ is the event horizon, the boundary of the past of the spacetime’s future null infinity, and therefore the spacetime has a nonempty black hole region. It is worth noting that the causal structure of the maximal analytic extension of the subextremal RWN spacetime is quite rich and complicated, comprising an infinitude of regions, and is plagued by the breakdown of determinism due to the presence of Cauchy horizons (cf. [6].) By contrast, the RWN metric in the superextremal regime, corresponding to |q0 | > m0 has a very simple causal structure. The metric coefficient eξ is always positive, (t, r, θ, φ) is a global coordinate system for the manifold, and the only singularity present is the naked one, on the timelike axis r = 0. The topology of the manifold is that of R4 minus a line. In view of the fact that the empirical charge-to-mass ratio of charged particles such as the electron and the proton are huge (1018 and 1022 respectively) many researchers have been tempted by the prospect that the superextremal RWN solution is but the simplest example of spacetimes featuring one or more point charges.d Such a spacetime we shall refer to as a “charged-particle spacetime”. One of the key questions that needs to be addressed in this regard is the following: According to relativity theory’s E = mc2 , the proper mass of a charged-particle spacetime should equal its energy, which for a static spacetime is expressed as the integral of the time-time component of T over a static spacelike hypersurface. Is it then possible to attribute some, or all, of the mass of a charged-particle spacetime to the energy of the electromagnetic field that permeates ite ? One difficulty with such an attribution is that in classical Maxwell–Lorentz electrodynamicsf the self-energy of a point charge is infinite, and that remains to [7, §21.1] for references, and for a catalog of such solutions. about the origin of the mass of charged particles are as old as the theory of electromagnetism itself, beginning with Heaviside, and continuing with Abraham, Lorentz, Poincar´e, Mie, Einstein, Fermi, Born, Dirac, Wheeler, Feynman, Schwinger, Rohrlich and many others. f That is to say, Maxwell’s equations with “point-particle-like” sources whose formal law of motion is that of Newton’s, with the formal force being the Lorentz force. The model is ill-defined unless regularized. d See
e Questions
April 11, J070-S0129055X11004308
312
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
be the case even when the electromagnetic Maxwell–Lorentz field is coupled to gravity via the Einstein equations. This can be clearly seen in the case of RWN because the total electrostatic energy carried by a time-slice would be ∞ 2 ∞ q0 |dϕ|2 r2 dr = dr, r2 0 0 which is infiniteg unless q0 = 0. In general relativity, the principle of equivalence states that it is the total energy of a system that interacts gravitationally, i.e. unlike the Newtonian theory, there is no distinction between active and passive “gravitational” masses. Thus, the total electrostatic energy will always make a contribution to the ADM mass of the spacetime, which in the case of the superextremal RWN is clearly an infinite contribution. On the other hand, the ADM mass is usually interpreted as representing the total energy content of the spacetime (recall that in relativity there is no distinction between mass and energy (with c = 1)). Ingenious proposals have been made to explain the finiteness of the ADM mass. Such ideas have been pursued using renormalization techniques [11], but these techniques are usually very difficult or even impossible to justify rigorously.h Another difficulty with taking the superextremal RWN too seriously is the presence of a strong naked singularity on the time axis. Such singularities are expected to be “subject to cosmic censorship”, in the sense that they are believed to be nongeneric, and therefore unstable under small perturbations. Furthermore, the presence of a strong “eternal” singularity means that such a spacetime cannot arise as a solution of a classically-posed initial value problem. For the RWN metric the worst part of the singular behavior at r = 0 stems from the contribution of the charge q0 to the metric coefficient, as can be seen for example from the Kretschmann scalar, which is a curvature invariant equal to the norm of the Riemann tensor: 7q04 48 2m0 q02 2 abcd 2 + 2 . = 6 m0 − K = Rabcd R r r 6r Clearly, K ∼ r−4 ∞ as r 0. To overcome the first of these two divergence problems while retaining the concept of a point charge, Born [15] proposed to make the Maxwell equations g Indeed,
this is the same infinity that turns up in the absence of gravity, in flat spacetime, for the self-energy of a point charge, and which led Abraham and Lorentz, and later Mie, to look for alternatives to the point-charge description. We also note that in the current approach to quantum electrodynamics, the corresponding energy integral to the above is still divergent, although less violently [8], thus even in the absence of gravity, the problem of infinite self-energy of point charges is not solved by going over to the quantum description [9, 10]. h For example, Dirac [12] proposed that the point charge possesses a compensating “bare mass” equal to −∞. However, it is known that a rigorous removal of the regularization using mass renormalization mbare → −∞ is impossible in the case of Lorentz electrodynamics, since the renormalization flow terminates at mbare = 0 with regularization still in place [13]. Similar difficulties exist for the renormalization of RWN [14].
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
313
nonlinear.i This is done by choosing a Lagrangian density L for the electromag netic action S[A] = M L(dA) in such a way that, in addition to fulfilling the basic requirement of generating a Lorentz- and Weyl- (gauge) invariant theory, it coincides in the weak field limit with the Lagrangian of the Maxwell–Maxwell system,j L0 = −
1 F ∧ ∗F, 8π
while its behavior in the strong field limit is such that it leads to finite total energies. One example of such a Lagrangian density is the well-known one-parameter family proposed by Born and Infeld [18]: Lβ = ∗
1 [1 − 1 − β 4 ∗ (F ∧ ∗F) − β 8 (∗(F ∧ F))2 ], 4 4πβ
for β > 0 (in the notation of [20]), which even leads to finite limits of the field strengths at the location of the charge. What is perhaps less common knowledge — though it should be equally wellknown — is that a nonlinear aether law also has the power to reduce the strength of the spacetime singularity that is present when the electromagnetic field is coupled to gravity.k This phenomenon was first noticed by Hoffmann [22], who initially claimed that a solution of the Einstein–Maxwell–Born–Infeld system that he had found, was free of all singularities.l In the years since the publication of Hoffmann’s paper there have been several attempts at finding static electrovacuum spacetimes which are free of all singularities [23–31], either in the fields or in the metric, while at the same time various “no-go” results have been announced [32, 33], that seem to show that such solutions cannot exist. One of the goals of this paper is to take steps towards dispelling the confusion that seems to persist about this subject. Our approach here is to characterize aether laws that not only feature finite self-energies for point charges, but also lead to electrostaticm spacetimes with the mildest form of singularity possible.n Thus we will initially allow the aether law to have the most general form possible, and let the above requirements, of finiteness of the self-energy and mildness of the singularity, as well as other criteria which i Born picked up on the program initiated by Mie [16], who however did not want point charges. See [20, 21] for an excellent account of the development of these ideas, and for their author’s key contribution to this program. j Which seems to have been discovered by Schwarzschild [17]. k Even though, as we will prove in this paper, it cannot quite eliminate the singularity. l Hoffmann’s enthusiasm for nonlinear electrodynamics was not dampened even after it was pointed out to him by Einstein and Rosen (see [23, fn. 15]), that a mild singularity remains at the center of his spacetime. m See Sec. 3 for a precise definition. n In fact, we will see that the remaining mild singularity in the field and the metric is of the point-defect type. There are indications that this kind of singularity may be “just right” for the Hamilton–Jacobi equations to provide a law of motion for those defects [20, 21]. In this way it may turn out then that the “underwater stone” of nonlinear electrodynamics (in the words of [33]) is a gem after all!
April 11, J070-S0129055X11004308
314
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
may arise during the course of analyzing the corresponding solutions, dictate the final form that it should have. In particular we will prove the following: Theorem 1.1 (Informal Version). For any aether law which (a) agrees with that of Maxwell in the weak field limit, (b) is derivable from a Lagrangian, with an energy tensor that satisfies the Dominant Energy Condition, and for which (c) the corresponding Hamiltonian satisfies certain growth conditions (to be made precise below), the following hold: • There exists a unique electrostatic, spherically symmetric, asymptotically flat solution of the Einstein–Maxwell system (1.1)–(1.3) with that aether law, the maximal analytical extension of which is homeomorphic to R4 minus a line. It has a conical singularity on the time axis,o which is the mildest possible singularity for any spherically symmetric electrostatic spacetime whose aether law satisfies (a) and (b). No other singularities are present in the spacetime. • A generalization of Birkhoff ’s Theorem shows that this solution is unique in the spherically symmetric class. • The electrostatic potential is finite on the axis of symmetry, which can be identified with the world line of a point charge. The electric field has a point defect at the location of the charge. The field has finite total electrostatic energy, which is equal to the ADM mass of the spacetime. The mass of the point charge is thus entirely of electromagnetic origin, i.e. it has no bare mass. • By scaling the Hamiltonian appropriately, one can arrange the total mass and the total charge of this solution to have any chosen values. The mass-to-charge ratio of the spacetime enters as a natural small parameter, measuring the departure from the Minkowski spacetime. • For small enough mass-to-charge ratio, there are no horizons of any kind and no trapped null geodesics in this spacetime. • The analysis of geodesics and test-charge trajectories shows that the naked singularity at the center of this spacetime is gravitationally attractive (unlike the case of superextremal RWN). The rest of this paper is organized as follows: In Sec. 2, we introduce the Lagrangian formulation of electromagnetics. Section 3 gives the equations satisfied by an electrostatic solution of (1.1)–(1.3), with an arbitrary aether law. In Sec. 4, we assume spherical symmetry as well, and obtain the general solution to the equations. Section 5 is devoted to the study of the singularities of this solution. In Sec. 6, we state and prove a Birkhoff-type uniqueness result for these spacetimes. In Sec. 7, we give the precise version of our main result, and in Sec. 8, we carry out the qualitative analysis of geodesics and test-charge orbits. o That is to say, the limit as the radius goes to zero of the ratio of the circumference to the radius of a small spacelike circle going around the axis, exists but is not equal to 2π.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
315
2. Nonlinear Electrodynamics
p Let (M, g) be a four-dimensional Lorentzian manifold. Let (M) denote the bundle of p-forms on M. By an electromagnetic Lagrangian (density) we mean a 1 2 (M) ×M (M) which mapping Lem defined on sections of the vector bundle 4 (M) (see [41] for details.) Thus if a is a 1-form on M and takes its values in f a 2-form, then Lem (a, f ) is a 4-form on M. The electromagnetic action is by definition S[a; D] := Lem (a, da), D
where D is a domain in M. A critical point of S with respect to variations of a that are compactly supported in D is called an electromagnetic potential A in D, δS = 0, δa a=A and the exterior derivative of it is the electromagnetic Faraday tensor F = dA. The Maxwell tensor M is by definition ∂Lem (2.1) M= ∂f a=A,f =F in the sense of evaluation, i.e. the object on the right, when evaluated on a vari2 2 (Tp M) as a 4-form is equal to M ∧ f˙ . The source-free ation f˙ ∈ Tp ( (M)) = Maxwell equations are the Euler–Lagrange equations for stationary points of the electromagnetic action S, and are equivalent to the system dF = 0,
dM = 0.
It can be shown [41] that the only Lorentz-invariant gauge-invariant source-free electromagnetic Lagrangians possible are those of the form Lem (a, f ) = − (x(f ), y(f )) [g],
(2.2)
where [g] is the volume form on M induced by the metric g, is a real-valued function of two variables, and x and y are the electromagnetic invariants 1 1 1 1 x(f ) := − ∗ (f ∧ ∗f ) = fµν f µν , y(f ) := ∗ (f ∧ f ) = fµν ∗ f µν . 2 4 2 4 k 4−k Here ∗ is the Hodge star operator ∗ : (M) → (M) with respect to the g metric, defined by ∗σµ1 ···µ4−k =
1 ν1 ···νk σ
[g]ν1 ···νk µ1 ···µ4−k . k!
Note that for k-forms on a Lorentzian 4-manifold, ∗∗ = (−1)k+1 . Furthermore, conservation of parity implies that (x, y) = (x, −y), so that if we assume that is a C 1 function of its variables, then y (x, 0) = 0.
April 11, J070-S0129055X11004308
316
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
By an aether law we simply mean a particular function (x, y) as the Lagrangian density function, which determines the way the electromagnetic vacuum interacts with the spacetime geometry.p For example, conventional (linear) electromagnetics corresponds to the choice = x made first by Maxwell. Using * one defines a dot product on k-forms, by σ · τ = − ∗ (σ ∧ ∗τ ) =
1 µ1 ···µk σ τµ1 ···µk . k!
We also set |σ|2 := σ · σ even though this is not necessarily positive. It follows from (2.2) that ∗Lem = and ∗M =
∂ = x F + y ∗F. ∂F
The energy tensor T corresponding to the Lagrangian density function is a symmetric 2-covariant tensor field on M defined by Tµν = 2
∂ − gµν , ∂gµν
which in the case of an electromagnetic Lagrangian yields 1 1 λ λ Tµν = 2 x Fµλ Fν + y Fµλ ∗Fν − gµν = Fµλ ∗Mλν − gµν . 2 2
(2.3)
Recall that if both of the following hold, the energy tensor T is said to satisfy the Dominant Energy Condition: • Tµν Yµ Yν ≥ 0 for every future-directed timelike vector Y. • The vector −Tµν Yν is future directed causal when Y is a future-directed causal vector. The first of these two is called the Weak Energy Condition. There is also a Strong Energy Condition: 1 Tµν − gµν tr T Yµ Yν ≥ 0, 2 for all future-directed timelike Y. One can prove [42] that the Dominant Energy Condition is satisfied for this field theory if and only if x > 0,
− x x − y y ≥ 0.
Note that under the above assumption, tr T = −4( − x x − y y ) ≤ 0. We also note that it is possible for the Strong Energy Condition to be violated in nonlinear electrodynamics (even though, as we shall see, it can hold for nonlinear electrostatics). p Traditionally, aether law referred to the relationship between tensors M and F, similar to the constitutive relations of elastodynamics relating stresses to strains for the medium. In case of a Lagrangian theory, this is given by (2.1), and thus the choice is that of a particular Lagrangian.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
317
Next we define the following 1-forms, which provide a useful decomposition of the Faraday and Maxwell tensors. Let K be an arbitraryq non-null vectorfield on M, and letr E := iK F
(2.4)
B := iK ∗F
(2.5)
D := iK ∗M = x E + y B
(2.6)
H := iK ∗ ∗M = −iK M = − y E + x B,
(2.7)
where iK denotes the interior product with the vectorfield K, e.g. (iK F)ν = Kµ Fµν . For the Maxwell Lagrangian = x one has D = E and B = H. A general aether law will specify D and B as functions of E and H, or the other way around. Let X := g(K, K). Thus X > 0 wherever K is spacelike, X = 0 where K is null, and X < 0 where K is timelike. From the decomposition of F in terms of E and B it follows that x=
|E|2 − |B|2 , 2X
y=
E·B . X
(2.8)
We also obtain that, by (2.3) T(K, K) = E · D − X , while from (1.1) it follows that X R(K, K) = T(K, K) − tr T = B · H + X . 2
(2.9)
(2.10)
For K a timelike vectorfield, we can define two electromagnetic Hamiltonians (partial Legendre transforms of the Lagrangian with respect to either E or B)s : ∂ − = |X|−1 B · H − , ∂B ∂ ˜ + = |X|−1 E · D + . H(D, B) := −|X|−1 E · ∂E ˜ we need to think of B (respectively E) as In the definition of H (respectively H) given by the aether law. Also, the factors of X in these will disappear in the next section, once the inner products are re-expressed in terms of a different metric on spacelike slices which is conformal to the one induced by g. H(E, H) := |X|−1 B ·
q In the next section we will assume that K is a timelike Killing field for (M, g), but the definitions in this section are independent of that. r Note that strictly speaking, only if K were the unit tangent vectorfield to a timelike curve in M (the world-line of an observer) would we be justified in calling these the (flattened) electric field, magnetic induction, electric displacement, and magnetic field, respectively. s These can also be defined for a spacelike vectorfield, but there will be some sign changes.
April 11, J070-S0129055X11004308
318
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
We then have ˜ T(K, K) = |X|H(B, D),
R(K, K) = |X|H(E, H).
The Weak, Dominant, and Strong Energy Conditions, respectively, take the following form [43] in terms of the Hamiltonians ˜ ≥ 0, H
˜ ≥ |H|, H
H ≥ 0.
(2.11)
It is easy to see [44] that H is actually only a function of the three scalar invariants that one can form out of E and H using the metric. More precisely, H(E, H) = h(ν, ω, τ ), where ν :=
1 (E · E + H · H), 2|X|
ω :=
1 E · H, |X|
τ :=
1 (E · E − H · H), 2|X|
and that h ∈ C 1 (R3 ) is such that its gradient lies on the future unit hyperboloid in R3 , i.e. h2ν − h2ω − h2τ = 1,
hν > 0.
˜ in terms of the invariants µ, , σ defined analogously in Similar results hold for H, terms of D, B: ˜ H(B, D) = ˜h(µ, , σ), where µ :=
1 (D · D + B · B), 2|X|
:=
1 D · B, |X|
σ :=
1 (D · D − B · B), 2|X|
˜ ∈ C 1 (R3 ) with gradient lying on the future unit hyperboloid. and likewise h 3. Electrostatic Spacetimes Let K be a timelike hypersurface-orthogonal Killing field for the spacetime (M, g). Let X = g(K, K) < 0 and define e = dX. Let 1 Kµ Kν X denote the metric induced on the quotient Q of M under the symmetry generated by K, and let hµν = gµν −
γ = |X|h, so that γ is also a Riemannian metric on Q, conformal to h. Since K is assumed to be twist-free, the quotient Q can be identified with a spacelike hypersurface in M.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
319
Let (xi ), i = 1, 2, 3 be an arbitrary coordinate system on Q. Then the line element of g is given by ds2 = Xdt2 + |X|−1 γij dxi dxj . g
γ
It follows in particular that ∇ · V = X ∇ · X1 V for any vectorfield V such that LK V = 0. The Einstein–Maxwell system in the static case reduces [43] to the following set of equations: γ 1 −2 e = H(E, H) (3.1) ∇· X X γ 1 · D =0 (3.2) ∇ X γ 1 B =0 (3.3) ∇· X 1 1 1 Rij [γ] − γij R[γ] = ei ej + (Ei Dj + Bi Hj ) 2 2X 2 X 1 1 2 |e| + H(E, H) , − γij 4X 2 γ X
(3.4)
where to close the system one has to remember that D=
∂H ∂E
,
B=
∂H ∂H
,
(3.5)
with being the operation of lowering indices with respect to the γ metric. Equation (3.4) confirms that we can interpret the above as a system of Einstein equations for the manifold Q coupled to the fields (e, E, H). We further recall that Maxwell’s equations for F and M also imply that E and H are exact 1-forms E = dϕ,
H = dψ.
Thus the above is a system of equations for the three potentials X, ϕ, ψ, and the metric γ of the quotient manifold. We also observe that the parity conservation assumption about the Lagrangian y (x, 0) = 0 implies that, away from singularities, D = 0 whenever E = 0, and likewise H = 0 whenever B = 0. This fact, together with the invariance of the equations under interchanging D with B and E with H, implies that the system of equations in the magnetostatic case E ≡ 0 is formally the same as that in the electrostatic case H ≡ 0, even though (it turns out) the restrictions on the Hamiltonian under which meaningful solutions can be obtained are different.t t For
examples of regular magnetostatic (magnetic monopole) solutions, see [28].
April 11, J070-S0129055X11004308
320
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
In this paper we are confining ourselves to the electrostatic case, where ω = = 0, τ = ν and σ = µ. If we define η(ν) = h(ν, 0, ν),
˜ 0, µ), ζ(µ) = h(µ, ˜
∂H then in the electrostatic case D = ∂H ∂E = η (ν)E and E = ∂D = ζ (µ)D. We can moreover express η in terms of ζ. This is because in the electrostatic case B = H = 0, and thus ˜ ∂H dζ ˜ = 2µ , H + H = D, Eγ = D, ∂D dµ γ
so that we have η(ν) = 2µζ (µ) − ζ(µ).
(3.6)
In terms of the reduced Hamiltonian ζ the Dominant Energy Condition (2.11) takes the following simple form ζ (µ) > 0,
ζ − µζ (µ) ≥ 0,
∀µ ≥ 0,
(3.7)
while the Strong Energy Condition reads 2µζ (µ) − ζ(µ) ≥ 0,
∀µ ≥ 0.
(3.8)
We note that in the electrostatic case, ν = −x and y = 0, thus η(ν) = − (x, 0) = − (−ν, 0). On the other hand the function ζ can be obtained from via a Legendre transform: Given = (x, y) let f (t) := − (− 21 t2 , 0) and let f∗ (s) = supt [st − f (t)] √ be the Legendre transform of f . Then it is easy to see that ζ(µ) = f∗ ( 2µ). As an example, here are the Lagrangian density function originally proposed by Born [15], and its two reduced Hamiltonians: √ √ B (x) = 1 + 2x − 1, ηB (ν) = 1 − 1 − 2ν, ζB (µ) = 1 + 2µ − 1. Let ξ := log(−X). With ν = 12 |dϕ|2γ and D = η (ν)dϕ, the electrostatic Einstein–Maxwell system then becomes γ
−ξ ∇ · dξ = 2e η(ν), γ
∇ · (e
−ξ
D) = 0,
1 1 Rij [γ] − γij R[γ] = ∂i ξ∂j ξ − e−ξ η (ν)∂i ϕ∂j ϕ − γij 2 2
(3.9)
(3.10) 1 |dξ|2γ − e−ξ η(ν) . 4 (3.11)
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
321
4. Spherical Symmetry If we assume that the spacetime (and the electromagnetic field) in addition to being static is also spherically symmetric, there will be a further reduction in the Einstein–Maxwell system. In particular, using the area-radius coordinate r we may rewrite the metric γ as follows γij dxi dxj = dr2 + eξ r2 (dθ2 + sin2 θ dφ2 ), where now ξ = ξ(r), and also ϕ = ϕ(r), D = D(r)dr. From (3.10), γ 1 1 0 =∇ · (e−ξ D) = ξ 2 ∂r (eξ r2 e−ξ D) = ξ 2 ∂r (r2 D), e r e r and thus one obtains that c D = 2 dr, r where c is an arbitrary constant. On the other hand, since |dr|2γ = 1, µ= From (3.9), we now obtain 1 d r2 eξ dr Change variable to u =
1 r
c2 1 2 |D|γ = 4 . 2 2r
(4.1)
dξ eξ r 2 = 2e−ξ η(ν). dr
to obtain d2 ξ η(ν) (e ) = c2 . du2 µ
(4.2)
We can now use (3.6), integrate (4.2) twice, change the order if integration and recompute the kernel to obtain a formula for ξ as a function of r: c2 ∞ ζ(µ) dr c ξ(r) =c + + , (4.3) e r r r µ r 2 2
where c, c , c are arbitrary constants, and µ = 2rc 4 . The requirement that the solution be asymptotically flat now forces c = 1. On the other hand, setting ζ ≡ 0 should give the Schwarzschild solution, hence c = −2m0 where m0 is the mass parameter in the Schwarzschild metric. We can also find an expression for the electrostatic potential ϕ as a function of r. Recall that dr ϕ (r)dr = dϕ = E = ζ (µ)D = cζ (µ) 2 , r and thus ∞ dr ζ (µ) 2 . ϕ(r) = c r r Since in the Maxwell–Maxwell case ζ(µ) = µ, comparison with the RWN solution shows that c = q0 , the charge parameter in RWN. Finally, a direct computation shows [43] that (3.11) is identically satisfied.
April 11, J070-S0129055X11004308
322
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
Thus we have the following generalization of RWN to nonlinear gravitoelectrostatics: 2 ∞ 2m0 + eξ(r) = 1 − ζ(µ)r2 dr , (4.4) r r r ∞ dr q2 ϕ(r) = q0 ζ (µ) 2 ; µ = 0 4 . (4.5) r 2r r These simple and elegant formulae seem to have first appeared in [25], with a different derivation, although special cases of it were known long before (see below). Note that only the reduced Hamiltonian ζ makes an appearance in them. The prospect of generating exact solutions to the Einstein–Maxwell system with interesting and desirable properties, just by inserting a judiciously chosen ζ into these formulae has proven to be irresistible to many theoreticians. In particular we should mention the solution found by Hoffmann [45] to the Einstein–Maxwell– Born–Infeld system, which corresponds to the following Hamiltonian: (4.6) ζBI (µ) = 1 + 2µ − 1. Another early example is the solution found by Infeld and Hoffmann [23], where they made the following choice ζIH (µ) = log(1 + µ),
(4.7)
and obtained a metric which was completely smooth and free of all singularities! Their work was followed up by Rao [24], who attempted to find a large family of actions leading to such solutions. Infeld and Hoffmann however may also be the first of many investigators who have made the mistake of picking a Hamiltonian that is not admissible because it cannot arise from a Lagrangian: One has to remember that ζ(µ) is the electrostatic ˜ reduction of the Hamiltonian H(B, D), which in turn is subject to the following restrictions: (1) It must in the weak field limit agree with Maxwell’s choice for the aether law. (2) It must correspond to an energy tensor that satisfies the dominant energy condition. (3) It must be the Legendre–Fenchel transform in E of a Lagrangian density = (E, B), i.e. it must be convex in D. It follows from the above that the function ζ is subject to the following requirements: (R1) limµ→0 ζ(µ) µ = 1. (R2) ζ (µ) > 0 and ζ − µζ (µ) ≥ 0 for all µ > 0. (R3) ζ (µ) + 2µζ (µ) ≥ 0 for all µ > 0. The Hamiltonian (4.7) proposed by Infeld and Hoffmann violates the third condition above. Therefore it cannot arise from a single-valued Lagrangian.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
323
The condition (R3) is equivalent to insisting that for the above solution (4.5), ν = − 21 (ϕ (r))2 be monotone decreasing in r (note that µ = q02 /(2r4 ) is always monotone, independent of the choice of a Lagrangian). It was shown by Bronnikov et al. [32] that this monotonicity requirement of ν rules out the possibility of having electrostatic solutions with a regular center (i.e. no curvature blowup at r = 0) that are at the same time Maxwellian in the weak field limit. The same argument applies to show that the three conditions above are incompatible with having a regular center.u Therefore spacetimes corresponding to (4.4) and (4.5) must have some kind of a singularity at r = 0. In view of this fact, fantastic claims about existence of singularity-free point-charge metrics in nonlinear electrodynamics, which every now and then appear in the literature, should be viewed with a healthy dose of skepticism, and the “Hamiltonian” involved should be examined carefully, for it may violate one or more of the above conditions. We note also that (R3) implies in particular that the Strong Energy Condition, which we mentioned before can be violated in nonlinear electrodynamics, nevertheless holds in the electrostatic case, since 0 ≤ 2µζ + ζ = (2µζ − ζ) , which upon integration on [0, µ] and using (R1) implies that (3.8) holds. This will have important consequences, as we will see below. Interestingly, a question that should have been addressed long ago, but was not, is this: Since the above requirements rule out solutions that are everywhere regular, what is then the mildest singular behavior allowed by them? In the next section we characterize those static spherically symmetric point-charge metrics that have the mildest form of singularity possible at their center. 5. Singularities 5.1. Singularities in the metric We begin by calculating the curvature tensor of a static spherically symmetric metric of the form ds2 = f 2 dt2 − f −2 dr2 + r2 dΩ2 , with f = f (r). The nonzero components of the Riemann tensor are R0101 = f f + f , 2
R0202 = R0303 = −R1212 = −R1313 = r−1 f f , R2323 = r−2 (1 − f 2 ). The indices here refer to the rigid frame {ω (µ) } defined as follows: ω (0) = f dt,
ω (1) = f −1 dr,
ω (2) = rdθ,
ω (3) = r sin θ dφ.
u Another possibility is of course, not to have a center at all [28]. The topology of such a spacetime however, does not seem to lend itself to the point-charge concept.
April 11, J070-S0129055X11004308
324
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
Thus the Kretschman scalar is, with X = −f 2 , K2 = X + r−2 X + r−4 (1 + X)2 . 2
2
(5.1)
It is evident from (5.1) that for there to be no spacetime curvature singularity at r = 0 it is necessary and sufficient that 1+X(r) = O2 (r2 ) as r → 0.v More generally, the Kretschman scalar K will blow up like r−α if and only if 1 + X(r) = O2 (r2−α ). For spherically symmetric, electrostatic spacetimes, 2 X = −1 + m(r), r where
m(r) := m0 −
∞
ζ r
q02 2r4
r2 dr
is the mass function. Thus we see that K will blow up at least like r−3 if m(0) = 0, as it is for example in the case of the “negative mass” Schwarzschild solution, where m0 < 0 and q0 = 0. For the superextremal RWN solution, the situation is much worse since m(0) = −∞. Since our goal here is to characterize solutions which are as mildly singular as possible at the location of the charge, we may start by requiring m(0) = 0. Now, since |q0 |3/2 ∞ −7/4 y ζ(y)dy = m0 − |q0 |3/2 Iζ m(0) = m0 − 11/4 2 0 and m0 is an integration constant, it is always possible to meet this requirement as long as ∞ y −7/4 ζ(y)dy < ∞. (5.2) Iζ := 2−11/4 0
From now on we will add this to the list of requirements that the reduced Hamiltonian ζ must satisfy: ∞ µ−7/4 ζ(µ)dµ < ∞. (R4) 0
Note that this new restriction implies the following: If ζ(µ) is assumed to grow like a power µα , then we must have α < 3/4, which of course rules out the Maxwellian case. Having made such a choice of the integration constant m0 , we now observe that mADM = m(∞) = |q0 |3/2 Iζ . This means that the mass of this spacetime is entirely of electrical nature. Moreover, by an appropriate scaling of the aether law, namely β (x, y) := β −4 (β 4 x, β 4 y), a continuous function f : R+ → R, integer k and α > 0 we say that f = Ok (r α ) if limr→0 r j−α dj f /dr j exists and is finite for j = 0, . . . , k.
v For
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
325
which for the reduced Hamiltonian amounts to using ζβ (µ) = β −4 ζ(β 4 µ) in place of ζ, it is possible to “fit” the ADM mass and total charge of this solution to the empirical mass and charge of any particle. This is because Iζβ = β −1 Iζ , so for a given pair of numbers (q, m) setting q0 = q and β :=
q3/2 Iζ m
(5.3)
will result in m(∞) = m, while at the same time the scaled version of an admissible Hamiltonian function ζ will remain admissible. In this connection it is worth mentioning that if one carries out this procedure for the Born–Infeld Lagrangian, with m0 and q0 set to the mass and charge of the electron, the value for the scaling parameter β thus obtained coincides with the value originally proposed by Born [20]. Once m0 is chosen as above, the mass function can be rewritten as follows: ∞ r 2 q q 2 2 ζβ ζβ (5.4) r dr = m − r dr . m(r) = 2r4 2r 4 0 r We can now obtain some estimates for m(r). Recall the second part of (R2): ζ(µ) ≥ µζ (µ). Integrating this inequality on [0, µ] and using (R1) we easily obtain ζ(µ) ≤ µ,
∀µ > 0.
This gives the following lower bound for the mass function: m(r) ≥ m −
q2 , 2r
which in turn gives the following bound on the metric coefficient eξ : eξ ≤ 1 −
2m q2 + 2. r r
These two estimates are clearly only useful for large r. In fact, assuming slightly more than (R1), one can turn these into large-r asymptotics for m and eξ . Namely, let us assume that (R1)
ζ(µ) = µ + O(µ5/4 ),
as µ → 0.
Substituting into the second expression for the mass function in (5.4) we obtain 1 1 q2 2m q2 ξ +O + 2 +O m(r) = m − , e =1− . 2 2r r r r r3 As advertised, these asymptotics are identical to those of the RWN solution.
April 11, J070-S0129055X11004308
326
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
We now need to establish the small-r behavior of the mass function. Clearly, m(r) = O2 (r3 ) if and only if ζ goes to a constant as µ → ∞. On the other hand, we recall (R3) on ζ, i.e. the convexity requirement. It is equivalent to (µ1/2 ζ (µ)) > 0.
(5.5)
Integrating this on an interval [µ0 , µ] for a fixed µ0 > 0, we obtain √ ζ(µ) ≥ C1 + C2 µ, with C1 , C2 = 0 constants depending on µ0 . Therefore the reduced Hamiltonian must grow at least like µ1/2 in order to satisfy this requirement. In particular, it cannot go to a constant at infinity, hence there will be curvature blowup at r = 0 no matter what aether law is chosen, as anticipated in [32]. Assuming now that the reduced Hamiltonian grows at the slowest possible rate, namely like µ1/2 , we obtain that there must be a conical singularity at r = 0 where K blows up like r−2 , and that there are no horizons for this metric. In order to do this rigorously, we need to make the growth condition more precise: (R5) There exists positive constants Jζ , Kζ , Lζ depending only on the profile ζ such that √ √ Jζ Lζ Jζ ≤ ζ (µ) ≤ Jζ µ − Kζ ≤ ζ(µ) ≤ Jζ µ, − . µ 2µ1/2 2µ1/2 We should here pause to mention that an example of a reduced Hamiltonian satisfying all five requirements (R1)–(R5) is the one of Born–Infeld (4.6). Many other examples can easily be constructed by considering smooth, monotone increasing, √ concave functions of µ that behave like µ for small µ and like c µ for large µ. Assuming (R5), from (5.4) we have, √ 2Jζ m A 2 . (5.6) , := m(r) ≤ r, A := 2 2 Iζ |q| This right away implies that there will be no horizons as long as is small enough, since 2m(r) ≥ 1 − A 2 > 0. −X = 1 − r Combining (R5) with our previously obtained bounds for the reduced Hamiltonian, √ √ max{0, Jζ µ − Kζ } ≤ ζ(µ) ≤ min{µ, Jζ µ},
Jζ Lζ Jζ max 0, 1/2 − ≤ ζ (µ) ≤ min 1, 1/2 . µ 2µ 2µ
(5.7)
We can now use this to obtain a lower bound for the mass function that does not degenerate near r = 0, namely m(r) ≥
Kζ A 2 r − 4 r3 , 2 3β
(5.8)
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
327
as well as the following small r asymptotics for m: m(r) =
B 2 6 3 A 2 r− r + o2 (r3 ), 2 2m2
B 2 :=
2Kζ . 3Iζ4
Consequently, −X = eξ = a2 + b2 r2 + o2 (r2 )
√ 3 with a = 1 − A 2 and b = B m . This in turn implies that there is a conical singularity at r = 0, because the line element of the spacetime metric is Xdt2 − X −1 dr2 +r2 (dθ2 +sin2 θ dφ2 ), and the coefficient of dr2 at r = 0 is a1 , which is greater than one. In fact, introducing standard Cartesian space coordinates (x1 , x2 , x3 ) near 3 r = 0, with r = ( i=1 (xi )2 )1/2 we see that the line element in these coordinates is i j
1 xx 2 −1 + δij dxi dxj , Xdt + 2 |X| r i,j thus the metric has no continuous extension at r = 0 unless a = 1. We therefore need to take the axis r = 0 out of the spacetime manifold M, which gives it the topology of R4 minus a line. We also note that, given a profile ζ, the deficit angle of the conical singularity is proportional to = m/|q|. This will be quite small precisely when the empirical charge-to-mass ratio of the particle to which this solution is being fitted is large, which happens to be the case for the electron and the proton, etc. This means that in the study of these metrics it is permissible to consider the small regime. The last observation we would like to make about the metric before moving on to discussing the electric field, is that the metric coefficient eξ = −X is monotone increasing. The easiest way to see this is from (4.2). Recall that the assumption (R3) implies that the Strong Energy Condition is satisfied, and thus η(ν) ≥ 0, so that eξ is convex as a function of u = 1r . We have already established that a2 ≤ eξ ≤ 1, eξ |u=0 = 1, limu→∞ eξ = a2 < 1. Thus eξ cannot have a local maximum or a local minimum at a finite u, and must therefore be monotone decreasing in u, hence monotone increasing in r. We will see that this fact, which is in great contrast to the behavior of the same metric coefficient in the RWN spacetimes, has important consequences, in particular for the trajectories of test particles. 5.2. Singularities in the electric field In the spherically symmetric, electrostatic case, the only nonzero component of the Faraday tensor is Frt = ϕ (r) and the 1-form E = iK F = dϕ. We have ∞ 2 4 q β dr ϕ(r) = q ζ . 4 2r r2 r One easily computes that sgn(q)ϕ(0) =
3
, 2
sgn(q)ϕ (0) = −
A 3
. 2m
April 11, J070-S0129055X11004308
328
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
Moreover, using (5.7), we have
|q| 3 A 3 C 7 3 3 A 3 − r ≤ sgn(q)ϕ(r) ≤ min , − r+ r , max 0, 2 2m r 2 2m 3m3
C 7 |q| A 3 A 3 + 3 r2 , max − 2 , − ≤ sgn(q)ϕ (r) ≤ min 0, − r 2m 2m m with C := 2Lζ /Iζ4 . In particular, ϕ is monotone decreasing, and is asymptotic to the Coulomb potential as r → ∞. Moreover, ϕ (0) = 0, and thus E becomes undefined at r = 0. More precisely, since E = ϕ (r)dr and dr is a unit covector whose direction is undefined at r = 0, it follows that the covectorfield E is of finite magnitude and undefined direction, i.e. has a point defect at r = 0. E is otherwise smooth. We now check that the total electrostatic energy is finite, and in fact is equal to the ADM mass: By virtue of (1.1), the energy tensor T is divergence free, i.e. ∇µ Tµν = 0. Let Pµ := −Tµν Kν where K = ∂t is the timelike Killing field of (M, g). It follows that δP = ∗d ∗ P = 0 and thus by the divergence theorem ∗P is a conserved current, i.e. Σt ∗P is independent of t. With coordinates (t, r, θ, φ) as before, K = (1, 0, 0, 0) and we calculate that the only nonzero component of ∗P is −1 T(K, K) − det g (∗P)123 = −g00 P0 − det g = X ˜ = H(B, D) − det g = ζ(µ)r2 sin θ. Thus, defining the total electromagnetic energy carried by the slice Σt to be 1 E := ∗P, 4π Σt we see that for the particle-spacetimes under study ∞ 2π π ∞ 1 (∗P)123 drdθdφ = ζ(µ)r2 dr = m, E= 4π 0 0 0 0 as promised. 6. Uniqueness in the Spherically Symmetric Class Birkhoff’s celebrated theorem [34] states that any spherically symmetric solution of vacuum Einstein’s equations Rµν = 0 is locally isometric to a region in Schwarzschild spacetime.w This was generalized to the electrovacuum case w This result is often paraphrased inaccurately as “any spherically symmetric vacuum solution of Einstein’s equations must be static”, which makes it no longer true in general [36]. The theorem was discovered first by Jebsen [35], whose proof appeared two years before Birkhoff’s. Jebsen’s proof however, contains an error [36]. There is a more general result, by Eiesland [37], on necessary and sufficient conditions for the existence of an extra Killing field for spherically symmetric (not necessarily vacuum) spacetimes, a preliminary version of which was also announced [38] before Birkhoff’s book but the final paper appeared two years after it. Birkhoff’s theorem is a corollary of Eiesland’s result.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
329
by Hoffmann [39], thereby proving the uniqueness of the RWN solution in the spherically symmetric class. Many other extensions and generalizations have followed since, see [40] and references therein. Here we prove that the charged-particle spacetime whose existence we have established in the above, enjoys the same uniqueness property. To begin, we recall the definition of spherical symmetry: A Lorentzian manifold (M, g) is locally spherically symmetric if every point in M has an open neighborhood V such that (1) There is a linearly independent set of vector fields {Ui }3i=1 on V which generate a faithful representation of the Lie algebra of the rotation group SO(3), and whose orbits are two-dimensional and spacelike. (Note that the orbits do not need to be contained in V .) (2) LUi g = 0 in V for i = 1, 2, 3, i.e. Ui are Killing vector fields for the metric. Next we recall the following (see [37] for a proof): Proposition 6.1. The necessary and sufficient condition for local spherical symmetry of a smooth Lorentzian manifold (M, g) is that every point in M has a neighborhood in which there exists a system of local coordinates (t, r, θ, φ) such that the line element of the metric in those coordinates has the form ds2 = −A2 (t, r)dt2 + B 2 (t, r)dr2 + C 2 (t, r)(dθ2 + sin2 θ dφ2 ),
(6.1)
with A, B, C smooth functions of (t, r).x In this case, the three Killing fields generating spherical symmetry are U1 = sin φ ∂θ + cot θ cos φ ∂φ ,
U2 = cos φ ∂θ − cot θ sin φ ∂φ ,
U3 = ∂φ ,
and we can check that these vectorfields satisfy the commutation relations of the Lie algebra of the rotation group: [Ui , Uj ] = kij Uk , where ijk is the completely antisymmetric 3-symbol. We also note that [Ui , Z] = 0 if Z is either ∂t or ∂r . Moreover, we can express ∂θ and ∂φ in terms of (U1 , U2 ): ∂θ = sin φ U1 + cos φ U2 ,
∂φ = tan θ(cos φ U1 − sin φ U2 ).
(6.2)
Suppose now that F = Fµν dxµ ∧ dxν is a spherically symmetric 2-form defined in a neighborhood V of a spherically symmetric manifold (M, g), i.e. LUi F = 0 for i = 1, 2, 3. Let X, Y, Z be any three vectorfields on M. For any 2-form F we have the identity (LX F)(Y, Z) = XF(Y, Z) − F([X, Y], Z) − F(Y, [X, Z]). error was that he had assumed C(t, r) ≡ r. This can only be achieved if ∇C is spacelike, thus for a complete proof one had to consider three other cases as well, of ∇C being timelike, null, or zero.
x Jebsen’s
April 11, J070-S0129055X11004308
330
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
Thus, for a spherically symmetric F, we obtain the following (1) Ui F(∂t , ∂r ) = 0, i = 1, 2, 3. (2) Ui F(Uj , Z) = kij F(Uk , Z) for i, j = 1, 2, 3 and Z ∈ {∂t , ∂r }. (3) Ui F (Uj , Uk ) = 0 for i, j, k distinct, and Ui F(Ui , Uj ) = kij F(Ui , Uk ) for i = 1, 2, 3. Thus, from item (1), ∂θ F(∂t , ∂r ) = ∂φ F(∂t , ∂r ) = 0 which implies that F(∂t , ∂r ) = f (t, r).
(6.3)
From item (2) we have U1 F (U3 , Z) = −F(U2 , Z), U2 F(U3 , Z) = F(U1 , Z) and U3 F(U3 , Z) = 0, giving a differential equation for F(U3 , Z), and solving that we obtain that for each choice of Z there is a function a(t, r) such that F(∂φ , Z) = a(t, r) sin θ.
(6.4)
On the other hand, item (2) also implies U3 F(U1 , Z) = F(U2 , Z), U3 F(U2 , Z) = −F(U1 , Z), which gives a differential system for F(Ui , Z), for i = 1, 2. Solving it, making use of (6.2) and (6.4) one obtains that for each choice of Z there is a function b = b(t, r) such that F(∂θ , Z) = b(t, r).
(6.5)
Finally, in a similar manner item (3) gives rise to a differential system for F(U3 , Ui ), i = 1, 2, which upon solving yields that there is a function c(t, r) such that F(∂θ , ∂φ ) = c(t, r) sin θ. We have thus shown that a spherically symmetric 2-form F on a spherically symmetric manifold, with coordinates (t, r, θ, φ) as above, must have the following form Ftr = f (t, r), Ftφ = a1 (t, r) sin θ,
Ftθ = b1 (t, r),
Fr,θ = b2 (t, r),
Frφ = a2 (t, r) sin θ,
Fθφ = c(t, r) sin θ.
(6.6)
Now, assume that F is a closed 2-form: dF = 0. Using the identity LX F = iX dF + diX F valid for any vectorfield X and any tensor F we obtain that for a spherically symmetric closed 2-form F, diUj F = 0,
j = 1, 2, 3.
Writing the resulting differential equations for the components of F as found in (6.6) one then sees that the only solution is a1 = a2 = b1 = b2 = 0,
c(t, r) = c,
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
331
with c an arbitrary constant. Thus we have proven the following Lemma (stated without proof in [39]): Lemma 6.1. On a spherically symmetric manifold M with coordinates (t, r, θ, φ) the only nonzero components of a spherically symmetric Faraday tensor F are Ftr = f (t, r),
Fθφ = c sin θ,
(6.7)
where f is an arbitrary function and c an arbitrary constant. Note that the constant c here is the total magnetic charge: −1 c= F, 4π S where S is any spacelike 2-sphere in M that contains the origin. We can now compute the energy tensor T. Recall that the line element of the metric g has the form (6.1). The nonzero components of the dual tensor ∗F are therefore ∗Ftr =
AB c, C2
∗Fθφ =
−C 2 f (t, r) sin θ, AB
and the dual to the Maxwell tensor ∗M = x F + y ∗ F is computed to have components AB C2 f sin θ. (6.8) ∗Mtr = x f + c y 2 , ∗Mθφ = c x − y C AB We now quote the main theorem in [37] (slightly reworded to match our notation): Theorem 6.1 ([37]). The necessary and sufficient conditions that a locally spherically symmetric Lorentzian manifold with line element ds2 = −A2 dt2 + B 2 dr2 + C 2 (dθ2 + sin2 θ dφ2 ), A, B and C being arbitrary functions of t and r, and C not a constant, shall admit a one-parameter group of isometries generated by a vectorfield K = k0 (t, r)∂t + k1 (t, r)∂r are A2 Gtr = Ψ ∂r C∂t C,
(6.9)
AB(Grr − Gtt ) = Ψ C[A2 (∂r C)2 + B 2 (∂t C)2 ],
(6.10)
where Gνµ = Rνµ − 12 Rδµν is the Einstein tensor and Ψ = Ψ(C) is an arbitrary function of C, or a constant. If the above conditions are satisfied, then the Killing field K is, up to a constant multiple, given by k0 (t, r) = −
1 ∂r C, eΨ AB
k1 (t, r) =
1 ∂t C. eΨ AB
April 11, J070-S0129055X11004308
332
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
We are now in a position to state our uniqueness result: Theorem 6.2. Let (M, g, F) be a locally spherically symmetric solution of the Einstein–Maxwell system (1.1)–(1.3). There exists a system of local coordinates (t, r, θ, φ) on M in which the line element of the metric g takes the form (6.1). If ∇C is spacelike in M, then there exists a Killing vectorfield K which is hypersurface-orthogonal and timelike everywhere in M, i.e. the solution is static as well, and in the case of no magnetic charge it is thus isomorphic to a region in the electrostatic charged-particle spacetime found in our Sec. 4. Proof. The existence of the coordinate system is guaranteed by Proposition 6.1. By Lemma 6.1 the Faraday tensor F in these coordinates has the form (6.7). On the other hand, by virtue of (1.1) we must have Gνµ = Tνµ . The relevant components of the energy tensor T are easily computed from (6.7) and (6.8) to be: AB −f Ttt = Trr = 2 2 x f + c y 2 − , Trt = 0. A B C We thus see that the solution satisfies the conditions of Eiesland’s theorem, with Ψ a constant. We can take the components of the Killing field to be k0 = −∂r C/(AB) and k1 = ∂t C/(AB). In that case g(K, K) = −(∂r C/B)2 + (∂t C/A)2 = −g −1 (∇C, ∇C), and thus K is timelike if and only if ∇C is spacelike. Since dK is easily seen to be proportional to dt ∧ dr, the twist of K vanishes: K ∧ dK = 0, and therefore K is hypersurface-orthogonal, i.e. the solution is static. It would then be a matter of changing the coordinates (t, r) into new coordinates (t , r ) that satisfy Kt = k0
∂t ∂t + k1 = 1, ∂t ∂r
Kr = k0
∂r ∂r + k1 = 0, ∂t ∂r
(6.11)
which is solvable by quadratures, in order for the line element of g in the new coordinates to have the form ds2 = −P 2 dt2 + Q2 dr2 + R2 (dθ2 + sin2 θ dφ2 ), where P, Q, R are functions of r only, and R is not a constant. At this point we may again use that G00 − G11 = 0, where the indices now refer to the new coordinates x0 = t , x1 = r . From the definition of Gνµ we compute (see, e.g., [37]) that G00 =
1 (Q3 + 2Q RR − QR2 − 2QRR ), Q3 R 2
G11 =
1 (P Q2 − P R2 − 2RR P ). P Q2 R 2
The equality of the two is now easily seen to imply that there exists a constant λ such that P Q = λR . Thus letting τ = λt and taking (τ, R, θ, φ) as the new
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
333
coordinates, the metric takes the form P2 2 λ2 dτ + 2 dR2 + R2 (dθ2 + sin2 θ dφ2 ), 2 λ P which is the form we assumed in Sec. 4 in order to derive the static spherically symmetric solution we found. ds2 = −
Remark 6.1. The above uniqueness result can most likely be strengthened by considering the remaining cases, of ∇C being timelike, null, or zero, finding in each case what the solution reduces to, similar to the treatment of the cosmological vacuum solutions in [40]. We do not pursue this here, however. 7. Precise Statement of the Main Result We are now in a position to give the precise version of the main result: Theorem 7.1. Let ζ : R+ → R+ be any C 2 function satisfying the following conditions: (1) (2) (3) (4) (5)
ζ(µ) = µ + O(µ5/4 ) as µ → 0. ζ (µ) > 0 and ζ − µζ ≥ 0 for all µ > 0. ζ (µ) + 2µζ (µ) ≥ 0 for all µ > 0. ∞ Iζ := 0 µ−7/4 ζ(µ)dµ < ∞. √ There exists positive constants µ0 , Jζ , Kζ , Lζ such that Jζ µ − Kζ ≤ ζ(µ) ≤ √ Jζ L Jζ − µζ ≤ ζ (µ) ≤ 2µ1/2 hold for µ > µ0 . Jζ µ, and 2µ1/2
Let m > 0 and q = 0 be two fixed real numbers, and let := m/|q|. Define ζβ (µ) :=
1 ζ(β 4 µ), β4
β :=
q3/2 Iζ . m
Let be any electromagnetic Lagrangian density function with the property that the electrostatic reduction of its Hamiltonian is the function ζβ . Then the following hold: (1) The Einstein–Maxwell system (1.1)–(1.3) with the electromagnetic Lagrangian density L = − ∗ has a unique electrostatic, spherically symmetric, asymptotically flat solution, the maximal analytic extension of which is a Lorentzian manifold (M, g), called a charged-particle spacetime. M is topologically equivalent to R4 minus a line. There exists a global coordinate system (t, r, θ, φ) on M with the property that K := ∂t is an everywhere-timelike Killing field, r is the area-radius of rotation group orbits, r = 0 is invariant under K, and (θ, φ) are standard spherical coordinates on S2 . (2) Any region in a spherically symmetric electrovacuum spacetime where the arearadius of the rotation group orbits has a spacelike gradient, is necessarily static as well, and in the case of no magnetic charge is isometric to a region in the corresponding charged-particle spacetime (M, g) defined above.
April 11, J070-S0129055X11004308
334
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
(3) The line element of the metric g has the form ds2 = −eξ dt2 + e−ξ dr2 + r2 (dθ2 + sin2 θ dφ2 ), where 2m(r) , e := 1 − r ξ
m(r) :=
r
ζβ 0
q2 2s4
s2 ds,
m(r) is the mass function of the spacetime M. In particular, m(0) = 0, m is increasing in r, and the ADM mass of M is m(∞) = m. The mass function moreover has the following asymptotics: B 2 6 3 A 2 r− r + o2 (r3 ) 2 2m2 1 q2 +O m(r) = m − 2r r2 m(r) =
√
where A :=
2Jζ Iζ2
and B 2 :=
2Kζ . 3Iζ4
as r → 0, as r → ∞,
Consequently, grr |r=0 = (1 − A 2 )−1 > 1 and
thus the metric has a conical singularity at r = 0. (4) The mass function satisfies the following bounds
2
2 A
B 2 6 3 A
q2 max r− r, m . r , m − ≤ m(r) ≤ min 2 2m2 2r 2 Consequently, for small enough, gtt < 0 everywhere, which implies that there are no horizons in M, hence the conical singularity at r = 0 is naked. (5) r = 0 is a curvature singularity, where the Kretschman scalar blows up like r−2 . Any spherically symmetric electrostatic spacetime homeomorphic to R4 minus a line, whose reduced Hamiltonian satisfies the first three conditions listed above for ζ will necessarily have a singularity at r = 0 which is at least this strong. (6) The spacetime M is endowed with an electromagnetic field F = dA, where A = −ϕ(r)dt, ∞ 2 q ds ζβ ϕ=q 2s4 s2 r is the electrostatic potential, and E := e−ξ/2 dϕ is the (flattened) electric field.y The potential ϕ is smooth, monotone decreasing, and has the following asymptotics 3 A 3 − r + O(r3 ) 2 2m 1 q ϕ(r) = + O r r3
sgn(q)ϕ(r) =
as r → 0, as r → ∞.
Thus the total charge of the spacetime is q. The electric field E is smooth everywhere except at r = 0 where it has a point-defect, i.e. its magnitude has a finite limit but its direction is undefined. y That
ˆ is, E = iK ˆ F where K is the unit vectorfield in the direction of K.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
335
(7) The total electrostatic energy carried by a time slice is finite and is equal to m, therefore the mass of the spacetime is entirely of electric origin. (8) Radial null geodesics fall into the singularity and are thus incomplete in one direction, while null geodesics with nonzero angular momentum are infinitely extendible in both directions. For small, these are deflected by the singularity by an amount that is proportional to 2 , and there are no trapped null geodesics. (9) Similarly, timelike geodesics and test-charge trajectories can only reach the singularity if they are radial, while non-radial ones have either bound orbits or escape orbits. The singularity at r = 0, in contrast to the one in the superextremal RWN solution, is gravitationally attractive. The only items in the above theorem that we have not yet proved are the last two, regarding geodesics. These will be established below: 8. Geodesics and Test-Charge Trajectories Consider the following Lagrangian density, defined on a velocity bundle (see [41] for definitions): 1 gαβ (q)vα vβ + eAλ (q)vλ , 2 where A is an electromagnetic vector potential defined on M, q ∈ M and v ∈ Tq M. The corresponding action ˙ L(q, q)ds A[q] = L(q, v) =
R
is defined on curves q(s) = (q (s)) in M, where the dot represents differentiation with respect to an affine parameter s, which is related to arclength parametrization (“proper time” for timelike curves) τ by τ = ms. The stationary points of this action are, for e = 0 and m2 = 1, 0, −1 respectively timelike, null, and spacelike geodesics of the spacetime. Moreover, stationary points of the action with e = 0 and m2 = 1 represent possible worldlines of a “test charge”z of mass m and charge e in the spacetime (M, g) permeated by the electromagnetic field F. The Euler–Lagrange equations for these geodesics are α
e dqβ Ddqα , = Fα β (q) 2 dτ m dτ z This is a fairly standard treatment of test charges, cf. [46–50]. Notice however that it is not without conceptual problems: If one thinks, e.g., as in [49, 50], that the charged-particle spacetime in question is the spacetime of an electron, then one cannot possibly “test” an electron with an electron, i.e. a particle which has charge comparable to that of the charged-particle spacetime can by no stretch of imagination be considered a test particle, since its effect on the geometry of spacetime cannot be ignored, while on the other hand if the spacetime is to represent an elementary particle then of course no particle of smaller charge exists to test it with! The situation considered here is thus merely a cartoon, and proper treatment of this subject is postponed to a future paper, where we plan to consider charged-particle spacetimes featuring two point charges.
April 11, J070-S0129055X11004308
336
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
where D/dτ is covariant differentiation operation on tangent vectors, and F is the electromagnetic field tensor. Let ∂L = gαβ (q)vβ + eAα (q) pα = ∂qα be the canonical momenta, and H the corresponding Hamiltonian to L defined by 1 H = pα vα − L = gαβ (q)(pα − eAα (q))(pβ − eAβ (q)) 2 1 2 e2 = |p| − eA · p + |A|2 . 2 2 ∂H ˙ The geodesic equations in Hamiltonian form are q˙ = ∂H ∂p and p = − ∂q . The first constant of motion is H itself, and from the normalization condition discussed above we have that along solutions H = − 12 m2 . Moreover, J (p, q) is a constant of motion iff {J , H} = 0. Let us now take g to be a point-charge metric, with empirical charge q and mass m (that is entirely of electric origin). g is the spherically symmetric electrostatic solution to the Einstein–Maxwell system with a nonlinear aether law, corresponding to a reduced Hamiltonian ζβ that is subject to the restrictions (R1)–(R5) discussed in the previous sections but otherwise arbitrary, and let F be the corresponding electromagnetic field permeating this spacetime. The line element of g thus has the form
ds2g = −eξ dt2 + e−ξ dr2 + r2 (dθ2 + sin2 θ dφ2 ), where (θ, φ) are spherical coordinates on the standard unit sphere S2 . Moreover F = dA where A = ϕ(r)dt, and the functions ξ(r), ϕ(r) are given in terms of ζ by r 2 q 2m(r) ξ ; m(r) = ζβ (8.1) e = 1− s2 ds; 4 r 2s 0 ∞ 2 q ds ϕ=q ζβ ; ζβ (x) = β −4 ζ(β 4 x), (8.2) 4 2s s2 r with β as in (5.3). With coordinates (xα ) = (t, r, θ, φ) thus chosen, we have pt = −eξ t˙ + eϕ; and H=
pr = e−ξ r; ˙
˙ pθ = r2 θ;
˙ pφ = r2 sin2 θ φ,
1 1 1 −e−ξ (pt − eϕ)2 + eξ p2r + 2 p2θ + 2 2 p2φ . 2 r r sin θ
(8.3)
Since ∂H/∂t = ∂H/∂φ = 0 we get two more constants of motion, pt and pφ , and we call the values that they take on a solution the energy E and angular momentum about the φ axis Φ: pt ≡ E;
pφ ≡ Φ.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
337
A fourth constant is also found, by observing that d 2˙ 2Φ2 ∂H d r θ = pθ = − = 2 3 cos θ. ds ds ∂θ r sin θ Thus, if one can arrange it so that θ = π/2 when θ˙ = 0, then θ¨ = 0 as well, so that θ will stay constant at π/2. But since we still have the freedom of choosing the axes for the two spherical coordinates, we can always choose the φ axis in such a way that the plane θ = π/2 is the plane through the origin that contains the geodesic’s initial velocity vector, so that θ˙ = 0 initially. We have thus shown that all geodesics of these particle-spacetimes are planar, and for any single geodesic we can always assume that it is contained in the equatorial plane θ = π/2. Now, evaluating (8.3) along a solution we obtain that (−e−ξ (E − eϕ)2 + e−ξ r˙ 2 + ˙ m2 )r2 ≡ −Φ2 , which can be used to find an equation for r: 2 Φ + m2 eξ . r˙ 2 = (E − eϕ)2 − r2 With four constants of motion the equations are reduced to a first-order system of ODEs. We proceed to study this system, first for timelike and null geodesics (e = 0) and then for test-charges (e = 0). 8.1. Geodesics For timelike (respectively null) geodesics, e = 0 and m = 1 (respectively m = 0). The equations are 2 ˙t = −e−ξ E, r˙ = ± E 2 − Φ + m2 eξ , φ˙ = 1 Φ. (8.4) r2 r2 √ We observe that for Φ = 0 the geodesic is radial, with r˙ = ± E 2 − m2 eξ . Thus, since a2 ≤ eξ < 1, for E < ma there is no solution. For ma ≤ E ≤ m the geodesic does not have enough energy to escape, and will fall radially towards the singularity r = 0 where curvature blows up. The geodesic reaches the singularity at a finite parameter value, since the function under the square root has only a simple zero. This is also the case for geodesics with E > m (and therefore for all null geodesics), the only difference being that they can reach infinity, and in fact have a well-defined asymptotic velocity: (dr/dt)|r=∞ = 1 − m2 /E 2 . All radial geodesics are therefore inextendible in one direction and thus incomplete. Let Φ > 0 now. It is more convenient to reformulate the equations in terms of a reciprocal radial variable, and to measure length in units of m. Let m x := . r From the r˙ equation in (8.4) one easily obtains dφ ±1 = x , dx δ 2 − (x2 + γ 2 ) 1 − 2xM 2
April 11, J070-S0129055X11004308
338
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
where γ :=
mm , Φ
δ :=
mE , Φ
:=
m , |q|
and M (y) is the normalized mass function ∞ 1 ζ(µ)µ−7/4 dµ M (y) := 11/4 2 Iζ Iζ4 y4 /2 = 1−
1
211/4 Iζ
Iζ4 y 4 /2
ζ(µ)µ−7/4 dµ,
0
which, in view of (5.7), satisfies the following bounds
1 A B2 A max 1 − y, − , ≤ M (y) ≤ min 1, 2 2y 2y 3 2y
1 −A −A 3B 2 max − , 2 ≤ M (y) ≤ min 0, 2 + , 2 2y 2y 2y 4 where
√ 2Jζ ≥ 1, A := Iζ2
B 2 :=
2Kζ , 3Iζ4
are constants that depend only on the profile ζ. Note that for the RWN metric, MRWN (y) = 1 − 12 y. Let x 2 2 fγ,(x) := (x + γ ) 1 − 2xM 2 .
We note that for null geodesics, γ = 0. In that case, it is not hard to see that f0, (x) = x2 eξ(m/x) is monotone increasing, for small enough, thus there is a unique x0 > 0 such that δ = f0, (x0 ), and we must have x < x0 along any null geodesic. Hence r0 := m/x0 is the perihelion for the geodesic, i.e. the closest it can get to the singularity. Furthermore, since f0, has no critical points, it follows that there are no bounded orbits for null geodesics, so that all null geodesics with Φ > 0 can be extended in both directions to an infinite value for the parameter, and the events corresponding to those infinite values are points at infinity. This can be easily seen from the r˙ equation ±dr ds = 2 2 E − Φr2 eξ
for r0 < r < ∞,
where s is any affine parameter along the null geodesic. Since under the square-root has only a simple zero at r = r0 and goes to a constant as r → ∞ we see that the integral of the right-hand side of the above is finite over any finite subinterval of [r0 , ∞); and that this integral diverges only when the upper limit is infinite. We thus have (dφ/dx)2 = 1/(f0,(x0 ) − f0, (x)). For the Minkowski space, f0, (x) = x2 and x0 = δ, thus the solution is φ(x) − φ(0) = sin−1 (x/δ), which
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
339
as expected is a straight line at a distance 1/δ from the origin. Since the particlespacetime is always asymptotically Euclidean, Φ/(mE) = 1/δ is the impact parameter, the distance from the central singularity of the initial asymptote of the geodesic. Geodesics with 1/δ = 0 go straight to the central singularity, while geodesics with 1/δ > 0 will be deflected by it, as we will see below: We find it more convenient to parametrize the family of null geodesics in the equatorial plane by their reciprocal perihelion x0 . Let φ0 denote the corresponding angle to x0 . We then have x0 ±dx φ − φ0 = . f0, (x0 ) − f0, (x ) x The integral is convergent at the upper limit since as we said, the above denominator has only a simple zero there. Letting φ± denote the two values for φ obtained at x = 0, we have x0 dx =: π + d(x0 ). φ+ − φ− = 2 f0, (x0 ) − f0, (x) 0 The quantity d(x0 ) represents the total deflexion of null geodesics due to spacetime curvature. For the Minkowski space, one easily computes that d(x0 ) = 0, ∀x0 > 0. By contrast for a static, spherically symmetric spacetime whose line element is −a2 dt2 + (1/a2 )dr2 + r2 (dθ2 + sin2 θ dφ2 ) with a2 < 1, so that it has a conical singularity at r = 0, we have f (x) = a2 x2 and as a result d(x0 ) = π( 1a − 1) > 0. One can thus say that the null geodesics are “bent” by the gravitational attraction of the conical singularity. Below we will establish the lower bound d(x0 ) ≥ c 2 when x0 is sufficiently large, or equivalently, when r0 /m is small. For large r0 /m on the other hand, the geodesic stays far away from the singularity and thus its qualitative behavior is the same as in the RWN spacetime, as analyzed in [48] and more extensively in [51, §40]. As to the lower bound, we have, using that a2 x2 ≤ f0, (x) ≤ a2 x2 + b2 , with 2 a = 1 − A 2 and b2 = 2B 2 6 , x0 dx 1 (π + d(x0 )) ≥ 2 2 2 f (x 0 0) − a x =
ax0 1 1 1 sin−1 . ≥ sin−1 a a f (x0 ) 1 + (b/ax0 )2
B
. We then have Let 0 < c < Aπ/4 be fixed and assume x0 ≥ Aπ/4−c 2 1 b 1 1 −1 4 (π + d(x0 )) ≥ √ sin + O( ) 1− 2 2 ax0 1 − A 2 π B A − 3 + O( 4 ) = 1 + 2 + O( 4 ) 2 2 x0 Aπ π B 2 ≥ + −
+ O( 4 ), 2 4 x0
April 11, J070-S0129055X11004308
340
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
and thus d(x0 ) ≥ c 2 , which is the desired lower bound. Next we consider time-like geodesics with nonzero angular momentum. One sees (0) = −γ 2 < 0 and that the bounds for M (y) obtained that fγ,(0) = γ 2 > 0, fγ, above imply that fγ, (x) grows like x2 for large x. Thus once again geodesics with δ ≥ γ (i.e. those with E ≥ m) will have a perihelion and are extendible to infinity in either direction. In contrast to the null geodesics, however, there are bounded orbits for time-like geodesics. This is because fγ, will have at least one critical point. Let f∗ := minx fγ,(x) = fγ, (x∗ ). It then follows that there will be no solution with √ √ δ < f∗ , and that for f∗ < δ < γ the equation fγ,(x) = δ 2 has at least two solutions, corresponding to the perihelion and the aphelion of a bound orbit. There √ will also be at least one stable circular orbit, for δ = f∗ . 8.2. Trajectories of test charges For the RWN metric, a detailed study of test-charge trajectories was done in [50]. We are not aware of a comparable study done for the generalized RWN metrics with a nonlinear aether law, other than the particular case of the Born–Infeld law [52] in the black hole regime, and some preliminary discussion of the general case in [53]. Here we are going to analyze the qualitative behavior of all the testcharge trajectories in a given particle-spacetime with a small enough mass-to-charge ratio . Recall that the orbit of a single test particle of mass m and charge e, in an electrostatic, spherically symmetric particle-spacetime of mass m and charge q satisfies the following system: 2 1 Φ −ξ 2 eξ , + m φ˙ = 2 Φ. t˙ = −e (E − eϕ), r˙ = ± (E − eϕ)2 − 2 r r In addition, the orbit lies in a plane, which is taken to be the plane θ = π/2. The system is clearly invariant under the simultaneous sign change of E, e, Φ and the independent variable s. Therefore, it is enough to consider the case E ≥ 0. Let x := m/r as before. It then follows that 2 x x x2 2 2 − (1 + λ x ) 1 − 2xM 2 κ − ρN 2 , x˙ = ± mm
where =
m |q|
as before and we have introduced new parameters ρ :=
e/m , q/m
κ :=
E , m
λ :=
Φ , mm
and where N (y) is the normalized electrostatic potential: Iζ4 y4 /2 1 ζ (µ)µ−3/4 dµ, N (y) := 7/4 2 Iζ 0
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
341
which in view of (5.7) satisfies the bound
A A C 3 3 + max 0, − . ≤ N (y) ≤ min y, − 2 2y 2 2y 3y 3 Moreover, N (y) = ζ (Iζ4 y 4 /2), so that
A C A max 0, 2 − 4 ≤ N (y) ≤ min 1, 2 . y y y Note that for the RWN metric, NRWN (y) = y. κ and λ are proportional to the energy and angular momentum of the test charge. The ratio ρ is positive in the case where the particle-spacetime and the test particle have charges of the same sign, and negative if they have the opposite sign. Setting ρ = 0 will reproduce the results obtained above for timelike geodesics. Our goal is to obtain a bifurcation diagram in the κ, ρ, λ parameter space for the above system, for a fixed small value of . It is clear that the diagram will be invariant under the simultaneous sign change of κ and ρ, and thus it is enough to restrict our attention to κ ≥ 0. Let x x gρ,κ, (x) := κ − ρN 2 , hλ, := (1 + λ2 x2 ) 1 − 2xM 2 .
Consider first the case of orbits with zero angular momentum: Φ = λ = 0. The motion of the test charge is then radial. We recall that h0, (x) = eξ(m/x) . Thus, h0, is convex, decreasing, h0, (0) = 1, h0, (0) = −2 and limx→∞ h0, (x) = a2 = 1 − A 2 . (0) = −ρ/ 2 and that limx→∞ gρ,κ, (x) = On the other hand, gρ,κ, (0) = κ, gρ,κ, κ − 3ρ/2. Accordingly, there are three main parameter regimes to consider: Case 1. ρ < 0. The two charges q and e thus have the opposite sign. The function 2 must have the same gρ,κ, is increasing, and has a horizontal asymptote. Thus gρ,κ, properties, and in particular it will be concave and increasing, while we have already established that h0, is convex and decreasing. It then follows that there will be no trajectory with κ ≤ a + 3ρ/2, for a + 3ρ/2 < κ < 1 the trajectory is bounded, and corresponds to a test charge that cannot escape and falls radially inward and into the singularity, and for κ ≥ 1 the test charge has enough energy to escape to infinity. Case 2. ρ > 0, κ > 3ρ/2. The two charges thus have the same sign, and the test charge has relatively high energy. The function gρ,κ, is decreasing, has a positive 2 will have these same three properties asymptotic value, and is convex. Thus gρ,κ, 2 and h0, , as well. In order to find the number of intersections of the graphs of gρ,κ, one needs to compare the values of the derivatives of these two functions at x = 0 and near x = ∞. we have κρ (0) − h0, (0) = 1 − 2 , 2gρ,κ, (0)gρ,κ,
April 11, J070-S0129055X11004308
342
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
while, using the large y asymptotics established above for the two functions M (y) and N (y), we have 1 1 3 2gρ,κ, (x)gρ,κ, (x) − h0, (x) = −ρ 2 κ − ρ + O < 0 as x → ∞. 2 x2 x3 Thus we need to distinguish two subcases: . (Note that this requires ρ2 < 2 2 /3, or equivalently, Case 2a. κ < 2 /ρ = mm eq e/m < 2/3, which does not allow the test charge to have a charge-to-mass ratio typical of actual elementary particles.) In this case there will be a critical value of 2 and κ, say κc ∈ (3ρ/2, a + 3ρ/2), such that the graphs of the two functions gρ,κ c , h0, are tangent to each other at a positive x = xc . Thus for κ < κc there will be no trajectories, for κ = κc there will be a stable equilibrium, corresponding to the test charge remaining at rest with respect to the singularity, for κc < k < a + 3ρ/2 the test charge shuttles back and forth between a perihelion and an aphelion, which grow further apart as κ is increased, for a + 3ρ/2 ≤ κ < 1 the perihelion coincides with the singularity, and for κ ≥ 1 the aphelion is at r = ∞, i.e. the test charge is allowed to escape. Case 2b. κ ≥ 2 /ρ. The two above-mentioned graphs can have no point of tangency, irrespective of κ. Moreover, there is no intersection at a positive x, and hence no trajectory, with κ ≤ 1 (i.e. for E ≤ m). For 1 < κ < a + 3ρ/2 there is a single intersection, which corresponds to the perihelion of a hyperbolic trajectory that avoids the singularity, and for κ > a + 3ρ/2 the test charge in one direction falls into the singularity and in the other direction is allowed to escape to infinity. Case 3. ρ > 0 and κ < 3ρ/2. The charges have the same sign and the test charge has relatively low energy. The function gρ,κ, has a negative asymptotic value, thus 2 will be tangent to the x-axis at some point x = x∗ , which will be the graph of gρ,κ, where the global minimum is achieved, and is asymptotic to (κ − 3ρ/2)2. There are no trajectories with 3ρ/2 − a < κ < 1. For κ < min{1, 3ρ/2 − a} the trajectories correspond to falling-in test particles. For each 1 < κ < 3ρ/2−a there is an avoiding trajectory as well as one that falls in. For 3ρ/2 − a < κ < 3ρ/2 all trajectories avoid the singularity. Based on the above analysis, the κ-ρ parameter plane is divided into several regions as in Fig. 1. The shaded region in this figure is a forbidden zone: there are no trajectories for (κ, ρ) in this region. Region I corresponds to shuttling orbits, Region II to test particles that are trapped and fall into the singularity. Particles with parameters in Region III can also fall into the center, but have enough energy so that in the opposite direction they are allowed to escape to infinity. Region IV corresponds to hyperbolic trajectories that avoid the singularity altogether, while each point in Region V corresponds to two trajectories, one that falls into the center and another that avoids it. Next, we consider charged trajectories with nonzero angular momentum, i.e. λ > 0. The only thing that changes in the above analysis is the behavior of hλ, ,
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
343
κ
III
IV V 1
a
I II
II
ρ
0
Fig. 1.
Bifurcation diagram for trajectories with no angular momentum.
κ
H E,H 1
E
a
0
Fig. 2.
E
ρ
Bifurcation diagram for trajectories with positive angular momentum.
which now grows like x2 for large x, and has a global minimum at a positive x. As a result, there will be no trajectories falling into the singularity. The trajectories in fact will be similar to classical Keplerian trajectories, divided into bound and escape (scattering) orbits. The following diagram (Fig. 2) shows the various parameter regimes for each type of orbit. The shaded region in Fig. 2 is a forbidden one. The parameters in regions labeled E give rise to bound orbits, while those in regions labeled H give rise to escape (scattering) ones. One observes that, in sharp contrast to the case of superextremal RWN studied in [50], there is no evidence here of gravity being repulsive in the
April 11, J070-S0129055X11004308
344
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
vicinity of the naked conical singularity at the center of these particle-spacetimes. On the contrary, the singularity appears to be attractive.
Acknowledgments I was introduced to nonlinear electrodynamics by my dear friend and colleague Michael Kiessling, and I also owe him the initial impetus for studying solutions with no horizon. I have benefitted greatly from his help and encouragement throughout this project, and I am indebted to him for his critical reading of many drafts. I would also like to thank the anonymous referee for constructive comments, and the Institute for Advanced Study for their hospitality while I was a member there during the final stage of this work. References ¨ [1] H. Reissner, Uber die Eigengravitation des elektrischen Feldes nach der Einsteinschen Theorie, Ann. Phys. (Berlin) 50 (1916) 106–120. [2] G. Nordstr¨ om, On the energy of the gravitational field in Einstein’s theory, Proc. Kon. Ned. Akad. Wet. 20 (1918) 1238–1245. [3] H. Weyl, Zur Gravitationstheorie, Ann. Phys. (Berlin) 54 (1917) 117–145. [4] C. W. Misner, K. S. Thorne and J. A. Wheeler, Gravitation (W. H. Freeman & Co, New York, 1973). [5] R. Arnowitt, S. Deser and C. W. Misner, Coordinate invariance and energy expressions in general relativity, Phys. Rev. 122 (1961) 997–1006. [6] S. Hawking and G. Ellis, The Large Scale Structure of Space-Time (Cambridge University Press, Cambridge, 1973). [7] H. Stephani, D. Kramer, M. MacCallum, C. Hoenselaers and E. Herlt, Exact Solutions of Einstein’s Field Equations (Cambridge University Press, Cambridge, 2003). [8] V. F. Weisskopf, On the self-energy and the electromagnetic field of the electron, Phys. Rev. 56 (1939) 72–85. [9] R. P. Feynman, Lectures in Physics, Vol. 2, Chap. 28 (Addison-Wesley, Reading, Mass., 1964). [10] S. Weinberg, The Quantum Theory of Fields, Vol. I (Cambridge University Press, Cambridge, 1995), p. 31. [11] R. Arnowitt, S. Deser and C. W. Misner, Gravitational-electromagnetic coupling and the classical self-energy problem, Phys. Rev. 120 (1960) 313–319. [12] P. A. M. Dirac, Classical theory of radiating electrons, Proc. R. Soc. London A 167 (1938) 148–169. [13] W. Appel and M. K.-H. Kiessling, Mass and spin renormalization in Lorentz electrodynamics, Ann. Phys. (N.Y.) 289 (2001) 24–83. [14] M. K.-H. Kiessling, personal communication. [15] M. Born, Modified field equations with a finite radius of the electron, Nature 132 (1933) 1004. [16] G. Mie, Grundlagen einer Theorie der Materie, Ann. Phys. 37 (1912) 511–534; ibid. 39 (1912) 1–40; ibid. 40 (1913) 1–66. [17] K. Schwarzschild, Zur Elektrodynamik, I. Zwei Formen des Princips der Action in der Elektronentheorie, in Nachrichten von der Gesellschaft der Wissenschaften zu Goettingen (1903), pp. 126–131.
April 11, J070-S0129055X11004308
2011 12:3 WSPC/S0129-055X
148-RMP
On the Static Spacetime of a Single Point Charge
345
[18] M. Born and L. Infeld, Foundation of the new field theory, Nature 132 (1933) 1004. [19] M. Born and L. Infeld, Foundation of the new field theory, Proc. R. Soc. London A 144 (1934) 425–451. [20] M. H.-K. Kiessling, Electromagnetic field theory without divergence problems 1. The Born legacy, J. Stat. Phys. 116 (2004) 1057–1120. [21] M. H.-K. Kiessling, On the motion of point defects in relativistic fields, preprint (2011). [22] B. Hoffmann, Gravitational and electromagnetic mass in the Born–Infeld electrodynamics, Phys. Rev. 47 (1935) 877–880. [23] B. Hoffmann and L. Infeld, On the choice of the action function in the new field theory, Phys. Rev. 51 (1937) 765–773. [24] B. S. Madhava Rao, Generalized action-functions in Born’s electro-dynamics, Proc. Indian Acad. Sci. Sec. A 6 (1937) 158–173. [25] R. Pellicer and R. J. Torrence, Nonlinear electrodynamics and general relativity, J. Math. Phys. 10 (1969) 1718–1723. [26] M. Demianski, Static electromagnetic geon, Found. Phys. 16 (1986) 187–190. [27] E. Ay´ on-Beato and A. Garc´ıa, Regular black hole in general relativity coupled to nonlinear electrodynamics, Phys. Rev. Lett. 80 (1998) 5056–5059. [28] K. A. Bronnikov, Regular magnetic black holes and monopoles from nonlinear electrodynamics, Phys. Rev. D 63 (2001) 044005, 6 pp. [29] I. Dymnikova, Regular electrically charged vacuum structures with de Sitter centre in nonlinear electrodynamics coupled to general relativity, Classical Quant. Grav. 21 (2004) 4417. [30] D. J. Cirilo-Lombardo, New spherically symmetric monopole and regular solutions in Einstein–Born–Infeld theories, J. Math. Phys. 46 (2005) 042501. [31] D. J. Cirilo-Lombardo, Rotating charged black holes in Einstein–Born–Infeld theories and their ADM mass, Gen. Relativ. Gravit. 37 (2005) 847–856. [32] K. A. Bronnikov, V. N. Melnikov, G. N. Shikin and K. P. Staniukovich, Scalar, electromagnetic, and gravitational fields interaction: Particlelike solutions, Ann. Phys. (N.Y.) 118 (1979) 84. [33] K. A. Bronnikov, Comment on “Regular black hole in general relativity coupled to nonlinear electrodynamics”, Phys. Rev. Lett. 85 (2000) 4641. [34] G. D. Birkhoff, Relativity and Modern Physics (Harvard University Press, Cambridge MA, 1923), p. 253. ¨ [35] J. T. Jebsen, Uber die allgemeinen kugelsymmetrischen L¨ osungen der Einsteinschen Gravitationsgleichungen im Vakuum, Arkiv f¨ or Matematik, Astronomi och Fysik 15 (1921) 1–9; English translation in Gen. Relativ. Gravit. 37 (2005) 2253–2259. [36] J. Ehlers and A. Krasi´ nski, Comment on the paper by J. T. Jebsen [reprinted in Gen. Relativ. Grav. 37 (2005) 2253–2259], Gen. Relativ. Gravit. 38 (2006) 1329–1330. [37] J. A. Eiesland, The group of motions of an Einstein space, Trans. Amer. Math. Soc. 27 (1925) 213–245. [38] J. A. Eiesland, Bull. Amer. Math. Soc. 27 (1921) 410, paragraph. [39] B. Hoffmann, On the spherically symmetric field in relativity, Quart. J. Math. 3 (1932) 226–237. [40] K. Schleich and D. M. Witt, A simple proof of Birkhoff’s theorem for cosmological constant, J. Math. Phys. 51 (2010) 112502. [41] D. Christodoulou, The Action Principle and Partial Differential Equations, Chap. 6 (Princeton University Press, Princeton, NJ, 1999). [42] J. Plebanski, Lectures on Nonlinear Electrodynamics (NORDITA, Copenhagen, 1968).
April 11, J070-S0129055X11004308
346
2011 12:3 WSPC/S0129-055X
148-RMP
A. S. Tahvildar-Zadeh
[43] A. S. Tahvildar-Zadeh, One- and two-Killing field reductions of the Einstein–Maxwell system with arbitrary constitutive laws, in preparation. [44] I. Bialinicki-Birula, Nonlinear electrodynamics: Variations on a theme by Born and Infeld, in Quantum Theory of Particles and Fields, eds. B. Jancewicz and J. Lukierski (World Scientific, Singapore, 1983), pp. 31–48. [45] B. Hoffmann, On the new field theory, Proc. R. Soc. London A 148 (1935) 353–364. [46] C. Darwin, The gravity field of a particle, Proc. R. Soc. London A 249 (1959) 180– 194. [47] C. Darwin, The gravity field of a particle, II, Proc. R. Soc. London A 263 (1961) 39–50. [48] J. C. Graves and D. R. Brill, Oscillatory character of Reissner–N¨ordstrom metric for an ideal charged wormhole, Phys. Rev. 120 (1960) 1507–1513. [49] B. Carter, Global structure of the Kerr family of gravitational fields, Phys. Rev. 174 (1968) 1559–157. [50] W. B. Bonnor, The equilibrium of a charged test particle in the field of a spherical charged mass in general relativity, Classical Quant. Grav. 10 (1993) 2077–2082. [51] S. Chandrasekhar, The Mathematical Theory of Black Holes (Oxford University Press, New York, 1983). [52] N. Bret´ on, Geodesic structure of the Born–Infeld black hole, Classical Quant. Grav. 19 (2002) 601–612. [53] J. Diaz-Alonso and D. Rubiera-Garcia, Electrostatic spherically symmetric configurations in gravitating nonlinear-electrodynamics, Phys. Rev. D 81 (2010) 064021.
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 4 (2011) 347–373 c World Scientific Publishing Company DOI: 10.1142/S0129055X1100431X
LOCALIZED ENDOMORPHISMS IN KITAEV’S TORIC CODE ON THE PLANE
PIETER NAAIJKENS Institute for Mathematics, Astrophysics and Particle Physics, Radboud University Nijmegen, Postbus 9010, 6500 GL Nijmegen, The Netherlands
[email protected] Received 20 December 2010 Revised 28 March 2011 We consider various aspects of Kitaev’s toric code model on a plane in the C ∗ -algebraic approach to quantum spin systems on a lattice. In particular, we show that elementary excitations of the ground state can be described by localized endomorphisms of the observable algebra. The structure of these endomorphisms is analyzed in the spirit of the Doplicher–Haag–Roberts program (specifically, through its generalization to infinite regions as considered by Buchholz and Fredenhagen). Most notably, the statistics of excitations can be calculated in this way. The excitations can equivalently be described by the representation theory of D(Z2 ), i.e. Drinfel’d’s quantum double of the group algebra of Z2 . Keywords: Localized endomorphisms; superselection sectors; anyons; toric code. Mathematics Subject Classification 2010: 81T05, 81T25, 18D10
1. Introduction Kitaev’s quantum double model [1] has attracted much interest in recent years. One of its interesting features is that the model has anyonic excitations. Such models may be relevant to a new approach to quantum computing, where topological properties of a system are used to do computations (see [2, 3] for reviews). Here we consider the simplest case of this model, corresponding to the group Z2 . This model is often called the toric code, although we will consider it on the plane instead of on a torus. This model is not powerful enough for applications to quantum computing, but it has interesting properties nonetheless. In particular, it has anyonic excitations (albeit abelian anyons). The toric code has been studied by many authors by now, for example [1, 4, 5]. We take a different viewpoint, namely that of local quantum physics. Indeed, the model can be discussed in the C ∗ -algebraic approach to quantum spin 347
May 20, J070-S0129055X1100431X
348
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
systems [6, 7]. We show that single excitations can be described by states that cannot be distinguished from the ground state when restricted to measurements outside a cone extending to infinity. This structure is familiar from the algebraic approach to quantum field theory [8], in particular when massive particles are considered [9]. The states describing these single excitations lead, via the GNS construction, to inequivalent representations (superselection sectors) of the observable algebra. In fact, these states fulfill a certain selection criterion, pertaining to the fact that they are localized and transportable. The analysis of such representations is central to the Doplicher–Haag–Roberts (DHR) program in algebraic quantum field theory [10, 11]. In particular, it turns out that these representations can equivalently be described by endomorphisms of the observable algebra. This description leads in a natural way to the notion of composition of excitations and to statistics of (quasi)particles from first principles. This analysis can be carried out completely for the toric code on the plane. A related approach is taken for example in [12, 13], where the authors consider G-spin (or, more generally, Hopf-C ∗ ) chains. There, excitations localized in bounded regions (satisfying the so-called DHR criterion) are considered. Since every injective endomorphism of a finite dimensional algebra is in fact an automorphism, the authors consider amplimorphisms to obtain non-abelian charges. Here, we take a different approach, and look instead at endomorphisms localized in certain infinite “cone” regions. In our model the irreducible endomorphisms are all automorphisms, but since we consider excitations localized in infinite regions, finite dimensionality of the algebras is not an obstruction any more. The idea of construction charged sectors localized in infinite regions is not new: it is used, for example, in the work of Fredenhagen and Marcu [14]. Discrete gauge theories in d = 2 + 1 show similar algebraic features (i.e. fusion and braiding) of anyons [15]. Similar models have been studied in the constructive approach to quantum fields in lattice gauge theory, in particular for the gauge group Z2 in [14, 16]. These results have been generalized to the group ZN in [17, 18]. Although the setting considered here is different, some of the methods used are similar. A field theoretic interpretation of the model discussed here can be found in [1, Sec. 4]. The paper is organized as follows. In Sec. 2, we recall the model and discuss the ground state in the C ∗ -algebraic setting. In Sec. 3 localized automorphisms describing excitations are described. Section 4 is devoted to fusion and statistics of excitations. Then follows a discussion of operator-algebraic aspects of von Neumann algebras generated by observables localized in cones. Finally, in the last section we prove that the excitations are described by the representation theory of the quantum double D(Z2 ). 2. The Model We describe Kitaev’s model in the C ∗ -algebraic framework for quantum lattice systems [4]. Consider a square Z2 lattice. On each bond of the lattice, i.e. an edge
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
349
between two vertices of distance 1, there is a spin-1/2 particle. That is, at each bond b the local state space is H{b} = C2 , with observables A({b}) = M2 (C). The set of bonds will be denoted by B. If Λ ⊂ B is a finite set, A(Λ) is the algebra of observables living on the bonds of Λ. It is the tensor product of the observable algebras acting on the individual bonds of Λ. If Λ1 ⊂ Λ2 there is an obvious inclusion of corresponding algebras, by identifying HΛ2 ∼ = HΛ1 ⊗ HΛ2 \Λ1 . This defines a local net of algebras, with respect to the inclusion A(Λ1 ) → A(Λ2 ) for Λ1 ⊂ Λ2 . Define Aloc = A(Λf ), Λf ⊂B
the algebra of local observables. The union is over the finite subsets Λf of B. The algebra A of quasi-local observables is the completion of Aloc in the norm topology, turning it into a C ∗ -algebra. Alternatively, one can see it as the inductive limit of the net Λ → A(Λ) in the category of C ∗ -algebras. Note that A is a uniformly hyperfinite (UHF) algebra [6]. The algebra of observables localized in an arbitrary subset Λ of B is defined as · A(Λf ) , A(Λ) = Λf ⊂Λ
where the union is again over finite subsets. An operator A is said to have support in Λ, or to be localized in Λ, if A ∈ A(Λ). The set supp(A) ⊂ B is the smallest subset in which A is localized. The Hamiltonian of Kitaev’s model is defined in terms of plaquette and star operators, each supported on four bonds (see Fig. 1). If s is a point on the lattice, star(s) denotes the star based at s. Similarly, plaq(p) are the bonds enclosing a plaquette p. The corresponding star and plaquette operators are given by σjx , Bp = σjz , As = j∈star(s)
j∈plaq(p)
Fig. 1. The Z2 lattice. The gray bonds each carry a spin-1/2 particle. A star (dashed lines) and plaquette (thick lines) are shown.
May 20, J070-S0129055X1100431X
350
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
where the tensor product is understood as having Pauli matrices σ x (respectively σ z ) in places j, and unit operators in all other positions. It is then straightforward to check that for all stars s and plaquettes p, we have [As , Bp ] = 0. These operators are used to define the local Hamiltonians. If Λf ⊂ B is finite, the associated local Hamiltonian is As − Bp . HΛf = − star(s)⊂Λf
plaq(p)⊂Λf
There is a natural action of Z2 on the quasi-local algebra, acting by translations. Denote this action by τx for x ∈ Z2 . Note that the interactions are of finite range, and moreover, they are translation invariant. Hence, there exists an action αt of R on A describing the dynamics of the system [7], as well as a derivation δ that is the generator of the dynamics. For observables localized in a finite set Λ, the action of this derivation is given bya δ(A) = i[HΛ , A],
A ∈ A(Λ).
By definition, ground states for these dynamics are states ω of A such that −iω(X ∗ δ(X)) ≥ 0 for all X ∈ Aloc . In [4] it is shown that the model admits a unique ground state, which can be computed explicitly. Since we will need the argument later, for the convenience of the reader we summarize the results. The following lemma is crucial in the computation of the ground state. The proof is a straightforward application of the Cauchy–Schwartz inequality and the fact that for A positive, ω(A) = 0 implies that ω(A2 ) = 0. Lemma 2.1. Let ω be a state on a C ∗ -algebra A, and suppose X = X ∗ such that X ≤ I and ω(X) = 1. Then ω(XY ) = ω(Y X) = ω(Y ) for any Y ∈ A. Consider now the abelian algebra AXZ generated by the star and plaquette operators. This algebra is in fact maximal abelian: AXZ ∩ A = AXZ [4]. Let ω be the state on AXZ such that ω(As ) = ω(Bp ) = 1 for all plaquette and star operators.b With help of the lemma, this completely determines the state on AXZ . Moreover, it minimizes the local Hamiltonians, hence any ground state of the system must be equal to ω if restricted to AXZ . The goal is then to show that this state has a unique extension to A. be a bit more precise: the derivation δ defined here is norm-closable and it is the closure δ that generates the dynamics [7, Theorem 6.2.4]. By a density argument, it is often enough to consider δ instead of its closure. b That such a state exists can be seen by mapping the model to an Ising spin model. a To
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
351
Let ω0 be an extension of ω to the algebra A.c Using the lemma one can show that for X, Y ∈ Aloc , (ω0 (X ∗ Y ) − ω0 (X ∗ As Y )) − iω0 (X ∗ δ(Y )) = s
+
(ω0 (X ∗ Y ) − ω0 (X ∗ Bp Y )),
(2.1)
p
where the variable s runs over all stars in the lattice, and p over all plaquettes. If one takes X = Y , an application of the Cauchy–Schwartz inequality shows that the right hand side is positive, hence ω0 is a ground state. As mentioned before, in the model at hand this extension is actually unique. In fact, let X be a monomial in the Pauli matrices, say X = i∈Λ σiki where Λ ⊂ B is finite and ki = x, y or z. Then ω0 (X) is non-zero if and only if X is a product of star and plaquette operators, in which case it is 1. This completely determines the state ω0 , since the value of ω0 (X) can be computed by a repeated application of Lemma 2.1. For example, to make plausible why ω0 is zero if X is not a product of star and plaquette operators, consider an operator of the form A = σjx for some bond j. Then there is a plaquette p such that j ∈ plaq(p). But then ω0 (A) = ω0 (Bp σjx Bp ) = −ω0 (A). In particular, for a local observable A that is a monomial in the Pauli matrices, the set of bonds where A has a σ x component should have the property that the intersection with each plaquette plaq(p) has an even number of elements. Continuing in this manner, one can show that indeed only products of star and plaquette operators lead to non-zero expectation values [4]. Proposition 2.1. There is a unique (hence pure) ground state ω0 . This state is translation invariant. The self-adjoint H0 generating the dynamics in the GNS representation (π0 , H0 , Ω), when normalized such that H0 Ω = 0, satisfies Sp(H0 ) ⊂ {0} ∪ [4, ∞). Proof. We have already discussed existence and uniqueness of ω0 . Translations map star operators into star operators, and plaquette operators into plaquette operators, hence the ground state is translation invariant. Since ω0 is a ground state, it is invariant under the dynamics and the time evolution can be implemented by a strongly continuous group t → Ut of unitaries. We can choose Ut such that Ut Ω = Ω. It follows that there is an (unbounded) self-adjoint H0 such that Ut = eitH0 and H0 Ω = 0. We claim that Sp H0 ⊂ {0} ∪ [M, ∞) is equivalent to −iω0 (X ∗ δ(X)) ≥ M (ω0 (X ∗ X) − |ω0 (X)|2 ),
(2.2)
for all X ∈ Aloc , because the ground state is non-degenerate. Indeed, since H0 Ω = 0 with Ω the GNS vector, the inequality can equivalently be written as c By
the Hahn–Banach theorem an extension ω0 of ω to A always exists.
May 20, J070-S0129055X1100431X
352
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
XΩ, H0 XΩ ≥ M (XΩ2 − | Ω, XΩ |2 ) because XΩ, H0 XΩ = ω0 (X ∗ δ(X)). Here we have identified X with its image π0 (X), which is possible since π0 is a representation of a UHF (hence simple) algebra. On the other hand, the spectrum condition is equivalent to H0 + MP Ω ≥ MI , where PΩ is the projection on the subspace spanned by Ω (by non-degeneracy, this is the spectral projection corresponding to {0}). This is equivalent to the condition Ψ, (H0 + MP Ω )Ψ = Ψ, H0 Ψ + M | Ω, Ψ |2 ≥ M Ψ2 for all Ψ in the domain D(H0 ) of H0 . But π(Aloc )Ω is a core for H0 (compare with the proof of [7, Proposition 5.3.19]), hence it is enough to check the inequality for Ψ = XΩ with X ∈ Aloc . This shows that inequality (2.2) is equivalent to the assertion on the spectrum of H0 . We now show that inequality (2.2) indeed holds for M = 4. As a first step, we claim that if either X or Y is a local operator in AXZ , −iω0 (X ∗ δ(Y )) = 4(ω0 (X ∗ Y ) − ω0 (X)ω0 (Y )) = 0.
(2.3)
Under these assumptions, the left-hand side can be seen to vanish by Eq. (2.1) and Lemma 2.1. As for the right-hand side, consider the case where X ∈ AXZ (the other case is proved similarly). In this case, X = i λi Xi where each Xi is a product of star and plaquette operators. Using Lemma 2.1 again, it follows that ω0 (X ∗ Y ) = i λi ω0 (Y ) = ω0 (X)ω0 (Y ), proving the claim. Now consider the general case, with a local operator X = XXZ + i∈I λi Xi , where XXZ ∈ AXZ and each Xi (with i in some finite set I) is a monomial in / AXZ . Since Xi ∈ / AXZ , there is some As or Bp the Pauli matrices such that Xi ∈ that does not commute with Xi . Suppose this is As . Since Xi is a monomial in the Pauli matrices, this actually implies that {As , Xi } = 0, in other words, they anti-commute. Note that this implies that ω0 (Xi ) is zero for each i = 0, since by the same trick as used before it follows that ω0 (Xi ) = −ω0 (Xi ). By the remarks above, Eq. (2.2) reduces to −i ω0 (Xi∗ δ(Xj )) ≥ 4 ω0 (Xi∗ Xj ). (2.4) i,j∈I
i,j∈I
Note that for each Xi , there is a finite number ni of plaquette and star operators that anti-commute with Xi . In fact, ni ≥ 2, since if there is for example one star operator that does not commute with Xi , there must necessarily be another one with this property.d Note that if ni = nj , there is a star or a plaquette operator that commutes with Xi and anti-commutes with Xj (or vice versa). Consequently, ω0 (Xi∗ Xj ) = 0. Now define for each integer k the finite set Ik = {i ∈ I : ni = k} and the opera k = tors X i∈Ik Xi , with the understanding that Xk = 0 if Ik is the empty set. By d This
amounts to saying that excitations always exist in pairs in finite regions in Kitaev’s model [1].
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
353
∗X the considerations above, it then follows that i,j∈I ω0 (Xi∗ Xj ) = k≥2 ω0 (X k k ), since ni ≥ 2 for each i ∈ I. On the other hand, from Eq. (2.1) it follows that −iω0 (Xi∗ δ(Xj )) = 2ni ω(Xi∗ Xj ). It then follows that the left-hand side of the ∗X inequality (2.4) is equal to 2 k≥2 kω0 (X k k ). From this it easily follows that inequality (2.4) holds. The spectrum condition has far-reaching consequences for the correlation functions; for example, it implies that ground state correlations decay exponentially [19]. 3. Localized Endomorphisms In this section we describe localized excitations of the system. In his model, Kitaev associates certain string operators to paths on the lattice (or the dual lattice). These string operators create excitations at the endpoints of the paths [1]. The idea is to consider a single excitation by moving one of the excitations to infinity, as is done, for example, in [14]. Before this construction is introduced, we give some preliminary definitions. By a site, we mean either a point on the lattice, a plaquette, or a pair of a plaquette with one of its vertices (i.e. a combined site). Sites can be seen as the places where excitations can be introduced. Between two sites of the same type, we can consider paths. A path between two points on the lattice is just a path consisting of bonds of the lattice. A path between plaquettes can be viewed as a path on the dual lattice. A path between combined sites is called a ribbon (see Fig. 2). One can think of a ribbon as being composed by a path on the lattice and one on the dual lattice. Definition 3.1. Let γ be a finite path between two sites. If γ is a path on the lattice, define the corresponding string operator as ΓγZ = i∈γ σiz . If it is a path x on the dual lattice, the string operator is defined as ΓγX = i∈γ σi . Here i ∈ γ
Fig. 2. A path on the lattice (left black line) and a ribbon. The dots on the ribbon indicate a combined site, i.e. a plaquette with one of its vertices.
May 20, J070-S0129055X1100431X
354
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
means that i is a bond that intersects the path on the dual lattice. Finally, a string operator corresponding to a ribbon is a combination of these constructions. That is, ΓγY = ΓγX1 ΓγZ2 , where γ1 is the path on the lattice and γ2 the path on the dual lattice, corresponding to the ribbon. It should be clear from the context whether we consider paths on the lattice, paths on the dual lattice, or ribbons. We say that a path or the corresponding string operator is of type X, Y or Z, corresponding to the subscripts used in the definition. We first make some observations that will be used later. Consider a plaquette p. The corresponding plaquette operator Bp is just the string operator ΓγZ , where γ is the closed path consisting of the edges of the plaquette. If p is, for example, a plaquette adjacent to p, Bp Bp is the string operator corresponding to the closed path on the outer edges of the two plaquettes. Continuing this way, it follows that the string operator corresponding to a closed path on the lattice is the product of plaquette operators corresponding to the plaquettes enclosed by the path. The reader will have no trouble checking that similarly a string operator corresponding to a closed path on the dual lattice is the product of all star operators corresponding to the stars enclosed by the path. The idea now is to study “elementary” excitations by first considering a pair of excitations (created by a string operator), and then move one of the excitations to infinity. This technique is also used in, for instance, lattice gauge theory [14, 18]. We show that in Kitaev’s model such excitations can be described by localized automorphisms of A. Definition 3.2. Let ρ be a ∗-endomorphism of A. Let Λ ⊂ B be arbitrary. Then ρ is said to be localized in Λ if ρ(A) = A for all A ∈ A(Λc ). Here Λc denotes the complement of any subset Λ of B. We will primarily be interested in cone regions, although in fact the specific shape of the regions is not important (see also Remark 3.1 below). Definition 3.3. Consider a point on the lattice Z2 , with two semi-infinite lines emanating from it, such that the angle between those lines is positive but smaller than π. A cone Λ ⊂ B consists of all bonds that are in the area bounded by the two lines, or intersected by one of the lines. See Fig. 3 for an example. Remark that for x ∈ Z2 there is a translated cone Λ+x. Furthermore, ∪x∈Z2 (Λ+ x) is the set of all bonds. Finally, τx (A(Λ)) = A(Λ + x) for any Λ ⊂ B. These properties hold in fact for any subset Λ of the bonds containing at least a horizontal and a vertical bond. The string operators induce localized endomorphisms (in fact, automorphisms) of A. If γ is a path starting at a site x and extending to infinity, write γn (n ∈ N) for the finite path consisting of the first n bonds of the path γ. Proposition 3.1. Let Λ be a cone and let k = X, Y, Z. Choose a path γ k of type k in Λ extending to infinity. Consider the corresponding string operators Γγkn for
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
355
Fig. 3. Example of a cone (bold bonds). The shaded region is the area bounded by two lines emanating from a point.
n ∈ N. For any A in A, define ρk (A) = lim Ad Γγkn (A), n→∞
(3.1)
where the limit is taken in norm. Then for each k, ρk defines an outer automorphism of the quasi-local algebra A. These automorphisms are localized in Λ. Proof. In the proof we will omit the symbol γ and write Γnk . Suppose A is an observable localized in a finite region Λ0 . Then one can find n0 such that (γn \γn0 )∩ Λ0 = ∅ for all n > n0 . In other words, new parts of the path all lie outside Λ0 . But then it follows that Ad Γnk (A) = Ad Γnk 0 (A) for all n > n0 , hence the limit in Eq. (3.1) converges in norm for any local operator A. To define ρk on A, extend by continuity. Indeed, since each Γkn is a unitary operator, ρk (A) = A for each local observable. The local observables are normdense in A, so that ρk extends uniquely to A. By continuity of the ∗-operation and joint continuity of multiplication (in the norm topology), ρk is a ∗-endomorphism. The localization property immediately follows from locality: if B ∈ A(Λc ), then it commutes with Γnk for each n. The endomorphism ρk is in fact an automorphism. Indeed, because Pauli matrices square to the identity, ρk ◦ ρk is the identity. To see that the automorphisms are outer, it is enough to notice that the sequence Γnk is not a Cauchy sequence in A, hence it does not converge to an element in A. By [20, Theorem 6.3], it follows that the automorphisms are outer.e Note that the automorphism ρk depends on the choice of path γ k . If necessary, this path dependence will be emphasized by using the notation ρkγ . this follows because the GNS representation of ω0 ◦ ρk is disjoint from the GNS representation of ω0 , see Theorem 3.1.
e Alternatively,
May 20, J070-S0129055X1100431X
356
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
The automorphisms defined in Proposition 3.1 induce states by composing with the ground state. Definition 3.4. Let x be a site and γ a path of type k = X, Y, Z starting at x and extending to infinity. Define a state ωkx of A by ωkx (A) = ω0 (ρkγ (A)). At first sight, this state appears to depend on the specific choice of path. However, this is not the case. Lemma 3.1. For each k = X, Y, Z and each site x of the same type, the state ωkx only depends on x, but not on the path γ. Proof. First consider the case k = Z, so that x is a point on the lattice. To prove independence of the path, consider another point y and let γ 1 and γ 2 be two paths from x to y. Denote the corresponding string operators by Γ1Z and Γ2Z . This allows to define two (a priori distinct) states ωix,y (A) = ω0 (ΓiZ AΓiZ ),
i = 1, 2.
Note that the string operators commute with plaquette operators, hence clearly ωix,y (Bp ) = 1 for each plaquette p. As for the star operators, note that each star has an even number (0, 2 or 4) of edges in common with the paths γ i , except at the endpoints x and y, where there are an odd number of edges in common. Let s be the star based at x. Suppose for the sake of example that it has one edge in common with the path γ 1 . Then, using the commutation relations for Pauli matrices, ω1x,y (As ) = ω0 (Γ1Z As Γ1Z ) = i2 ω0 (As ) = −1. A similar calculation holds in the case of three common edges, or for a star s containing the endpoint y. Summarizing, we find that ω1x,y and ω2x,y coincide on the abelian algebra AXZ , taking the value 1 on all plaquette operators. On the star operators they take the value −1 if the star is based at either x or y, and 1 otherwise. A similar argument as given for the ground state now allows us to compute the value of the states on arbitrary elements of the local algebras, and it follows that both states coincide. There is in fact another way to see this. Let for example γ be a finite path of type Z. Let p be a plaquette such that p ∩ γ is non-empty. Then it is easy to see Z that ΓZ γ Bp = Γγ , where the path γ is obtained from γ by deleting the bonds of γ ∩ p and adding the bonds p\γ to the path γ. Hence once can use the plaquette operators to deform one path into another, provided the endpoints are the same. Since Z Z Z Z Z ω0 (ΓZ γ AΓγ ) = ω0 (Bp Γγ AΓγ Bp ) = ω0 (Γγ AΓγ )
it follows that the states coincide. A similar argument can be given for paths of type X. Now consider the case where γ 1 and γ 2 are two paths starting at x and extending to infinity. Let A be a local observable, localized in some finite set Λ ⊂ B. Then
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
357
there is an n0 such that the paths γn1 and γn2 do not return to Λ for n ≥ n0 . Consider a path γ ⊂ Λc from γn1 0 to γn2 0 . By locality and the result above, we then have γ1
γ1
γ2
γ2
γ1
γ1
n0 n0 n0 n0 γ γ ω0 (ρZ γ 1 (A)) = ω0 (ΓZ AΓZ ) = ω0 (ΓZ ΓZ AΓZ ΓZ )
= ω0 (ΓZn0 AΓZn0 ) = ω0 (ρZ γ 2 (A)). x By continuity this result extends to observables A ∈ A, hence the state ωZ is independent of the path. x and ωYx is essentially the same. The difference The argument for the states ωX is that one has to consider points x, y in the dual lattices, i.e. plaquettes of the lattice, together with paths on the dual lattice. For example, for k = X one finds −1 x, y ∈ p, x,y x,y (As ) = 1, ωX (Bp ) = ωX 1 otherwise. x The argument is now the same as for ωZ .
The state ωkx describes a single excitation. By the GNS construction, this leads to a corresponding representation πωkx of A. The GNS triple coming from the ground state ω0 will be denoted by (π0 , H0 , Ω). The remarkable feature is that representations corresponding to single excitations cannot be distinguished from the ground state representation when restricted to the complement of a cone. Theorem 3.1. Let Λ ⊂ B be any cone. Then π0 A(Λc ) ∼ = πωkx A(Λc ),
(3.2)
for k = X, Y, Z and any site x. In addition, πωkx ∼ = πωly if and only if k = l. This x holds for k = 0, X, Y, Z, where ω0 := ω0 . Proof. Let x be a site. Choose a path γ (of type k) in Λ, starting at x and going to infinity. Consider ρk := ρkγ as above. Then π0 ◦ ρk is localized in Λ, in the sense that π0 ◦ ρk (A) = π0 (A) for all A ∈ A(Λc ). Moreover, it is a GNS representation for the state ωkx , essentially by definition of ωkx (the Hilbert space is H0 and Ω the cyclic vector).f Hence by uniqueness of the GNS representation, π0 ◦ ρk ∼ = πωkx . Together with localization this yields Eq. (3.2). Let y be another site. Consider a path γ from x to y, with corresponding string operator Γγk . Note that Ad Γγk ◦ ρk is precisely the automorphism induced by the path from y to infinity, obtained by concatenating γ with γ. From unitarity of Γγk it is easy to see that πωkx ∼ = πωky , proving that the GNS representations of type k are equivalent, independent of the starting site. To complete the proof, we show that the representations are globally inequivalent. Note that ω0 is a pure state, hence the GNS representation is irreducible. The that ω0 and ωxk are automorphic states in the terminology of [21, Chap. 12]. The statement is then an example of Proposition 12.3.3 of the same reference.
f Note
May 20, J070-S0129055X1100431X
358
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
Fig. 4. Consider the state induced by thick path on the lattice. A path γ on the dual lattice (dashed) defines a string operator ΓγX . The state has value −1 on this operator.
GNS representations of the states ωkx can be obtained by composing π0 with an automorphism of A, hence they are also irreducible. But this implies that ω0 and ωk are factor states. Moreover, since the representations are irreducible, unitary equivalence is equivalent to quasi-equivalence of the states [21, Proposition 10.3.7]. Recall that in the situation at hand, two factor states ω1 and ω2 are quasi-equivalent such that for all finite if and only if for each ε > 0, there is a finite set of bonds Λ c sets Λ ⊂ Λ and B ∈ A(Λ), |ω1 (B) − ω2 (B)| < εB, by [6, Corollary 2.6.11]. We show that this inequality cannot hold. x , for some point x on the Consider for the sake of example the case ω0 and ωZ contains the star lattice. Set ε = 1. Without loss of generality, we can assume that Λ based at x. Since Λ is finite, it is possible to choose a closed non-self-intersecting is contained in the region bounded path γ in the dual lattice, such that the set Λ by the path (see Fig. 4). Consider the string operator ΓγX corresponding to this path. Then clearly this operator is localized in a finite region in the complement Recall that Γγ is the product of star operators enclosed by the path γ, in of Λ. X particular the star based at x. That is, ΓγX = Astar(x) As1 · · · Asn for certain stars s1 , . . . , sn . But this implies x x (ΓγX )| = |1 − ωZ (Astar(x) )| = 2 > ΓγX . |ω0 (ΓγX ) − ωZ
The other cases are similar, if necessary using plaquettes instead of stars. Remark 3.1. The fact that Λ is a cone is not essential at this point. What is important is that it should be possible to choose a path extending to infinity contained in Λ. In particular, the proof implies that it is not possible to sharpen the result to unitary equivalence when restricted to the complement of a finite set. At one point in the analysis however, notably in the proof of Theorem 5.1, it is essential to be able to translate the support of any local observable to a region completely inside Λ. If Λ is a cone, this is always possible.
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
359
In the language of algebraic quantum field theory, the representations πωk are said to satisfy a selection criterion. Usually one imposes such a selection criterion to select physically relevant representations. Here, however, we start with physically reasonable constructions and arrive at the criterion. The criterion here can be interpreted as a lattice analogue of localization in spacelike cones, as considered in [9]. An example of a model admitting such representations, albeit a model mainly of mathematical interest, is constructed in [22]. The interpretation is that the excitations cannot be distinguished from the ground state outside a cone region. It would be interesting to know if there are other irreducible representations of A, not unitarily equivalent to the representations in Theorem 3.1, satisfying this criterion. One probably has to impose additional criteria to select physically relevant representations (cf. the condition on the existence of a mass gap in [9]). For the automorphisms considered here, a similar property can be derived. In particular, the automorphisms are covariant with respect to the time evolution. Moreover the generator has positive spectrum bounded away from zero. Note that the algebra A (being UHF) is simple, hence π0 is a faithful representation. To simplify notation, from now on we identify π0 (A) with A and often drop the symbol π0 , as was already done in the proof of Proposition 2.1. Proposition 3.2. Let γ be a path to infinity of type k. Then ργ is covariant for the action of αt . In fact, suppose γ is of type Z. Then, for all t ∈ R and A ∈ A, ργ (αt (A)) = eit(H0 +2As ) ργ (A)e−it(H0 +2As ) with Sp(H0 + 2As ) ⊂ [2, ∞). Here s is the starting point of γ. For the case k = X one has to replace As by Bp , where p is the plaquette where the path starts. The case k = Y has generator H0 + 2Bp + 2As , with spectrum contained in [4, ∞). Proof. We prove the result for paths of type X. The other cases are proved by making the obvious modifications. First note that for A ∈ Aloc , αt (A) = limΛ→Z2 eiHΛ t Ae−iHΛ t in norm. By the same reasoning as in the proof of Lemma 2.1, one sees that ργ (As ) = −As . Hence if Λ ⊃ star(s), we have ργ (HΛ ) = HΛ + 2As . By expanding the exponential into a power series, it is then clear that ργ (eitHΛ Ae−itHΛ ) = eit(HΛ +2As ) ργ (A)e−it(HΛ +2As ) . One then sees (remark in particular that As commutes with all local Hamiltonians) that for all A ∈ A we have ργ (αt (A)) = Ut ργ (A)Ut∗ , where Ut is the unitary Ut = exp(it(H0 + 2As )). It remains to show the spectrum condition. This can be done by similar methods as used in the proof of Proposition 2.1. The spectrum condition is equivalent to the inequality −iω(X ∗ δ(X)) + 2ω(X ∗ As X) − 2ω(X ∗ X) ≥ 0
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
360
for all X ∈ Aloc . We then proceed as before: write X = XXZ + i Xi where / AXZ monomials in the Pauli matrices. After substituting XXZ ∈ AXZ and Xi ∈ this into the inequality, all terms containing XXZ vanish. By the same reasoning as in the proof of Proposition 2.1 one then sees that this inequality is indeed satisfied for all X ∈ Aloc . The following corollary is immediate. Corollary 3.1. The states ωγk are invariant with respect to αt . 4. Fusion, Statistics and Braiding The localized endomorphisms considered in the previous section can be endowed with a tensor product. In fact, it is possible to define a braiding in a canonical way. This braiding is related to the statistics of particles. In the DHR analysis, a crucial role in the construction is played by Haag duality in the vacuum sector. For dealing with cone localized endomorphisms, the appropriate formulation is the condition that for each cone Λ the following equality holds: π0 (A(Λ)) = π0 (A(Λc )) . Note that by locality, one always has π0 (A(Λ)) ⊂ π0 (A(Λc )) . Currently no general conditions from which Haag duality follows are known, but note that there are some results for quantum spin chains, e.g. [23, 24]. At the moment we do not have a proof of Haag duality, but since the ground state is known explicitly, one might hope that a direct proof is possible. Fortunately, in the present situation it is possible to do without Haag duality. To clarify this, first note that Theorem 3.1 implies in particular that the localized automorphisms defined by paths extending to infinity are transportable. Definition 4.1. Let Λ be a cone and suppose that ρ is an endomorphism of A there is a unitary localized in Λ. Then ρ is called transportable, if for any cone Λ g equivalent endomorphism ρ localized in Λ. One of the applications of Haag duality is to get more control over the unitary setting up the equivalence. Specifically, one can show that the intertwiners are elements of the (weak closure) of cone algebras. Recall that an intertwiner V from an endomorphism ρ1 to ρ2 is an operator such that V ρ1 (A) = ρ2 (A)V for all A ∈ A. A unitary intertwiner is also called a charge transportation operator (or simply charge transporter). In our model we will be able to prove, without invoking Haag duality, that the charge transporters are elements of the weak closure of cone algebras. We again identify π0 (A) with A in the proof. g We
do not require that this unitary lives in A. More precisely, we demand that π0 ◦ ρ ∼ = π0 ◦ ρb.
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
361
Lemma 4.1. Let γ 1 (respectively γ 2 ) be a path of type k starting at a site x (respectively y) and extending to infinity. Then there is a unitary intertwiner V from ρkγ1 to ρkγ2 such that Γkbγ V Ω = Ω (where Ω is the GNS vector for ω0 ) for any path γ from x to y. Moreover, if for each n a path γn from the nth site of γ1 to the nth site of γ2 is chosen such that limn→∞ dist( γn , x) = ∞, then for Vn = Γn1 Γkeγn Γn2 , where Γni is the string operator corresponding to the path γni , we have V = w-lim Vn . n→∞
(4.1)
In other words, Vn is a sequence of operators converging weakly to V. Proof. First note that As Ω = Bp Ω = Ω for all star and plaquette operators. Indeed, (As − I)Ω2 = ω0 ((As − I)∗ (As − I)) = 2 − 2 = 0, for all As . A similar calculation holds for the operators Bp . Note that this property can be interpreted as the ground state vector minimizing the value of each local Hamiltonian [1]. First note that a unitary V as in the statement is necessarily unique because any unitary intertwiner from ρkγ1 to ρkγ2 is a scalar multiple of V, by Schur’s lemma and irreducibility of π0 . To show existence, first consider (for simplicity) the case where γ 1 and γ 2 start at the same site x. As remarked earlier in the proof of Theorem 3.1, Ω is a cyclic vector for ρk1 and for ρk2 (we will write ρk1 instead of ρkγ 1 in the proof). Moreover, the corresponding vector state is ωkx . By uniqueness of the GNS construction, there is a unitary V such that V ρk1 (A) = ρk2 (A)V for all A ∈ A, and V Ω = Ω. Choose paths γ n as in the statement of the lemma. The path obtained by concatenating γ n with the paths γn1 and γn2 can be seen as a loop based at x that gets larger and larger as n gets bigger. Now consider a sequence Vn of unitaries defined by Vn = Γn1 Γn2 Γkeγn where Γni is defined in the statement of the lemma. Note that Vn is a product of star and plaquette operators, since it is the path operator of a closed loop. Hence, Vn Ω = Ω by the observation above. Suppose B ∈ Aloc . Let N be such that γ n ∩ supp(B) = ∅ for all n ≥ N . Then from locality, one can easily verify that Vn ρk1 (B) = ρk2 (B)Vn for all n ≥ N , in other words, lim ρk1 (A)Ω, Vn ρk1 (B)Ω = lim ρk1 (A)Ω, ρk2 (B)Vn Ω = Ω, ρk1 (A)∗ ρk2 (B)Ω ,
n→∞
n→∞
for all A, B ∈ Aloc . On the other hand, for each A, B ∈ Aloc , ρk1 (A)Ω, V ρk1 (B)Ω = Ω, ρk1 (A)∗ ρk2 (B)Ω , since V Ω = Ω. The sequence Vn is uniformly bounded and because ρk1 (Aloc )Ω is dense in H0 , since ρk1 is an automorphism, it follows that Vn → V weakly. Seeing that any path γ from x to x is a loop, it is clear that Γkbγ V Ω = Ω.
May 20, J070-S0129055X1100431X
362
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
As for the general case, suppose γ 1 starts at the site x and γ 2 starts at the site y. Choose a path γ from x to y. Then ρ := Ad Γkeγ ◦ ρk1 is defined by a path starting at y. By the argument above, there is a unitary V intertwining ρ and ρk2 such that V Ω = Ω. Set V = Γk V . It follows that V is an intertwiner from ρk to ρk that γ e
1
2
satisfies Γkγ V Ω = Ω for all paths γ from x to y, because Γkeγ Γkγ is the path operator of a loop. The claim on the converging net follows from the construction. A pleasant consequence of the above proof is that a specific sequence converging to the intertwiners is given, which makes it possible to do explicit calculations. A direct consequence of the lemma is that we have some control over the algebras containing the unitary intertwiners, a point where usually Haag duality is used. Theorem 4.1. Suppose Λ1 and Λ2 are two cones such that there is another cone Λ ⊃ Λ1 ∪ Λ2 . For k = X, Y, Z, consider ρki ∼ = πωk localized in Λi for i = 1, 2, defined by paths γ i extending to infinity. Let W be a unitary such that W ρk1 (A) = ρk2 (A)W for all A ∈ A. Then W ∈ A(Λ) . Proof. By Schur’s lemma, W is a multiple of the intertwiner V in the previous lemma. The geometric situation makes it clear that a net Vn as in the lemma can be chosen to be a net in A(Λ). This net converges weakly to V , by the previous lemma. Remark 4.1. Again it is not essential that Λ as in the theorem is a cone. It is enough to be able to chose paths γ n in as in Lemma 4.1 that lie inside Λ. But note that the smaller Λ is, the more control one has over the algebra where the intertwiners live in. Proposition 4.1. The representations ρk are covariant with respect to the action τx of translations. That is, for each x ∈ Z2 there is a unitary W (x) such that ρk (τx (A)) = W (x)ρk (A)W (x)∗ for all A ∈ A and the map x → W (x) is a group homomorphism. Proof. Let γ denote the string (starting at the site x0 ) defining ρk . For x ∈ Z2 , consider the translated string γ = γ − x. This defines an automorphism ρ k . In fact, k k ρ = τ−x ◦ ρ ◦ τx . Then by Lemma 4.1, there is a unitary intertwiner Vx from ρk to ρ k . We choose Vx such that the condition in Lemma 4.1 is satisfied. Write U (x) for the unitaries that implement the translations in the GNS representation of ω0 . Define W (x) = U (x)Vx . It then follows that ρk (τx (A)) = W (x)ρk (A)W (x)∗ for all A ∈ Aloc , and hence by continuity for all A ∈ A. It remains to show that W (x) is a representation of Z2 . By irreducibility of ρk it follows that W (x + y) = λ(x, y)W (x)W (y) with λ a 2-cocycle of Z2 taking values in the unit circle. The claim is that λ is in fact trivial. This would follow from the equation U (y)∗ Vx U (y) = Vx+y Vy∗ for all x, y ∈ 2 Z . Note that the operator on the right-hand side is an intertwiner from ρkγ−y
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
363
to ρkγ−(x+y) satisfying the condition in Lemma 4.1. This equation can be verified by noting that Vx+y and Vy commute with path operators (this should be clear from the construction of a converging net) and by the following observation: a path is a path from x0 − y to x0 − (x + y)) can be written as Γγ1 Γ∗γ2 operator Γbγ (where γ with γ1 a path from x0 to x0 − (x + y) and γ2 a path from x0 to x0 − y. Let Vxn be a sequence as in Lemma 4.1 converging weakly to Vx . Then for the translated sequence τ−y (Vxn ) w-lim τ−y (Vxn ) = Vx+y Vy∗ , n→∞
by the same lemma. The result follows since the map A → τ−y (A) = U (y)∗ AU (y) is weakly continuous, hence the left hand side is equal to U (y)∗ Vx U (y). It is possible to define a tensor product of localized endomorphisms. If ρ1 and ρ2 are localized in cones Λ1 and Λ2 , the basic idea is to define an endomorphism ρ1 ⊗ ρ2 by (ρ1 ⊗ ρ2 )(A) = ρ1 (ρ2 (A)). If Λ ⊃ Λ1 ∪ Λ2 is a cone, it follows that ρ1 ⊗ ρ2 is localized in Λ. In order to get a categorical tensor structure, one would then like to define a tensor product for intertwiners. If Ti , i = 1, 2 are intertwiners from ρi to σi (and Ti ∈ A), the reader will have no difficulty showing that T1 ⊗ T2 := T1 ρ1 (T2 ) is an intertwiner from ρ1 ⊗ ρ2 to σ1 ⊗ σ2 . In the terminology of category theory, this would turn the category of cone localized automorphisms with intertwiners as morphisms into a strict tensor category. The trivial endomorphism ι is the tensor unit. Note that the unit operator I of A can be regarded as an intertwiner from ρ to itself for any endomorphism ρ. To indicate this, we sometimes write Iρ . The distinction is important in the definition of the tensor product of intertwiners. There is, however, one problem with this definition: the intertwiners are elements of the algebra A(Λ) rather than of A(Λ) (recall that we identified π0 (A) with A). There is no reason why they should be contained in the quasi-local algebra A, because this algebra is not weakly closed in general. Since the localized endomorphisms are (a priori) only defined on A, the above definition therefore does not make sense. A possible solution is to introduce an auxiliary algebra that contains the intertwiners [9]. Choose an arbitrary cone Λa , which will be fixed from now on. The cone can be interpreted as a “forbidden” direction, not unlike the technique of puncturing the circle. Introduce a partial ordering on Z2 by defining x ≤ y ⇔ (Λa + y) ⊂ (Λa + x) ⇔ (Λa + x)c ⊂ (Λa + y)c . Now (Z2 , ≤) is a directed set (each pair of points has an upper bound with respect to ≤), hence it is possible to take the (C ∗ )-inductive limit AΛa =
x∈Z2
A((Λa + x)c )
·
.
(4.2)
May 20, J070-S0129055X1100431X
364
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
Note that AΛa +x = AΛa for all x ∈ Z2 . Clearly, A ⊂ AΛa . Moreover, if Λ is a cone such that Λ ⊂ (Λa + x)c for some x, then A(Λ) ⊂ AΛa . An important pointh is that the automorphisms we consider can be extended to AΛa . Proposition 4.2. Let ρ be an automorphism defined by a path extending to infinity. Then ρ has a unique extension ρΛa to AΛa that is weakly continuous on A((Λa + x)c ) for any x ∈ Z2 . Moreover, ρΛa (AΛa ) ⊂ AΛa . In other words, it is an endomorphism of the auxiliary algebra. Proof. The proof is essentially the same as that of [9, Lemma 4.1], except at points where duality is used. First, let A ∈ A((Λa + x)c ). Since ρ is localizable, there is a unitary V such that ρ(A) = VAV ∗ (choose a unitary equivalent endomorphism localized in Λa + x). This implies that ρ is weakly continuous on A((Λa + x)c ) and the unique weakly continuous extension can be given by ρΛa (B) = VBV ∗ for B ∈ A((Λa + x)c ) . This procedure determines ρΛa on all of AΛa . To show that ρΛa maps AΛa into itself, first note that ρ(A(Λ)) ⊂ A(Λ) for every finite set Λ ⊂ B. Hence, by weak continuity, ρΛa (A((Λa + x)c ) ) = ρ(A((Λa + x)c )) ⊂ A((Λa + x)c ) , which proves the claim. Remark 4.2. In the proof of Buchholz and Fredenhagen, Haag duality is used to show that the extensions map the auxiliary algebra into itself (see also Footnote h). The point is that using Haag duality it is possible to show that for representations localized in a cone Λ one has ρ(A(Λ)) ⊂ A(Λ) . Since we have an explicit description of the representations, we can directly prove the stronger statement ρ(A(Λ)) ⊂ A(Λ) for the automorphisms considered in our model. However, the intertwiners are typically not elements of A(Λ). a We now redefine the tensor product as ρ1 ⊗ρ2 = ρΛ 1 ◦ρ2 . For the automorphisms that we have considered so far, this definition reduces to the old one. However, to define the tensor product of intertwiners, this definition is necessary. If S is an intertwiner from ρ1 to ρ1 and T an intertwiner from ρ2 to ρ2 such that T ∈ A(Λ) a for some cone Λ asymptotically disjoint from Λa , then S ⊗ T := SρΛ 1 (T ) is a well-defined intertwiner from ρ1 ⊗ ρ2 to ρ1 ⊗ ρ2 . The tensor product gives rise to fusion rules. A fusion rule gives a decomposition of the tensor product of two irreducible representations into a direct sum of irreducible representations. In Kitaev’s model the rules are particularly simple. As remarked before, for each k = X, Y, Z, ρk ⊗ ρk = ι, where ι is the trivial endomorphism of A. Furthermore, essentially by definition, ρX ⊗ ρZ ∼ = ρY . This determines
h In
the case of algebraic quantum field theory, the main point is to obtain endomorphisms of the auxiliary algebra from representations of the quasi-local algebra. In the present model, however, we already have automorphisms of A.
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
365
the fusion rules for unitarily equivalent representations as well: unitaries setting up the equivalence can be defined using the tensor product. Using the tensor product, in this case a braiding can then be defined, similarly as in the DHR analysis [10]. This is a unitary operator ερ1 ,ρ2 intertwining ρ1 ⊗ ρ2 and ρ2 ⊗ ρ1 . First, consider two disjoint cones Λ1 and Λ2 that are both contained in (Λa + x)c for some x. We say that Λ1 < Λ2 if we can rotate Λ1 counterclockwise around the apex of the cone until it has non-empty intersection with Λa + x, such that at any intermediate angle it is disjoint from Λ2 . Note that for two disjoint cones either Λ1 < Λ2 or Λ2 < Λ1 . Now let ρ1 , ρ2 be two localized automorphisms, as considered above, such that ρ1 is localized in a cone Λ1 and ρ2 in Λ2 . Moreover, we demand that there is a cone 2 such that Λ 2 < Λ1 . Λ ⊃ Λ1 ∪Λ2 . Note that ρ1 ⊗ρ2 is localized in Λ. Choose a cone Λ ∗ 2 . This unitary can Then there is a unitary V such that V ρ2 (−)V is localized in Λ a be chosen in AΛa [25]. It then follows that ερ1 ,ρ2 := (V ⊗Iρ1 )∗ (Iρ1 ⊗V ) = V ∗ ρΛ 1 (V ) is an intertwiner from ρ1 ⊗ ρ2 to ρ2 ⊗ ρ1 . With this definition, one can prove the following result by adapting the proof in the DHR analysis (see e.g. [26]) in a suitable way. 2 < Λ1 , not on the Lemma 4.2. The braiding ερ,σ only depends on the condition Λ specific choices made. Moreover, it satisfies the braid equations ερ,σ⊗τ = (Iσ ⊗ ερ,τ )(ερ,σ ⊗ Iτ ) ερ⊗σ,τ = (ερ,τ ⊗ Iσ )(Iρ ⊗ εσ,τ ).
(4.3)
Furthermore, ερ,σ is natural in ρ and σ: if T is an intertwiner from ρ to ρ , then ερ ,σ (T ⊗ I) = (I ⊗ T )ερ,σ , and similarly for σ. In Lemma 4.1, a net converging to the charge transporters was explicitly constructed. This makes it possible to calculate the braiding operators exactly. In the subscript of the braiding, we will sometimes write X, Y or Z instead of ρX , ρY and ρZ . Theorem 4.2. Let ρ1 , ρ2 be automorphisms defined by strings extending to infinity in some cone Λ. Suppose that each automorphism is of type X or type Z. The braid operators in each of the possible cases are then given by εX,X = εZ,Z = I and εX,Z = ±I. If εX,Z = I, then εZ,X = −I and vice versa. disjoint from Λ, such that Λ < Λ and such that there is a Proof. Consider a cone Λ cone Λ ⊃ Λ ∪ Λ. There is a path γ 2 in Λ such that the corresponding automorphism The corresponding unitary charge ρ 2 is unitarily equivalent to ρ2 and localized in Λ. . By definition we then have ερ1 ,ρ2 = transporter V is then contained in A(Λ) a V ∗ ρΛ 1 (V ). a This can be calculated using weak continuity of ρΛ 1 and the explicit construction of Lemma 4.1 of a net converging to V . Indeed, let Vn → V be this net. Note that each Vn is a string operator of the same type as ρ2 . In particular, if ρ1 is of the
May 20, J070-S0129055X1100431X
366
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
Fig. 5. The path γn (dashed line) crosses the defining path of ρ1 from the right. The dotted lines represent the defining paths of ρ2 and ρb2 . a same type as ρ2 , then ρ1 (Vn ) = Vn for all n and hence ρΛ 1 (V ) = V . It follows that εX,X = εZ,Z = I. The situation where ρ1 is of type X and ρ2 is of type Z (or vice versa) is a bit more complicated. Recall that for the definition of the net Vn , for each n a path γ n is chosen, such that the distance to the starting points of the paths γ1 and γ2 goes to infinity. The operator Vn is then the string operator corresponding to the string 2 , together with γ n . Note that, if n is big formed by the first n bonds of γ2 and γ enough, this string crosses γ1 either an even number of times, or an odd number, independent of n. This property depends on whether the first crossing is from the “left” or from the “right” (see Fig. 5), or if there is no crossing at all. By anti-commutation of the Pauli matrices, it follows that if the number of crossings is even, ρ1 (V ) = V , whereas if it is odd then ρ1 (V ) = −V . Hence, εX,Z = ±I. If the role of ρ1 and ρ2 is reversed, an odd number of crossings becomes an even number. This observation proves the last claim.
Since ρY = ρX ⊗ ρZ , the braid equations allow to compute the braiding with excitations of type Y . The braiding with the trivial automorphism is always trivial. This completely determines the braiding for all irreducible representations we consider. We note that the sign of, for example, εX,Z depends on the relative localization of both strings. Indeed, suppose we have two automorphisms ρ1 and ρ2 , defined by strings γ1 of type X and γ2 of type Z, extending to infinity and localized in Λ1 respectively Λ2 . Suppose moreover that Λ2 < Λ1 . It then follows that ερ1 ,ρ2 = I, 2 , do not cross γ1 . On the other since the paths in the proof, going from γ2 to γ hand, if Λ1 < Λ2 it follows that ερ1 ,ρ2 = −I. Note that this coincides with the situation in algebraic quantum field theory in low dimensions [27, Sec. 2.2]. The final piece of structure is that of conjugation. A conjugate can be interpreted as an anti-charge. Formally, a conjugate for an endomorphism ρ is a triple (ρ, R, R) such that R intertwines ι and ρ ⊗ ρ and R intertwines ι and ρ ⊗ ρ [28]. Here ι is the trivial endomorphism. The intertwiners R, R should satisfy ∗
R ρ(R) = I,
R∗ ρ(R) = I.
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
367 ∗
A conjugate for an irreducible endomorphism ρ is called normalized if R∗ R = R R ∗ and standard if R∗ ρ(T )R = R T R for every intertwiner T from ρ to itself. If a conjugate exists, one can always find a standard conjugate. Note that ρk ⊗ ρk = ι for k = X, Y, Z. It follows that in our model the automorphisms we consider have conjugates. These are particularly simple: ρk = ρk and one can choose the unit operators for the intertwiners R and R. This is trivially a standard conjugate. With the help of the braiding and conjugates one can define a twist. Let ρ be a cone localized endomorphism and (ρ, R, R) be a standard conjugate. The twist Θρ ∈ End(ρ) is then defined by ∗
Θρ = (R ⊗ idρ ) ◦ (idρ ⊗ερ,ρ ) ◦ (R ⊗ idρ ). Note that if ρ is irreducible, Θρ = ωρ I for some phase factor. The (equivalence class of) ρ is called bosonic if ωρ = 1 and fermionic if ωρ = −1. Since the conjugates of ρk , k = X, Y, Z are particularly simple, the following corollary immediately follows from Theorem 4.2. Corollary 4.1. The excitations X and Z are bosonic and Y is fermionic. 5. Cone Algebras Let Λ be a cone. In this section we consider the von Neumann algebras associated to the observables localized in this cone. More precisely, define RΛ := π0 (A(Λ)) and RΛc := π0 (A(Λc )) . The main result in this section is that RΛ is an infinite factor. Lemma 5.1. With the notation above, RΛ ∨ RΛc = B(H0 ). Proof. Note that for each set Λ ⊂ B one has RΛ = that B(H0 ) = π0 (A) = RΛ ∨ RΛc .
b∈Λ
π0 (A({b})). It follows
More can be said about the cone algebras. In fact, they are infinite factors. In other words, RΛ is a factor of Type I∞ , Type II∞ or Type III. The basic idea of the proof, which is adapted from [23, Proposition 5.3], is to assume that RΛ admits a tracial state. It then follows that ω0 is tracial, which is a contradiction. In fact, Type I∞ can be ruled out as well. Theorem 5.1. RΛ is a factor of Type II∞ or Type III. Proof. To show that RΛ is a factor, we argue as in [23]. The center is Z(RΛ ) = RΛ ∩ RΛ . By taking commutants, Z(RΛ ) = RΛ ∨ RΛ . Note that RΛc ⊂ RΛ , hence by Lemma 5.1, Z(RΛ ) = B(H0 ). Assume that RΛ is a finite factor. Then there exists a unique tracial state ψ on RΛ . This induces a tracial state ψ = ψ ◦ π0 on A(Λ). By Propositions 10.3.12(i)
May 20, J070-S0129055X1100431X
368
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
and 10.3.14 of [21], it follows that the state ψ is factorial and quasi-equivalent to the restriction of ω0 to A(Λ). ⊂ Λ such that |ω0 (A) − Let ε > 0. By [6, Corollary 2.6.11], there is a finite set Λ ψ(A)| < εA for all A ∈ A(Λ\Λ). Now, let k > 0 be an integer. Consider local observables A, B with localization region contained in B(0, k) (that is, all bonds that can be connected to the origin of Z2 with a path of length at most k) and is finite, there is an x ∈ Z2 , such that τx (AB ) is norm 1. Since Λ is a cone and Λ localized in Λ\Λ. By translation invariance, x (AB ))| = |ω0 (τx (AB )) − ψ(τ x (AB ))| < ε, |ω0 (AB ) − ψ(τ and similarly for BA. Hence since ψ is a trace, x (AB )) − ω0 (BA) + ψ(τ x (BA))| < 2ε. |ω0 (AB ) − ω0 (BA)| = |ω0 (AB ) − ψ(τ Because k and ε were arbitrary, ω0 (AB ) = ω0 (BA) for all A, B ∈ Aloc , which is absurd. To see that the Type I case can be ruled out, note that RΛ is of Type I if and only if ω0 is quasi-equivalent to ω0,Λ ⊗ ω0,Λc . This can be seen by adapting the ⊂ B be any finite set. Then one can always proof of [29, Proposition 2.2]. Let Λ c find a star s in Λ such that the intersection with both Λ and Λc is not empty. But for this star s, one has ω0 (As ) = 1. On the other hand, (ω0,Λ ⊗ ω0,Λc )(As ) = 0, essentially because Λ ∩ s is not a star any more. This implies that the states ω and ω0,Λ ⊗ ω0,Λc are not equal at infinity. It follows that ω0 cannot be quasi-equivalent to ω0,Λ ⊗ ω0,Λc . We single out a useful consequence of this result. Corollary 5.1. Let Λ be a cone. Then RΛ contains isometries V1 , V2 such that Vi∗ Vj = δi,j I and V1 V1∗ + V2 V2∗ = I. Proof. By [30, Proposition V.1.36], there is a projection P such that P ∼ (I − P ) ∼ I, where ∼ denotes Murray–von Neumann equivalence with respect to RΛ . Hence, there are isometries V1 , V2 such that V1 V1∗ = P and V2 V2∗ = (I − P ). These isometries suffice. Although we have no proof for Haag duality for cones, we would like to point out an interesting consequence of this duality. For two cones Λ1 ⊂ Λ2 , write Λ1 Λ2 if any star or plaquette in Λ1 ∪ Λc2 is either contained in Λ1 or in Λc2 . Definition 5.1. We say that ω0 satisfies the distal split property for cones if for any pair of cones Λ1 Λ2 there is a Type I factor N such that RΛ1 ⊂ N ⊂ RΛ2 . With the assumption of Haag duality we can then prove the following theorem. Theorem 5.2. Suppose that π0 satisfies Haag duality for cones. Then ω0 has the distal split property for cones.
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
369
Proof. Let Λ1 Λ2 be two cones. Note that it is enough to prove that RΛ1 ∨RΛ2 RΛ1 ⊗ RΛ2 , where denotes that the natural map A ⊗ B → AB (A ∈ RΛ1 , B ∈ RΛ2 ) extends to a normal isomorphism. Indeed, if this is the case, the result follows from [31, Theorem 1 and Corollary 1], since RΛ1 and RΛ2 are factors. Note that ω0 (AB ) = ω0 (A)ω0 (B) if A ∈ A(Λ1 ), B ∈ A(Λc2 ). Since ω0 is normal, this result is also valid for A ∈ RΛ1 and B ∈ RΛc2 . A result of Takesaki [32] then implies that RΛ1 ∪Λc2 = RΛ1 ∨ RΛc2 RΛ1 ⊗ RΛc2 . By Haag duality, RΛc2 = RΛ2 , which concludes the proof. Note that without Haag duality only the existence of a Type I factor RΛ1 ⊂ N ⊂ π0 (A(Λc2 )) can be concluded. The condition that Λ1 Λ2 is needed precisely to avoid the situation at the end of the proof of Theorem 5.1. 6. Equivalence with Repf D(Z2 ) If G is a finite group, one can form the quantum double D(G) of the group. The quantum double is a quasi-triangular Hopf algebra (see e.g. [33] for an introduction). It is well-known that Repf D(G), the category of finite dimensional D(G)-modules, is a modular tensor category [34]. In this section we will introduce the category ∆(Λ) of stringlike localized representations and show that it is equivalent to Repf D(Z2 ) (as braided tensor C ∗ -categories). This implies that for all practical purposes, the excitations are described by the representation theory of D(Z2 ). Lemma 6.1. Let ρ1 , ρ2 be two transportable endomorphisms of A, localized in a cone Λ. Then one can define a localized and transportable direct sum ρ1 ⊕ ρ2 . Proof. Let V1 , V2 ∈ RΛ be isometries as in Corollary 5.1. Define ρ(A) := V1 ρ1 (A)V1∗ + V2 ρ2 (A)V2∗ , for all A ∈ A. It follows that ρ is a ∗-representationi of A. Since Vi ∈ RΛ and RΛc ⊂ RΛ , it follows that ρ(A) = A for A ∈ A(Λc ), be another cone. Pick hence ρ is localized in Λ. To show transportability, let Λ isometries W1 , W2 ∈ RΛb as in Corollary 5.1. Since ρ1 and ρ2 are transportable, Define there are unitary operators Ui such that Ui ρi (−)Ui∗ is localized in Λ. ∗ ∗ ∗ ∗ ∗ W = W1 U1 V1 + W2 U2 V2 . Then W W = W W = I and W ρ(−)W is localized in hence ρ is transportable. This ρ, which is unique up to unitary equivalence, will Λ, be denoted by ρ1 ⊕ ρ2 . We will now introduce the category ∆(Λ). For technical reasons it is convenient to consider only representations localized in a fixed cone Λ, since in that case clearly all intertwiners are in the algebra AΛa . Proceeding in this way, there is no problem in defining the tensor product. It should be noted that the resulting category does that ρ is not necessarily an endomorphism of A any more, but rather of AΛa . This is, however, only a minor technicality and is not essential for what follows.
i Note
May 20, J070-S0129055X1100431X
370
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
not depend on the specific choice of cone Λ (see [25, Proposition 2.11] for a proof and for alternative approaches). The irreducible objects of the category ∆(Λ) are precisely the automorphisms localized in the cone Λ that are given by paths extending to infinity. The morphisms are intertwiners from one endomorphism to another. By the lemma above, finite direct sums can be constructed, turning ∆(Λ) into a category with direct sums. In fact, by construction, each object can be decomposed into irreducibles. It is clear from the construction that the direct sums can be extended to endomorphisms of the auxiliary algebra. Hence the tensor product defined in Sec. 4 can be defined for all objects. Similarly, a braiding for direct sums can be constructed from Theorem 4.2. Conjugates for direct sums can be constructed from conjugates for the irreducible components. Summarizing, freely using terminology from [26, 35], we have the following result: Theorem 6.1. The category ∆(Λ) is a braided tensor C ∗ -category. The category obtained in this way is actually equivalent (as a braided tensor C ∗ -category) to the representation category of D(Z2 ) over the field k = C. For the structure of Repf D(Z2 ) as a braided tensor C ∗ -category we refer to [13]. A highbrow way of seeing this is to appeal to the classification results of modular tensor categories [36]. It is however possible to give an explicit construction of the equivalence. Note that equivalence as braided categories is in general stronger than equivalence as tensor categories. Indeed, there are non-isomorphic groups whose representation categories are equivalent as tensor categories but not as braided tensor categories [37]. On the other hand, every symmetric tensor category (satisfying certain additional properties) is the representation category of a compact group (determined up to isomorphism) [38]. Theorem 6.2. There is a braided equivalence of tensor C ∗ -categories ∆(Λ) → Repf D(Z2 ). Proof. Since Z2 is abelian, the irreducible representations of D(Z2 ) are labeled 2 [34, 39]. Here χe and by the elements e, f of Z2 and χe , χσ of the dual group Z χσ denote the trivial and the sign character of Z2 respectively. Write Vg,χ for the irreducible D(Z2 )-module induced by an element g and character χ. We obtain the following list of all irreducible modules of D(Z2 ): Π0 = Ve,χe ,
ΠX = Vf,χe ,
ΠY = Vf,χσ ,
ΠZ = Ve,χσ .
Recall that using the coproduct of D(Z2 ) the tensor product Πi ⊗ Πj can be made into a left D(Z2 )-module. The tensor product has the same fusion rules as ∆(Λ), e.g. ΠX ⊗ ΠY ∼ = ΠZ and Πk ⊗ Π0 ∼ = Π0 ⊗ Πk ∼ = Πk . On the side of ∆(Λ), choose paths of type X, Z such that the corresponding automorphisms ρX , ρZ satisfy εX,Z = −I. Define ρY = ρX ⊗ ρZ , and ρ0 = ι, the trivial endomorphism. Note that each irreducible representation in ∆(Λ) is unitarily
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
371
equivalent to one of the ρk . This suggests to define a functor F : Repf D(Z2 ) → ∆(Λ) as follows: for irreducible modules, the most natural choice is to set F (Πk ) = ρk for k = 0, X, Y, Z. The irreducible modules have dimension one, hence the D(Z2 )linear maps between the irreducible modules are just the scalars. In order for F to be a linear functor, there is essentially only one choice of F (T ) for a morphism T . Note that F is full and faithful on the Hom-sets of irreducible objects. By construction every irreducible object of ∆(Λ) is isomorphic to an object in the image of F . In fact, F is a braided monoidal functor. By our particular choice of ρX , ρY and ρZ , one can choose the natural transformations F (V ⊗W ) → F (V )⊗F (W ), needed for the definition of a monoidal functor, to be identities. To see that F is indeed a braided functor, recall that for π1 , π2 ∈ Repf D(Z2 ), the braiding cπ1 ,π2 is the linear map intertwining π1 ⊗ π2 and π2 ⊗ π1 defined by cπ1 ,π2 = σ ◦ (π1 ⊗ π2 )(R). Here σ is the canonical flip and R is a universal R-matrix for D(Z2 ). It is then straightforward to verify that for irreducible modules, F sends the braiding of Repf D(Z2 ) to that of ∆(Λ). For example, cΠX ,ΠZ = −1 (where we omit the isomorphism of the underlying vector spaces). The extension of the functor to direct sums is left to the reader, as is the verification that F preserves all the relevant structures of a braided tensor C ∗ -category. Since the irreducible objects of both categories are in 1-1 correspondence, and the functor F preserves direct sums and braidings, F sets up an equivalence of braided tensor C ∗ -categories. Note, for example, that F is full, faithful and essentially surjective. Indeed, it is tedious but relatively straightforward to define an inverse functor setting up the equivalence. Acknowledgments This research is funded by the Netherlands Organization for Scientific Research (NWO) Grant No. 613.000.608. I would like to thank M. Fannes for a discussion on the construction of the ground state, P. Fendley for the idea that single excitations can be obtained by moving one excitation of a pair to infinity, and M. M¨ uger and N. P. Landsman for helpful discussions and a critical reading of the manuscript. Professors D. Buchholz and K. Fredenhagen gave useful references at the 27th LQP Workshop in Leipzig, where this work was presented. An anonymous referee pointed out a gap in the proof of the spectral gap in an earlier version, as well as a suggestion on how to fix it. References [1] A. Kitaev, Fault-tolerant quantum computation by anyons, Ann. Phys. 303 (2003) 2–30. [2] C. Nayak, S. H. Simon, A. Stern, M. Freedman and S. Das Sarma, Non-abelian anyons and topological quantum computation, Rev. Mod. Phys. 80 (2008) 1083–1159. [3] Z. Wang, Topological Quantum Computation, CBMS Regional Conference Series in Mathematics, Vol. 112, Published for the Conference Board of the Mathematical Sciences (Amer. Math. Soc., 2010).
May 20, J070-S0129055X1100431X
372
2011 14:34 WSPC/S0129-055X
148-RMP
P. Naaijkens
[4] R. Alicki, M. Fannes and M. Horodecki, A statistical mechanics view on Kitaev’s proposal for quantum memories, J. Phys. A 40 (2007) 6451–6467. [5] E. Dennis, A. Kitaev and J. Preskill, Topological quantum memory, J. Math. Phys. 43 (2002) 4452–4505. [6] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics. 1, Texts and Monographs in Physics, 2nd edn. (Springer-Verlag, New York, 1987). [7] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics. 2, Texts and Monographs in Physics, 2nd edn. (Springer-Verlag, Berlin, 1997). [8] R. Haag, Local Quantum Physics: Fields, Particles, Algebras, Texts and Monographs in Physics, 2nd edn. (Springer-Verlag, Berlin, 1996). [9] D. Buchholz and K. Fredenhagen, Locality and the structure of particle states, Comm. Math. Phys. 84 (1982) 1–54. [10] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics. I, Comm. Math. Phys. 23 (1971) 199–230. [11] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics. II, Comm. Math. Phys. 35 (1974) 49–85. [12] F. Nill and K. Szlach´ anyi, Quantum chains of Hopf algebras with quantum double cosymmetry, Comm. Math. Phys. 187 (1997) 159–200. [13] K. Szlach´ anyi and P. Vecserny´es, Quantum symmetry and braid group statistics in G-spin models, Comm. Math. Phys. 156 (1993) 127–168. [14] K. Fredenhagen and M. Marcu, Charged states in Z2 gauge theories, Comm. Math. Phys. 92 (1983) 81–119. [15] F. A. Bais, P. van Driel and M. de Wild Propitius, Quantum symmetries in discrete gauge theories, Phys. Lett. B 280 (1992) 63–70. [16] J. C. A. Barata and K. Fredenhagen, Charged particles in Z2 gauge theories, Comm. Math. Phys. 113 (1987) 403–417. [17] J. C. A. Barata and F. Nill, Electrically and magnetically charged states and particles in the (2 + 1)-dimensional ZN -Higgs gauge model, Comm. Math. Phys. 171 (1995) 27–86. [18] J. C. A. Barata and F. Nill, Dyonic sectors and intertwiner connections in (2 + 1)dimensional lattice ZN -Higgs models, Comm. Math. Phys. 191 (1998) 409–466. [19] B. Nachtergaele and R. Sims, Lieb–Robinson bounds and the exponential clustering theorem, Comm. Math. Phys. 265 (2006) 119–130. [20] D. E. Evans and Y. Kawahigashi, Quantum Symmetries on Operator Algebras, Oxford Mathematical Monographs (The Clarendon Press Oxford University Press, New York, 1998). [21] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras. Vol. II, Graduate Studies in Mathematics, Vol. 16 (American Mathematical Society, Providence, RI, 1997). [22] D. Buchholz and K. Fredenhagen, Locality and the structure of particle states in gauge field theories, in Mathematical Problems in Theoretical Physics, Lecture Notes in Physics, Vol. 153 (Springer, 1982), pp. 368–371. [23] M. Keyl, T. Matsui, D. Schlingemann and R. F. Werner, Entanglement Haag-duality and type properties of infinite quantum spin chains, Rev. Math. Phys. 18 (2006) 935–970. [24] T. Matsui, Spectral gap, and split property in quantum spin chains, J. Math. Phys. 51 (2010) 015216, 8 pp. [25] P. Naaijkens, On the extension of stringlike localised sectors in 2+1 dimensions, Comm. Math. Phys. 303 (2011) 385–420.
May 20, J070-S0129055X1100431X
2011 14:34 WSPC/S0129-055X
148-RMP
Localized Endomorphisms in Kitaev’s Toric Code on the Plane
373
[26] H. Halvorson, Algebraic quantum field theory, in Philosophy of Physics, eds. J. Butterfield and J. Earman (Elsevier, 2006), pp. 731–922. [27] K. Fredenhagen, K.-H. Rehren and B. Schroer, Superselection sectors with braid group statistics and exchange algebras. II. Geometric aspects and conformal covariance, Rev. Math. Phys. 4 (1992) 113–157. [28] R. Longo and J. E. Roberts, A theory of dimension, K-Theory 11 (1997) 103–159. [29] T. Matsui, The split property and the symmetry breaking of the quantum spin chain, Comm. Math. Phys. 218 (2001) 393–416. [30] M. Takesaki, Theory of Operator Algebras. I, Encyclopaedia of Mathematical Sciences, Vol. 124 (Springer-Verlag, Berlin, 2002). [31] C. D’Antoni and R. Longo, Interpolation by type I factors and the flip automorphism, J. Funct. Anal. 51 (1983) 361–371. ohoku Math. J. (2) 10 (1958) [32] M. Takesaki, On the direct product of W ∗ -factors, Tˆ 116–119. [33] C. Kassel, Quantum Groups, Graduate Texts in Mathematics, Vol. 155 (SpringerVerlag, New York, 1995). [34] B. Bakalov and A. Kirillov, Jr., Lectures on Tensor Categories and Modular Functors, University Lecture Series, Vol. 21 (American Mathematical Society, Providence, RI, 2001). [35] M. M¨ uger, Abstract duality for symmetric tensor ∗-categories, appendix to [26]. [36] E. Rowell, R. Stong and Z. Wang, On classification of modular tensor categories, Comm. Math. Phys. 292 (2009) 343–389. [37] P. Etingof and S. Gelaki, Isocategorical groups, Int. Math. Res. Notices 2001 (2001) 59–76. [38] S. Doplicher and J. E. Roberts, A new duality theory for compact groups, Invent. Math. 98 (1989) 157–218. [39] R. Dijkgraaf, V. Pasquier and P. Roche, Quasi Hopf algebras, group cohomology and orbifold models. Recent advances in field theory (Annecy-le-Vieux, 1990), Nuclear Phys. B Proc. Suppl. 18B (1991) 60–72.
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 4 (2011) 375–407 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004321
EXISTENCE OF GROUND STATES OF HYDROGEN-LIKE ATOMS IN RELATIVISTIC QED I: THE SEMI-RELATIVISTIC PAULI–FIERZ OPERATOR
∗ ¨ MARTIN KONENBERG
Fakult¨ at f¨ ur Mathematik und Informatik, FernUniversit¨ at Hagen, L¨ utzowstraße 125, D-58084 Hagen, Germany
[email protected] OLIVER MATTE† Institut f¨ ur Mathematik, TU Clausthal, Erzstraße 1, D-38678 Clausthal-Zellerfeld, Germany
[email protected] EDGARDO STOCKMEYER Mathematisches Institut, Ludwig-Maximilians-Universit¨ at, Theresienstraße 39, D-80333 M¨ unchen, Germany
[email protected]
Received 7 March 2011 We consider a hydrogen-like atom in a quantized electromagnetic field which is modeled by means of the semi-relativistic Pauli–Fierz operator and prove that the infimum of the spectrum of the latter operator is an eigenvalue. In particular, we verify that the bottom of its spectrum is strictly less than its ionization threshold. These results hold true, for arbitrary values of the fine-structure constant and the ultraviolet cut-off as long as the Coulomb coupling constant is less than 2/π. For Coulomb coupling constants larger than 2/π, we show that the quadratic form of the Hamiltonian is unbounded below. Keywords: Semi-relativistic Pauli–Fierz operator; quantum electrodynamics; existence of ground states. Mathematics Subject Classification 2010: 81Q10, 81V10
1. Introduction The existence of atoms described in the framework of non-relativistic quantum electrodynamics (QED) is by now a well-established fact. The general picture is roughly ∗ Present
address: Fakult¨ at f¨ ur Physics, Universit¨ at Wien, Boltzmanngasse 5, 1090 Vienna, Austria. address: Mathematisches Institut, Ludwig-Maximilians-Universit¨ at, Theresienstraße 39, D-80333 M¨ unchen, Germany.
† Present
375
May 20, J070-S0129055X11004321
376
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
that all exited bound states of an electronic Hamiltonian modeling an atom turn into resonances when the interaction with the quantized electromagnetic field is taken into account. Only at the lower end of the spectrum there remains an eigenvalue corresponding to the ground states of the atomic system. Its analysis is particularly subtle as the whole spectrum is continuous up to its minimum in the presence of the quantized radiation field. The existence of energy minimizing ground states for atoms and molecules in non-relativistic QED has been proven first in [3, 5], for small values of the involved physical parameters. The latter are Sommerfeld’s fine structure constant, e2 , and the ultraviolet cut-off, Λ. The existence of ground states for a molecular Pauli–Fierz–Hamiltonian has been shown in [11], for all values of e2 and Λ, assuming a certain binding condition which has been verified later on in [16]. In the last decade there appeared a large number of further mathematical contributions to non-relativistic QED. Here we only mention that ground state energies and projections have also been studied by means of infrared finite algorithms and renormalization group methods [1–6, 9]. In contrast to the situation in non-relativistic QED only a few mathematical works deal with models where the quantized radiation field is coupled to relativistic particles. For instance, in [15, 16] the authors study a relativistic no-pair model of a molecule. They prove the stability of matter of the second kind and give an upper bound on the (positive) binding energy under certain restrictions on e2 , Λ, and the nuclear charges. In [19] two of the present authors consider a no-pair model of a hydrogenic atom and study the exponential localization of low-lying spectral subspaces. The same result is established in [19] also for the following operator which is investigated further in the present paper, (1.1) Hγ := (σ · (−i∇ + A))2 + 1 − γ/|x| + Hf . Here A is the quantized vector potential in the Coulomb gauge, Hf is the radiation field energy, σ is a formal vector containing the Pauli spin matrices, and γ = e2 Z > 0 is the Coulomb coupling constant, Z > 0 denoting the atomic number. (The square-root of the fine structure constant is included in the symbol A, which also depends on the choice of Λ.) Previous mathematical works dealing with this operator include [20] where the fiber decomposition of H0 with respect to different values of the total momentum is studied. We adopt the nomenclature of the latter paper and call Hγ the semi-relativistic Pauli–Fierz operator. Furthermore, the operator Hγ appears in the mathematical analysis of Rayleigh scattering [10] which is connected to the phenomenon of relaxation of an isolated atom to its ground state. (The electron spin has been neglected in [10] for notational simplicity.) An advantageous feature of semi-relativistic Hamiltonians in this situation is that the propagation speed of the electron is strictly less than the speed of light (which equals one in the units chosen in (1.1)). Moreover, it is shown in [27] that Hγ converges in norm resolvent sense to the non-relativistic Pauli–Fierz operator when the speed of light is re-introduced in (1.1) and goes to infinity. A bound on the binding energy for Hγ is derived in [12]. (The bound obtained in [12] improves
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
377
an earlier bound from a preprint version of the present article. Here we prove the same bound as in [12] by a variation of our original argument.) We remark that the existence of ground states in relativistic models of QED where all particles, including the electrons and positrons, are described by quantized fields is proven in [7]. To this end the authors employ infrared cut-offs in the interaction part of the Hamiltonian which will not be necessary in our analysis below. Thanks to [19] we already know that Hγ is semi-bounded below on some natural dense domain, for all γ ∈ [0, 2/π], and, hence, has a physically distinguished self-adjoint realization. As already indicated above, it is also shown in [19] that its spectral subspaces corresponding to energies below the ionization threshold are exponentially localized with respect to the electron coordinates. Typically, localization estimates are important ingredients in the proofs of the existence of ground states. The bound on the binding energy derived in [12] and, by a different method, in the present paper ensures that the bottom of the spectrum of Hγ is actually smaller than the ionization threshold. As a byproduct of our estimates we shall see that the quadratic form of Hγ is unbounded below, if γ > 2/π, so that the critical Coulomb coupling constant of the square-root operator does not change when the interaction with the quantized radiation field is taken into account. The main theorem of this article asserts that the operator Hγ has an energy minimizing ground state eigenvector, for arbitrary values of e2 and Λ and for γ ∈ (0, 2/π). We remark that the ground state energy — in fact, every speculative eigenvalue — of Hγ is evenly degenerate since Hγ commutes with the time reversal operator [20]. In order to prove the existence of ground states we combine the strategies employed in [3, 5] and [11]. Roughly speaking we construct a sequence of approximating ground state eigenvectors — these are ground states of infrared cut-off Hamiltonians — along the lines of [3, 5], and apply a compactness argument similar to the one given in [11]. As in [3, 5], where the authors assumed e2 or Λ to be small, we prove the existence of ground states for the infrared cut-off Hamiltonians by means of a discretization procedure. A new observation based on the localization estimates actually permits to carry out the discretization argument, for all values of e2 and Λ. Another key ingredient in the proofs are infrared estimates on the approximating ground state eigenvectors, namely, a bound on the number of soft photons [3, 11] and a photon derivative bound [11]. In order to establish these bounds for the model treated here, the formal gauge invariance of Hγ is crucial. In fact, the no-pair models investigated in [15, 16, 19] are gauge invariant also and the present authors exploit this property to prove the existence of ground states for a no-pair model of a hydrogenic atom in the companion paper [14]. Although the general strategies to prove the existence of ground states in QED are fairly well known by now, their application to the model studied in the present paper and to no-pair models in QED is non-trivial, mainly due to the non-locality of the corresponding Hamiltonians. In fact, the electronic kinetic energy and the quantized vector potential in the Hamiltonian (1.1)
May 20, J070-S0129055X11004321
378
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
are always linked together in a non-local way which leads to a variety of new mathematical problems in each of the steps in the existence proof mentioned above. To overcome these difficulties we employ various commutator estimates involving sign functions of the Dirac operator, multiplication operators, and the radiation field energy. Some of them have already been derived in [19]. This article is organized as follows. In the next section, we introduce the semirelativistic Pauli–Fierz operator and state our main results more precisely. As a technical prerequisite we derive some estimates involving absolute values and sign functions of the Dirac operator in Sec. 3. In Sec. 4, we prove that binding occurs in our model. Section 5 is devoted to the proof of the existence of ground states and starts with a brief outline of the strategy. Finally, in Sec. 6 we prove the infrared bounds. 2. Definition of the Model and Main Results The semi-relativistic Pauli–Fierz operator acts in the Hilbert space H 2 , where ⊕ ν 2 3 ν ∼ Hm := L (Rx , C ) ⊗ Fb [Km ] = Cν ⊗ Fb [Km ]d3 x, H ν := H0ν , (2.1) R3x
for m ≥ 0. (In our proofs we employ spaces with m > 0 and it shall be convenient to choose ν = 4.) Here the bosonic Fock space, Fb [Km ] =
∞
(n)
Fb [Km ],
n=0
is modeled over the one photon Hilbert space Km , K := K0 , where Km := L2 (Am × Z2 , dk), Am := {|k| ≥ m}, dk := λ∈Z2
d3 k.
Am
The letter k = (k, λ) always denotes a tuple consisting of a photon wave vector, k ∈ R3 , and a polarization label, λ ∈ Z2 . The components of k are written as k = (k (1) , k (2) , k (3) ). The following subspace is dense in Hmν , ν := C0∞ (R3x , Cν ) ⊗ Cm , Dm
D ν := D0ν . (Algebraic tensor product.)
(2.2)
Here Cm ⊂ Fb [Km ] denotes the subspace of all elements (ψ (n) )∞ n=0 ∈ Fb [Km ] such that only finitely many components ψ (n) are non-zero and such that each ψ (n) is bounded and has a compact support. The quantized vector potential is defined by means of the following coupling function with sharp ultraviolet cut-off at Λ > 0, (k) := e−ik·x g(k), Gphys x
g(k) ≡ ge,Λ (k) := −e
1{|k|≤Λ} ε(k), 2π |k|
(2.3)
for every x ∈ R3 and almost every k = (k, λ) ∈ R3 × Z2 . Here the square of the elementary charge, e > 0, is equal to Sommerfeld’s fine-structure constant in our units where Planck’s constant, the speed of light, and the electron mass are equal to one. (Energies are measured in units of the rest energy of the electron and x is
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
379
measured in units of one Compton wave length divided by 2π; we have e2 ≈ 1/137 in nature.) Writing k⊥ := (k (2) , −k (1) , 0),
k = (k (1) , k (2) , k (3) ) ∈ R3 ,
(2.4)
the polarization vectors are given by ε(k, 0) =
k⊥ , |k⊥ |
ε(k, 1) =
k ∧ ε(k, 0), |k|
(2.5)
for almost every k ∈ R3 . We let a† (f ) and a(f ) denote the standard bosonic creation and annihilation operators, for a photon state f ∈ Km . For a vector of functions f = (f (1) , f (2) , f (3) ) ∈ Km3 , we set a (f ) := (a (f (1) ), a (f (2) ), a (f (3) )), where a is a or a† . Then the quantized vector potential, A = (A(1) , A(2) , A(3) ), is the triplet of operators given by the direct integral
⊕
A := R3x
1Cν ⊗ A(x)d3 x,
A(x) := a† (Gphys ) + a(Gphys ). x x
(2.6)
(It will be clear from the context when ν = 2 or ν = 4.) We further set p := −i∇x ,
σ · (p + A) :=
3
σj (−i∂xj + A(j) ),
j=1
where σ1 , σ2 , σ3 are the Pauli spin matrices. An application of Nelson’s commutator theorem with test operator −∆ + Hf + 1 shows that σ · (p + A) is essentially selfadjoint on D 2 . We denote its closure again by the same symbol and define TA :=
(σ · (p + A))2 + 1
by means of the spectral calculus. Now the semi-relativistic Pauli–Fierz operator is a priori given as γ + Hf ϕ, ϕ ∈ D 2 . (2.7) Hγ ϕ := TA − |x| Here the radiation field energy, Hf := dΓ(ω), is given as the second quantization of the dispersion relation ω(k) = |k|, k = (k, λ) ∈ R3 × Z2 . We recall that the second quantization of some real-valued Borel function, , on Am × Z2 , m ≥ 0, (n) is the direct sum dΓ( ) = ∞ ( ), where dΓ(0) ( ) := 0 and dΓ(n) ( ) is n=0 dΓ the maximal operator of multiplication with the symmetric function (k1 , . . . , kn ) → γ γ ≡ |x| ⊗ 1, Hf ≡ 1 ⊗ Hf ,
(k1 ) + · · · + (kn ). In (2.7) and henceforth we identify |x| etc. It has been shown in [19] that the quadratic form of Hγ is bounded from below on D 2 , for γ ∈ [0, 2π] and all values of e, Λ > 0; compare Inequality (3.5) below.
May 20, J070-S0129055X11004321
380
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
In particular, Hγ has a self-adjoint Friedrichs extension which we again denote by the same symbol Hγ . In what follows we denote the ground state energy of Hγ by Eγ := inf σ[Hγ ],
γ ∈ (0, 2/π],
and its ionization threshold by Σ := inf σ[H0 ]. The following theorems are the main results of this paper. Theorem 2.1 (Binding). Let e, Λ > 0 and γ ∈ (0, 2/π]. Then Σ − Eγ ≥ |Eγel |,
√ where Eγel := inf σ[ 1 − ∆ − 1 − γ/|x|].
(2.8)
Proof. This theorem is a special case of Corollary 4.1 below. Theorem 2.2 (Critical Coupling Constant). For e, Λ >√0 and γ > 2/π, the quadratic form of TA − γ/|x| + Hf is unbounded below on Q( −∆) ∩ Q(Hf ). Proof. This theorem follows from Corollary 4.2. Theorem 2.3 (Existence of Ground States). Let e, Λ > 0 and γ ∈ (0, 2/π). Then Eγ is an evenly degenerated eigenvalue of Hγ . Proof. It is remarked in [20, §4] that every eigenvalue of Hγ is evenly degenerated. In fact, this follows from Kramers’ degeneracy theorem since Hγ commutes with the anti-unitary time reversal operator ϑ := σ2 CR, ϑ2 = −1, where C denotes complex conjugation and the electron parity R replaces x by −x. The fact that Eγ is an eigenvalue is proved in Sec. 5. Remark 2.1. (i) A binding condition for Hγ has been proven first in a preprint version of the present article. There we gave, however, an estimate on the binding energy in terms of the lowest eigenvalue of the non-relativistic operator − 12 ∆ − γ/|x| which is smaller in absolute value than Eγel . The improved and more natural bound (2.8) has been obtained first in [12] under the assumption that Hγ be essentially self-adjoint on some suitable dense domain. The latter property of Hγ can be verified, at least for all γ ∈ [0, 1/2) [14]. (ii) Every ground state eigenfunction of Hγ is exponentially localized with respect to the electron coordinates in the L2 -sense [19]; see Proposition 5.2 where we recall the precise statement. (iii) Theorems 2.1–2.3 actually hold true, for arbitrary choices of the polarization vectors ε(k, λ), λ ∈ Z2 , as long as ε(k, λ) is homogeneous of degree zero in k and {k/|k|, ε(k, 0), ε(k, 1)} is an orthonormal basis of R3 , for almost every k. For in this case the special form (2.5) of the polarization vectors can always be
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
381
achieved by a suitable unitary transformation; see [26, Appendix] for details. Moreover, the sharp ultraviolet cut-off in (2.3) can be replaced by a smooth cutoff implemented by some rapidly decaying function which is symmetric about the origin and Theorems 2.1–2.3 still remain valid. This follows by inspection of the proofs below. 3. The Dirac Operator 3.1. Operators acting on four-spinors It shall be convenient to work with a two-fold direct sum of the operator Hγ defined in (2.7). For this permits to exploit earlier results on sign functions of the free Dirac operator minimally coupled to the quantized radiation field and to have a familiar notation in the proofs. The full Hilbert spaces we shall work with in the rest of this paper, thus, are H 4 or Hm4 defined in (2.1). In order to introduce the Dirac operator we first recall that the Dirac matrices α1 , α2 , α3 , and β = α0 are hermitian (4 × 4)-matrices obeying the Clifford algebra relations αi αj + αj αi = 2δij 1,
i, j ∈ {0, 1, 2, 3}.
In the standard representation they are given in terms of the Pauli matrices as αj = σ1 ⊗ σj , j ∈ {1, 2, 3}, and β = σ3 ⊗ 1. We shall also work with generalized coupling functions in what follows. For many of the technical results stated below are applied to truncated and discretized versions of the physical form factor (2.3). Recall that Am := {k ∈ R3 : |k| ≥ m}, for m ≥ 0. Hypothesis 3.1. Let : Am ×Z2 → [0, ∞) be a measurable function that depends on k ∈ Am only such that 0 < (k) ≤ |k|, for k = (k, λ) ∈ Am × Z2 with k = 0. For almost every k ∈ Am × Z2 , let G(k) be a bounded, twice continuously differentiable function, R3x x → Gx (k) ∈ R3 , such that the map (x, k) → Gx (k) is measurable and Gx (−k, λ) = Gx (k, λ), for almost every k and all x ∈ R3 , λ ∈ Z2 . Assume there exist d−1 , d0 , d1 , d2 ∈ (0, ∞) such that 2 (k) G(k)2∞ dk ≤ d2 , ∈ {−1, 0, 1, 2}, (3.1) where G(k)∞ := supx |Gx (k)|, and 2 (k)−1 ∇x ∧ G(k)2∞ dk ≤ d21 .
(3.2)
The free Dirac operator minimally coupled to A (defined as in (2.6) with a general G and ν = 4) is now given as DA := α · (p + A) + β =
3
αj (−i∂xj + A(j) ) + β.
(3.3)
j=1 4 DA is essentially self-adjoint on Dm as a straightforward application of Nelson’s commutator theorem shows [16, 20]. We use the symbol DA again to denote its
May 20, J070-S0129055X11004321
382
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
4 closure starting from Dm . Its spectrum is contained in the union of two half-lines, σ(DA ) ⊂ (−∞, −1] ∪ [1, ∞). The connection between the Dirac operator and the semi-relativistic Pauli–Fierz operator is given by the identities TA 0 (3.4) , TA := (σ · (p + A))2 + 1. |DA | = 0 TA
The assumption Gx (−k, λ) = Gx (k, λ) has been used in [19] to show that 2 1 ≤ |DA | + δdΓ( ) + (δ −1 + δk 2 )d21 , π |x|
(3.5)
4 for some k ∈ (0, ∞) and every δ > 0, in the sense of quadratic forms on Dm .
3.2. Comparison between operators with different form factors x (k), k ∈ Am × Z2 , j ∈ {1, 2, 3}, is another form In the following we assume that G factor fulfilling Hypothesis 3.1 with new constants d−1 , . . . , d2 , that is, x (k)|2 dk ≤ d2 < ∞, ∈ {−1, 0, 1, 2}. 2 (k) sup |G (j)
x∈R3
as in (2.6) but with G in place of Gphys and put We define A 4 2 x (k)|2 }dk, (a) := 2+ sup {e−2a|x||Gx (k) − G
(k) x∈R3
a ≥ 0.
(3.6)
Well known estimates using α · z2 ≤ 2|z|2 , z ∈ C3 , show that + 1)−1/2 ≤ (a). e−a|x| α · (A − A)(dΓ( )
(3.7)
Next, we state some simple facts which are used repeatedly in the sequel: First, we may represent the sign function of DA in terms of its resolvent, RA (iy) := (DA − iy)−1 ,
y ∈ R,
as a strongly convergent principal value [13, Lemma VI.5.6], τ dy −1 RA (iy)ϕ , ϕ ∈ Hm4 . SA ϕ := DA |DA | ϕ = lim τ →∞ −τ π
(3.8)
Second, we recall that, for all y ∈ R, a ∈ [0, 1), and F ∈ C ∞ (R3x , R) with a fixed sign and satisfying |∇F | ≤ a, we have iy ∈ (DA + iα · ∇F ), F (iy) := eF RA (iy)e−F = (DA + iα · ∇F − iy)−1 RA
on D(e−F ),
(3.9)
and F RA (iy) ≤ J(a)(1 + y 2 )−1/2 ,
(3.10)
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
383
√ where J(0) := 1 and J(a) := 6/(1 − a2 ), for a ∈ (0, 1). The bound (3.10) is essentially well known; see, e.g., [8]. The proof given in [18] for classical vector potentials works for quantized ones as well. Finally, we abbreviate ˇ f := dΓ( ) + E H
(3.11)
in what follows, for some sufficiently large E ≥ 1. Assuming E > (2d1 J(a))2 we constructed operators ΞF (iy) ∈ L (Hm4 ), y ∈ R, in [19] satisfying ˇ −1/2 RF (iy) = ΞF (iy)RF (iy)H ˇ −1/2 , H A A f f ΞF (iy) ≤ (1 − ca,E )−1 ,
(3.12)
ca,E := 2d1 J(a)/E 1/2 .
(3.13)
fulfill Hypothesis 3.1. Let a, κ ∈ [0, 1) and Lemma 3.1. Assume that , G, and G ∞ 3 assume that F ∈ C (Rx , R) satisfies |∇F (x)| ≤ a and F (x) ≥ a|x|, for all x ∈ R3 , and F (x) = a|x|, for large |x|. Then we find some C(κ) ∈ (0, ∞), depending only on κ, such that, for all E ≥ 1 with E > (2d1 J(a))2 , κ ˇ −1/2 e−F ≤ C(κ)(a)J(a)/(1 − ca,E ). |DA e | (SA e − SA )Hf
Proof. A short computation using (3.9) and (3.12) yields, for every ϕ ∈ −1/2
ˇ e−F H f
(3.14) 4 , Dm
(RA (−iy) − RA e (−iy))(DA e + iy)ϕ −1/2 −F
−F ˇ (−iy)H = Ξ−F (−iy)RA f
e
− A)R e (−iy)(D e + iy)ϕ. α · (A A A
4 4 4 Now, (DA e + iy)D is dense in Hm since DA e is essentially self-adjoint on Dm and −1/2 −F ˇ is bounded due to (3.7). Therefore, the previous computation H e α · (A − A) f
implies an operator identity in L (Hm4 ) whose adjoint reads
ˇ −1/2 e−F = R e (iy)Θ(y), (RA (iy) − RA e (iy))Hf A where ˇ − A)e−F H Θ(y) := α · (A f
−1/2
F RA (iy)[Ξ−F (−iy)]∗
satisfies Θ(y) ≤ (a)J(a)(1 + y 2 )−1/2 Ξ−F (iy) by (3.7) and (3.10). Combinκ 2 κ/2−1/2 we find, for ing this with (3.8) and |DA e | RA e (iy) ≤ const(κ)(1 + y ) 4 ϕ, ψ ∈ Dm , −1/2
κ ˇ e−F ψ| ||DA e | ϕ | (SA − SA e )Hf dy κ (−iy)|D | ϕ|Θ(y)ψ = RA e e A π R
≤ const(κ)(a)J(a) sup Ξ−F (iy) y∈R
In view of (3.13) this implies (3.14).
R
dy ϕψ. (1 + y 2 )1−κ/2
May 20, J070-S0129055X11004321
384
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
In the next lemma and the proofs of the infrared bounds we shall need the following bound which follows immediately from [19, Lemma 3.5]. (At least, for bounded F . Since its right-hand side depends only on a it can be generalized to all F satisfying the conditions given in the next lemma by some straightforward approximation ˇ 1/2 into itself and, for all argument.) We know that SA maps the domain of H f E > (2d1 J(a))2 , ˇ 1/2 SA H ˇ −1/2 e−F ≤ (1 + aJ(a))/(1 − ca,E ). eF H f f
(3.15)
fulfill Hypothesis 3.1. Let F ∈ C ∞ (R3x , R) Lemma 3.2. Assume that , G, and G satisfy |∇F (x)| ≤ a < 1 and F (x) ≥ a|x|, for all x ∈ R3 , and F (x) = a|x|, for large |x|. Then, for every E ≥ 1 with E > (2d1 J(a))2 , there is some C ≡ C(a, E, d1 ) ∈ 4 , (0, ∞) such that, for all , τ ∈ (0, 1] and ϕ ∈ Dm 4 1/2 ˇ 1/2 eF ϕ2 + C (a) ϕ2 . |ϕ | (|DA | − |DA ϕ2 + τ H e |)ϕ| ≤ |DA e| f 3 τ 2 1/2
4 Proof. Since SA maps Dm into the domains of D0 and Hf 4 , of Lemma 3.4(ii)]) we have the following identity on Dm
|DA | − |DA e | = DA e S∆ + α · (A − A)SA ,
(3.16)
(compare [19, Proof
S∆ := SA − SA e.
(3.17)
Thanks to (3.7) and (3.15) the second term on the right-hand side can be estimated as A ϕ| |ϕ | α · (A − A)S ˇ 1/2 SA H ˇ −1/2 e−F eF H ˇ 1/2 ϕ H ˇ −1/2 eF H ≤ ϕe−F α · (A − A) f f f f 2 ˇ 1/2 ϕ2 + (a) · const(a, E, d1 )ϕ2 , ≤ τ eF H f 4τ 4 for all ϕ ∈ Dm and τ > 0. To treat the first term on the right-hand side of (3.17) we apply Lemma 3.1 with κ = 3/4 and find some C∗ ∈ (0, ∞) such that 1/2 1/4 3/4 |DA S∆ ϕ2 ≤ |DA S∆ ϕ|DA S∆ ϕ e| e| e| 1/2 ˇ 1/2 ϕ S∆ ϕ1/2 C∗ (a)eF H ≤ S∆ ϕ | |DA e| f
≤
C 4 4 (a) 2 τ F ˇ 1/2 2 1 1/2 e Hf ϕ + |DA S∆ ϕ2 + ∗ 2 · 2 ϕ2 , e| 2 2 8τ (3.18)
4 for all ϕ ∈ Dm . In the last step we also used that S∆ ≤ 2. Solving (3.18) for 1/2 2 | S ϕ and replacing τ by 4ετ we arrive at |DA ∆ e 1/2 |ϕ | DA ϕ2 + e S∆ ϕ| ≤ ε|DA e|
1 1/2 |DA S∆ ϕ2 e| 4ε
4 4 1/2 ˇ 1/2 ϕ2 + C∗ (a) ϕ2 . ≤ ε|DA ϕ2 + τ eF H e| f 64ε3 τ 2
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
385
fulfill Hypothesis 3.1. Then, for every Corollary 3.1. Assume that , G, and G γ ∈ [0, 2/π) and τ ∈ (0, 1], we find C ≡ C(γ, τ, d−1 , d0 , d1 , d˜−1 , d˜0 ) ∈ (0, ∞) and 4 , c ≡ c(γ, τ ) > 0 such that, in the sense of quadratic forms on Dm |DA | − γ/|x| + τ dΓ( ) ≥ c(|DA e | + 1/|x| + dΓ( )) − C.
(3.19)
Proof. We choose ∈ (0, 1) such that (γ + )/(1 − ) = 2/π. Then (3.5) with δ = τ /(2 − 2) implies |DA | −
τ γ + τ dΓ( ) ≥ |DA + dΓ( ) − C , e | + (|DA | − |DA e |) + |x| |x| 2
for some C ≡ C (τ, d1 ) ∈ (0, ∞). Applying (3.16) with F = 0 = a and τ replaced by τ /4 we obtain (3.19) with c := min{ − 2 , τ /4}. Corollary 3.2. Assume that and G fulfill Hypothesis 3.1. Then we find c ∈ (0, ∞), depending only on d1 , d0 , d−1 , such that inf σ[|DA | + dΓ( )] ≤ c. Proof. By virtue of Corollary 3.1 |DA | + dΓ( ) ≤ c(|D0 | + dΓ( )). Picking a minimizing sequence for the quadratic form on the right-hand side we conclude that inf σ[|DA | + dΓ( )] ≤ c. 4. Existence of Binding As a first step towards the proof of the existence of ground states we have to verify that the infimum of the spectrum of the semi-relativistic Pauli–Fierz operator is strictly less than its ionization threshold. This information will be exploited mathematically when we apply a bound on the spatial localization of low-lying spectral subspaces from [19]. The localization estimate in turn enters into the proof of the existence of ground states at various places, for instance, into the derivation of the infrared estimates. In our applications, it will actually be necessary to have a bound on the binding energy which is uniform in certain infrared cutoff and discretization parameters introduced later on. In fact, we shall see that in all these cases the (positive) binding energy is always bounded from below by the binding energy of the purely electronic model. When the operator without electric potential is translation invariant such a bound has been derived recently in [12], improving an earlier bound from a preprint version of the present paper. Here we give an alternative proof for the same bound as in [12] which also applies to the discretized version of the semirelativistic Pauli–Fierz operator, which is not translation invariant anymore. In the first two subsections below we introduce the infrared cut-off and discretized semi-relativistic Pauli–Fierz operators. After that we introduce a fiber integral representation of these operators for vanishing electric potentials. This representation is a key ingredient in our proof of the binding condition which is presented in the last of the four subsequent subsections.
May 20, J070-S0129055X11004321
386
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
From now on we shall consider the following operator acting in H 4 , Hγ := |DA | − γ/|x| + Hf ,
γ ∈ [0, 2/π],
which, on account of (3.4), is just a two-fold direct sum of the semi-relativistic Pauli–Fierz operator Hγ . For then we can conveniently employ our results on the Dirac operator DA . By virtue of (3.5) we may define Hγ as a Friedrichs extension 0 , starting from D 4 . Again using (3.5), we may also define the operators Hγ,m , Hγ,m ε and Hγ,m introduced below with γ ∈ [0, 2π] by means of a Friedrichs extension and we shall always do so without further mentioning. 4.1. The infrared cut-off operator Hγ,m The infrared cut-off Hamiltonians, Hγ,m , m > 0, are given by Am (x) := a† (e−ik·x gm ) + a(e−ik·x gm ), Hγ,m := |DAm | −
gm := 1Am g,
γ + Hf . |x|
(4.1) (4.2)
Here Am = {|k| ≥ m}, g is defined in (2.3), and Hf = dΓ(ω). We shall also encounter a version of Hγ,m acting in the truncated Hilbert space Hm4 : For every m > 0, we split the one-photon Hilbert space into two mutually orthogonal subspaces K = L2 (R3 × Z2 ) = Km ⊕ Km⊥ ,
Km = L2 (Am × Z2 ).
It is well known that Fb [K ] = Fb [Km ] ⊗ Fb [Km⊥ ] and we observe that all the operators a(1Am e−ik·x g (j) ), a† (1Am e−ik·x g (j) ), j ∈ {1, 2, 3}, x ∈ R3 , and Hf leave the Fock space factors associated to the subspaces Km and Km⊥ invariant. Hence, the same holds true also for DAm , SAm , and |DAm |. We designate operators acting in the Fock space factors Fb [Km ] or Fb [Km⊥ ] by a superscript 0 or ⊥, respectively. For instance, 0 Hf,m := dΓ(ωAm ×Z2 ),
⊥ Hf,m := dΓ(ω{|k|<m}×Z2 ).
Moreover, DAm ∼ = DA0m ⊗ 1 and SAm ∼ = SA0m ⊗ 1 under the isomorphism H4∼ = Hm4 ⊗ Fb [Km⊥ ],
Hm4 = L2 (R3x , C4 ) ⊗ Fb [Km ].
Under the same isomorphism Hγ,m decomposes as ⊥ , 0 Hγ,m = Hγ,m ⊗ 1 + 1 ⊗ Hf,m
0 Hγ,m := |DA0m| −
γ 0 + Hf,m . |x|
(4.3)
⊥ We let Ω⊥ = (1, 0, 0, . . .) denote the vacuum in Fb [Km⊥ ]. In view of Hf,m Ω⊥ = 0 we then observe that 0 Eγ,m := inf σ[Hγ,m ] = inf σ[Hγ,m ],
m > 0,
0 Σm := inf σ[H0,m ] = inf σ[H0,m ],
m > 0,
γ ∈ (0, 2/π],
(4.4) (4.5)
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
387
0 by tensor-multiplying minimizing sequences for Hγ,m , γ ∈ [0, 2/π], with Ω⊥ . For 0 presentational reasons we consider Hγ,m to find a lower bound on Σm − Eγ,m . ε 4.2. The discretized operator Hγ,m 0 Next, we define a discretized version of Hγ,m . To this end we decompose Am into a disjoint union of cubes with side length ε ∈ (0, m],
Qεm (ν), Qεm (ν) := (ν + [−ε/2, ε/2)3) ∩ Am , ν ∈ (εZ)3 . Am = ν∈(εZ)3
ε (k) ∈ (εZ)3 , such that Of course, for every k ∈ Am , we find a unique vector, ν ε 3 ε ν ε (k)). To each ν ∈ (εZ) with Qm (ν) = ∅ we further associate some k ∈ Qm ( κ εm (ν) ∈ Qεm (ν) such that |κ εm (ν)| =
inf
k∈Qεm (ν)
|k|.
In this way we obtain a map ν ε : Am × Z2 → R3 , It is evident that the vectors
k = (k, λ) → ν ε (k) := κ εm ( ν ε (k)).
κ εm (ν)
(4.6)
can be chosen such that
ν ε (−k, λ) = −ν ε (k, λ),
for almost every k ∈ Am .
(4.7)
The set of Lebesgue measure zero where the identity (4.7) might not hold is contained in the union of all planes which are perpendicular to some coordinate axis and contain points of the lattice (εZ)3 . We define the ε-average of a locally integrable function, f , on Am × Z2 by 1 f (p, λ)d3 p, k = (k, λ) ∈ Am × Z2 . f ε (k) := ε |Qm ( ν ε (k))| Qεm (eν ε (k)) Alternatively, we may write, for every f ∈ Km , Qε (ν) | f 1 Qε (ν) , f ε = Pε f := 1 m m
(4.8)
ν∈(εZ)3 : Qεm (ν) =∅
Qε (ν) denotes the normalized characteristic function of the set Qεm (ν) so where 1 m that Pε is an orthogonal projection in Km . The discretized vector potential is now given as ε ε ) + a(e−iν ε ·x gm ), Aεm (x) := a† (e−iν ε ·x gm
ε gm = Pε [1Am g].
(4.9)
The dispersion relation is discretized in a slightly different way, namely ωε (k) := inf{|p| : p ∈ Qεm ( ν ε (k))}, which entails max{m, (1 −
k = (k, λ) ∈ Am × Z2 ,
√ 3ε/m)ω} ≤ ωε ≤ ω
ε 0 Hf,m := dΓ(ωε ) ≤ Hf,m .
on Am × Z2 ,
(4.10) (4.11)
May 20, J070-S0129055X11004321
388
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
Here the operators in the last line are acting in Fb [Km ]. For every m > 0, we define ε , acting in Hm4 , by a discretized Hamiltonian, Hγ,m ε ε Hγ,m := |DAεm | − γ/|x| + Hf,m .
(4.12)
4.3. Fiber decompositions of the free operators (γ = 0) Our bound on the binding energy for Hγ is based on a direct fiber decomposition of H 4 with respect to fixed values of the total momentum p ⊗ 1 + 1 ⊗ pf , where pf := dΓ(k) := (dΓ(k (1) ), dΓ(k (2) ), dΓ(k (3) )),
(4.13)
is the photon momentum operator. A conjugation of the Dirac operator with the unitary operator eipf ·x — which is simply a multiplication with the phase (n) ei(k1 +···+kn )·x in each Fock space sector Fb [K ] — yields eipf ·x DA e−ipf ·x = α · (p − pf + A(0)) + β. ε 0 When we deal with Hγ,m , m ≥ ε ≥ 0, (Hγ,0 := Hγ ) then we replace pf by
pεf := dΓ(ν ε )
(all three components acting in Fb [Km ]),
where ν ε , ε > 0, is defined in (4.6) and ν 0 := k on Am × Z2 . Then it is again easy to check that eipf ·x DAεm e−ipf ·x = α · (p − pεf + Aεm (0)) + β. ε
ε
A further conjugation with the Fourier transform, F : L2 (R3x ) → L2 (R3ξ ), with respect to the variable x turns the transformed Dirac operators into ⊕ ε ε ε (ξ)d3 ξ, A0 := A. (4.14) D (F ⊗ 1)eipf ·x DAεm e−ipf ·x (F −1 ⊗ 1) = m 0 R3
Here the operators ε (ξ) := α · (ξ − pε + Aε (0)) + β, D m f m
ξ ∈ R3 ,
acting in C4 ⊗ Fb [K ], for m = ε = 0, and in C4 ⊗ Fb [Km ], for m > 0, are fiber Hamiltonians of the transformed Dirac operator in (4.14) with respect to the isomorphisms ⊕ ⊕ C4 ⊗ Fb [K ]d3 ξ, Hm4 ∼ C4 ⊗ Fb [Km ]d3 ξ, (4.15) H4∼ = = R3
R3
respectively. Corresponding to (4.15) we then have the direct integral representation (compare, e.g., [24, Theorem XIII.85]) ⊕ ipεf ·x ε −ipεf ·x −1 ε H0,m e (F ⊗ 1) = H0,m (ξ)d3 ξ, (4.16) (F ⊗ 1)e R3
for m ≥ ε ≥ 0, where ε ε (ξ)| + H ε , H0,m (ξ) := |D m f,m
0 Hf,0 := Hf .
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
389
4.4. Proof of the binding condition For m ≥ ε ≥ 0, we set ε ε Eγ,m := inf σ[Hγ,m ],
γ ∈ (0, 2/π],
ε Σεm := inf σ[H0,m ],
(4.17)
0 0 so that Eγ,m = Eγ,m , Eγ,0 = Eγ , Σ0m = Σm , Σ00 = Σ. We fix some ρ > 0 in what follows. In view of the fiber decomposition (4.16) we know that the Lebesgue ε (ξ)] ∩ (Σεm − ρ, Σεm + ρ) = ∅ is measure of the set of all ξ ∈ R3 satisfying σ[H0,m strictly positive [24, Theorem XIII.85]. In particular, we find some ξ ∈ R3 and ε (ξ )) such that some normalized ϕ ∈ Q(H0,m ε ϕ | H0,m (ξ )ϕ C4 ⊗Fb < Σεm + ρ.
(4.18)
ε := ei(pf −ξ )·x and observe as above We define the unitary transformation U ≡ Um that UDAεm U ∗ = α · (p + t ) + β, where ε
t := ξ − pεf + Aεm (0). It suffices to prove the binding condition for the unitarily equivalent operator ε ε UHγ,m U ∗ = (α · (p + t ))2 + 1 − γ/|x| + Hf,m . (4.19) ε Lemma 4.1. Let ϕ1 ∈ H 1/2 (R3 ) be real-valued, ϕ2 ∈ Q(Hf,m ) ⊂ C4 ⊗ Fb , ϕ := ϕ1 ⊗ ϕ2 , and µ ≥ 0. Then ϕ | (α · (p + t ))2 + µ2 ϕ ≤ ϕ | p2 + (α · t )2 + µ2 ϕ ≤ ϕ1 | ( p2 + µ2 − µ)ϕ1 ϕ2 2 + ϕ1 2 ϕ2 | (α · t )2 + µ2 ϕ2 .
Proof. For ξ ∈ R3 and η > 0, we abbreviate R0 (ξ, η) := (|ξ|2 + (α · t )2 + µ2 + η)−1 , RI (ξ, η) := ((α · (ξ + t ))2 + µ2 + η)−1 , and we observe that (α · (ξ + t ))2 = |ξ|2 + (α · t )2 + 2ξ · t on the domain of (α · t )2 . Applying the second resolvent identity twice we deduce that R0 − RI = 2R0 ξ · t R0 − {4R0 ξ · t RI ξ · t R0 }, where we dropped the fixed arguments (ξ, η). The last operator {· · ·} is non-negative since RI ≥ 0. Furthermore, since ϕ1 is real-valued its Fourier transform satisfies 1 (−ξ)|. Since also R0 (ξ, η) = R0 (−ξ, η), the transformation ξ → −ξ |ϕ 1 (ξ)| = |ϕ applied to the integral below reveals that ϕ 1 ⊗ ϕ2 | R0 (·, η)ξ · t R0 (·, η)ϕ 1 ⊗ ϕ2 = ξ · R0 (ξ, η)ϕ2 | t R0 (ξ, η)ϕ2 C4 ⊗Fb |ϕ 1 (ξ)|2 d3 ξ = 0. R3
May 20, J070-S0129055X11004321
390
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
We finally use the formula √ Tψ =
∞
0
1−
η T +η
dη ψ √ , π η
ψ ∈ D(T ),
valid for any self-adjoint operator T ≥ 0, to conclude that ϕ | ( (α · (p + t ))2 + µ2 − p2 + (α · t )2 + µ2 )ϕ ∞ √ ηdη ≤ 0. = ϕ 1 ⊗ ϕ2 | (R0 (·, η) − RI (·, η))ϕ 1 ⊗ ϕ2 π 0 Since p2 and α · t commute the second asserted inequality follows from the ele2 2 mentary bound µ + a + b + µ ≤ a + µ + b + µ2 , for a, b, µ ≥ 0. Corollary 4.1. Let e, Λ > 0, m ≥ ε ≥ 0, and γ ∈ (0, 2π]. Then √ ε Σεm − Eγ,m ≥ |Eγel |, Eγel := inf σ[ 1 − ∆ − 1 − γ/|x|].
(4.20)
√ Proof. First, assume γ < 2/π. Since the semi-group generated by 1 − ∆ is positivity improving we then find some normalized, strictly positive eigenfunction, ϕ1 ∈ H 1/2 (R3 ), corresponding to Eγel . Then we apply Lemma 4.1 with ϕ2 = ϕ , ε ε ≤ ϕ | U Hγ,m U ∗ ϕ ≤ Eγel + Σεm + ρ, by the choice of ϕ = ϕ1 ⊗ ϕ2 , to get Eγ,m ϕ in (4.18). Since ρ > 0 is arbitrary, this proves the assertion if γ < 2/π. It is, el ε ε , as γ 2/π, and clear that Eγ,m ≥ E2/π,m , however, well known that Eγel E2/π for γ ∈ (0, 2/π). Corollary 4.2. Let e, Λ > 0 and γ > 2/π. Then we find normalized test functions 1/2 Φn , n ∈ N, in the algebraic tensor product H 1/2 (R3 ) ⊗ C4 ⊗ D(Hf ) such that U Φn | (|DA | − γ/|x| + Hf )U Φn → −∞,
n → ∞.
√ Proof. We set qγ [ϕ] := ϕ | ( −∆ − γ/|x|)ϕ, ϕ ∈ H 1/2 (R3 ), and assume that γ > 2/π. Then it is well known that qγ is unbounded below. For instance, the function ψ given in momentum space by ψ(ξ) := |ξ|−2 1{r≤|ξ|≤R} satisfies qγ [ψ] < 0, if 0 < r < R and the ratio R/r is sufficiently large. ψ is real-valued since ψ is real and ψ(−ξ) = ψ(ξ). By scaling, the functions ϕn (x) := n3/2 ψ(nx) satisfy qγ [ϕn ] = nqγ [ψ], for n ∈ N. On account of Lemma 4.1 the functions Φn := ϕn ⊗ ϕ /ψ thus have the required properties.
5. Existence of Ground States 5.1. Outline of the proof In this section we prove our main Theorem 2.3. As in [3, 5] (see also [11] where a photon mass is introduced in a slightly different way) we first show that the infrared cut-off Hamiltonians, Hγ,m , m > 0, defined in (4.1) and (4.2) possess
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
391
ground state eigenfunctions, provided m > 0 is sufficiently small. This is done in Sec. 5.3 by means of a discretization argument similar to the one in [5]. The implementation of the discretization procedure in [3, 5] requires a small coupling condition. By a modification of the argument we observe, however, that this is actually not necessary. Before we turn to these issues we explain in Sec. 5.2 how to infer the existence of ground states for the limit operator Hγ from the fact that the Hγ,m have ground state eigenfunctions. Here we benefit from a result from [11] saying that the spatial localization, a bound on the number of soft photons, and a photon derivative bound introduced in [11] allow to use standard imbedding theorems for Sobolev spaces to ensure the compactness of the set of approximating ground states. The first of the latter key ingredients, the exponential localization estimate for low-lying spectral subspaces of the operators Hγ,m , has been proven in [19]. The proofs of the two infrared bounds are postponed to Sec. 6. We close this subsection by a general lemma which allows to prove the existence of imbedded eigenvalues by means of approximating sequences of operators and eigenvectors. It is a modified version of a result we learned from [3] and its assertion is actually stronger than necessary for our application. Lemma 5.1. Let T, T1 , T2 , . . . be self-adjoint operators acting in some separable Hilbert space, X, such that {Tj }j∈N converges to T in the strong resolvent sense. Assume that Ej is an eigenvalue of Tj with corresponding eigenvector φj ∈ D(Tj ). Assume further that {φj }j∈N converges weakly to some 0 = φ ∈ X . Then E := limj→∞ Ej exists and is an eigenvalue of T . If Ej = inf σ[Tj ], then T is semibounded below and E = inf σ[T ]. Proof. In what follows we abbreviate f := arctan. Then f (Tj )ψ → f (T )ψ, j → ∞, for every ψ ∈ X , since Tj → T in the strong resolvent sense. Let us assume for the moment that ψ ∈ X fulfills ψ | φ = 0. Then we find some j0 ∈ N such that ψ | φj = 0, for j ≥ j0 , and we may write f (Ej ) =
f (T )ψ | φj + f (Tj )ψ − f (T )ψ | φj , ψ | φj
j ≥ j0 .
The sequence {f (Ej )}j∈N thus has a limit f (E) := lim f (Ej ) = j→∞
ψ | f (T )φ . ψ | φ
In the case ψ | φ = 0 we may replace ψ by ψ = ψ + φ in the above argument to see that f (T )ψ | φ = 0 also. The equality ψ | f (T )φ = ψ | f (E)φ thus holds, for every ψ ∈ X , whence f (T )φ = f (E)φ. It follows that E := tan(f (E)) ≥ inf σ[T ] is an eigenvalue of T since u(σpp [f (T )]) ⊂ σpp [u(f (T ))] = σpp [(u ◦ f )(T ))], for every Borel measurable function u. Now assume that Ej = inf σ[Tj ]. Set λ := inf σ[T ], when T is semi-bounded below, and pick some λ ∈ σ[T ] ∩ (−∞, E − 1), when inf σ[T ] = −∞. Since Tj converges to T in the strong resolvent sense there is a sequence {Ej }j∈N with
May 20, J070-S0129055X11004321
392
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
Ej ≤ Ej ∈ σ[Tj ] and Ej → λ. So E ≥ λ = limj→∞ Ej ≥ limj→∞ Ej = E. If inf σ[T ] = −∞, the first inequality is strict and we get a contradiction. 5.2. Approximation by infrared cut-off electromagnetic fields In order to prove that Hγ has a ground state if this holds true for Hγ,m with sufficiently small m > 0 we first show that Hγ,m converges to Hγ in norm resolvent = 1Am Gphys , sense. If we choose a = 0, = ω, G = Gphys = e−ik·x g, and G m > 0, then the parameter defined in (3.6) is equal to 4 2∗ (m) := (5.1) 2+ |g(k)|2 dk → 0, m 0. ω(k) {|k|≤m} Lemma 5.2. Let e, Λ, m > 0 and γ ∈ [0, 2/π). Then Hγ,m and Hγ have the same form domain, Q(Hγ,m ) = Q(Hγ ) = Q(|D0 |) ∩ Q(Hf ), and the form norms associated to Hγ,m and Hγ are equivalent. Moreover, Hγ,m converges to Hγ in the norm resolvent sense, as m 0. Proof. The first assertion, which has already been observed in [27], follows from = 0). Moreover, we know that D 4 is a common Corollary 3.1 (with A = 0 or A form core of Hγ and Hγ,m , m > 0, and on D 4 we have Hγ − Hγ,m = (SA − SAm )DA + SAm α · (A − Am ). By virtue of Lemma 3.1 and (3.7) we thus find some C ∈ (0, ∞) such that, for all m > 0 and ϕ ∈ D 4 , ˇ 1/2 ϕ|DA |1/2 ϕ + ϕH ˇ 1/2 ϕ) |ϕ | (Hγ − Hγ,m )ϕ| ≤ O(∗ (m))(H f f ≤ O(∗ (m))ϕ | (Hγ + C)ϕ.
(5.2)
ˇ f = Hf + E, for some sufficiently large E > 0, and in the last step we used Here H Corollary 3.1. Since we may replace ϕ in (5.2) by any element of Q(Hγ,m ) = Q(Hγ ) the second assertion follows from [25, Theorem VIII.25]. In what follows we let [·]− : R → (−∞, 0] denote the negative part [t]− := min{t, 0},
t ∈ R.
Proposition 5.1 (Ground States with Mass). Let e, Λ > 0, γ ∈ (0, 2/π). Then there exists some m0 > 0 such that, for every m ∈ (0, m0 ), the operator 0 − Eγ,m − m [Hγ,m 4 ]− has finite rank. In particular, Eγ,m is an eigenvalue of both 0 Hγ,m and Hγ,m . 0 in Sec. 5.3. However, if φ0m is a ground Proof. We prove the assertion for Hγ,m 0 state eigenvector of Hγ,m then, by (4.3) and (4.4), φ0m ⊗ Ω⊥ is a ground state eigenvector of Hγ,m .
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
393
To benefit from this proposition we also need the following results. The first one on the exponential localization of low-lying spectral subspaces is also applied to the ε 0 later on. (Recall our convention Hγ,0 = Hγ .) discretized operator Hγ,m Proposition 5.2 (Exponential Localization). There exist k0 , k1 ∈ (0, ∞) such that the following holds: Let e, Λ > 0, 0 ≤ ε ≤ m, γ ∈ (0, 2/π), and let I ⊂ (−∞, Σεm ) be some compact interval. Pick some a ∈ (0, 1) such that := Σεm − max I − 6a2 /(1 − a2 ) > 0 and ≤ 1. Then 2
ε ea|x|1I (Hγ,m ) ≤ (k1 /2 )(1 + |I|)(Σεm + k0 e2 Λ3 )ec(γ)a(Σm +k0 e ε
Λ3 )/
,
(5.3)
where |I| denotes the length of I and c : (0, 2/π) → (0, ∞) is some universal increasing function. In particular, we find m1 , δ1 > 0, and a1 ∈ (0, 1) such that ε sup{ea1 |x| 1J(m,ε) (Hγ,m ) : 0 ≤ ε ≤ m ≤ m1 } < ∞,
where J(m, ε) :=
ε ε , Eγ,m [Eγ,m
(5.4)
+ δ1 ]. The same estimates hold for Hγ,m , if ε = 0.
Proof. The bound (5.3) with k0 e2 Λ3 replaced by some constant times d21 is stated in [19, Theorem 2.5] for dispersion relations and form factors G fulfilling Hypothesis 3.1. In the cases = ω, G = e−ik·x g1Am , or = ωε , G = e−iν ε ·x gm,ε we can clearly choose d21 = const e2 Λ3 uniformly in 0 ≤ ε ≤ m ≤ m1 , for some m1 > 0. To prove (5.4) we pick some 0 < δ1 < |Eγel |, choose I = J(m, ε), and observe that, by Theorem 4.1, ≥ |Eγel | − δ1 − 6a21 /(1 − a21 ) ≥ const > 0, uniformly in 0 ≤ ε ≤ m ≤ m1 , provided that a1 ∈ (0, 1) is sufficiently small. Finally, we know from Corollary 3.2 that all threshold energies Σεm are bounded from above by some constant depending only on the values of d−1 , d0 , d1 . Since these values can be chosen uniformly in 0 ≤ ε ≤ m ≤ m1 this concludes the proof. We recall the notation (a(k)ψ)(n) (k1 , . . . , kn ) = (n + 1)1/2 ψ (n+1) (k, k1 , . . . , kn ),
n ∈ N0 ,
(0) , 0, 0, . . .) = 0. The almost everywhere, for ψ = (ψ (n) )∞ n=0 ∈ Fb [K ], and a(k)(ψ following proposition is proved in Sec. 6.
Proposition 5.3 (Soft Photon Bound). Let e, Λ > 0 and γ ∈ (0, 2/π). Then there exist constants, m2 , C ∈ (0, ∞), such that, for all m ∈ [0, m2 ] and every normalized ground state eigenfunction, φm , of Hγ,m , we have a(k)φm 2 ≤ 1{m≤|k|≤Λ}
C , |k|
(5.5)
for almost every k = (k, λ) ∈ R3 × Z2 . The next proposition, which is also proved in Sec. 6, is the only place in the whole article where the special choice of the polarization vectors (2.5) enters into the analysis. Proposition 5.4 (Photon Derivative Bound). Let e, Λ > 0, γ ∈ (0, 2/π). Then there exist constants, m3 , C ∈ (0, ∞), such that, for all m ∈ [0, m3 ] and every
May 20, J070-S0129055X11004321
394
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
normalized ground state eigenfunction, φm , of Hγ,m , we have 1 1 a(k, λ)φm − a(p, λ)φm ≤ C|k − p| + , |k|1/2 |k⊥ | |p|1/2 |p⊥ |
(5.6)
for almost every k, p ∈ R3 with m < |k| < Λ, m < |p| < Λ, and λ ∈ Z2 . (Here we use the notation introduced in (2.4).) We have now collected all prerequisites to show that inf σ[Hγ ] is an eigenvalue of Hγ . Proof of Theorem 2.3 by means of Propositions 5.1–5.4. Let φm denote a normalized ground state of Hγ,m , for m ∈ (0, m ], where m > 0 is the minimum of the constants m0 , m1 , m2 , m3 appearing in Propositions 5.1–5.4. Then the family {φm }m∈(0,m ] contains a weakly convergent sequence, {φmj }j∈N . We denote the weak limit of the latter by φ and verify that φ = 0 in the following. The assertion of Theorem 2.3 then follows from Lemmas 5.1 and 5.2. (In fact, we shall show that φmj → φ strongly in H 4 along a subsequence.) To verify that φ = 0 one can argue as in [11]. Essentially, we only have to replace the Rellich–Kondrashov theorem applied there by a suitable imbedding theorem for spaces of functions with fractional derivatives. (In the non-relativistic case the ground states φm possess weak derivatives with respect to the electron coordinates, whereas in our case we only have Inequality (5.11) below.) For the convenience of the ∞ (n) 2 3 reader we present the complete argument. Writing φm = (φm )∞ n=0 ∈ n=0 L (R × (n) Z4 ) ⊗ Fb [K ] we infer from the soft photon bound that ∞ ∞ 1 1 C 2 (n) 2 φ(n) ≤ nφ = , a(k)φm 2 dk ≤ m m n n n 0 0 0 n=n n=0 0
for m ∈ (0, m ] and some m-independent constant C ∈ (0, ∞). Given some ε > 0 we fix n0 ∈ N so large that C/n0 < ε.
(5.7)
By virtue of (5.4) we further find some R > 0 such that, for all m ∈ (0, m ], φm 2C4 ⊗Fb (x)d3 x < ε. (5.8) |x|≥R/2
(n)
In addition, the soft photon bound ensures that φm (x, ς, k1 , . . . , kn ) = 0, for almost every (x, ς, k1 , . . . , kn ) ∈ R3 × {1, 2, 3, 4} × (R3 × Z2 )n , kj = (kj , λj ), such that |kj | > Λ, for some j ∈ {1, . . . , n}. (Here and henceforth ς labels the four spinor components.) For 0 < n < n0 and some fixed θ = (ς, λ1 , . . . , λn ) ∈ {1, 2, 3, 4} × Zn2 we set (n)
φm,θ (x, k1 , . . . , kn ) := φ(n) m (x, ς, k1 , λ1 , . . . , kn , λn ) and similarly for φ. Moreover, we set, for every δ ≥ 0, Qn,δ := {(x, k1 , . . . , kn ) : |x| < R − δ, δ < |kj | < Λ − δ, j = 1, . . . , n}.
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
395
Fixing some small 0 < δ < min{m , R/2, Λ/4} we pick some cut-off function χ ∈ C0∞ (R3(n+1) , [0, 1]) such that χ ≡ 1 on Qn,2δ and supp(χ) ⊂ Qn,δ and define (n) (n) ψm,θ := χφm,θ . As a next step the photon derivative bound is used to show that (n)
{ψm,θ }m∈(0,δ] is a bounded family in the anisotropic Nikol’ski˘ı spacea Hqs (R3(n+1) ), where s = (1/2, 1/2, 1/2, 1, . . ., 1) and q = (2, 2, 2, p, . . . , p) with p ∈ [1, 2). In fact, employing the H¨older inequality (with respect to d3 x d3 k2 · · · d3 kn ) and the photon derivative bound (5.6), we obtain as in [11], for p ∈ [1, 2) and m ∈ (0, δ], (n) (n) |φm,θ (x, k1 + h, k2 , . . . , kn ) − φm,θ (x, k1 , . . . , kn )|p d3 x d3 k1 · · · d3 kn Q n,δ∩
{δ<|k1 +h|<Λ}
≤
([4π/3]n R3 Λ3(n−1) ) np/2
≤ C|h|
λ∈Z2
|(u,v)|
p 0
|(u,v)|<Λ
2−p 2
m<|k|<Λ, m<|k+h|<Λ
dk (3) + |(u, v)|p/2
a(k + h, λ)φm − a(k, λ)φm p d3 k Λ
|(u,v)|
dk (3) |k (3) |p/2
du dv |(u, v)|p
= C |h|p , (n)
where the constants C, C ∈ (0, ∞) do not depend on m ∈ (0, δ]. Since φm is symmetric in the photon variables the previous estimate implies [22, §4.8] that the (n) weak first order partial derivatives of φm,θ with respect to its last 3n variables exist on Qn,δ and that (n) φm,θ pW r (Qn,δ ) p
=
(n) φm,θ pLp (Qn,δ )
+
n 3
(n)
∂k(i) φm,θ pLp (Qn,δ ) ≤ C ,
j=1 i=1
j
for m ∈ (0, δ] and some m-independent C ∈ (0, ∞), with r := (0, 0, 0, 1, . . . , 1). (n) The previous estimate implies ψm,θ Wpr (R3(n+1) ) ≤ C , for some C ∈ (0, ∞) which does not depend on m ∈ (0, δ]. Moreover, the anisotropic Sobolev space T (r ,...,r ) r1 , . . . , rd ∈ [0, 1], q1 , . . . , qd ≥ 1, we have Hq11,...,qdd (Rd ) := di=1 Hqriixi (Rd ). For ri ∈ [0, 1), ri d d a measurable function f : R → C belongs to the class Hqi xi (R ), if f ∈ Lqi (Rd ) and there is some M ∈ (0, ∞) such that a For
f (· + hei ) − f Lqi (Rd ) ≤ M |h|ri ,
h ∈ R,
(5.9)
where ei is the ith canonical unit vector in Rd . If ri = 1 then (5.9) is replaced by f (· + hei ) − 2f + f (· − hei )Lqi (Rd ) ≤ M |h|,
h ∈ R.
(5.10)
(r ,...,r )
Hq11,...,qdd (Rd ) is a Banach space with norm (r ,...,r )
f q11,...,qdd := max f Lqi (Rd ) + max Mi , 1≤i≤d
1≤i≤d
where Mi is the infimum of all constants M > 0 satisfying (5.9) or (5.10), respectively. Finally, (r ,...,rd ) (r1 ,...,rd ) (Rd ) := Hq,...,q (Rd ). we abbreviate Hq 1
May 20, J070-S0129055X11004321
396
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
Wpr (R3(n+1) ) is continuously imbedded into Hpr (R3(n+1) ); see, e.g., [22, §6.2]. Furthermore, since D 4 is a form core of Hγ,m , m > 0, Corollary 3.1 shows that (n) c−1 φ(n) m | |D0 |φm ≤ φm | Hγ,m φm + c = Eγ,m + c,
n ∈ N,
(5.11)
(n)
for some m-independent c ∈ (0, ∞). Therefore, {φm,θ }m∈(0,m] and, hence, (n)
{ψm,θ }m∈(0,m ] are bounded families in the Bessel potential, or, Liouville space
Lr2 (R3(n+1) ), r := (1/2, 1/2, 1/2, 0, . . ., 0), where the fractional derivatives are defined by means of the Fourier transform. The imbedding Lr2 (R3(n+1) ) → (n) H2r (R3(n+1) ) is continuous, too [22, §9.3]. Altogether it follows that {ψm,θ }m∈(0,δ] is a bounded family in Hqs (R3(n+1) ). Now we may apply the compactness theo(n)
rem [21, Theorem 3.2]. The latter ensures that {ψm,θ }m∈(0,δ] contains a sequence which is strongly convergent in L2 (Qn,2δ ) provided 1 − 3n(p−1 − 2−1 ) > 0. Of course, we can choose p < 2 large enough such that the latter condition is fulfilled, for all n = 1, . . . , n0 − 1. By finitely many repeated selections of subsequences we (n) may hence assume without loss of generality that {φmj ,θ }j∈N converges strongly in (n)
L2 (Qn,2δ ) to φθ , for 0 ≤ n < n0 . In particular, by the choice of n0 and R in (5.7) and (5.8), 2
φ ≥ lim
j→∞
n 0 −1 n=0
(n)
φmj ,θ 2L2 (Qn,2δ )
θ
≥ lim φmj 2 − 2ε − c(δ) j→∞
= 1 − 2ε − c(δ), where we use the soft photon bound to estimate n 0 −1 n=1
≤
(n)
φmj ,θ 1{∃ i : |ki | ≤ 2δ ∨ |ki | ≥ Λ − 2δ}2
θ
λ∈Z2
{|k|≤2δ}∪ {|k|≥Λ−2δ}
2 3
a(k, λ)φmj d k ≤ C
2δ
Λ
+ 0
Λ−2δ
r2 dr =: c(δ) → 0, r
as δ 0. Since δ > 0 and ε > 0 are arbitrary we get φ = 1, whence φmj → φ strongly in H 4 . 5.3. Existence of ground states with infrared cut-off At the end of this subsection we prove Proposition 5.1. For this we first have to extend our results on the spatial localization of low-lying spectral subspaces. This extension requires the following inequality, where J(a) is defined below (3.10).
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
397
Lemma 5.3. Let e, Λ > 0, m ≥ ε ≥ 0, γ ∈ [0, 2/π), and a ∈ [0, 1). Moreover, let F ∈ C ∞ (R3x , [0, ∞)) ∩ L∞ satisfy |∇F | ≤ a. Then ε |Re ϕ | [Hγ,m , eF ]e−F ϕ| ≤ 2a2 J(a)ϕ2 ,
4 ϕ ∈ Dm .
(5.12)
Proof. This lemma is a special case of [19, Lemma 5.7]. Lemma 5.4. Let e, Λ > 0, γ ∈ [0, 2/π), and m1 , a1 , δ1 , J(m, ε) be as in Proposition 5.2. Assume that F ∈ C ∞ (R3x , [0, ∞)) satisfies |∇F | ≤ a1 /2 and F (x) = a1 |x|, for large |x|. Then there is some C ∈ (0, ∞) such that, for all 0 ≤ ε ≤ m ≤ m1 , ε ε )), we have eF ψ ∈ Q(Hγ,m ) and and ψ ∈ Ran(1J(m,ε) (Hγ,m ε ε ˚γ,m ˚γ,m (H )1/2 eF ψ2 ≤ e2F ψH ψ + 2a2 J(a)eF ψ2 ,
(5.13)
˚ε := H ε − E ε , In particular, where H γ,m γ,m γ,m ˚ ε )1/2 eF 1J(m,ε) (H ε ) ≤ C , (H γ,m γ,m
(5.14)
where the constant C ∈ (0, ∞) neither depends on m nor on ε. Moreover, for ε ε } and ψ ∈ Ran(1J(m,ε) (Hγ,m )), we have eF ψ ∈ D(O1/2 ) O ∈ {|D0 |, |DAεm |, Hf,m and ε ) : 0 ≤ ε ≤ m ≤ m1 } < ∞. sup{O1/2 eF 1J(m,ε) (Hγ,m
(5.15)
4 and let F ∈ C ∞ (R3x , [0, ∞))∩L∞ such that |∇F | ≤ a1 /2. Proof. First, let ϕ ∈ Dm e Applying (5.12) with ϕ replaced by eF ϕ we obtain
˚ ε ϕ + eFe ϕ | [H ε , eFe ]e−Fe (eFe ϕ)] ˚ ε )1/2 eFe ϕ2 = Re[e2Fe ϕ | H (H γ,m γ,m γ,m e
e
˚ ε ϕ| + c(a)eF ϕ2 , ≤ |e2F ϕ | H γ,m
(5.16)
where c(a) = 2a2 J(a). In [19, Lemma 5.8] we proved the following inequality, ε ε eG ϕ ≤ c1 eG 2 ϕ | Hγ,m ϕ + c2 eG 2 ϕ2 , eG ϕ | Hγ,m
4 ϕ ∈ Dm ,
for every G ∈ C ∞ (R3 , [0, ∞)) ∩ L∞ with ∇G∞ < 1. In fact, we stated this γ is relatively form bounded with respect to inequality only for γ = 0. Since |x| ε H0,m with relative bound less than one it is clear, however, that it holds true for γ ∈ (0, 2/π) also, with new constants c1 , c2 ∈ (0, ∞) of course. In particular, if ε 4 ), ϕn ∈ Dm , n ∈ N, and ϕn → ψ with respect to the form norm ψ ∈ Q(Hγ,m e
e
e
e
e
e
ε ε of Hγ,m , then eF ψ, e2F ψ ∈ Q(Hγ,m ) and eF ϕn → eF ψ and e2F ϕn → e2F ψ with ε ε ) respect to the form norm of Hγ,m , too. We may thus replace ϕ by any ψ ∈ Q(Hγ,m ε ε )) in what follows. Then ψ ∈ D(Hγ,m ) in (5.16). We fix some ψ ∈ Ran(1J(m,ε) (Hγ,m since J(m, ε) is bounded and we arrive at e
e
e
˚ ε ψ + c(a)eF ψ2 . ˚ ε )1/2 eF ψ2 ≤ e2F ψH (H γ,m γ,m
May 20, J070-S0129055X11004321
398
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
Furthermore, we know from (5.3) that
e4F (x) ψ2F 4 (x)d3 x < ∞. We pick a b
sequence of bounded functions Fn ∈ C ∞ (R3 , [0, ∞)) such that |∇Fn | ≤ a1 /2, n ∈ N, and Fn F . Then eFn ψ → eF ψ and e2Fn → e2F ψ in Hm4 by dominated convergence. Inserting Fn for F in the previous estimate we conclude that the densely defined linear functional ε ε ˚γ,m ˚γ,m )1/2 η = lim (H )1/2 eFn ψ | η, f (η) := eF ψ | (H n→∞
ε η ∈ Q(Hγ,m ),
is bounded, ε ˚γ,m |f (η)| ≤ (e2F ψH ψ + c(a)eF ψ2 )1/2 η,
ε η ∈ Q(Hγ,m ).
ε ε ε ˚γ,m )1/2 is self-adjoint with domain Q(Hγ,m ) it follows that eF ψ ∈ Q(Hγ,m ) Since (H ε 1/2 F ˚ and (Hγ,m ) e ψ = f . The inequality (5.14) follows from (5.13) and Proposition 5.2. The bound (5.15) follows from (5.14) and Corollary 3.1 where the constants can be chosen uniformly in 0 ≤ ε ≤ m ≤ m1 .
= e−iν ε ·x gm,ε , and = ωε , the parameter Choosing G = Gphys = e−ik·x g, G defined in (3.6) equals 1 2ε := 1Am (k) 1 + sup {e−2a|x| |e−ik·x g(k) − e−iν ε (k)·x gm,ε (k)|2 }dk. ωε (k) x∈R3 It is well known and easy to see that, for fixed e, Λ, a > 0, and m ∈ (0, 1], 2ε = o(ε0 ),
ε 0,
(5.17)
(Here and henceforth the little o-symbols depend on m which does, however, not do any harm.) In the next lemma we compare the ground state energies Eγ,m = 0 ε ε ] and Eγ,m = inf σ[Hγ,m ]. inf σ[Hγ,m Lemma 5.5. Let e, Λ > 0, m ∈ (0, m1 ], and γ ∈ (0, 2/π). Then ε + o(ε0 ). Eγ,m ≤ Eγ,m 0 ε Proof. By virtue of (4.10) and Corollary 3.1 we know that Q(Hγ,m ) = Q(Hγ,m ). In particular, we may pick some ρ ∈ (0, δ1 ] and try some normalized φρε ∈ ε 0 ε ε Ran(1[Eγ,m ,Eγ,m +ρ) (Hγ,m )) as a test function for Hγ,m . Here δ1 is the parameter appearing in Proposition 5.2 and we shall also employ the parameters a1 , m1 , and the interval J(m, ε) introduced there. We obtain 0 φρε Eγ,m ≤ φρε | Hγ,m ε 0 ε ≤ Eγ,m + ρ + φρε | (|DA0m | − |DAεm |)φρε + φρε | (Hf,m − Hf,m )φρε .
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED 1/2
Next, we choose F as in Lemma √ 5.4 and √ use (3.16) with = τ = ε 0 ε ε − Hf,m ≤ m3ε (1 − m3ε )−1 Hf,m , to get which implies Hf,m
399
and (4.10),
ε ρ ρ 1/2 ε 1/2 F ρ 2 Eγ,m ≤ Eγ,m + ρ + 1/2 e φε ε φε | |DAεm |φε + ε (Hf,m + E) ρ 2 0 ρ ε ρ + C3/2 ε φε + o(ε )φε | Hf,m φε , ε where E ≡ E(e, Λ) ∈ (0, ∞). By virtue of (5.15), (Hf,m + E)1/2 eF φρε ≤ C, where the constant C ∈ (0, ∞) neither depends on ρ nor 0 ≤ ε ≤ m ≤ m1 . Employing Corollary 3.1 once more we conclude that ε ε Eγ,m ≤ Eγ,m + ρ + C1/2 + o(ε0 )φρε | (Hγ,m + 1)φρε , ε
where the little o-symbol is uniform in ρ and ρ is arbitrarily small. Proof of Proposition 5.1. Let m1 , a1 , δ1 , and F be as in Proposition 5.2 and 0 χ := 1(−∞,Eγ,m +m/4] (Hγ,m ).
(5.18)
We assume that m ≤ m1 , m/4 ≤ δ1 , and ε ≤ m in the sequel so that (5.15) applies 1/2 to χ. On account of (3.16) with = τ = ε and (4.11) we have 0 ε χHγ,m χ ≥ χ{|DAεm | − γ/|x| + Hf,m }χ − o(ε0 )T1 , 0 where the norm of T1 := χ{|DA0m | + eF (Hf,m + E)eF }χ is bounded uniformly in m ∈ (0, m1 ] due to Lemma 5.4. (The constant E ≡ E(e, Λ) appears when we apply (3.16).) To proceed further we introduce the subspaces of discrete and fluctuating photon states,
Kmd := Pε Km ,
Kmf := Km Kmd ,
where Pε is defined in (4.8). The splitting Km = Kmd ⊕ Kmf gives rise to an isomorphism L2 (R3 , C4 ) ⊗ Fb (Km ) ∼ = (L2 (R3 , C4 ) ⊗ Fb [Kmd ]) ⊗ Fb [Kmf ] and we observe that the Dirac operator and the field energy decompose under the above isomorphism as DAεm ∼ ⊗ 1f , = DAε,d m
ε,d ε,f ε Hf,m = Hf,m ⊗ 1f + 1d ⊗ Hf,m .
(5.19)
Here and in the following we designate operators acting in the Fock space factors Fb [Km ], ∈ {d, f }, by the corresponding superscript ∈ {d, f }. In fact, the discretized vector potential Aεm acts on the various n-particle sectors in Fb [Km ] by tensor-multiplying or taking scalar products with elements from Kmd (apart from symmetrization and a normalization constant). Denoting the projection onto the
May 20, J070-S0129055X11004321
400
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
vacuum sector in Fb [Km ] by PΩ , writing PΩ⊥ := 1 − PΩ , ∈ {d, f }, and using ε,f PΩf = 0, we thus obtain Hf,m 0 χ{Hγ,m − Eγ,m − m/2}χ + o(ε0 )T1 ε,d ≥ χ{[|DAε,d | − γ/|x| + Hf,m − Eγ,m − m/2] ⊗ PΩf }χ m
(5.20)
ε,d ε + χ{[|DAε,d | − γ/|x| + Hf,m − Eγ,m ] ⊗ PΩ⊥f }χ m
(5.21)
ε,f ε + χ{1el ⊗ 1d ⊗ (Hf,m + Eγ,m − Eγ,m − m/2)PΩ⊥f }χ.
(5.22)
ε Here Eγ,m is defined in (4.17). Setting ε,d | − γ/|x| + Hf,m Xεd := |DAε,d m ε we observe that Xεd − Eγ,m 1d ≥ 0, so that the term in (5.21) is non-negative. In fact, let ρ > 0 and pick some φd ∈ Q(Xεd ), φd = 1, satisfying φd | Xεd φd < inf σ(Xεd ) + ρ. Then ε )φd ⊗ Ωf = φd | Xεd φd ≤ inf σ(Xεd ) + ρ φd ⊗ Ωf | (|DAεm | − γ/|x| + Hf,m ε because of (5.19). Moreover, we know from Lemma 5.5 that Eγ,m − Eγ,m ≥ o(ε0 ), ε,f ⊥ ε 0. Since Hf,m PΩf ≥ mPΩ⊥f this implies that the term in (5.22) is non-negative also, provided that ε > 0 is sufficiently small. In order to bound the remaining term in (5.20) from below we employ Corol = 0, = ωε ), (5.19), and H ε,f PΩf = 0 to get lary 3.1 (with A f,m ε,d | − γ/|x| + Hf,m ] ⊗ PΩf [|DAε,d m ε = (1 ⊗ PΩf ){|DAεm | − γ/|x| + Hf,m }(1 ⊗ PΩf ) ε,d ≥ ε[|D0 | + |x|2 + Hf,m ] ⊗ PΩf − (C(ε, γ, e, Λ) + ε|x|2 ) ⊗ PΩf ,
for all sufficiently small values of ε > 0. Since χ is exponentially localized we further know that T2 := χ{|x|2 ⊗ PΩf }χ is a bounded operator. Therefore, we arrive at 0 χ{Hγ,m − Eγ,m − m/2}χ + o(ε0 )(T1 + T2 ) ε,d − C (ε, γ, e, Λ)] ⊗ PΩf }χ ≥ χ{[ε|D0 | + ε|x|2 + εHf,m ε,d − C (ε, γ, e, Λ)]− ⊗ PΩf }χ, ≥ χ{[ε|D0 | + ε|x|2 + εHf,m
(5.23)
ε,d have where [· · · ]− ≤ 0 denotes the negative part. Now, both |D0 | + |x|2 and Hf,m purely discrete spectrum as operators on the electron and photon Hilbert spaces and ε,d is the restriction of the discretized PΩf , of course, has rank one. (Recall that Hf,m field energy to the Fock space modeled over the “2 -space” Kmd .) In particular, we observe that ε,d − := [ε|D0 | + ε|x|2 + εHf,m − C (ε, γ, e, Λ)]− ⊗ PΩf Wm,ε
is a finite rank operator, for every sufficiently small ε > 0.
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
401
We can now conclude the proof as follows: Given some sufficiently small m > 0 we choose ε > 0 small enough such that, in particular, the terms in (5.21) and (5.22) are non-negative and o(ε0 )(T1 + T2) ≤ m/8. Since by definition (5.18) it 0 − Eγ,m − m/2}χ ≤ −(m/4)χ, we see that the left hand side of (5.23) holds χ{Hγ,m is bounded from above by −(m/8)χ, whence 0 − ) ≥ χWm,ε χ. −(m/8)1(−∞,Eγ,m+m/4] (Hγ,m 0 In particular, 1(−∞,Eγ,m +m/4] (Hγ,m ) is a finite rank projection.
6. Infrared Bounds 6.1. The gauge transformed operator In order to derive the infrared bounds (5.5) and (5.6) it is necessary to pass to a suitable gauge [5, 11]. For otherwise we would end up with a too singular infrared behavior of their right hand sides. To define an appropriate operator-valued gauge (i) transformation [11] we recall that, for i, j ∈ {1, 2, 3}, the components Am (x) and (j) 3 Am (y) of the magnetic vector potential at x, y ∈ R commute in the sense that all their spectral projections commute; see, e.g., [23, Theorem X.43]. Therefore, it makes sense to define ⊕ 3 (j) 1C4 ⊗ Ux d3 x, Ux := eixj Am (0) , x = (x1 , x2 , x3 ) ∈ R3 . U := R3
j=1
Then the gauge transformed vector potential is given by m := Am − 1 ⊗ Am (0) = a† (G m ) + a(G m ), A m ) = ⊕3 1C4 ⊗ a (G x,m )d3 x, and where a (G R x,m (k) := −e 1{m≤|k|≤Λ} G (e−ik·x − 1)ε(k) = (e−ik·x − 1)gm (k), 2π |k| for x ∈ R3 and almost every k = (k, λ) ∈ R3 × Z2 . In fact, using [U, α · Am ] = 0 ∗ ∗ we find UD Am U ∗ = DA e m , thus, U SAm U = SA e m and U |DAm |U = |DA e m |. The crucial point observed in [5] is that the transformed vector potential has a better infrared behavior than Am in view of the estimate x,m (k)| ≤ |k| |x| |gm (k)|. |G
(6.1)
This avoids the appearance of infrared divergent integrals when the expectation value of the photon number operator is estimated by means of the soft photon bound. In this section we also include the case m = 0 where Hγ,0 := Hγ , A0 := A, etc. Assuming that φm is a ground state eigenvector of Hγ,m we further set f := UHf U ∗ , H
f , γ,m := UHγ,m U ∗ = |D e | − γ + H H Am |x|
φm := U φm ,
May 20, J070-S0129055X11004321
402
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
γ,m φm = Eγ,m φm . We recall that, for f ∈ K with ωf ∈ K , so that H [a(f ), U ] = if | gm · xU,
f , a(f )] = −a(ωf ) + if | ωgm · x. [H
6.2. Soft photon and photon derivative bound In the following we set R(ωk ) := (Hγ,m − Eγ,m + ωk )−1 ,
ωk := ωk := |k|,
T (y, k) := R(ωk )DAm RAm (iy) = {|DAm |1/2 R(ωk )}∗ SAm |DAm |1/2 RAm (iy), where RAm (iy) = (DAm − iy)−1 , y ∈ R. For subcritical γ ∈ (0, 2/π), we find T (y, k) ≤ const(γ, e, Λ)|k|−1 (1 + y 2 )−1/4 .
(6.2)
Lemma 6.1. For m ≥ 0, e, Λ > 0, γ ∈ (0, 2/π), and almost every k ∈ R3 × Z2 , x,m (k)SA φm − i(1 − ωk R(ωk ))gm (k) · xφm a(k)φm = −R(ωk )α · G m ∞ x,m (k)RAm (iy)φm dy . (6.3) + T (y, k)α · G π 0 γ,m − Eγ,m + ν)−1 , for every ν > 0, and fix some Proof. We set R(ν) := (H 3 p = (p, µ) ∈ R × Z2 with p = 0. Let η ∈ U D 4 and f ∈ C0∞ ((R3 \{0}) × Z2 ). By a virial type argument we get γ,m − Eγ,m + ωp )η | a(f )φm (H † = [a† (f ), |DA e m |]η | φm + [a (f ), Hf ]η | φm + ωp η | a(f )φm 1/2 1/2 η | |DA [SA = η | [DA e m , a(f )]SA e m φm + SA e m |DA em| em| e m , a(f )]φm
+ iη | f | ωgm K · xφm + η | a((ωp − ω)f )φm .
(6.4)
Since |DA e m | ≤ Hγ,m +C and a(f )φm ∈ Q(Hγ,m ) by Lemma 6.2 below, the previous γ,m ). In particular, we may choose η = η (p) := identity extends to every η ∈ Q(H p )η, for some η ∈ H 4 . Next, we pick some h ∈ C ∞ (R3 , [0, ∞)) with supp h ⊂ R(ω 0 B1 (0) and R3 h(k)d3 k = 1, set h := −3 h(·/), and specify f := fp, , where fp, (k) := h (k − p)δµλ , for k = (k, λ) ∈ R3 × Z2 and > 0. Then (6.4) yields an identity between two locally bounded functions of p ∈ (R3 \{0}) × Z2 . After multiplying both sides of (6.4) with g ∈ C0∞ ((R3 \{0}) × Z2 , C) and integrating with respect to p we obtain (6.5) g(p)η | a(fp, )φm dp = C1 () + C2 () + C3 () + C4 (),
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
where
C1 () := − C2 () := C3 () := i C4 () :=
403
p )η | α · fp, | G x,m S e φm dp, g(p)R(ω Am
1/2 1/2 R(ωp )η | |DA g(p)SA [SA e m |DA em| em| e m , a(fp, )]φm dp,
p )η | fp, | ωgm K · xφm dp, g(p)R(ω p )η | a((ωp − ω)fp, )φm dp. g(p)R(ω
Because of h ∗g → g in L2 and Fubini’s theorem the left-hand side of (6.5) tends to η | a(g)φm = U ∗ η | a(g)φm + iU ∗ η | g | gm · xφm , p )η | xφm , so that s3 is continuous on as 0. Next, we abbreviate s3 (p) := iR(ω 3 2 R \{0}. Then h ∗ (g s3 ) → g s3 in L and we further get C3 () = fp, | ω gm K · s3 (p)g(p)dp = h ∗ (g s3 ) | ω gm → g | ω s3 · gm , as 0. The term C1 () is discussed similarly. Furthermore, p )2 η2 dp + (4)−1 C5 (), |C4 ()| ≤ |g|(p)R(ω where, since |ωp − ω| ≤ on supp(fp, ), C5 () := |g|(p)a((ωp − ω)fp, )φm 2 dp 2 dp (ω = |g|(p) − ω )f (k)a(k) φ dk p k p, m fp, (k ) dk ≤ 2 |g|(p) fp, (k)ω(k)a(k)φm 2 dkdp. ω(k ) Here the integral in the curly brackets {· · ·} is bounded by some K ∈ (0, ∞) uniformly in p as long as ≤ dist(0, supp(g))/2, whence 2 C5 () ≤ K (|g| ∗ h )(k)ω(k)a(k)φm 2 dk. 1/2 Since |g| ∗ h → |g| in L∞ and φm ∈ D(Hf ) we conclude that C4 () → 0. (iy), a(f )] = R (iy)α · fp, (k) Finally, we treat C2 (). Using (3.8) and [RA p, em em A Gx,m (k)dkR e (iy) we obtain by means of Fubini’s theorem Am
C2 () =
λ∈Z2
R3
R3
g(p, λ)h (k − p)gm (k, λ)s2 (k, p)d3 k d3 p.
May 20, J070-S0129055X11004321
404
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
Here we abbreviate
s2 (k, p) :=
dy , T(y, p)η | (e−ik·x − 1)αRA e m (iy)φm π R
1/2 1/2 where the norm of T(y, p) := {|DA R(ωp ) obeys the RA e m| e m (−iy)}SA e m |DA em| 2 −1/2 . Since s2 same bound as the one of T (y, p) in (6.2) and RA e m (iy) ≤ (1 + y ) 3 3 is continuous on (R \{0}) × R it is easy to see that C2 () converges to the second term on the right-hand side of ∗ g(k)U η | a(k)φm dk + i g(k)U ∗ η | gm (k) · xφm dk
=−
x,m (k)SAm φm dk g(k)U ∗ η | R(ωk )α · G
+ +i
g(k) 0
∞
x,m (k)RAm (iy)φm U ∗ η | T (y, k)α · G
dy dk π
g(k)U ∗ η|ωk R(ωk )gm (k) · xφm dk.
Next, we let g run over C0∞ ((R3 \{0})×Z2, C) and obtain an identity for all k outside some zero set which depends on η. Choosing η from a countable dense subset of H 4 we readily arrive at the assertion. Proof of Proposition 5.3. The soft photon bound (5.5) follows from (6.3) in combination with (6.1), (6.2), R(ωk ) ≤ |k|−1 , |x|φm ≤ C, |x|SAm φm ≤ C (by (3.15) and Proposition 5.2), and |x|RAm (iy)φm ≤ C(1 + y 2 )−1/2 (by (3.10)), uniformly in m ≥ 0. Proof of Proposition 5.4. Again, we use (6.3) to represent a(k)φm − a(p)φm and apply the bounds recalled in the proof of Proposition 5.3. Then it suffices to observe that, by the resolvent identity, R(ωk ) − R(ωp ) ≤ |k − p||k|−1 |p|−1 and to recall the following bound from [11]: For m < |k|, |p| < Λ and λ ∈ Z2 , 1 x,m (k, λ) − G x,m (p, λ)|} {|ωk gm (k, λ) − ωp gm (p, λ)| + |G |k| |k − p| x,m (p, λ)|} {|ωp gm (p, λ)| + |G |k||p| 1 1 + . ≤ C(1 + |x|)|k − p| |k|1/2 |k⊥ | |p|1/2 |p⊥ | +
(6.6)
For the sake of completeness we re-derive (6.6) in Sec. 6.3. Of course, here the special form of the polarization vectors (2.5) is exploited. Lemma 6.2. Let m ≥ 0, e, Λ > 0, γ ∈ (0, 2/π), and f ∈ K with ω −1/2 f ∈ K . γ,m ). Then a(f )φm ∈ Q(H
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
405
˚ := Hγ,m − Eγ,m + 1. Thanks to [17, Theorem 5.2] we know that Proof. We put H ˚ −n/2 maps D(Hγ,m ) into itself, for every n ∈ N. ˚ n/2 ) ⊂ D(H n/2 ) and H n/2 H D(H f f n/2 n/2 ˚ −n/2 In particular, Hf φm = Hf H φm ∈ D(Hγ,m ) ⊂ Q(Hγ,m ) ⊂ D(|D0 |1/2 ). 1/2 Since φm ∈ D(Hf ) it is also clear that a(f )φm ∈ D(Hf ). Furthermore, a standard estimate based on the definitions of a(f ) and Hf shows that a(f )ψ ∈ D(|D0 |1/2 ), for 1/2 1/2 every ψ ∈ D(Hf ) with Hf ψ ∈ D(|D0 |1/2 ). In particular, a(f )φm ∈ D(|D0 |1/2 ), 1/2 thus a(f )φm ∈ Q(Hγ,m ) = D(|D0 |1/2 ) ∩ D(Hf ). γ,m ) we may equivalently show Now, to verify that a(f )φm = a(f )U φm ∈ Q(H that a(f )φm + if | gm · xφm ∈ Q(Hγ,m ). Therefore, it remains to prove that 1/2 xj φm ∈ Q(Hγ,m ) = D(|D0 |1/2 ) ∩ D(Hf ), for every component xj , j = 1, 2, 3, of x. We already know from Lemma 5.4, however, that eF φm ∈ Q(Hγ,m ), for every F ∈ C ∞ (R3 , [0, ∞)) such that F (x) = a|x|, for large |x|, and |∇F | ≤ a, provided that a > 0 is sufficiently small. Since multiplication with xj e−F leaves the space 1/2 D(|D0 |1/2 ) ∩ D(Hf ) invariant we conclude that xj φm ∈ Q(Hγ,m ). 6.3. Elementary estimates on polarization vectors In this subsection we recall the proof of the bound (6.6). A similar bound has been used before in [11]. The elementary estimates below are the only place in our whole article where the special choice of the polarization vectors (2.5) is exploited. We set y⊥ := (y (2) , −y (1) , 0) and y◦ := y/|y|, for y = (y (1) , y (2) , y (3) ) ∈ R3 \{0}, and ∆h f (k, λ) := f (k + h, λ) − f (k, λ), for some function on R3 × Z2 . By means of (2.5) we find (∆−h ε)(k, 0) = −|k⊥ |−1 h⊥ + (|(k − h)⊥ |−1 − |k⊥ |−1 )(k − h)⊥ , (∆−h ε)(k, 1) = ((k − h)◦ − k◦ ) ∧ ε(k − h, 0) + k◦ ∧ (∆−h ε)(k, 0), whence |∆−h ε(k, 0)| ≤ 2|h⊥ |/|k⊥ | ≤ 2|h|/|k⊥ |, |∆−h ε(k, 1)| ≤ 2|h|/|k| + |∆−h ε(k, 0)| ≤ 4|h|/|k⊥ |. Since gm (k) = |k|−1/2 ε(k)1m≤|k|≤Λ and |a−1/2 −b−1/2 | ≤ (|a−b|/2)(a−3/2 +b−3/2 ), a, b > 0, we further have, for m < |k|, |k − h| < Λ, 4|h| 1 |h| 1 + + , |∆−h gm (k)| ≤ 2 |k|3/2 |k|1/2 |k⊥ | |k − h|3/2 |h| 1 |∆−h (ωgm )| ≤ |∆−h gm (k)| + . |k| |k||k − h|1/2 x,m = (e−ik·x − 1)gm (k) and |eiy·x − eiz·x | ≤ |y − z||x|, we find Moreover, since G 1 |h||x| x,m | ≤ |x||∆−h gm (k)| + |∆−h G , |k| |k||k − h|1/2
May 20, J070-S0129055X11004321
406
2011 14:34 WSPC/S0129-055X
148-RMP
M. K¨ onenberg, O. Matte & E. Stockmeyer
again for m < |k|, |k − h| < Λ. Furthermore, it is clear that (∆h ω)(k − h)2 |h|2 ≤ . 2 2 |k| |k − h| |k| |k − h| Finally, by Young’s inequality, |h| |h| ≤ 1/2 3 |k||k − h|
2 |k|1/2 |k⊥ |
+
1 1/2 |k − h| |(k − h)⊥ |
.
Combining the above estimates with p = k − h we obtain (6.6). Acknowledgments This work has been partially supported by the German Research Foundation (DFG) (SFB/TR12). References [1] V. Bach, T. Chen, J. Fr¨ ohlich and I. M. Sigal, Smooth Feshbach map and operatortheoretic renormalization group methods, J. Funct. Anal. 203 (2003) 44–92. [2] V. Bach, J. Fr¨ ohlich and A. Pizzo, Infrared-finite algorithms in QED: The groundstate of an atom interacting with the quantized radiation field, Comm. Math. Phys. 264 (2006) 145–165. [3] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998) 299–395. [4] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Renormalization group analysis of spectral problems in quantum field theory, Adv. Math. 137 (1998) 205–298. [5] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field, Comm. Math. Phys. 207 (1999) 249–290. [6] V. Bach and M. K¨ onenberg, Construction of the ground state in nonrelativistic QED by continuous flows, J. Differ. Equ. 231 (2006) 693–713. [7] J.-M. Barbaroux, M. Dimassi and J. C. Guillot, Quantum electrodynamics of relativistic bound states with cutoffs, J. Hyperbol. Differ. Eq. 1 (2004) 271–314. [8] A. Berthier and V. Georgescu, On the point spectrum of Dirac operators, J. Funct. Anal. 71 (1987) 309–338. [9] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, On spectral renormalization group, Rev. Math. Phys 21 (2009) 511–548. [10] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic electromagnetic fields in models of quantum-mechanical matter interacting with the quantized radiation field, Adv. Math. 164 (2001) 349–398. [11] M. Griesemer, E. H. Lieb and M. Loss, Ground states in non-relativistic quantum electrodynamics, Invent. Math. 145 (2001) 557–595. [12] F. Hiroshima and I. Sasaki, On the ionization energy of the semi-relativistic Pauli– Fierz model for a single particle, RIMS Kokyuroku Bessatsu 21 (2010) 25–34. [13] T. Kato, Perturbation Theory for Linear Operators, Classics in Mathematics (Springer, Berlin-Heidelberg, 1995). [14] M. K¨ onenberg, O. Matte and E. Stockmeyer, Existence of ground states of hydrogenlike atoms in relativistic quantum electrodynamics II: The no-pair operator, preprint (2010), 76 pp.; arXiv:1005.2109v1.
May 20, J070-S0129055X11004321
2011 14:34 WSPC/S0129-055X
148-RMP
Ground States in Semi-Relativistic QED
407
[15] E. H. Lieb and M. Loss, A bound on binding energies and mass renormalization in models of quantum electrodynamics, J. Stat. Phys. 108 (2002) 1057–1069. [16] E. H. Lieb and M. Loss, Stability of a model of relativistic quantum electrodynamics, Comm. Math. Phys. 228 (2002) 561–588. [17] O. Matte, On higher order estimates in quantum electrodynamics, Doc. Math. 15 (2010) 207–234. [18] O. Matte and E. Stockmeyer, On the eigenfunctions of no-pair operators in classical magnetic fields, Integr. Equat. Oper. Th. 65 (2009) 255–283. [19] O. Matte and E. Stockmeyer, Exponential localization for a hydrogen-like atom in relativistic quantum electrodynamics, Comm. Math. Phys. 295 (2010) 551–583. [20] T. Miyao and H. Spohn, Spectral analysis of the semi-relativistic Pauli–Fierz Hamiltonian, J. Funct. Anal. 256 (2009) 2123–2156. [21] S. M. Nikol’ski˘ı, An imbedding theorem for functions with partial derivatives considered in different metrics, Izv. Akad. Nauk SSSR Ser. Mat. 22 (1958) 321–336 (Russian); English translation in Amer. Math. Soc. Transl. (2 ) 90 (1970) 27–43. [22] S. M. Nikol’ski˘ı, Approximation of Functions of Several Variables and Imbedding Theorems, Die Grundlehren der Mathematischen Wissenschaften, Band 205 (Springer, New York, 1975). [23] M. Reed and B. Simon, Methods of Modern Mathematical Physics. II. Fourier Analysis, Self-Adjointness (Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1975). [24] M. Reed and B. Simon, Methods of Modern Mathematical Physics. IV. Analysis of Operators (Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1978). [25] M. Reed and B. Simon, Methods of Modern Mathematical Physics. I. Functional Analysis, 2nd edn. (Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1980). [26] I. Sasaki, Ground state of a model in relativistic quantum electrodynamics with a fixed total momentum, preprint (2006), 30 pp.; arXiv:math-ph/0606029v4. [27] E. Stockmeyer, On the non-relativistic limit of a model in quantum electrodynamics, preprint (2009), 13 pp.; arXiv:0905.1006v1.
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 4 (2011) 409–451 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004345
FAST SOLITONS ON STAR GRAPHS
RICCARDO ADAMI∗,§ , CLAUDIO CACCIAPUOTI†,¶ , DOMENICO FINCO‡, and DIEGO NOJA∗,∗∗ ∗Dipartimento
di Matematica e Applicazioni, Universit` a di Milano Bicocca, via R. Cozzi, 53, 20125 Milano, Italy
†Hausdorff Center for Mathematics, Institut f¨ ur Angewandte Mathematik, Endenicher Allee, 60, 53115 Bonn, Germany ‡Facolt´ a di Ingegneria, Universit´ a Telematica Internazionale Uninettuno, Corso Vittorio Emanuele, II 00186 Roma, Italy §
[email protected]
¶
[email protected]
d.fi
[email protected] ∗∗
[email protected]
Received 15 January 2011 Revised 30 March 2011 We define the Schr¨ odinger equation with focusing, cubic nonlinearity on one-vertex graphs. We prove global well-posedness in the energy domain and conservation laws for some self-adjoint boundary conditions at the vertex, i.e. Kirchhoff boundary condition and the so-called δ and δ boundary conditions. Moreover, in the same setting, we study the collision of a fast solitary wave with the vertex and we show that it splits in reflected and transmitted components. The outgoing waves preserve a soliton character over a time which depends on the logarithm of the velocity of the ingoing solitary wave. Over the same timescale, the reflection and transmission coefficients of the outgoing waves coincide with the corresponding coefficients of the linear problem. In the analysis of the problem, we follow ideas borrowed from the seminal paper [17] about scattering of fast solitons by a delta interaction on the line, by Holmer, Marzuola and Zworski. The present paper represents an extension of their work to the case of graphs and, as a byproduct, it shows how to extend the analysis of soliton scattering by other point interactions on the line, interpreted as a degenerate graph. Keywords: Quantum graphs; nonlinear Schr¨ odinger equation; solitary waves. Mathematics Subject Classification 2010: 35Q55, 81Q35, 37K40
1. Introduction In the present paper, we study the nonlinear wave propagation on graphs. As far as we know the subject of nonlinear Schr¨ odinger evolution on graphs is at its 409
May 20, J070-S0129055X11004345
410
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
beginnings. An extensive literature on the behavior of linear wave and Schr¨ odinger equations on graphs exists ([22, 23, 21, 7, 8], and references therein) and a certain activity concerning the so called discrete nonlinear Schr¨ odinger equation (DNLSE) in chains with edges and graphs inserted (“decorations”, interpreted as defects in the chain), both from the physical and the numerical point of view (see, for example, [20, 9]). To our knowledge, however, there are only very few papers in which nonlinear Schr¨ odinger evolution on graphs has been introduced and studied (see [6, 26]). In the first paper [6] (and similar ones quoted therein) NLS on graphs emerges in some models of quantum field theory on “bulks”; the second recent paper [26] addresses from a physical point of view some general questions related to the ones here studied and is briefly commented in the conclusions of the present paper. Besides several results on existence and stability of stationary states for the NLS on star graphs can be found in [2]. In any case, the study of nonlinear propagation in ramified structures could be of relevance in several branches of pure and applied science, from condensed matter physics to nonlinear fiber optics, hydrodynamics and fluid transport (a nontraditional example is blood flow in veins and arterias), and finally neural networks (see for example the study of reaction-diffusion type FitzHugh–Nagumo–Rall equations on networks in [12], and references therein). In all these examples, there is a strong dependence on the modelization. The nonlinear Schr¨ odinger equation with cubic nonlinearity is especially suitable to describe nonlinear electromagnetic pulse propagation in optical fibers and, under the name of Gross–Pitaevskii equation, the dynamics of Bose–Einstein condensates. Better suited for other applications, for example the hydrodynamic flow, is the KdV equation or its relatives, not treated here. Of course real networks are not strictly one dimensional, and an abstract graph, which is just a set of copies of R+ (“edges”, or “branches”) with functions living on it satisfying certain boundary conditions at 0 lacks some of the geometric meaningful characteristics of a real network, such as thickness and curvature of the branch or orientation between edges. On the other hand, problems related to wave propagation on networks are far from being well understood also for the linear propagation (see, for example, [4, 10]), and so we content ourselves with posing and analyzing the nonlinear problem in the idealized and simplified framework of an abstract graph. We study here the special case of a star graph with three edges. A generalization to a star graph with n edges would be straightforward, but here our interest is in clarifying the main features of the evolution and the techniques involved in its analysis. A preliminary point and not a trivial issue is the definition of the dynamics. Let us recall that for a star graph G, the linear Schr¨ odinger dynamics is defined by giving a self-adjoint operator H on the product of n copies of L2 (R+ ) (briefly L2 (G)), with a domain D(H) in which appears a linear condition involving the values at 0 of the functions of the domain and of their derivatives. The admissible boundary conditions characterize the interaction at the vertex of the star graph. For the nonlinear problem, we establish the well-posedness of the dynamics in the case of a star graph with some distinguished boundary condition at the vertex,
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
411
namely the free (or Kirchhoff), and the δ and δ boundary conditions (see Sec. 2 for the relevant definitions). To clarify the exact problem we are faced to, the differential equation to be studied is of the form d (1.1) i Ψt = HΨt − |Ψt |2 Ψt , t ≥ 0, dt where the function Ψ is, for a three-edge graph, a column vector ψ1 Ψ = ψ2 , ψ3 that lies in the domain D(H) of the linear Hamiltonian H on the graph, so emboding the relevant boundary conditions. This is the abstract strong form of the equation, which is equivalent to a particular nonlinear coupled system of scalar equations. The coupling is not due to the nonlinearity, because of the definition |ψ1 |2 ψ1 |Ψ|2 Ψ ≡ |ψ2 |2 ψ2 , |ψ3 |2 ψ3 but to the boundary condition at the vertex. For example, in the simple case of a Kirchhoff boundary condition, the coupling between the edges is given by Ψ ∈ L2 (G)
s.t. ψi ∈ H 2 (R+ ),
ψ1 (0) = ψ2 (0) = ψ3 (0), ψ1 (0) + ψ2 (0) + ψ3 (0) = 0. The δ or δ boundary conditions allow a coupling between the values of the function Ψ and the values of their derivatives at the origin, but in principle the nature of the problem is unaltered. In the present paper, for several reasons, we prefer to write the dynamics in weak form, which is the following t e−iH(t−s) |Ψs |2 Ψs ds, t ≥ 0; (1.2) Ψt = e−iHt Ψ0 + i 0
here the Ψ belongs to the form domain D(Elin ), where Elin is the quadratic form of the linear Hamiltonian H. We interpret, according to the use, the form domain as the finite energy space. An adaptation of the methods in [3] gives the local wellposedness of Eq. (1.2) for every initial data in D(Elin ). Moreover, charge and energy conservation laws hold true for such weak solutions, and as a consequence, the NLS on graph admits global solutions. The generalization of the well-posedness result to the case of general self-adjoint boundary conditions will be treated elsewhere. Apart from well-posedness, the main goal of this paper is to provide information on the interaction between a solitary wave and the boundary condition at the vertex. As it is well known, the NLS on the line admits a family of solitary non dispersive
May 20, J070-S0129055X11004345
412
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
solutions, rigidly translating with a fixed velocity. A rich family of solitary solutions (“solitons”) is given by the action of the Galilei group on the function √ φ(x) = 2 cosh−1 x, x ∈ R, or explicitly, v
φx0 ,v (x, t) = ei 2 x e−it
v2 4
eit φ(x − x0 − vt)
x ∈ R,
t ∈ R,
v ∈ R.
We show that, after the collision of a solitary wave with the vertex there exists a timescale during which the dynamics can be described as the scattering of three split solitary waves, one reflected on the same branch where the originary soliton was running asymptotically in the past, and two transmitted solitary waves on the other branches. On the same timescale, the amplitudes of the reflected and transmitted solitary waves are given by the scattering matrix of the linear dynamics on the graph. The soliton-like character persists over time intervals that depend on the velocity of the impinging original soliton: the faster is the original soliton, the (logarithmically in the velocity v) longer is the survival time of the solitary wave behavior on every branch of the graph. The non-trivial point is that the persistence time of solitary behavior after collision with the vertex is much longer, for fast solitons, than the time over which it is reasonable to approximate the nonlinear dynamics with the linear one. The same timescale of the order ln v of persistence of solitary behavior appears in the paper [1] where the collision of two solitary waves with an underlying smooth potential is studied, and in the paper by Holmer, Marzuola and Zworski [17] on the fast NLS-soliton scattering by a delta potential on the line, which is the main source of inspiration for our result and for the techniques employed in the present paper. We give now an outline of our result and its proof. The initial data are of the following form v χ(x)e−i 2 x φ(x − x0 ) , x0 ≥ v 1−δ , 0 < δ < 1, (1.3) Ψ0 (x) = 0 0 where χ is a cut off function, that is χ ∈ C ∞ (R+ ), χ = 1 in (2, +∞) and χ = 0 in (0, 1). Apart from a small tail term truncated by the cutoff function, the first component is the initial condition of a free (i.e. without external potentials) NLS which on the line yields a solitary wave running with velocity v; the center x0 of the initial soliton is chosen far from the vertex. We are interested in the evolution Ψt of this initial condition. The dynamics can be divided into three phases. The pre-interaction phase, where the evolved initial condition is far from the vertex, and the undisturbed NLS evolution dominates. At the end of this phase, the solution enters the vertex zone, and differs (in L2 norm) from the evolved solitary wave by an exponentially small error in the velocity v. The second phase is the interaction phase, in which a substantial
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
413
fraction of the mass of the initial soliton has reached the vertex, and the linear dynamics dominates due to the shortness of the interaction time, leaving the system at the end of this phase with three scattered waves, the amplitudes of which are given by the action of the scattering matrix of the associated linear graph on the incoming solitary wave. The size of the corresponding error is (again in L2 norm) a 1 suitable negative inverse power (v − 2 δ ) of the velocity. In the phase one, the main technical tool consists in the accurate use, as fixed by [17], of the Strichartz’s estimates to control the deviations between the unperturbed NLS flow and the NLS flow on the graph. In the phase two, we need to compare the nonlinear evolution with the linear flow on the graph in the relevant time interval. Finally there is the post-interaction phase, where the free NLS dynamics dominates again. However, now the initial data are not exact solitary waves, but waves with soliton-like profiles and “wrong” amplitudes (due to the scattering process in the interaction phase). The true evolution is compared with a reference dynamics given by the superposition of the nonlinear evolution of the outgoing scattered profiles, and it turns out that the error is, in L2 norm, of the order of an inverse power of velocity (depending on the size of the time interval of approximation). For a precise formulation, one has to tackle the problem of representing the reference soliton dynamics to be compared with the true dynamics. This problem arises because one would like to use crucial and known properties of NLS on the line (such as existence of an infinite number of constants of motion), and various associated estimates, while on a star graph one has a NLS on halflines, jointly with boundary conditions. The problem occurs, of course, in each of the three phases in which the dynamics is decomposed. Our choice of reference dynamics is the following. We associate to every edge of the star graph a companion edge chosen between the other two, in such a way to have three fictitious lines; then we glue the soliton on every single edge with the right tail on the companion edge, respecting the free nonlinear dynamics. One of the main technical points in the analysis of the true dynamics is to have a control in the errors brought by this schematization. More precisely, let us define v2 v r˜e−i 4 t ei 2 x1 eit φ(x1 + x0 − vt) ˜ 1t (x1 , x2 , x3 ) ≡ −i v2 t −i v x2 it Φ r˜e 4 e 2 e φ(x − x + vt), 2
0
0
0
v2 v ˜ 2t (x1 , x2 , x3 ) ≡ t˜e−i 4 t ei 2 x2 eit φ(x2 + x0 − vt) , Φ v2 v t˜e−i 4 t e−i 2 x3 eit φ(x3 − x0 + vt) v2 v t˜e−i 4 t e−i 2 x1 eit φ(x1 − x0 + vt) ˜ 3t (x1 , x2 , x3 ) ≡ Φ 0 . v2 v −i t i x it 3 t˜e 4 e 2 e φ(x3 + x0 − vt)
(1.4)
May 20, J070-S0129055X11004345
414
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
Each of these vectors represents a soliton on the fictitious line given by an edge and its companion, multiplied by the scattering coefficients of the linear dynamics considered, Kirchhoff, δ or δ (and here left unspecified). Up to a small error, these functions represent outgoing waves at the end (t = t2 ) of the interaction phase, which is essentially a scattering process. Taking these as initial data for the free nonlinear dynamics on the pertinent fictitious line, we define their time evolution Φjt as given by t j −iHj (t−t2 ) ˜ j Φt = e Φt2 + i dse−iHj (t−s) |Φjs |2 Φjs j = 1, 2, 3, t ≥ t2 , t2
where the Hj are the linear Hamiltonians that decouple the j + 2-branch from the others. With these premises, the main result of the paper is the following: Theorem 1.1. Let Ψt the unique, global solution to the Cauchy problem (1.1) with initial data (1.3). There exist τ∗ > 0 and T∗ > 0 such that for 0 < T < T∗ one has Ψt − Φ1t − Φ2t − Φ3t L2x (G) ≤ Cv −
T∗ −T τ∗
for every time t in the interval t2 < t < t2 + T ln v, where C is a positive constant independent of t and v. To be precise, the Hamiltonians in (1.1) to which the theorem refers have to be rescaled in order to give a nontrivial scattering matrix in the regime of high velocity (see Sec. 4, in particular Theorem 4.3). To get the previous result, Strichartz estimates do not suffice, and more direct properties coming from the integrable character of the cubic NLS are needed. In particular, thanks to the existence of an infinite number of integrals of motion, in [17] a spatial localization property of the solution of NLS with smooth data is proven, with a polynomial (and not exponential) bound in time. This gives a control on the tails of the difference between the solution and the reference modified solitary dynamics. An analogous method applies in our case. Let us note that as a consequence of the previous result, we can give the scattered amplitudes in terms of the incoming amplitude and scattering coefficients of the linear dynamics, in the time range of applicability of the main theorem (see Remark 4.4). We give a brief summary of the content of the various sections of the papers. In Sec. 2, we give some generalities on linear dynamics on graphs, including Hamiltonians, their quadratic forms, resolvents and propagators. Moreover the essential Strichartz estimates are recalled. In Sec. 3, local and global well posedness of nonlinear Schr¨ odinger equations on star graphs is proved. In Sec. 4 the main result (Theorem 4.3) is introduced and stated. Section 5 is devoted to the proof of the result. In Sec. 6, some final remarks are given and further possible developments are discussed.
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
415
1.1. Setting and notations We consider a graph G given by three infinite half lines attached to a common vertex. In order to study a quantum mechanical problem on G, the natural Hilbert space is then L2 (G) ≡ L2 (R+ ) ⊕ L2 (R+ ) ⊕ L2 (R+ ). We denote the elements of L2 (G) by capital greek letters, while functions in L2 (R+ ) are denoted by lowercase greek letters. It is convenient to represent functions in L2 (G) as column vectors of functions in L2 (R+ ), namely ψ1 Ψ = ψ2 . ψ3 The norm of L2 -functions on G is naturally defined by 12 3 ΨL2 (G) := ψj 2L2 (R+ ) . j=1
Analogously, given 1 ≤ r ≤ ∞, we define the space Lr (G) as the set of functions on the graph whose components are elements of the space Lr (R+ ), and the norm is correspondingly defined by 1r 3 ΨLr (G) = ψj rLr (R+ ) , 1 ≤ r < ∞, j=1
ΨL∞ (G) = sup ψj L∞ (R+ ) . 1≤j≤3
When a functional norm refers to a function defined on the graph, we omit the symbol G. Furthermore, from now on, when such a norm is L2 , we drop the subscript, and simply write · . Accordingly, we denote by (·, ·) the scalar product in L2 . As it is standard when dealing with Strichartz’s estimates, we make use of spaces of functions that are measurable as functions of both time (on the interval [T1 , T2 ]) and space (on the graph). We denote such spaces by Lp[T1 ,T2 ] Lr (G), with indices 1 ≤ r ≤ ∞, 1 ≤ p < ∞; we endow them with the norm
1/p ΨLp[T
T2
1 ,T2
ΨL∞ [T
Lr (G) ]
=
Lr (G)
=
1 ,T2 ]
T1
Ψs pLr ds
sup t∈[T1 ,T2 ]
,
1 ≤ p < ∞,
Ψs Lr .
The extension of the definitions given above to the case p = ∞ is straightforward. Besides, we need to introduce the spaces H 1 (G) ≡ H 1 (R+ ) ⊕ H 1 (R+ ) ⊕ H 1 (R+ ), H 2 (G) ≡ H 2 (R+ ) ⊕ H 2 (R+ ) ⊕ H 2 (R+ ),
May 20, J070-S0129055X11004345
416
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
equipped with the norms Ψ2H 1 (G) =
3
ψi 2H 1 (R+ ) ,
Ψ2H 2 (G) =
i=1
3
ψi 2H 2 (R+ ) .
(1.5)
i=1
The product of functions is defined componentwise, |ψ1 |2 ψ1 ψ1 φ1 ΨΦ ≡ ψ2 φ2 , so that |Ψ|2 Ψ ≡ |ψ2 |2 ψ2 . |ψ3 |2 ψ3
ψ3 φ3
We denote by I the 3 × 3 identity matrix, while J is the 3 × 3 matrix whose elements are all equal to one. When an element of L2 (G) evolves in time, we use in notation the subscript t: for instance, Ψt . Sometimes we shall write Ψ(t) in order to highlight the dependence on time, or whenever such a notation is more understandable. 2. Summary on Linear Dynamics on Graphs 2.1. Hamiltonians and quadratic forms Standard references about the linear Schr¨ odinger equation on graphs are [7, 8, 22, 23, 21], to which we refer for more extensive treatments. Here we only give the definitions needed to have a self-contained exposition. We consider three Hamiltonian operators, denoted by HF , Hδα , Hδβ (with α, β ∈ R), and called, respectively, the Kirchhoff, the Dirac’s delta, and the delta-prime Hamiltonian. These operators act as −ψ1 (2.1) Ψ → −ψ2 −ψ3 on some subspace of H 2 (G), to be defined by suitable boundary conditions at the vertex. Here and in the following subsection we collect some basic facts (see [21–23, 7]) on HF , Hδα , and Hδβ . The Kirchhoff Hamiltonian HF acts on the domain D(HF ) := {Ψ ∈ H 2 (G) s.t. ψ1 (0) = ψ2 (0) = ψ3 (0), ψ1 (0) + ψ2 (0) + ψ3 (0) = 0}.
(2.2)
It is well known, see [21], that (2.2) and (2.1) define a self-adjoint Hamiltonian on L2 (G). Boundary conditions in (2.2) are usually called Kirchhoff boundary conditions. We use the index F to remind that HF reduces to the free Hamiltonian on the line for a degenerate graph composed of two half lines. The quadratic form EF associated to HF is defined on the subspace D(EF ) = {Ψ ∈ H 1 (G) s.t. ψ1 (0) = ψ2 (0) = ψ3 (0)}
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
and reads EF [Ψ] =
3 i=1
+∞
0
417
|ψi (x)|2 dx.
The Dirac’s delta Hamiltonian is defined on the domain D(Hδα ) := {Ψ ∈ H 2 (G) s.t. ψ1 (0) = ψ2 (0) = ψ3 (0), ψ1 (0) + ψ2 (0) + ψ3 (0) = αψ1 (0)}. 2
Hδα
(2.3) Hδα
Again, is a self-adjoint opeator on L (G) ([21]). It appears that generalizes the ordinary Dirac’s delta interaction with strength parameter α on the line, see, e.g. [5]. The quadratic form Eδα associated to Hδα is defined on D(Eδα ) = {Ψ ∈ H 1 (G) s.t. ψ1 (0) = ψ2 (0) = ψ3 (0)} and is given by Eδα [Ψ] =
3 i=1
0
+∞
|ψi (x)|2 dx + α|ψ1 (0)|2 .
The delta-prime Hamiltonian is defined on the domain D(Hδβ ) := {Ψ ∈ H 2 (G) s.t. ψ1 (0) = ψ2 (0) = ψ3 (0), ψ1 (0) + ψ2 (0) + ψ3 (0) = βψ1 (0)}.
(2.4)
2
Hδβ
Again, is a self-adjoint opeator on L (G) ([21]). The quadratic form Eδβ associated to Hδβ is defined on D(Eδβ ) = H 1 (G) and is given by 2 3 +∞ 3 1 β 2 Eδ [Ψ] = |ψi (x)| dx + ψi (0) . β 0 i=1
i=1
Notice that does not reduce to the standard δ interaction on the line, see, e.g., [5], when it is restricted to a two-edge graph. Here we are following the notation in [22, 23]. The present δ vertex is called sometimes δs graph, where s is for symmetric. A discussion of the correct extension of the usual δ interaction is given in [15, 8]. For completeness, we give the operator domain (the action is the same as in the other cases). We use the denomination δ˜ to avoid confusion with the previously defined interaction. n β ψj (0) = 0, ψj (0) − ψk (0) = (ψj (0) − ψk (0)), D(Hδ˜β ) := Ψ ∈ H 2 (G) s.t. n j=1 Hδβ
j, k = 1, 2, . . . , n . Throughout the paper we restrict to the case of repulsive delta and delta-prime (δ ) interaction, i.e. α, β > 0. It is easily proved, for example by inspection of the
May 20, J070-S0129055X11004345
418
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
operator resolvents in the following subsection, that such a condition prevents the corresponding Hamiltonian operator from possessing bound states. We also point out that the Hamiltonian Hδα is well defined for α = 0 and that ˜ β˜ > 0, we shall consider the indeed Hδα |α=0 ≡ HF . We finally point out that, fixed α, ˜ β/v
˜ and Hδ , namely, in the following we rescale α = α ˜v Hamiltonian operators Hδαv ˜ and β = β/v.
2.2. Resolvents, propagators and scattering For any complex number k with Im k > 0 we denote by RF (k), Rδα (k), Rδβ the resolvents (HF − k 2 )−1 , (Hδα − k 2 )−1 , (Hδβ − k 2 )−1 , respectively. We define the function Ut by x2
ei 4t , Ut (x) := √ 4πit
t = 0.
In the following we shall use the same symbol Ut to denote the operator in L2 (R) defined by ∞ Ut (x − y)ψ(y)dy, t = 0. [Ut ψ](x) := −∞
Moreover, we define two integral operators Ut± acting on L2 (R+ ) +∞ Ut± : L2 (R+ ) → L2 (R+ ), [Ut± ψ](x) = Ut (x ± y)ψ(y)dy, 0
t = 0.
We stress that, according to our definitions, the operators Ut− and Ut have the same integral kernel, but act on different Hilbert spaces. For the cases we consider, resolvents and propagators can be easily computed (see, e.g. [7, pp. 201–226] from resolvent formulas with generic boundary conditions in the vertex). The results are summarized in the following theorem. Theorem 2.1. For any complex number k with Im k > 0, the integral kernel of the resolvent operators RF (k), Rδα (k), Rδβ (k) are given by −1 2 2 i ik|x−y| i ik(x+y) 1 e e RF (k; x, y) = I+ (2.5) 2 −1 2, 2k 2k 3 2 2 −1 α − ik 2ik 2ik ik(x+y) i ik|x−y| i e e I− (2.6) Rδα (k; x, y) = α − ik 2ik , 2ik 2k 2k α − 3ik 2ik 2ik α − ik −1 + iβk 2 2 ik(x+y) i ik|x−y| i e e 2 −1 + iβk 2 I− Rδβ (k; x, y) = . 2k 2k 3 − iβk 2 2 −1 + iβk (2.7)
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
419
Furthermore, the unitary group of the related time evolution operators reads −1 2 2 1 2 e−iHF t = Ut− I + Ut+ 2 −1 (2.8) 2 = (Ut− − Ut+ )I + Ut+ J, 3 3 2 2 −1 α Uδ,t (x, y) = [Ut (x − y) − Ut (x + y)]I 2 α +∞ −α u 3 + due Ut (x + y + u) J, Ut (x + y) − 3 3 0 3 2 +∞ β Uδ ,t (x, y) = [Ut (x − y) + Ut (x + y)]I − due− β u Ut (x + y + u)J, β 0
(2.9) (2.10)
where x, y ∈ R+ , α, β > 0. Proof. We start from the proof of (2.6). Let (Ψ)i = ψi for i = 1, 2, 3 and define RΨ by +∞ i i ikx e ci , i = 1, 2, 3, eik|x−y| ψi (y)dy + (2.11) (RΨ)i (x) = 2k 0 2k where ci = ci (Ψ) are constant to be specified. It is obvious that (RΨ)i satisfies d2 − 2 − k 2 (RΨ)i = 0, i = 1, 2, 3, (2.12) dx then it is sufficient to fix ci such that RΨ belongs to D(Hαδ ) in order to compute δ . The boundary conditions in (2.3) translate to the following linear system for ci . Rα +∞ +∞ iky e ψ (y)dy + c = eiky ψ2 (y)dy + c2 , 1 1 0 0 +∞ +∞ eiky ψ2 (y)dy + c2 = eiky ψ3 (y)dy + c3 , 0 0 (2.13) +∞ iky e (ψ1 (y) + ψ2 (y) + ψ3 (y))dy − (c1 + c2 + c3 ) 0 +∞ iα = eiky ψ2 (y)dy + c2 , k 0 the solution is easily computed and it is given by +∞ 2k 2k k + iα iky c1 = ψ1 (y) + ψ2 (y) + ψ3 (y) dy, e − 3k + iα 3k + iα 3k + iα 0 +∞ k + iα 2k 2k iky ψ1 (y) − ψ2 (y) + ψ3 (y) dy, c2 = e 3k + iα 3k + iα 3k + iα 0 +∞ 2k 2k k + iα c3 = ψ1 (y) + ψ2 (y) − ψ3 (y) dy, eiky 3k + iα 3k + iα 3k + iα 0
(2.14)
May 20, J070-S0129055X11004345
420
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
which gives (2.6). In order to obtain (2.5) it is sufficient to put α = 0 in (2.6). Formula (2.7) can be proved by the same method. Now we prove (2.9). We start from the standard formula +∞ 1 α Rα (k; x, y)kdk, Uδ,t (x, y) = πi −∞ δ use (2.6) and recall the following identity 1 2π
+∞
−∞
eik(x+y) −ik2 t e dk = a − ik
2
+∞
e
0
−au e
i (x+y+u) 4t
√ du, 4πit
(2.15)
then we immediately arrive at (2.9). Formula (2.8) can be obtained by putting α = 0 into (2.9). Formula (2.10) can be proved in the same way. Corollary 2.2. From the expression of the resolvent one immediately has the reflection and transmission coefficients: 1 r(k) = − , 3 k + iα rHδα (k) = − , 3k + iα βk + i , rH β (k) = δ βk + 3i
t(k) =
2 , 3
2k , 3k + iα 2i tH β (k) = − . δ βk + 3i tHδα (k) =
(2.16) (2.17) (2.18)
We refer to [21] for a comprehensive analysis of the scattering on star graphs. Indeed by using the results in [21] the reflection and transmission coefficients can be obtained directly by the boundary conditions in the vertex. Throughout the paper we shall need some auxiliary dynamics to be compared with the dynamics described by (1.1), so, for later convenience, we introduce the two-edge Hamiltonians Hj and the corresponding two-edge propagators e−iHj t , j = 1, 2, 3. Let Hj be defined by: D(Hj ) := {Ψ ∈ H 2 (G) s.t. ψj (0) = ψj+1 (0), ψj (0) + ψj+1 (0) = 0,
ψk (0) = 0, k = j, j + 1}, −ψ1 Hj Ψ := −ψ2 , −ψ3 where in Eq. (2.19) it is understood that j = {1, 2, 3} modulo 3. The Hamiltonian Hj couples the edges j and j + 1 with a Kirchhoff boundary condition and sets a Dirichlet boundary condition for the remaining edge, so that there is free propagation between the edges j and j + 1 and no propagation between them and the edge j + 2.
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
421
Fig. 1. The open graphs depict the Hamiltonians H1 , H2 and H3 . Under each graph we report the boundary conditions on vectors in the domain of the corresponding Hamiltonian.
With a straightforward computation, we have e−iHj t = Ut− I + Ut+ Tj , where Tj are the 0 T1 = 1 0
matrices 1 0 0 0; 0 −1
t ∈ R,
−1 0 0 T2 = 0 0 1; 0 1 0
0 T3 = 0 1
(2.19) 0 1 −1 0. 0 0
2.3. Strichartz’s estimates A key tool in our method is the extension of the standard Strichartz’s estimates α (see e.g. [13]) to the dynamics on G described by the propagators e−iHF t , e−iHδ t , β and e−iHδ t . In this subsection we use the symbol H to denote any of the three Hamiltonians of interest described in Sec. 2.1. As a preliminary step, we remark that from Eqs. (2.8)–(2.10), the standard dispersive estimate immediately follows: e−iHt ΨL∞ ≤
c t1/2
ΨL1 ,
t = 0.
(2.20)
Proposition 2.3 (Strichartz Estimates for e−iHt ). Let Ψ0 ∈ L2 (G), Γ ∈ LqR Lk , with 1 ≤ q, k ≤ 2, 2q + k1 = 52 , and define Ψ(t) = e
−iHt
Ψ0 ,
Φ(t) = 0
t
dse−iH(t−s) Γ(s).
The following estimates hold true: ΨLp Lr ≤ cΨ0
(2.21)
R
ΦLp Lr ≤ c ΓLq Lk R
R
(2.22)
May 20, J070-S0129055X11004345
422
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
for any pair of indices (r, p) satisfying 2 1 1 + = . p r 2 The constants c in (2.21) and (2.22) are independent of T . 2 ≤ p, r ≤ ∞,
(2.23)
Proof. The proof is standard due to the dispersive estimate (2.20), see for instance [13, 19]. Remark 2.4. If H = Hδα (Hδβ ), the constants appearing in (2.20)–(2.22) are independent of α (β). Indeed, by the change of variable u → uα (u → u/β) the integral term in (2.9) ((2.10)) can be easily estimated independently of α (β), obtaining a dispersive estimate (2.20) independent of the parameters and therefore, by the standard Strichartz machinery, uniform inequalities (2.21) and (2.22). 3. Well-Posedness and Conservation Laws Here we treat the problem of the well-posedness (in the sense of, e.g., [13]), i.e., the existence and uniqueness of the solution to Eq. (1.2) in the energy domain of the system. Such a domain turns out to coincide with the form domain of the linear part of Eq. (1.1). Throughout this section, such a linear part is denoted by H, and, according to the particular case under consideration, it can be understood as the Hamiltonian operator HF , Hδα , or Hδβ . Correspondingly, we denote the associated energy domain simply by D(E). All of the following formulas can be specialized to the particular cases D(EF ), D(Eδα ), or D(Eδβ ). Let us stress that throughout the paper we do not approximate the dynamics in H 1 , but rather in L2 . Furthermore, local well-posedness in L2 is ensured by Strichartz estimates (Proposition 2.3), as is easily seen following the line exposed in [13, Chaps. 2 and 3]. Nonetheless, we prefer to deal with functions in the energy domain, since they are physically more meaningful. We follow the traditional line of proving, first of all, local well-posedness, and then extending it to all times by means of a priori estimates provided by the conservation laws. For a more extended treatment of the analogous problem for a two-edge vertex (namely, the real line with a point interaction at the origin), see [3]. First, we endow the energy domain D(E) with the H 1 -norm defined in (1.5). Second, we denote by D(E) the dual of D(E), i.e. the set of the continuous linear functionals on D(E). We denote the dual product of Γ ∈ D(E) and Ψ ∈ D(E) by Γ, Ψ . In such a bracket we sometimes exchange the place of the factor in D(E) with the place of the factor in D(E): indeed, the duality product follows the same algebraic rules of the standard scalar product. As usual, one can extend the action of H to the space D(E), with values in D(E) , by 1
1
HΨ1 , Ψ2 := (H 2 Ψ1 , H 2 Ψ2 ), where (·, ·) denotes the standard scalar product in L2 (G).
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
423
Furthermore, for any Ψ ∈ D(E) the identity d −iHt e Ψ = −iHe−iHt Ψ dt holds in D(E) too. To prove it, one can first test the functional element Ξ in the operator domain D(H), obtaining d −iHt eiH(t+h) Ξ − eiHt Ξ e Ψ, Ξ = lim Ψ, h→0 dt h
(3.1) d −iHt Ψ dt e
on an
= (Ψ, iHeiHt Ξ) = −iHe−iHt Ψ, Ξ . Then, the result can be extended to Ξ ∈ D(E) by a density argument. Besides, by (3.1), the differential version (1.1) of the Schr¨ odinger equation holds in D(E) . In order to prove a well-posedness result we need to generalize standard onedimensional Gagliardo–Nirenberg estimates to graphs, i.e. 1
−1
1
+1
ΨLp ≤ CΨ L2 2 p ΨL2 2 p ,
+∞ ≥ p ≥ 1,
(3.2)
where the C > 0 is a positive constant which depends on the index p only. The proof of (3.2) follows immediately from the analogous estimates for functions of the real line, considering that any function in H 1 (R+ ) can be extended to an even function in H 1 (R), and applying this reasoning to each component of Ψ. Proposition 3.1 (Local Well-Posedness in D(E)). For any Ψ0 ∈ D(E), there exists T > 0 such that Eq. (1.2) has a unique solution Ψ ∈ C 0 ([0, T ), D(E)) ∩ C 1 ([0, T ), D(E) ). Moreover, Eq. (1.2) has a maximal solution Ψmax defined on an interval of the form [0, T ), and the following “blow-up alternative” holds: either T = ∞ or lim Ψmax D(E) = +∞, t
t→T
the function Ψmax evaluated at time t. where we denoted by Ψmax t Proof. We define the space X := L∞ ([0, T ), D(E)), endowed with the norm ΨX := supt∈[0,T ) Ψt D(E) . Given Ψ0 ∈ D(E), we define the map G : X → X as · e−iH(·−s) |Φs |2 Φs ds. GΦ := e−iH· Ψ0 + i 0
Notice that the nonlinearity preserves the space D(E). Indeed, since any component ψi of Ψ belongs to H 1 (R+ ), then |ψi |2 ψi belongs to L2 (R+ ) too, and so the energy space for the delta-prime case is preserved. Furthermore, the product preserves the continuity at zero required by the Kirchhoff and the delta case.
May 20, J070-S0129055X11004345
424
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
By estimates (3.2), one obtains |Φs |2 Φs D(E) ≤ CΦs 3D(E), so GΦX ≤ Ψ0 D(E) + C
T
0
Φs 3D(E) ds ≤ Ψ0 D(E) + CT Φ3X .
(3.3)
Analogously, given Φ, Ξ ∈ D(E), GΦ − GΞX ≤ CT (Φ2X + Ξ2X )Φ − ΞX .
(3.4)
We point out that the constant C appearing in (3.3) and (3.4) is independent of Ψ0 , Φ, and Ξ. Now let us restrict the map G to elements Φ such that ΦX ≤ 2Ψ0D(E) . From (3.3) and (3.4), if T is chosen to be strictly less than (8CΨ0 2D(E) )−1 , then G is a contraction of the ball in X of radius 2Ψ0 D(E) , and so, by the contraction lemma, there exists a unique solution to (1.2) in the time interval [0, T ). By a standard one-step boostrap argument one immediately has that the solution actually belongs to C 0 ([0, T ), D(E)), and due to the validity of (1.1) in the space D(E) we immediately have that the solution Ψ actually belongs to C 0 ([0, T ), D(E))) ∩ C 1 ([0, T ), D(E) ). The proof of the existence of a maximal solution is standard, while the blowup alternative is a consequence of the fact that, whenever the D(E)-norm of the solution is finite, it is possible to extend it for a further time by the same contraction argument. The next step consists in the proof of the conservation laws. Proposition 3.2. For any solution Ψ ∈ C 0 ([0, T ), D(E)) ∩ C 1 ([0, T ), D(E) ) to the problem (1.2), the following conservation laws hold at any time t: Ψt = Ψ0 ,
E(Ψt ) = E(Ψ0 ),
where the symbol E denotes the energy functional E(Ψt ) :=
1 1 Elin (Ψt ) − Ψt 4L4 . 2 4
Here the functional Elin coincides with EF , Eδα or Eδβ , according to the case one considers. Proof. The conservation of the L2 -norm can be immediately obtained by the validity of Eq. (1.1) in the space D(E): d d 2 Ψt = 2 Re Ψt , Ψt = 2 Im Ψt , HΨt = 0 dt dt
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
425
by the self-adjointness of H. In order to prove the conservation of the energy, first we notice that Ψt , HΨt is differentiable as a function of time. Indeed, 1 [ Ψt+h , HΨt+h − Ψt , HΨt ] h Ψt+h − Ψt Ψt+h − Ψt , HΨt+h + HΨt , = h h and then, passing to the limit h → 0, d d (Ψt , HΨt ) = 2 Re Ψt , HΨt = 2 Im |Ψt |2 Ψt , HΨt , dt dt
(3.5)
where we used the self-adjointness of H and (1.1). Furthermore, d d (Ψt , |Ψt |2 Ψt ) = (Ψ2t , Ψ2t ) = 4 Im |Ψt |2 Ψt , HΨt . dt dt From (3.5) and (3.6), one then obtains
(3.6)
d 1 d 1 d E(Ψt ) = Ψt , HΨt − (Ψt , |Ψt |2 Ψt )L2 = 0 dt 2 dt 4 dt and the proposition is proved. Corollary 3.3. The solutions are globally defined in time. Proof. By estimate (3.2) with p = ∞ and conservation of the L2 -norm, there exists a constant M , that depends on Ψ0 only, such that 1 2 Ψ − M Ψt . 2 t Therefore a uniform (in t) bound on Ψt 2 is obtained. As a consequence, one has that no blow-up in finite time can occur, and therefore, by the blow-up alternative, the solution is global in time. E(Ψ0 ) = E(Ψt ) ≥
4. Main Result In this section, we describe the asymptotic dynamics of a particular initial state, which resembles a soliton for the standard cubic NLS on the line. According to Sec. 3, we use the symbol H to generically denote the linear part of the evolution, regardless of the fact that we are considering the Kirchhoff, delta, or delta-prime boundary conditions. When necessary, we will distinguish between the three of them. We use the notation √ φ(x) = 2 cosh−1 x, x ∈ R, and for any x0 ∈ R and v ∈ R we define v
φx0 ,v (x, t) = ei 2 x e−it
v2 4
eit φ(x − x0 − vt),
x ∈ R,
t ∈ R.
(4.1)
May 20, J070-S0129055X11004345
426
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
The function φx0 ,v represents a soliton for the cubic NLS on the line which at time t = 0 is centered in x = x0 and has velocity v. Therefore, φx0 ,v is the solution of the integral equation v
φx0 ,v (x, t) = [Ut ei 2 · φ(· − x0 )](x) t +i [Ut−s |φx0 ,v (·, s)|2 φx0 ,v (·, s)](x)ds, 0
t ∈ R.
(4.2)
Let χ be the cut off function χ ∈ C ∞ (R+ ), χ = 1 in (2, +∞) and χ = 0 in (0, 1). For later use we define also χ+ = χ[0,+∞) and χ− = χ(−∞,0] , where χ[a,b] denotes the characteristic function of the interval [a, b]. Moreover let x0 and v be two positive constants and 0 < δ < 1. We take as initial condition the following function v χ(x)e−i 2 x φ(x − x0 ) , Ψ0 (x) = 0
x0 ≥ v 1−δ ,
(4.3)
0 and we denote by ΨH,t the solution of the equation ΨH,t = e
−iHt
Ψ0 + i
0
t
dse−iH(t−s) |ΨH,s |2 ΨH,s ,
t ≥ 0.
(4.4)
The choice of the vector Ψ0 is used to render the idea that the initial condition is a soliton centered away from the vertex and moving towards the vertex with velocity v. The cut off function χ in formula (4.3) is aimed at setting Ψ0 in the domain of the Hamiltonian H, see Sec. 2.1. Let us set t2 := x0 /v + v −δ and define the following functions:
r˜H e−i
v2 4 t2
v
ei 2 x1 eit2 φ(x1 + x0 − vt2 )
v2 v Φ1H,t2 (x1 , x2 , x3 ) ≡ r˜H e−i 4 t2 e−i 2 x2 eit2 φ(x2 − x0 + vt2 ),
(4.5)
0
0
2 ˜ −i v4 t2 ei v2 x2 eit2 φ(x2 + x0 − vt2 ) , Φ2H,t2 (x1 , x2 , x3 ) ≡ tH e v2 v t˜H e−i 4 t2 e−i 2 x3 eit2 φ(x3 − x0 + vt2 ) v2 v t˜H e−i 4 t2 e−i 2 x1 eit2 φ(x1 − x0 + vt2 ) . Φ3H,t2 (x1 , x2 , x3 ) ≡ 0
t˜H e
2 −i v4 t2
e
i v2 x3
(4.6)
(4.7)
eit2 φ(x3 + x0 − vt2 )
They represent solitons on the line multiplied by the scattering coefficients of the linear dynamics r˜H and t˜H , that, in the particular regime we consider, are defined
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
427
as follows: r˜HF = −1/3, ˜ =− r˜Hδαv
r˜H β˜/v δ
t˜HF = 2/3
1 + 2iα ˜ , 3 + 2iα ˜
β˜ + 2i , = β˜ + 6i
˜ t˜Hδαv =
2 3 + 2iα ˜
˜ t˜H β/v =− δ
(4.8)
4i . β˜ + 6i
Remark 4.1. Notice that the coefficients r˜H and t˜H can be obtained by the scattering coefficients (2.17), (2.18), identifying k with v/2 and replacing α by αv ˜ and ˜ β by β/v. This is due to the fact that we implicitly considered a particle with mass equal to 1/2, therefore the momentum k is linked to the speed v by k = v/2. For any t > t2 we define the vectors ΦjH,t as the evolution of ΦjH,t2 with the nonlinear flow generated by Hj , i.e. they are solutions of the equation t ΦjH,t = e−iHj (t−t2 ) ΦjH,t2 + i dse−iHj (t−s) |ΦjH,s |2 ΦjH,s , j = 1, 2, 3. (4.9) t2
Remark 4.2. The vectors ΦjH,t can be represented by v2 (x ) e−i 4 t2 eit2 φref 1 t−t 2 , v2 Φ1H,t (x1 , x2 , x3 ) = e−i 4 t2 eit2 φref (−x ) 2 t−t2 0 0 −i v2 t2 it2 tr e 4 e φt−t2 (x2 ) , Φ2H,t (x1 , x2 , x3 ) = 2 −i v4 t2 it2 tr e φt−t2 (−x3 ) e v2 e−i 4 t2 eit2 φtr t−t2 (−x1 ) Φ3H,t (x1 , x2 , x3 ) = 0 , e−i
v2 4 t2
(4.10)
(4.11)
eit2 φtr t−t2 (x3 )
and φtr where, for any t ≥ 0, the functions φref t t are the solutions to the following NLS on the line ∞ v (x) = r ˜ Ut (x − y)ei 2 y φ(y − v 1−δ )dy φref H t −∞
+i φtr t (x)
= t˜H
ds
0
∞
−∞
+i
t
0
∞
−∞
2 ref Ut−s (x − y)|φref s (y)| φs (y)dy,
(4.12)
v
Ut (x − y)ei 2 y φ(y − v 1−δ )dy
t
ds
∞
−∞
2 tr Ut−s (x − y)|φtr s (y)| φs (y)dy.
(4.13)
May 20, J070-S0129055X11004345
428
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
Our main result is summarized in the following theorem: Theorem 4.3. Fixed α ˜ , β˜ > 0, let H be any of the self-adjoint operators HF , ˜ β/v ˜ , Hδ acting on L2 (G), where G is the three-edge star graph, and defined by Hδαv (2.1)–(2.4). Call Ψt the unique, global solution to the Cauchy problem (4.4) with initial data (4.3). Then, there exist τ∗ > 0 and T∗ > 0 such that for 0 < T < T∗ one has Ψt − Φ1t − Φ2t − Φ3t L2x (G) ≤ Cv −
T∗ −T τ∗
(4.14)
for any time t in the interval x0 /v + v 1−δ < t < x0 /v + v 1−δ + T ln v. In (4.14), C is a positive constant independent of t and v, the functions Φjt are defined by formulas (1.4), t˜ and r˜, given in (4.8), are the scattering coefficients associated to H. The proof of the theorem will be broken into three steps or equivalently we break the time evolution of ΨH,t into three phases. Remark 4.4. A further consequence of Theorem 4.3, as in the case of [17], is the fact that fast solitons have reflection and transmission coefficients which, up to negligible corrections, coincide with the corresponding coefficients of the linear graph. For example, a definition of the transmission coefficients along the branches j = 2, 3 could be given considering the ratio between the amount of mass on the j-edge and the total mass, in the limit t → ∞: Ψjt ≡ |tj (v)|, t→∞ Ψt lim
where Ψj denotes the restriction of the solution to the j-edge. In our case we do not have at our disposal the rigorous asymptotics for t → ∞; nevertheless, we can obtain a weaker result. We have the results of Theorem 1.1, which give, in the time interval t2 < t < t2 + T ln v, the estimate Ψjt = |t˜j | + O(v −σ ) Ψt for a certain σ > 0 and where t˜j is the scattering coefficient of the linear Hamiltonian which describes the vertex. So, in the limit of fast solitons, i.e. v → ∞, one can assert that the ratio which defines the nonlinear scattering coefficient converges to the corresponding linear scattering coefficient. And analogously for the case of reflection coefficient r(v), i.e. j = 1, one has (for t > t2 , as before) Ψ1t = |˜ r |. v→∞ Ψt lim
This is true for every coupling between the ones considered, i.e. Kirchhoff, δ or δ .
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
429
5. Proof of Theorem 4.3 In the proof, we drop the subscript H. When convenient, we specify the particular Hamiltonian operator we refer to. For any t ∈ R, we introduce the soliton v2 v e−i 4 t e−i 2 x1 eit φ(x1 − x0 + vt) v2 v Φt (x1 , x2 , x3 ) = (5.1) e−i 4 t ei 2 x2 eit φ(x2 + x0 − vt) . 0 Then, the function Φt satisfies the equation t −iH1 t Φt = e Φ0 + i dse−iH1 (t−s) |Φs |2 Φs , 0
t ≥ 0.
(5.2)
5.1. Phase 1 We call “phase 1” the dynamics in the time interval (0, t1 ) with t1 = xv0 − v −δ . In this interval we approximate the solution by the soliton (5.1). The content of this subsection is the estimate of the error due to such an approximation, that is contained in Proposition 5.3. Before proving it, we need two lemmas. Lemma 5.1. Given 0 ≤ ta ≤ tb ≤ t1 , for the functions ∞ v K1 (t, x) := Ut−ta (x + y)e−i 2 y φ(y − x0 )dy 0
K2 (t, x) :=
t
+i
ds
∞
0
ta
v
Ut−s (x + y)e−i 2 y e−is
Ut−ta (x + y)e
t
+i
ds
∞
0
ta
eis φ3 (y − x0 + vs)dy, (5.3)
∞
0
v2 4
i v2 y
φ(y + x0 )dy v
Ut−s (x + y)ei 2 y e−is
v2 4
eis φ3 (y + x0 − vs)dy,
the following estimate holds: Ki Xta ,tb (R+ ) ≤ Ce−x0 +vtb ,
i = 1, 2,
2 + 6 6 + where Xta ,tb (R+ ) := L∞ [ta ,tb ] L (R ) ∩ L[ta ,tb ] L (R ).
Proof. Let us start with K1 . Adding and subtracting a contribution to negative values of y one can write ∞ v Ut−ta (x + y)e−i 2 y φ(y − x0 )dy K1 (t, x) = −∞
t
+i
ds ta
∞
−∞
v
Ut−s (x + y)e−i 2 y e−is
v2 4
eis φ3 (y − x0 + vs)dy
May 20, J070-S0129055X11004345
430
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
−
0
v
−∞
−i
Ut−ta (x + y)e−i 2 y φ(y − x0 )dy
t
ds
−∞
ta v
= ei 2 x e −i
2 −it v4
ds ta
v
Ut−s (x + y)e−i 2 y e−is
eit φ(x + x0 − vt) −
t
0
0
−∞
v
0
v2 4
eis φ3 (y − x0 + vs)dy v
−∞
Ut−ta (x + y)e−i 2 y φ(y − x0 )dy
Ut−s (x + y)e−i 2 y e−is
v2 4
eis φ3 (y − x0 + vs)dy
(5.4)
where we used the integral equation (4.2). By a straightforward computation, the Xta ,tb (R+ )-norm of the first term can be bounded by Ce−(x0 −vtb ) . To evaluate the size of the second term, let us write it as follows: ∞ v v − Ut−ta (x − y)ei 2 y φ(y + x0 )dy = [Ut−t ei 2 · φ(· + x0 )](x) a 0
v
= [Ut−ta χ+ ei 2 · φ(· + x0 )](x),
x > 0.
(5.5)
Using the one-dimensional homogeneous Strichartz’s estimates for Ut−ta , namely, the analogous of (2.21) for functions of the half line, we can estimate the Xta ,tb (R+ )norm of this term as v
v
Ut−ta χ+ ei 2 · φ(· + x0 )Xta ,tb (R) ≤ Cχ+ ei 2 · φ(· + x0 ) ≤ Ce−x0 , 2 6 6 where we used the notation Xta ,tb (R) := L∞ [ta ,tb ] L (R) ∩ L[ta ,tb ] L (R). The norm of the last term on the right-hand side of Eq. (5.4) can be estimated in a similar way by · [U·−s |φ−x0 ,v (s)|2 φ−x0 ,v (s)]ds ≤ Cφ3−x0 ,v L1[t ,t ] L2 (R+ ) a
Xta ,tb (R+ )
ta
b
1 ≤ C e−x0 +vtb . v
(5.6)
Therefore, from (5.4)–(5.6) we get K1 Xta ,tb (R+ ) ≤ Ce−x0 +vtb . To estimate K2 , the first term in its definition (5.3) can be treated as in (5.5), while the second is estimated following the line of (5.6). Lemma 5.2. Given 0 ≤ ta ≤ tb ≤ t1 , let a and b two strictly positive numbers, with 1 . b≤ 2 8a + 4a Moreover, let y be a real, continuous function such that 0 ≤ y(ta ) ≤ a, and 0 ≤ y(t) ≤ a + by 2 (t) + by 3 (t),
for any t ∈ [ta , tb ].
(5.7)
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
431
Then, max y(t) ≤ 2a.
t∈[ta ,tb ]
Proof. Consider the function fb (x) := bx3 + bx2 − x + a. Denoted ¯b := 8a21+4a , one has f¯b (2a) = 0. If b ≤ ¯b, then fb (x) ≤ f¯b (x) for any x > 0. Besides, notice that ˜ ∈ (0, 2a] s.t. fb (˜ x) = 0. Finally, since fb (0) = a > 0, then there must be a point x the function y is continuous, in order to satisfy the constraint (5.7) one must have y(t) ≤ x ˜ ≤ 2a,
for any t ∈ [ta , tb ].
Proposition 5.3. Let Ψt be the solution of Eq. (4.4), and Φt be the solution of Eq. (5.2). There exists C > 0, independent of t and v, such that Ψt − Φt ≤ Ce−v
1−δ
(5.8)
for any t ∈ [0, t1 ]. Proof. Let us define Ξt := Ψt − Φt , and fix ta ∈ [0, t1 ]. Then, from Eqs. (4.4) and (5.2), we have Ξt = e−iH(t−ta ) Ξt−a + (e−iH(t−ta ) − e−iH1 (t−ta ) )Φt−a t t +i (e−iH(t−s) − e−iH1 (t−s) )|Φs |2 Φs ds + i e−iH(t−s) [|Ξs |2 Ξs + |Ξs |2 Φs ta
ta
2
+ |Φs | Ξs + 2 Re(Ξs Φs )Ξs + 2 Re(Ξs Φs )Φs ] = e−iH(t−ta ) Ξt−a + F (ta , t) t +i e−iH(t−s) [|Ξs |2 Ξs + |Ξs |2 Φs + |Φs |2 Ξs + 2 Re(Ξs Φs )Ξs ta
+ 2 Re(Ξs Φs )Φs ], where we defined F (ta , t) := (e−iH(t−ta ) − e−iH1 (t−ta ) )Φta t +i (e−iH(t−s) − e−iH1 (t−s) )|Φs |2 Φs ds. ta 2 6 6 Let us fix tb ∈ [ta , t1 ], and denote Xta ,tb = L∞ [ta ,tb ] L ∩ L[ta ,tb ] L . Then
ΞXta ,tb ≤ e−iH(·−ta ) Ξta Xta ,tb + F (ta , ·)Xta ,tb · + e−iH(·−s) [|Ξs |2 Ξs + |Ξs |2 Φs + |Φs |2 Ξs ta
+ 2 Re(Ξs Φs )Ξs + 2 Re(Ξs Φs )Φs ]
Xta ,tb
.
(5.9)
May 20, J070-S0129055X11004345
432
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
Using (2.21) the first term on the right-hand side can be estimated as e−iH(·−ta ) Ξta Xta ,tb ≤ CΞta . We estimate the integral term on the right-hand side of (5.9) also using Strichartz’s estimates. Let us analyze in detail the cubic term. Since both pairs of indices (∞, 2) and (6, 6) fulfill (2.23), in (2.22) we can choose q = k = 6/5 and obtain · −iH(·−s) 2 e |Ξ | Ξ ds ≤ C|Ξ· |2 Ξ· L6/5 L6/5 . (5.10) s s [ta ,tb ] ta
Xta ,tb
Moreover, by standard H¨ older estimates, |Ξ· |2 Ξ· L6/5
[ta ,tb ]
L6/5
≤ Ξ· 2L6 Ξ· L2 L6/5
[ta ,tb ]
≤ Ξ2L6
[ta ,tb ]
L6 ΞL2[ta ,t ] L2 b
1 2
≤ (tb − ta ) Ξ2L6
L6 ΞL[ta ,tb ] L2 .
(5.11)
≤ C(tb − ta )1/2 Ξ3Xta ,t .
(5.12)
[ta ,tb ]
Then, by (5.10) and (5.11), · −iH(·−s) 2 e |Ξs | Ξs , ds ta
Xta ,tb
∞
b
Notice that the constant C can be chosen independently of ta , tb , and of the boundary condition at the vertex. The other terms in the integral on the right-hand side of (5.9) can be estimated analogously. One ends up with 1
ΞXta ,tb ≤ CΞta + F (ta , ·)Xta ,tb + C(tb − ta ) 2 Ξ3Xta ,t
b
+ C(tb − ta )
2 3
Ξ2Xta ,t b
5 6
+ C(tb − ta ) ΞXta ,tb ,
where the arising norms of Φ were absorbed in the constant C. 5 If tb and ta are sufficiently close, then C(tb − ta ) 6 < 1/2. Furthermore, since 2 1 the quantity tb − ta is upper bounded, one can estimate (tb − ta ) 3 by C(tb − ta ) 2 . >0 So, for some C 2 ta + CF (ta , ·)Xt ,t + C(t b − ta ) 2 [Ξ3 ΞXta ,tb ≤ CΞ Xta ,t + ΞXta ,t ]. a b 1
b
b
(5.13) Applying Lemma 5.2 to the function y(t) = ΞXta ,t , which is continuous and monotone, one has that, if 3 (Ξta + F (ta , ·)Xt ,t )2 + 4C 2 (Ξta + F (ta , ·)Xt ,t ))−2 , tb − ta ≤ (8C a b a b
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
433
ta + 2CF (ta , ·)Xt ,t . From the immediate estimates then ΞXta ,tb ≤ 2CΞ a b Ξta ≤ 4,
F (ta , ·)Xta ,tb ≤ F (0, ·)X0,t1 ,
if one denotes 3 (4 + F (ta , ·)X0,t )2 + 4C 2 (4 + F (ta , ·)X0,t ))−2 , τ := (8C 1 1 t + 2CF (t, ·)Xt,t+τ . then for any t ∈ [0, t1 ) ΞXt,t+τ ≤ 2CΞ We divide the interval [0, t1 ] in N + 1 subintervals as follows N −1 [0, t1 ] = [jτ, (j + 1)τ ] ∪ [N τ, t1 ], j=0
where
N :=
x0 − v 1−δ , vτ
[·] = integer part.
Making use of Lemma 5.2, and noting that Ξ(j+1)τ ≤ ΞXjτ,(j+1)τ , one proves by induction that j+1 Ξ0 + ΞXjτ,(j+1)τ ≤ (2C)
j
j+1−k F (kτ, ·)X (2C) , kτ,(k+1)τ
k=0
j = 0, · · · , N − 1, N +1 Ξ0 + ΞXN τ,t1 ≤ (2C)
N −1
N +1−k F (kτ, ·)X (2C) kτ,(k+1)τ
k=0
(N τ, ·)XN τ,t , + 2CF 1
(5.14)
where the last inequality comes from the fact that t1 − N τ ≤ τ , so Lemma 5.2 applies to this last step too. The norm of Ξ as a function of the whole time interval [0, t1 ] can be estimated by ΞX0,t1 ≤
N −1
ΞXjτ,(j+1)τ + ΞXN τ,t1
j=0
≤
N
j+1 Ξ0 + (2C)
j=0
j N
j+1−k F (kτ, ·)X (2C) . (5.15) kτ,min{t1 ,(k+1)τ }
j=0 k=0
In order to prove the theorem using (5.15), we need more precise estimates for Ξ0 and F (jτ, ·)Xjτ,(j+1)τ . First, 2 ∞ φ2 (x − x0 )dx + φ2 (x + x0 ) Ξ0 2 ≤ 0
0
= 2(1 − tanh(x0 − 2)) ≤ Ce−2x0 .
(5.16)
May 20, J070-S0129055X11004345
434
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
To estimate F (ta , ·)Xta ,tb we specialize F to the three cases under analysis. From the explicit propagators (2.8)–(2.10), we get t −1 −1 2 −1 −1 2 1 + i U + −1 −1 2 |Φs |2 Φs ds, FF (ta , t) = Ut−ta −1 −1 2 Φt−a + 3 3 ta t−s 2 2 2 2 2 2 +∞ α ˜ 2 + ˜ ˜v (ta , t) = FF (ta , t) − α due− 3 uv (Ut−t JΦt−a )(· + u) Fδαv a 9 0 t +∞ α ˜ 2 + −i α ˜v ds due− 3 uv (Ut−s J|Φs |2 Φs )(· + u), 9 ta 0 t 1 −1 0 1 −1 0 ˜ β/v + + −1 Fδ (ta , t) = Ut−t Ut−s 1 0 Φt−a + i −1 1 0 |Φs |2 Φs ds a ta 0 0 2 0 0 2 2v +∞ − 3 vu + due β˜ (Ut−t JΦt−a )(· + u) − a β˜ 0 2v −i β˜
t
ds ta
+∞
0
due
3 −β ˜ vu
+ (Ut−s J|Φs |2 Φs )(· + u).
It is immediately seen that FF (ta , t, x) =
−K1 (x, t) − K2 (x, t)
1 −K1 (x, t) − K2 (x, t) , 3 2K1 (x, t) + 2K2 (x, t),
where K1 and K2 were defined in (5.3). Lemma 5.1 yields FF (ta , ·)Xta ,tb ≤ Ce−x0 +vtb .
(5.17)
Furthermore, since ˜ (ta , t, x) Fδαv
+∞ α ˜ 2 ˜v = FF (ta , t, x) − α due− 3 uv 9 0 K1 (x + u, t) + K2 (x + u, t) × K1 (x + u, t) + K2 (x + u, t), K1 (x + u, t) + K2 (x + u, t)
after the change of variable u → uv we conclude ˜ (ta , ·)Xta ,tb ≤ Ce−x0 +vtb . Fδαv
(5.18)
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
435
Finally,
K1 (x, t) − K2 (x, t) ˜ β/v Fδ (ta , t, x) = −K1 (x, t) + K2 (x, t) 0 K1 (x + u, t) + K2 (x + u, t) 3 2v +∞ − uv due β˜ K1 (x + u, t) + K2 (x + u, t) − β˜ 0 K1 (x + u, t) + K2 (x + u, t) yields, after the change of variable u → uv, ˜ β/v
Fδ (ta , ·)Xta ,tb ≤ Ce−x0 +vtb .
(5.19)
Now we go back to estimate (5.15). Due to (5.16)–(5.19), and estimating any geometric sum by the double of its largest term, which is justified if the rate of the sum is not less than two, we get ΞX0,t1
−x0 ≤ 2C Ce
N
j + 2C Ce −x0 +vτ (2C)
j=0
N +1 + Ce−x0 +vτ (2C)
N −1 vτ
e
k=0
k
2C
N +1 e−x0 + 4C Ce −x0 +vτ ≤ 2C(2C)
N −1
j vτ k e
j=0
k=0
j (2C)
2C
−x0 +vt1 + 2C Ce
N −1
ejvτ
j=0
2 e−x0 +N vτ + 2C Ce −x0 +vt1 + 8C C N +1 e−x0 + 8C Ce −x0 +N vτ + 8C C 2 e−x0 +N vτ ≤ 2C(2C) −x0 +vt1 . + 2C Ce
(5.20)
Concerning the first term on the right-hand side of (5.20), we have x0 −v1−δ 1 vτ 1−δ 1−δ 2C ≤ e−v ≤ Ce−v . e
N e−x0 (2C)
From (5.20) and (5.21), we get ΞX0,t1 ≤ Ce−v is concluded.
1−δ
(5.21)
, so (5.8) follows and the proof
5.2. Phase 2 We call “phase 2” the evolution of the system in the time interval (t1 , t2 ) with t2 = xv0 + v −δ . Let us define the vector ΦSt := ΦS,in + ΦS,out t t
May 20, J070-S0129055X11004345
436
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
Fig. 2. We represent the state ΦS t for t = t1 and t = t2 . The continuous lines stand for the real graph, the dashed lines represent the extension of each edge to negative values of x. At time t = t1 only the vector ΦS,in is significantly supported on the real graph. At time t = t2 only the state ΦS,out has relevant support on the real graph, while the body of the soliton in ΦS,in has moved to the negative axis associated with the first edge.
with
ΦS,in t
φx0 ,−v (t) , := 0 0
ΦS,out t
r˜φ−x0 ,v (t) := t˜φ−x0 ,v (t) , t˜φ−x0 ,v (t)
where the function φx0 ,v was defined in Eq. (4.1) and the reflection and transmission coefficients, r˜ and t˜, must be chosen accordingly to the Hamiltonian H taken in ˜ β/v ˜ , Hδ Eq. (4.4). The explicit expressions of r˜ and t˜ in all the cases H = HF , Hδαv can be read in formula (4.8). Proposition 5.4. Let t ∈ (t1 , t2 ) then there exists v0 > 0 such that for all v > v0 δ
Ψt − ΦSt ≤ C1 v − 2 ,
(5.22)
moreover δ
Ψt2 − ΦS,out ≤ C2 v − 2 , t2
(5.23)
where C1 and C2 are positive constants which do not depend on t and v. Proof. From the definition of Ψt , see Eq. (4.4), we have t dse−iH(t−s) |Ψs |2 Ψs . Ψt = e−iH(t−t1 ) Ψt1 + i t1
We start with the trivial estimate Ψt − ΦSt ≤ Ψt − e−iH(t−t1 ) Ψt1 + e−iH(t−t1 ) Ψt1 − e−iH(t−t1 ) ΦS,in t1 S + e−iH(t−t1 ) ΦS,in t1 − Φt
(5.24)
and estimate the right-hand side term by term. The estimates involved in the analysis of the first term are similar to the ones used in the previous proposition,
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
437
thus we omit the details. Similarly to what was done above we set Xt1 ,t2 = 2 6 6 L∞ [t1 ,t2 ] L ∩ L[t1 ,t2 ] L , then by Strichartz estimates (see Eq. (5.12) and Proposition 2.3) · −iH(·−t1 ) −iH(·−s) 2 Ψt1 Xt1 ,t2 ≤ dse |Ψs | Ψs Ψ − e t1
Xt1 ,t2
1/2
≤ C(t2 − t1 )
(5.25)
Ψ3Xt1 ,t2 ,
e−iH(·−t1 ) Ψt1 Xt1 ,t2 ≤ CΨt1 , which imply ΨXt1 ,t2 ≤ CΨt1 + C(t2 − t1 )1/2 Ψ3Xt1 ,t2 . By Lemma 5.2, one has that if (t2 − t1 ) ≤ [8C 3 Ψt1 2 + 4C 2 Ψt1 ]−2 , then ΨXt1 ,t2 ≤ 2CΨt1 ; using this estimate in the inequality (5.25) we get Ψt − e−iH(t−t1 ) Ψt1 ≤ Ψ − e−iH(·−t1 ) Ψt1 Xt1 ,t2 ≤ C(t2 − t1 )1/2 Ψt1 3 ≤ Cv −δ/2
(5.26)
where we used t2 − t1 = 2v −δ . We proceed now with the estimate of the second term on the right-hand side of inequality (5.24). Let us set 0 φ−x0 ,v (t1 ). Φtail t1 := 0 tail We notice that ΦS,in t1 + Φt1 = Φt1 where the vector Φt was defined in Eq. (5.1) and rewrite Ψt1 by adding and subtracting Φt1 tail Ψt1 = Ψt1 − Φt1 + ΦS,in t1 + Φt1 .
The following trivial inequality holds true tail −v e−iH(t−t1 ) Ψt1 − e−iH(t−t1 ) ΦS,in t1 ≤ Ψt1 − Φt1 + Φt1 ≤ Ce
1−δ
(5.27)
where in the latter estimate we used Proposition 5.3 and the fact that Φtail t1 ≤ 1−δ 2e−v . Let us consider now the last term on the right-hand side of inequality (5.24). We are going to prove that for all t ∈ (t1 , t2 ) and for v big enough S −δ . e−iH(t−t1 ) ΦS,in t1 − Φt ≤ Cv
Let us introduce the functions (x) := φ− t
0
∞
Ut−t1 (x − y)φx0 ,−v (y, t1 )dy
(5.28)
(5.29)
May 20, J070-S0129055X11004345
438
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
and φ+ t (x)
∞
:=
Ut−t1 (x + y)φx0 ,−v (y, t1 )dy.
0
(5.30)
(see Eqs. (5.35) First we prove a preliminary formula for the vector e−iH(t−t1 ) ΦS,in t1 and (5.36) below). For any constant a > 0, not dependent on v let us consider the term ∞ + e−uva [Ut−t φ (t1 )](u + x)du va 1 x0 ,−v 0
= va
∞
0
du
∞
0
v
dye−uva Ut−t1 (u + x + y)eiϕ(t1 ) e−i 2 y φ(y − x0 + vt1 )
2
where we set ϕ(t) := −t v4 + t. By integrating by parts we obtain the equality va
∞
0
+ e−uva [Ut−t φ (t1 )](u + x)du 1 x0 ,−v
= 2ia 0
∞
du
∞
0
dye
−uva
Ut−t1 (u + x + y)e
iϕ(t1 )
d −i v y 2 e φ(y − x0 + vt1 ) dy
= A1 (x, t) + A2 (x, t) + A3 (x, t)
(5.31)
with A1 (x, t) := −2ia
∞
0
A2 (x, t) := −2ia
due−uva Ut−t1 (u + x)eiϕ(t1 ) φ(−x0 + vt1 ),
∞
du
0
0
∞
dye
−uva
d Ut−t1 (u + x + y) dy
v
× eiϕ(t1 ) e−i 2 y φ(y − x0 + vt1 ), ∞ ∞ du dye−uva Ut−t1 (u + x + y)eiϕ(t1 ) A3 (x, t) := −2ia 0
v
× e−i 2 y
0
d φ(y − x0 + vt1 ) . dy
We notice that ∞ −uva due Ut−t1 (u + ·) 0
L2 (R+ )
≤
0
∞
due
−uva
Ut−t1 (u + ·)
= χ+ e−va· L2 (R) 1 . = 2va
L2 (R)
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
Then the following estimate for the term A1 holds true ∞ −uva due Ut−t1 (u + ·) A1 (t)L2 (R+ ) ≤ 2a 2 0
L (R+ )
439
|φ(−x0 + vt1 )|
1−δ
e−v . (5.32) v 1/2 The term A3 is estimated by ∞ v A3 (t)L2 (R+ ) ≤ 2a due−uva [Ut−t1 χ+ e−i 2 · φ (· − x0 + vt1 )](−(u + ·))L2 (R+ ) ≤C
0
≤ 2φ L2 (R) a
∞
0
due−uva
C , v where we used the equality [Ut+ f ](x) = [Ut χ+ f ](−x). We compute finally the term A2 . By integration by parts ∞ ∞ d A2 (x, t) = −2ia Ut−t1 (u + x + y) du dye−uva du 0 0 ≤
(5.33)
v
× eiϕ(t1 ) e−i 2 y φ(y − x0 + vt1 ) ∞ + + 2 = 2iaφt (x) − 2iva e−uva [Ut−t φ (t1 )](u + x)du, 1 x0 ,−v 0
where the function we get
φ+ t
was defined in Eq. (5.30). Using the last equality in Eq. (5.31)
va
∞
0
=
+ e−uva [Ut−t φ (t1 )](u + x)du 1 x0 ,−v
A1 (x, t) + A3 (x, t) 2ia φ+ . t (x) + 1 + 2ia 1 + 2ia
(5.34)
From the definition of ΦS,in and using the last equality with a = α ˜ /3 in the formula t αv ˜ for the integral kernel of e−iHδ t , see Eq. (2.9), it follows that φx0 ,−v (t1 ) αv ˜ αv ˜ e−iHδ (t−t1 ) ΦS,in = e−iHδ (t−t1 ) 0 t1 0
φ− t
=
1 + 2iα ˜ + φ − 3 + 2iα ˜ t Aα˜ (t) 2 2 φ+ − 3 Aα˜ (t) , 3 + 2iα ˜ t Aα˜ (t) 2 + φ 3 + 2iα ˜ t
(5.35)
where the function φ− ˜ (x, t) := [(A1 (x, t) + t was defined in Eq. (5.29) and we set Aα . Similarly using equality (5.34) with a = 3/β˜ in the A3 (x, t))/(1 + 2ia)]|a=α/3 ˜
May 20, J070-S0129055X11004345
440
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al. ˜ β/v
formula for the integral kernel of the propagator e−iHδ t , see Eq. (2.10), we get β˜ + 2i + − φ + φ t t β˜ + 6i (t) A ˜ β ˜ β/v 4i 2 = − (5.36) e−iHδ (t−t1 ) ΦS,in φ+ − Aβ˜ (t) t1 t ˜ 3 β + 6i Aβ˜ (t) 4i φ+ − t ˜ β + 6i where we introduced the notation Aβ˜ (x, t) := [(A1 (x, t) + A3 (x, t))/(1 + 2ia)]|a=3/β˜.
can be obtained from We notice that the analogous formula for e−iHF (t−t1 ) ΦS,in t1 Eq. (5.35) by setting α ˜ = 0 and Aα˜ = 0. To get the estimate (5.28) we show that, at the cost of an error of the order + (t2 − t1 ), for t ∈ (t1 , t2 ), the functions φ− t (x) and φt (x) can be approximated by the solitons φx0 ,−v (x, t) and φ−x0 ,v (x, t) respectively. We consider first the function φ+ t , by adding and subtracting a suitable term to the right-hand side of Eq. (5.30) we get ∞ (x) = Ut−t1 (x + y)φx0 ,−v (y, t1 ) φ+ t −∞
t
+i
ds
−
−∞
t1
∞
Ut−s (x + y)|φx0 ,−v (y, s)|2 φx0 ,−v (y, s)dy
0
−∞
−i
Ut−t1 (x + y)φx0 ,−v (y, t1 )dy
t
ds
∞
−∞
t1
Ut−s (x + y)|φx0 ,−v (y, s)|2 φx0 ,−v (y, s)dy
= φ−x0 ,v (x, t) + I(x, t) + II(x, t),
(5.37)
where we used the fact that φx0 ,−v (−x, t) = φ−x0 ,v (x, t) and we set 0 Ut−t1 (x + y)φx0 ,−v (y, t1 )dy I(x, t) := − −∞
and
II (x, t) := −i
t
ds t1
∞
−∞
Ut−s (x + y)|φx0 ,−v (y, s)|2 φx0 ,−v (y, s)dy.
For the term I, we use the estimate IL2 (R+ ) ≤ φ(· − x0 + vt1 )L2 (R− ) ≤ 2e−v The term II is estimated by II L2 (R+ ) ≤ (t − t1 )φ3 L2 (R) ≤ Cv −δ .
1−δ
.
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
441
Similarly, for the function φ− t2 , we get φ− t (x) = φx0 ,−v (x, t) + III (x, t) + IV (x, t) where we set
III (x, t) := −
and
IV (x, t) := −i
t
ds t1
0
−∞
∞
−∞
(5.38)
Ut−t1 (x − y)φx0 ,−v (t1 , y)dy
Ut−s (x − y)|φx0 ,−v (s, y)|2 φx0 ,−v (s, y)dy.
For t ∈ (t1 , t2 ) the estimates III L2 (R+ ) ≤ 2e−v
1−δ
;
IV L2 (R+ ) ≤ Cv −δ
are similar to the ones given above for the terms I and II . Then from Eqs. (5.37) and (5.38) we have −v φ+ t − φ−x0 ,v (t)L2 (R+ ) ≤ C[e
1−δ
+ v −δ ],
−v φ− t − φx0 ,−v (t)L2 (R+ ) ≤ C[e
1−δ
+ v −δ ]
for all t ∈ (t1 , t2 ). By using the last estimates and the estimates (5.32) and (5.33) in Eqs. (5.35) and (5.36) we get 1−δ e−v 1 −iH(t−t1 ) S,in S −v 1−δ −δ Φt1 − Φt ≤ C e + v + 1/2 + e v v which in turn implies that for v big enough the estimate (5.28) holds true. Using estimates (5.26)–(5.28) in the inequality (5.24) we get the estimate (5.22). The estimate (5.23) is a consequence of estimate (5.22) and of the fact that −v 1−δ = ΦS,in . ΦSt2 − ΦS,out t2 t2 ≤ Ce Remark 5.5. Notice that estimate (5.22), although not strictly necessary for the proof of Theorem 4.3, enforces the picture that in the phase 2 a scattering event is occurring. The true wavefunction can be approximated by the superposition of an incoming and an outgoing wavefunction. At the end of this phase, only the outgoing wavefunction is not negligibile. 5.3. Phase 3 Let us put t3 = t2 +T ln v. We call “phase 3” the evolution of the system in the time interval (t2 , t3 ). The approximation of Ψt during this time interval is the content of Theorem 4.3. We recall the following result (see [17]). ref be defined as it was done in Eqs. (4.12) and Proposition 5.6. Let φtr t and φt (4.13) above. Then ∀ k ∈ N there exist two constants c(k) > 0 and σ(k) > 0 such
May 20, J070-S0129055X11004345
442
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
that φγt L2 (R− ) + φγt L∞ (R− ) ≤
c(k)(ln v)σ(k) v k(1−δ)
(5.39)
for γ = {ref, tr}, uniformly in t ∈ [0, T ln v]. Notice that also the norms φγt Lp (R− ) , for 2 ≤ p ≤ ∞, are estimated by the right-hand side of (5.39). Now we can prove Theorem 4.3 Proof of Theorem 4.3. The strategy of the proof closely follows Proposition 5.3. We will just sketch the common part of the proof while proving in details the different estimates. Let us define Ξt := Ψt − 3j=1 Φjt where the vectors Φjt were given in Eq. (4.9) and fix ta ∈ [t2 , t3 ]. From Eqs. (4.4) and (4.9) it follows that the vector Ξt satisfies the following integral equation Ξt = e−iH(t−ta ) Ξta +
3
(e−iH(t−ta ) − e−iHj (t−ta ) )Φjta
j=1
t
+i
ds(e
−iH(t−s)
−e
−iHj (t−s)
ta t
+i
dse−iH(t−s)
ta
)|Φjs |2 Φjs
(1 − δj1 j2 δj2 j3 )Φjs1 Φjs2 Φjs3
j1 ,j2 ,j3
2 3 3 −iH(t−s) j 2 +i dse Φs Ξs + |Ξs | Φjs Ξs + ta j=1 j=1
t
+ 2 ReΞs
3
Φjs
j=1
3
Φjs
j=1
= e−iH(t−ta ) Ξta + G(ta , t) 2 t 3 3 +i dse−iH(t−s) Ξs + Φjs Ξs + |Ξs |2 Φjs ta j=1 j=1 + 2 ReΞs
3 j=1
Φjs
3
Φjs .
(5.40)
j=1
2 6 6 Let us fix tb ∈ [ta , t3 ] and let Xta ,tb = L∞ [ta ,tb ] L ∩L[ta ,tb ] L . Using the Strichartz estimates as it was done in Proposition 5.3 (see Eqs. (5.9)–(5.13)), it is straightforward
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
443
to prove that 3 ta + G(ta , ·)Xt ,t + (tb − ta )1/2 (Ξ2 ΞXta ,tb ≤ C[Ξ Xta ,t + ΞXta ,t )] a b b
b
depends only on the constants appearing in the Strichartz estimates. Using where C Lemma 5.2 as it was done in the proof of Proposition 5.3 it follows that there exists t + 2CG(t, ·)Xt,t+τ . τ > 0 such that, for any t ∈ [t2 , t3 ) one has ΞXt,t+τ ≤ 2CΞ We divide the interval [t2 , t3 ] in N + 1 subintervals: [t2 + jτ, t2 + (j + 1)τ ), with j = 0, . . . , N − 1; and [t2 + N τ, t3 ), and where N is the integer part of (t3 − t2 )/τ . Then proceeding by induction as we did in the proof of Proposition 5.3, see Eqs. (5.14) and (5.15), we get the inequality ΞXt2 ,t3 ≤
N
j+1 Ξt2 (2C)
j=0
+
j N
j+1−k G(t2 + kτ, ·)X (2C) . t2 +kτ,min{t3 ,t2 +(k+1)τ }
(5.41)
j=0 k=0
Now we estimate the initial data Ξt2 and the source term G(ta , ·)Xta ,tb with t2 ≤ ta ≤ tb ≤ t3 and tb − ta ≤ τ . By Proposition 5.4 (estimate (5.33)) and using the definitions (4.5)–(4.7) one has t˜φx0 ,−v (t2 ) 3 j Ξt2 = Ψt2 − Φt2 = Ψt2 − ΦS,out − r˜φx0 ,−v (t2 ), t2 j=1 t˜φx0 ,−v (t2 ) with Ψt2 − ΦS,out ≤ Cv −δ/2 . Since t2 ∞ ∞ 2 2 |φx0 ,−v (x, t2 )| dx = |φ(x − x0 + vt2 )| dx = 2 0
∞
v 1−δ
0
sech(x)2 dx
1−δ
=
4e−2v −2v 1−δ 1−δ ≤ 4e −2v 1+e
we have δ
Ξt2 ≤ C(v − 2 + e−2v
1−δ
δ
) ≤ Cv − 2 .
(5.42)
Let us now consider the source term G(ta , t). We use the estimate G(ta , ·)Xta ,tb ≤ G(t2 , ·)Xt2 ,t3 . To simplify the notation we set G(t) ≡ G(t2 , t) and 3 G1 (t) := (e−iH(t−t2 ) − e−iHj (t−t2 ) )Φjt2 j=1
t
+i
ds(e
−iH(t−s)
−e
t2 t
G2 (t) := i t2
dse−iH(t−s)
j1 ,j2 ,j3
−iHj (t−s)
)|Φjs |2 Φjs
,
(1 − δj1 j2 δj2 j3 )Φjs1 Φjs2 Φjs3 .
May 20, J070-S0129055X11004345
444
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
By the definition of G(ta , t), see Eq. (5.40) it follows that G(t) = G1 (t) + G2 (t); we estimate G1 (t) and G2 (t) separately. We proceed first with the estimate of the term G1 . From Eqs. (2.8)–(2.10) and (2.19) one can see that for any (column) vector F = (F1 , F2 , F3 ) ∈ L2 $ ∞ −uva + + va 0 e [Ut F1 ](u + ·)du Ut F1 2 $ ∞ −uva + [e−iHt − e−iHj t ]F = Mj Ut+ F2 − J va 0 e [Ut F2 ](u + ·)du , 3 $ ∞ −uva + + Ut F3 va 0 e [Ut F3 ](u + ·)du j = 1, 2, 3, where the constant a and the matrices Mj must be chosen accordingly to the Hamiltonian H: ˜ for H = Hδαv ,
α ˜ , 3
a=
2 Mj = −I + J − Tj ; 3
˜ β/v
for H = Hδ , a=
3 , β˜
Mj = I − Tj ;
and the formula for H = HF can be obtained by setting α ˜ = 0 in the formula for H = Hδα . Then, denoting by (G1 (x, t))l , l = 1, 2, 3, the lth component of the vector G1 one has 1 (x, t))l + (G %1 (x, t))l , (G1 (x, t))l = (G with 1 (x, t))l := (G
3
(Mj )lk
+ (Φjt2 )k ](x) [Ut−t 2
+i
j,k=1
%1 (x, t))l := − (G
t
t2
+ ds[Ut−s |(Φjs )k |2 (Φjs )k ](x)
,
3 2va ∞ −vua + e (Φjt2 )k ](u + x) [Ut−t 2 3 0 j,k=1
t
+i t2
+ ds[Ut−s |(Φjs )k |2 (Φjs )k ](u
+ x) du.
From the definition of the vectors Φjt , see Eqs. (4.10) and (4.11), we see that for 1 (x, t))l is a linear combination of four functions, each l = 1, 2, 3 the function (G
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
445
ftγ,+ and ftγ,− , with γ being equal to ref and tr, given by ∞ 2 γ,+ −i v4 t2 it2 ft (x) := e e Ut−t2 (x + y)φγ0 (y)dy
ds
+i and
−i v4 t2 it2
e
ds
+i
y)|φγs−t2 (y)|2 φγs−t2 (y)dy
(5.43)
Ut−t2 (x + y)φγ0 (−y)dy
∞
0
t2
∞
0
t
Ut−s (x +
2
:= e
∞ 0
t2
ftγ,− (x)
0
t
Ut−s (x +
y)|φγs−t2 (−y)|2 φγs−t2 (−y)dy
,
and φref were defined in Eqs. (4.12) and (4.13) respectively. where the functions φref t t %1 (x, t))l is a linear Similarly one can see that for each l = 1, 2, 3 the function (G combination of ∞ ∞ −vua ref ,+ va e ft (u + x)du, va e−vua fttr,+ (u + x)du, va
0
0
∞ 0
e−vua ftref ,− (u + x)du,
va
∞
0
e−vua fttr,− (u + x)du.
ftγ,+ .
We notice that, adding and subtracting a suitable First we study the function term in Eq. (5.43) and using the definitions (4.12) and (4.13), ftγ,+(x) can be written as ftγ,+(x) = I(x, t) + II (x, t) + III (x, t), with I(x, t) := e−i
v2 4 t2
eit2 φγt−t2 (−x), ∞ 2 −i v4 t2 it2 II (x, t) := −e e Ut−t2 (x − y)φα 0 (−y)dy, 0
III (x, t) := −ie
2 −i v4 t2
e
it2
t
ds t2
0
∞
Ut−s (x − y)|φγs−t2 (−y)|2 φγs−t2 (−y)dy.
2 ± Similarly to what was done above, we set Xt2 ,t3 (R± ) = L∞ [t2 ,t3 ] L (R ) ∩ 6 6 ± L[t2 ,t3 ] L (R ). By Proposition 5.6, we have
IXt2 ,t3 (R+ ) = φγ·−t2 Xt2 ,t3 (R− ) ≤
c (k)(ln v)σ (k) v k(1−δ)
(5.44)
where c (k) and σ (k) are constants, different from the one appearing in Proposition 5.6. For our purposes we do not need to compute them.
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
446
Using the one-dimensional Strichartz estimates for Ut , we have II Xt2 ,t3 (R+ ) ≤ Cχ− φγ0 L2 (R) ≤ Ce−2v
1−δ
.
(5.45)
Finally, the term III can be estimated using the inhomogeneous Strichartz estimate and Proposition 5.6 3 c(k)(ln v)σ(k) γ 3 III Xt2 ,t3 (R+ ) ≤ C(χ− φ·−t2 ) L1[t ,t ] L2 (R) ≤ C . (5.46) 2 3 v k(1−δ) Collecting the estimates (5.44)–(5.46), it follows that σ(k) c(k)(ln v) f·γ,+Xt2 ,t3 (R+ ) ≤ C . v k(1−δ) The estimate of ftγ,− is similar and we omit it. We have proved that for some c (k) and σ (k) possibly bigger than c(k) and σ(k) we have: σ (k) (k)(ln v) c 1 Xt ,t ≤ C G , 2 3 v k(1−δ) 1 (x, t))l , l = 1, 2, 3. 1 is the vector in L2 with components (G where G % 1 (t) = ((G %1 (t))1 , (G % 1 (t))2 , (G %1 (t))3 ) is a trivial consequence The estimate of G of the fact that ∞ −vua γ,± va e f (u + ·)du ≤ f·γ,±Xt2 ,t3 (R+ ) · + 0
Xt2 ,t3 (R )
c(k)(ln v)σ(k) ≤C v k(1−δ) from which it follows that
,
% 1 Xt ,t G 2 3 and
c (k)(ln v)σ (k) ≤C ; v k(1−δ)
G1 Xt2 ,t3
c (k)(ln v)σ (k) ≤C . v k(1−δ)
(5.47)
We analyze now the term G2 . Due to the presence of (1−δj1 j2 δj2 j3 ) the components of the vector (1 − δj1 j2 δj2 j3 )Φjs1 Φjs2 Φjs3 contains only terms (up to a phase) like 1 2 3 φγt−t (x)φγt−t (x)φγt−t (−x) 2 2 2
1 2 3 or φγt−t (x)φγt−t (−x)φγt−t (−x) 2 2 2
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
447
where γ1 , γ2 and γ3 can be ref or tr. This can easily be seen by using Eqs. (4.10) and (4.11). By Strichartz methods, it is sufficient to estimate the L1[t2 ,t3 ] L2 (R+ ) norm of these terms. Then using H¨older’s inequality we have, for instance 1 1 2 3 φγ2 φγ3 (−·)L2 (R+ ) ≤ φγt−t 4 + φγt−t 4 + φγt−t ∞ − φγt−t 2 t−t2 t−t2 2 L (R ) 2 L (R ) 2 L (R )
c(k)(ln v)σ(k) . v k(1−δ) The second kind of terms can be estimated in the same way and we obtain c (k)(ln v)σ (k) G2 Xt2 ,t3 ≤ C , v k(1−δ) ≤C
which, together with the estimate (5.47), gives c (k)(ln v)σ (k) GXt2 ,t3 ≤ C . v k(1−δ)
(5.48)
Fix k such that k(1 − δ) > 2 then for v sufficiently large (5.48) implies GXt2 ,t3 ≤ ΞXt2 ,t3 ≤ Cv −δ/2
N
1 . v
(5.49)
j+1 + v −1 (2C)
j=0
j N
j+1−k (2C)
j=0 k=0
N +1 + 2v −1 ≤ 2Cv −δ/2 (2C)
N
j+1 (2C)
j=0
N +1 + 2v −1 (2C) N +1 ≤ 2Cv −δ/2 (2C) % −δ/2 (2C) N. ≤ Cv Since N is the integer part of (t3 − t2 )/τ =
T τ
ln v we have
% −δ/2+ Tτ ΞXt2 ,t3 ≤ Cv
e ln(2C)
.
and T∗ = δτ∗ /2 and obtain We can finally set τ∗ ≡ τ / ln(2C) ΞL∞ [t ,t 2
3]
L2
% − ≤ ΞXt2 ,t3 ≤ Cv
T∗ −T τ∗
,
which concludes the proof of Theorem 4.3. 6. Conclusion and Perspectives In the present paper we have given a first rigorous analysis of nonlinear Schr¨ odinger propagation on graphs. We have given a preliminary proof of local and global well posedness of the dynamics, and of energy and mass conservation laws for some distinguished vertex couplings, i.e Kirchhoff, δ and δ couplings. Then we concentrated on the problem of collision of a fast solitary wave on the graph vertex (with
May 20, J070-S0129055X11004345
448
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
couplings as before). It turns out that the solitary wave splits in reflected and transmitted components the form of which are again of solitary type, but with modified amplitudes controlled by scattering coefficient given by the linear graph dynamics. This behavior holds true over times of the order ln v where v is the velocity of the impinging soliton. We add some other remarks on the result and further analysis and generalizations. To begin with, let us note that the real line with a point interaction at 0 can be interpreted as a degenerate graph with two edges. The cited paper [17] treats the special case of a δ interaction on the line, and our description shows how it could be possible to extend their results to other point interactions; among the examples treated in the present paper there is a version of the δ interaction, showing how to treat point interactions of a more singular character than the one given by a δ. Concerning more general issues, a sharper description of the post interaction phase can be achieved by an explicit characterization of the evolution of the modified solitary profiles, i.e. of the Φjt . This last part is somewhat delicate, and intersects with contemporary intense work on asymptotics for solitons in integrable and quasi integrable PDE, so we limit ourselves to the following remarks. In the case analyzed in [17], the asymptotic behavior of nonlinear Schr¨ odinger evolution of solitary waveforms with modified amplitudes is given, and making use of inverse scattering theory it is shown (Appendix B of the cited paper) that the evolution is close to a soliton up to times of order ln v and an error the order of which is an inverse power of v. Borrowing from these results, it is possible to get in our case too, but we omit details, the asymptotics of the Φjt , i.e. of the free NLS evolution of the modified solitary profiles outgoing from the phase two. It turns out that these outgoing wavefunctions can be approximated, on the same logarithmic timescale of Theorem 1.1, as new solitons with the same waveform of the unperturbed dynamics, modified amplitudes and phases, plus a dispersive (“radiation”) contribution. The meaning of this statement is that the L∞ norm of the difference between the evolved modified solitary profiles Φjt and such final outgoing solitons has the usual 1 dispersive behavior, |t − t2 |− 2 . Let us note that, following this strategy, at the end of the phase three, there would be two types of errors: errors due to the approximation procedure in phase one and two (OL2 ); and errors arising from neglecting dispersion in the reconstruction of the outgoing solitons (OL∞ ). An important question concerns the possibility of extending the timescale of validity of approximation by the solitary outgoing waves. As a quite generic remark, this possibility could be related to the asymptotic stability of the system, or of systems immediately related to it. More concretely, in a different type of model (scattering of two solitons on the line) in the already cited paper [1], some considerations are given on obtaining longer timescales of quasiparticle approximation in dependence of the initial data and external potential, but it is unclear whether similar considerations can be applied to the present case.
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
449
Another issue is the nonlinearity. The fundamental asymptotics proved in [17] and used in the present paper relies on the integrable nature of cubic NLS, and it is not immediate to extend these results to more general nonlinearities. One can conjecture that for nonlinearities close to integrable ones which admit solitary waves, the outgoing waves are close to solitons over suitable timescales. Let us mention, however, the recent results of Perelman on the asymptotics of colliding solitons for nonlinearity close to integrable or L2 critical on the line ([25, 24]). A final problem is the extension of results of the present work to more general graphs. We believe that results similar to the ones of the present papers are valid for more general boundary conditions at the vertex of a star graphs, with the same proof, under the condition of absence of eigenvalues for the linear Hamiltonian describing the graph. In presence of eigenvalues, some Strichartz estimates weaken, and a more refined analysis is needed (see [14] for the analogous problem on the line with an attractive δ interaction). Of course, the extension of the present results to the case of star graphs with more than three edges has to be considered straightforward, while the extension to graphs having a less trivial topology is an open problem. Finally let us comment briefly the recent paper [26]. In this partly heuristic paper the authors study a star graph (but also more general type of graphs are considered) with a NLS in which on every edge there is a different strength βk in front of the the cubic term. The authors fix a boundary condition which guarantees that mass and energy of the solution are (formally) conserved. Moreover according to the authors it is possible to derive a condition on the strengths βk which allow for complete transmission of an incoming solitary wave across the vertex. In these same situations the authors show that an infinite chain of conserved quantities exists, defined analogously to the case of the NLS on the line. The result, if formal, is interesting, and concerning the relation with ours we note the following. In the case of a three edge graph and more generally for a odd edge number, the complete transmission is made possible exactly by the fine tuning of the coupling constants in front of the nonlinearities. For the case of a single medium with the same nonlinearity on every edge and Kirchhoff boundary conditions, one can prove (see [2], where more generally the case of nonlinear bound states for δ boundary conditions is treated) that exact travelling solitons exist only in the case of a graph with an even number of edges, while in the case of an odd number of edges a stationary state is formed which is given by half a free soliton on every edge. Acknowledgements The present research was partially supported by INDAM-GNFM research project “Equazione di Schr¨ odinger non lineare interagente con difetti sulla retta e su grafi”. The Hausdorff Research Institute for Mathematics is also acknowledged for the support. The authors are grateful to Sergio Albeverio for comments and discussions.
May 20, J070-S0129055X11004345
450
2011 14:35 WSPC/S0129-055X
148-RMP
R. Adami et al.
References [1] W. K. Abu Salem, J. Fr¨ ohlich and I. M. Sigal, Colliding solitons for the nonlinear Schr¨ odinger equation, Comm. Math. Phys. 291 (2009) 151–176. [2] R. Adami, C. Cacciapuoti, D. Finco and D. Noja, Stationary states of NLS on star graphs (2011), 4 pp.; arXiv: 1104.3839[math-ph]. [3] R. Adami and D. Noja, Existence of dynamics for a 1D NLS equation perturbed with a generalized point defect, J. Phys. A: Math. Theor. 42(49) (2009) 495302, 19pp. [4] S. Albeverio, C. Cacciapuoti and D. Finco, Coupling in the singular limit of thin quantum waveguides, J. Math. Phys. 48 (2007) 032103. [5] S. Albeverio, F. Gesztesy, R. Høegh-Krohn and H. Holden, Solvable Models in Quantum Mechanics, 2nd edn. (AMS Chelsea Publ., 2005); with an Appendix by P. Exner. [6] B. Bellazzini and M. Mintchev, Quantum fields on star graphs, J. Phys. A Math. Gen. 39(35) (2006) 1101–1117. [7] G. Berkolaiko, R. Carlson, S. Fulling and P. Kuchment, Quantum Graphs and Their Applications, Contemporary Math., Vol. 415 (American Math. Society, Providence, RI, 2006). [8] J. Blank, P. Exner and M. Havlicek, Hilbert Spaces Operators in Quantum Physics (Springer, New York, 2008). [9] R. Burioni, D. Cassi, P. Sodano, A. Trombettoni and A. Vezzani, Soliton propagation on chains with simple nonlocal defects, Phys. D 216 (2006) 71–76. [10] C. Cacciapuoti and P. Exner, Nontrivial edge coupling from a Dirichlet network squeezing: The case of a bent waveguide, J. Phys. A: Math. Theor. 40(26) (2007) F511–F523. [11] D. Cao Xiang and A. B. Malomed, Soliton defect collisions in the nonlinear Schr¨ odinger equation, Phys. Lett. A 206 (1995) 177–182. [12] S. Cardanobile and D. Mugnolo, Analysis of FitzHugh–Nagumo–Rall model of a neuronal network, Math. Methods Appl. Sci. 30 (2007) 2281–2308. [13] T. Cazenave, Semilinear Schr¨ odinger Equations, Courant Lecture Notes in Mathematics, Vol. 10 (Amer. Math. Soc., 2003). [14] K. Datchev and J. Holmer, Fast soliton scattering by attractive delta impurities, Commun. Part. Diff. Eqs. 34 (2009) 1074–1113. [15] P. Exner, Contact interactions on graph superlattices, J. Phys. A: Math. Gen. 29(1) (1996) 87–102. [16] R. H. Goodman, P. J. Holmes and M. I. Weinstein, Strong NLS soliton-defect interactions, Phys. D 192 (2004) 215–249. [17] J. Holmer, J. Marzuola and M. Zworski, Fast soliton scattering by delta impurities, Comm. Math. Phys. 274 (2007) 187–216. [18] J. Holmer, J. Marzuola and M. Zworski, Soliton splitting by delta impurities, J. Nonlinear Sci. 7 (2007) 349–367. [19] M. Keel and T. Tao, Endpoint Strichartz estimates, Amer. J. Math. 120 (1998) 955–980. [20] P. G. Kevrekidis, D. J. Frantzeskakis, G. Theocharis and I. G. Kevrekidis, Guidance of matter waves through Y -junctions, Phys. Lett. A 317 (2003) 513–522. [21] V. Kostrykin and R. Schrader, Kirchhoff’s rule for quantum wires, J. Phys. A: Math. Gen. 32(4) (1999) 595–630. [22] P. Kuchment, Quantum graphs. I. Some basic structures, Waves Random Media 14(1) (2004) S107–S128. [23] P. Kuchment, Quantum graphs. II. Some spectral properties of quantum and combinatorial graphs, J. Phys. A: Math. Gen. 38(22) (2005) 4887–4900.
May 20, J070-S0129055X11004345
2011 14:35 WSPC/S0129-055X
148-RMP
Fast Solitons on Star Graphs
451
[24] G. Perelman, A remark on soliton-potential interaction for nonlinear Schr¨ odinger equations, Math. Res. Lett. 16(3) (2009) 477–486. [25] G. Perelman, Two soliton collision for nonlinear Schr¨ odinger equation in dimension 1, Ann. Inst. Henri Poincar´e, in print (2011). [26] Z. Sobirov, D. Matrasulov, K. Sabirov, S. Sawada and K. Nakamura, Integrable nonlinear Schr¨ odinger equation on simple networks: Connection formula at vertices, Phys. Rev. E 81 (2010) 066602.
June 3, 2011 13:31 WSPC/S0129-055X J070-S0129055X11004333
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 5 (2011) 453–530 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004333
REGULARITY OF BOUND STATES
JEREMY FAUPIN∗,‡ , JACOB SCHACH MØLLER†,§ and ERIK SKIBSTED†,¶ ∗Institut
de Math´ ematiques de Bordeaux, Universit´ e de Bordeaux 1, France
†Department
of Mathematical Sciences, Aarhus University, Denmark ‡
[email protected] §
[email protected] ¶
[email protected] Received 1 July 2010 Revised 14 March 2011
We study regularity of bound states pertaining to embedded eigenvalues of a self-adjoint operator H, with respect to an auxiliary operator A that is conjugate to H in the sense of Mourre. We work within the framework of singular Mourre theory which enables us to deal with confined massless Pauli–Fierz models, our primary example, and many-body AC-Stark Hamiltonians. In the simpler context of regular Mourre theory, our results boil down to an improvement of results obtained recently in [8, 9]. Keywords: Mourre theory; non-relativistic QED; AC-Stark. Mathematics Subject Classification 2010: 81Q10, 81V10
Contents 1. Introduction 454 1.1. Singular Mourre theory . . . . . . . . . . . . . . . . . . . . . . . . . 457 1.2. The Nelson model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460 1.3. The AC-Stark model . . . . . . . . . . . . . . . . . . . . . . . . . . . 466 2. Assumptions and Statement of Regularity Results
468
3. Preliminaries 472 3.1. Improved smoothness for operators of class C 1 (A) . . . . . . . . . . 472 3.2. Iterated commutators with N 1/2 . . . . . . . . . . . . . . . . . . . . 476 3.3. Approximating A by regular bounded operators . . . . . . . . . . . . 478 4. Proof of the Abstract Results 4.1. Proof of Theorem 2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Proof of Theorem 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Theorem on more N -regularity . . . . . . . . . . . . . . . . . . . . . 453
481 481 490 491
June 3, J070-S0129055X11004333
454
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
5. A Class of Massless Linearly Coupled Models 5.1. The model and the result . . . . . . . . 5.2. Application to the Nelson model . . . . 5.3. Expanded objects . . . . . . . . . . . . . 5.4. Mourre estimates . . . . . . . . . . . . . 5.5. Checking the abstract assumptions . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
495 495 498 501 503 510
6. AC-Stark Type Models 517 6.1. The model and the result . . . . . . . . . . . . . . . . . . . . . . . . 517 6.2. Regularity of non-threshold bound states . . . . . . . . . . . . . . . 519 6.3. Regularity of non-threshold atomic type bound states . . . . . . . . 523
1. Introduction This paper is the first in a series of two dealing with embedded eigenvalues and their bound states. Our arguments in both papers revolve around local positive commutator methods originating from Mourre’s seminal paper [30]. In fact, some of the central ideas employed in the present paper can be traced back to [14] by Froese and Herbst, where exponential decay of eigenfunctions for many-body Schr¨ odinger operators were first extracted from a positive commutator estimate. See also [15] for a precursor pertaining to two-body operators. In contrast to the above mentioned works, we do not here study decay of bound states of a self-adjoint operator H in position space, but rather decay in the spectral representation for an auxiliary operator A conjugate to H in the sense of Mourre. More precisely, given a bound state ψ of H, we address the question Q(k): For a given k ∈ N, under what conditions on the pair of operators H and A does it hold true that ψ is in the domain of Ak ? It is a question that arises naturally in the context of second order perturbation theory for embedded eigenvalues because together with the Limiting Absorption Principle from [30], an affirmative answer allows one to construct and analyze the so called Fermi Golden Rule operator describing level shifts to second order in perturbation theory. In [27] Fermi’s golden rule was formulated and verified in an abstract setup under the condition that ψ ∈ D(A2 ), following ideas from [1]. See also [5, 11, 13, 31]. For many-body Schr¨ odinger operators the conjugate operator is usually taken to be the generator of dilation and here the condition ψ ∈ D(A2 ) is fulfilled by the Froese–Herbst exponential bound. In other contexts however, it is a non-trivial question to answer. The first results in an abstract setup are due to Cattaneo [8, 9], and the setting is regular Mourre theory. The adjective regular refers to setups where multiple commutators between H and A, in particular [H, A], are suitably controlled by resolvents of H. Results in this category range from Mourre’s original work [30] to the results relying on the C k (A) type conditions introduced by Amrein et al. [3]. See also [1, 5, 10, 18, 25, 27].
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
455
In this paper, we address the question of regularity of bound states with respect to a conjugate operator A in the context of singular Mourre theory. In the second paper [13] the results obtained here are used to do second order perturbation theory of embedded eigenvalues, in particular we establish the validity of Fermi’s golden rule for an abstract class of Hamiltonians. By singular Mourre theory we refer to the situation where the first commutator [H, A] is not controlled by the Hamiltonian itself, as in [11, 24, 21, 22, 31, 38]. Regular Mourre theory is a special case of the singular setup considered here, and our results thus extend those of [8, 9]. Roughly speaking, our answer to the question Q(k) is that control of k + 1 commutators suffices. We stress that even within regular Mourre theory we extend [8, 9] in that we reduce by one, from k + 2 to k + 1, the number of commutators one needs to control in order to answer the question in the affirmative. Our result is optimal in terms of integer numbers of commutators, cf. Example 1.1 below. See also [32] where the regular Mourre theory analysis is extracted from this paper and conditions are established under which bound states become analytic vectors for A. Our main motivation is applications to massless models from quantum field theory. In particular our results apply to the massless confined Nelson model at arbitrary coupling strength. We can deal with infrared singularities that are slightly 1 weaker than the physical one, that is we can handle singularities of the form |k|− 2 + , for some > 0. As a by-product of our methods we also establish that all bound states are in the domain of the number operator. In Sec. 5, we in fact deal with a larger class of quantum field theory models, sometimes called Pauli–Fierz models, which includes the Nelson model. For simplicity and concreteness we present our results in the introduction in the context of the Nelson model. This is done in Sec. 1.2 below. The reader can also consult [22, Sec. 2.3] for a discussion of the field theory models considered in this paper and its sequel. In Sec. 6, we apply the abstract results of this paper to many-body AC-Stark Hamiltonians where we obtain a new regularity result. See Sec. 1.3 below for a formulation of the model and the result. The following example illustrates that if one desires bound states to be in the domain of the kth power of a conjugate operator, one needs at least control of k + 1 commutators. Example 1.1. Consider the one-dimensional Schr¨ odinger operator H = −∆+V on H = L2 (R), where V is a rank-one potential V = |φφ|. Here φ ∈ H is constructed as follows: Let k0 ∈ N and ∈ (0, 1/2). In momentum space we write φ as a sum of two functions φˆ = φˆ1 + φˆ2 , where we choose φ2 , or rather its Fourier transform, to be 0, |ξ| ≤ 1, φˆ2 (ξ) = 2 1 (ξ 2 − 1)k0 + 2 + e−ξ , |ξ| > 1.
June 3, J070-S0129055X11004333
456
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Having fixed φ2 , we choose φ1 , such that ˆ 2 (ξ 2 − 1)−1 dξ = −1. |φ(ξ)| φˆ1 ∈ C0∞ − 21 , 12 , and
(1.1)
R
The key to the example is the singular behavior of φˆ near ξ 2 = 1. We have φ ∈ C ∞ (R) and d φ −k0 − 32 − , for all ≥ 0, dx (x) ≤ C (1 + |x|) (1.2) d φ xk ∈ L2 (R) ⇔ k ≤ k0 + 1. dx Furthermore, the normalization in (1.1) ensures that H has λ = 1 as an embedded eigenvalue with eigenfunction ψ = (−∆−1)−1 φ. Note that (ξ 2 −1)−1 φˆ decays faster than any polynomial. We have ψ ∈ C ∞ (R) and xk
d ψ ∈ L2 (R) ⇔ k ≤ k0 . dx
(1.3)
1 d d Let A denote the generator of dilations A = 2i (x dx + dx x). Introducing the k k−1 0 notation adA (H) = [adA (H), A] and adA (H) = H we formally compute
ik adkA (H) = −2k ∆ + ik adkA (V ). Due to (1.2), the iterated commutator adkA (V ) is compact if and only if k ≤ k0 + 1. Adding resolvents of H does not help. Furthermore i[H, A] obviously satisfies a Mourre estimate with compact error at positive energies: For any E > 0 1[H≥E] i[H, A]1[H≥E] ≥ E1[H≥E] − K, where K is compact and 1[H≥E] is the spectral projection for H associated with the Borel set [E, ∞). That is, we are within the scope of [8, 9]. We have the first k0 + 1 commutators all H-bounded, with the (k0 + 2)nd commutator not controlled by any power of resolvents of H. Appealing to (1.3), we see that the bound state ψ is in the domain of Ak0 , but not in the domain of Ak0 +1 . That ψ ∈ D(Ak0 ) is a conclusion one cannot reach using [8, 9]. It is however attainable by the abstract result of the present paper. See Sec. 2. The additional information that ψ ∈ D(Ak0 +1 ) demonstrates that our result is optimal. As a last observation, more geared towards our second paper [13], let us perturb H by adding a small multiple of V to obtain Hσ = H + (1 + σ)V , with |σ| being small. First note that the operator Hσ can have at most one eigenvalue. Repeating some of the analysis from above, and appealing to the implicit function theorem, one can verify the following statements: There exist δ > 0 and a function λ : (−δ, 0] → R such that ∀ σ ∈ (0, δ) : σpp (Hσ ) = ∅, ∀ σ ∈ (−δ, 0] : σpp (Hσ ) = {λ(σ)}.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
457
The function λ satisfies that λ(0) = 1, λ is real analytic in (−δ, 0), λ ∈ C k0 +1 ((−δ, 0]) and λ ∈ C k0 +2 ((−δ, 0]). This establishes a natural limit on what one should expect from a perturbation theory for embedded eigenvalues. Indeed, it indicates that control of two commutators (corresponding to k0 = 1) may suffice for second order perturbation theory and this is in fact accomplished in [13, Sec. 5.1]. The strongest results, cf. [13, Sec. 5.2], require control of three commutators, since they rely on the condition ψ ∈ D(A2 ). 1.1. Singular Mourre theory Before we formulate our results more precisely, we pause to discuss on a more heuristic level the origin of conjugate operators, and how we are led naturally to singular Mourre theory. Consider the operator Mω of multiplication in momentum space L2 (Rd ) by a dispersion relation ω assumed to be locally Lipschitz. The connection between dynamics and structure of the spectrum of a self-adjoint operator is fairly well understood, starting from Kato-smoothness and the RAGE theorem [33–36]. When looking for a conjugate operator, one should study the dynamics of the operator Mω . It is natural to identify what states have (at least) ballistic motion, that is find states ψ0 satisfying x2 ψt ≥ ct2 , for some c > 0. Here ψt = exp(−itMω )ψ0 . The position operator x is equal to i∇k . We can compute this quantity explicitly and we get t x · ∇ω + ∇ω · xψs ds x2 ψt = x2 ψ0 + 0
= x ψ0 + tx · ∇ω + ∇ω · xψ0 + t2 |∇ω|2 ψ0 . 2
We observe that if ψ0 has support away from zeroes of ∇ω, then the motion is at least ballistic. More precisely this is the case if essinf k∈supp ψ0 |∇ω(k)| ≥ c > 0. If ω = k 2 , the standard non-relativistic dispersion relation, we find that ψ0 should be localized away from 0 in momentum space. Since |∇ω|2 = 4ω, the requirement on ψ0 can also be expressed as ψ0 ∈ EMω ([c/4, ∞))L2 (Rd ), where EMω denotes the spectral resolution associated with the self-adjoint operator Mω . We observe that the energy 0 has a special significance for the case ω = k 2 and is called a threshold, in the sense that states localized in energy near a threshold may not have strict ballistic motion. A second example is ω = |k|. Here we observe that |∇ω| = 1, and hence all states ψ0 will exhibit ballistic motion. In other words this dispersion relation does not have thresholds. This of course reflects the constant (momentum independent) speed of light. See [21, Sec. 1.2] for a discussion of general dispersion relations.
June 3, J070-S0129055X11004333
458
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
When picking a conjugate operator in Mourre theory, one is precisely looking for an observable a with at least ballistic growth. The choice often used is the Heisenberg derivative of x2 , where x is some suitably chosen position observable. That is, one would naturally be lead to consider 1 (x · ∇ω + ∇ω · x). 2 This is for example the case for the N -body problem, see e.g. [1, 8, 9, 27], and in the case of field theory see [10, 11, 16, 17, 22, 38], where the position is the Newton– Wigner position dΓ(x). The free energy is dΓ(Mω ), and we get as conjugate operator A = dΓ(a), where a is as above. Here dΓ(b) denotes the second quantization of a one-body operator b, cf. the following subsection on Nelson’s model. It is often advantageous to modify the so obtained conjugate operator, to simplify proofs, or circumvent some technical issues. In this paper we need the modified generator of translations Aδ from [22] in order to deal with the confined massless Nelson model, and more generally confined massless Pauli–Fierz models. There are two issues that come up naturally when following the above guidelines for massless field theory models, like the Nelson model. One is already apparent in the one-particle setup discussed above. If ω(k) = |k|, the resulting conjugate operator a, the generator of radial translations, does not have a self-adjoint realization. This appears to be a purely technical complication, that becomes a serious issue when one is in need of localizations in the operator a. The operator is not normal, so we do not have spectral calculus at hand, only resolvents. This has so far not been a serious issue when dealing with the limiting absorption principle [11, 22, 26, 27, 38], and perturbation theory around an uncoupled system [11, 24]. It does however become an obstacle when one tries to apply the conjugate operator a in the context of scattering theory [23]. In the present paper, non-self-adjointness of a is also a serious obstacle, which we overcome, as in [23], by passing to a so called expanded Hamiltonian. The idea is to write L2 (Rd ) ∼ L2 (R+ ) ⊗ L2 (S d−1 ) and double the Hilbert space to L2 (R) ⊗ L2 (S d−1 ). The dispersion relation in polar coordinates is just multiplication by r, which when extended linearly to negative r gives rise to the self-adjoint conjugate operator i∂/∂r ⊗ 1. We thus work with an expanded Hamiltonian, and in the end pull our results back to the physical Hamiltonian. The reader should keep this in mind when going through the abstract conditions in the following section. However passing to an expanded Hamiltonian is not a silver bullet, it comes with a price. The operator of multiplication by r is no longer bounded from below, making it hard to utilize energy localizations. For this reason we have to develop an abstract theory which does not demand that any naturally occurring object can be controlled by the (expanded) Hamiltonian. The second feature we want to discuss does not occur on the one-particle level, but only after second quantization. The free commutator becomes a=
i[dΓ(|k|), A] = N ,
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
459
where N is the number operator. In the standard (regular) commutator based methods, one typically has the commutator bounded at least as a form on D(H). (This is for example a consequence of a C 1 (A) assumption.) This is not the case here and we call such a situation singular. One could of course avoid this issue by observing that the operators involved conserve particle number, and then rescale A by 1/n on the n-particle sector. However, perturbations are typically expressed in terms of field operators, and straying from second quantized conjugate operators give rise to terms from the commutator with the perturbation, that have so far not been controllable. The dΓ(|k|)-unboundedness of the number operator has led authors to use a different conjugate operator instead, namely the second quantized generator of dilation given by dΓ((x · k + k · x)/2), normally associated with the dispersion relation k 2 . Here the commutator with dΓ(|k|) is dΓ(|k|) itself, so the issue disappears. However, this choice induces an artificial threshold at photon energy 0, which for a coupled system turns all eigenvalues of the atomic system into artificial thresholds. In order to circumvent this problem one can modify the generator of dilation by building the level shift from Fermi’s golden rule into the conjugate operator. This was done in [5] and gives rise to positive relatively bounded commutators, at weak coupling. There are however disadvantages to this approach. It does not cover situations where symmetries may cause embedded eigenvalues to persist to second order in perturbation theory. For the N -body problem in quantum mechanics, one can, for example, show that the underlying spectrum is absolutely continuous without a priori imposing Fermi’s golden rule, which can then subsequently be established [1, 27]. Works employing this choice of conjugate operator has, so far, not been able to address what happens outside the regime of weak coupling, which may be an issue since coupling constants typically are explicitly given numbers. In electron-photon models, the coupling constant involve the feinstructure constant 1/137 and in electron-phonon models from solid state physics, the coupling constants occurring may even be of the order 1. Effective coupling constants may also depend on an ultraviolet cutoff, thus imposing apparently artificial limitations on the size of the cutoff. Finally the restriction on the size of the coupling constant is always locally uniform in energy. That is, all statements of this type holds only below a fixed E0 . Papers employing the generator of dilation include [4, 5, 18]. We remark that in [24], the author modifies the generator of radial translation, as it was done in [5] for the generator of dilations, in order to establish Fermi’s golden rule. We have no need for this construction since we follow the strategy of [1, 27, 31]. Instead of viewing the unboundedness of the first commutator with respect to dΓ(|k|) as a technical problem, one can also adopt the point of view that it is a feature of the model which can be exploited. This is most obviously done for small coupling constants, where one gets a positive commutator globally in energy, modulo a compact error. This was done in [11, 17, 24, 38]. In [22] the extra positivity of the commutator is directly utilized to prove a Mourre estimate
June 3, J070-S0129055X11004333
460
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
at arbitrary coupling constant, the first (and so far only) such result for massless models. Another piece of information one can extract is that the number operator has finite expectation in bound states. This was done in [38] for small coupling constants and generally in [22]. A more subtle property is that one can obtain a stronger limiting absorption principle, see [21, 31], which has so far not found an application. Here we prove in particular that bound states are in the domain of the number operator, not just in its form domain. We have not discussed positive temperature models, where one has a similar situation, except that so far no positive commutator estimates at arbitrary coupling has been proven, regardless of choice of conjugate operator. See e.g. [12, 19] and references therein. 1.2. The Nelson model The model describes a confined atomic system coupled to a massless scalar quantum field. The Hamiltonian K of the atomic system is K =−
P 1 ∆i + Vij (xi − xj ) + W (x1 , . . . , xP ) 2mi i=1 i<j
(1.4)
acting on K = L2 (R3P ). Here mi > 0 denotes the mass of the i’th particle located at xi ∈ R3 . We write x = (x1 , . . . , xP ) ∈ R3P . The external potential W is the confinement and must satisfy (W0) W ∈ L2loc (R3P ) and there exist positive constants c0 , c1 and α > 2 such that W (x) ≥ c0 |x|2α − c1 . As for the pair potentials Vij , they should satisfy (V0) The Vij ’s are ∆-bounded with relative bound 0. The Hilbert space for the scalar bosons is the symmetric Fock-space F = Γ(L2 (R3 )) and the kinetic energy for the massless bosons is dΓ(|k|), the second quantization of the operator of multiplication with the massless dispersion relation |k|. The uncoupled Hamiltonian, describing the atomic system and the scalar field is K ⊗ 1F + 1K ⊗ dΓ(|k|), as an operator on the full Hilbert space H = K ⊗ F. Our next task is to introduce a coupling of the form Iρ (x) =
P
φρ (xi ),
i=1
where φρ (y) is an ultraviolet and infrared regularized field operator 1 (ρ(k)e−ik·y a∗ (k) + ρ(k)eik·y a(k))dk. φρ (y) = √ 2 R3
(1.5)
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
461
We assume purely for simplicity that ρ only depends on k through its modulus. To conform with the notation used in [22], we introduce such that |k|ρ(k) = ρ˜(|k|).
ρ˜(r) = rρ(r, 0, 0),
For the interacting Hamiltonian, indexed by the coupling function ρ, HρN = K ⊗ 1F + 1K ⊗ dΓ(|k|) + Iρ (x)
(1.6)
to be essentially self-adjoint on D(K) ⊗ Γfin (C0∞ (R3 )), we need the following basic assumption on ρ. ∞ (ρ1) 0 (1 + r−1 )|˜ ρ(r)|2 dr < ∞. Here Γfin (V ) denotes the subspace of F consisting of elements η with only finitely many n-particle components η (n) nonzero, and those that are nonzero lie in the n-fold algebraic tensor product of the subspace V ⊆ L2 (R3 ). Note that Γfin (V ) is dense in F if V is dense in L2 (R3 ). In order to formulate the remaining assumption on ρ we introduce a function d ∈ C ∞ ((0, ∞)), which measures the amount of infrared regularization carried by ρ. It should, for some Cd > 0, satisfy d(r) = 1,
for r ≥ 1,
−Cd
d(r) ≤ d (r) < 0, r
lim d(r) = +∞.
r→0+
(1.7)
Note that the conditions above imply that 1 ≤ d(r) ≤ r−Cd , for r ∈ (0, 1]. In order to simplify some expressions below we make the additional assumption that 1
∀r ∈ (0, 1] : d(r) ≤ Cd r− 2 ,
(1.8)
for some Cd > 0. In practice we want to construct a d with as weak a singularity as possible, so this extra assumption is no restriction. We formulate the remaining conditions on ρ, of which the two first also appeared in [22]. ∞ ρ(r)|2 + | ddrρ˜ (r)|2 ]dr < ∞. (ρ2) 0 (1 + r−1 )d(r)2 [r−2 |˜ ∞ 2 (ρ3) 0 | ddrρ2˜ (r)|2 dr < ∞. ∞ ρ(r)|2 dr < ∞. (ρ4) 0 r4 |˜ We remark that (ρ2) and (ρ4) imply (ρ1). A typical form of ρ, and hence ρ˜, would be |k|2
1
ρ(k) = e− 2Λ2 |k|− 2 + ,
r2
1
ρ˜(r) = e− 2Λ2 r 2 + .
(1.9)
One can construct a d by gluing together the functions 1 and r− , with 0 < < min{, 1/2}. The parameters Λ and are the ultraviolet and infrared regularization parameters. Ideally we would like to have Λ = ∞ and = 0. For the conditions (ρ1)–(ρ4) to be satisfied we must have 0 < Λ < ∞ and > 1. Observe that it is the condition (ρ3) on the second derivative of ρ˜ that causes the strongest restriction on .
June 3, J070-S0129055X11004333
462
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Observe that the set of ρ’s satisfying (ρ1)–(ρ4) is a complex vector space IN (d) ⊆ L (R3 ), which can be equipped with a norm matching the four conditions. That is 2 2 2 ∞ d˜ ρ d ρ ˜ ρ(r)|2 + (1 + r−1 )d(r)2 (r) + 2 (r) dr. (r4 + d(r)2 r−3 )|˜ ρ2N := dr dr 0 2
(1.10) In order to formulate our main theorem, we need to introduce an operator conjugate to HρN . We use the one constructed in [22], for which a Mourre estimate has been established under the assumptions above. Let χ ∈ C0∞ (R), with 0 ≤ χ ≤ 1, χ(r) = 1 for |r| < 1/2, and χ(r) = 0 for |r| > 1. For 0 < δ ≤ 1/2 we define a function on (0, ∞) by sδ (r) = χ(r/δ)d(δ)r−1 + (1 − χ)(r/δ)d(r)r−1 . Using this function we construct a vector-field by sδ (k) = sδ (|k|)k, which equals k/|k| for |k| > 1 and d(δ)k/|k| for |k| < δ/2. The conjugate operator on the oneparticle sector is aδ =
1 (sδ · i∇k + i∇k · sδ ). 2
(1.11)
The operator is symmetric and closable on {f ∈ C0∞ (R3 )|f (0) = 0}. We denote again by aδ its closure which is a maximally symmetric operator, but not selfadjoint. It is a modification, near k = 0, of the generator of radial translations k k ·i∇k + i∇k · |k| )/2. The conjugate operator is now the maximally symmetric a = ( |k| operator Aδ = 1K ⊗ dΓ(aδ ). The second quantization dΓ(a) of the generator of radial translations works as conjugate operator if one stays close to the uncoupled system. See [11, 24, 38]. It is not known if one really needs the modified generator of radial translations Aδ in order to get a Mourre estimate at arbitrary coupling. For an eigenvalue E ∈ σpp (HρN ) we write Pρ for the associated eigenprojection. It is known from [22] that Pρ has finite dimensional range. Finally we need the number operator N = 1K ⊗ dΓ(1L2 (R3 ) ). We will make use of the same notation for the (usual) number operator on F . Our main result of this paper, formulated in terms of the Nelson model, is Theorem 1.2. Suppose (W0) and (V0). Let E0 ∈ R and ρ0 ∈ IN (d) be given. There exist 0 < δ ≤ 1/2, r > 0 and C > 0 such that for any ρ ∈ IN (d), with ρ − ρ0 N ≤ r, and E ∈ σpp (HρN ) ∩ (−∞, E0 ] we have 1
1
Pρ : H → D(N 2 Aδ ) ∩ D(Aδ N 2 ) ∩ D(N )
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
463
and 1
1
N 2 Aδ Pρ + Aδ N 2 Pρ + N Pρ ≤ C. We remark that for any δ > 0 small enough, one can find r and C such that the conclusion of the theorem holds. See Theorem 5.2. The above suffices for our purpose and is a cleaner statement. We can implement a unitary transformation, the so-called Pauli–Fierz transform, which has the effect of smoothening the infrared singularity. Let Uρ = exp(−iP φiρ/|k| (0)) be the unitary transformation with P ρ(k) Uρ a(k)Uρ∗ = a(k) − √ 2|k|
P ρ(k) and Uρ a∗ (k)Uρ∗ = a∗ (k) − √ . 2|k|
For
the transformation Uρ to be well-defined we must require −2 |k| |ρ(k)|2 dk < ∞. To achieve this we strengthen (ρ1) to read R3 ∞ ρ(r)|2 dr < ∞. (ρ1 ) 0 (1 + r−2 )|˜
that
We then get
HρN = (1K ⊗ Uρ )HρN (1K ⊗ Uρ )∗ = Kρ ⊗ 1F + 1K ⊗ dΓ(|k|) + Iρ (x) − Iρ (0), (1.12) where P
Kρ = K −
vρ (xi ) +
i=1
and
vρ (y) = P
Observe that 1 φρ (y) − φρ (0) = √ 2
R3
R3
P2 2
∞
r−1 |˜ ρ(r)|2 dr1K
(1.13)
0
|ρ(k)|2 cos(k · y)dk. |k|
(1.14)
(ρ(k)(e−ik·y − 1)a∗ (k) + ρ(k)(eik·y − 1)a(k))dk.
The estimate |e±ik·y − 1| ≤ max{2, |k||y|} ≤ 2
|k| y, k
(1.15)
with η = (1 + |η|2 )1/2 , enables us to extract an extra infrared regularization using the decay in x supplied by the confinement condition (W0). Keeping (1.8) and (ρ1 ) in mind, the remaining two assumptions on ρ now weaken to ∞ (ρ2 ) 0 | ddrρ˜ (r)|2 dr < ∞. ∞ 2 (ρ3 ) 0 r2 | ddrρ2˜ (r)|2 /(1 + r2 )dr < ∞.
June 3, J070-S0129055X11004333
464
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
The condition (ρ4), being an ultraviolet condition, is unchanged. For the choice (1.9) to satisfy (ρ1 )–(ρ3 ) and (ρ4) we must have 0 < Λ < ∞ and > 0. Here the first three conditions on ρ all require > 0. Observe again that the set of ρ satisfying (ρ1 )–(ρ3 ) and (ρ4) is a complex (d). We introduce the natural norm vector space IN 2 2 ∞ 2 d ρ˜ 2 d˜ ρ r 2 4 −2 2 ρ(r)| + (r) + (r) dr. (r + r )|˜ ρN := dr 1 + r2 dr2 0 Fix a ρ0 ∈ IN (d). There are now two avenues one can follow. Either one can continue as above, and for each ρ in a · N -ball around ρ0 we apply the transformation Uρ to arrive at the more regular Hamiltonian HρN that we can fit into our class of Pauli–Fierz models. A second option would be to apply the same transformation Uρ0 regardless of ρ chosen near ρ0 . The advantage of this is two-fold: Firstly, we would be working in the same coordinate system for all ρ’s, which in the context of perturbation theory, cf. [13], is the most natural. Secondly, in this way the Hamiltonian will have a linear dependence on the “perturbation” ρ − ρ0 , which is a requirement in [13]. The drawback is that ρ − ρ0 has to be an element of IN (d), and for example cannot be a small multiple of ρ0 . To implement the latter approach, we now let ρ = ρ0 + ρ1 , with ρ1 ∈ IN (d), the space of regular interactions. We then employ the transformation Uρ0 which yields the transformed Hamiltonian HρN
= (1K ⊗
Uρ0 )HρN (1K
∗
⊗ Uρ0 ) =
HρN0
+ Iρ1 (x) −
P
vρ0 ,ρ1 (xi ),
(1.16)
i=1
where
ρ1 (k)ρ0 (k) −ik·y e Re vρ0 ,ρ1 (y) = P |k| R3
dk.
(1.17)
For an eigenvalue E ∈ σpp (HρN ) we write Pρ = (1K ⊗ Uρ )Pρ (1K ⊗ Uρ )∗ for the associated eigenprojection for HρN , and Pρ = (1K ⊗ Uρ0 )Pρ (1K ⊗ Uρ0 )∗ for the associated eigenprojection for HρN . Again Pρ and Pρ have finite dimensional ranges. Theorem 5.2 can be applied to the transformed Hamiltonian and we arrive at the following theorem. (d) be given. Theorem 1.3. Suppose (W0) and (V0). Let E0 ∈ R and ρ0 ∈ IN There exist 0 < δ ≤ 1/2, r > 0 and C > 0 such that (1) For any ρ ∈ IN (d) with ρ − ρ0 N ≤ r and E ∈ σpp (HρN ) ∩ (−∞, E0 ] we have 1
1
Pρ : H → D(N 2 Aδ ) ∩ D(Aδ N 2 ) ∩ D(N ) and 1
1
N 2 Aδ Pρ + Aδ N 2 Pρ + N Pρ ≤ C.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
465
(2) For any ρ1 ∈ IN (d) with ρ1 N ≤ r and E ∈ σpp (HρN ) ∩ (−∞, E0 ], where ρ = ρ0 + ρ1 , we have 1
1
Pρ : H → D(N 2 Aδ ) ∩ D(Aδ N 2 ) ∩ D(N ) and 1
1
N 2 Aδ Pρ + Aδ N 2 Pρ + N Pρ ≤ C. Unfortunately the transformation Uρ , with ρ ∈ IN (d), is too singular to allow for a recovery of the full set of regularity results for the original Hamiltonian HρN , as in Theorem 1.2. The only thing that remains after undoing the transformation is the following corollary to Theorem 1.3(1). The same argument using Theorem 1.3(2) would give a weaker result. Theorem 1.3(2) will however play a role in [13]. (d) be given. Corollary 1.4. Suppose (W0) and (V0). Let E0 ∈ R and ρ0 ∈ IN (d) with There exist 0 < δ ≤ 1/2, r > 0 and C > 0 such that for any ρ ∈ IN N ρ − ρ0 N ≤ r and E ∈ σpp (Hρ ) ∩ (−∞, E0 ] we have
Pρ : H → D(N )
and
N Pρ ≤ C.
We make a number of remarks concerning the results above. The domain of aδ is independent of δ, and in fact equals the domain of the generator of radial translations. The same is (presumably) false for the second quantized versions. This is the reason for the somewhat unpleasant formulation of the theorems in terms of Aδ . It should be read in the context of Mourre’s commutator method, and in [13] we need the regularity formulated in terms of Aδ . The statement that bound states are in the domain of the number operator is new. Previously it was only known that bound states are in the domain of N 1/2 . See [22]. The reader should first and foremost read the results above with ρ = ρ0 . In the sequel [13] we need the locally uniform version to deduce a Fermi golden rule under minimal assumptions. In traditional approaches to Fermi’s golden rule, one typically require unperturbed bound states to be in the domain of the square of the conjugate operator. See [1, 27, 31]. In [13] we reduce the requirement to bound states ψ being in the domain of the conjugate operator itself, at the expense of a need for the norm Aδ ψ to be bounded uniformly in ρ in a ball around the unperturbed coupling function ρ0 and uniformly in E running over eigenvalues of Hρ in a fixed compact interval. This motivates the somewhat unorthodox formulation in Theorem 1.2. The conditions (ρ3) and (ρ3 ) come from a need of handling the double commutator [[Hρ , Aδ ], Aδ ]. It is not a priori obvious that we should be able to place bound states in the domain of Aδ with control of just two commutators. In the context of regular Mourre theory the question is addressed in [8, 9] where the author(s) need three commutators to conclude a result of this type, which in view of Example 1.1 is not optimal. To deal effectively with the infrared singularity, it is crucial to minimize the number of commutators needed.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
466
1.3. The AC-Stark model The model describes a system of N charged particles in a nonzero time-periodic Stark-field with zero mean (AC-Stark field). The particles are here taken threedimensional and we assume that the field is 1-periodic and, for simplicity, that it is continuous i.e. that E˜ ∈ C([0, 1]; R3 ). The Hamiltonian is of the form
N 2 pi ˜ ˜ − qi E(t) · xi + V ; h(t) = (1.18) 2mi i=1 here xi , mi and qi are the position, the mass and the charge of the ith particle, respectively, and pi = −i∇xi is its momentum. The potential is of the form V = vij (xi − xj ), (1.19) 1≤i<j≤N
where the pair-potentials obey Conditions 1.5. Let k0 ∈ N be given. For each pair (i, j) the pair-potential R3 1 2 + vij where y → vij (y) ∈ R splits into a sum vij = vij (1) (2) (3) (4)
1 2 Differentiability: vij ∈ C k0 +1 (R3 ) and vij ∈ C k0 +1 (R3 \{0}). 1 (y)| ≤ C. Global bounds: For all α with |α| ≤ k0 +1 there are bounds |y||α| |∂yα vij 1 1 Decay at infinity: |vij (y)| + |y · ∇y vij (y)| = o(1). 2 is compactly supported and for all α with |α| ≤ k0 + 1 Local singularity: vij |α|+1 2 |∂yα vij (y)| ≤ C; y = 0. there are bounds |y|
In the above conditions, the letter α denotes multiindices. Note that (1.18) and (1.19) with vij (y) = qi qj |y|−1 conform with Conditions 1.5 for any k0 . Introducing the inner product x · y = i 2mi xi · yi for x = (x1 , . . . , xN ), y = (y1 , . . . , yN ) ∈ R3N we can split R3N = XCM ⊕ X;
XCM = {x ∈ R3N | x1 = · · · = xN }.
There is a corresponding splitting ˜ = hCM (t) ⊗ I + I ⊗ h(t), h(t)
on L2 (XCM ) ⊗ L2 (X),
where hCM (t) = p2CM − ECM (t) · x Here ECM
Q ˜ ˜ and E = (E, . . . , E) = 2M
and h(t) = p2 − E(t) · x + V. q1 Q − 2m1 2M
˜ ..., E,
qN Q − 2mN 2M
E˜ ,
where Q = q1 + · · · + qN and M = m1 + · · · + mN are the total charge and mass of the system, respectively. In the special case where all the particles have identical charge to mass ratio, we see that the center of mass Hamiltonian is just an ordinary time-independent N -body Hamiltonian. Otherwise the Hamiltonian h(t) depends
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
467
˜ (t, s), UCM (t, s) and U (t, s) the non-trivially on the time-variable t. We denote by U ˜ dynamics generated by h(t), hCM (t) and h(t), respectively, and observe that ˜ (t, s) = UCM (t, s) ⊗ U (t, s). U We shall address spectral properties of the monodromy operator U(1,0). Note that this is a unitary operator on L2 (X). Let A be the set of all cluster partitions a = {C1 , . . . , C#a }, 1 ≤ #a ≤ N , each given by splitting the set of particles {1, . . . , N } into non-empty disjoint clusters Ci . The spaces Xa , a ∈ A, are the spaces of configurations of the #a centers of mass of the clusters Ci (in the center of mass frame). The complement X a = X C1 ⊕ · · · ⊕ X C#a is the space of relative configurations within each of the clusters Ci . More precisely X Ci = {x ∈ X | xj = 0, j ∈ / Ci } and Xa = {x ∈ X | k, l ∈ Ci ⇒ xk = xl }. We will write xa and xa for the orthogonal projection of a vector x onto the subspace X a and its orthogonal complement respectively. Notice the natural ordering on A: a ⊂ b if and only if any cluster C ∈ a is contained in some cluster C ∈ b. Clearly the minimal and maximal elements are amin = {(1), . . . , (N )} and amax = {(1, . . . , N )}, respectively. Any pair (i, j) defines an N − 1 cluster decomposition (ij) ∈ A by letting C = {i, j} constitute a cluster and all others being one-particle clusters. For each a = amax the sub-Hamiltonian monodromy operator is U a (1, 0); it is defined as the monodromy operator on Ha = L2 (X a ) constructed for a = amin from ha = (pa )2 − E(t)a · xa + V a , V a = (ij)⊂a vij (xi − xj ). If a = amin we define 1 U a (1, 0) = 1 (implying σpp (U amin (1, 0)) = {1}). The condition 0 E(t)dt = 0 leads to the existence of a unique 1-periodic function b such that 1 d b(t) = E(t) and b(t)dt = 0. dt 0 The set of thresholds is F (U (1, 0)) = e−iαa σpp (U a (1, 0)); a =amax
αa =
1
|b(t)a |2 dt.
(1.20)
0
We recall from [31] that the set of thresholds is closed and countable, and non-threshold eigenvalues, i.e. points in σpp (U (1, 0))\F (U (1, 0)), have finite multiplicity and can only accumulate at the set of thresholds. Moreover any corresponding bound state is exponentially decaying, the singular continuous spectrum σsc (U (1, 0)) = ∅ and there are integral propagation estimates for states localized away from the set of eigenvalues and away from F (U (1, 0)). These properties are known under Conditions 1.5 with k0 = 1. For completeness of presentation we mention that some of the results of [31] hold under more general conditions, in particular the exponential decay result does not require that the Coulomb singularity of each pair-potential (if present) is located at the origin (this applies to Born–Oppenheimer molecules in an AC-Stark field).
June 3, J070-S0129055X11004333
468
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Letting A(t) =
1 (x · (p − b(t)) + (p − b(t)) · x), 2
(1.21)
and using a different frame, we prove in Sec. 6: Theorem 1.6. Suppose Conditions 1.5, for some k0 ∈ N. Let φ be a bound state / F(U (1, 0)). Then for U (1, 0) pertaining to an eigenvalue e−iλ ∈ (1) φ ∈ D(A(1)k0 ) where A(t) is given by (1.21). 2 = 0 then φ ∈ D(|p|k0 +1 ). (2) If for all pairs (i, j) the term vij The result (1) is new for k0 > 1 while it is essentially contained in [31] for k0 = 1, see [31, Proposition 8.7(ii)]. We remark that the highest degree of smoothness known 2 = 0 is φ ∈ D(|p|), cf. [31, Theorem 1.8]. This holds without in general in the case vij the non-threshold condition. The result (2) overlaps with [29, Theorem 1.2], when N = 2 and “k0 = ∞”. 2. Assumptions and Statement of Regularity Results For a self-adjoint operator A on a Hilbert space H, we will make use of the C 1 (A) class of operators. This class consists a priori of bounded operators B with the property that [B, A] extends from a form on D(A) to a bounded form on H. The class is (consistently) extended to self-adjoint operators H, by requiring that (H − z)−1 is of class C 1 (A), for some (and hence all) z ∈ ρ(H), the resolvent set of H. We will use the notation H ∈ C 1 (A) to indicate that an operator H is of class C 1 (A). If H is of class C 1 (A) then D(H) ∩ D(A) is dense in D(H) and the form [H, A] extends by continuity from the form domain D(H) ∩ D(A) to a bounded form on D(H). The extension is denoted by [H, A]0 , and is also interpreted as an element of B(D(H), D(H)∗ ). If in addition [H, A]0 extends by continuity to an element of 1 1 (A). Note that being of class CMo (A) is B(D(H), H), then we say it is of class CMo equivalent to having the conditions of Mourre [30] satisfied for the first commutator. See [20]. Conditions 2.1. Let H be a complex Hilbert space. Suppose there are given some self-adjoint operators H, A and N as well as a symmetric operator H with D(H ) = D(N ). Suppose N ≥ 1. Let R(η) = (A − η)−1 for η ∈ C\R. 1 (A). We abbreviate N = i[N, A]0 . (1) The operator N is of class CMo 1 (2) The operator N is of class C (H), and there exists 0 < κ ≤ 12 such that the commutator obeys 1
1
i[N, H]0 ∈ B(N − 2 +κ H, N 2 −κ H).
(2.1)
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
469
(3) There exists a (large) σ > 0 such that for all η ∈ C with |Im η| ≥ σ we have as a form on D(H) ∩ D(N 1/2 ) i[H, R(η)] = −R(η)H R(η).
(2.2)
(Here it should be noticed that N −1/2 H N −1/2 and N ∓1/2 R(η)N ±1/2 are bounded if σ is large enough, cf. Remark 2.4(1).) (4) The commutator form i[H , A] defined on D(A) ∩ D(N ) extends to a bounded operator 1
1
H := i[H , A]0 ∈ B(N − 2 H, N 2 H).
(2.3)
Condition 2.2. There are constants C1 , C2 , C3 ∈ R such that as a form on D(H)∩ D(N 1/2 ) N ≤ C1 H + C2 H + C3 1.
(2.4)
Condition 2.3. For a given λ ∈ R there exist c0 > 0, C4 ∈ R, fλ ∈ Cc∞ (R) with 0 ≤ fλ ≤ 1 and fλ = 1 in a neighborhood of λ, and a compact operator K0 on H such that as a form on D(H) ∩ D(N 1/2 ) H ≥ c0 1 − C4 fλ⊥ (H)2 H − K0 .
(2.5)
Here fλ⊥ := 1 − fλ . Remarks 2.4. (1) It follows from Condition 2.1(1) and an argument of Mourre [30, Proposition II.3], that there exists σ > 0 such that for |Im η| ≥ σ we have (A − η)−1 : D(N ) ⊆ D(N ) and (A − η)−1 D(N ) is dense in D(N ). By interpolation the same holds with N replaced by N α , 0 < α < 1, cf. Lemma 3.4 below. 1 (H). (2) From Condition 2.1(2) and Lemma 3.2 it follows that N 1/2 is of class CMo 1/2 1/2 In particular D(H) ∩ D(N ) is dense in D(N ). (3) Combining the above two remarks with Condition 2.1(3) and (3.14), we find that given H, A and N , there can at most be one H such that Conditions 2.1(1)–2.1(3) are satisfied. (4) We remark that in practice we work with the weaker commutator estimate H ≥ c0 1 − Re{B(H − λ)} − K0 ,
(2.6)
where B = B(λ) is a bounded operator, with BD(N 1/2 ) ∪ B ∗ D(N 1/2 ) ⊆ D(N 1/2 ). The one in Condition 2.3 is however more standard. To see that Condition 2.3 implies the above bound choose B = C4 fλ⊥ (H)2 H(H − λ)−1 which under our Condition 2.1 satisfies the requirements on B by Lemma 3.3. We call H the first derivative of H. Similarly H is the second derivative of H. The estimate (2.4) is called the virial estimate, while (2.5) is the Mourre estimate at λ.
June 3, J070-S0129055X11004333
470
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Theorem 2.5. Suppose Conditions 2.1–2.3, and let ψ be a bound state, (H −λ)ψ = 0 (with λ as in Condition 2.3), obeying 1
ψ ∈ D(N 2 ).
(2.7)
Then ψ ∈ D(A) and Aψ ∈ D(N 1/2 ). By imposing assumptions on higher-order commutators between H and A we obtain a higher-order regularity result. For this we need the following condition, which coincides with Condition 2.1(4) if k0 = 1, but for k0 ≥ 2 it is stronger. Condition 2.6. There exists k0 ∈ N such that the commutator forms i adA (H ) defined on D(A) ∩ D(N ), = 0, . . . , k0 , extend to bounded operators i adA (H ) ∈ B(N −1 H, H); ik0 adkA0 (H )
∈ B(N
− 12
= 0, . . . , k0 − 1.
(2.8)
1 2
H, N H).
(2.9)
We have the following extension of Theorem 2.5 to include higher orders: Theorem 2.7. Suppose Conditions 2.1–2.3 and Condition 2.6, and let ψ be a bound state, (H −λ)ψ = 0 (with λ as in Condition 2.3), obeying (2.7). Let k0 be given as in Condition 2.6. Then ψ ∈ D(Ak0 ), and for k = 1, . . . , k0 the states Ak ψ ∈ D(N 1/2 ). It should be noted that under the assumptions imposed in Theorems 2.5 and 2.7, it is crucial that N 1/2 is applied after the powers of A. The following result requires an additional assumption, and allows for arbitrary placement of N 1/2 amongst the at most k0 powers of A. The new condition (2.10) below is a generalization of Condition 2.1(1). Condition 2.8. Let N be given as in Condition 2.1(1). There exists k0 ∈ N such that the commutator forms i adA (N ) defined on D(A) ∩ D(N ), = 0, . . . , k0 − 1, extend to bounded operators i adA (N ) ∈ B(N −1 H, H);
= 0, . . . , k0 − 1.
(2.10)
Moreover there exists κ1 > 0 such that the commutators (initially defined as forms on D(N )) i adN (i adA (N )) ∈ B(N −1 H, N 1−κ1 H);
= 0, . . . , k0 − 1.
(2.11)
We have Corollary 2.9. Suppose Conditions 2.1–2.3, 2.6 and 2.8 (with the same k0 in Conditions 2.6 and 2.8). Let ψ ∈ D(N 1/2 ) be a bound state, (H − λ)ψ = 0 (with λ as in Condition 2.3). For any k, ≥ 0, with k + ≤ k0 , we have ψ ∈ D(Ak N 1/2 A ).
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
471
We end with the following improvement of Theorem 2.5, which concludes in addition that bound states are in the domain of N . It requires the added assumption (2.11), with k0 = 1. Theorem 2.10. Suppose Conditions 2.1–2.3 and (2.11) for k0 = 1, and let ψ ∈ D(N 1/2 ) be a bound state (H − λ)ψ = 0 (with λ as in Condition 2.3). Then ψ ∈ D(N ), the states ψ, N 1/2 ψ ∈ D(A) and Aψ ∈ D(N 1/2 ). In Sec. 4.3, we in fact prove an extension of the above theorem, to include higher order estimates in N . These are applied in Sec. 6 to many-body AC-Stark Hamiltonians. Remarks 2.11. (1) The condition that N ≥ 1 is imposed partly for convenience of formulation. Obviously one can obtain a version of the above results upon imposing only that N is bounded from below (upon “translating” N → N + C ≥ 1 at various points in the above conditions). (2) The “standard” or “regular” Mourre theory, considered for example in [9], fits in the semi-bounded case into the above scheme so that Theorem 2.7 holds. In fact (assuming here for simplicity that H is bounded from below) we have N := H + C ≥ 1 for a sufficiently large constant C. Use this N and the same “conjugate operator” A in Conditions 2.1–2.3, 2.6 and 2.8. Note also that the standard Mourre estimate at energy λ reads fλ (H)i[H, A]0 fλ (H) ≥ c0 fλ2 (H) − K0 ;
c0 > 0,
K0 compact.
(2.12)
From (2.12) we readily conclude (2.5) with c0 = c0 /2, K0 = K0 an a suitable constant C4 ≥ 0. Although we shall not elaborate we also remark that the method of proof of Theorem 2.7 essentially can be adapted under the conditions of the standard Mourre theory, in fact only a simplified version is needed. Whence although we cannot literately conclude from Theorem 2.7 in the general non-semi-bounded case the result ψ ∈ D(Ak0 ) is still valid given standard conditions on repeated commutators ik adkA (H) for k ≤ k0 + 1. (3) Theorem 2.7 does not hold with one less commutator in Condition 2.6. Alternatively, under the conditions of Theorem 2.7 it is in general false that the bound state ψ ∈ D(Ak0 +1 ). Based on considerations for discrete eigenvalues this statement may at a first thought appear surprising. See Example 1.1. Compared to [9] our method works with one less commutator, cf. (2), although the overall scheme of ours and the one of [9] are similar. (4) The proofs of Theorems 2.5 and 2.7, Corollary 2.9 and Theorem 2.10 are constructive in that they yield explicit bounds. Precisely, if we have a positive lower bound of the constant c0 in (2.5) that is uniform in λ belonging to some fixed compact interval I as well as uniform bounds of the absolute value of the constants C1 , . . . , C4 of (2.4) and (2.5) (uniform in the same sense) and similarly for all possible operator norms related to Conditions 2.1, 2.6 and 2.8
June 3, J070-S0129055X11004333
472
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
(and the B(λ) in Remark 2.4 if it is used) then there are bounds of the form, for example, 1
1
N 2 Ak ψ ≤ CN 2 ψ;
C = C(k, I, K0 );
here K0 = K0 (λ) is the compact operator of (2.5) and k ≤ k0 . Similar bounds are valid for the states Ak N 1/2 A ψ of Corollary 2.9 and for the state N ψ of Theorem 2.10. In the context of perturbation theory typically I will be a small interval centered at some (unperturbed) embedded eigenvalue λ0 and K0 = K0 (λ0 ). Whence the constant will depend only on the interval. For various models one can verify the condition (2.7) for all bound states ψ by a “virial argument”, cf. [22, 31, 38], along with a similar bound 1
N 2 ψ ≤ C(I)ψ. This virial argument is in a concrete situation related to the virial estimate (2.4). Clearly the above bounds can be used in combination, and this is precisely how we in Sec. 5 arrive at the Theorems 1.2 and 1.3. In [32] the case of regular Mourre theory is considered where the derivation of the bounds is simpler, and care is taken to derive good explicit bounds, which in particular are independent of any proof technical constructions. The bounds are good enough to formulate a reasonable condition on the growth of norms of multiple commutators which ensures that bound states are analytic vectors with respect to A. 3. Preliminaries In this section, we establish basic consequences of Conditions 2.1, and introduce a calculus of almost analytic extensions taylored to avoid issues with (A − η)−1 , when |Im η| is small. 3.1. Improved smoothness for operators of class C 1 (A) For an operator N of class C 1 (A) not much in the way of regularity can be expected, beyond the C 1 (A) property itself, and its equivalent formulations. See [3, 21]. Often one requires some additional smoothness properties to manipulate and estimate expressions in the two operators. The typical way of achieving improved smoothness is to impose conditions on i[N, A]0 stronger than what is implied by the C 1 (A) property itself. This is what is done in Conditions 2.1(1) and 2.1(2). This subsection is devoted primarily to the extraction of improved smoothness properties of the pair of operators N, H, afforded to us by Conditions 2.1. Lemma 3.1. Let N ≥ 1 be of class C 1 (H) with [N, H]0 ∈ B(N −1/2 H, N 1/2 H). For any α ∈ ]0, 1[, the operator N α is of class C 1 (H).
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
473
Proof. Let 0 < α < 1. It suffices to check for one η ∈ ρ(N α ) that (N α − η)−1 is of class C 1 (H). To this end we pick η = 0, and use the representation formula ∞ sin(απ) . (3.1) t−α (N + t)−1 dt, cα = N −α = cα π 0 Since N ∈ C 1 (H) we have for all t > 0 that the operator (N + t)−1 preserves D(H). In fact [H, (N + t)−1 ]φ = (N + t)−1 [N, H]0 (N + t)−1 φ;
φ ∈ D(H).
(3.2)
By combining (3.1) and (3.2), we can compute [N −α , H] considered as a form on D(H) as ∞ t−α (N + t)−1 [N, H]0 (N + t)−1 dt. (3.3) [N −α , H] = cα 0
Notice that the integral is absolutely convergent for any 0 < α < 1. This completes the proof. Lemma 3.2. Assume N ≥ 1 and H satisfy Condition 2.1(2) and let α ∈ ]0, 1[. Then N α ∈ C 1 (H) and for τ1 , τ2 ≥ 0, with
1 1 max 0, − κ − τ1 + max 0, − κ − τ2 < 1 − α, 2 2 1 (H). we have [N α , H]0 ∈ B(N −τ1 H, N τ2 H). In particular N 1/2 is of class CMo
Proof. That N α ∈ C 1 (H) follows from Lemma 3.1. We compute as a form on D(N α ) ∩ D(H) ∞ [N α , H] = cα tα (N + t)−1 [N, H]0 (N + t)−1 dt, (3.4) 0
where we have used the strongly convergent integral representation formula ∞ N α = cα tα (t−1 − (N + t)−1 )dt, (3.5) 0
which follows from (3.1). We thus get for τ1 , τ2 ≥ 0 ∞ 1 1 tα (N + t)−1 N 2 −κ−τ1 (N + t)−1 N 2 −κ−τ2 dt |ψ, [N α , H]ϕ| ≤ C 0
× N τ1 ψN τ2 ϕ. The integrand is of the order O(tα−2+θ ), with θ = max{0, 12 − κ − τ1 } + max{0, 12 − κ − τ2 }. It is integrable provided θ < 1 − α, which proves the lemma.
June 3, J070-S0129055X11004333
474
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
We shall need a boundedness result: Lemma 3.3. Assume N ≥ 1 and H satisfy Condition 2.1(2) and let α ∈ ]0, 1/2+κ[. Suppose f ∈ C ∞ (R) is given such that dk f (t) = O(t−k ); dtk
k = 0, 1, . . .
Then N α f (H)N −α ∈ B(H).
(3.6)
Proof. Let ρ ∈ ]0, 1/2 + κ[, where 0 < κ ≤ 1/2 comes from Condition 2.1(2). From Lemma 3.2 applied with τ1 = max{0, ρ − κ} and τ2 = 0, we get [N ρ , H]0 ∈ B(N − max{0,ρ−κ} H, H).
(3.7)
is of class C 1 (H), We recall from [30, Proposition II.3] that if an operator N Mo then ∃σ > 0 : |Im η| ≥ σ ⇒ (H − η)−1
) and preserves D(N
(H − η)−1 ψ = (H − η)−1 Nψ N H]0 (H − η)−1 ψ + i(H − η)−1 i[N,
). for all ψ ∈ D(N
(3.8)
= N ρ , 0 < ρ < 1/2 + κ. The assumption is satisfied by (3.7). We apply this to N We shall show a representation formula for the special case f (x) = fη (x) = (x − η)−1 with v = Im η = 0. Now fix α ∈ ]0, 1/2 + κ[. Using (3.7) and (3.8), multiple times with ρ = α − jκ, we obtain for |Im η| sufficiently large and for all ψ ∈ D(N α ) N α (H − η)−1 ψ − (H − η)−1 N α ψ =
n
((H − η)−1 B1 ) · · · ((H − η)−1 Bj )(H − η)−1 N α−jκ ψ
j=1
+ ((H − η)−1 B1 ) · · · ((H − η)−1 Bn )((H − η)−1 Bn+1 )(H − η)−1 ψ,
(3.9)
where n is the biggest natural number for which α−nκ > 0 and the Bj ’s are bounded and independent of η. Next by analytic continuation we conclude that (3.9) is valid for all η ∈ C\R. Hence we have verified the adjoint version of (3.6) for f = fη ; v = 0. We shall now show (3.6) in general. Define a new function by h(t) = f (t)(t+i)−1 , ˜ denote an almost analytic extension of h such that (using the notation and let h η = u + iv) ˜ ∀n ∈ N : |∂¯h(η)| ≤ Cn η−n−2 |v|n .
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
We shall use the representation 1 ˜ f (H) = (∂¯h)(η)(H − η)−1 (H + i)dudv π C 1 ˜ (∂¯h)(η)(1 + (η + i)(H − η)−1 )dudv, = π C
475
(3.10)
which should be read as a strong integral on D(H). We multiply by N α and N −α from the left and from the right, respectively. Inserting (3.9) we conclude the lemma. Observe that N −α being C 1 (H) preserves D(H). It will be important to work with the following “regularization” operators, ˜ cf. [30]: Let for any given self-adjoint operator A˜ and any positive operator N ˜ = −in(A˜ − in)−1 In (A)
˜ ) = n(N ˜ + n)−1 ; and Iin (N
n ∈ N.
(3.11)
In particular we shall use In (A) in conjunction with (2.2), In (H) in conjunction with (2.2), (2.4) and (2.5), while Iin (N ) will be used in conjunction with (2.1). Lemma 3.4. Assume the pairs N, A and N, H satisfy Conditions 2.1(1) and 2.1(2) respectively. Then 1
1
s-lim N 2 In (H)N − 2 = 1,
(3.12)
s-lim N In (A)N −1 = 1,
(3.13)
n→∞ n→∞
1
1
s-lim N 2 In (A)N − 2 = 1. n→∞
(3.14)
Proof. Observe first that s-lim In (A) = 1 and s-lim A(A − in)−1 = 0, and similarly with A replaced by H. The statements (3.12) and (3.13) now follows from (3.8) and boundedness of the operators [N 1/2 , H]0 N −1/2 and [N, A]0 N −1 . This argument appears also in [30]. As for (3.14) we observe first that N (In (A) − 1)N −1 is bounded uniformly in n. By interpolation the same holds true for N 1/2 (In (A) − 1)N −1/2 . The result now follows from observing that the result holds true strongly on the dense set D(N 1/2 ) by (3.13). We end with a small technical remark: Remark 3.5. Suppose N and H are as in Lemma 3.1 and 0 ≤ α < 1. Then D(H) ∩ D(N ) is dense in D(H) ∩ D(N α ) in the intersection topology. To see this let ψ ∈ D(H) ∩ D(N α ). Then ψn = Iin (N )ψ ∈ D(H) ∩ D(N ) since N is of class C 1 (H). We claim that ψn → ψ in D(H) ∩ D(N α ). Obviously N α ψn → N α ψ, so it remains to consider N N − 12 0 − 12 Hψn = Iin (N )Hψ + Iin (N )(N [N, H] N ) Iin (N )ψ. n n
June 3, J070-S0129055X11004333
476
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
As in the proof above, the last term goes to zero and the first term converges to Hψ proving the claim. 3.2. Iterated commutators with N 1/2 We address here the following question. Supposing Conditions 2.1(1) and (2.10) is satisfied for some k0 ≥ 1. One could reasonably assume that N 1/2 is also of class 1 (A) and admits k0 iterated N 1/2 -bounded commutators. We have however not CMo been able to establish this, but making the additional assumption (2.11) we answer the question in the affirmative below. This permits us to deduce Corollary 2.9 from Theorem 2.7. The reader primarily interested in Theorem 2.7 may skip this subsection. We begin with a technical lemma. Let q ∈ N and ∈ (N∩{0})q , with 0 ≤ j < k0 for all j = 1, . . . , q. We abbreviate Nm = im adm A (N ), which is the iteratively defined N -bounded operator from (2.10). Let for t ≥ 0 and q, as above q 1 (3.15) 4Bq (t) = t 2 (N + t)−1 Nj (N + t)−1 . j=1
Observe that Bq (t) is bounded for all t. Indeed it satisfies the bound Bq (t) = O(t−1/2 ) and is thus not norm integrable. However if ϕ ∈ D(N ) we have Bq (t)ϕ = O(t−3/2 ). The extra assumption (2.11) allows us to prove Lemma 3.6. Suppose Condition 2.1(1) and Condition 2.8. For any q ∈ N, ∈ (N ∪ {0})q (with 0 ≤ j < k0 as above) and ϕ ∈ D(N ) the map t → Bq (t)ϕ is integrable and there exist constants Cq such that ∞ ≤ Cq N 12 ϕ. B (t)ϕ dt q 0
Proof. We only have to prove the bound on the strong integral, since we already discussed strong integrability. We begin by analyzing the leftmost factors in Bq (t), −1 namely the N -bounded operator (N + t) N1 . We compute strongly on D(N ) (N + t)−1 N1 = (N1 N −1 )N (N + t)−1 − (N + t)−1 N (N −1 [N, N1 ]N −1+κ1 )N 1−κ1 (N + t)−1 = (N1 N −1 )N (N + t)−1 + O(t−κ1 ). (3.16) ∞ The contribution to the integral 0 Bq (t)N −1/2 dt coming from the last term is O(t−1−κ1 ) and hence norm-integrable. If q = 1 we can now finish the argument because the contribution to the integral coming from the first term on the right-hand side of (3.16) is 1
(N1 N −1 )t 2 (N + t)−2 N,
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
477
which on the domain of N integrates to the N 1/2 -bounded operator cN1 N −1/2 , for some c ∈ R. If q > 1 we write N (N + t)−1 = 1 − t(N + t)−1 . We can now bring out the next term N2 , and again the commutators with (N + t)−1 give norm-integrable contributions. Repeating this procedure successively until all the terms Nj are brought out to the left yields the formula q 1 1 Nj N −1 t 2 (1 − t(N + t)−1 )q−1 (N + t)−2 N + O(t−1−κ1 )N 2 . Bq (t) = j=1
We compute, by a change of variables, ∞ 1 1 t 2 (1 − t(N + t)−1 )q−1 (N + t)−2 dt = c N − 2 , 0
for some c ∈ R. This implies the lemma. Proposition 3.7. Assume Condition 2.1(1) and Condition 2.8. Then N 1/2 is of 1 (A) and the iterated commutators ip adpA (N 1/2 ), p ≤ k0 , extends from class CMo D(A) ∩ D(N 1/2 ) to N 1/2 -bounded operators. Proof. We already know from Lemma 3.1 that N 1/2 is of class C 1 (A). Hence we only need to establish that the iterated commutator forms extend to N 1/2 -bounded operators. Recall also that D(A)∩D(N ) is dense in D(A)∩D(N 1/2 ), cf. Remark 3.5, which implies that it suffices to show that the iterated commutator forms extend from D(A) ∩ D(N ) to N 1/2 -bounded operators. By Lemma 3.6 and the above remark, it suffices to prove, iteratively, the following representation formula p
i
adpA (N
1 2
)ϕ =
p
αp,q
q=1 1 +···+q =p−q
∞
Bq (t)ϕdt,
(3.17)
0
for ϕ ∈ D(N ). Note that the integrals are absolutely convergent. Here Bq (t) are defined in (3.15). For p = 1, we compute using (3.5) ∞ 1 0 B1,n (t)ϕdt, i[An , N 2 ]ϕ = c 12 0
where the extra subscript n indicates that N0 = N has been replaced by In (A)N In (A). By (3.22) the integrand is O(t−3/2 ) uniformly in large n, and by (3.13) and Lebesgue’s theorem on dominated convergence we can thus compute ∞ 1 0 B1 (t)ϕdt. lim i[An , N 2 ]ϕ = c 12 n→∞
0
June 3, J070-S0129055X11004333
478
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Obviously this together with Lemma 3.6 implies that the form i adA (N 1/2 ) extends from D(A) ∩ D(N ) to an N 1/2 -bounded operator represented on D(N ) by the strongly convergent integral above. We can now proceed by induction, assuming that the iterated commutator p−1 p−1 adA (N 1/2 ) exists as an N 1/2 -bounded operator and is represented on D(N ) i p−1 (N 1/2 )] strongly on D(N ) by (3.17). Compute first the commutator i[An , ip−1 adA using that In (A) i[An , N ] = −In (A)N+1
and i[An , (N + t)−1 ] = (N + t)−1 N (N + t)−1 .
Subsequently take the limit n → ∞ as above and appeal to Lemma 3.6 to conclude that the so computed limit in fact is an N 1/2 -bounded extension of the form p−1 (N 1/2 )] from D(A) ∩ D(N ) and represented on D(N ) as in (3.17). i[A, ip−1 adA Proof of Corollary 2.9. We can now argue that Corollary 2.9 is indeed a direct corollary of Theorem 2.7. Note that ψ ∈ D(N 1/2 Ak ) for all k ≤ k0 due to Theorem 2.7. We can now repeatedly use the fact that D(A) ∩ D(N 1/2 ) is dense in D(A) and Proposition 3.7 to compute for ϕ ∈ D(Ap ), with p + k ≤ k0 , 1
Ap ϕ, N 2 Ak ψ =
p
1
1
1
p−q βq ϕ, (adA (N 2 )N − 2 )N 2 Aq+k ψ,
q=0
with βq some real combinatorial factors. This completes the proof since the norm of the right-hand side is bounded by Cϕ. 3.3. Approximating A by regular bounded operators We recall now a construction from [31] (see [31, p. 203]). Consider an odd realvalued function g ∈ C ∞ (R) obeying g ≥ 0, that the function R t → tg (t)/g(t) has a smooth square root, that the function ]0, ∞[ t → g(t) is concave and the properties for t > 3, 2 g(t) = t for |t| < 1, −2 for t < −3. ˜ such Let h(t) = g(t)/t. We pick an almost analytic extension of h, denoted by h, that for some ρ > 0 (and using again the notation η = u + iv) ˜ ∀N : |∂¯h(η)| ≤ CN η−N −2 |v|N , 2/η for u > 6, |v| < ρ(u − 6), ˜ h(η) = −2/η for u < −6, |v| < ρ(6 − u). ˜ ˜ η ). ˜ such that h(η) = h(¯ We can choose h
(3.18)
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
This gives the representation 1 ˜ g(t) = (∂¯h)(η)t(t − η)−1 du dv. π C
479
(3.19)
Let gm (t) = mg(t/m), for m ≥ 1. Using the properties of g one verifies that for all t ∈ R the function m → gm (t)2 is increasing. We recall that there exists σ > 0 such that for |v| ≥ σ/m the operator
−1 A −η Rm (η) := m
(3.20)
(3.21)
preserves D(N ). See (3.8). Moreover we have uniformly in α ∈ [0, 1], m ∈ N and η that N α Rm (η)N −α ≤ C|v|−1 ;
η ∈ Vm> ,
(3.22)
where Vm> := {u + iv ∈ C : |v| ≥ σ/m}
and Vm< := {u + iv ∈ C : |v| < σ/m}.
This motivates the decomposition into smooth bounded real-valued functions gm = g1m + g2m , where
−1 t m ˜ g1m (t) = −η (∂¯h)(η) 1+η (3.23) dudv + Cm , π Vm> m m g2m (t) = π Cm
m = π
˜ (∂¯h)(η)η < Vm
t −η m
−1 dudv;
(3.24)
˜ ∂¯h(η)dudv. < Vm
Note that the integral in the expression for g2m is over a compact set (decreasing with m). This implies the property sup m∈N,t∈R
(k)
mn tk+1 |g2m (t)| ≤ Cn,k < ∞ for n, k ∈ N ∪ {0}.
(3.25)
Since gm and g2m are bounded functions, we conclude the same for g1m . At a key point in the proof we will need a smooth square root of the function tg g. We pick gˆ = pg ∈ C0∞ (R),
(3.26) where p(t) = tg (t)/g(t), which was assumed smooth. Clearly gˆ2 = tg g. Let p˜ ∈ C0∞ (C) be an almost analytic extension of p. It satisfies ∀N : |∂¯p˜(η)| ≤ CN |v|N .
(3.27)
June 3, J070-S0129055X11004333
480
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
As above we put pm (t) = p(t/m) and make the splitting pm = p1m + p2m , where
−1 t 1 p1m (t) = −η (∂¯p˜)(η) dudv, (3.28) π Vm> m
−1 1 t ¯ −η p2m (t) = (∂ p˜)(η) dudv. (3.29) π Vm< m Let gˆm = pm gm and split gˆm = gˆ1m + gˆ2m by gˆ1m = p1m g1m
and gˆ2m = pm g2m + p2m g1m .
(3.30)
Clearly we can choose Cn,k in (3.25) possibly larger such that gˆ2m satisfies the same estimates. Since pm and p2m are uniformly bounded in m we get P := sup sup |p1m (t)| < ∞.
(3.31)
m∈N t∈R
We observe that the operators g2m (A) and p1m (A), p2m (A) are given by norm convergent integrals, whereas gm (A) and g1m (A) are given on the domain of As , for any s > 0, as strongly convergent integrals. From (3.20) and Lebesgue’s theorem on monotone convergence, we observe that ψ ∈ D(Ak ) is equivalent to supm gm (A)k ψ < ∞. Combining this with (3.25) we find that for k ≥ 1 ψ ∈ D(Ak ) ⇔ ψ ∈ D(Ak−1 ) and
sup g1m (A)k ψ < ∞.
(3.32)
m
It will be convenient in the following when dealing with g1m to abbreviate 1 ¯˜ (∂ h)(η)dudv. π This is however not a complex measure, just a notation. Similarly we will on one occasion write dλp (η) = π1 (∂¯p˜)(η)dudv, which is in fact a complex measure. We have the following dλ(η) =
Lemma 3.8. As a result of the above constructions we have for any m ≥ 1 and (A), p1m (A) and Ag1m (A) 0 ≤ α ≤ 1 that the bounded operators g1m (A), g1m α preserve D(N ). Proof. Let ψ ∈ D(N ) and ϕ ∈ D(A). Observe that N −1 ϕ ∈ D(A), by the C 1 (A) property of N , cf. Condition 2.1(1). We can thus compute using the strongly convergent integral representation for g1m (A), and the notation introduced in (3.21), N ψ, g1m (A)N −1 ϕ =m N ψ, (1 + ηRm (η))N −1 ϕdλ(η) + Cm ψ, ϕ > Vm
= ψ, g1m (A)ϕ + i
> Vm
ηψ, Rm (η)N Rm (η)N −1 ϕdλ(η).
(3.33)
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
481
By Condition 2.1(1), (3.18) and (3.22) we find that for some constant Km we have |N ψ, g1m (A)N −1 ϕ| ≤ Km ψϕ.
(3.34)
This together with an interpolation argument concludes the proof. The cases g1m (A) and p1m (A) are done the same way. As for Ag1m (A) we write Aj = AIj (A) and compute (A)N −1 = Aj N g1m (A)N −1 − iIj (A)N N −1 N Ij (A)g1m (A)N −1 . N Aj g1m
To complete the proof by taking j → ∞ we need to argue that (A)D(N ) ⊆ D(A). N g1m
To achieve this we repeat the computation (3.33), with ψ replaced by Aψ, ψ ∈ D(A), . We get and g1m replaced by g1m (A)N −1 ϕ Aψ, N g1m (A)ϕ = ψ, Ag1m A 2 2 −1 ψ, {Rm (η)N Rm (η) + Rm (η) N Rm (η)}N ϕ dλ(η). η + > m Vm
The result now follows from writing and (3.22) as above.
A m Rm (η)
= 1 + ηRm (η)and appealing to (3.18)
4. Proof of the Abstract Results In this section we prove the abstract theorems formulated in Sec. 2 as well as an extended version of Theorem 2.10. The proofs are given in separate subsections. 4.1. Proof of Theorem 2.7 Let 1
Dk = {ϕ ∈ D(Ak )|∀0 ≤ j ≤ k : Aj ϕ ∈ D(N 2 )}. Using Conditions 2.1–2.3 and 2.6 we shall prove Theorem 2.7 by induction in k = 0, . . . , k0 that ψ ∈ Dk . We can assume without loss of generality that λ = 0. The proof relies on three estimates which we state first in the form of three propositions. After giving the proof of Theorem 2.7, we then proceed to verify the propositions. We begin with some abbreviations and a definition. For a state ψ we introduce the notation ψm = g1m (A)k ψ
and ψˆm = gˆ1m (A)g1m (A)k−1 ψ = p1m (A)ψm .
Let σ > 0 be fixed as in Remark 2.4(1), applied with N 1/2 in place of N .
June 3, J070-S0129055X11004333
482
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Definition 4.1. Let k ≥ 1. A family of forms {Rm }∞ m=1 on Dk−1 will be called a k-remainder if for all > 0 there exists C > 0 such that 1
1
|ψ, Rm ψ| ≤ N 2 ψm 2 + C N 2 (A − iσ)k−1 ψ2 ,
(4.1)
for any ψ ∈ Dk−1 and m ∈ N. Lemma 3.8 is repeatedly used below, mostly without comment, to justify manipulations. The first proposition is a virial result, to be proved by a symmetrization of a commutator between H and a regularized version of A2k+1 . Proposition 4.2. Let 0 < k ≤ k0 and ψ ∈ Dk−1 be a bound state for H. There exists a k-remainder Rm , such that ψm , H ψm + 2kψˆm , H ψˆm = ψ, Rm ψ. The second result is an implementation of the virial bound (2.4) in Condition 2.2, which together with Proposition 4.2 makes it possible to deal with N 1/2 ψm . This is reminiscent of what was done in the proof of [31, Proposition 8.2]. The constant C2 appearing in the proposition comes from Condition 2.2. Proposition 4.3. Let ψ ∈ Dk−1 be a bound state. There exists C independent of m such that 1
1
N 2 ψm 2 ≤ 2C2 ψm , H ψm + C(ψm 2 + N 2 (A − iσ)k−1 ψ2 ) and 1 1 N 2 ψˆm 2 ≤ 2C2 ψˆm , H ψˆm + C(ψˆm 2 + N 2 (A − iσ)k−1 ψ2 ).
The third and final input is an implementation of the positive commutator estimate in Condition 2.3. The constant c0 and the compact operator K0 appearing in the proposition come from Condition 2.3. >0 Proposition 4.4. Let ψ ∈ Dk−1 be a bound state. There exist constants C, C independent of m such that c0 m , K0 ψm − CN 12 (A − iσ)k−1 ψ2 ψm , H ψm ≥ ψm 2 − Cψ 2 and ψˆm , H ψˆm ≥
1 c0 ˆ 2 ˆ ψm − Cψm , K0 ψˆm − CN 2 (A − iσ)k−1 ψ2 . 2
Proof of Theorem 2.7. Let ψ be the bound state, which we take to be normalized. By assumption ψ ∈ D0 . Assume by induction that ψ ∈ Dk−1 , for some k ≤ k0 . We proceed to show that ψ ∈ Dk : From Proposition 4.2 we get the existence of a k-remainder Rm such that ψm H ψm + 2kψˆm , H ψˆm = ψ, Rm ψ.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
483
Estimating the right-hand side using (4.1) and Proposition 4.3 we find a C > 0 such that 1 c0 ψm , H ψm + 2kψˆm , H ψˆm ≤ ψm 2 + CN 2 (A − iσ)k−1 ψ2 . 4 Finally, we appeal to Proposition 4.4 to derive the bound 1 c0 m , K0 ψm + 2k C ψˆm , K0 ψˆm . (4.2) ψm 2 ≤ CN 2 (A − iσ)k−1 ψ2 + Cψ 4
Pick Λ > 0 large enough such that 0 1[|A|>Λ] ≤ 2CK
c0 , 12(1 + 2kP 2 )
where P is given by (3.31). Write 1[|A|≤Λ] ψm = [1[|A|≤Λ] (gm (A) − g2m (A))]k ψ and estimate using (3.25) [|A|≤Λ] ψm , K0 ψm | ≤ 2C(Λ + C0,0 )k K0 ψψm 2C|1 ≤
2 (Λ + C0,0 )2k K0 2 12C c0 ψm 2 + ψ2 12 c0
and similarly 2k 2 4 2 [|A|≤Λ] ψˆm , K0 ψˆm | ≤ c0 ψm 2 + 24k C (Λ + C0,0 ) K0 P ψ2 . 2C|1 24k c0
Inserting 1 = 1[|A|≤λ] + 1[|A|>λ] ahead of the K0 ’s in (4.2) and appealing to the bounds above we get 1 c0 ψm 2 ≤ C(N 2 (A − iσ)k−1 ψ2 + ψ2 ), 8
for a suitable m-independent C. Recalling (3.32) we conclude that ψ ∈ D(Ak ). It remains to prove that Ak ψ ∈ D(N 1/2 ). Note that what we just established implies that ψm → Ak ψ in norm, cf. (3.20) and (3.25). We can now compute Ak ψ, N Iin (N )Ak ψ = lim ψm , N Iin (N )ψm . m→∞
But by Propositions 4.2 and 4.3 we have 1
ψm , N Iin (N )ψm ≤ N 2 ψm 2 1 1 ≤ N 2 ψm 2 + 2kN 2 ψˆm 2
≤ 2C2 (ψm , H ψm + 2kψˆm , H ψˆm ) + C = ψ, Rm ψ + C, where C > 0 is constant independent of m. The result now follows from (4.1) by first taking the limit m → ∞, and subsequently n → ∞. Notice that Lebesgue’s
June 3, J070-S0129055X11004333
484
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
theorem on monotone convergence applies, since Iin (N ) = n(N + n)−1 → 1 monotonously. The rest of the section is devoted to establishing Propositions 4.2–4.4. We begin with a definition and a series of lemmata. The σ in the definition below is the same σ that entered into Definition 4.1. l r and Em be families of forms on Dk−1 × D(N 1/2 ) and Definition 4.5. Let Em 1/2 l is a left-error if D(N ) × Dk−1 respectively. We say that Em 1
1
l |ψ, Em ϕ| ≤ CN 2 (A − iσ)k−1 ψN 2 ϕ. r We say that Em is a right-error if 1
1
r |ψ, Em ϕ| ≤ CN 2 ψN 2 (A − iσ)k−1 ϕ.
Remark 4.6. An example of a right-error that we will encounter below are forms N 1/2 Bm N 1/2 g1m (A) (A − iσ)−j , with − j ≤ k − 1 and supm Bm < ∞. To see that this is a right-error observe that it suffices to prove that N g1m (A) (A − iσ)−j−k+1 N −1 is uniformly bounded in m. The result then follows from interpolation. Since j + k − 1 ≥ , recalling that σ was chosen according to (3.8), we reduce the problem to showing that N g1m (A)(A − iσ)−1 N −1 is bounded uniformly in m. But this follows by a computation similar to (3.33), where the extra resolvent produces a bound which is uniform in m compared with the point wise bound (3.34). We introduce the notation Hn := HIn (H) = in(In (H) − 1),
(4.3)
which plays the role of a regularized Hamiltonian. See (3.11) for the definition of In (H). Lemma 4.7. We have the following limit in the sense of forms on D(N 1/2 ) lim i[Hn , g1m (A)] = − ηRm (η)H Rm (η)dλ(η). n→∞
> Vm
Proof. Observe first that the integral on the right-hand side in the lemma is norm convergent. Compute as a form on D(A) using that the integral representation for g1m (A) is strongly convergent on D(A) i[Hn , g1m (A)] = ηi[Hn , Rm (η)]dλ(η). > Vm
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
Recalling (4.3) we arrive at
485
i[Hn , g1m (A)] = in
> Vm
= > Vm
ηi[In (H), Rm (η)]dλ(η)
ηIn (H)i[H, Rm (η)]In (H)dλ(η).
Finally we employ Condition 2.1(3) to conclude that for each n, the following holds as a form identity on D(A) ∩ D(N 1/2 ) ηIn (H)Rm (η)H Rm (η)In (H)dλ(η). i[Hn , g1m (A)] = − > Vm
The integral on the right-hand side of the above identity is absolutely convergent in B(N −1/2 H; N 1/2 H). By density of D(A) ∩ D(N 1/2 ) in D(N 1/2 ), see Remark 2.4(2), the identity therefore extends to a form identity on D(N 1/2 ). The lemma now follows from (3.12). Lemma 4.8. Let 1 ≤ k ≤ k0 . r ˆ r such that, as forms on D(N 1/2 ) × Dk−1 , (1) There exist right-errors Em ,E m r lim i[Hn , g1m (A)k ] = Em ,
n→∞
ˆr . lim i[Hn , gˆ1m (A)g1m (A)k−1 ] = E m
n→∞
l r (2) There exist a left-error Em and a right-error Em such that, as forms on Dk−1 × 1/2 1/2 D(N ) and D(N ) × Dk−1 respectively, l (A)H + Em , lim lim i[Hn , g1m (A)k ]Aj = kg1m (A)k−1 Ag1m
j→∞ n→∞
r (A)g1m (A)k−1 + Em . lim lim Aj i[Hn , g1m (A)k ] = kH Ag1m
j→∞ n→∞
Proof. (1) also holds if we take the limit in the sense of forms on Dk−1 × D(N 1/2 ) and replace the right-error by a left-error. We will however not need that statement. One does however need its proof for the left-error part of (2). In the proof we will only work with right-errors. The other case is similar. We begin with (1) and prove only the first statement leaving the second to the reader. We first compute as a form on D(N 1/2 ). i[Hn , g1m (A)k ] = ki[Hn , g1m (A)]g1m (A)k−1
k +1 k + (−1) i adg1m (A) (Hn )g1m (A)k− .
(4.4)
=2
We now analyze the large n limit. The first term on the right-hand side of (4.4) can be dealt with using Lemma 4.7 directly, observing that by Lemma 3.8 g1m (A)
June 3, J070-S0129055X11004333
486
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
preserves the domain of N 1/2 . As for the terms involving higher order commutators, we again use Lemma 4.7 to compute lim i adg1m (A) (Hn ) = − ηRm (η) ad−1 g1m (A) (H )Rm (η)dλ(η) n→∞
> Vm
in the sense of forms on D(N 1/2 ). We can now employ Condition 2.6 to compute as forms on D(N 1/2 ) 1
1
() N2, lim i adg1m (A) (Hn ) = (−1) N 2 Bm
n→∞ ()
(4.5)
()
where Bm is a family of bounded operators with supm Bm < ∞, for all . They are given by 1 () Bm = η1 · · · η N − 2 Rm (η1 ) · · · Rm (η ) ad−1 A (H ) > (Vm )
1
× Rm (η ) · · · Rm (η1 )N − 2 dλ(η1 ) · · · dλ(η ).
(4.6)
From (4.4), (4.5) and Lemma 4.7, we thus obtain lim i[Hn , g1m (A)k ] = −k ηRm (η)H Rm (η)dλ(η)g1m (A)k−1 n→∞
> Vm
−
k
k =2
1
1
() N 2 g1m (A)k− . N 2 Bm
(4.7)
Combining this computation with Remark 4.6 yields (1). We now turn to part (2) of the lemma. In view of (4.7) we begin by computing as a form on D(N 1/2 ), using Condition 2.1(4) −k ηRm (η)H Rm (η)dλ(η) > Vm
=
kH g1m (A)
ik − m
> Vm
ηRm (η)H Rm (η)2 dλ(η).
(4.8)
We remark that the identity i[H , Rm (η)] = −m−1 Rm (η)H Rm (η) holds a priori as a form identity on D(N ). It extends by continuity to a form identity on D(N 1/2 ), which is what is used in the above computation. Note that the integral on the right-hand side is convergent as a form on D(N 1/2 ). From (4.7), (4.8) and Remark 4.6, we find that r (A)Ag1m (A)k−1 + Em )Ij (A) lim i[Hn , g1m (A)k ]Aj = (kH g1m
n→∞
and hence by (3.14) we conclude the following identity as forms on D(N 1/2 ) r (A)Ag1m (A)k−1 + Em . lim lim i[Hn , g1m (A)k ]Aj = kH g1m
j→∞ n→∞
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
487
To prove the second statement in (2) it remains to show that the commutator between Aj and i[Hn , g1m (A)k ] converges to a right-error. From (4.8) we get, as a form on D(N 1/2 ), ! ηRm (η)H Rm (η)dλ(η), Aj −k > Vm
= kIj (A)H
ik − m
Ij (A)g1m (A)
> Vm
ηRm (η)(H Aj − Aj H )Rm (η)2 dλ(η).
We can now take the limit j → ∞ and obtain ! 1 1 (1) lim −k ηRm (η)H Rm (η)dλ(η), Aj = N 2 Bm N 2, j→∞
(4.9)
> Vm
(1)
(1)
where Bm , is a family of bounded operators with supm Bm < ∞. It is given by
1 (1) − 12 η(Rm (η)H − H Rm (η))Rm (η)dλ(η) N − 2 . Bm = kN H g1m (A) − i > Vm
Here we used (3.14), that Aj Rm (η) = Rm (η)Aj = m(1 + ηRm (η))Ij (A), as well as Lebesgue’s theorem on dominated convergence. For the commutator between Aj and the second term on the right-hand side of (4.7) we compute 1
1
1
1
() ˜ () N 2 Ij (A), N 2 , Aj ] = Ij (A)N 2 B [N 2 Bm m () () ˜m are bounded operators with supm∈N Bm < ∞, for all . They are where B given by 1 ˜ () = B N − 2 Rm (η1 ) · · · Rm (η ) adA (H ) m > (Vm )
1
× Rm (η ) · · · Rm (η1 )N − 2 dλ(η1 ) · · · dλ(η ). We can now take the limit j → ∞ using (3.14), and the resulting expression together with (4.9), the formula (4.7) and Remark 4.6 yields that r . lim lim [i[Hn , g1m (A)k ], Aj ] = Em
j→∞ n→∞
Lemma 4.9. There exists a k-remainder Rm such that lim lim i[Hn , g1m (A)k Aj g1m (A)k ]
j→∞ n→∞
(A)H g1m (A)k } + Rm , = g1m (A)k H g1m (A)k + 2k Re{g1m (A)k−1 Ag1m
in the sense of forms on Dk−1 .
June 3, J070-S0129055X11004333
488
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Proof. We compute as a form on Dk−1 i[Hn , g1m (A)k Aj g1m (A)k ] = i[Hn , g1m (A)k ]Aj g1m (A)k + g1m (A)k i[Hn , Aj ]g1m (A)k + g1m (A)k Aj i[Hn , g1m (A)k ]. Using that limn→∞ i[Hn , Aj ] = Ij (A)H Ij (A), limj→∞ Ij (A)H Ij (A) = H (in the sense of forms on D(N 1/2 )), and Lemma 4.8(2), we conclude the result, with l r g1m (A)k + g1m (A)k Em . Rm = Em
Note that Rm is a k-remainder, in the sense of Definition 4.1. We now symmetrize the form g1m (A)k−1 Ag1m (A)H g1m (A)k , defined on D(N 1/2 ).
Lemma 4.10. There exists a k-remainder Rm such that Re{g1m (A)k−1 Ag1m (A)H g1m (A)k } = g1m (A)k p1m (A)H p1m (A)g1m (A)k + Rm ,
in the sense of forms on Dk−1 . Proof. Step I. From the proof of Lemma 3.8, it follows that [N, Ag1m (A)]N −1 ,
[N, p21m (A)g1m (A)]N −1 ,
and N −1 p1m (A)N
(4.10) (4.11)
extend as forms from D(N ) to bounded operators with norm bounded uniformly in m. Step II. Boundedness of the forms in (4.10), together with the observation that − p21m g1m ∞ is bounded uniformly in m, implies after an interpolation tg1m argument that 1
1
N 2 (Ag1m (A) − p1m (A)2 g1m (A))N − 2
is bounded uniformly in m. Hence Re{g1m (A)k−1 Ag1m (A)H g1m (A)k } (1) , = g1m (A)k Re{p1m (A)2 H }g1m (A)k + Rm (1)
where Rm is a k-remainder. Step III. We compute as a form on D(N 1/2 ) A + iσ Rm (η)H Rm (η)dλp (η), (A + iσ)[p1m (A), H ] = −i > m Vm
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
489
which is bounded uniformly in m as a form on D(N 1/2 ). This together with (4.11) and a interpolation argument as in step II, shows that (2) g1m (A)k Re{p1m (A)2 H }g1m (A)k = g1m (A)k p1m (A)H p1m (A)g1m (A)k + Rm , (2)
where Rm is a k-remainder. Here we used again Remark 4.6. This proves the lemma (1) (2) with Rm = Rm + Rm . Proof of Proposition 4.2. Combine Lemmas 4.9 and 4.10. Proof of Proposition 4.3. We only prove the first estimate. The second is verified the same way. We can assume that λ = 0. We estimate using Condition 2.2 1
N 2 In (H)ψm 2 ≤ C1 In (H)ψm , HIn (H)ψm + C2 In (H)ψm , H In (H)ψm + C3 In (H)ψm 2 .
(4.12)
Note that HIn (H)ψm = Hn ψm = [Hn , g1m (A)k ]ψ. By Lemma 4.8(1) we find that for any ϕ ∈ D(N 1/2 ) we have r ψ. lim ϕ, Hn ψm = ϕ, Em
n→∞
(4.13)
By this observation and the uniform boundedness principle there exists C = C(m) such that |ϕ, Hn ψm | ≤ CN 1/2 ϕ uniformly in n, for ϕ ∈ D(N 1/2 ). Applying this to ϕ = (In (H) − I)ψm , together with (4.13), now applied with ϕ = ψm , we get r ψ. lim In (H)ψm , HIn (H)ψm = ψm , Em
n→∞
(4.14)
r Here Em is a right-error. We can now take the limit n → ∞ in (4.12), and the result follows from Definition 4.5.
Proof of Proposition 4.4. As above we assume λ = 0 and prove only the first bound. By Remark 2.4(4) it suffices to estimate using the bound (2.6) instead of the one in Condition 2.3. We get In (H)ψm , H In (H)ψm ≥ c0 In (H)ψm 2 + ReIn (H)ψm , BHIn (H)ψm − In (H)ψm , K0 In (H)ψm . (4.15) Arguing as in the part of the proof of Proposition 4.3 pertaining to (4.14), we find that r ψ, lim ReIn (H)ψm , BHIn (H)ψm = Reψm , Em
n→∞
June 3, J070-S0129055X11004333
490
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
r where Em is a right-error. Here (4.13) was used (twice) with ϕ replaced by Bϕ ∗ and B ϕ, where we used the assumption on B in Remark 2.4(4) to argue that Bϕ, B ∗ ϕ ∈ D(N 1/2 ) in (4.13). Inserting this limit into (4.15) yields
ψm , H ψm = lim In (H)ψm , H In (H)ψm n→∞
r ≥ c0 ψm 2 − ψm , K0 ψm + Reψm , Em ψ, r being a right-error. Using Definition 4.5 and Proposition 4.3, we conclude with Em the first estimate.
4.2. Proof of Theorem 2.10 We shall show Theorem 2.10, which is an extension of Corollary 2.9 under the minimal condition k0 = 1. Proof of Theorem 2.10. We can without loss of generality take λ = 0. Due to Corollary 2.9, only the first statement needs elaboration. The idea of the proof is to apply a virial argument for the commutator i[H, A] and the state N 1/2 ψ. We (1/2) = N 1/2 Iin (N ). divide the proof into three steps. Let Nn (1/2)
Step I. Due to Lemma 3.1, we have Nn
ψ ∈ D(H). We shall show that
sup HNn(1/2) ψ n∈N
< ∞.
(4.16)
We can use the representation formula (3.5) with α = 1/2 and commute H through N 1/2 , cf. (3.4). Whence it suffices to bound ∞ 1 1 t 2 (N + t)−1 [H, N ]0 (N + t)−1 Iin (N )N − 2 dt 0
independently of n. (Note that the contribution from commuting through the second factor Iin (N ) indeed is bounded independently of n.) By (2.1) we have 1
1
[H, N ]0 = N 2 −κ BN 2 −κ
for B bounded,
and we can estimate 1
3
(N + t)−1 i[H, N ]0 (N + t)−1 Iin (N )N − 2 ≤ Bt− 2 −κ
uniformly in n.
Hence the integrand is O(t−1−κ ) uniformly in n, and (4.16) follows. Step II. We shall show that sup ANn(1/2) ψ < ∞.
(4.17)
n∈N
Since φ := N 1/2 ψ ∈ D(A) due to Corollary 2.9 it suffices to bound the state [A, Iin (N )]φ independently of n. This is obvious from the representation [A, Iin (N )]φ = −i(N + n)−1 N Iin (N )φ, and whence (4.17) follows.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
491
Step III. We look at i[H, A]N (1/2) ψ = −2 ReiHNn(1/2) ψ, ANn(1/2) ψ. n
Due to (4.16) and (4.17), the right-hand side is bounded independently of n. We compute using Conditions 2.1(1) and 2.1(3) i[H, A]N (1/2) ψ = lim i[H, AIn˜ (A)]N (1/2) ψ = H N (1/2) ψ . n ˜ →∞
n
n
n
Whence using the virial estimate Condition 2.2 (and also Step I again) we conclude that N N (1/2) ψ ≤ C uniformly in n. n
Taking n → ∞ we obtain that indeed ψ ∈ D(N ). 4.3. Theorem on more N -regularity We formulate and prove an extended version of Theorem 2.10. Notice that under Conditions 2.1(1) and 2.1(2), and the additional condition (2.11) for k0 = 1, 1
1 1 (A) ∩ CMo (H), N 2 ∈ CMo
(4.18)
cf. Lemma 3.2 and Proposition 3.7. We impose the conditions of Corollary 2.9 and aim at an improvement of Corollary 2.9 and Theorem 2.10 in the case k0 ≥ 2. Let M0 = i[N 1/2 , A]0 . Then, cf. Proposition 3.7, 1
2 im adm A (M0 ) is N -bounded for m = 0, . . . , k0 − 1.
(4.19)
Here the commutators are defined iteratively as extensions of forms on D(N 1/2 ) ∩ D(A) and they are considered as symmetric N 1/2 –bounded operators. We introduce the following N 1/2 –bounded operators: ∞ 1 1 t 2 (N + t)−1 i[N, H]0 (N + t)−1 dt, M1 = i[N 2 , H]0 = c 12 0
M2 = H N
− 12
1
and M3 = N − 2 H .
Notice that M3 ⊆ M2∗
and M2 ⊆ M3∗ .
(4.20)
We need to consider repeated commutation of Mj , j = 1, . . . , 3, with factors of T = A or T = N 1/2 . Condition 4.11. For all j = 1, . . . , 3, m = 1, . . . , k0 − 1 and all possible combinations of factors Tn ∈ {A, N 1/2 } where n = 1, . . . , m 1
im adTm · · · adT1 (Mj ) is N 2 -bounded.
(4.21)
June 3, J070-S0129055X11004333
492
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Notice that in (4.21) the commutators are defined iteratively as extensions of forms on D(N 1/2 ) ∩ D(A) using (4.20) and the analogue properties for m ≥ 2 (−1)m−1 adTm−1 · · · adT1 (M3 ) ⊆ (adTm−1 · · · adT1 (M2 ))∗ , (−1)m−1 adTm−1 · · · adT1 (M2 ) ⊆ (adTm−1 · · · adT1 (M3 ))∗ . We shall prove the following extension of Corollary 2.9 and Theorem 2.10. Theorem 4.12. Suppose the conditions of Corollary 2.9 and for k0 ≥ 2 also Condition 4.11. Let ψ ∈ D(N 1/2 ) be a bound state (H − λ)ψ = 0 (with λ as in Condition 2.3). Then ψ ∈ D(Tk0 +1 · · · T1 ) where Tn ∈ {A, N 1/2 , 1} for n = 1, . . . , k0 + 1 and at least for one such n, Tn = A. Proof. We proceed by induction in k0 . The case k0 = 1 is the content of Theorem 2.10. So suppose k0 ≥ 2 and that the statement holds for k0 → k0 − 1. Consider any product S = Tk0 +1 · · · T1 not all factors being given by A. We shall show that ψ ∈ D(S). By Corollary 2.9 and the induction hypothesis we can assume that the factors Tn ∈ {A, N 1/2 } and that for at least two n’s Tn = N 1/2 . By using (4.19) and the induction hypothesis we can assume that Tk0 +1 = N 1/2 . Whence we can assume α with k = k0 introducing here the following notation for k = 1, . . . , k0 , S = N 1/2 Sk, = 0, . . . , k and α being a multiindex α ∈ {0, 1}k with j≤k αj = , α = Sαk · · · Sα1 =: Sk,
k
Sαj
1
where S0 = A and S1 = N 2 .
j=1
Partly motivated by the above considerations we introduce the following quantity for n ∈ N large and ∈ ]0, 1[ small f (n, ) =
k0 =0
2
−2 g(n, ); g(n, ) :=
1
N 2 Iin (N )Skα0 , ψ2 .
α∈{0,1}k0 α1 +···+αk0 =
We claim that for some constants K1 , K2 () > 0 independent of n f (n, ) ≤ 2 K1 f (n, ) + K2 ().
(4.22)
The theorem follows from (4.22) by first choosing so small that 2 K1 ≤ 1/2, subtraction of the first term on the right-hand side and then letting n → ∞. By Corollary 2.9 (or Theorem 2.7), supn g(n, = 0) < ∞, in agreement with (4.22). To see how the factor 2 comes about let us note that −22 = −( − 1)2 − ( + 1)2 + 2, whence (to be used later) we can for = 1, . . . , k0 − 1 bound the expression 2 (4.23) −2 g(n, − 1) g(n, + 1) ≤ 2 f (n, ). To show (4.22) we mimic the proof of Theorem 2.10. Again this is in three steps and we assume that λ = 0. We need to bound each term of g(n, ) for ≥ 1.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
493
Step I. Bounding HIin (N )Skα0 , ψ. We expand into terms; some can be bounded independently of n (using the induction hypothesis) while others will be estimated as C g(n, + 1) (assuming here that ≤ k0 − 1). We compute formally k0 i[H, Iin (N )Skα0 , ] = i[H, Iin (N )]Skα0 , + Iin (N )i H, Sαj , (4.24) j=1
where the second commutator is expanded as k0 m=k k0 m−1 0 i H, Sαj = Sαj i[H, Sαm ] Sαj . j=1
m=1
j=m+1
(4.25)
j=1
In turn we have the expressions i[H, Iin (N )] = n−1 Iin (N )i[N, H]0 Iin (N ), i[H, Sαm ] = −M1 i[H, Sαm ] = M2 S1
if αm = 1,
(4.26a) (4.26b)
if αm = 0.
(4.26c)
We plug (4.26a)–(4.26c) into (4.24) and (4.25) and look at each term separately. Before embarking on a such examination we need to “fix” the above formal computation. This is done in terms of multiple approximation somewhat similar to the one of the proof of Theorem 2.7. We replace H → Hp and the factors A → Aq and 1/2 N 1/2 → Niq = (N 1/2 )iq . More precisely it is convenient to introduce k0 different q’s, say q1 , . . . , qk0 ; the q used for the jth factor Sαj is qj . For fixed p and q’s the product rule applies for computing the commutator of the product and the analogues of (4.24) and (4.25) hold true. Now we can take the limit p → ∞. We can plug the modified expressions of (4.26a)–(4.26c) into (modified) (4.24) and (4.25). Actually (4.26a) is the same, but (4.26b) and (4.26c) are changed as 1
1
1
i[H, Niq2j ] = −Iiqj (N 2 )M1 Iiqj (N 2 ),
(4.27a)
i[H, Aqj ] = Iqj (A)M2 S1 Iqj (A).
(4.27b) 1/2
Of course we have a q-dependence of the various factors of either S1 → Niqj or S0 → Aqj . Eventually we take the limits in the q’s done in increasing order starting by taking q1 → ∞ and ending by taking qk0 → ∞. Before taking these limits we need to do some further commutation using Condition 4.11. For simplicity of presentation we ignore below in this process commutation with the regularizing factors of Iiqj (N 1/2 ) or Iqj (A) since in the limit they will disappear (a manifestation of this occurred also in the proof of Lemma 3.4). In other words we proceed now slightly formally using (4.24) and (4.25) with the plugged in expressions (4.26a)–(4.26c): From (4.26a), we obtain that i[H, Iin (N )] ≤ C so the contribution from the first term of (4.24) can be estimated (uniformly in n) as i[H, Iin (N )]Skα0 , ψ ≤ CSkα0 , ψ ≤ C.
(4.28)
June 3, J070-S0129055X11004333
494
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
As for the contribution from (4.26b), we compute −Iin (N )
k0
m−1
Sαj M1
j=m+1
j=1
1 2 Sαj = T1 N
1≤j≤k0 j =m
Sαj + T2 ,
where 1 T1 = −Iin (N )M1 N − 2 .
Here T2 is given by repeated commutation using Condition 4.11. We apply this identity to the bound state ψ. Since T1 ≤ C the induction hypothesis gives similar bounds as (4.28) for the contribution from (4.26b). It remains to look at the contribution from (4.26c): We commute the factor M2 to the left and get similarly k0 m−1 Iin (N ) Sαj M2 S1 Sαj j=m+1
j=1
= T1 N Iin (N ) 1 2
k0
Sαj S1
j=m+1
m−1
Sαj + T2 ,
j=1
where 1 T1 = Iin (N )M2 (N 2 Iin (N ))−1 .
As before T1 ≤ C (here we use that H is N -bounded) and the contribution from T2 is treated by using Condition 4.11 and the induction hypothesis. Consequently we get for ≤ k0 − 1 the total bound 1 g(n, + 1) + C 2 , HIin (N )Skα0 , ψ ≤ C (4.29) 1 and C 2 are independent of n, and for = k0 this bound without the first where C term to the right. Step II. Bounding AIin (N )Skα0 , ψ. We claim that (recall ≥ 1) 3 AIin (N )Skα0 , ψ ≤ C
4 , g(n, − 1) + C
(4.30)
4 are independent of n. 3 and C where C To prove (4.30) we observe that it suffices by the induction hypothesis to bound Iin (N )ASkα0 , ψ. Since ≥ 1 there is a nearest factor of N 1/2 in the product Skα0 , that we move to the left in front of the factor A: 1
Iin (N )ASkα0 , = N 2 Iin (N )ASkβ0 −1,−1 + T.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
495
We apply this identity to the bound state ψ. The contribution from T is treated by using (4.19) and the induction hypothesis. This proves (4.30). Step III. We repeat Step III of the proof of Theorem 2.10 using now the proven estimates (4.29) and (4.30) to bound any term of g(n, ) for ≥ 1. In combination with (4.23) these bounds yield (4.22) with 1 C 3 (2k0 − 1) + 1; K1 = 2C2 C 1 and C 3 come from (4.29) and (4.30), here the constant C2 comes from (2.4) while C respectively. Notice that the cardinality of set {0, 1}k0 is 2k0 , so the factor 2k0 − 1 αj ≥ 1. arises by counting only those indices α ∈ {0, 1}k0 with Corollary 4.13. Suppose the conditions of Corollary 2.9 and for k0 ≥ 2 also Condition 4.11. Let ψ ∈ D(N 1/2 ) be a bound state (H − λ)ψ = 0 (with λ as in Condition 2.3). Then ψ ∈ D(N (k0 +1)/2 ). 5. A Class of Massless Linearly Coupled Models In this section we introduce a class of massless linearly coupled Hamiltonians, sometimes referred to as Pauli–Fierz Hamiltonians [7, 10, 11, 22]. The bulk of this section is spent on checking that an expanded version of the Hamiltonian does indeed satisfy the abstract assumptions of Sec. 2. In Sec. 5.2, we verify that the Nelson model described in Sec. 1.2 is indeed an example of the type of models discussed here. 5.1. The model and the result Consider the Hilbert space HPF = K ⊗ Γ(h), where K is the Hilbert space for a “small” quantum system, and Γ(h) is the symmetric Fock space over h = L2 (Rd , dk), describing a field of massless scalar bosons. The Pauli–Fierz Hamiltonian HvPF acting on HPF is defined by HvPF = K ⊗ 1Γ(h) + 1K ⊗ dΓ(|k|) + φ(v),
(5.1)
where K is a Hamiltonian on K describing the dynamics of the small system. We assume that K is bounded from below, and for convenience we require furthermore that K ≥ 0. The term dΓ(|k|) is the second √ quantization of the operator of multiplication by |k|, ∗ and φ(v) = (a (v) + a(v))/ 2. The form factor v is an operator from K to K ⊗ h, and a∗ (v), a(v) are the usual creation and annihilation operators associated to v. See [7, 22]. The hypotheses we make are slightly stronger than the ones considered in [22]. The first one, Hypothesis (H0), expresses the assumption that the small system is confined: (H0) (K + i)−1 is compact on K.
June 3, J070-S0129055X11004333
496
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Let 0 ≤ τ < 1/2 be fixed. We will introduce a class of interactions which increase with τ . In order to formulate our assumption on the form factor v we introduce the subspace Oτ of B(D(K τ ); K ⊗ h) consisting of those operators which extend by continuity from D(K τ ) to an element of B(K; D(K τ )∗ ⊗ h). In other words Oτ := {v ∈ B(D(K τ ); K ⊗ h) | ∃ C > 0, ∀ ψ ∈ D(K τ ) : [(K + 1)−τ ⊗ 1h ]vψK⊗h ≤ CψK }. We also write v for the extension. It is natural to introduce a norm on Oτ by vτ = v(K + 1)−τ B(K;K⊗h) + [(K + 1)−τ ⊗ 1h ]vB(K;K⊗h) . Our first assumption on the form factor interaction is the following: (I1) v, [1K ⊗ |k|−1/2 ]v ∈ Oτ . It is proved in [22] that if (I1) holds, HvPF is self-adjoint with domain D(HvPF ) = D(K ⊗ 1Γ(h) + 1K ⊗ dΓ(|k|)). ˜ defined by The unitary operator T : L2 (Rd ) → L2 (R+ ) ⊗ L2 (S d−1 ) =: h (d−1)/2 u(ωθ) allows us to pass to polar coordinates. Lifting T to the (T u)(ω, θ) = ω ˜ PF := K ⊗ Γ(h). full Hilbert space as 1K ⊗ Γ(T ) gives a unitary map from HPF to H PF The Hamiltonian Hv is unitarily equivalent to vPF := K ⊗ 1 ˜ + 1K ⊗ dΓ(ω) + φ(˜ H v ), Γ(h)
(5.2)
˜ where v˜ = [1K ⊗ T ]v ∈ B(K; K ⊗ h). In polar coordinates the space of couplings consists of operators of the form ˜ where v ∈ Oτ . We write O τ = [1K ⊗ T ]Oτ and equip it with [1K ⊗ T ]v : K → K ⊗ h, ˜ ∗ v τ . Observe ˜ v ˜τ = vτ , when v˜ = [1K ⊗T ]v. the obvious norm ˜ v τ = [1K ⊗T ]˜ Let d be as in (1.7) and (1.8). We recall that d expresses the least amount of infrared regularization carried by a v satisfying (I2) below. The following further assumptions on the interaction are made: (I2) The following holds τ , v∈O [1K ⊗ (1 + ω −1/2 )ω −1 d(ω) ⊗ 1L2 (S d−1 ) ]˜ τ , [1K ⊗ (1 + ω −1/2 )d(ω)∂ω ⊗ 1L2 (S d−1 ) ]˜ v∈O ˜ v ∈ B(D(K τ ); K ⊗ h). (I3) [1K ⊗ ∂ω2 ⊗ 1L2 (S d−1 ) ]˜ In this paper we need an additional assumption compared to [22]. For bounded K, it is implied by (I1). Its presence is motivated by a desire to deal effectively with infrared singularities. ˜ × D(K) to an element v − v˜K extends from [D(K) ⊗ h] (I4) The form [K ⊗ 1h˜ ]˜ of O 1 . 2
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
497
1 is defined as O τ . Supposing (I1), the statement above is meaningful. See Here O 2 also Remark 5.14 below. Remark 5.1. We remark that for separable Hilbert spaces K1 and K2 there are two natural subspaces of B(K1 ; K2 ⊗ h). Namely
2 d d 2 L (R ; B(K1 ; K2 )) = v : R → B(K1 ; K2 ) v(k)B(K1 ;K2 ) dk < ∞ L2w (Rd ; B(K1 ; K2 )) =
Rd
d 2 v : R → B(K1 ; K2 ) sup v(k)ψ2 dk < ∞ . ψ 1 ≤1 Rd
The functions v should be weakly measurable, to ensure that v(k)B(K1 ,K2 ) and v(k)ψ2 are measurable. Here · j denotes the norm on Kj . We have the obvious inclusions L2 (Rd : B(K1 ; K2 )) ⊆ L2w (Rd ; B(K1 ; K2 )) ⊆ B(K1 ; K2 ⊗ h). The first inclusion is a contraction and the second an isometry. Both inclusions are strict as exemplified by choosing K1 = K2 = L2 (R3x ), h = L2 (R3k ) and v1 (k) = e−|x−k| (read as a multiplication operator) for the first inclusion and v2 (k, x) = |x − k|−1 e−|x−k| for the second. Here v2 induces the bounded operator v2 : L2 (R3x ) → L2 (R3k × R3x ) by the prescription (v2 ψ)(k, x) = v2 (k, x)ψ(x). (In [10, Sec. 2.16] and [22, Sect. 3.4] the second inclusion is claimed to be an equality.) We denote by IPF (d) the vector space of interactions v satisfying (I1)–(I4) and turn it into a normed vector space by equipping it (in polar coordinates) with the norm v ˜τ vPF := [1K ⊗ (1 + ω −3/2 d(ω)) ⊗ 1L2 (S d−1 ) ]˜ + 1K ⊗ (1 + ω −1 )d(ω)∂ω ) ⊗ 1L2 (S d−1 ) ]˜ v ˜τ + [(K + 1)−1/2 ⊗ ∂ω2 ⊗ 1L2 (S d−1 ) ]˜ v B(K;K⊗h) ˜ + [K ⊗ 1h ]˜ v − v˜K˜1 . 2
(5.3)
For any v0 ∈ IPF (d) and r > 0 write Br (v0 ) = {v ∈ IPF (d) | v − v0 PF ≤ r}
(5.4)
for the closed ball in IPF (d) with radius r around v0 . PF used in [22]. Let Let us recall the definition of the conjugate operator on H ∞ χ ∈ C0 ([0, ∞)) be such that χ(ω) = 0 if ω ≥ 1 and χ(ω) = 1 if ω ≤ 1/2. For 0 < δ ≤ 1/2, the function mδ ∈ C ∞ ([0, ∞)) is defined by
ω ω mδ (ω) = χ d(δ) + (1 − χ) d(ω), δ δ
June 3, J070-S0129055X11004333
498
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
˜ the operator a On h, ˜δ is defined in the same way as in [22], that is a ˜δ := imδ (ω)
i dmδ ∂ + (ω), ∂ω 2 dω
D(˜ aδ ) = H01 (R+ ) ⊗ L2 (S d−1 ).
(5.5)
i dmδ ∂ − (ω), ∂ω 2 dω
D(˜ a∗δ ) = H 1 (R+ ) ⊗ L2 (S d−1 ).
(5.6)
Its adjoint is given by a ˜∗δ := imδ (ω)
We recall that H01 (R+ ) is the closure of C0∞ ((0, ∞)) in H 1 (R+ ). The conjugate PF is defined by A δ := 1K ⊗ dΓ(˜ δ on H aδ ). Going back to HPF we get operator A −1 ˜δ T and aδ = T a δ [1K ⊗ Γ(T )]. Aδ = dΓ(aδ ) = [1K ⊗ Γ(T −1 )] A The operator aδ takes the form (1.11) when written in the original coordinates. We write N for the number operator 1K ⊗ dΓ(1h ) on HPF . For E ∈ σpp (HvPF ), we write Pv for the corresponding eigenprojection. Recall from [22, Theorem 2.4] that the range of Pv is finite dimensional under the assumptions (H0), (I1) and (I2). Theorem 5.2. Suppose (H0). Let v0 ∈ IPF (d) and J ⊆ R be a compact interval. There exists 0 < δ0 ≤ 1/2 such that for all 0 < δ ≤ δ0 the following holds: There exist γ > 0 and C > 0 such that for any v ∈ Bγ (v0 ) and E ∈ σpp (HvPF ) ∩ J we have 1
1
Pv : HPF → D(N 2 Aδ ) ∩ D(Aδ N 2 ) ∩ D(N ) and 1
1
N 2 Aδ Pv + Aδ N 2 Pv + N Pv ≤ C. Unfortunately we cannot employ our theory directly to conclude the above theorem, due to Aδ not being self-adjoint. Instead we use a trick of passing to an “expanded” model, for which we can use our abstract theory. The theorem above will then be a consequence of a corresponding theorem in the expanded picture. Remark 5.3. Under the hypotheses of Theorem 5.2, we also have that Pv : HPF → D(A∗δ N 1/2 )∩D(N 1/2 A∗δ ). This follows from Aδ ⊆ A∗δ . In particular this implies that Pv Aδ extends from D(Aδ ) to a bounded operator on HPF . Similar statements hold also for Pv Aδ N 1/2 and Pv N 1/2 Aδ . 5.2. Application to the Nelson model In this subsection we check the conditions (H0) and (I1)–(I4) for the Nelson model introduced in the introduction. After possibly adding a constant to W , we can assume that K ≥ 0. See (1.4) and (W0).
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
499
We begin by remarking that it follows from (W0) and (V0) that 1
|x|α (K + 1)− 2 ∈ B(K) 1
|p|(K + 1)− 2 ∈ B(K).
(5.7) (5.8)
Here α > 2 is coming from (W0), |x| = |x1 | + · · · + |xP | and |p| = |p1 | + · · · + |pP |, where p = −i∇x . These bounds imply in particular (H0). Let ΨN : IN (d) → B(K; K ⊗ h) be defined by ΨN (ρ) =
P
e−ik·x ρ.
=1
Clearly ΨN is a linear map and φ(ΨN (ρ)) = Iρ (x) such that HρN = K ⊗ 1F + 1K ⊗ dΓ(|k|) + φ(ΨN (ρ)), is a Pauli–Fierz Hamiltonian, cf. (1.5) and (1.6). Verifying the conditions (I1)–(I4) will be achieved if we can show that ΨN is a bounded operator from IN (d) to IPF (d). This also implies that results valid uniformly for v in a ball in IPF (d) will translate into results holding uniformly for ρ in a sufficiently small ball in IN (d). See Remark 2.11(4). That the terms in the norm ΨN (ρ)PF , cf. (5.3), pertaining to the conditions (I1)–(I3) can be bounded by · N (or rather terms in · N pertaining to (ρ1)–(ρ3)), follows as in [22] after we have checked that |x|2 (K + 1)−τ is bounded for some positive τ < 1/2. To produce such a τ we invoke Hadamard’s three-line theorem. Consider the function z → |x|−iαz (K +1)iz/2 ∈ B(K). Observe that this function is bounded when Im z = 0 or Im z = 1, cf. (5.7). It now follows, cf. [33–36], that |x|sα (K + 1)−s/2 is bounded for 0 ≤ s ≤ 1. Choosing s = 2/α implies the desired bound with τ = α−1 < 1/2. This will be the τ used in the conditions (I1)–(I3). It remains to verify (I4). For this we compute ! P 1 −ik·xj −ik·xj −ik·xj −ik·xj 1 ρ−e ρK = − ∆ e ρ − ρe ∆ [K ⊗ 1h ]e 2m 2m =1 ! 1 1 =− ∆j e−ik·xj ρ − ρe−ik·xj ∆j 2mj 2mj =
e−ik·xj [−2k · pj + k 2 ]ρ 2m
= [−2k · pj − k 2 ]ρ
e−ik·xj . 2mj
(5.9)
From this computation and (5.8) we conclude that [K ⊗ 1]ΨN (ρ) − ΨN (ρ)K ∈ O1/2 as required by (I4) and the · 1/2 -norm of the difference is bounded by a constant times ρN . Here we need the term in · N coming from (ρ4).
June 3, J070-S0129055X11004333
500
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
We can thus conclude Theorem 1.2 from Theorem 5.2. It remains to discuss the Nelson model after a Pauli–Fierz transformation. We recall that we have two transformations to consider, one giving rise to HρN and one to HρN . See (1.12) and (1.16). To identify these Hamiltonians as Pauli–Fierz (d) → B(K; K ⊗ h) by Hamiltonians, we introduce a linear map ΨN : IN ΨN (ρ) =
P
(e−ik·x − 1)ρ.
=1 With this notation we find for ρ ∈ IN (d)
HρN = Kρ ⊗ 1Γ(h) + 1K ⊗ dΓ(|k|) + φ(ΨN (ρ)) and, specializing to ρ = ρ0 + ρ1 with ρ0 ∈ IN (d) and ρ1 ∈ IN (d), P HρN = Kρ0 − vρ0 ,ρ1 (x ) ⊗ 1Γ(h) + 1K ⊗ dΓ(|k|) + φ(ΨN (ρ0 ) + ΨN (ρ1 )). =1
See (1.13) for Kρ and (1.17) for vρ0 ,ρ1 . In order to apply Theorem 5.2 one should first observe that ΨN is a bounded map from IN (d) to IPF (d). We leave it to the reader to establish this following the arguments in [22], using the key estimate (1.15). As for (I4), observe that the extra −ρ from (e−ik·xj − 1)ρ drops out when repeating (5.9) for ΨN (ρ). In particular we do not need (1.15) for (I4). Observe that for both the transformed Hamiltonians, the Hamiltonian for the confined quantum system K is altered by the transformation, to obtain, e.g., Kρ in the case of HN . A priori the norm · PF is however defined in terms of the operator K, and this definition we retain. However, when verifying the Mourre estimate in Sec. 5.4 and our abstract assumptions for Pauli–Fierz Hamiltonians in Sec. 5.5, we will naturally meet norms with the modified ρ-dependent K’s, and not the original K. We proceed to argue that the · PF norms arising in this way are equivalent, locally uniformly in ρ, (d) with respect to the appropriate normed space. Let for ρ ∈ IN Bρ
= Kρ − K = −
P =1
P2 vρ (x ) + 2
∞
r−1 |˜ ρ(r)|2 dr1K
0
and for ρ = ρ0 + ρ1 as above Bρ
=−
P
vρ0 ,ρ1 (x ) −
=1
P =1
P2 vρ0 (x ) + 2
∞
r−1 |ρ˜0 (r)|2 dr1K .
0
We observe the bounds 2 2 Bρ ≤ Cρ2 N and Bρ ≤ C(ρ0 N + ρ1 N ),
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
501
for some ρ-independent constant C. In particular both Bρ and Bρ can be bounded locally uniformly in ρ, with respect to the appropriate norm. By yet another interpolation argument this implies that we can pass between · PF norms defined with either K, Kρ , or Kρ0 + Bρ , and still retain bounds that are locally uniform in ρ. Finally we note that the above bounds also imply that by possibly adding to W a positive constant we still have Kρ ≥ 0 and Kρ0 + Bρ ≥ 0 locally in ρ. This ensures that (H0) is satisfied also for transformed Nelson Hamiltonians. In particular we still have e.g. |x|2 (Kρ + 1)−τ bounded. In conclusion, Theorem 1.3 also follows from Theorem 5.2. 5.3. Expanded objects ˜ by ( ve on H ( e := H PF ⊗ Γ(h) Let us now define the expanded operator H ˆ ( e := H PF ⊗ 1 ˜ − 1 e ⊗ dΓ(h), H v v Γ(h) HPF
(5.10)
ˆ is the operator of multiplication by where h ω2 ˆ . h(ω) = eω − 1 − 2 From the bound ω ≤
1 2
(5.11)
+ ω 2 /2 we find that for ω ≥ 0 1 d ˆ ˆ h(ω) ≥ h(ω) + . dω 2
(5.12)
Since L2 (R+ ) ⊕ L2 (R+ ) L2 (R), it is known (see e.g. [11]) that there exists a unitary operator ˜ ⊗ Γ(h) ˜ → Γ(he ), U : Γ(h)
(5.13)
˜ ⊗ Γ(h), ˜ the unitary operator 1K ⊗ U is where he := L2 (R) ⊗ L2 (S d−1 ). On K ⊗ Γ(h) e e still denoted by U. It maps into H = K⊗Γ(h ). In this representation, the operator ( ve is unitary equivalent to the “expanded Pauli–Fierz Hamiltonian” Hve defined as H an operator on He by ( e U −1 = K ⊗ 1Γ(he ) + 1K ⊗ dΓ(h) + φ(v e ), Hve := U H v where v e ∈ B(K, K ⊗ he ), and v e and h are defined by ω if ω ≥ 0, v˜(ω) if ω ≥ 0, e h(ω) := v (ω) := ˆ 0 if ω ≤ 0. −h(−ω) if ω ≤ 0,
(5.14)
(5.15)
Note that h ∈ C 2 (R). The idea of expanding the Hilbert space in the above fashion has been used previously in [11, 12, 23, 28]. Our choice of expansion for the boson dispersion relation to the unphysical negative ω appears to be new. Previous implementations of the expansion all used the obvious linear expansion h(ω) = ω. ˜ is a core for dΓ(ω), then We remark that if CK ⊆ K is a core for K, C ⊆ Γ(h) PF , hence for H PF , and finally the algebraic tensor product CK ⊗ C is a core for H 0 v
June 3, J070-S0129055X11004333
502
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
( e for any v ∈ IPF (d). The domain D(H e ) itself may CK ⊗ C ⊗ C is a core for H v v however be v dependent. (The argument for the contrary in [11, Sec. 5.2] seems wrong.) We have however set up our analysis such that knowledge of Hve ’s domain is not needed. See also Lemma 5.15 where an intersection domain is computed. Remark 5.4. We remark that if one is going for higher order results, i.e. ψ ∈ ˆ The choice D(Ak0 ) for k0 ≥ 2, one should use a different h. ˆ k (ω) = eω − 1 − h 0
k 0 +1 =2
ω !
will work since the corresponding hk0 is in C k0 +1 (R) and the bound ˆ k (ω) 1 d ˆ h 0 + hk0 (ω) ≥ dω (k0 − 1)! 2 holds for ω ≥ 0 and k0 ≥ 1. For k0 = 1 this reduces to (5.12). Before introducing the conjugate operator on He that we shall use, let meδ ∈ C (R) be defined by mδ (ω) if ω ≥ 0, meδ (ω) := d(δ) if ω ≤ 0. ∞
We set aeδ := imeδ (ω)
i dmeδ ∂ + (ω), ∂ω 2 dω
D(aeδ ) = H 1 (R) ⊗ L2 (S d−1 ),
(5.16)
and Aeδ := 1K ⊗ dΓ(aeδ ) as an operator on He . Note that both aeδ and Aeδ are self-adjoint. We can now formulated the expanded version of our regularity theorem Let −1 N e = 1K ⊗ dΓ(1he ) = U(N ⊗ 1Γ(h) ˜ ))U ˜ + 1H e PF ⊗ dΓ(1h
denote the expanded number operator. For E ∈ σpp (Hve ) we write Pve for the associated eigenprojection. Theorem 5.5. Suppose (H0). Let v0 ∈ IPF (d) and J ⊆ R be a compact interval. There exists a 0 < δ0 ≤ 1/2 such that for any 0 < δ ≤ δ0 the following holds: There exist γ > 0 and C > 0 such that for any v ∈ Bγ (v0 ) and E ∈ σpp (Hve ) ∩ J we have 1
1
Pve : He → D((N e ) 2 Aeδ ) ∩ D(Aeδ (N e ) 2 ) ∩ D(N e ) and 1
1
(N e ) 2 Aeδ Pve + Aeδ (N e ) 2 Pve + N e Pve ≤ C.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
503
In the next two subsections we verify that our abstract theory applies to the expanded model, but before doing so we pause to check that Theorem 5.2 does indeed follow from Theorem 5.5. For that we need a lemma. PF generated by A δ . Let Wδ,t , t ≥ 0, denote the contraction semigroup on H PF we have for t ≥ 0 Lemma 5.6. For any state ϕ ∈ H e
e−itAδ U(ϕ ⊗ Ω) = U(Wδ,t ϕ ⊗ Ω). k ) if and only if U(ϕ ⊗ Ω) ∈ D((Ae )k ). In particular, ϕ ∈ D(A δ δ Proof. It suffices to check the identity on a dense set of ϕ’s. Let ϕ ∈ K ⊗ Γfin (H01 (R+ )⊗L2 (S d−1 )) ⊆ D(A˜δ ). Then U(ϕ⊗Ω) ∈ K⊗Γfin (H 1 (R)⊗L2 (S d−1 )) ⊆ D(Aeδ ). The identify now follows by differentiating both sides of the equation and observing they satisfy the same differential equation, with the same initial conδ ϕ ⊗ Ω) valid for dition. Here we made use of the equality Aeδ U(ϕ ⊗ Ω) = U(A 1 + 2 d−1 )). ϕ ∈ K ⊗ Γfin (H0 (R ) ⊗ L (S Proof of Theorem 5.2. We only have to recall that bound states of Hve are PF , with precisely states on the form U(ϕ ⊗ Ω), where ϕ is a bound state for H v the same eigenvalue. This implies that eigenprojections for Hve are on the form PF . Theorem 5.5, together U[P ⊗ |ΩΩ|]U −1 where P is an eigenprojection for H v with Lemma 5.6, now implies Theorem 5.2. 5.4. Mourre estimates We begin by establishing a Mourre estimate for HvPF and Aδ in a form appropriate for use in this paper. At the end of the subsection we derive a Mourre estimate for Hve and Aeδ . Let Mδ := 1K ⊗ dΓ(mδ ) and Rδ = Rδ (v) := −φ(iaδ v) as operators on HPF . Let H be the closure of Mδ + Rδ with domain D(HvPF ) ∩ D(Mδ ). Recall from [22] that H = [HvPF , iAδ ]0 . Let f ∈ C0∞ (R) be such that 0 ≤ f ≤ 1, f (λ) = 1 if |λ| ≤ 1/2 and f (λ) = 0 if |λ| ≥ 1. In addition we choose f to be monotonously decreasing away from 0, i.e. λf (λ) ≤ 0. For E ∈ R and κ > 0 we set
λ−E fE,κ (λ) := f . κ The following “Mourre estimate” for HvPF is proved in [22]: Theorem 5.7 [22, Theorem 7.12]. Assume that Hypotheses (H0), (I1) and (I2) hold. Let E0 ∈ R. There exists δ0 ∈]0, 1/2] such that : For all E ≤ E0 , 0 < δ ≤ δ0
June 3, J070-S0129055X11004333
504
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
and ε0 > 0, there exist C > 0, κ > 0, and a compact operator K0 on HPF such that the estimate ⊥ (HvPF )2 − K0 (5.17) Mδ + fE,κ (HvPF )Rδ fE,κ (HvPF ) ≥ (1 − ε0 )1HPF − CfE,κ
holds as a form on D(N 1/2 ). The following lemma is just a reformulation of [22, Proposition 4.1(i), Lemma 4.7 and Lemma 6.2(iv)]. We leave the proof to the reader. Lemma 5.8. Let v0 ∈ IPF (d). There exists c0 , c1 , c2 > 0, depending on v0 , such + c0 ≥ 0 and the following holds: for all w ∈ IPF (d) and 0 < δ ≤ 1/2 that HvPF 0 ±φ(w) ≤ c1 wPF (HvPF + c0 ) 0
and
± Rδ (w) ≤ c1 wPF (HvPF + c0 ). 0
+ i)−1 ≤ c2 wPF φ(w)(HvPF 0
and
Rδ (w)(HvPF + i)−1 ≤ c2 wPF . 0
The first step we take is to translate the commutator estimate above into the form used in this paper, see Condition 2.3. In anticipation of the need for local uniformity of constants, we need to already at this step ensure that B = CB 1 can be chosen uniformly in E ∈ J, where J is compact interval. Corollary 5.9. Let J ⊆ R be a compact interval and v0 ∈ IPF (d). There exists δ0 ∈ ]0, 1/2] and CB > 0 such that for any E ∈ J, 0 > 0 and 0 < δ < δ0 the following holds. There exists κ > 0, C4 > 0 and a compact operator K0 such that the form inequality ⊥ Mδ + Rδ (v0 ) ≥ (1 − 0 )1HPF − C4 fE,κ (HvPF )2 − CB (HvPF − E) − K0 0 0
(5.18)
holds on D(N 1/2 ) ∩ D(HvPF ). 0 Proof. Let E0 be an upper bound for the interval J and take δ0 to be the one coming from Theorem 5.7, applied with v = v0 . Fix E ∈ J, 0 < δ < δ0 and 0 > 0. Apply Theorem 5.7 with 0 /2 in place of 0 . ) Compute as a form on D(HvPF 0 ⊥ ⊥ Rδ (v0 ) = fE,κ (HvPF )Rδ (v0 )fE,κ (HvPF ) + fE,κ (HvPF )Rδ (v0 )fE,κ (HvPF ) 0 0 0 0 ⊥ )Rδ (v0 )fE,κ (HvPF )}. + 2 Re{fE,κ (HvPF 0 0
Using Lemma 5.8 with w = v0 and abbreviating CB = c1 v0 PF , we estimate ⊥ ⊥ fE,κ (HvPF )Rδ (v0 )fE,κ (HvPF ) 0 0 ⊥ + c0 )fE,κ (HvPF )2 ≥ −c1 v0 PF (HvPF 0 0 ⊥ ⊥ = −CB (HvPF − E)fE,κ (HvPF )2 − CB (c0 + E)fE,κ (HvPF )2 0 0 0 ⊥ ≥ −CB (HvPF − E) − 3CB κ − CB (c0 + E)fE,κ (HvPF )2 . 0 0
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
505
Using Lemma 5.8 again, we get ⊥ 2 Re{fE,κ (HvPF )Rδ (v0 )fE,κ (HvPF )} 0 0
≥−
4 0 ⊥ − Rδ (v0 )fE,κ (HvPF )2 fE,κ (HvPF )2 0 0 4 0
≥−
4c2 v0 2PF (|E| + κ + 1)2 ⊥ 0 − 2 fE,κ (HvPF )2 . 0 4 4ζ
Combining the equations above with Theorem 5.7 yields (5.18) with CB only depending on v0 . The above corollary suffices to prove Theorem 5.5 without local uniformity in v and E. The following lemma is designed to deal with uniformity of estimates in a small ball of interactions v around a fixed (unperturbed) interaction v0 . Technically it replaces [22, Lemma 6.2(iv)]. > 0 and c0 , c1 , c2 > 0, Lemma 5.10. Let v0 ∈ IPF (d). There exists γ0 > 0, CB only depending on v0 , such that
(1) ∀ v ∈ Bγ0 (v0 ) : HvPF ≥ −c0 . (2) ∀ v ∈ Bγ0 (v0 ) : ±φ(v) ≤ c1 (HvPF + c0 ) and φ(v)(HvPF − i)−1 ≤ c2 . (HvPF + c0 ) and Rδ (v)(HvPF − (3) ∀ v ∈ Bγ0 (v0 ) and 0 < δ ≤ 1/2 : ±Rδ (v) ≤ CB −1 i) ≤ c2 . Proof. Let v0 ∈ IPF (d) be given. Let C1 (r, v) = [1K ⊗ ω −1/2 ]˜ v (K + r)−1/2 , for v ∈ IPF (d) and r > 0. √ We begin with (1). Fix r = r(v0 ) ≥ 1 such that 2C1 (r, v0 ) ≤ 1/3. This is possible due to (I1). Using [22, Proposition 4.1(i)] we get HvPF = H0PF + φ(v) = H0PF + φ(v0 ) + φ(v − v0 ) √ 1 ≥ H0PF − (H0PF + r) − 2C1 (1, v − v0 )(H0PF + 1) 3
r √ 1 √ = 1 − − 2C1 (1, v − v0 ) H0PF − − 2C1 (1, v − v0 ). 3 3 Using that ω −1/2 ≤ 2/3+ω −3/2/3 ≤ 2/3(1+ω −3/2d(ω)) we get C1 (r, v) ≤ 2vPF /3 for any v ∈ IPF (d) and r ≥ 1. This implies √ √ 2 2 2 2 2 r PF PF − v − v0 PF H0 − − v − v0 PF . Hv ≥ 3 3 3 3 √ Observe that the choice γ0 = 1/(2 2) ensures that we arrive at the bound r+1 . 3 Choose c0 = 1 + (r + 1)/3 such that HvPF + c0 ≥ 1. This proves (1). HvPF ≥ −
June 3, J070-S0129055X11004333
506
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
As for (2) √ we observe first that φ(v) = HvPF − H0PF ≤ HvPF . Next let r = r(v0 ) and γ0 = 1/(2 2) be as in the proof of (1) and estimate −φ(v) = −φ(v0 ) + φ(v0 − v) ≤
1 PF 1 2 r+1 (H + r) + (H0PF + 1) = H0PF + . 3 0 3 3 3
Writing H0PF = HvPF − Φ(v) we arrive at −φ(v) ≤ 2HvPF + r + 1. Combining with the choice of c0 in the proof of (1) now yields the first estimate in (2), for a sufficiently large c1 . As for the second part of (2) one can employ [22, Proposition 4.1(ii)] in place of [22, Proposition 4.1(i)] and argue as above. This gives a bound of the desired type for γ0 small enough. The choice γ0 = 1/8 works. Here one should observe that the constants Cj (r, v), j = 0, 1, 2, in [22] are all related to the norm · PF by Cj (1, v) ≤ 2vPF/3 as argued above for C1 . The statement in (3) now follows by appealing to [22, Proposition 4.1(i)] again √ ±Rδ (v) ≤ 2C1 (1, [1K ⊗ aδ ]v)(H0PF + 1) √ ≤ 2C1 (1, [1K ⊗ aδ ]v)((c1 + 1)HvPF + c1 c0 + 1). for which the first estimate From (5.5) and (5.3), we conclude the existence of a CB in (3) is satisfied. Similarly for the second part of (3), where, as in the discussion of the second part of (2), one can make use of [22, Proposition 4.1(ii)].
We can now state and prove a commutator estimate that is uniform with respect to v from a small ball around v0 , and E in a compact interval. Given v0 , let γ0 denote the radius coming from Lemma 5.10. Corollary 5.11. Let J ⊆ R be a compact interval, v0 ∈ IPF (d), and 0 > 0. There exist a δ0 ∈ ]0, 1/2] such that for any 0 < δ < δ0 the following holds. There exists 0 < γ < γ0 , κ > 0, C4 > 0 and a compact operator K0 , with γ only depending on δ, 0 , J and v0 , such that the form inequality ⊥ Mδ + Rδ (v) ≥ (1 − 0 )1HPF − C4 fE,κ (HvPF )2 HvPF − K0
(5.19)
holds on D(N 1/2 ) ∩ D(HvPF ), for all E ∈ J and v ∈ Bγ (v0 ). Remark. We note that the constant C4 in Corollary 5.9 can, on inspection of the proof of [22, Theorem 7.12], be chosen uniformly in 0 < δ ≤ δ0 . Making use of this would allow us to choose γ independent of δ ≤ δ0 here, which would slightly simplify the exposition. We however choose not to test the readers patience on this issue. See Step II in the proof below.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
507
Proof. Given J, v0 and 0 , let γ0 be given by Lemma 5.10 and let CB > 0 δ0 > 0 be the constants coming from Corollary 5.9. For E ∈ J we apply Corollary 5.9, with 0 replaced by 0 /3, and get the form estimate ⊥ Mδ + Rδ (v0 ) ≥ (1 − 0 /3)1HPF − C4 (v0 , E)fE,κ(v (HvPF )2 0 0 ,E)
− CB (HvPF − E) − K0 (v0 , E). 0
(5.20)
The constants C4 , κ and the operator K0 also depend on δ, but this dependence does not concern us. We can assume that K0 ≥ 0. The key observation is that the constants C4 and κ, and the operator K0 above can be chosen independently of E ∈ J and v ∈ Bγ (v0 ), for some sufficiently small γ which does not depend on δ ≤ δ0 . We divide the proof of the corollary into three steps, the two first establish the observation mentioned in the previous paragraph. Step I. We begin by arguing that C4 , κ and K0 can be chosen independently of E ∈ J. By a covering argument it suffices to show that they can be chosen independently of E in a small neighborhood of E ∈ J. For the compact error, we remark that one should replace K0 by a finite sum K0 (v0 ) = K0 (v0 , E1 ) + · · · + K0 (v0 , Em ) of non-negative compact operators, which is again compact. Let E ∈ J be fixed. Pick ζ1 = 0 /(6CB ) such that for |E − E | < ζ1 we have CB E ≥ CB E − 0 /6.
(5.21)
⊥ As for the term involving fE,κ we observe that for any self-adjoint operator S we have ⊥ fE,κ (S) − fE⊥ ,κ (S) = fE ,κ (S) − fE,κ (S) )
−1
−1 * S − E S − E 1 −z −z (∂¯f˜)(z) − dudv. = π C κ κ
Here z = u + iv. Estimating this we find that |E − E | . κ ⊥ Writing a2 − b2 = (a − b)(a + b) we observe a similar bound for fE,κ (S)2 − fE⊥ ,κ (S)2 . Again we conclude that for ζ2 = κ(v0 , E)0 /(6CC4 (v0 , E)) we find that for |E − E | < ζ2 : ⊥ fE,κ (S) − fE⊥ ,κ (S) ≤ C
⊥ (HvPF )2 ≥ −C4 (v0 , E)fE⊥ ,κ (HvPF )2 − 0 /6. −C4 fE,κ 0 0
(5.22)
The estimates (5.21) and (5.22) plus the aforementioned covering argument implies the form estimate ⊥ Mδ + Rδ (v0 ) ≥ (1 − 20 /3)1HPF − C4 (v0 )fE,κ(v (HvPF )2 0)
− CB (HvPF − E) − K0 (v0 ), for all E ∈ J.
(5.23)
June 3, J070-S0129055X11004333
508
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Step II. Secondly we argue that one can use the same constants C4 , κ, and compact operator K0 for v ∈ Bγ (v0 ), if γ is small enough. Using Lemma 5.10, we estimate Rδ (v0 ) = Rδ (v) + Rδ (v0 − v) ≤ Rδ (v) + C1 v − v0 PF (HvPF + C2 ). Writing C1 v − v0 PF (HvPF + C2 ) = C1 v − v0 PF (HvPF − E) + C1 v − v0 PF (C2 + E), We see that choosing γ1 = γ1 (0 , J, v0 ) small enough we arrive at the following bound 0 (5.24) Rδ (v0 ) ≤ Rδ (v) + C(HvPF − E) + 1HPF , 9 which holds for all v ∈ Bγ1 (v0 ) and E ∈ J. ⊥ For the fE,κ , contribution, we compute ) − fE,κ (HvPF ) fE,κ (HvPF 0 )
−1 PF
−1 * HvPF −E Hv − E 1 0 ¯ ˜ = −z −z (∂ f )(z) − dudv π C κ κ =
1 κπ
(∂¯f˜)(z) C
HvPF − E −z κ
−1
φ(v − v0 )
HvPF − E 0 −z κ
−1 dudv.
From Lemma 5.8 and the representation formula above, we find that ⊥ ⊥ fE,κ (HvPF )2 − fE,κ (HvPF )2 ≤ Cv − v0 PF , 0
uniformly in E ∈ J. Arguing as above we thus find a γ2 = γ2 (0 , J, v0 , δ) > 0 such that 0 ⊥ ⊥ (HvPF )2 ≥ −C4 (v0 )fE,κ (HvPF )2 − 1HPF (5.25) −C4 (v0 )fE,κ 0 9 for all v ∈ Bγ2 (v0 ). This is where the δ-dependence enters into the choice of γ through C4 . See the remark to the corollary. Using Lemma 5.10, we also get a γ3 = γ3 (0 , v0 ) > 0 such that −CB (HvPF − E) ≥ −CB (HvPF − E) − 0
0 1HPF , 9
(5.26)
for all v ∈ Bγ3 (v0 ). Combining (5.23) with (5.24)–(5.26) we conclude that the estimate (5.20) holds with the same C4 , κ and K0 , for all E ∈ J and v ∈ Bγ (v0 ), with γ = min{γ1 , γ2 , γ3 } only depending on 0 , J, v0 and δ. Step III. To conclude the proof we let γ, C4 , κ and K0 be fixed by Steps I and II. Pick κ smaller than κ such that κ CB (1 + maxE∈J |E|)| ≤ 0 . The Corollary now
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
509
follows from (5.20) and the estimate
⊥ (HvPF )2 HvPF . −CB (HvPF − E) ≥ −CB 1 + max |E| fE,κ E∈J
Observe that (5.20) holds with κ replaced by κ as well. The corresponding objects in the expanded Hilbert space are defined as follows: We set Mδe := 1K ⊗ dΓ(meδ h ) and Rδe = Rδe (v) := −φ(iaeδ v e ). Note that + U −1 Mδe U = Mδ ⊗ 1Γ(h) ˜ + 1H ⊗ M δ ,
U −1 Rδe U = Rδ ⊗ 1Γ(h) ˜ ,
ˆ ) as an operator on Γ(h). ˜ From (5.12), we get +δ := dΓ(d(δ)h where M ! 1 ˆ + Mδ ≥ d(δ) dΓ(h) + N , 2
(5.27)
(5.28)
The Mourre estimate for Hve is stated in the following theorem. Theorem 5.12. Assume that Hypotheses (H0), (I1) and (I2) hold. Let v0 ∈ IPF (d), J a compact interval, and 0 > 0. There exists δ0 ∈ ]0, 1/2] such that for all 0 < δ ≤ δ0 , there exist 0 < γ < γ0 , C4 > 0, κ > 0, and a compact operator K0 on He such that ⊥ (Hve )2 Hve − K0 Mδe + Rδe ≥ (1 − 0 )1He − CfE,κ
(5.29)
for all E ∈ J and v ∈ Bγ (v0 ), as a form on D((Mδe )1/2 ) ∩ D(Hve ). Remark. As in Corollary 5.11, the constant γ can be chosen to only depend on 0 , J, v0 and δ, and as in the associated remark one can in fact choose it uniformly in 0 < δ ≤ δ0 . Proof. We fix v0 , J and 0 as in the statement of the the theorem. We begin by taking δ0 to be the δ0 coming from Corollary 5.11. Secondly we fix and c0 to be the two constants from Lemma 5.10(3). CB We can now choose 0 < δ0 ≤ δ0 such that
+ 2, max 2CB (E + c0 ) . (5.30) d(δ0 ) ≥ max CB E∈J
Here we used that limt→0+ d(t) = +∞. Fix now a 0 < δ ≤ δ0 and denote by γ the radius coming from Corollary 5.11. The above choices anticipates the proof below, but we make them here to make it evident that we pick the constants in the right order. We begin the verification of the commutator estimate for v ∈ Bγ (v0 ) by computing as a form on D((Mδe )1/2 ) ∩ D(H e ) +δ + Rδ ⊗ 1]1 ⊗ P¯Ω . U −1 [Mδe + Rδe ]U = [Mδ + Rδ ] ⊗ PΩ + [Mδ ⊗ 1 + 1 ⊗ M (5.31)
June 3, J070-S0129055X11004333
510
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
We apply Corollary 5.11 to the first term in the right-hand side of (5.31), with the given δ (apart from v0 , J and 0 ). This yields a C4 , a κ > 0, and a compact operator K0 (apart from γ) such that the following bound holds ⊥ PF 2 ) HvPF − K0 ] ⊗ PΩ . [Mδ + Rδ ] ⊗ PΩ ≥ [(1 − 0 )1 − C4 fE,κ (Hv
(5.32)
Observe that the bound above also holds with κ replaced by any 0 < κ ≤ κ . To bound from below the second term on the right-hand side of (5.31), we use Lemma 5.10. Together with (5.28) and (5.30), this implies +δ + Rδ ⊗ 1]1 ⊗ P¯Ω [1 ⊗ M !
ˆ + 1 − C (H PF ⊗ 1 + c ) ⊗ 1g 1 ⊗ P¯Ω ≥ g 1 ⊗ d(δ) dΓ(h) B v 0 2 ! d(δ) e ˆ − C (H ( − C ≥ (d(δ) − CB )1 ⊗ dΓ(h) − E) + (E + c ) 1 ⊗ P¯Ω B v B 0 2 (e ≥ [2 − CB (Hv − E)]1 ⊗ P¯Ω .
(5.33)
ˆ ≥ 0. We now pick a 0 < κ ≤ κ such Here we also made use of (5.10) and that h 2 ⊥ ⊥ 2 + (fE,κ ) into (5.33) yields the that 3κCB ≤ 1. Inserting 1 = fE,κ + 2fE,κ fE,κ bound ⊥ ( ve )2 H ( ve ]1 ⊗ P¯Ω , +δ + Rδ ⊗ 1]1 ⊗ P¯Ω ≥ [1 − CB (1 + E )fE,κ (H [1 ⊗ M
where E = maxE∈J |E|. This estimate together with (5.31) and (5.32) lead (1 + E )} and K0 = to the statement of the theorem with C4 = min{C4 , CB −1 U[K0 ⊗ PΩ ]U . 5.5. Checking the abstract assumptions The purpose of this subsection is to complete the proof of Theorem 5.5. We do this by running through the abstract assumptions in Sec. 2 pertaining to Theorems 2.5 and 2.10, from which Theorem 5.5 then follows. In accordance with Remark 2.11(4), we ensure that all constants can be chosen locally uniformly in energy E and form factor v. This ensures local uniformity in Theorem 5.5. ). Observe that there exists e0 such that We fix v0 ∈ IPF (d) and E0 ∈ σ(HvPF 0 e0 < inf σ(HvPF ) for all v ∈ Bγ0 (v0 ), where γ0 comes from Lemma 5.10. Put J = [e0 , E0 ]. Let γ and δ0 be fixed by Theorem 5.12 and choose a δ < δ0 , which from now on is fixed. We begin by postulating the objects for which the abstract assumptions in Conditions 2.1 should hold. We take H = He , H = Hve , A = Aeδ ,
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
N = K ρ ⊗ 1Γ(he ) + 1K ⊗ dΓ(h ) + 1He ,
511
1 max 2τ, < ρ < 1, 2
H = [Mδe + Rδe ]|D(N ) . (5.34) The constant τ appearing above is the one from (I1). Observe that Rδe and Mδe are N -bounded. See Lemma 5.13 just below. We make use of the following dense subspace of H S = D(K) ⊗ Γfin (C0∞ (R) ⊗ L2 (S d−1 )) ⊆ He . The tensor product is algebraic. Observe that S is a core for H, N , and A. We recall that we can construct the group eitA explicitly. Let ψt denote the (global) flow for the 1-dimensional ODE ψ˙ t (ω) = meδ (ψt (ω)). Then, for continuous compactly supported supported f , e
1
(eitaδ f )(ω) = e 2
Rt 0
(meδ ) (ψs (ω))ds
f (ψt (ω)).
This in particular implies that e
e
eitAδ = Γ(eitaδ ) : S → S.
(5.35)
We begin with the following lemma which implies that Rδe is N -bounded. Lemma 5.13. Let v ∈ Oτ and κ = 1/4 − τ /(2ρ). Then D(N 1−2κ ) ⊆ D(φ(v)), and for f ∈ D(N ) we have φ(v e )f ≤ Cvτ N 1−2κ f , where C does not depend on v nor on f . Proof. Adopting notation from [22] we put C0 (v) = v(K + 1)−τ 2 and C2 (v) = [(K + 1)−τ ⊗ 1h ]v2 . We estimate for f ∈ S, repeating the argument for [22, (3.14) and (3.16)], and get a∗ (v e )f 2 ≤ C0 (v)(K + 1)τ ⊗ 1Γ(he ) f 2 + C0 (v)f, (K + 1)2τ ⊗ N ef and a(v e )f 2 ≤ C2 (v)f, (K + 1)2τ ⊗ N ef . Observing the bound, with 2κ = 1/2 − τ /ρ and some C > 0, τ (K + 1)2τ ⊗ N e ≤ (K + 1)2ρ(1−2κ) ⊗ 1Γ(he ) ρ(1 − 2κ) +
1 (N e )2(1−2κ) ≤ C N 2(1−2κ) , 2(1 − 2κ)
yields Φ(v e )f ≤ Cvτ N 1−2κ f
(5.36)
June 3, J070-S0129055X11004333
512
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
a priori as a bound for elements of S. The lemma now follows since S is a core for N . Condition 2.1(1). We make use of the fact (given the invariance of S mentioned in (5.35)) that our Condition 2.1(1) is equivalent to Mourre’s conditions, eitA D(N ) ⊆ D(N ) (i.e. D(N ) is invariant) and that i[N, A] extends from a form on S to an element of B(N −1 H; H). See [30, Proposition II.1]. From the computation i[h , aeδ ] = meδ h it follows that the following identity holds in the sense of forms on S N = i[N, Aeδ ] = 1K ⊗ dΓ(meδ h ).
(5.37)
Since meδ is bounded and supω∈R |h (ω)|/h (ω) < ∞, we find that N extends from S to a bounded operator on D(N ), and the extension is in fact an element of B(N −1 H; H) as required. e It remains to check that D(N ) is invariant under eitAδ . For this we compute strongly on S e
e
N eitAδ = eitAδ (K ρ ⊗ 1Γ(he ) + 1K ⊗ dΓ(h ◦ ψ−t )). Since t → ψt (ω) is increasing and ω → h (ω) is decreasing (and positive) we find for t ≤ 0 0 ≤ h ◦ ψ−t ≤ h . For positive t we estimate ω − Ct ≤ ψ−t (ω) ≤ ω, for some C > 0, where we used that meδ was a bounded function. This gives for t > 0 0 ≤ h ◦ ψ−t (ω) = max{1, e−ψ−t (ω) + ψ−t (ω)} ≤ max{1, e−ω+Ct + ω − Ct}. Using that e−ω+α + ω ≤ Cα (e−ω + ω), we get for any t a C = C (t) such that (h ◦ ψ−t )2 ≤ C (h )2 and hence by [22, Proposition 3.4] we arrive at dΓ(h ◦ ψ−t )2 ≤ C dΓ(h )2 . e
Since S was a core for N we now conclude that eitAδ D(N ) ⊆ D(N ). This completes the verification of Condition 2.1(1). Condition 2.1(2). We begin by observing that N and H0e commute. In particular we can compute as a form on S i[N −1 , Hve ] = iN −1 φ(v e ) − iφ(v e )N −1 . This computation in conjunction with Lemma 5.13 implies that i[N −1 , Hve ] extends from a form on D(Hve ) to a bounded operator and hence N is of class C 1 (H).
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
513
Since the commutator form i[N, H] extends from D(N ) ∩ D(H) to a bounded form on D(N ) it suffices to compute it on a core for N . Here we take again S and compute i[N, H] = [K ρ ⊗ 1Γ(he ) , φ(v e )] + φ(ih v e ) = φ([K ρ ⊗ 1he ]v e − v e K ρ ) + φ(iv e ).
(5.38) 1
That the second term extends by continuity to a bounded form on D(N 2 −κ ) follows from Lemma 5.13 (applied with iv e instead of v e ) and interpolation. In order to deal with the first term in (5.38) we write −1 v − v˜K ρ ) ⊗ 1Γ(h) . φ([K ρ ⊗ 1he ]v e − v e K ρ ) = U(φ([K ρ ⊗ 1h˜ ]˜ ˜ )U
Here we need the new assumption (I4). We will immediately verify that the above expression extends to a bounded form on D(N 1/2−κ ) for some κ > 0. This implies the required property for i[H, N ]0 . We employ the representation formula (3.5) with K instead of N . Compute as a form on D(K ⊗ 1h˜ ) × D(K) ∞ (K ρ ⊗ 1h˜ )˜ v − v˜K ρ = −cρ tρ [((K + t)−1 ⊗ 1h˜ )˜ v − v˜(K + t)−1 ]dt 0
= B − cρ 1
∞
tρ ((K + t)−1 ⊗ 1h˜ )[˜ v K − (K ⊗ 1h˜ )˜ v ](K + t)−1 dt,
where B is the contribution from the integral between 0 and 1, which due to (I1) is a bounded operator. By (I4), we have 1
v K − (K ⊗ 1h˜ )˜ v )(K + 1)− 2 < ∞, c1 := (˜ 1
c2 := (K + 1)− 2 ⊗ 1h˜ (˜ v K − (K ⊗ 1h˜ )˜ v ) < ∞. Let τ < 1/2 be chosen such that ρ/2 > τ > ρ − 1/2. This is possible due to the choice of ρ. We estimate for ψ ∈ D(K ⊗ 1h˜ ) and ϕ ∈ D(K) ψ, ((K ρ ⊗ 1h˜ )˜ v − v˜K ρ )ϕ ≤ Bψϕ +
1 2
c1 cρ ψ(K + 1)τ ϕ. +τ −ρ
Similarly we get ψ, ((K ρ ⊗ 1h˜ )˜ v − v˜K ρ )ϕ ≤ Bψϕ +
1 2
c 2 cρ (K ⊗ 1h˜ + 1)τ ψϕ. + τ − ρ
We have thus established that the first term in (5.38) is the (expanded) field operator associated to an operator in Oτ . We can thus employ Lemma 5.13 again, this time with v e replaced by [K ρ ⊗ 1he ]v e − v e K ρ and κ replaced by 0 < κ = 1/4 − τ /(2ρ) < 1/4. Together with an interpolation argument this
June 3, J070-S0129055X11004333
514
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
ensures that φ((K ρ ⊗ 1he )v e − v e K ρ ) extends by continuity to a bounded form on D(N 1/2−κ ). We have thus verified Condition 2.1(2) with the smallest of the two kappa’s. In addition we observe that the B(N −1/2+κ H; N 1/2−κ H)-norm of i[N, H]0 is bounded by a constant times vPF, cf. Remark 2.11(4). Remark 5.14. We observe from the discussion above that we could relax (I4) and v − v˜K ρ extends to an element of B(D(K η ); K ⊗ require instead that [K ρ ⊗ 1h˜ ]˜ ˜ ∩ B(K; D(K η )∗ ⊗ h), ˜ for some 1/2 ≤ η < 1 − τ , where τ is coming from (I1). h) This would still leave room to choose ρ and τ (in the argument above) such that 1 > ρ > 2τ and ρ/2 > τ > ρ + η − 1. While we do not know the domain of H, it turns out that we can indeed compute the intersection domain D(H) ∩ D(N ). This is done in the following lemma. Lemma 5.15. We have the identity D(H) ∩ D(N ) = D(K ⊗ 1Γ(he ) ) ∩ D(1K ⊗ dΓ(max{h , ω}))
(5.39)
and S is dense in D(H) ∩ D(N ) with respect to the intersection topology. Proof. Let for the purpose of this proof H0 = H0e , the unperturbed expanded Hamiltonian, and denote by D the right-hand side of (5.39). Since N controls the unphysical part of dΓ(h), due to the choice of extension of ω by an exponential, we observe that the identity (5.39) holds if H is replaced by H0 . Since H0 and N commute we find that T0 = N + iH0 is a closed operator on D and it clearly generates a contraction semigroup. We now construct the formal operator sum N + iH in two different ways. By Lemma 5.13, D(φ(v e )) ⊂ D(N 1−2κ ) and hence for u ∈ D φ(v e )u ≤ cN 1−2κ u + c u ≤
1 1 N u + c u ≤ T0 u + c u. 4 4
From this estimate, we deduce that T1 = T0 + iφ(v) =: N + iG is a closed operator on D and it generates a contraction semigroup. See [33–36, Lemma preceding Theorem X.50]. Here G is implicitly defined as the operator sum G = H0 + φ(v e ) with domain D. On the other hand, since we have just established Condition 2.1(2), we conclude from [21, Theorem 2.25] that T2,± = N ± iH are closed operators on D(H) ∩ D(N ). ∗ = T2,∓ and since T2,± are both accretive we conclude that In addition we have T2,± T2,+ generates a contraction semigroup. See [33–36, Corollary to Theorem X.48]. We proceed to argue that T2 = T2,+ is an extension of T1 , i.e. T1 ⊂ T2 . Since S ⊆ D, G is a symmetric extension of H|S and S is a core for H we deduce that H is an extension of G. Hence indeed T1 ⊂ T2 .
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
515
We now argue that in fact T1 = T2 , or more poignantly that their domains coincide. This will follow if the intersection of the resolvent sets is non-empty. Indeed, let ζ ∈ ρ(T1 ) ∩ ρ(T2 ). Then (T2 − ζ)(T1 − ζ)−1 = (T1 − ζ)(T1 − ζ)−1 = 1, and hence (T2 − ζ)−1 = (T1 − ζ)−1 and the domains must coincide. But by the Hille–Yosida theorem [33–36, Theorem X.47a] we have (−∞, 0) ⊂ ρ(T1 ) ∩ ρ(T2 ). Here we used that both T1 and T2 generate contraction semigroups. It remains to ascertain that S is dense in D with respect to the intersection topology of D(H)∩D(N ). We begin by verifying that S is dense in D with respect to the graph norm of T0 , which induces the intersection topology of D(H0 )∩D(N ) = D. Let ψ ∈ D. Observe first that limn→∞ 1N e ≤n ψ → ψ in the graph norm of T0 , since N e and T0 commute. Similarly we find that 1K ⊗Γ(1|ω|≤)ψ → ψ in the graph norm of T0 . Hence it suffices to approximate ψ ∈ D with Γ(1|ω|≤ )1N e ≤n ψ = ψ, for some and n, by elements from S in the graph norm of T0 . Fix now such a ψ, n and . Since S is a core for K ⊗ 1Γ(he ) we can find a sequence {ψj } ⊂ S with ψj → ψ in D(K ⊗ 1Γ(he ) ). Put ψ˜j = 1Ne ≤n [1K ⊗ Γ(f )]ψj ∈ S, where f ∈ C0∞ (R), with 0 ≤ f ≤ 1 and f = 1 on [−, ]. Then ψ˜j → ψ in D(K ⊗ 1Γ(he ) ) as well. We now observe that T0 ψ˜j = (iK ⊗ 1Γ(he ) + Bn, )ψ˜j , for some bounded operator Bn, . This implies density of S in D in the graph norm of T0 . By the closed graph theorem H(T0 − ζ)−1 and N (T0 − ζ)−1 are bounded, and hence S is also dense in D(H)∩D(N ) = D with respect to the indicated intersection topology. Condition 2.1(3). Let σ be such that R(η) preserves D(N ) for η with |Im η| ≥ σ. It suffices to establish the identity R(η)H − HR(η) = −iR(η)H R(η), for η with |Im η| ≥ σ, as a form on D(H) ∩ D(N ), since this set is dense in D(H) ∩ D(N 1/2 ) by Remark 3.5. By Lemma 5.15, we can on the set D(H) ∩ D(N ) espress H and H as sums of operators H = H0e + φ(v e ) and H = dΓ(h ) − φ(iaeδ v e ). We are thus reduced to verifying the following two form identities on D(H) ∩ D(N ) R(η)H0e − H0e R(η) = −iR(η)1K ⊗ dΓ(meδ h )R(η), R(η)φ(v e ) − φ(v e )R(η) = iR(η)φ(iaeδ v e )R(η).
(5.40) (5.41)
Since all operators appearing in (5.40) commute with N e it suffices to verify this identity on each fixed expanded particle sector with N e = n. Introduce for a positive integer the semibounded dispersion h (ω) = max{−, h(ω)} and a cutoff
June 3, J070-S0129055X11004333
516
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
expanded free Hamiltonian H0, = K ⊗ 1Γ(he ) + 1K ⊗ dΓ(h ). Then on a particle 1 (A) such that we can compute for |Im η| ≥ σn, sector H0, is of class CMo R(η)H0, − H0, R(η) = −iR(η)1K ⊗ dΓ(meδ h )R(η), as a form on 1[N e =n] D. Here σn, is some positive constant. Since both sides are analytic in η for |Im η| ≥ σ we conclude the above identity for all such η. Appealing to the explicit form of the domain D we find that we can remove the cutoff → ∞ by the dominated convergence theorem. This yields (5.40) for |Im η| ≥ σ. As for (5.41) we recall that we have already established that N is of class 1 (A). It is a consequence of the proof of [30, Proposition II.1], that i[φ(v e ), A] CMo read as a form on D(N ) ∩ D(A) can be represented by an extension from the form computed on S. Here we used (5.35). As a form on S we clearly have i[φ(v e ), A] = −φ(iaeδ v e ), which extends to an N -bounded operator by Lemma 5.13. The computation R(η)φ(v e ) − φ(v e )R(η) = R(η)[φ(v e ), A]R(η) as forms on D(N ) now concludes the verification of (5.41), and hence of Condition 2.1(3). Condition 2.1(4). We compute first as a form on S
dmeδ h + (meδ )2 h − φ((aeδ )2 v e ) i[H , A] = H = 1K ⊗ dΓ meδ dω and observe that the right-hand side extends by continuity to an N -bounded operator, cf. Lemma 5.13. Again, by the proof of [30, Proposition II.1], cf. (5.35), we conclude that the operator on the right-hand side of the formula also represents the commutator form i[H , A] on D(N ) ∩ D(A). Condition 2.2. By Lemma 5.15 and Remark 3.5, it suffices to check the form bound in the virial condition on S. In addition, since K ρ ≤ 1 + K, it suffices to check the estimate with ρ = 1. ˆ ≤ h ˆ , and hence h + h ≥ 0. Recalling (5.11) and (5.15), we observe that h Making use of this observation we find that K ⊗ 1Γ(he ) + 1K ⊗ dΓ(h ) ≤ K ⊗ 1Γ(he ) + 1K ⊗ (dΓ(h) + 2dΓ(h )) ≤ K ⊗ 1Γ(he ) + 1K ⊗ dΓ(h) + 2Mδe . We now add and subtract Φ(v e ) + 2Rδe to obtain K ⊗ 1Γ(he ) + 1K ⊗ dΓ(h ) ≤ Hve + 2H − Φ(v e ) − 2Rδe . We now make use of the fact that 1
C = (Φ(v e ) + 2Rδe )(K ⊗ 1Γ(he ) + 1K ⊗ dΓ(h ) + 1)− 2 < ∞ to conclude the form estimate 1 1 K ⊗ 1Γ(he ) + 1K ⊗ dΓ(h ) ≤ Hve + 2H + (K ⊗ 1Γ(he ) + 1K ⊗ dΓ(h ) + 1) + C 2 . 2 2
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
517
This completes the verification of the virial bound. We again observe that the constants involved can be chosen independent of E in a bounded set and v ∈ Bγ (v0 ). Condition 2.3. This condition has already been essentially verified in the form of Theorem 5.12. We only need to observe that the form bound extends by continuity from D(H) ∩ D(N ) to D(H) ∩ D(N 1/2 ), cf. Remark 3.5. The condition (2.7). Let ψ e be a bound state for H = Hve . That is ψ e ∈ D(Hve ) vPF ) and Hve ψ e = Eψ e , for some E ∈ R. Recall that ψ e = U(ψ ⊗ Ω), where ψ ∈ D(H PF and Hv ψ = Eψ. From [22, Proposition 6.5] we conclude that ψ ∈ D(N 1/2 ). PF ⊗ 1 ˜ ). In particular we Hence we conclude that ψ e ∈ D(dΓ(h )1/2 ) ∩ UD(H v Γ(h) find that ψ e ∈ D(H) ∩ D(N 1/2 ) and the result follows from the virial estimate in Condition 2.2. Observe again that N 1/2 ψ can be bounded uniformly in v ∈ Bγ0 (v0 ) and E ∈ [e0 , E0 ]. Condition 2.8 k0 = 1. This merely amounts to checking the statement in (2.11) with = 0. But this is trivially satisfied since [N, N ] = 0. See (5.37). This completes the verification of the conditions needed to conclude Theorem 5.5 from Theorems 2.5 and 2.10. 6. AC-Stark Type Models 6.1. The model and the result We will work in the framework of generalized N -body systems, which we review briefly. Let A be a finite index set and X a finite dimensional real vector-space with inner product. There is an injective map from A into the subspaces of X, A a → X a ⊆ X, and we write Xa = (X a )⊥ . We introduce a partial ordering on A: a ⊂ b ⇔ Xa ⊆ Xb and assume the following (1) There exist amin , amax ∈ A with X amin = {0} and X amax = X. (2) For each a, b ∈ A there exists c = a ∪ b ∈ A with Xa ∩ Xb = Xc . We will write xa and xa for the orthogonal projection of a vector x onto the subspaces X a and Xa respectively. We will work with a generalized potential Va (t, xa ), V = V (t, x) = a∈A\{amin }
where Va is a real-valued function on R × X a . In the conditions below α denotes multiindices.
June 3, J070-S0129055X11004333
518
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Conditions 6.1. Let k0 ∈ N be given. For each a = amin the following holds. The pair-potential R × X a (t, y) → Va (t, y) ∈ R is a continuous function satisfying (1) Periodicity: Va (t + 1, y) = Va (t, y), t ∈ R and y ∈ X a . (2) Differentiability in y: For all α with |α| ≤ k0 + 1 there exist ∂yα Va ∈ C(R × X a ). (3) Global bounds: For all α and k ∈ N ∪ {0} with |α| + k ≤ k0 + 1 there are global bounds |∂yα (y · ∇y )k Va (t, y)| ≤ C. (4) Decay at infinity: |Va (t, y)| + |y · ∇y Va (t, y)| = o(1) uniformly in t. (5) Regularity in t: There exists ∂t Va ∈ C(R × X a ) and there is a global bound |∂t Va (t, y)| ≤ C. We consider under Condition 6.1 the Hamiltonian h = h(t) = p2 + V , p = −i∇, on the Hilbert space L2 (X). The corresponding propagator U satisfies: It is twoparameter strongly continuous family of unitary operators which solves the timedependent Schr¨ odinger equation d U (t, s)φ = h(t)U (t, s)φ for φ ∈ D(p2 ). dt The family satisfies the Chapman Kolmogorov equations i
U (s, r)U (r, t) = U (s, t),
r, s, t ∈ R,
the initial condition U (s, s) = 1 for any s ∈ R and the periodicity equation U (t + 1, s + 1) = U (t, s),
s, t ∈ R.
The operator U (1, 0) is called the monodromy operator. For each a = amax the sub-Hamiltonian monodromy operator is U a (1, 0); it is defined as the monodromy operator on Ha = L2 (X a ) constructed for a = amin from ha = (pa )2 + V a , b a Va = amin =b⊂a Vb (t, x ). If a = amin we define U (1, 0) = 1 (implying amin (1, 0)) = {1}). The set of thresholds is then σpp (U σpp (U a (1, 0)). (6.1) F (U (1, 0)) = a =amax
We recall from [31] that the set of thresholds is closed and countable, and non-threshold eigenvalues, i.e. points in σpp (U (1, 0))\F (U (1, 0)), have finite multiplicity and can only accumulate at the set of thresholds. Moreover any corresponding bound state is exponentially decaying, the singular continuous spectrum σsc (U (1, 0)) = ∅ and there are integral propagation estimates for states localized away from the set of eigenvalues and away from F (U (1, 0)). It should be remarked that the weakest condition, Condition 6.1 with k0 = 1, corresponds to [31, Condition 1.1] (more precisely Condition 6.1 with k0 = 1 is slightly weaker than [31, Condition 1.1], and we also remark that [31] goes through with this modification). All of the above properties are proven in [31] either under [31, Condition 1.1] or under weaker conditions allowing local singularities. In particular local singularities up to the Coulomb singularity are covered in [31]. See Sec. 6.3 for a new result for Coulomb systems.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
519
In the following subsection we establish the theorem below, which implies Theorem 1.6(2). Theorem 6.2. Suppose Conditions 6.1, for some k0 ∈ N. Let φ be an bound state / F(U (1, 0)). Then φ ∈ D(|p|k0 +1 ). for U (1, 0) pertaining to an eigenvalue e−iλ ∈ 6.2. Regularity of non-threshold bound states The principal tool in the proof of Theorem 6.2 will be Floquet theory (in common with [31] and other papers) which we briefly review. The Floquet Hamiltonian associated with h(t) is H = τ + h(t) = H0 + V,
on H = L2 ([0, 1]; L2 (X)).
(6.2)
d , with periodic boundary conditions. Here τ is the self-adjoint realization of −i dt The spectral properties of the monodromy operator and the Floquet Hamiltonian are equivalent. We have the following relations
σpp (U (1, 0)) = e−iσpp (H) ,
σac (U (1, 0)) = e−iσac (H) ,
σsc (U (1, 0)) = e−iσsc (H) ,
and the multiplicity of an eigenvalue z = e−iλ of U (1, 0) is equal to the multiplicity of λ as an eigenvalue of H (regardless of the choice of λ). We also recall that the Floquet Hamiltonian is the self-adjoint generator of the strongly continuous unitary one-parameter group on H given by (e−isH ψ)(t) = U (t, t − s)ψ(t − s − [t − s]),
(6.3)
where [r] is the integer part of r. In particular any bound state of the monodromy operator, U (1, 0)φ = e−iλ φ, gives rise to a bound state of the Floquet Hamiltonian, Hψ = λψ, by the formula ψ(t) = eitλ U (t, 0)φ.
(6.4)
Proposition 6.3. Suppose Conditions 6.1 for some k0 ∈ N and suppose Hψ = λψ / F(U (1, 0)). Then ψ ∈ D(|p|k0 +1 ). for e−iλ ∈ Proof. We shall use Corollary 4.13 with H being the Floquet Hamiltonian and N = p2 + 1. This amounts to checking the assumptions given in terms of Conditions 2.1– 2.3, 2.6, 2.8 and (for k0 ≥ 2 only) 4.11 (same k0 ). We take A = 12 (x · p + p · x) and compute with direct reference to Conditions 2.1, 2.6 and 2.8 H = 2p2 − x · ∇V, i[N, H]0 = p · ∇V + ∇V · p =
(6.5a) dim X
((pj ∂j V + (∂j V )pj ),
(6.5b)
j=1
N = 2p2 ,
(6.5c)
June 3, J070-S0129055X11004333
520
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
i adA (N ) = 2+1 p2 , i adN (i adA (N )) = 0;
≤ k0 − 1,
(6.5d)
il adlA (H ) = 2l+1 p2 + (−1)l+1 (x · ∇)l+1 V ;
l ≤ k0 .
(6.5e)
A comment on (6.5a) is due. We need to show Condition 2.1(3) using the expression (6.5a): First we remark that the operators τ , p2 and H0 are simultaneously diagonalizable. Therefore D(H)∩D(N ) = D(τ )∩D(N ) is dense in D(H)∩D(N 1/2 ). (See also Remark 3.5.) Moreover p2 , V and R(η) are obviously fibered (i.e. they act on the fiber space L2 (X)) and R(η) preserves D(p2 ) and D(|p|) for |η| large enough. Whence as a form on D(τ ) ∩ D(N ) i[H, R(η)] = i[p2 + V, R(η)] = −R(η)i[p2 + V, A]R(η) = −R(η)H R(η). The last identity for fiber operators is well known in standard Mourre theory for Schr¨ odinger operators. Finally we extend the shown version of (2.2) by continuity to a form identity on D(H) ∩ D(N 1/2 ) yielding Condition 2.1(3). Clearly (2.4) holds with C1 = 0, C2 = 1/2 and C3 = 1 + sup x · ∇V (t, x)/2. As for (2.5) a stronger version follows from [31, Theorem 4.2] H ≥ c0 1 − C4 fλ⊥ (H)2 − K0 .
(6.6)
Finally it follows from [31, Proposition 4.1] that indeed the condition of Corollary 4.13, ψ ∈ D(N 1/2 ) = D(|p|), is fulfilled. This shows the proposition in the case k0 = 1. For k0 ≥ 2 it remains to verify Condition 4.11. For this purpose it is helpful to notice that i adA (pj ) = pj ,
(6.7a)
i adA ((N + tj )−1 ) = −2(N + tj )−1 (N − 1)(N + tj )−1 .
(6.7b)
Moreover all computations are in terms of fiber operators (in particular M1 , M2 and M3 are all fibered operators), and recalling [30, Proposition II.1] and using the 1 (A) it suffices to do the computations in terms of forms on fact that N 1/2 ∈ CMo the Schwartz space S(X). Re M1 . We shall apply (6.7a) in combination with (6.5b) to verify the part of Condition 4.11 that involves M1 . Let us first look at the particular choice in (4.21) for M1 given by taking all the T ’s equal N 1/2 . That is we will demonstrate that for m = 1, . . . , k0 − 1 im adm N 1/2 (M1 ) is |p|-bounded. We compute m
i
adm 12 N
(M1 ) = −(ic 21 )
m+1
∞
(N + tm+1 )
dtm+1 tm+1 · · ·
0 −1
1 2
0
adm+1 p2 (V
∞
(6.8)
1 2
dt2 t2 −1
)(N + tm+1 )
∞ 0
1
t12 (N + t1 )−1 · · ·
· · · (N + t1 )−1 dt1 ,
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
and in turn,
adm+1 p2 (V ) =
521
cα,β pα (∂ α+β V )pβ = T1 + T2 + T3 ;
|α+β|=m+1
T1 =
cα,β pα (∂ α+β V )pβ ,
|α+β|=m+1, |β|≥1
T2 =
−i cα+β,0 pα (∂ α+2β V ),
|α+β|=m+1, |β|=1
T3 =
cα+β,0 pα (∂ α+β V )pβ .
|α+β|=m+1, |β|=1
Now in front of the bounded derivative of any of the terms of the expressions T1 , T2 and T3 we move the factor pα to the left in the integral representation and use the bound N s (N + t)−1 ≤ Cs (1 + t)s−1 ;
s ∈ [0, 1].
(6.9)
We obtain pα (N + tm+1 )−1 · · · (N + t1 )−1 ≤ Csm+1
m+1
(1 + tj )s−1 ;
s=
j=1
|α| . 2(m + 1)
β
Using (6.9) for the factors of p to the right (in case of T1 and T3 ) combined with the resolvents to the right and an additional factor N −1/2 we obtain 1
pβ (N + tm+1 )−1 · · · (N + t1 )−1 N − 2 ≤ Cσm+1
m+1
(1 + tj )σ−1 ;
σ=
j=1
|β| − 1 . 2(m + 1)
To treat T2 we notice that (N + tm+1 )−1 · · · (N + t1 )−1 ≤
m+1
(1 + tj )−1 .
(6.10)
j=1
Now the integrand with an additional factor N −1/2 to the right is a sum of terms either bounded (up to a constant) by m+1
|α|
1
tj2 (1 + tj ) 2(m+1)
−1
|β|−1
(1 + tj ) 2(m+1)
−1
j=1
=
m+1
1
3 1 − 2 − 2(m+1)
tj2 (1 + tj )
j=1
(these terms come from T1 and T3 ), or (for any term of T2 ) by m+1 j=1
1
|α|
tj2 (1 + tj ) 2(m+1) −1 (1 + tj )−1 =
m+1
1
tj2 (1 + tj ) 2(m+1) −2 . m
j=1
Whence in all cases the integral with an additional factor N −1/2 to the right is convergent in norm, which finishes the proof of the special case where all of the T ’s are equal to N 1/2 . The general case follows the same scheme. Some of the
June 3, J070-S0129055X11004333
522
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
commutators with A “hit” the potential part introducing a change W (t, x) → −x · ∇W (t, x). Other commutators with A hit a factor pj in which case we apply (6.7a). Finally yet other commutators with A hit a factor (N + tj )−1 in which case we apply (6.7b) and (6.10). Re M2 and M3 . The contributions to (4.21) from the first term of (6.5a), i.e. contributions from the expression 2p2 N −1/2 , vanish except for the case where all of the T ’s are equal to A. In this case we compute
m 1 d m m 2 − 12 i adA (2p N ) = 2t f (t)|t=p2 ; f (t) = 2t(t + 1)− 2 . (6.11) dt Obviously the right hand side of (6.11) is N 1/2 -bounded. The contributions to (4.21) from the expressions −x · ∇V N −1/2 and −N −1/2 x · ∇V are treated like the term M1 in fact slightly simpler. The iterated commutators are all bounded in this case. We leave out the details. Remark. Since H is not elliptic (more precisely |p|(H0 − i)−1 is unbounded) we do not see an “easy way” to get the conclusion of Proposition 6.3. For instance we need to use the assumption that e−iλ is non-threshold. See [29] for a related result for the one-body AC-Stark problem. Proof of Theorem 6.2. We mimic the proof of [31, Theorem 1.8]. Recall the notation Iin (N ) = n(N + n)−1 and Nin = N Iin (N ). Due to Proposition 6.3 and the representation (6.4) there exists t0 ∈ [0, 1[ such that U (t0 , 0)φ ∈ D(N (k0 +1)/2 ).
(6.12)
In particular ψ(t) = eitλ U (t, 0)φ ∈ D(p2 ) for all t. Next we compute d k0 +1 k0 +1 ψ(t), Nin ψ(t) = ψ(t), i[V, Nin ]ψ(t), dt p k0 −p k0 +1 i[V, Nin ]= Nin i[V, Nin ]Nin ,
(6.13a) (6.13b)
0≤p≤k0
i[V, Nin ] = −Iin (N )
dim X
((pj ∂j V + (∂j V )pj )Iin (N ).
(6.13c)
j=1
We plug (6.13c) into (6.13b) and then in turn (6.13b) into the right-hand side of (6.13a). We expand the sum and redistribute for each term at most k0 derivatives by pulling through the factor ∂j V obtaining terms on a more symmetric form, more precisely on the form N
k0 +1 2
ψ(t), Bn N
k0 +1 2
ψ(t)
where sup Bn < ∞.
(6.14)
n
Notice that for all terms the operator Bn involves at most k0 + 1 derivatives of V . Thanks to the Cauchy–Schwarz inequality and Proposition 6.3 any expression
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
523
like (6.14) can be integrated on [t0 , 1] and the integral is bounded uniformly in n. In combination with (6.12) we conclude that k0 +1 ψ(1) < ∞, sup ψ(1), Nin n
whence φ = ψ(1) ∈ D(N
(k0 +1)/2
).
6.3. Regularity of non-threshold atomic type bound states The generator of the evolution of the a system of N particles in a time-periodic Stark-field with zero mean (AC-Stark field) is of the form hphy (t) = p2 − E(t) · x + Vphy
1 on L2 (X). Assuming that the field is 1-periodic the condition 0 E(t)dt = 0 leads to the existence of unique 1-periodic functions b and c such that 1 d d b(t) = E(t), c(t) = 2b(t)) and c(t)dt = 0; dt dt 0 see [31] for details. For simplicity let us here assume that E ∈ C([0, 1]; X), see Remark 6.4 for an extension. The potential Vphy is a sum of time-independent real-valued “pair-potentials” Vphy = Vphy (x) = Va (xa ). a∈A\{amin }
In terms of these quantities we introduce Hamiltonians haux (t) = p2 + 2b(t) · p + Vphy , h(t) = p2 + Vphy (· + c(t)). The propagators Uphy , Uaux and U of hphy , haux and h, respectively, are linked by Galileo type transformations. Define t |b(s)|2 ds. S1 (t) = eic(t)·p and S2 (t) = ei(b(t)·x−α(t)) ; α(t) = 0
Then Uphy (t, 0) = S2 (t)Uaux (t, 0)S2 (0)−1 , −1
U (t, 0) = S1 (t)Uaux (t, 0)S1 (0) −1
Uphy (t, 0) = S2 (t)S1 (t)
(6.15a)
,
(6.15b) −1
U (t, 0)S1 (0)S2 (0)
.
(6.15c)
The bulk of [31] is a study of the Floquet Hamiltonian of h. Spectral information is consequently deduced for the monodromy operator U (1, 0). Finally the formula (6.15c) then gives spectral information for the physical monodromy operator Uphy (1, 0). The part of [31] concerning potentials with local singularities contains an incorrect reference in that it is referred to [39] for the existence of the propagator U (see [31, Remark 1.4]). However although the issue of Yajima’s paper is
June 3, J070-S0129055X11004333
524
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
the existence of an appropriate dynamics for singular time-dependent potentials the paper as well as the method of proof is for the one-body problem only. This point is easily fixed as follows, see Remark 6.4 for a more complicated procedure for E ∈ L1 ([0, 1]; X)\C([0, 1]; X): We use Yosida’s theorem which is in fact also alluded to in [31, Remark 1.4] (see [37, Theorem II.21] for a statement of the theorem). If Vphy is -bounded relatively to p2 (which is the case under the conditions considered in [31]) then indeed the propagator Uaux exists and we can use (6.15a) and (6.15b) to define Uphy and U . In particular we can use (6.15c) and obtain not only the existence of Uphy but various spectral information of the corresponding monodromy operator Uphy (1, 0) (see the introduction of [31] for details). We remark that the construction of the Floquet Hamiltonian of h is done independently of U although of course (6.3) may be taken as a definition. Let us for completeness note the following by-product of Yosida’s theorem (intimately related to its proof): Pick λ0 ∈ R such that haux (t) ≥ λ0 + 1 for all t. The crucial assumption in the theorem is the boundedness of the function −1 d −1 (6.16) t → (haux (t) − λ0 ) (haux (t) − λ0 ) . dt Since, by assumption E ∈ C([0, 1]; X), clearly the following constant is a bound of (6.16), C := 2 sup |E(t)| sup |p|(haux (t) − λ0 )−1 . t
t
We have the explicit bound of the dynamics restricted to D(p2 ). (haux (t) − λ0 )Uaux (t, 0)φ2 ≤ e2C|t| (haux (0) − λ0 )φ2
for φ ∈ D(p2 ).
Let us also note the following property of the dynamics restricted to D(|p|), cf. [37, Theorems II.23 and II.27], (haux (t) − λ0 )1/2 Uaux (t, 0)φ2 e
≤ eC|t| (haux (0) − λ0 )1/2 φ2
for φ ∈ D(|p|);
(6.17)
here := 2 sup |E(t)| sup |p|1/2 (haux (t) − λ0 )−1/2 2 . C t
t
Remark 6.4. If E ∈ L1 ([0, 1]; X) but possibly E ∈ C([0, 1]; X) we can still show that there exists an appropriate dynamics U under the conditions considered in [31], although possibly not one that preserves D(p2 ). We can use [37, Theorem II.27] directly on h. For the borderline case, the Coulomb singularity, Hardy’s inequality [31, (6.2)] is needed to verify the assumptions of this theorem; the details are not discussed here. This yields a dynamics U preserving D(|p|) which is good enough for getting the conclusions of [31] related to the condition E ∈ L1 ([0, 1]; X). The results presented below can similarly be extended to E ∈ L1 ([0, 1]; X).
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
525
The following condition is an extension of [31, Condition 1.3] (which corresponds to k0 = 1 below). The Coulomb potential commonly used to describe atomic and molecular systems (here with moving nuclei) is included. Conditions 6.5. Let k0 ∈ N be given. For each a = amin the following holds. The pair-potential X a y → Va (y) ∈ R splits into a sum Va = Va1 + Va2 where (1) (2) (3) (4) (5)
Differentiability: Va1 ∈ C k0 +1 (X a ) and Va2 ∈ C k0 +1 (X a \{0}). Global bounds: For all α with |α| ≤ k0 +1 there are bounds |y||α| |∂yα Va1 (y)| ≤ C. Decay at infinity: |Va1 (y)| + |y · ∇y Va1 (y)| = o(1). Dimensionality: Va2 = 0 if dim X a < 3. Local singularity: Va2 is compactly supported and for all α with |α| ≤ k0 + 1 there are bounds |y||α|+1 |∂yα Va2 (y)| ≤ C; y = 0.
We note that the part of time-dependent potential Vphy (·+c(t)) coming from the first term Va1 of the splitting of Va in Condition 6.5 conforms with Condition 6.1. The part from Va2 does not, and we do not in general expect there to be an analogue of Theorem 6.2 in this case for k0 > 1. It is an open problem to determine whether there is an analogue statement of Theorem 6.2 for k0 = 1. Notice that the lowest degree of regularity, φ ∈ D(|p|), holds even without the non-threshold condition, cf. [31, Theorem 1.8]. On the other hand since the singularity is located at x = −c(t) we would expect and we will indeed prove regularity with respect to the observable 1 A = A(t) = ((x + c(t)) · p + p · (x + c(t))) 2 1 = S1 (t) (x · p + p · x)S1 (t)−1 . (6.18) 2 This regularity is the content of Theorem 6.6 stated below; see [31, Proposition 8.7(ii)] for a related result in the case k0 = 1 at the level of Floquet theory, cf. Proposition 6.7 stated below. The A-regularity statement of the theorem for k0 > 1 is new. The set of thresholds is defined as before, see (6.1). Theorem 6.6. Suppose Conditions 6.5 for some k0 ∈ N. Let φ be a bound state for / F(U (1, 0)). Then φ ∈ D(A(1)k0 ) where U (1, 0) pertaining to an eigenvalue e−iλ ∈ A(1) is given by taking t = 1 in (6.18). The above theorem implies Theorem 1.6(1). We shall prove Theorem 6.6 along the same lines as that of the proof of Theorem 6.2. Whence we introduce the Floquet Hamiltonian by the expression (6.2) (with V = Vphy (·+ c(t))). By [31, Theorem 6.2] V is -bounded relatively to H0 whence H is self-adjoint. Proposition 6.7. Suppose Conditions 6.5 for some k0 ∈ N and suppose Hψ = λψ / F(U (1, 0)). Then for any k, ≥ 0, with k + ≤ k0 , we have ψ ∈ for e−iλ ∈ D(Ak pA ) where A is given by (6.18). Proof. It is tempting to try to apply Corollary 2.9 with H being the Floquet Hamiltonian, A being as stated and N = p2 + 1. In fact all of the conditions of
June 3, J070-S0129055X11004333
526
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
Corollary 2.9 can be verified except for Condition 2.1(2) (notice that the formal analogue of (6.5b) might be too singular). This deficiency will be discussed at the end of the proof. All other conditions can be verified with H = 2p2 − (x + c) · ∇V + 2b · p,
(6.19a)
N = 2p2 ,
(6.19b)
i adA (N ) = 2+1 p2 ,
i adN (i adA (N )) = 0;
≤ k0 − 1,
i adA (H ) = 2+1 p2 + (−1)+1 ((x + c) · ∇)+1 V + 2b · p;
(6.19c) ≤ k0 .
(6.19d)
Comments are due. First, the second and the third terms of (6.19a) are bounded relatively to |p| uniformly in t, cf. the Hardy inequality [31, (6.2)], and whence indeed (6.19a) is N -bounded. We need to verify Condition 2.1(3) using the expression (6.19a): The operators p2 , V and R(η) are fibered and R(η) preserves D(p2 ) and D(|p|) for |η| large enough (uniformly in t). Whence as a form on D(τ ) ∩ D(N ) i[h, R(η)] = −R(η)i[p2 + V, A]R(η) = −R(η)(2p2 − (x + c) · ∇V )R(η), i[τ, R(η)] = −R(η)2b · pR(η), and therefore i[H, R(η)] = −R(η)H R(η). Using again that D(H) ∩ D(N ) = D(τ ) ∩ D(N ) is dense in D(H) ∩ D(N 1/2 ), cf. Remark 3.5, the latter form identity can be extended by continuity to a form identity on D(H) ∩ D(N 1/2 ) yielding Condition 2.1(3). As for (6.19b)–(6.19d), Conditions 2.1(1) and 2.1(4), Conditions 2.6 and 2.8 the verification is straightforward (omitted here). To show (2.4) we first introduce the natural notation V = V 1 + V 2 reflecting the splitting of Conditions 6.5. Then we introduce 1 1 = (x + c) · ∇V 1 ; C = |p|− 2 ((x + c) · ∇V 2 − 2b · p)|p|− 2 and C
the norm is the operator norm on H. Then we note that N≤
1 1 1 H + C|p| + 1 + C, 2 2 2
understood as a form yielding (2.4) with C1 = 0, C2 = 1 and C3 = 1 + C 2 /4 + C 1/2 on D(N ). We have verified Condition 2.2. As for (2.5) a stronger version follows from [31, Proposition 6.4] H ≥ c0 1 − C4 fλ⊥ (H)2 − K0 .
(6.20)
/ F(U (1, 0)). The estimate (6.20) is valid Here we use the condition that e−iλ ∈ as a form on D(N 1/2 ). Finally it follows from [31, Theorem 6.3] that indeed the condition of Corollary 2.9, ψ ∈ D(N 1/2 ) = D(|p|), is fulfilled.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
527
Now to the deficiency given by the lack of Condition 2.1(2). Checking the proof of Corollary 2.9 it is realized that Condition 2.1(2) is used only to assure boundedness of N 1/2 BN −1/2 , where under the assumption (2.5) we have B = C4 fλ⊥ (H)2 H(H− λ)−1 . In our case we have a slightly stronger version of the Mourre estimate, (6.20), so what we really need is 1
1
N 2 BN − 2 ∈ B(H) where B = g(H); g(E) = fλ⊥ (E)2 (E − λ)−1 .
(6.21)
So let us show (6.21) without invoking a condition like Condition 2.1(2). Clearly it suffices to show that the commutator 1
[N 2 , g(H)] ∈ B(H). But
1 2
[N , g(H)] = c 12
∞
(6.22)
1
t 2 (N + t)−1 [N, g(H)](N + t)−1 dt,
0
[N, g(H)] = [H − V + I − τ, g(H)] = −[τ, g(H)] + T, 1 ¯g)(η)(H − η)−1 [τ, V ](H − η)−1 dudv, −[τ, g(H)] = (∂˜ π C −[τ, V ] = i2b · ∇V. Here the term T is bounded since V is bounded relatively to H; whence indeed T gives a bounded contribution to the commutator in (6.22). As for the contribution from the term −[τ, g(H)] only the part from V 2 is non-trivial. For that part we use [31, (6.6)] to obtain 1
(H − η)−1 2b · ∇V 2 (H − η)−1 ≤ Cmax(|Im η|−2 , |Im η|− 2 ). Whence we can bound the integral ¯g)(η)(H − η)−1 2b · ∇V 2 (H − η)−1 dudv (∂˜ C
≤C
C
1
¯g )(η)|max(|Im η|−2 , |Im η|− 2 )dudv < ∞. |(∂˜
This means that also the first term −[τ, g(H)] is bounded and whence in turn its contribution to the commutator in (6.22) agrees with the statement of (6.22). We have proven (6.22). Proof of Theorem 6.6. We mimic the proof of Theorem 6.2. Recall the notation In (A) = −in(A − in)−1 and An = AIn (A). Due to Proposition 6.7 and the representation (6.4) there exists t0 ∈ [0, 1[ such that U (t0 , 0)φ ∈ D(|p|) ∩ D(A(t0 )k0 ).
(6.23)
June 3, J070-S0129055X11004333
528
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
In particular ψ(t) = eitλ U (t, 0)φ ∈ D(|p|) for all t, cf. (6.15b) and (6.17). Moreover ψ(·) is differentiable as a D(|p|)∗ –valued function, and in this sense d ψ(t) = (h(t) − λ)ψ(t). dt
i Whence we can compute
d d Akn0 ψ(t)2 = 2 Re Akn0 ψ(t), i[h(t), Akn0 ] + Akn0 ψ(t) , dt dt
d d Apn i[h(t), An ] + An Ank0 −p−1 , i[h(t), Akn0 ] + Akn0 = dt dt
(6.24a) (6.24b)
0≤p≤k0 −1
d An = In (A)(2p2 + 2b · p − (x + c) · ∇V )In (A). (6.24c) dt We plug (6.24c) into (6.24b) and then in turn (6.24b) into the right-hand side of (6.24a). We expand the sum and redistribute for each term at most k0 − 1 factors of A obtaining terms on a more symmetric form, more precisely on the form i[h(t), An ] +
RepAk0 ψ(t), BpAk ψ(t)
where k ≤ k0 − 1
and
sup B < ∞.
(6.25)
n,t
Thanks to the Cauchy–Schwarz inequality and Proposition 6.7 any expression like (6.25) can be integrated on [t0 , 1] and the integral is bounded uniformly in n. In combination with (6.23) we conclude that supA(1)kn0 ψ(1)2 < ∞, n
whence φ = ψ(1) ∈ D(A(1) ). k0
Acknowledgment The first author was partially supported by Center for Theory in Natural Sciences, Aarhus University. References [1] S. Agmon, I. Herbst and E. Skibsted, Perturbation of embedded eigenvalues in the generalized N -body problem, Comm. Math. Phys. 122 (1989) 411–438. [2] J. Aguilar and J.-M. Combes, A class of analytic perturbations for one-body Schr¨ odinger Hamiltonians, Comm. Math. Phys. 22 (1971) 269–279. [3] W. Amrein, A. Boutet de Monvel and V. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N -Body Hamiltonians (Birkh¨ auser, 1996). [4] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined non-relativistic particles, Adv. Math. 137 (1998) 299–395. [5] V. Bach, J. Fr¨ ohlich, I. M. Sigal and A. Soffer, Positive commutators and the spectrum of Pauli–Fierz Hamiltonian of atoms and molecules, Comm. Math. Phys. 207 (1999) 557–587. [6] E. Balslev and J.-M. Combes, Spectral properties of many-body Schr¨ odinger operators with dilation analytic interactions, Comm. Math. Phys. 22 (1971) 280–294.
June 3, J070-S0129055X11004333
2011 13:31 WSPC/S0129-055X
148-RMP
Regularity of Bound States
529
[7] L. Bruneau and J. Derezi´ nski, Pauli–Fierz Hamiltonians defined as quadratic forms, Rep. Math. Phys. 54 (2004) 169–199. [8] L. Cattaneo, Mourre’s inequality and embedded boundstates, Bull. Sci. Math. 129 (2005) 591–614. [9] L. Cattaneo, G. M. Graf and W. Hunziker, A general resonance theory based on Mourre’s inequality, Ann. Henri Poincar´e 7 (2006) 583–601. [10] J. Derezi´ nski and C. G´erard, Asymptotic completeness in quantum field theory. Massive Pauli–Fierz Hamiltonians, Rev. Math. Phys. 11 (1999) 383–450 [11] J. Derezi´ nski and V. Jakˇsi´c, Spectral theory of Pauli–Fierz operators, J. Funct. Anal. 180 (2001) 243–327. [12] J. Derezi´ nski and V. Jakˇsi´c, Return to equilibrium for Pauli–Fierz systems, Ann. Henri. Poincar´e 4 (2003) 739–793. [13] J. Faupin, J. S. Møller and E. Skibsted, Second order perturbation theory for embedded eigenvalues, to appear in Comm. Math. Phys. [14] R. Froese and I. Herbst, Exponential bounds and absence of positive eigenvalues for N -body Schr¨ odinger operators, Comm. Math. Phys. 87 (1982) 429–447. [15] R. Froese, I. Herbst, M. Hoffmann-Ostenhof and T. Hoffmann-Ostenhof, On the absence of positive eigenvalues for one-body Schr¨ odinger operators, J. Anal. Math. 41 (1982) 272–284. [16] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic completeness for Rayleigh scattering, Ann. Henri Poincar´e 3 (2002) 107–170. [17] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic completeness for Compton scattering, Comm. Math. Phys. 252 (2004) 415–476. [18] J. Fr¨ ohlich, M. Griesemer and I. M. Sigal, Spectral theory for the standard model of non-relativistic QED, Comm. Math. Phys. 283 (2008) 613–646. [19] J. Fr¨ ohlich and M. Merkli, Another return of “return to equilibrium”, Comm. Math. Phys. 251 (2004) 235–262. [20] V. Georgescu and C. G´erard, On the virial theorem in quantum mechanics, Comm. Math. Phys. 208 (1999) 275–281. [21] V. Georgescu, C. G´erard and J. S. Møller, Commutators, C0 -semigroups and resolvent estimates, J. Funct. Anal. 216 (2004) 303–361. [22] V. Georgescu, C. G´erard and J. S. Møller, Spectral theory of massless Pauli–Fierz models, Comm. Math. Phys. 249 (2004) 29–78. [23] C. G´erard, On the scattering theory of massless Nelson models, Rev. Math. Phys. 14 (2002) 1165–1280. [24] S. Gol´enia, Positive commutators, Fermi Golden Rule and the spectrum of zero temperature Pauli–Fierz Hamiltonians, J. Funct. Anal. 256 (2009) 2587–2620. [25] S. Gol´enia and T. Jecko, A new look at Mourre’s commutator theory, Complex Anal. Oper. Theory 1 (2007) 399–422. [26] M. H¨ ubner and H. Spohn, Spectral properties of the spin-boson Hamiltonian, Ann. Inst. Henri Poincar´e 62 (1995) 289–323. [27] W. Hunziker and I. M. Sigal, The quantum N -body problem, J. Math. Phys. 41 (2000) 3448–3510. [28] V. Jakˇsi´c and C.-A. Pillet, On a model for quantum friction II: Fermi’s golden rule and dynamics at positive temperature, Comm. Math. Phys. 176 (1996) 619–644. [29] Y. Kuwabara and K. Yajima, The limiting absorption principle for Schr¨ odinger operators with long-range time-periodic potentials, J. Fac. Sci. Univ. Tokyo Sect. IA Math. 34 (1987) 833–851. ´ Mourre, Absence of singular continuous spectrum for certain selfadjoint operators, [30] E. Comm. Math. Phys. 78 (1980/81) 391–408.
June 3, J070-S0129055X11004333
530
2011 13:31 WSPC/S0129-055X
148-RMP
J. Faupin, J. S. Møller & E. Skibsted
[31] J. S. Møller and E. Skibsted, Spectral theory of time-periodic many-body systems, Adv. Math. 188 (2004) 137–221. [32] J. S. Møller and M. Westrich, Regularity of eigenstates in regular Mourre theory, J. Funct. Anal. 260 (2011) 852–878. [33] M. Reed and B. Simon, Methods of Modern Mathematical Physics I, Functional Analysis (Academic Press, 1972). , Methods of Modern Mathematical Physics II, Fourier Analysis, Self[34] Adjointness (Academic Press, 1975). , Methods of Modern Mathematical Physics III, Scattering Theory (Academic [35] Press, 1979). , Methods of Modern Mathematical Physics IV, Analysis of Operators [36] (Academic Press, 1978). [37] B. Simon, Quantum Mechanics for Hamiltonians Defined as Quadratic Forms, Princeton Series in Physics (Princeton University Press, 1971). [38] E. Skibsted, Spectral analysis of N -body systems coupled to a bosonic field, Rev. Math. Phys. 10 (1998) 989–1026. [39] K. Yajima, Existence of solutions for Schr¨ odinger evolution equations, Comm. Math. Phys. 110 (1987) 415–426.
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 5 (2011) 531–551 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004357
DYNAMICAL BACKREACTION IN ROBERTSON–WALKER SPACETIME
BENJAMIN ELTZNER∗,‡ and HANNO GOTTSCHALK†,§ ∗Max-Planck
Institute for Mathematics in the Sciences, Inselstr. 22, D-04103 Leipzig, Germany
†Bergische Universit¨ at Wuppertal, Fachgruppe Mathematik, Gaußstraße 20, D-42119 Wuppertal, Germany ‡
[email protected] §
[email protected]
Received 26 March 2010 Revised 16 March 2011 The treatment of a quantized field in a curved spacetime requires the study of backreaction of the field on the spacetime via the semiclassical Einstein equation. We consider a free scalar field in spatially flat Robertson–Walker spacetime. We require the state of the field to allow for a renormalized semiclassical stress tensor. We calculate the singularities of the stress tensor restricted to equal times in agreement with the usual renormalization prescription for Hadamard states to perform an explicit renormalization. The dynamical system for the Robertson–Walker scale parameter a(t) coupled to the scalar field is finally derived for the case of conformal and also general coupling. Keywords: Quantum field theory; cosmology. Mathematics Subject Classification 2010: 81T05
1. Introduction The studies of quantized fields in curved spacetimes usually assume a fixed background spacetime on which a quantized field is defined [5]. Such a setting has also been used to investigate the simplest cosmological model of a homogeneous isotropic universe. In this context the mode spectrum of a scalar field in a homogeneous isotropic universe undergoing an era of inflation has been found to be the famous and experimentally confirmed scale free Harrison–Zeldovich spectrum [9, 10]. The work which is concerned with the backreaction of the field on the evolution of the spacetime itself usually takes a mean field approach in which quantum fluctuations only contribute to the effective potential of the (classical) expectation value of the field [11, 12, 21]. The dynamics of the quantum degrees of freedom is not considered. In other approaches, the quantum degrees of freedom either decouple from the spacetime [17] due to conformal coupling for a massless field, or a large 531
June 3, J070-S0129055X11004357
532
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
mass of the field is assumed [3], which implies another semiclassical approximation. This approximation in fact implies that the field configuration is dominated by renormalization ambiguities. Again the coupling of quantum degrees of freedom to the geometry of the space time is only approximate. In this work, which is partially based on [4], we determine the coupling of a free quantum field to the scale parameter of the spatially flat Robertson–Walker spacetime. The result is an equation for a dynamical system with infinitely many degrees of freedom that can, at least in principle, be solved. The dynamical system is derived via an expansion of Riemann normal coordinates in the Robertson–Walker canonical coordinates up to fifth order [1, 6]. This allows to calculate the singular terms of the Hadamard bidistribution and its first and second time derivative restricted to equal time surfaces. Those terms are needed to renormalize the energy-momentum tensor of the free field restricted to equal time surfaces. The latter establishes the dynamics of the Robertson–Walker scale factor (up to renormalization ambiguities). While finalizing this paper we got aware of the publication [15], which derives exactly the same equation of motion as we do for the case of conformal coupling using a completely independent argument. In fact, this work in the conformally coupled case also establishes the existence of solutions for small time intervals given that the initial state fulfills certain conditions, which goes beyond the scope of this article.
2. General Description of Dynamics and Renormalization Approach Throughout the article we restrict to flat Robertson–Walker (RW) spacetimes with spacetime dimension 4. This allows to use the standard Fourier transform on spatial sections of constant RW time in order to formulate the dynamics of the field in terms of modes. As the stress tensor is formulated as a differential operator acting on the two-point function, we will formulate the dynamics of the field in such a way that the two-point function of the field and its associate momentum field operator restricted to the given time is the dynamical variable. We will also restrict our attention to homogeneous, isotropic, quasifree and pure states that either are Hadamard states or are sufficiently close to Hadamard in the sense that they allow the same renormalization prescription for their energy-momentum tensor. Hadamard states and adiabatic vacuum states on RW spacetime have been studied, e.g., by L¨ uders and Roberts [13], Juncker and Schrohe [7] and more recently by Olbermann [14]. The line element of a spatially flat Robertson–Walker metric for a homogeneous, isotropic spacetime is ds2 = dt2 − a2 (t)dx 2
(1)
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
533
where we call a(t) the scale parameter and define the Hubble parameter H(t) = aa˙ . We consider a scalar free field of mass m coupled to the scalar curvature R(t) = ˙ −6(H(t) + 2H 2 (t)) with coupling ξ, ( − ξR + m2 )φ = 0,
(2)
where is the D’Alembertian. We use the abbreviation ωk2 = a−2 k 2 + m2 − ξR. The Klein–Gordon equation for the field modes φk is given by φ¨k + 3H φ˙ k + ωk2 φk = 0.
(3)
Using the canonical momentum field πk = a3 φ˙ k we consider the Hamiltonian form of this equation φ 0 a−3 φk ∂t k = . (4) πk −a3 ωk2 0 πk It is shown in [13] that the equal time two-point function, i.e. the two-point function on a Cauchy surface, of a state can be described by matrices Gφφ,k Gφπ,k , (5) Gπφ,k Gππ,k where Gφφ,k = Gφφ (t, k) is the Fourier transform in z = x −y of two equal time field operators G(t, x, y ) = ω(φ(t, x)φ(t, y )), Gφπ,k the Fourier transform of one equal time field operator and one canonically conjugated momentum operator etc., and k is the modulus of the momentum conjugated to z . Our normalization convention for the Fourier transform is F (f )(k) = R3 f (x)eik·x dx. The positivity of the state will enforce that the matrix is positive semidefinite with its determinant vanishing for a pure quasifree state. The symmetric part of the two-point function modes then fulfills the linear system of equations Gφφ,k 0 2a−3 0 Gφφ,k (6) ∂t G(φπ),k = −a3 ωk2 0 a−3 G(φπ),k . 3 2 Gππ,k 0 −2a ωk 0 Gππ,k Here and in the following ( ) stands for symmetrization and [ ] for antisymmetrization in the field and momentum operator. This system of equations has one conserved quantity per mode Jk = Gφφ,k Gππ,k − G2(φπ),k
(7)
which reduces the number of degrees of freedom per mode to two, as required. The condition for a state to be pure and to induce a representation of the canonical commutation (CCR) algebra then is 1 (8) 4 which implies the vanishing of the above-mentioned determinant for all modes. ∀ k: Jk = −G2[φπ],k =
June 3, J070-S0129055X11004357
534
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
Next we turn to the semi classical Einstein equation. This equation has the form Gµν = 8πG Tµν ω
(9)
where the expectation value of the stress tensor must be renormalized. In this equation Gµν is the Einstein tensor and G is the gravitational constant and both should not be confused with Gφφ,k etc. which are two-point functions. We will apply the point-splitting procedure as formulated in [16], see also [18]. In order to preserve general covariance of the renormalization prescription, we need to subtract the Hadamard bidistribution H, described in some detail in [5], from the two-point function before removing the point splitting. Actually, a sufficiently precise approximation Hn of the Hadamard parametrix does the job as well (η) (η) lim D(v)µν (x, y)[φ(x)φ(y)ω − Hn (x, y)] + tµν (v) (10) Tµν (v) ω,λ,ξ = (x,y)→(v,v)
with, cf. [8, 18], tµν (v) =
δ δg µν (v)
(Am4 + Bm2 R + CR2 + DRαβ Rαβ )dg x
(11)
where A, B, C and D are real valued renormalization degrees of freedom and dg x stands for the canonical volume form |g|dx. (η) D(v)µν (x, y) is the symmetrization in x and y of the following second order partial differential operator (cf. [16, Eq. (10)] for the details, where we corrected some minor misprints)
1 ∂y,ν − gµν g γ,δ ∂x,γ ∂y,δ − m2 ∂x,µ 2
1 γδ + ξ Rµν − gµν R + 2gµν x + g ∂x,γ ∂y,δ − 2 Dx,µ ∂x,ν + ∂x,µ ∂y,ν 2 − ηgµν (x − ξR + m2 ).
(12)
Here ∂x,µ = δµν (v, x)∂x,ν , δ(v, x) being the geodesic transport and Dx,µ analogously denotes a covariant derivative. The quantities g, R and the Ricci tensor Rµν are evaluated at v. Note here we use sign convention (+ − −−) instead of (− + ++) in [16] which flips the sign in the last expression in the brackets (and some more in the text below). The definition of Hn for d = 4 is n 1 σ(x, y) 1 1 u(x, y) − 2 log − vk (x, y)σ(x, y)k . (13) − 2 4π σ(x, y) 4π λ2 k! k=0
Here σ(x, y) is the squared geodesic distance, u(x, y) is the square root of the Van Vleck–Morette determinant (which for dimension 4 is U0 (x, y) as defined in [16] up to normalization). u(x, y) and the vk (x, y) (corresponding to Uk+1 (x, y) of [16]) are
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
535
functions that can be determined by a recursive system of differential equations that depends exclusively on invariant quantities and the operator − ξR + m2 : 2g µν (x)(∂µx σ)(∂νx u) + (x σ − 4)u = 0, 2g
µν
(x)(∂µx σ)(∂νx v0 )
(14) 2
+ ( σ − 2)v0 = −( + m − ξR(x))u, x
x
2g µν (x)(∂µx σ)(∂νx vk+1 ) + (x σ + 2k)vk+1 = −(x + m2 − ξR(x))vk .
(15) (16)
The coincidence limits vk (v, v) up to normalization coincide with the Hadamard– Minkashisundram–De Witt–Seely coefficients [5, 16]. In particular, on RW space time, vk ((t, x), (t, y )) depends on x and y only through z 2 = (x − y ) 2 and the coincidence limit vk ((t, x), (t, x )) of the vk depends only on t. The same holds true for derivatives of vk wrt z 2 . The regularization at light-like separated x, y is done by adding an i(x0 −y 0 ) to σ(x, y), letting 0 and taking the real part. We do not need this in the following, as we will approach the coincidence limit from spatial directions exclusively. It is shown that for n ∈ N, the expression [φ(x)φ(y) ω − Hn (x, y)] can be extended to a function in C n (Cv × Cv ) where Cv is some convex normal neighborhood of v. The quantum averaged field fluctuations are defined as φ2 ω,λ,ξ =
lim
[φ(x)φ(y)ω − Hn (x, y)]
(x,y)→(v,v)
(17)
where n = 0 suffices if no derivatives of this quantity are required. For η = 0, the corresponding term in (10) in the energy momentum tensor vanishes classically, but not quantum mechanically. For η = 1/3, the so-defined energy-momentum tensor is conserved. For RW spacetimes and states of the form (5), the quantum averaged energy-momentum tensor depends only on t and is diagonal so that only energy conservation is non-trivial (1/3)
∇µ Tµ0
ω,λ,ξ = ρ˙ + H(ρ + p) = 0,
(18)
(1/3,λ) (1/3,λ) where ρ = T00 and p = a2 Tjj are energy density and pressure, ω,λ,ξ ω,λ,ξ respectively, see, e.g., [21]. Moretti also showed (1/3) 1 µν 2 − ξ + m φ2 ω,λ,ξ = ρ − 3p = −3 g Tµν ω,λ,ξ 6 +
1 v1 + cm2 + c m2 R + c R 4π 2
(19)
where v1 = v1 (t, ξ, m2 ) = v1 ((t, v ), (t, v )) and c, c , c can be calculated from A, B, C and D. The D’Alembert operator here does not require point splitting and hence on flat RW spacetime can be replaced by d2 /dt2 + 3Hd/dt. Interestingly, this equation can be seen as the equation of state and the right-hand side gives the deviation of quantum matter from the state equation of hot (relativistic) matter p = ρ/3. For Minkowski space and ω the Minkowski vacuum, we expect p = ρ = 0 which
June 3, J070-S0129055X11004357
536
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk 7
can be achieved through λ2 = 4e 4 −2γ /m2 with γ the Euler constant. v1 has been calculated in [3] for the case of flat RW spacetime as 1 ˙ 2 1 1 4 (HH + H ) + − ξ R v1 = 60 24 5 2 1 1 9 1 m4 2 2 ˙ 4 ˙ − ξ (H + 4H H + 4H ) − + − ξ m2 R. − (20) 2 6 8 4 6 In the case of a massless field with conformal coupling m2 = 0, ξ = 1/6, the terms in the second line on the right-hand side of (19) is called the conformal anomaly. Combining (9) and (19) one arrives at the following equation of motion 2 d 1 d 2 −ξ + 3H + m φ2 ω,λ,ξ −R = 8πG −3 6 dt2 dt 1 + 2 v1 + cm4 + c m2 R + c R . (21) 4π 2
d d 2 We would like to find an expression for ( dt 2 + 3H dt )φ ω,λ,ξ that does not ¨ ω,λ,ξ since only then second order time derivatives of the field φ do include terms φφ d ˙ ω,λ,ξ + φ2 ω,λ,ξ equals φφ only occur on the left-hand side of (4). It is clear that dt 2 2 ¨ ˙ φφω,λ,ξ . Let h(t) = f (s, s )|s=s =t . Then, h(t) = (∂s + ∂s + 2∂s ∂s )f (s, s )|s=s =t . Applying this to our problem and using the fact [16, Lemma 2.1] that
φ( − ξR + m2 )φω,λ,ξ =
3 v1 , 2π 2
(22)
we conclude that 2 d d + 3H φ2 ω,λ,ξ dt2 dt 3 = 2φ˙ 2 ω,λ,ξ + 2φ(a−2 ∆ + ξR − m2 )φω,λ,ξ + 2 v1 . π
(23)
Here ∆ stands for the Laplacian on R3 . We summarize the discussion in the following theorem: Theorem 2.1. The equation of motion for semi-classical Einstein equation on flat Robertson–Walker spacetime can be written as follows: −R = 8πG (6ξ − 1)(φ˙ 2 ω,λ,ξ + a−2 φ∆φω,λ,ξ ) + [(2 − 6ξ)m2 − (1 − 6ξ)ξR]φ2 ω,λ,ξ +
36ξ − 5 v1 + cm4 + c m2 R + c R 4π 2 2
d d with R = −6(H˙ + 2H 2 ), R = ( dt 2 + 3H dt )R and v1 given by (20).
(24)
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
537
From the above analysis it is clear that we need to compute φ2 ω,λ,ξ and its ˙ R, R, ˙ R, ¨ ... second time derivative in terms of the functions Gφφ , . . . and a, H, H, in order to combine (6) and (24) to a closed system of equations. Hence we need to calculate those terms in H(x, y)|x0 =y0 =t , ∆x H(x, y)|x0 =y0 =t ∂ ∂ and ∂x 0 ∂y 0 H(x, y)|x0 =y 0 =t that either are singular or contribute a (time dependent) quantity to the spatial coincidence limit z = (x − y ) → 0. 3. Leading Terms of the Hadamard Distribution We start the task described at the end of the preceding section via a perturbative u(x,y) of the calculation of the singularities in the leading term H (x, y) = −(4π 2 )−1 σ(x,y) Hadamard distribution. In normal coordinates X around y, σ takes the form 2σ = ηµν X µ X ν
(25)
with ηµν the Minkowski metric. Thus the computation is reduced to finding normal coordinates perturbatively in dependence of the canonical coordinates. We will need this expansion to fifth order to fix all singularities of limy→x H (x, y) and its first and second derivatives as well as all homogeneous terms. The normal coordinates have been calculated perturbatively to fifth order in [1] (using computer algebra support), formulas (11.12)–(11.16) for a general metric. We plug in
(n) α i j 2 α 0 i i 0 (26) ∂0n Γα βγ = δ0 δβ δγ δij Ln a + δi δβ δγ + δβ δγ H where H (n) denotes the nth derivative of H with respect to time and the Ln are L0 = H,
(27)
L1 = H˙ + 2H 2 ,
(28)
¨ + 6HH ˙ + 4H 3 , L2 = H
(29)
¨˙ + 8HH ¨ + 6H˙ 2 + 24HH ˙ 2 + 8H 4 . L3 = H
(30)
Using these formulae we get 1 1 2σ = z02 − a2 z 2 1 + Hz0 + (H˙ + H 2 )z02 + H 2 a2z 2 3 12 1 ¨ 1 ˙ 3 ˙ (H + 2HH)z (HH + 2H 3 )a2 z 2 z0 0 + 12 12 1 ¨˙ + 6HH ¨ + 2H˙ 2 − 8HH ˙ 2 − 4H 4 )z 4 (3H + 0 180 1 ¨ + 8H˙ 2 + 74HH ˙ 2 + 48H 4 )a2 z 2 z 2 (9 HH + 0 360 1 ˙ 2 + 4H 4 )a4 (z 2 )2 + O(z 7 ) (3HH + 360 +
(31)
where all time dependent terms are evaluated at y0 and we use the abbreviation z = x − y.
June 3, J070-S0129055X11004357
538
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
Lemma 3.1. σ is symmetric under exchange of x and y. Proof. The proof is done by straightforward calculation, however on has to keep in mind that a, H and its derivative all have a suppressed argument y0 such that they have to be Taylor expanded. The terms of even orders in z do not change signs so it has to be shown, that they are not affected by the terms from the Taylor expansion of the lower order coefficients. Using abbreviations ay = a(y0 ) and the likes we get to the relevant orders a2x 1 ¨ 3 3 ˙ = 1+ 2Hy z0 +(H˙ y + 2Hy2 )z02 + (H y + 6Hy Hy + 4Hy )z0 a2y 3 1 ¨˙ ¨ y Hy + 6H˙ y2 + 24H˙ y Hy2 + 8Hy4 )z04 , (32) (H y + 8H 12 a2 1 ¨ 3 3 ˙ −Hx x2 z0 = −Hy z0 − (H˙ y + 2Hy2 )z02 − (H y + 6Hy Hy + 4Hy )z0 ay 2 +
1 ¨˙ 2 4 4 ¨ ˙2 ˙ − (H y + 8Hy Hy + 6Hy + 24Hy Hy + 8Hy )z0 , 6
(33)
1 ˙ a2 1 1 ¨ 3 3 ˙ (Hx + Hx2 ) x2 z02 = (H˙ y + Hy2 )z02 + (H y + 4Hy Hy + 2Hy )z0 3 ay 3 3 1 ¨˙ 2 4 4 ¨ ˙2 ˙ + (H y + 6Hy Hy + 4Hy + 14Hy Hy + 4Hy )z0 , 6 (34) −
1 ¨ (Hx + 12
a2 2H˙ x Hx ) x2 z03 ay
=− −
1 ¨ (Hy + 2H˙ y Hy )z03 12 1 ¨˙ ¨ y Hy + 2H˙ y2 + 4H˙ y Hy2 )z04 , (H y + 4H 12
(35)
1 2 a4x 1 2 1 ˙ H H + (Hy Hy + 2Hy3 )z0 = 12 x a4y 12 y 6 + −
1 ¨ (Hy Hy + H˙ y2 + 10H˙ y Hy2 + 8Hy4 )z02 , 12
(36)
a4 1 1 ˙ (Hx Hx + 2Hx3 ) x4 z0 = − (H˙ y Hy + 2Hy3 )z0 12 ay 12 −
1 ¨ (Hy Hy + H˙ y2 + 10H˙ y Hy2 + 8Hy4 )z02 . 12
(37)
Adding up the first four and the last two terms one gets 1 1 ¨ 3 ˙ + 2HH)z 1 + Hy z0 + (H˙ + H 2 )z02 + (H 0, 3 12 1 2 1 Hy + (H˙ y Hy + 2Hy3 )z0 , 12 12 which proves the claim.
(38) (39)
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
539
To calculate u we use the first equation of well know Hadamard recursion which is deduced from the Klein–Gordon equation 2g µν (x)(∂µx σ)(∂νx u) + (x σ − 4)u = 0
(40)
where one has to take into account, that the derivatives are with respect to x and the metric is evaluated at x in the above formula, which leads to additional terms in the calculation. We make the ansatz u = 1 + µz02 + νz 2 + ρz03 + τz 2 z0 + φz04 + ψz 2 z02 + χ(z 2 )2
(41)
where all coefficients are evaluated at y0 . The first term being 1 and the absence of a first order term follow from the requirements that u(x, x) = 1 and u(x, y) = u(y, x). The calculation of the coefficients then yields 1 1 u = 1 − (H˙ + H 2 )z02 + (H˙ + 3H 2 )a2 z 2 4 12 1 ¨ 1 ¨ 3 ˙ ˙ + 6H 3 )a2 z 2 z0 (H + 8HH − (H + 2HH)z 0 + 8 24 1 ¨˙ − 36HH ¨ − 17H˙ 2 + 38HH ˙ 2 + 19H 4 )z 4 (−18H + 0 480 1 ¨˙ + 26HH ¨ + 17H˙ 2 + 52HH ˙ 2 + 1H 4 )a2 z 2 z 2 (3H + 0 240 1 ¨ + 3H˙ 2 + 36HH ˙ 2 + 29H 4 )a4 (z 2 )2 + O(z 5 ) (4HH + 480
(42)
which can be checked to be symmetric under interchange of x and y by a straight forward calculation. This equation has been derived using computer algebra support, but has been validated up to second order by hand calculations. From this we obtain Theorem 3.1. The most singular order of the Hadamard distribution is given by 1 1 1 − (H˙ + 2H 2 ) − a2z 2 12 1440 ˙ 3 + 51H 4 )a2 z 2 + O((z 2 )2 ), ¨ + 9H˙ 2 + 86HH × (12HH
−4π 2 H (z)|z0 =0 = −
(43) −4π 2 H˙ (z)|z0 =0 = −4π 2
H 1 ¨ ˙ − (H + 4HH) + O(z 2 ), a2 z 2 24
(44)
∂ ∂ 2 H2 1 H (z)|z0 =0 = 4 2 2 − 2 2 − 0 0 ∂x ∂y a (z ) a z 240 ˙ 2 + 17H 4 ) + O(z 2 ). ¨˙ + 16HH ¨ + 27H˙ 2 + 30HH × (4H (45)
June 3, J070-S0129055X11004357
540
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
The proof is done by straightforward calculations where the occurring fractions in powers of Z = z 2 are expanded in a power series like L + MZ + NZ 2 MP − LQ L Z 2 = P + P2 P + QZ + SZ NP 2 − LPS − MPQ + LQ 2 2 Z . (46) + P3 Even if placeholders are inserted for the coefficients of the powers of Z and z0 in σ and u the calculations remain very lengthy, as the coefficient’s derivatives have to be taken into account. Therefore we will not show them here explicitly. We calculate the subleading terms of the Hadamard parametrix in position space to the order relevant to calculate the homogeneous term of the energy-momentum tensor. This means we use the Hadamard recursion
(47) 2g µν (x) ∂µx σ (∂νx v0 ) + (x σ − 2)v0 = −(x + m2 − ξR(x))u
2g µν (x) ∂µx σ (∂νx v1 ) + (x σ)v1 = −(x + m2 − ξR(x))v0 (48) to calculate 1 v0 = − 2
1 1 ¨ 2 ˙ − ξ R + m + (H + 4HH)z 0 6 4
1 ¨˙ + (87 − 480ξ)HH ¨ + (54 − 300ξ)H˙ 2 ((21 − 120ξ)H 240 ˙ 2 − (58 − 360ξ)H 4 + 30m2 (H˙ + H 2 ))z02 − (76 − 540ξ)HH +
1 ¨˙ + (3 − 60ξ)HH ¨ + (6 − 60ξ)H˙ 2 + (76 − 540ξ)HH ˙ 2 (−H 240 + (58 − 360ξ)H 4 − 10m2 (H˙ + 3H 2 ))a2 z 2 + O(z 3 ) +
and 1 v1 = 120
(49)
5 (1 − 6ξ)2 R2 − 2(H˙ + H 2 )H 2 12 + 5(1 − 6ξ)Rm2 + 15m4 + O(z). (1 − 5ξ)R +
(50)
4. Mode Expansion Two things remain to be done: The singular terms in (13) have to be related to the mode expansion Gφφ,k etc. of the two point functions and the homogeneous term of the energy momentum tensor has to be calculated. First, the expressions in Theorem 3.1 can easily be Fourier transformed with a method explained e.g. in [2]. Using the distributional Fourier transforms F (|z |−2 )(k) = 2π 2 k −1 , F (|z |−4 )(k) = −π 2 k with k = |k| and ignoring for now homogeneous terms that will only contribute terms proportional to the delta distribution in the zero mode and positive order terms in Z, we get the
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
541
mode expressions 1 F (H |z0 =0 )(k) = a−2 , 2k H F (H˙ |z0 =0 )(k) = −a−2 , 2k 2 ∂ ∂ k −4 −2 H 0 + a . H | F ( k) = a z =0 ∂x0 ∂y 0 2 2k
(51) (52) (53)
Here it needs to be taken into account that all singular orders in z 2 that are encountered are locally integrable functions on R3 with the exception of the z14 ∂ ∂ ∂ ∂ term in ∂x 0 ∂y 0 H |z 0 =0 . Continuation of the ∂x0 ∂y 0 [φ(x)φ(y) − Hn (x, y)]|z 0 =0 for n ≥ 2 to z = 0 thus implicitly induces a rotational invariant regularization of z14 , as it can be written as a difference of already regularized terms and the newly found continuous function. The analytic regularization [2] of |z | ζ is well defined except for poles at ζ = −3, −5, −7, . . . . It is clearly rotational invariant. Furthermore |z|ζ with ζ = −4 is the only invariant extension of z14 on R3 that preserves the scaling degree, as there are no rotation invariant linear combinations of first order derivative of the delta distribution at zero. But the difference of two extensions with the given properties has to be such a linear combination, [2]. Hence we conclude that it has to ∂ ∂ coincide with the regularization induced by the continuation of ∂x 0 ∂y 0 [φ(x)φ(y)− Hn (x, y)]|z0 =0 to z = 0. The Fourier transform (53) can thus be calculated based on the analytic regularization prescription [2]. We now turn to the mode expansion of the subleading terms. Expanding the logarithmic factor with the help of (31) yields 2 n ∞ a 1 2 2 2 σ(x, y) 2 4 H = log ) − a z + O( z ) , + log( z − log − 0 λ2 λ2 12 z =0 n=1 (54) where the infinite sum can be truncated after a few relevant terms. Inserting (31) to expand the powers σ(x, y)k in (13), we easily see that the singular contribution from the second term in (13) to [φ(x)φ(y) − Hn (x, y)]|z0 =0 is 2 a 1 2 v0 ((t, x), (t, y )) log 2 + log(z ) . (55) 4π 2 λ ∞ 1 2 2 2 1 Likewise, using σ(x,y) |z0 =0 = − z12 n=0 ( 12 H a z + O(z 4 ))n , it is easily shown ∂ that the expansion of derivatives of the second term in (13) with respect to ∂x 0 2 and ∂x∂0 ∂y0 at equal time into singular orders in z 2 which is truncated after some sufficiently high order contains only terms ∼log(z 2 ) and ∼(z 2 )k , k ≥ 0, where the coefficients are made of log(λ), log(a) and Uk ((t, x), (t, y ))-terms and their derivatives w.r.t. z 2 along with their time derivatives; for the case with the two time derivatives an additional term with the singularity structure z12 needs to be taken into account as well.
June 3, J070-S0129055X11004357
542
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
One could calculate the coefficients of the singular terms described above by calculating the derivatives of (49) and (50) and Fourier transforming the result by hand. However these calculations are very tedious, therefore we proceed differently here. As we have analyzed the general shape of the terms to account for, we can Fourier transform these terms and include them into the mode expansion with up to now undetermined coefficients. Doing so, we use that F (log(|z|))(k ) = −4π 3/2 Γ( 32 )k −3 . Note that this distribution at 0 does not extend to a locally integrable function and hence requires a regularization prescription. The details can be found in the Appendix. The remaining singular orders that occur have mode expansion F (z 2 log(|z|))(k) = 24π 3/2 Γ( 32 )k −5 and F ((z 2 )n )(k) = (2π)3 (−∆)n δ(k), where we again neglect the latter for the time being. Next we multiply the leading terms in the singular order expansion with the powers of a to get expressions similar to the two-point function Hφφ =
a−2 , 2k
H(φπ) = − Hππ =
aH , 2k
a2 k a4 H 2 + . 2 2k
(56) (57) (58)
So, taking now into account the discussion above, we consider the ansatz Hφφ =
α3 a−2 α5 + 3 + 5 + O(k −7 ), 2k 2k 2k
H(φπ) = − Hππ =
β3 aH β5 + 3 + 5 + O(k −7 ), 2k 2k 2k
γ1 γ3 a2 k a4 H 2 + + + 3 + O(k −5 ), 2 2k 2k 2k
(59) (60) (61)
that captures all possible singular orders except for those concentrated in the zero mode. Note that Hn fulfills the Klein–Gordon equation up to terms that vanish in the coincidence limit and a zero mode term. Plugging our ansatz of the singular order expansion into the system of equations (6) we obtain a couple of equations that help us to determine the unspecified coefficients α˙ 5 2a−3 β3 2a−3 β5 α˙ 3 + 5 = + , 3 3 2k 2k 2k 2k 5 −
(62)
a(H˙ + 2H 2 ) (−ξR + m2 )a3 α3 + aα5 β˙ 3 (−ξR + m2 )a + aα3 + 3 =− − 2k 2k 2k 2k 3 +
a−3 γ3 a−3 γ1 + , 2k 2k 3
(63)
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
543
γ˙ 1 γ˙ 3 2a4 H(H˙ + 2H 2 ) −2(−ξR + m2 )a4 H + 2aβ3 + + 3 =− 2k 2k 2k 2k −
2(−ξR + m2 )a3 β3 + 2aβ5 , 2k 3
(64)
where we suppress undetermined orders. Assuming these equations to hold order by order in k we get two sets of equations α˙ 3 = 2a−3 β3 , γ˙ 1 = −2aβ3 + 2a4 H α3 = a−4 γ1 −
1 − ξ R + m2 , 6
1 − ξ R − m2 , 6
(65) (66) (67)
and analogously α˙ 5 = 2a−3 β5 ,
(68) 3
2
γ˙ 3 = −2aβ5 − 2a β3 (−ξR + m ),
(69)
α5 = a−4 γ1 − a−1 β˙ 3 − a2 α3 (−ξR + m2 ).
(70)
We now want to solve these differential equations. The equation (65) + (66) −∂t (67) leads to a4 1 − ξ (R˙ + 2HR) + a4 Hm2 (71) γ˙ 1 − 2Hγ = 2 6 which can be solved using standard methods. The solutions for α3 and β3 can then be calculated straightforwardly without solving differential equations. Lemma 4.1. The solution to the system (65)–(67) is 1 1 − ξ R + m2 + Aa−2 , α3 = − 2 6 a3 1 β3 = − − ξ R˙ − AHa, 4 6 4 1 a γ1 = − ξ R + m2 + Aa2 , 2 6
(72) (73) (74)
where A is some constant. Proof. A differential equation of the form x(t) ˙ − 2f (t)x(t) = g(t)
(75)
June 3, J070-S0129055X11004357
544
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
has the solution t t
t x(t) = g(t ) exp 2 f (t )dt dt + A exp 2 f (t )dt t
0
(76)
0
t with A being constant. In our case f (t) = H, such that exp (2 t12 H(t )dt ) = We get the solution
t 1 1 2 2 ˙ − ξ (R + 2HR) + Hm a2 dt + Aa2 γ1 = a 2 6 0 2 t 1 a = − ξ R + m2 a2 dt + Aa2 ∂t 2 0 6 a4 1 = − ξ R + m2 + Aa2 2 6
a2 (t1 ) a2 (t2 ) .
(77) (78) (79)
where the constant A absorbs all constant terms and thus changes from line to line. 1 4 2 2aβ3 = −γ˙ 1 + 2a H −ξ R+m (80) 6 1 a4 1 4 2 −ξ R+m − − ξ R˙ − 2Aa2 H = −2a H (81) 6 2 6 1 − ξ R + m2 + 2a4 H (82) 6 a4 1 − ξ R˙ − 2Aa2 H =− (83) 2 6 and
1 − ξ R − m2 6 1 1 1 − ξ R + m2 + Aa−2 − − ξ R − m2 = 2 6 6 1 1 − ξ R + m2 + Aa−2 =− 2 6
α3 = a−4 γ1 −
(84) (85) (86)
yield the other two coefficients without solving differential equations. Applying the same strategy as before we get the solutions for the second system of equations 1 1 a2 ¨ + 5H R˙ − HR) ˙ − ξ (R −3 − ξ ξR2 α5 = (87) 8 6 6 − (4H˙ + 6H 2 + 6ξR)m2 − 3m4 , (88)
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
a3 α˙ 5 , 2 1 1 a6 ¨ + H R˙ + HR) ˙ − ξ (R − − ξ ξR2 γ3 = − 8 6 6 − 2(H 2 + ξR)m2 − m4 ,
β5 =
545
(89) (90) (91)
where we suppress the integration constant as it vanishes for a pure state due an argument given below. Theorem 4.1. The singular part of the Hadamard modes is equal to the singularities of the two-point function of a pure state if the integration constant A in Lemma 4.1 vanishes. That is, (i) ∀ k: Hφφ,k Hππ,k − H2(φπ),k = 14 + O(k −6 ) (ii) ∀ k: Hφφ,k + Hππ,k > 0 + O(k −3 ) holds if and only if A = 0. Proof. The first claim is equivalent to a2 H 2 + a−2 γ1 + a2 α3 − a2 H 2 = 0,
(92)
a−2 γ3 + a2 α5 + a4 H 2 α3 + α3 γ1 + 2aHβ3 = 0,
(93)
where Eq. (92) consists of the terms of order k −2 and Eq. (93) represents the order k −4 . Equation (92) gives 2A = 0. Analogously Eq. (93) forces the other integration constant to vanish. To show that the second claim is satisfied we use the fact that H is a solution of the Klein–Gordon equation (6) up to O(k −5 ). Then we use a form of the well-known deformation argument of Wald, which is explained in [18]. We deform the spacetime for early times such that a is constant at early times. Thus the claim holds exactly at early times, as we just get the Minkowski two-point function. Then we propagate H with Eq. (6) to some later time, where the spacetime is undeformed. Thus the claim is still satisfied at that time because the first claim shown above ensures that Hφφ,k Hππ,k > 0 + O(k −6 ). Finally, we undeform the spacetime at early times and can thus show the claim to hold for all times using Eq. (6). From Theorem 4.1 we obtain a (tentative) parametrization of initial conditions of the field degrees of freedom for the equation of motion for pure states. For k > 1 we set Gφφ,k = Hφφ,k +a(k), Gφπ,k = H(φπ),k +b(k) and Gππ,k = ( 14 +G2(φπ),k )/Gφφ,k where a(k)k 7 and b(k)k 7 are Cb∞ ((1, ∞)). Furthermore, a(k) needs to be chosen such that Gφφ,k > 0 for all such k. This can always be achieved as the leading term a−2 /k is positive. For k ≤ 1, these functions need to be continued to Cb∞ ([0, ∞)) such that the relations Gφφ,k + Gππ,k > 0 and (8) are preserved. Obviously with
June 3, J070-S0129055X11004357
546
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
such initial conditions the trace of the initial energy momentum tensor is finite. In particular, Hadamard states fall into this class. From the construction of Hφφ,k , H(φπ),k and Hππ,k it follows that these quantities fulfill (6) up to order k −7 and k −5 , respectively. Thus, if the differece of Gφφ,k and Hφφ,k (multiplied by k 2 ) is integrable for large k, this property will prevail after an infinitisimal time step. The argument for the difference of Gππ,k and Hππ,k is analoguous. For non pure states, a positive c(k) can be added to Gππ,k such that k 5 c(k) ∈ b C ([0, ∞)]. 5. Equation of Motion We begin with the easier case where the field is conformally coupled to the mean curvature, i.e. ξ = 1/6. In this case, (24) simplifies considerably, as it only contains the field dependent term φφω,λ,ξ . This term, up to zero mode contributions, is
1 m2 a−2 + 3 dk, Gφφ,k − (94) 8π 3 R3 2k 4k 2 since α3 = − m2 . The renormalization prescription of R3 k −3 dk at 0 is given in the Appendix. The zero mode terms in the conformally coupled case are 2 a 1 1 R − 2 v0 log , (95) 72 4π λ2 where the first term stems from the first equation in Theorem 3.1 and the second one from (55). In the conformally coupled case, v0 (x, x) takes the simple form v0 = −
m2 . 2
(96)
Wrapping up, we obtain Theorem 5.1. For the case of conformal coupling, the equations of motion (24) is 2 m 1 m2 a−2 2 − R = 8πG + 3 dk + m +c R Gφφ,k − 8π 3 R3 2k 4k 72 2 1 a 1 m4 2 4 ˙ + (HH + H ) + + c R + 2 log 240 π 2 2880 π 2 4π λ2 1 − m4 − c (97) 32 π 2 where the k −3 integral is regularized at zero as explained in the Appendix.
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
547
Wald’s fifth axiom [19, 20] states that the equation of motion should be a second order equation. (97) is second order if and only if the renormalization constants A, B, C and D in (11) are chosen such that c = −1/(2880π 2). Note however, that for this value of c the joint system differential equations (6) and (97) is an implicit (infinite dimensional) system of differential equations and for c = −1/(2880π 2) it is explicit, which is of advantage for proving existence and numerical solutions. The log(a) term that seems to be missing in [15] is due to the fact that the infra red prescription we employ for R3 fk(k) 3 dk (cf. Appendix) is not a dependent, whereas the prescription used in [15] is, leading to the absence of a standalone log(a) term in the latter case. For ξ = 1/6 the homogeneous terms of the Hadamard parametrix take a much more complicated form, as second derivatives of the parametrix must be taken into account. Ignoring a prefactor 8πG 4π 2 we have to calculate zero mode terms of u 1 − v0 log (2σ) − v1 2σ log (2σ) Cξ,m = (6ξ − 1)(∂x0 ∂y0 + a−2 ) −2 2σ 2 1 [(2 − 6ξ)m2 + (6ξ − 1)ξR](H˙ + 2H 2 ) 12 2
a 1 −2 + log (6ξ − 1) ∂x0 ∂y0 + a −v0 − v1 2σ λ2 2 − [(2 − 6ξ)m2 + (6ξ − 1)ξR]v0 −
(98)
in the coincidence limit x → x . This can be done using the expansions of u, v0 , v1 and σ 2 given in Sec. 3, applying the differential operator, setting z 0 = 0 and extracting terms of z -order zero. With a computer aided calculation this yields the result 1 ¨˙ + 53HH ¨ + 11H˙ 2 + 141HH ˙ 2 + 3H 4 ) Cξ,m = − (4H 30 1 ¨˙ + 113HH ¨ − 44H˙ 2 + 11HH ˙ 2 − 277H 4) + ξ(9H 5 ¨˙ + 12HH ¨ − 14H˙ 2 − 38HH ˙ 2 − 68H 4 ) − 108ξ 3 (H˙ + 2H 2 )2 − 6ξ 2 (H m2 ((21H˙ + 44H 2 ) − 6ξ(3H˙ + 8H 2 ) − 36ξ 2 (H˙ + 2H 2 )) 6 2 a m4 (1 − 6ξ) + log + 2 λ2 1 ¨˙ + 21HH ¨ + 42H˙ 2 + 152HH ˙ 2 + 116H 4) × − (3H 60
+
1 ¨˙ + 28HH ¨ + 31H˙ 2 + 106HH ˙ 2 + 58H 4 ) + ξ(4H 5
June 3, J070-S0129055X11004357
548
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
¨˙ + 21HH ¨ − 6H˙ 2 − 36HH ˙ 2 − 72H 4 ) − 108ξ 3 (H˙ + 2H 2 )2 − ξ 2 (3H 3 2 2 4 ˙ + m (1 − 6ξ)(H + 2H ) − m (1 − 3ξ) . (99) 2 Although this result looks very tedious, one can simplify the result subsuming terms proportional to R, and m2 R into the renormalization degrees of freedom and the H independent term into a redefinition of the scale parameter λ. The simplified form is 25 ˙ 5 2 1 2 2 4 ˙ Cξ,m = − − R − R − R + 13HH + 23H 30 3 6 36 1 3 25 ˙ 20 2 2 4 ˙ + ξ − R − R − R + 23HH + 43H 5 2 3 9 5 1 1 ˙ 2 + 4H 4 − 3ξ 3 R2 − 6ξ 2 − R − R˙ − R2 + 2HH 6 6 2 m2 m4 1 7 (1 − 6ξ) + − R + 2H 2 − 6ξ − R + 2H 2 + 6ξ 2 R + 6 2 2 2 2 a 5 2 1 1 2 4 ˙ + log − − R + R − 4HH − 4H λ2 60 2 6 1 2 5 2 2 4 ˙ + ξ − R + R − 2HH − 2H 5 3 12 1 1 1 − ξ 2 − R − R2 − 3ξ 3 R2 + m2 (1 − 6ξ) R − m4 (1 − 3ξ) . 2 2 4 (100) Plugging in the calculated homogeneous term we obtain the equation of motion for the general case. Theorem 5.2. The equation of motion in the general case is given by the following expression −6 a γ3 a2 k a4 H 2 + γ1 − − − −R = 8πG (6ξ − 1) G dk ππ,k 8π 3 R3 2 2k 2k 3
α5 a−2 k α3 a−2 − − 3 dk + 3 k 2 Gφφ,k − 8π R3 2 2k 2k
α3 ξ a−2 − 3 dk + 3 6m2 + (6ξ − 1)R Gφφ,k − 8π 2k 2k R3 36ξ − 5 1 4 2 v1 + cm + c m R + c R . (101) − 2 Cξ,m + 4π 4π 2 The regularization of the integrals over k −3 at zero is given in the Appendix. We would like to point out that due to the 4th time derivatives in γ3 and α5 , (101) again leads to an implicit system of differential equations, as the leading
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
549
order time derivative can not be isolated, regardless of the renormalization degrees of freedom and the specific form of Cξ,m (t). Also, Wald’s fifth axiom can never be fulfilled in this case. Appendix. Regularization Prescription on k−3 Here we give the details of the regularization of the seemingly infrared divergent integrals
m2 a−2 + 3 dk Gφφ,k − (102) 2k 4k R3 and related. For any Schwartz test function χ with χ(0) = 1. This integral can be re-written as
m2 m2 a−2 a−2 + 3 (1 − χ(k)) + Gφφ,k − χ(k)dk. Gφφ,k − χ(k) dk + 3 2k 4k 2k R3 R3 4k (103) Here the first integral is regular. The second — seemingly infrared divergent — integral needs to be properly traced back to its definition as distributional Fourier transform of log(r) smeared with the test function χ, which gives a regular integral after shifting the Fourier transform to the test function. Let us quickly calculate the Fourier transform in the sense of tempered distributions of the locally integrable function log(r), r = |x| for the convenience of the reader. Let ϕ( ˇ x) = ϕ(−x), then d ζ r , F (ϕ) ˇ F (log(r))(k), ϕ = log(r), F (ϕ) ˇ = lim ζ0 dζ ζ +3 Γ d 2ζ+3 π 3/2 2 k −ζ−3 , ϕ . = lim (104) ζ ζ0 dζ Γ − 2 Since ζ approaches 0 from below, k −ζ−3 is locally integrable. Furthermore, k −ζ−3 , ϕ can be analytically continued in ζ to C\N0 ([2]). The Laurent series of k −ζ−3 in ζ = 0 has a pole proportional to δ0 . Hence, for ϕ(0) = 0, k −ζ−3 , ϕ is analytic at ζ = 0 and in particular has a finite ζ derivative. So let us from now on assume ϕ(0) = 0 and perform the ζ derivative at the right-hand side of (104). Due to the pole of Γ(− 2ζ ) at ζ = 0, the only term that contributes in the limit ζ 0 is
ζ +3 ζ Γ − 2 2 lim 2ζ+3−1 π 3/2 k −ζ−3 , ϕ 2 ζ0 ζ Γ − 2 3 = −4π 3/2 Γ k −3 , ϕ, 2 Γ
(105)
June 3, J070-S0129055X11004357
550
2011 13:30 WSPC/S0129-055X
148-RMP
B. Eltzner & H. Gottschalk
where we used the Laurent series expansion of Γ at zero. Since, by dominated convergence, k −3 , ϕ = R3 k −3 ϕ(x)dx, we see that F (log(r))(k) is a regularization
2 of −4π 3/2 Γ 32 k −3 . Let now ϕ(0) = 0, and χ(k) = e−k . Since ϕ(x) = [ϕ(x) − 2 2 2 ϕ(0)e−k /2 ] + ϕ(0)e−k /2 = ϕ1 (x) + ϕ(0)e−k /2 . We have already shown how to evaluate F (log(r))(k) on ϕ1 . It remains to calculate ϕ(0) ϕ(0) 3 −k2 /2 −k2 /2 F (log(r)), ϕ(0)e = log(r), e = Γ . (106) 2 (2π)3/2 (2π)1/2 References [1] L. Brewin, Riemann normal coordinate expansions using Cadabra, Class. Quantum. Grav. 26(17) (2009) 175017. [2] F. Constantinescu, Distributionen und ihre Anwendungen in der Physik (Teubner, 1974). [3] C. Dappiaggi, K. Fredenhagen and N. Pinamonti, Stable cosmological models driven by a free quantum scalar field, Phys. Rev. D 77 (2008) 1129–1163. [4] B. Eltzner, The semiclassical Einstein equation on Robertson–Walker spacetimes with backreaction, Diploma thesis, Bonn (2008). [5] S. A. Fulling, Aspects of Quantum Field Theory in Curved Spacetime (Cambridge University Press, 1989). ¨ [6] F. Gackstatter, Uber Volumendefekte und Kr¨ ummung bei der Robertson–Walker– Metrik und Konstruktion kosmologischer Modelle, Ann. Phys. 44(6) (1987) 423–439. [7] W. Junker and E. Schrohe, Adiabatic vacuum states on general spacetime manifolds: Definition, construction, and physical properties, Ann. Henri Poincar´e 3(6) (2002) 1113–1181. [8] T.-P. Hack, On the backreaction of scalar and spinor quantum fields on curved space times, Dissertation, Hamburg (2010). [9] E. R. Harrison, Fluctuations at the threshold of classical cosmology, Phys. Rev. D 1 (1970) 2726–2730. [10] Ya. B. Zel’dovich, A hypothesis unifying the structure and the entropy of the universe, Mon. Not. R. Astron. Soc. 160 (1972) 1P–3P. [11] A. D. Linde, Particle Physics and Inflationary Cosmology (Harwood, Chur, Switzerland, 1990). [12] A. D. Linde, Inflation and Quantum Cosmology (Academic Press, Boston, 1990). [13] C. L¨ uders and J. E. Roberts, Local quasiequivalence and adiabatic vacuum states, Comm. Math. Phys. 134 (1990) 29–63. [14] H. Olbermann, States of low energy on Robertson–Walker spacetimes, Class. Quantum Grav. 24 (2007) 5011–5030. [15] N. Pinamonti, On the initial conditions and solutions of the semi-classical Einstein equations in a cosmological scenario (January, 2010); to appear in Comm. Math. Phys.; gr/qc1001.0864v1. [16] V. Moretti, Comments on the stress-energy tensor operator in curved spacetime, Comm. Math. Phys. 232 (2003) 189–221; grqc/0109048. [17] A. A. Starobinski, A new type of isotropic cosmological models without singularity, Phys. Lett. B 91 (1980) 99–101. [18] R. M. Wald, Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics (The University of Chicago Press, 1994).
June 3, J070-S0129055X11004357
2011 13:30 WSPC/S0129-055X
148-RMP
Dynamical Backreaction in Robertson–Walker Spacetime
551
[19] R. M. Wald, The back reaction effect in particle creation in curved spacetime, Comm. Math. Phys. 54 (1977) 1–19. [20] R. M. Wald, Trace anomaly of a conformally invariant quantum field in curved spacetime, Phys. Rev. D 17 (1978) 1477–1484. [21] S. Weinberg, Cosmology (Oxford University Press, 2008).
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 5 (2011) 553–574 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004369
HYPERFINE SPLITTING OF THE DRESSED HYDROGEN ATOM GROUND STATE IN NON-RELATIVISTIC QED
L. AMOUR∗,‡ and J. FAUPIN†,§ ∗Laboratoire
de Math´ ematiques EDPPM, EA-4535, Universit´ e de Reims, Moulin de la Housse — BP 1039, 51687 REIMS Cedex 2, France †Institut
de Math´ ematiques de Bordeaux, UMR-CNRS 5251, Universit´ e de Bordeaux 1, 351 cours de la lib´ eration, 33405 Talence Cedex, France ‡
[email protected] §
[email protected] Received 24 June 2010 Revised 31 March 2011
We consider a spin- 21 electron and a spin- 21 nucleus interacting with the quantized electromagnetic field in the standard model of non-relativistic QED. For a fixed total momentum sufficiently small, we study the multiplicity of the ground state of the reduced Hamiltonian. We prove that the coupling between the spins of the charged particles and the electromagnetic field splits the degeneracy of the ground state. Keywords: Quantum electrodynamics; hyperfine splitting; multiplicity of the ground state; Feshbach–Schur map. Mathmatics Subject Classfication 2010: 81V10, 81V45, 81Q10, 81Q15
1. Introduction This paper is concerned with the spectral analysis of the quantum Hamiltonian associated with a free hydrogen atom, in the context of non-relativistic QED. Before describing our result more precisely, we begin with recalling a few well-known facts about the spectrum of Hydrogen in the case where the corrections due to quantum electrodynamics are not taken into account. For more details, we refer the reader to classical textbooks on Quantum Mechanics (see, e.g., [29, 15]). See also [11, 26, 6]. We consider a neutral hydrogenoid system composed of one electron with spin 12 and one nucleus with spin 12 . The Pauli–Hamiltonian in L2 (R6 ; C4 ) associated with this system can be written in the following way: 1
H Pa :=
1 α 2 el 1 (pel − α 2 An (xel ))2 − σ · Bn (xel ) 2mel 2mel 1
+
1 1 α2 n α . (pn + α 2 Ael (xn ))2 + σ · Bel (xn ) − 2mn 2mn |xel − xn |
553
(1.1)
June 3, J070-S0129055X11004369
554
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
Here the units are chosen such that = c = 1, where = h/2π, h is the Planck constant, and c is the velocity of light. The notations mel , xel and pel = −i∇xel (respectively, mn , xn and pn = −i∇xn ) stand for the mass, the position and the momentum of the electron (respectively of the nucleus), and α = e2 is the finestructure constant (with e the charge of the electron). Moreover, σ el = (σ1el , σ2el , σ3el ) (respectively, σn ) are the Pauli matrices accounting for the spin of the electron (respectively, of the nucleus), and An (xel ) is the vector potential of the electromagnetic field generated by the nucleus at the position of the electron, that is An (xel ) = Cα1/2 (σ n ∧ (xel − xn ))/(mn |xel − xn |3 ) where C is a positive constant (and similarly for Ael (xn )). Finally, Bn (xel ) = ipel ∧ An (xel ) and Bel (xn ) = ipn ∧ Ael (xn ). The Hamiltonian H Pa can be derived from the Dirac equation in the nonrelativistic regime. It allows one to justify the so-called hyperfine structure of the ground state of the Hydrogen atom. More precisely, let H Pa (0) be the Hamiltonian obtained when the total momentum vanishes. Then H Pa (0) in L2 (R3 ; C4 ) can be decomposed into a sum of four terms, H Pa (0) = H0 + H1 + H2 + H3 , where H0 = p2r /(2µ) − α/|r| (here µ denotes the reduced mass of the atom and pr = −i∇r ), H1 is the orbital interaction, H2 is the spin-orbit interaction, and H3 is the spin-spin interaction (see, e.g., [6, Chap. 4] and [2] for details). It is seen that H0 has a 4-fold degenerate ground state. The correction terms, H1 , H2 , and H3 , produce an energy shift. Moreover, under the influence of the spin-spin interaction, the unperturbed ground state eigenvalue splits into two parts: a simple eigenvalue associated with a unique ground state, and a 3-fold degenerate eigenvalue. This phenomenon is referred to as the hyperfine splitting of the hydrogen atom ground state. Let us mention that this splitting explains the famous observed 21-cm hydrogen line. In this paper, we investigate the hyperfine structure of the hydrogen atom in the standard model of non-relativistic QED. We aim at establishing that a hyperfine splitting does occur in the framework of non-relativstic QED. The Hamiltonian is still given by the expression (1.1), except that An (xel ) and Ael (xn ) are replaced by the vector potentials of the quantized electromagnetic field in the Coulomb gauge (and likewise for Bn (xel ) and Bel (xn ), precise definitions will be given in Sec. 2.1 below). Moreover the energy of the free photon field is added. Since both the electron and the nucleus are treated as moving particles, the total Hamiltonian, Hg , is translation invariant. Here g denotes a coupling parameter depending on the fine-structure constant α. The translation invariance implies that Hg admits a direct integral decomposition, Hg ∼ R3 Hg (P )dP , with respect to the total momentum P of the system. We set Eg (P ) := inf σ(Hg (P )). In [5], it is established that, for g and P sufficiently small, Eg (P ) is an eigenvalue of Hg (P ), that is Hg (P ) has a ground state. We also mention [27] where the existence of a ground state for Hg (P ) is obtained for any value of g, under the assumption that Eg (0) ≤ Eg (P ). Using a method due to [23], it is proven in [5] that the multiplicity of Eg (P ) cannot exceed the multiplicity of E0 (P ) := inf σ(H0 (P )),
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
555
where H0 (P ) := Hg=0 (P ) denotes the non-interacting Hamiltonian. In other words, (0 <) dim Ker(Hg (P ) − Eg (P )) ≤ dim Ker(H0 (P ) − E0 (P )).
(1.2)
Our purpose is to determine whether the inequality in (1.2) is strict, or, on the contrary, is an equality. Of course, the multiplicity of Eg (P ) depends on the value of the spins of the charged particles. If the spin of the electron is neglected and the spin of the nucleus is equal to 0, then E0 (P ) is simple, and hence, according to (1.2), Eg (P ) is also a simple eigenvalue. In particular, (1.2) is an equality. If the spin of the electron is taken into account, and the spin of the nucleus is equal to 0, then E0 (P ) is twice-degenerate. Using Kramer’s degeneracy theorem (see [28]), one can prove that the multiplicity of Eg (P ) is even. Therefore, by (1.2), Eg (P ) is also twice-degenerate, and hence (1.2) is again an equality. We refer the reader to [25, 32, 30, 22, 28] for results on the twice-degeneracy of the ground state of various QED models. Consider now a hydrogen atom composed of a spin- 12 electron and a spin- 12 nucleus (e.g. a proton). In this case, the multiplicity of E0 (P ) is equal to 4. Our main result states that dim Ker(Hg (P ) − Eg (P )) < dim Ker(H0 (P ) − E0 (P )) = 4,
(1.3)
for g = 0 small enough. Equation (1.3) can be interpreted as a hyperfine splitting of the ground state of Hg (P ). In other words, the Hamiltonian of a freely moving hydrogen atom at a fixed total momentum in non-relativistic QED contains hyperfine interaction terms which split the degeneracy of the ground state, in the same way as for the Pauli–Hamiltonian of Quantum Mechanics mentioned above. Pursuing the analogy with the Pauli–Hamiltonian (1.1), one can conjecture that Eg (P ) is simple. Proving this is however beyond the scope of the present paper. We also mention that non-relativistic QED provides a suitable framework to rigorously justify radiative decay and Bohr’s frequency condition (see [9, 10, 1, 31] for the case of atomic systems with an infinitely heavy nucleus). In particular, save for the ground state, all stationary states are expected to turn into metastable states with a finite lifetime. Hence in relation with the 21-cm hydrogen line mentioned above, one can expect that a resonance appears near the ground state energy Eg (P ), with a very small imaginary part. Showing this would presumably require the use of complex dilatations together with renormalization techniques as in [9]. The case of a nucleus of spin ≥ 1 is not considered here (for instance, the nucleus of deuterium, composed of one proton and one neutron, can be treated as a spin-1 particle), but we expect that a similar hyperfine splitting of the ground state occurs in this case also. As for positively charged hydrogenoid ions, the question of the existence of a ground state is more subtle than for the hydrogen atom. Indeed, it is proven in [21] that the Hamiltonian of a positive ion at a fixed total momentum in non-relativistic QED does not have a ground state in Fock space. This result should be compared with the corresponding one for the model of a freely moving, dressed
June 3, J070-S0129055X11004369
556
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
non-relativistic electron in non-relativistic QED, which has been studied recently by several authors (see, among other papers, [12, 13, 8, 21, 14, 28, 19]; see also [4]). Let us finally mention that the ground state degeneracy of the non-relativistic hydrogen atom confined by its center of mass (see [3, 16]) could also be analyzed by the techniques developed here, provided that both the electron and the nucleus have a spin equal to 12 . 2. Definition of the Model and Statement of the Main Result 2.1. Definition of the model In the standard model of non-relativistic QED, the Hamiltonian associated with the system we consider acts on the Hilbert space H := Hat ⊗ Hph where Hat := L2 (R3 ; C2 ) ⊗ L2 (R3 ; C2 ) ∼ L2 (R6 ; C4 )
(2.1)
is the Hilbert space for the charged particles (the electron and the nucleus), and ∞ n (2.2) Hph := C ⊕ Sn L2 (R3 × {1, 2}) ⊗ n=1
is the symmetric Fock space for the photons. Here Sn denotes the symmetrization operator. The Hamiltonian of the system, H SM , is formally given by the expression 1 1 1 1 H SM := (pel − α 2 A(xel ))2 + (pn + α 2 A(xn ))2 + V (xel , xn ) + Hph 2mel 2mn 1
1
α 2 el α2 n − σ · B(xel ) + σ · B(xn ), 2mel 2mn
(2.3)
where xel , xn , pel , pn and α are defined as in (1.1). For x ∈ R3 , A(x) is defined by χΛ (k) λ −ik·x ∗ 1 aλ (k) + eik·x aλ (k) dk, (2.4) A(x) := 1 ε (k) e 2π 3 2 |k| λ=1,2 R and B(x) is given by
1 k i ∧ ελ (k) [e−ik·x a∗λ (k) − eik·x aλ (k)]dk, B(x) := − |k| 2 χΛ (k) 2π |k| R3 λ=1,2
(2.5) where the polarization vectors ε1 (k) and ε2 (k) are chosen in the following way: (k2 , −k1 , 0) ε1 (k) := 2 , k1 + k22
ε2 (k) :=
k (−k1 k3 , −k2 k3 , k12 + k22 ) ∧ ε1 (k) = 2 . |k| k1 + k22 k12 + k22 + k32
(2.6)
In (2.4) and (2.5), χΛ (k) denotes an ultraviolet cutoff function which, for the sake of concreteness, we choose as χΛ (k) := 1|k|≤Λα2 (k).
(2.7)
Here, Λ is supposed to be a given arbitrary (large and) positive parameter. As explained in [10, 31], the model is physically relevant if we assume that 1 Λ
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
557
α−2 . The reason for introducing α2 into the definition (2.7) will appear below (see (2.18)). As usual, for any h ∈ L2 (R3 × {1, 2}), we set ∗ ∗ ¯ λ)aλ (k)dk, h(k, λ)aλ (k)dk, a(h) := (2.8) h(k, a (h) := λ=1,2
R3
λ=1,2
R3
and Φ(h) := a∗ (h) + a(h), where the creation and annihilation operators, a∗λ (k) and aλ (k), obey the canonical commutation relations [aλ (k), aλ (k )] = [a∗λ (k), a∗λ (k )] = 0,
[aλ (k), a∗λ (k )] = δλλ δ(k − k ).
(2.9)
Hence, in particular, for j ∈ {1, 2, 3}, we have Aj (x) = Φ(hA j (x)) and Bj (x) = (x)), with Φ(hB j 1 χΛ (k) λ ε (k)e−ik·x , 2π |k| 12 j
k 1 i B λ 2 ∧ ε (k) e−ik·x . hj (x, k, λ) := − |k| χΛ (k) 2π |k| j hA j (x, k, λ) :=
(2.10) (2.11)
The Coulomb potential V (xel , xn ) is given by V (xel , xn ) ≡ V (xel − xn ) := −
α , |xel − xn |
(2.12)
and Hph is the Hamiltonian of the free photon field, defined by Hph := |k|a∗λ (k)aλ (k)dk. λ=1,2
(2.13)
R3
The 3-uples σ el = (σ1el , σ2el , σ3el ) and σ n = (σ1n , σ2n , σ3n ) are the Pauli matrices associated with the spins of the electron and the nucleus, respectively. They can be written as 4 × 4 matrices in the following way:
0 0 σ1el = 1 0
0 1 σ1n = 0 0
0 0 0 1
1 0 0 0
1 0 0 0
0 0 0 1
0 1 , 0 0 0 0 , 1 0
0 0 σ2el = i 0
0 −i 0 0 0 −i , 0 0 0 i 0 0
0 −i 0 0 i 0 0 0 σ2n = 0 0 0 −i , 0 0 i 0
0 0 0 1 0 0 , 0 −1 0 0 0 −1 (2.14) 1 0 0 0 0 −1 0 0 . σ3n = 0 0 1 0 0 0 0 −1 (2.15) 1 0 σ3el = 0 0
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
558
In order to exhibit the perturbative behavior of the interaction between the charged particles and the photon field, we proceed to a change of units. More precisely, let U : H → H be the unitary operator associated with the scaling (xel , xn , k1 , λ1 , . . . , kn , λn ) → (xel /α, xn /α, α2 k1 , λ1 , . . . , α2 kn , λn ).
(2.16)
We have 1 1 3 3 1 2 2 ˜ ˜ UH SM U ∗ = (pel − α 2 A(αx (pn + α 2 A(αx el )) + n )) 2 α 2mel 2mn 3
−
3
1 α 2 el ˜ α2 n ˜ σ · B(αxel ) + σ · B(αxn ), + Hph − |xel − xn | 2mel 2mn (2.17)
˜ are defined in the same way as A and B, except that the ultraviolet where A˜ and B cutoff function χΛ (k) is replaced by χ ˜Λ (k) := χΛ (α2 k) = 1|k|≤Λ (k).
(2.18)
˜ Setting g := α 32 , To simplify the notations, we redefine χ ˜Λ = χΛ , A = A˜ and B = B. we are thus led to study the Hamiltonian HgSM :=
2 2 1 1 (pel − gA(g 3 xel ))2 + (pn + gA(g 3 xn ))2 2mel 2mn 2 2 1 g el g n + Hph − − σ · B(g 3 xel ) + σ · B(g 3 xn ). |xel − xn | 2mel 2mn
(2.19)
Let the total mass, M , and the reduced mass, µ, be defined respectively by M := mel + mn ,
1 1 1 := + . µ mel mn
(2.20)
Let r := xel − xn ,
R :=
pel pr pn := − , µ mel mn
mel mn xel + xn , M M
PR := pel + pn . (2.21)
SM is given by For g = 0, the Hamiltonian H0SM := Hg=0
H0SM =
p2el p2 1 + Hph = HR + Hr + Hph , + n − 2mel 2mn |xel − xn |
(2.22)
where the Schr¨odinger operators HR and Hr on L2 (R3 ) are defined by HR :=
PR2 , 2M
Hr :=
1 p2r − . 2µ |r|
(2.23)
Let e0 := − µ2 be the ground state eigenvalue of Hr and e1 be the first eigenvalue above e0 . Note that a normalized eigenstate associated with e0 is given by 1
φ0 (r) := (π −1 µ3 ) 2 e−µ|r| .
(2.24)
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
559
To conclude this subsection, we recall the definition of the photon number operator, Nph , which will be used in the sequel: a∗λ (k)aλ (k)dk. (2.25) Nph := λ=1,2
R3
2.2. Fiber decomposition The Hamiltonian HgSM is translation invariant in the sense that HgSM formally commutes with the total momentum operator Ptot := PR + Pph , where Pph denotes the momentum operator of the photon field, given by the expression ka∗λ (k)aλ (k)dk. (2.26) Pph := λ=1,2
R3
In the same way as in [5], it follows that HgSM can be decomposed into a direct integral, which is expressed in the following proposition. Proposition 2.1 ([5]). There exists gc > 0 such that for all |g| ≤ gc , the following holds : the Hamiltonian HgSM given by the formal expression (2.19) identifies with a ⊕ self-adjoint operator which is unitarily equivalent to the direct integral R3 Hg (P )dP . Moreover, for all P ∈ R3 , Hg (P ) is a self-adjoint operator acting on the Hilbert space H(P ) := L2 (R3 ; C4 ) ⊗ Hph ∼ C4 ⊗ L2 (R3 , dr) ⊗ Hph ,
(2.27)
with domain D(Hg (P )) = D(H0 (P )), and Hg (P ) is given by the expression :
2 1 mel 2 mel (P − Pph ) + pr − gA g3r Hg (P ) = 2mel M M
2 mn 1 mn 2 3 (P − Pph ) − pr + gA − g r + 2mn M M
mel 2 1 g el g n mn 2 + Hph − g3r + g3r . − σ ·B σ ·B − |r| 2mel M 2mn M (2.28) Let us mention that this direct integral decomposition remains true for an arbitrary value of the coupling constant g (see [27]). However, in this paper, we shall only be interested in the small coupling regime. For g = 0, the fiber Hamiltonian H0 (P ) := Hg=0 (P ) reduces to the diagonal operator 1 (2.29) (P − Pph )2 + Hph , H0 (P ) = Hr + 2M odinger operator defined in (2.23). Let Ω denote the photon where Hr is the Schr¨ vacuum in Hph . One can verify that E0 (P ) := inf σ(H0 (P )) = e0 +
P2 , 2M
(2.30)
June 3, J070-S0129055X11004369
560
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
and that e0 + P 2 /2M is an eigenvalue of multiplicity 4 of H0 (P ). Moreover, the associated normalized eigenstates can be written under the form y ⊗ φ0 ⊗ Ω, where y is an arbitrary normalized element in C4 . The operator H0 (P ) is treated as an unperturbed Hamiltonian, the perturbation Wg (P ) := Hg (P ) − H0 (P ) being given by
mel mel 2 g 3 (P − Pph ) + pr · A g r Wg (P ) = − mel M M
mn g mn 2 3 + (P − Pph ) − pr · A − g r mn M M
2
2 mel 2 g2 mn 2 3 3 g r + g r A − M 2mn M
g el g n mn 2 mel 2 3 3 g r + g r . − σ ·B σ ·B − 2mel M 2mn M g2 + A 2mel
(2.31)
Note that, due to the choice of the Coulomb gauge, the operators A(mel g 2/3 r/M ) and A(−mn g 2/3 r/M ) commute both with pr and Pph . 2.3. Main result and organization of the paper Our main result is stated in the following theorem. Theorem 2.2. There exist gc > 0 and pc > 0 such that, for any 0 < |g| ≤ gc and 0 ≤ |P | ≤ pc , dim Ker(Hg (P ) − Eg (P )) < 4.
(2.32)
Our proof of Theorem 2.2 is based on a contradiction argument and the use of the Feshbach–Schur identity. The point is that the assumption dim Ker(Hg (P ) − Eg (P )) = 4 will allow us to compute the second order expansion in g of the expression (Eg (P )−E0 (P ))Π0 , where Π0 denotes the projection onto the eigenspace associated with the eigenvalue E0 (P ) of H0 (P ). More precisely, applying in a suitable way the Feshbach–Schur map, we will find that (Eg (P ) − E0 (P ))Π0 = Γ + O(|g|2+τ ) for some τ > 0, where Γ is an explicitly given 4 × 4 matrix. The previous identity implies in particular that all the coefficients of order g 2 in the matrix Γ must be located on the diagonal, which will lead to a contradiction. We decompose the proof of Theorem 2.2 into two main steps. In Sec. 3, we introduce and study some properties of the Feshbach–Schur operator that we consider. Next, in Sec. 4, we assume that the multiplicity of Eg (P ) is equal to 4, and we conclude the proof of Theorem 2.2 by a contradiction argument. In the Appendix, we collect some fairly standard estimates which are used in Secs. 3 and 4. Throughout the paper, C, C , C will denote positive constants that may differ from one line to another.
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
561
3. The Feshbach–Schur Operator In this section, we introduced the Feshbach–Schur operator that we consider, and we study some of its properties. They will be used below in Sec. 4 in order to prove Theorem 2.2. ˜ g (P ) obtained from Hg (P ) by It is convenient to work with the Hamiltonian H ˜ Wick ordering, that is Hg (P ) = : Hg (P ) :, with the usual notations. It is not difficult ˜ g (P ) = Hg (P ) − g 2 CΛ , where CΛ is a positive constant depending to check that H on the ultraviolet cutoff parameter Λ. Hence it suffices to prove Theorem 2.2 with ˜ g (P ) replacing Hg (P ) and E˜g (P ) := inf σ(H ˜ g (P )) replacing Eg (P ). To simplify H ˜ ˜g (P ). Moreover, in the notations, we redefine Hg (P ) := Hg (P ) and Eg (P ) := E what follows, we drop the dependence on P everywhere unless a confusion may arise. In particular, we set Hg = Hg (P ),
H0 = H0 (P ),
Wg = Wg (P ),
(3.1) P2 . 2M For any ρ ≥ 0, we define the projections Πρ in the tensor product C4 ⊗ L2 (R3 ) ⊗ Hph by Eg = Eg (P ),
E0 = E0 (P ) = e0 +
Πρ := 1 ⊗ Πφ0 ⊗ 1Hph ≤ρ ,
(3.2)
where Πφ0 denotes the projection onto the eigenspace associated with the eigenvalue e0 of Hr . In particular, as above, Π0 = 1 ⊗ Πφ0 ⊗ ΠΩ is the projection onto the eigenspace associated with the eigenvalue E0 of H0 (here ΠΩ is the projection onto the Fock vacuum). Lemma 3.1. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc , ¯ ρ − Eg + ε : D(H0 ) ∩ ¯ ρ Hg Π 0 ≤ |P | ≤ pc , ε ≥ 0 and g 2 ρ 1, the operator Π ¯ ρ ) is invertible and satisfies ¯ ρ ) → Ran(Π Ran(Π ¯ ρ Hg Π ¯ ρ − Eg + ε −1 Π ¯ ρ = H0 − Eg + ε −1 Π ¯ρ Π ¯ ρ H0 − Eg + ε −1 Π ¯ ρ )n . × (−Wg Π (3.3) n≥0
¯ ρ , it suffices to prove ¯ ρ = H0 − Eg + ε)Π ¯ρ +Π ¯ ρ Wg Π ¯ ρ Hg − Eg + ε Π Proof. Since Π that the Neumann series in the right-hand-side of (3.3) is convergent. It follows from Lemmas A.2 and A.8 in the Appendix that, for all n ∈ N, ε ≥ 0 and ρ > 0, −1 ¯ ρ (−Wg Π ¯ ρ H0 − Eg + ε −1 Π ¯ ρ )n ≤ Cρ−1 C |g|ρ− 12 n , (3.4) H0 − Eg + ε Π Therefore, for 1 ρ g 2 , (3.4) implies (3.3). Lemma 3.2. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc , 0 ≤ |P | ≤ pc , ε ≥ 0 and g 2 ρ 1, the Feshbach–Schur operator ¯ ρ − Eg + ε −1 Π ¯ ρ Wg Πρ ¯ ρ Hg Π Fρ (ε) = (H0 − Eg + ε)Πρ + Πρ Wg Πρ − Πρ Wg Π (3.5)
June 3, J070-S0129055X11004369
562
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
is a well-defined (bounded) operator on Ran(Πρ ). Moreover, Fρ (ε) satisfies Fρ (0) = lim+ Fρ (ε),
(3.6)
Fρ (0) ≤ Cρ.
(3.7)
ε→0
in the norm topology, and
Proof. By Lemma 3.1 and the fact that Ran(Πρ ) ⊂ D(H0 ) ⊂ D(Wg ), Fρ (ε) is obviously well-defined on Ran(Πρ ), for any ε ≥ 0. The boundedness of Fρ (ε) and Eq. (3.6) are straightforward verifications. In order to prove (3.7), we proceed as follows: First, it follows from Lemma A.6 that 2 Pph P (H0 − Eg )Πρ = · Pph + + Hph Πρ (E0 − Eg )Πρ + − M 2M ≤ Cg 2 + C ρ ≤ C ρ,
(3.8)
since, by assumption, ρ g 2 . Next, by Lemma A.8, we have that 1
Πρ Wg Πρ ≤ C|g|ρ 2 ≤ C ρ. Lemma 3.1 gives ¯ ρ Hg Π ¯ ρ − Eg −1 Π ¯ ρ Wg Πρ Πρ Wg Π n −1 ¯ρ ¯ ρ H0 − Eg −1 Π ¯ ρ Wg Πρ . Π −Wg Π = Πρ Wg H0 − Eg
(3.9)
(3.10)
n≥0
Using again Lemma A.8, we obtain that, for all n ≥ 0, ¯ ρ (−Wg Π ¯ ρ H0 − Eg −1 Π ¯ ρ )n Wg Πρ ≤ Cg 2 (C |g|ρ− 12 )n , Πρ Wg H0 − Eg −1 Π (3.11) which implies ¯ ρ − Eg −1 Π ¯ ρ Wg Πρ ≤ Cg 2 ≤ C ρ. ¯ ρ Hg Π Πρ Wg Π
(3.12)
Equations (3.8), (3.9) and (3.12) give (3.7). We now turn to the Feshbach–Schur identity. We refer to [9, 7, 20] for definitions and properties of the (smooth) Feshbach–Schur map, and its use in the context of non-relativistic QED. In our case, the operator Hg − Eg + ε is obviously invertible (for ε > 0), so that the following lemma simply follows from usual second order perturbation theory. Lemma 3.3. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc , 0 ≤ |P | ≤ pc , ε > 0 and g 2 ρ 1, the operators Hg − Eg + ε : D(H0 ) →
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
563
C4 ⊗ L2 (R3 ) ⊗ Hph and Fρ (ε) : Ran(Πρ ) → Ran(Πρ ) are invertible and satisfy Πρ [Hg − Eg + ε]−1 Πρ = Fρ (ε)−1 .
(3.13)
Proof. Since Hg − Eg ≥ 0, for any ε > 0, the operator Hg − Eg + ε from D(H0 ) to C4 ⊗ L2 (R3 , dr) ⊗ Hph is obviously invertible. The identity (3.13) is then easily verified following for instance [7, Theorem 2.1]. As a consequence of Lemmas 3.2 and 3.3, we obtain the following lemma. Lemma 3.4. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc , 0 ≤ |P | ≤ pc , and g 2 ρ 1, Fρ (0)Πρ 1{Eg } (Hg )Πρ = 0.
(3.14)
Proof. We obtain from (3.13) that Fρ (ε)Πρ [Hg − Eg + ε]−1 Πρ = Πρ ,
(3.15)
for all ε > 0. It follows from the functional calculus that s- lim ε[Hg − Eg + ε]−1 = 1{Eg } (Hg ), ε→0+
(3.16)
where s-lim stands for strong limit. Hence, using (3.6), we obtain (3.14) by multiplying (3.15) by ε and letting ε go to 0. The next lemma will be used in the proof of Theorem 2.2. Lemma 3.5. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc and 0 ≤ |P | ≤ pc , Π0 Fρ (0)Π0 = E0 − Eg Π0 −1 1 (P − k)2 + |k| − Eg − Π0 w(r, ˜ k, λ) Hr + 2M R3 λ=1,2
× w(r, k, λ)Π0 dk + O(|g|2+τ ),
(3.17)
2−2τ
where ρ = |g|
, τ > 0 is fixed sufficiently small,
g mel mel 2 w(r, k, λ) := − (P − Pph ) + pr · hA g 3 r, k, λ mel M M
mn g mn 2 A 3 (P − Pph ) − pr · h g r, k, λ + − mn M M
g mel 2 g n B mn 2 el B 3 3 − g r, k, λ + g r, k, λ , σ ·h σ ·h − 2mel M 2mn M (3.18)
and w(r, ˜ k, λ) is given by the same expression as w(r, k, λ) except that hA and hB ¯ A and h ¯ B , respectively. are replaced by h
June 3, J070-S0129055X11004369
564
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
Proof. We have that Π0 Πρ = Πρ Π0 = Π0 and H0 Π0 = E0 Π0 . Introducing (3.3) into (3.5), we thus obtain that −1 ¯ ρ Wg Π0 Π Π0 Fρ (0)Π0 = E0 − Eg Π0 + Π0 Wg Π0 − Π0 Wg H0 − Eg −1 ¯ ρ H0 − Eg −1 n Π ¯ ρ Wg Π0 . ¯ ρ Wg Π − Π0 Wg H0 − Eg −Π n≥1
(3.19) Observe that Π0 Wg Π0 = 0 since Wg is Wick ordered. Hence Estimate (3.11) for n ≥ 1 yields −1 ¯ ρ Wg Π0 + O(|g|3 ρ− 12 ). (3.20) Π Π0 Fρ (0)Π0 = E0 − Eg Π0 − Π0 Wg H0 − Eg We conclude the proof by applying Lemma A.9 of the Appendix. 4. Proof of Theorem 2.2. From now on we assume that dim Ker(Hg − Eg ) = 4, which will lead to a contradiction at the end of this section. Lemma 4.1. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc and 0 ≤ |P | ≤ pc , the following holds : If dim Ker(Hg − Eg ) = 4, then Π0 1{Eg } (Hg )Π0 is invertible on Ran(Π0 ) and satisfies 1 [Π0 1{E } (Hg )Π0 ]−1 ≤ . (4.1) g 1 − Cg 2 Proof. In order g } (Hg )Π0 is invertible on Ran(Π0 ), it suffices to prove that Π0 1{E to show that Π0 − Π0 1{Eg } (Hg )Π0 < 1. Observe that Π0 − Π0 1{Eg } (Hg )Π0 is a finite rank and positive operator. We have that Π0 − Π0 1{E } (Hg )Π0 ≤ tr(Π0 − Π0 1{E } (Hg )Π0 ) g g = tr(Π0 ) − tr(Π0 1{Eg } (Hg )) ¯ 0 1{E } (Hg )) = tr(Π0 ) − tr(1{Eg } (Hg )) + tr(Π g ¯ 0 1{E } (Hg )). ¯ 0 1{E } (Hg )) = tr(Π = 4 − 4 + tr(Π g g
(4.2)
¯ 0 can be decomposed as The projection Π ¯0 = 1 ⊗ Π ¯ φ0 ⊗ ΠΩ + 1 ⊗ 1 ⊗ Π ¯ Ω. Π
(4.3)
It follows from Lemma A.7 that ¯ φ0 ⊗ ΠΩ )1{E } (Hg )) ≤ Cg 2 , tr((1 ⊗ Π g
(4.4)
and from Lemma A.5 that ¯ Ω )Pg ) ≤ tr(Nph Pg ) ≤ C g 2 . tr((1 ⊗ 1 ⊗ Π 2
(4.5)
Therefore, Π0 − Π0 1{Eg } (Hg )Π0 ≤ C g . The invertibility of Π0 1{Eg } (Hg )Π0 and Eq. (4.1) directly follow from the latter estimate.
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
565
As a consequence of Lemma 4.1, we obtain the following lemma: Lemma 4.2. Let Γ denote the operator on Ran(Π0 ) defined by −1 1 (P − k)2 + |k| − Eg Γ := Π0 w(r, ˜ k, λ) Hr + w(r, k, λ)Π0 dk, 2M R3
(4.6)
λ=1,2
with w(r, k, λ) and w(r, ˜ k, λ) as in (3.18). There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc and 0 ≤ |P | ≤ pc , the following holds : If dim Ker(Hg − Eg ) = 4, then (4.7) Γ = E0 − Eg Π0 + O(|g|2+τ ), where τ > 0 is fixed sufficiently small. Proof. Fix ρ = |g|2−2τ for some sufficiently small τ > 0. Multiplying both sides of Eq. (3.14) by Π0 , we get Π0 Fρ (0)Πρ 1{Eg } (Hg )Π0 = 0.
(4.8)
¯ 0 into (4.8) and using Lemma 4.1, this Introducing the decomposition 1 = Π0 + Π yields ¯ 0 1{E } (Hg )Π0 [Π0 1{E } (Hg )Π0 ]−1 . Π0 Fρ (0)Π0 = −Π0 Fρ (0)Πρ Π g g
(4.9)
By Eqs. (4.3)–(4.5), we learn that ¯ 0 1{E } (Hg )) ≤ Cg 2 , ¯ 0 1{E } (Hg ) ≤ tr(Π Π g g
(4.10)
which, combined with (3.7) and (4.1), implies that Π0 Fρ (0)Πρ Π ¯ 0 1{E } (Hg )Π0 [Π0 1{E } (Hg )Π0 ]−1 ≤ Cg 2 ρ = C|g|4−2τ . g g
(4.11)
We conclude the proof thanks to Lemma 3.5. Let us consider the canonical orthonormal basis of C4 in which the Pauli matrices σjn , j ∈ {1, 2, 3}, are given by (2.14) and (2.15). Obviously, Γ identifies with a 4 × 4 matrix in this basis. In the next theorem, we determine a non-diagonal coefficient of Γ of the form −C0 g 2 + o(g 2 ) with C0 > 0. σjel ,
Theorem 4.3. Let Γ be given as in (4.6). There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc and 0 ≤ |P | ≤ pc , the coefficient of Γ located on the third line and second column, Γ32 , satisfies 8
Γ32 = −C0 g 2 + O(|g| 3 ),
(4.12)
where C0 is a strictly positive constant independent of g. Proof. We view w(r, k, λ) as a linear combination (some coefficients being given B by operators) of the functions hA j (· · · ) and hj (· · · ), j ∈ {1, 2, 3}. We introduce the corresponding expression into (4.6) and consider each term separately.
June 3, J070-S0129055X11004369
566
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
Since the coefficients located on the third line and second column of the Pauli matrices expressed in (2.14) and (2.15) vanish, the terms containing at least one factor hA j (· · ·) do not contribute to Γ32 . The same holds for the terms containing el n at least one factor hB 3 (· · ·), since the third Pauli matrices, σ3 and σ3 , are diagonal. Therefore, Γ32 is equal to the coefficient located on the third line and second column of the matrix Γ given by
g n¯B g mn 2 el ¯ B mel 23 3 g r, k, λ + g r, k, λ Γ = Π0 σ h σ h − − 2mel j j M 2mn j j M R3 j=1,2 λ=1,2
−1 1 × Hr + (P − k)2 + |k| − Eg 2M
mel 2 g n B g mn 2 el B g 3 r, k, λ + g 3 r, k, λ × σ h σ h − Π0 dk. − 2mel j j M 2mn j j M j =1,2
(4.13) It follows from the definition (2.11) of hB j that B 3 hj (r, k, λ) − hB 2 j (0, k, λ) ≤ C|k| χΛ (k)|r|,
(4.14)
for any j ∈ {1, 2, 3}, λ ∈ {1, 2}, r ∈ R3 and k ∈ R3 . Moreover, the expression (2.24) of φ0 implies that |r|φ0 (r) ≤ C. (4.15) Hence, using in addition that, for |P | sufficiently small, −1 C (P − k)2 + |k| − Eg , ≤ Hr + |k| 2M
(4.16)
we obtain from (4.13) and (4.14)–(4.16) that
g n¯B g el ¯ B Γ = Π0 σ h (0, k, λ) + σ h (0, k, λ) − 2mel j j 2mn j j 3 j=1,2 λ=1,2 R −1 1 2 × e0 + (P − k) + |k| − Eg 2M
8 g n B g el B × σj hj (0, k, λ) + σj hj (0, k, λ) Π0 dk + O(|g| 3 ). − 2m 2m el n j =1,2
(4.17) Notice now that, for j, j ∈ {1, 2}, the coefficient on the third line and second column of the products σjel σjel and σjn σjn vanishes. We thus obtain from (4.17) that 8
Γ32 = Γ32 = γ1 + γ2 + O(|g| 3 ),
(4.18)
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
where γ1 := −
567
B g2 ¯ (0, k, λ) + ih ¯ B (0, k, λ) (φ0 , h 1 2 4mel mn R3 λ=1,2
−1 B 1 2 (P − k) + |k| − Eg h1 (0, k, λ) − ihB × e0 + 2 (0, k, λ) φ0 )dk, 2M (4.19) and γ2 := −
B g2 ¯ (0, k, λ) − ih ¯ B (0, k, λ) (φ0 , h 1 2 4mel mn R3 λ=1,2
× e0 +
1 (P − k)2 + |k| − Eg 2M
−1
B hB 1 (0, k, λ) + ih2 (0, k, λ) φ0 )dk. (4.20)
B We remark that the cross terms involving hB 1 (0, k, λ) and h2 (0, k, λ) vanish. Thus, we obtain g2 ¯ B (0, k, λ) h Γ32 = − j 2melmn j=1,2 R3 λ=1,2
−1 8 (P − k)2 3 + |k| − Eg × e0 + hB j (0, k, λ)dk + O(|g| ). 2M
(4.21)
The integral in the right-hand side of (4.21) still depends on g through the ground state energy Eg . Nevertheless, one can readily check that −1 −1 (P − k)2 (P − k)2 + |k| − Eg + |k| − E0 − e0 + e0 + 2M 2M ≤ |E0 − Eg |
C C g 2 ≤ , |k|2 |k|2
(4.22)
where, in the last inequality, we used Lemma A.6. Therefore, since, for any j ∈ {1, 2} B 1/2 χΛ (k), we get and λ ∈ {1, 2}, the functions hB j (0, k, λ) satisfy |hj (0, k, λ)| ≤ C|k| 2 g ¯ B (0, k, λ) Γ32 = − h j 2mel mn j=1,2 3 R λ=1,2 −1 8 (P − k)2 3 × e0 + + |k| − E0 hB (4.23) j (0, k, λ)dk + O(|g| ). 2M Now, the integrals in the right-hand side of (4.23) can be explicitly computed, which leads to
2 8 k3 |k|χΛ (k)2 g2 + 1 dk + O(|g| 3 ). (4.24) Γ32 = − 2 8π mel mn R3 k 2 /2M − k · P/M + |k| |k|2 The integrand in (4.24) is strictly positive (for P sufficiently small), and hence the integral does not vanish. This concludes the proof of the theorem.
June 3, J070-S0129055X11004369
568
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
We are now able to prove Theorem 2.2: Proof of Theorem 2.2. By [5], we know that dim Ker(Hg − Eg ) ≤ 4. Assume by contradiction that dim Ker(Hg − Eg ) = 4. By Lemma 4.2, the matrix Γ defined in (4.6) satisfies (4.7). In particular, in any basis of C4 , the non-vanishing terms of order g 2 of Γ are necessarily located on the diagonal. However, according to Theorem 4.3, in the canonical orthonormal basis of C4 in which the Pauli matrices are given by (2.14) and (2.15), the non-diagonal coefficient Γ32 contains a non-vanishing term of order g 2 . Hence we get a contradiction and the theorem is proven. Appendix. Standard Estimates In this Appendix, we collect some estimates which were used in Secs. 3 and 4. Some of them are standard (see for instance [9, 10]). We begin with two lemmas concerning the non-interacting Hamiltonian H0 defined in (2.29). Lemma A.1. There exists pc > 0 such that for all 0 ≤ |P | ≤ pc , Hph ≤ 2(H0 − E0 ).
(A.1)
Proof. For j ∈ {1, 2, 3}, one can easily verify that |(Pph )j | ≤ Hph . Hence, since E0 = e0 + P 2 /2M , we have that 1 P2 1 2 1 − P · Pph + Pph + Hph ≥ E0 + Hph , 2M M 2M 2 for P sufficiently small, which proves the lemma. H0 = H r +
(A.2)
Lemma A.2. There exists pc > 0 such that, for all 0 ≤ |P | ≤ pc and ρ ≥ 0, 2
P ρ ¯ ρ. ¯ ¯ Π Πρ H0 Πρ ≥ + min e0 + , e1 (A.3) 2M 2 Proof. Since Πρ = 1 ⊗ Πφ0 ⊗ 1Hph ≤ρ in the tensor product C4 ⊗ L2 (R3 ) ⊗ Hph , ¯ φ0 ⊗ 1H ≤ρ + 1 ⊗ 1 ⊗ 1H ≥ρ , where ¯ ρ = 1 − Πρ = 1 ⊗ Π we can write Π ph ph ¯ φ0 = 1 − Πφ0 . Since Hr Π ¯ φ0 ≥ e 1 Π ¯ φ0 , we get that Π
P2 ¯ ¯ φ0 ⊗ 1H ≤ρ ), (A.4) H0 (1 ⊗ Πφ0 ⊗ 1Hph ≤ρ ) ≥ e1 + (1 ⊗ Π ph 2M for P small enough. Moreover, by Lemma A.1,
ρ P2 H0 (1 ⊗ 1 ⊗ 1Hph ≥ρ ) ≥ e0 + + (1 ⊗ 1 ⊗ 1Hph ≥ρ ). 2M 2 Hence (A.3) is proven. The proofs of the next two lemmas being standard, we omit them.
(A.5)
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
569
¯ Ω ]−1/2 and Lemma A.3. For any f ∈ L2 (R3 × {1, 2}), the operators a(f )[Nph Π −1/2 ¯ [Nph ΠΩ ] a(f ) extend to bounded operators on Hph satisfying a(f )[Nph Π ¯ Ω ]− 12 ≤ f , (A.6) √ ¯ Ω ]− 12 a(f ) ≤ 2f . [Nph Π (A.7) Lemma A .4. Let f ∈ L2 (R3 × {1, 2}) be such that (k, λ) → |k|−1/2 f (k, λ) ∈ L2 (R3 × {1, 2}). Then, for any ρ > 0, the operators a(f )[Hph + ρ]−1/2 and [Hph + ρ]−1/2 a(f ) extend to bounded operators on Hph satisfying a(f )[Hph + ρ]− 12 ≤ |k|− 12 f , (A.8) 1 1 1 [Hph + ρ]− 2 a(f ) ≤ |k|− 2 f + ρ− 2 f . (A.9) The following lemma is taken from [5]. Its proof is based on a “pull-through” formula (see [5]). Lemma A.5. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc and 0 ≤ |P | ≤ pc , the following holds : ∀ Φg ∈ Ker(Hg − Eg ), Φg = 1, we have (Φg , Nph Φg ) ≤ Cg 2 ,
(A.10)
where C is a positive constant independent of g. In the next lemma, we estimate the difference between the ground state energies Eg = inf σ(Hg ) and E0 = inf σ(H0 ). Lemma A.6. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc and 0 ≤ |P | ≤ pc , Eg ≤ E0 ≤ Eg + Cg 2 ,
(A.11)
where C is a positive constant independent of g. Proof. Note that, since the perturbation Wg is Wick-ordered, we have that (1 ⊗ 1 ⊗ ΠΩ )Wg (1 ⊗ 1 ⊗ ΠΩ ) = 0, where, recall, ΠΩ denotes the orthogonal projection onto the vector space spanned by the Fock vacuum Ω. Hence, by the Rayleigh–Ritz principle, Eg ≤ (y ⊗ φ0 ⊗ Ω), Hg (y ⊗ φ0 ⊗ Ω) (A.12) = (y ⊗ φ0 ⊗ Ω), H0 (y ⊗ φ0 ⊗ Ω) = E0 , where, as above, y denotes an arbitrary normalized element in C4 . In order to prove the second inequality in (A.11), we use Lemmas A.3 and A.5. More precisely, let Φg ∈ Ker(Hg − Eg ), Φg = 1 (Φg exists by [5]). We have that E0 − Eg ≤ (Φg , (H0 − Hg )Φg ) = −(Φg , Wg Φg ).
(A.13)
June 3, J070-S0129055X11004369
570
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
Recall that Wg is given by the Wick-ordered expression obtained from (2.31). We express the latter in terms of operators of creation and annihilation, and estimate each term separately. Consider for instance the term
mel mel 2 g (P − Pph ) + pr · a hA g3r . (A.14) mel M M It is not difficult to check that (P − Pph )2 ≤ aH0 + b
and p2r ≤ aH0 + b,
(A.15)
for some positive constants a and b depending on µ and M . One easily deduces from (A.15) that
mel (A.16) M (P − Pph ) + pr Φg ≤ C. Moreover, by Lemmas A.3 and A.5, we have that
1 a hA mel g 32 r 2 Φg (A.17) ≤ C Nph Φg ≤ C |g|. M Equations (A.16) and (A.17) imply that (Φg , (A.14)Φg ) ≤ Cg 2 , and since the other terms in Wg are estimated similarly, this concludes the proof. Lemma A.5 gives an estimation of the overlap of the ground state Φg of Hg with the Fock vacuum. We also need to estimate the overlap of Φg with the ground state φ0 of the electronic Hamiltonian Hr in the sense stated in the following lemma. Lemma A.7. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc and 0 ≤ |P | ≤ pc , the following holds : ¯ φ0 ⊗ ΠΩ )Φg ) ≤ Cg 2 , ∀ Φg ∈ Ker(Hg −Eg ), Φg = 1, we have (Φg , (1 ⊗ Π
(A.18)
where C is a positive constant independent of g. Proof. Let Φg be a normalized ground state of Hg , that is (Hg − Eg )Φg = 0, Φg = 1. Since E0 − Eg = e0 + P 2 /2M − Eg ≥ 0 by Lemma A.6, we have that ¯ φ0 ⊗ ΠΩ )(Hg − Eg )Φg 0 = Φg , (1 ⊗ Π
P2 ¯ − Eg + Wg Φg = Φg , (1 ⊗ Πφ0 ⊗ ΠΩ ) Hr + 2M ¯ φ0 ⊗ ΠΩ )Φg − Φg , (1 ⊗ Π ¯ φ0 ⊗ ΠΩ )Wg Φg , ≥ (e1 − e0 ) Φg , (1 ⊗ Π (A.19) and hence
1 ¯ φ0 ⊗ ΠΩ )Wg Φg ). (A.20) (Φg , (1 ⊗ Π e1 − e0 We conclude the proof thanks to Lemmas A.3 and A.5, by arguing in the same way as in the proof of Lemma A.6. ¯ φ0 ⊗ ΠΩ )Φg ) ≤ (Φg , (1 ⊗ Π
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
571
We now give estimates relating the perturbation Wg to H0 . Lemma A .8. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc , 0 ≤ |P | ≤ pc , 0 < ρ 1 and ε ≥ 0, the following estimates hold : ¯ ρ Wg Π ¯ ρ [H0 − Eg + ε]− 12 ≤ C|g|ρ− 12 , [H0 − Eg + ε]− 12 Π ¯ ρ [H0 − Eg + ε]− 12 ≤ C|g|, Πρ Wg Π ¯ ρ Wg Πρ ≤ C|g|, [H0 − Eg + ε]− 12 Π Πρ Wg Πρ ≤ C|g|ρ 12 .
(A.21) (A.22) (A.23) (A.24)
Proof. Let us begin with proving (A.21). As in the proof of Lemma A .6, we express Wg in terms of creation and annihilation operators from the Wick-ordered expression obtained from (2.31), and we estimate each term separately. Let us consider again the term (A.14) as an example. Using (A.15), Lemma A.2, and the fact that E0 ≥ Eg , we obtain
mel 1 − 12 ¯ (P − Pph ) + pr ≤ Cρ− 2 , [H0 − Eg + ε] Πρ M j
(A.25)
for j ∈ {1, 2, 3}. Next, for j ∈ {1, 2, 3}, Lemma A.4 gives
−1/2 a hA mel g 23 r [H + ρ] ph j ≤ C, M
(A.26)
and it follows from Lemmas A.1 and A.2 that [Hph + ρ] 12 Π ¯ ρ [H0 − Eg + ε]− 12 ≤ C.
(A.27)
Using (A.25)–(A.27), we obtain ¯ ρ (A.14)Π ¯ ρ [H0 − Eg + ε]− 12 ≤ C|g|ρ− 12 . [H0 − Eg + ε]− 12 Π
(A.28)
The other terms in Wg are estimated similarly, using in particular Estimate (A.9) (in addition to (A.8)) for the terms quadratic in the annihilation and creation operators. Hence (A.21) is proven. In order to prove (A.22)–(A.24), we proceed similarly, using the further following estimates:
m el (P − Pph ) + pr Πρ ≤ C, M j
[Hph + ρ] 12 Πρ ≤ Cρ 12 .
(A.29)
The first estimate in (A.29) follows from (A.15), while the second is an obvious consequence of the Spectral Theorem.
June 3, J070-S0129055X11004369
572
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
Lemma A .9. There exist gc > 0 and pc > 0 such that, for all 0 ≤ |g| ≤ gc , 0 ≤ |P | ≤ pc , 0 < ρ 1, and ε ≥ 0, we have −1 ¯ ρ Wg Π0 = Π Π0 w(r, ˜ k, λ) Π0 Wg H0 − Eg λ=1,2
R3
1 (P − k)2 + |k| − Eg × Hr + 2M
−1
× w(r, k, λ)Π0 dk + O(|g|3 ) + O(g 2 ρ),
(A.30)
where w(r, k, λ) and w(r, ˜ k, λ) are defined in (3.18). ¯ ρ Wg Π0 . We introProof. The perturbation Wg appears twice in Π0 Wg [H0 −Eg ]−1 Π duce the expression (2.31) of Wg into the latter operator, and consider each term separately. First, the terms containing a creation operator in the “first” Wg vanish since Π0 projects onto the Fock vaccum. The same holds for the terms containing an annihilation operator in the “second” Wg . Next, the terms involving the parts of Wg quadratic in the creation and annihilation operators are (at least) of order O(|g|3 ), as follows again from Lemmas A.4 and A.8. Therefore, one can compute −1 ¯ ρ Wg Π0 Π Π0 Wg H0 − Eg =
−1 1 (P − k)2 + |k| − Eg Π0 w(r, ˜ k, λ) Hr + w(r, k, λ)Π0 dk 2M 3 λ=1,2 R −1 1 (P − k)2 + |k| − Eg Π0 w(r, ˜ k, λ) e0 + − 2M |k|≤ρ
λ=1,2
× (1 ⊗ Πφ0 ⊗ 1)w(r, k, λ)Π0 dk + O(|g|3 ). The second term in the right-hand side of (A.31) is estimated as follows: −1 1 2 (P − k) Π w(r, ˜ k, λ) e + + |k| − E 0 0 g 2M λ=1,2 |k|≤ρ C × (1 ⊗ Πφ0 ⊗ 1)w(r, k, λ)Π0 dk ≤ dk ≤ C ρ. 2 |k| |k|≤ρ
(A.31)
(A.32)
λ=1,2
Hence (A.30) is proven. References [1] W. K. Abou Salem, J. Faupin, J. Fr¨ ohlich and I. M. Sigal, On the theory of resonances in non-relativistic QED and related models, Adv. Appl. Math. 43 (2009) 201–230. [2] I. E. Abramov and A. V. Andreev, Hyperfine structure of a hydrogen-like atom due to orbit-orbit, spin-orbit and spin-spin interactions, Moscow Univ. Phys. Bull. 62 (2007) 283–286.
June 3, J070-S0129055X11004369
2011 13:30 WSPC/S0129-055X
148-RMP
Hyperfine Splitting in Non-Relativistic QED
573
[3] L. Amour and J. Faupin,The confined hydrogenoid ion in non-relativistic quantum electrodynamics, Cubo 9 (2007) 103–137. [4] L. Amour, J. Faupin, B. Gr´ebert and J.-C. Guillot, On the infrared problem for the dressed non-relativistic electron in a magnetic field, in Spectral and Scattering Theory for Quantum Magnetic Systems, Contemp. Math., Vol. 500 (Amer. Math. Soc., Providence, RI, 2009), pp. 1–24. [5] L. Amour, B. Gr´ebert and J.-C. Guillot, The dressed mobile atoms and ions, J. Math. Pures Appl. 86 (2006) 177–200. [6] A. V. Andreev, Atomic Spectroscopy. Introduction to the Theory of Hyperfine Structure (Springer, 2005). [7] V. Bach, T. Chen, J. Fr¨ ohlich and I. M. Sigal, Smooth Feshbach map and operatortheoretic renormalization group methods, J. Funct. Anal. 203 (2003) 44–92. [8] V. Bach, T. Chen, J. Fr¨ ohlich and I. M. Sigal, The renormalized electron mass in non-relativistic quantum electrodynamics, J. Funct. Anal. 243 (2007) 426–535. [9] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998) 299–395. [10] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Spectral analysis for systems of atoms and molecules coupled to the quantized radiation field, Comm. Math. Phys. 207 (1999) 249–290. [11] H. A. Bethe and E. Salpeter, Quantum Mechanics of One- and Two-Electron Atoms (Springer-Verlag, 1957). [12] T. Chen, Infrared renormalization in non-relativistic QED and scaling criticality, J. Funct. Anal. 254 (2008) 2555–2647. [13] T. Chen and J. Fr¨ ohlich, Coherent infrared representations in non-relativistic QED, in Spectral Theory and Mathematical Physics: A Festschrift in Honor of Barry Simon’s 60th Birthday, Proc. Sympos. Pure Math., Vol. 76 (Amer. Math. Soc., Providence, RI, 2007), pp. 25–45. [14] T. Chen, J. Fr¨ ohlich and A. Pizzo, Infraparticle scattering states in non-relativistic QED. II. Mass shell properties, J. Math. Phys. 50 (2009) 012103, 34 pp. [15] C. Cohen-Tannoudji, B. Diu and F. Lalo¨e, M´ecanique Quantique II (Hermann, Paris, 1977). [16] J. Faupin, Resonances of the confined hydrogen atom and the Lamb–Dicke effect in non-relativistic QED, Ann. Henri Poincar´e 9 (2008) 743–773. [17] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic completeness for Compton scattering, Comm. Math. Phys. 252 (2004) 415–476. [18] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Rayleigh scattering at atoms with dynamical nuclei, Comm. Math. Phys. 271 (2007) 387–430. [19] J. Fr¨ ohlich and A. Pizzo, Renormalized electron mass in nonrelativistic QED, Comm. Math. Phys. 294 (2010) 439–470. [20] M. Griesemer and D. Hasler, On the smooth Feshbach–Schur map, J. Funct. Anal. 254 (2008) 2329–2335. [21] D. Hasler and I. Herbst, Absence of ground states for a class of translation invariant models of non-relativistic QED, Comm. Math. Phys. 279 (2008) 769–787. [22] F. Hiroshima, Fiber Hamiltonians in non-relativistic quantum electrodynamics, J. Funct. Anal. 252 (2007) 314–355. [23] F. Hiroshima, Multiplicity of ground states in quantum field models: Application of asymptotic fields, J. Funct. Anal. 224 (2005) 431–470. [24] F. Hiroshima and J. Lorinczi, Functional integral representations of nonrelativistic quantum electrodynamics with spin 1/2, J. Funct. Anal. 254 (2008) 2127–2185.
June 3, J070-S0129055X11004369
574
2011 13:30 WSPC/S0129-055X
148-RMP
L. Amour & J. Faupin
[25] F. Hiroshima and H. Spohn, Ground state degeneracy of the Pauli–Fierz Hamiltonian with spin, Adv. Theor. Math. Phys. 5 (2001) 1091–1104. [26] C. Itzykson and J.-B. Zuber, Quantum Field Theory (McGraw-Hill, 1980). [27] M. Loss, T. Miyao and H. Spohn, Lowest energy states in nonrelativistic QED: Atoms and ions in motion, J. Funct. Anal. 243 (2007) 353–393. [28] M. Loss, T. Miyao and H. Spohn, Kramers degeneracy theorem in nonrelativistic QED, Lett. Math. Phys. 89 (2009) 21–31. [29] A. Messiah, M´ecanique Quantique, Tome 2 (Dunod, Paris, 1995). [30] I. Sasaki, Ground state of a model in relativistic quantum electrodynamics with a fixed total momentum, preprint (2006); arXiv:math-ph/0606029v4. [31] I. M. Sigal, Ground state and resonances in the standard model of non-relativistic QED, J. Stat. Phys. 134 (2009) 899–939. [32] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge University Press, Cambridge, 2004).
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 6 (2011) 575–613 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004370
CALCULI, HODGE OPERATORS AND LAPLACIANS ON A QUANTUM HOPF FIBRATION
GIOVANNI LANDI∗,‡ and ALESSANDRO ZAMPINI†,§ ∗Dipartimento
di Matematica e Informatica, Universit` a di Trieste, Via A. Valerio 12/1, I-34127, Trieste, Italy and INFN, Sezione di Trieste, Trieste, Italy
†Mathematisches Institut der Ludwig-Maximilians-Universit¨ at, Theresienstraße 39, 80333 M¨ unchen, Germany ‡
[email protected] §
[email protected]
Received 13 October 2010 Revised 5 April 2011 We describe Laplacian operators on the quantum group SUq (2) equipped with the fourdimensional bicovariant differential calculus of Woronowicz as well as on the quantum homogeneous space S2q with the restricted left covariant three-dimensional differential calculus. This is done by giving a family of Hodge dualities on both the exterior algebras of SUq (2) and S2q . We also study gauged Laplacian operators acting on sections of line bundles over the quantum sphere. Keywords: Quantum groups; quantum spheres; differential calculi; Hopf bundles; connections; Hodge star operators; Laplacian operators. Mathematics Subject Classification 2010: 17B37, 58B32
1. Introduction We continue our program devoted to Laplacian operators on quantum spaces with the study of such operators on the quantum (standard) Podle´s sphere S2q and their coupling with gauge connections on the quantum principal U(1)-fibration A(S2q ) → A(SUq (2)). While in [22] one worked with a left 3D covariant differential calculus on SUq (2) and its restriction to the (unique) 2D left covariant differential calculus on the sphere S2q , in the present paper we use the somewhat more complicate 4D+ bicovariant calculus on SUq (2) introduced in [36] and its restriction to a 3D left covariant calculus on the sphere S2q . Laplacian operators on all Podle´s spheres, related to the 4D+ bicovariant calculus on SUq (2) were already studied in [29]. Our contribution to Laplacian operators 575
July 6, J070-S0129055X11004370
576
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
comes from the use of Hodge -operators on both the manifold of SUq (2) and S2q that we introduce by improving and diversifying on existing definitions. We then move on to line bundles on the standard sphere S2q and to a class of operators on such bundles that are “gauged” with the use of a suitable class of connections on the principal bundle A(S2q ) → A(SUq (2)) and of the corresponding covariant derivatives on (module of sections of) the line bundles. These gauged Laplacians are completely diagonalized and are split in terms of a Laplacian operator on the total space SUq (2) of the bundle minus vertical operators, paralleling what happens on a classical principal bundle (see e.g. [2, Proposition 5.6]) and on the Hopf fibration of the sphere S2q with calculi coming from the left covariant one on SUq (2) as shown in [22, 37]. In Sec. 2 we describe all we need of the principal fibration A(S2q ) → A(SUq (2)) and associated line bundles over S2q . We also give a systematic description of the differential calculi we are interested in, the 4D+ bicovariant calculus on SUq (2) and its restriction to a 3D left covariant calculus on the sphere S2q . A thoughtful construction of Hodge -dualities on SUq (2) is in Sec. 3 while the one on S2q is in Sec. 4. These are used in Sec. 5 for the definition of Laplacian operators. A digression on connections on the principal bundle and of covariant derivatives on the line bundles is in Sec. 6 and the following Sec. 7 is devoted to the corresponding gauged Laplacian operators on modules of sections of the line bundles. To make the paper relatively self-contained it concludes with two appendices, Appendix A giving general facts on differential calculi on Hopf algebras and Appendix B concerning with general facts on quantum principal bundles endowed with connections. We like to mention that besides the constructions in [16, 21], examples of Hodge operators on the exterior algebras of the quantum homogeneous q-Minkowski and q-Euclidean spaces — satisfying a covariance requirement with respect to the action of the quantum groups SOq (3, 1) and SOq (4) — have been given in [26, 24] using the formalism of braided geometry and with a construction of a q-epsilon tensor. On the exterior algebra over the quantum planes RN q a Hodge operator has been studied in [12]. Conventions and notations. When writing about connections and covariant derivatives we shall pay attention in keeping the two notions distinct: a connection will be a projection on a principal bundle while a covariant derivative will be an operator on section, both objects fulfilling suitable properties. For q = 1 the “q-number” is defined as [x] = [x]q :=
q x − q −x , q − q −1
(1.1)
for any x ∈ R. For a coproduct ∆ we use the Sweedler notation ∆(x) = x(1) ⊗ x(2) , with implicit summation. This is iterated to (id ⊗∆) ◦ ∆(x) = (∆ ⊗ id) ◦ ∆(x) = x(1) ⊗ x(2) ⊗ x(3) , and so on.
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
577
2. Prelude: Calculi and Line Bundles on Quantum Spheres We introduce the manifolds of the quantum group SUq (2) and its quantum homogeneous space S2q — the standard Podle´s sphere. The corresponding inclusion A(S2q ) → A(SUq (2)) of the corresponding coordinate algebras is a (topological) quantum principal bundle. Following Appendix A we then equip A(SUq (2)) with a 4-dimensional bicovariant calculus, whose restriction gives a 3-dimensional left covariant calculus on A(S2q ). 2.1. Spheres and bundles For the quantum group SUq (2) its polynomial algebra A(SUq (2)) is the unital ∗-algebra generated by elements a and c, with relations ac = qca,
ac ∗ = qc ∗ a,
cc ∗ = c∗ c,
(2.1)
a∗ a + c∗ c = aa ∗ + q 2 cc ∗ = 1.
In the limit q → 1 one recovers the commutative coordinate algebra on the group manifold SU(2). The algebra A(SUq (2)) can be completed to a C ∗ -algebra in a usual way by considering all its admissible representations and the supremum (universal) norm on them [35]. For the sake of the present paper this is not necessary since we are interested in Laplacian operators on SUq (2) (and on its homogeneous space, the quantum sphere) and their spectra. Thus we only exhibit a vector space basis for A(SUq (2)) in (2.13) below, giving an analogue of the classical Wigner D-functions for the SU(2) group, i.e. matrix elements of its unitary irreducible (co)representations. Also, without loss of generality, the deformation parameter q ∈ R will be restricted to the interval 0 < q < 1, the map q → q −1 giving isomorphic algebras. If we use the matrix a −qc∗ , U= c a∗ whose being unitary is equivalent to relations (2.1), the Hopf algebra structure for A(SUq (2)) is given by coproduct, antipode and counit: ∆ U = U ⊗ U,
S(U ) = U ∗ ,
ε(U ) = 1,
that is ∆(a) = a ⊗ a − qc∗ ⊗ c, and ∆(c) = c ⊗ a + a∗ ⊗ c; S(a) = a∗ and S(c) = −qc; ε(a) = 1 and ε(c) = 0 and their ∗-conjugated relations. The quantum universal enveloping algebra Uq (su(2)) is the unital Hopf ∗-algebra generated as an algebra by four elements K ±1 , E, F with KK −1 = 1 = K −1 K and relations: K ±1 E = q ±1 EK ±1 ,
K ±1 F = q ∓1 FK ±1 ,
[E, F ] =
K 2 − K −2 . q − q −1
(2.2)
July 6, J070-S0129055X11004370
578
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
The ∗-structure is K ∗ = K, E ∗ = F , and the Hopf algebra structure is provided by coproduct ∆(K ±1 ) = K ±1 ⊗ K ±1 ,
∆(E) = E ⊗ K + K −1 ⊗ E,
∆(F ) = F ⊗ K + K −1 ⊗ F, while the antipode is S(K) = K −1 , S(E) = −qE, S(F ) = −q −1 F and the counit reads ε(K) = 1, ε(E) = ε(F ) = 0. The quadratic element Cq =
qK 2 − 2 + q −1 K −2 1 + FE − −1 2 (q − q ) 4
(2.3)
is a quantum Casimir operator that generates the centre of Uq (su(2)). The Hopf ∗-algebras Uq (su(2)) and A(SUq (2)) are dually paired. The ∗-compatible bilinear mapping , : Uq (su(2)) × A(SUq (2)) → C is on the generators given by K ±1 , a = q ∓1/2 , K ±1 , a∗ = q ∓1/2 , F, c∗ = −q −1 ,
E, c = 1,
(2.4)
with all other couples of generators pairing to zero. This pairing is proved [20] to be non-degenerate. The algebra Uq (su(2)) is recovered as a ∗-Hopf subalgebra in the dual algebra A(SUq (2))o , the largest Hopf ∗-subalgebra contained in the vector space dual A(SUq (2)) . There are [35] ∗-compatible canonical commuting actions of Uq (su(2)) on A(SUq (2)): h x := x(1) h, x(2) ,
x h := h, x(1) x(2) .
On powers of generators one computes, for s ∈ N, that s
K ±1 as = q ∓ 2 as ,
F as = 0,
s
K ±1 a∗s = q ± 2 a∗s , F a∗s = q (1−s)/2 [s]ca∗s−1 , s
K ±1 cs = q ∓ 2 cs , s
K ±1 c∗s = q ± 2 c∗s ,
F cs = 0, F c∗s = −q −(1+s)/2 [s]ac∗s−1 ,
E as = −q (3−s)/2 [s]as−1 c∗ , E a∗s = 0, E cs = q (1−s)/2 [s]cs−1 a∗ , E c∗s = 0;
(2.5)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
579
and s
as K ±1 = q ∓ 2 as , a∗s K ±1 = q
as F = q (s−1)/2 [s]cas−1 ,
± 2s
a∗s , a∗s F = 0,
s
cs F = 0,
cs K ±1 = q ± 2 cs , s
c∗s K ±1 = q ∓ 2 c∗s ,
c∗s F = −q (s−3)/2 [s]a∗ c∗s−1 ,
as E = 0,
(2.6)
a∗s E = −q (3−s)/2 [s]c∗ a∗s−1 , cs E = q (s−1)/2 [s]cs−1 a, c∗s E = 0. Consider the algebra A(U(1)) := C[z, z ∗ ]/zz ∗ − 1. The map a −qc∗ z := π : A(SUq (2)) → A(U(1)), π ∗ 0 c a
0
z∗
(2.7)
is a surjective Hopf ∗-algebra homomorphism. As a consequence, U(1) is a quantum subgroup of SUq (2) with right coaction: δR := (id ⊗ π) ◦ ∆ : A(SUq (2)) → A(SUq (2)) ⊗ A(U(1)).
(2.8)
The coinvariant elements for this coaction, elements b ∈ A(SUq (2)) for which δR (b) = b ⊗ 1, form the algebra of the standard Podle´s sphere A(S2q ) → A(SUq (2)). This inclusion gives a topological quantum principal bundle, following the formulation reviewed in Appendix B. The above right U(1) coaction on SUq (2) is dual to the left action of the element K, and allows one [25] to give a decomposition Ln (2.9) A(SUq (2)) = n∈Z
in terms of
A(S2q )-bimodules
defined by
Ln := {x ∈ A(SUq (2)) : K x = q n/2 x ⇔ δR (x) = x ⊗ z −n },
(2.10)
with A(S2q ) = L0 . It is easy to see (cf. [19, Proposition 3.1]) that L∗n = L−n and Ln Lm = Ln+m . Also E Ln ⊂ Ln+2 ,
F Ln ⊂ Ln−2 ,
Ln u ⊂ Ln ,
(2.11)
for any u ∈ Uq (su(2)). The bimodules Ln will be described at length later on when we endow them with connections. Here we only mention that the bimodules Ln have a vector space decomposition (cf., e.g., [23]): (n) VJ , (2.12) Ln := |n| |n| J= |n| 2 , 2 +1, 2 +2,...
July 6, J070-S0129055X11004370
580
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini (n)
where VJ is the spin J (with J ∈ 12 N) irreducible ∗-representation spaces for the right action of Uq (su(2)), and basis elements φn,J,l = (cJ−n/2 a∗J+n/2 ) E l |n| 2
with n ∈ Z, J =
(2.13)
+ N, l = 0, . . . , 2J.
2.2. The 4D exterior algebra over the quantum group SUq (2) Following the formulation reviewed in Appendix A, we present here the exterior algebra over the so called 4D+ bicovariant calculus on SUq (2), which was introduced as a first order differential calculus in [36], and described in details in [33]. One needs an ideal QSUq (2) ⊂ ker εSUq (2) . The one corresponding to the 4D+ calculus is generated by the nine elements {c2 ; c(a∗ − a); q 2 a∗2 − (1 + q 2 )(aa∗ − cc∗ ) + a2 ; c∗ (a∗ − a); c∗2 ; [q 2 a + a∗ − q −1 (1 + q 4 )]c; [q 2 a + a∗ − q −1 (1 + q 4 )](a∗ − a); [q 2 a + a∗ − q −1 (1 + q 4 )]c∗ ; [q 2 a + a∗ − q −1 (1 + q 4 )][q 2 a + a∗ − (1 + q 2 )]}. One has Ad(QSUq (2) ) ⊂ QSUq (2) ⊗ A(SUq (2)) and dim(ker εSUq (2) /QSUq (2) ) = 4. The associated quantum tangent space as in (A.2) turns out to be a four-dimensional XQ ⊂ Uq (su(2)). A choice for a basis is given by the elements 1
L− = q 2 FK −1 , L0 =
Lz =
K −2 − 1 , q − q −1
1
L+ = q − 2 EK −1 ;
q(K 2 − 1) + q −1 (K −2 − 1) q(K −2 − 1) + q −1 (K 2 − 1) + FE = + EF , (q − q −1 )2 (q − q −1 )2 (2.14)
from the last commutation rule in (2.2). The vector L0 belongs to the center of Uq (su(2)): it differs from the quantum Casimir (2.3) by a constant term, Cq = L0 +
1
1
q 2 − q− 2 q − q −1
2
2 1 1 1 − = L0 + − . 4 2 4
(2.15)
The coproducts of the basis (A.6) give ∆Lb = 1 ⊗ Lb + a La ⊗ fab : once chosen the ordering (−, z, +, 0), such a tensor product can be represented as a row by column matrix product where
fab
1
1
0
0
q − 2 KE
0
0
0
K2
» −2 2 – (q − q −1 )q 12 FK −1 K −2 (q − q −1 )q − 12 EK −1 (q − q −1 ) FE + q −1 K −−1K2 (q − q ) . = 1 0 0 1 q − 2 FK (2.16)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
581
The differential d : A(SUq (2)) → Ω1 (SUq (2)) in (A.5) is written for any x ∈ A(SUq (2)) as
dx = (La x)ωa = ωa (Ra x) (2.17) a
a (1)
on the dual basis of left invariant forms ωa ∈ Ω1 (SUq (2)) with ∆L (ωa ) = 1 ⊗ ωa . Here Ra := −S −1 (La ) and explicitly: R± = L± K 2 ,
Rz = Lz K 2 ,
R0 = −L0 .
(2.18)
On the generators of the algebra the differential acts as: da = (q − q −1 )−1 (q − 1)aωz − qc∗ ω+ + λ1 aω0 , da∗ = cω− + (q − q −1 )−1 (q −1 − 1)a∗ ωz + λ1 a∗ ω0 , dc = (q − q −1 )−1 (q − 1)cωz + a∗ ω+ + λ1 cω0 ,
(2.19)
dc∗ = −q −1 aω− + (q − q −1 )−1 (q −1 − 1)c∗ ωz + λ1 c∗ ω0 , with λ1 = [ 12 ][ 32 ]. These relations can be inverted, giving ω− = c∗ da∗ − qa∗ dc∗ , ω+ = adc − qcda,
(2.20)
ωz = a∗ da + c∗ dc − (ada∗ + q 2 cdc∗ ), ∗ ∗ ∗ 2 ∗ ω0 = (1 + q)−1 λ−1 1 [a da + c dc + q(ada + q cdc )].
It is then easy to see that for q → 1 one has ω0 → 0. This differential calculus reduces in the classical limit to the standard three-dimensional bicovariant calculus on SU(2). This first order differential 4D+ calculus is a ∗-calculus: the ∗-structure on A(SUq (2)) is extended to an antilinear ∗-structure on Ω1 (SUq (2)), such that (dx)∗ = d(x∗ ) for any x ∈ A(SUq (2)). For the basis of left invariant 1-forms is just ∗ = −ω+ , ω−
ωz∗ = −ωz ,
ω0∗ = −ω0 .
(2.21)
From (A.7) one works out the bimodule structure of the calculus, obtaining: ω− a = aω− − qc∗ ω0 , ω − a∗ = a∗ ω − , ω− c = cω− + a∗ ω0 , ω − c∗ = c∗ ω − ,
ω+ a = aω+ , ω+ a∗ = a∗ ω+ + cω0 , ω+ c = cω+ , ω+ c∗ = c∗ ω+ − q −1 aω0 ,
ω0 a = q −1 aω0 , ω0 a∗ = qa∗ ω0 , ω0 c = q −1 cω0 ,
(2.22)
ω0 c∗ = qc∗ ω0 ;
as well as: ωz a = qaωz − q(q − q −1 )c∗ ω+ + qaω0 , ωz a∗ = (q − q −1 )cω− + q −1 a∗ ωz − q −1 a∗ ω0 , ωz c = qcωz + (q − q −1 )a∗ ω+ + qcω0 , ωz c∗ = −q −1 (q − q −1 )aω− + q −1 c∗ ωz − q −1 c∗ ω0 .
(2.23)
July 6, J070-S0129055X11004370
582
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
The A(SUq (2))-bicovariant bimodule Ω2 (SUq (2)) of exterior 2-forms is defined (2) by the projection given in (A.10), with SQ = ker A(2) = ker(1 − σ) ⊂ 1 ⊗2 Ω (SUq (2)) . This necessitates computing the braiding as in (A.9), a preliminary step being the computation as in (A.8) of the right coaction on the left invariant (1) basis forms, ∆R (ωa ) = b ωb ⊗ Jba . For the calculus at hand: ∗2 (1 + q 2 )a∗ c −qc2 (1 − q 2 )a∗ c a −qa∗ c∗ aa∗ − cc∗ −ac (q 2 − 1)cc∗ . Jba = (2.24) ∗2 (q + q −1 )ac∗ a2 (q −1 − q)ac∗ −qc 0 0 0 1 The braiding map σ : Ω1 (SUq (2))⊗2 → Ω1 (SUq (2))⊗2 is then worked out [7] to be: σ(ω− ⊗ ω− ) = ω− ⊗ ω− , σ(ω+ ⊗ ω+ ) = ω+ ⊗ ω+ , σ(ω0 ⊗ ω0 ) = ω0 ⊗ ω0 , σ(ωz ⊗ ωz ) = ωz ⊗ ωz + (q 2 − q −2 )(ωz ⊗ ω0 + ω− ⊗ ω+ − ω+ ⊗ ω− ), σ(ω− ⊗ ω+ ) = ω+ ⊗ ω− − ωz ⊗ ω0 , σ(ω+ ⊗ ω− ) = ω− ⊗ ω+ + ωz ⊗ ω0 , σ(ω− ⊗ ωz ) = ωz ⊗ ω− + (1 + q 2 )ω− ⊗ ω0 , σ(ωz ⊗ ω− ) = (1 − q −2 )ωz ⊗ ω− + q −2 ω− ⊗ ωz − (1 + q −2 )ω− ⊗ ω0 , σ(ω− ⊗ ω0 ) = ω0 ⊗ ω− + (1 − q 2 )ω− ⊗ ω0 ,
(2.25)
σ(ω0 ⊗ ω− ) = q ω− ⊗ ω0 , 2
σ(ωz ⊗ ω+ ) = q 2 ω+ ⊗ ωz + (1 − q 2 )ωz ⊗ ω+ + (1 + q 2 )ω+ ⊗ ω0 , σ(ω+ ⊗ ωz ) = ωz ⊗ ω+ − (1 + q −2 )ω+ ⊗ ω0 , σ(ωz ⊗ ω0 ) = ω0 ⊗ ωz + (q − q −1 )2 (ω+ ⊗ ω− − ω− ⊗ ω+ ) − (q − q −1 )2 ωz ⊗ ω0 , σ(ω0 ⊗ ωz ) = ωz ⊗ ω0 , σ(ω+ ⊗ ω0 ) = ω0 ⊗ ω+ + (1 − q −2 )ω+ ⊗ ω0 , σ(ω0 ⊗ ω+ ) = q −2 ω+ ⊗ ω0 . Using the general construction of Appendix A, the q-wedge product on 1-forms is defined as θ ∧ θ = A(2) (θ ⊗ θ ) = (1 − σ)(θ ⊗ θ ) ⊂ Range A(2) . On generators: ω− ∧ ω− = ω+ ∧ ω+ = ω0 ∧ ω0 = 0, ωz ∧ ωz − (q − q 2
−2
ωz ∧ ω± + q
)ω+ ∧ ω− = 0,
±2
ω± ∧ ωz = 0,
ω± ∧ ω0 + ω0 ∧ ω± = 0, ω+ ∧ ω− + ω− ∧ ω+ = 0, ωz ∧ ω0 + ω0 ∧ ωz − (q − q −1 )2 ω− ∧ ω+ = 0.
(2.26)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
583
These relations show that dim Ω2 (SUq (2)) = 6. The exterior derivative on basis 1-forms results into: dω± = ∓q ±1 ω− ∧ ωz , dωz = (q + q −1 )ω+ ∧ ω− ,
(2.27)
dω0 = (q − q −1 )ω− ∧ ω+ The anti-symmetrizer operator A(2) : Ω2 (SUq (2)) → Ω2 (SUq (2)) has a natural spectral decomposition. This is what we need later on to introduce Hodge operators. A more general analysis of the spectral properties of the anti-symmetrizer operators associated to a class of bicovariant differential calculi over SLq (N ) (for N ≥ 2) is in [32]. On the basis ϕ0 = ω− ∧ ω0 ,
ϕz = ω− ∧ ω0 + (1 − q −2 )ω− ∧ ωz
ψ0 = ω+ ∧ ω0 ,
ψz = ω+ ∧ ω0 − (1 − q 2 )ω+ ∧ ωz
ψ± = ω0 ∧ ωz + (1 − q
±2
(2.28)
)ω− ∧ ω+ ,
∗ which is such that ϕ∗0 = ψ0 , ϕ∗z = ψz and ψ− = ψ+ , it holds that
A(2) (ϕ0 ) = (1 + q 2 )ϕ0 ,
A(2) (ψz ) = (1 + q 2 )ψz ,
A(2) (ψ+ ) = (1 + q 2 )ψ+
A(2) (ϕz ) = (1 + q −2 )ϕz ,
A(2) (ψ0 ) = (1 + q −2 )ψ0 , A(2) (ψ− ) = (1 + q −2 )ψ− . (2.29)
Later on we shall use the labeling ξ(±) ∈ E(±) with E(+) = {ϕ0 , ψz , ψ+ },
and E(−) = {ϕz , ψ0 , ψ− }.
(2.30)
By proceeding further, the A(SUq (2))-bimodule Ω3 (SUq (2)) is found to be 4-dimensional with left invariant basis elements: χ− = ω + ∧ ω 0 ∧ ω z ,
χ+ = ω − ∧ ω 0 ∧ ω z
χ0 = ω − ∧ ω + ∧ ω z ,
χz = ω − ∧ ω + ∧ ω 0 ,
(2.31)
with χ∗− = −q −2 χ+ , χ∗0 = χ0 and χ∗z = χz . These exterior forms are closed, dχa = 0,
(2.32)
A(3) (χa ) = 2(1 + q 2 + q −2 )χa
(2.33)
and in addition satisfy
for a = −, +, z, 0, thus providing the spectral decomposition for the antisymmetrizer operator A(3) : Ω3 (SUq (2)) → Ω3 (SUq (2)). The A(SUq (2))-bimodule Ω4 (SUq (2)) of top forms (Ωk (SUq (2)) = ∅ for k > 4) is 1-dimensional. Its left invariant basis element µ = ω− ∧ ω+ ∧ ωz ∧ ω0 is central, i.e. xµ = µx for any x ∈ A(SUq (2)) and its eigenvalue for the action of the
July 6, J070-S0129055X11004370
584
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
anti-symmetrizer is A(4) (µ) = 2(q 4 + 2q 2 + 6 + 2q −2 + q −4 )µ.
(2.34)
2.3. The exterior algebra over the quantum sphere S2q The restriction of the first order 4D+ bicovariant calculus endows the sphere S2q with a first order left covariant 3-dimensional calculus [1, 29]. The exterior algebra Ω(S2q ) can be characterized in terms of some of the bimodules Ln introduced in Sec. 2. Given f ∈ A(S2q ) L0 , the exterior derivative d : A(S2q ) → Ω1 (S2q ) from (2.17) reduces to: df = (L− f )ω− + (L+ f )ω+ + (L0 f )ω0 .
(2.35)
Notice that the basis 1-forms {ωa , a = −, +, 0} are graded commutative (cf. (2.26)). Furthermore, relation (2.11) shows that (L± f ) ∈ L±2 and that (L0 f ) ∈ L0 , while the A(SUq (2))-bimodule structure of Ω1 (SUq (2)) described by the coproduct (2.16) of the quantum derivations La gives: φ ω− = ω− φ − q −1 ω0 (L+ φ),
ω− φ = φ ω− + q(L+ K 2 φ)ω0 ,
φ ω+ = ω+ φ − qω0 (L− φ ),
ω+ φ = φ ω+ + q −1 (L− K 2 φ )ω0 ,
φ ω0 = ω0 (K −2 φ ),
ω0 φ = (K 2 φ )ω0 .
(2.36)
These identities are valid for any φ, φ , φ ∈ A(SUq (2)). They allow to prove by explicit calculations the following identities: φ ∈ L−2 : d(φ ω− ) = (L+ φ)ω+ ∧ ω− + (L0 φ)ω0 ∧ ω− , φ ∈ L2 : d(φ ω+ ) = (L− φ )ω− ∧ ω+ + (L0 φ )ω0 ∧ ω+ ,
(2.37)
φ ∈ L0 : d(φ ω0 ) = (L− φ )ω− ∧ ω0 + (L+ φ )ω+ ∧ ω0 + φ dω0 , and φ ∈ L−2 : d(φ ω− ∧ ω0 ) = (L+ φ)ω+ ∧ ω− ∧ ω0 , φ ∈ L2 : d(φ ω0 ∧ ω+ ) = (L− φ )ω− ∧ ω0 ∧ ω+ ,
(2.38)
φ ∈ L0 : d(φ ω− ∧ ω+ ) = (L0 φ )ω0 ∧ ω− ∧ ω+ . Together with the anti-symmetry properties (2.26) of the wedge product in Ω(SUq (2)), these identities suggest that the following proposition holds. Proposition 2.1. The exterior algebra Ω(S2q ) obtained as a restriction of Ω(SUq (2)) associated to 4D+ calculus on SUq (2) can be written in terms of A(S2q )bimodule isomorphisms: Ω1 (S2q ) L−2 ω− ⊕ L2 ω+ ⊕ L0 ω0 , Ω2 (S2q ) L−2 (ω− ∧ ω0 ) ⊕ L0 (ω− ∧ ω+ ) ⊕ L2 (ω0 ∧ ω+ ), Ω3 (S2q ) L0 ω− ∧ ω+ ∧ ω0 .
(2.39)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
585
Proof. The analysis above proves only the inclusion Ω1 (S2q ) ⊂ L−2 ω− ⊕ L2 ω+ ⊕ L0 ω0 and the analogue ones for higher order forms. The proof of the inverse inclusion will be given at the end of Sec. 6, out of the compatibility of the calculi on the principal Hopf bundle. The basis element ω− ∧ ω+ ∧ ω0 commutes with all elements in L0 A(S2q ). Such a calculus is 3-dimensional, since from (2.32) one has d(φ ω− ∧ ω+ ∧ ω0 ) = 0, for any φ ∈ A(S2q ), and from (2.26) one has that Ω1 (S2q ) ∧ (ω− ∧ ω+ ∧ ω0 ) = 0. From (2.17) and (2.18) the differential can also be written as df = ω− (R− f ) + ω+ (R+ f ) + ω0 (R0 f ),
(2.40)
and it is easy to check the following relations, analogues of the previous (2.37) and (2.38): φ ∈ L−2 : d(ω− φ) = −ω− ∧ ω+ (R+ φ) − ω− ∧ ω0 (R0 φ), φ ∈ L2 : d(ω+ φ ) = −ω+ ∧ ω− (R− φ ) − ω+ ∧ ω0 (R0 φ ),
(2.41)
φ ∈ L0 : d(ω0 φ ) = dω0 ∧ φ − ω0 ∧ ω− (R− φ ) − ω0 ∧ ω+ (R+ φ ); and φ ∈ L−2 : d(ω− ∧ ω0 φ) = ω− ∧ ω0 ∧ ω+ (R+ φ), φ ∈ L2 : d(ω0 ∧ ω+ φ ) = ω0 ∧ ω+ ∧ ω− (R− φ ),
(2.42)
φ ∈ L0 : d(ω− ∧ ω+ φ ) = ω− ∧ ω+ ∧ ω0 (R0 φ ). 3. Hodge Operators on Ω(SUq (2)) As described in Sec. 2.2, it holds for the bicovariant forms of the 4D+ first order bicovariant calculus that the spaces Ωk (SUq (2)) of forms are free A(SUq (2))-bimodules with dim Ωk (SUq (2)) = dim Ω4−k (SUq (2)), and dim Ω4 (SUq (2)) = 1. Our strategy to introduce Hodge operators on Ω(SUq (2)) in Sec. 3.2 uses first suitable contraction maps in order to define Hodge operators on the vector spaces Ωkinv (SUq (2)) of left invariant k-forms; we extend them next to the whole Ωk (SUq (2)) by requiring (one side) linearity over A(SUq (2)). This follows an alternative although equivalent approach to Hodge operators on classical group manifold that we describe first in Sec. 3.1. A somewhat complementary approach to the one of Sec. 3.2, more suitable when restricting to the sphere S2q , is then given in Sec. 3.3. 3.1. Hodge operators on classical group manifolds Let G be an N -dimensional compact connected Lie group given as a real form of a complex connected Lie group. The algebra A(G) = Fun(G) of complex valued coordinate functions on G is a ∗-algebra, whose ∗-structure can be extended to
July 6, J070-S0129055X11004370
586
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
the whole tensor algebra. A metric on the group G is a non-degenerate tensor g : X(G) ⊗ X(G) → A(G) which is symmetric — i.e. g(X, Y ) = g(Y, X), with X, Y ∈ X(G) — and real — i.e. g ∗ (X, Y ) = g(Y ∗ , X ∗ ). Any metric has a normal form: there exists a basis {θa , a = 1, . . . , N } of the A(G)-bimodule Ω1 (G) of 1-forms which is real, θa∗ = θa , such that g=
N
ηab θa ⊗ θb
(3.1)
a,b=1
with ηab = ±1 · δab . Given the volume N -form µ = µ∗ := θ1 ∧ · · · ∧ θN , the corresponding Hodge operator : Ωk (G) → ΩN −k (G) is the A(G)-linear operator whose action on the above basis is (1) = µ, (θa1 ∧ · · · ∧ θak ) =
1 a1 ···akb1 ···bN −k θb1 ∧ · · · ∧ θbN −k , (N − k)!
(3.2)
bj
a1 ···akb1 ···bN −k
η a1 s1 · · · η ak sk s1 ···sk b1 ···bN −k from the Levi–Civita N tensor and the usual expression for the inverse metric tensor g −1 = a,b=1 η ab La ⊗ ab Lb with b η ηbc = δca on the dual vector field basis such that θb (La ) = δab . The Hodge operator (3.2) satisfies the identity:
with
:=
s1 ···sk
2 (ξ) = sgn(g)(−1)k(N −k) ξ
(3.3)
on any ξ ∈ Ωk (G). Here sgn(g) = det(ηab ) is the signature of the metric. Hodge operators can indeed be equivalently introduced in terms of contraction maps. By this we mean an A(G)-sesquilinear map Γ : Ω1 (G) × Ω1 (G) → A(G) such that Γ(f φ, η) = f ∗ Γ(φ, η) while Γ(φ, η f ) = Γ(φ, η)f for f ∈ A(G). Such a map can be uniquely extended to a consistent map Γ : Ωk (G) × Ωk+k (G) → Ωk (G). We postpone showing this to the later Sec. 3.2 where we prove a similar statement for the bicovariant calculus on SUq (2). Having a contraction map, define the tensor g˜ : Ω1 (G) × Ω1 (G) → A(G): g˜(φ, η) := Γ(φ∗ , η).
(3.4)
Next, with a volume form µ, such that µ∗ = µ, define the operator L : Ωk (G) → ΩN −k (G) as L(ξ) :=
1 ∗ Γ (ξ, µ) k!
(3.5)
on ξ ∈ Ωk (G), having used the notation Γ∗ (·, ·) = (Γ(·, ·))∗ . A second A(G)sesquilinear map { , } : Ωk (G) × Ωk (G) → A(G) can be implicitly introduced by the relation {ξ, ξ }µ := ξ ∗ ∧ L(ξ ).
(3.6)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
587
For any pair of k-forms ξ, ξ it is straightforward to recover that {ξ, ξ } =
1 ∗ Γ (ξ , ξ). k!
(3.7)
The operator (3.5) is not in general an Hodge operator: one has for example L(1) = µ as well as L(µ) = det(Γ∗ (µ, µ)) which is not necessarily ±1. To recover the standard formulation for a Hodge operator, one has to impose two constraints: (a) An hermitianity condition. The sesquilinear map Γ is said hermitian provided it satisfies: {φ, η} = Γ(φ, η),
(3.8)
for any couple of 1-forms φ and η. From (3.7) and (3.6) it holds that {φ, η} = Γ∗ (η, φ). Then {φ, η} = Γ(φ, η) ⇔ Γ(φ, η) = Γ∗ (η, φ).
(3.9)
If the sesquilinear form Γ is hermitian, one can prove that the expression (3.7) becomes 1 {ξ, ξ } = Γ(ξ, ξ ). (3.10) k! (b) A reality condition, namely a compatibility of the operator L with the ∗-conjugation: L(φ∗ ) = (L(φ))∗
(3.11)
on 1-forms. If these two constraints are fullfilled, the tensor g˜ in (3.4) is symmetric and real: it is (the inverse of) a metric tensor on the group manifold G. The operator L turns out to be the standard Hodge operator corresponding to the metric given by g˜, and satisfies the identities: L2 (ξ) = (−1)k(N −k) sgn(Γ)ξ,
{ξ, ξ } = sgn (Γ){L(ξ), L(ξ )}
(3.12)
with sgn(Γ) := (det(Γ(φa , φb ))|det(Γ(φa , φb )|−1 = sgn(˜ g ). Moreover, the operator L turns out to be real, that is, it commutes with the hermitian conjugation ∗, on the whole exterior algebra Ω(G). The above procedure could be somehow inverted. That is, given an hermitian contraction map Γ as in (3.8), define the operator L by (3.5). The corresponding tensor g˜ turns out to be real, but non-necessarily symmetric. Imposing L to satisfy one of the two conditions in (3.12) — they are proven to be equivalent — makes the tensor g˜ symmetric, that is the inverse of a metric tensor, whose Hodge operator is L.
July 6, J070-S0129055X11004370
588
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
3.2. Hodge operators on Ω(SUq (2)) In this section we shall describe how the classical geometry analysis of the previous section can be used to introduce an Hodge operator on both the exterior algebras Ω(SUq (2)) and Ω(S2q ) built out of the 4D-bicovariant calculus a` la Woronowicz on A(SUq (2)). A somewhat different formulation of contraction maps was also used in [16, 17] for a family of Hodge operators on the exterior algebras of bicovariant differential calculi over quantum groups. We shall then start with a contraction map Γ : Ω1inv (SUq (2)) × Ω1inv (SUq (2)) → C, required to satisfy Γ(λ ω, ω ) = λ∗ Γ(ω, ω ) and Γ(ω, ω λ) = Γ(ω, ω )λ, for λ ∈ C. ⊗k+k The natural extension to Γ : Ω⊗k (SU ) → Ω⊗k inv (SUq (2)) × Ωinv inv (SU ) given by k Γ(ωa1 ⊗ · · · ⊗ ωak , ωb1 ⊗ · · · ⊗ ωbk+k ) := Γ(ωaj , ωbj ) ωbk+1 ⊗ · · · ⊗ ωbk+k , j=1
(3.13) with the assumption that Γ(1, ω) = ω for any ω ∈ Ω(SUq (2)), can be used to define a consistent contraction map Γ : Ωk (SUq (2)) × Ωk+k (SUq (2)) → Ωk (SUq (2)), via Γ(ωa1 ∧ · · · ∧ ωak , ωb1 ∧ · · · ∧ ωbk+k )
:= Γ(A(k) (ωa1 ⊗ · · · ⊗ ωak ), A(k+k ) (ωb1 ⊗ · · · ⊗ ωbk+k )).
(3.14)
This comes from the kth order anti-symmetrizer A(k) , constructed from the braiding of the calculus, and used to define the exterior product of forms, ωa1 ∧ · · · ∧ ωak := A(k) (ωa1 ⊗ · · · ⊗ ωak );
(3.15)
the key identity for the consistency of (3.15) is
A(k+k ) (ωa1 ⊗ · · · ⊗ ωak+k )
= (A(k) ⊗ A(k ) )
(−1)πσj σj (ωa1 ⊗ · · · ⊗ ωak+k ) , (3.16)
σj ∈S (k,k )
where S (k, k ) is the collection of the (k, k )-shuffles, permutations σj of {1, . . . , k + k } such that σj (1) < · · · < σj (k) and σj (k + 1) < · · · < σj (k + k ), and πσj is the parity of σj . The identity (3.16) is valid on the whole exterior algebra over any bicovariant calculus a` la Woronowicz on a quantum group. It allows to show that any (k + k) -form can be written as a linear combination of tensor products of k-forms times k -forms. To proceed further, we use a slightly more general volume form by taking µ = µ∗ = i m ω− ∧ ω+ ∧ ω0 ∧ ωz , with m ∈ R. Then we define an operator : Ωkinv (SUq (2)) → Ω4−k inv (SUq (2)),
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
589
in degree zero and one by (1) := Γ∗ (1, µ) = µ and
(ωa ) := Γ∗ (ωa , µ).
(3.17)
For Ωkinv (SUq (2)) with k ≥ 2 we use the diagonal bases of the anti-symmetrizer, that is A(k) (ξ) = λξ ξ,
(3.18)
with coefficients in (2.29), (2.33) and (2.34), respectively. On these basis we define (ξ) :=
1 ∗ Γ (ξ, µ). λξ
(3.19)
Here and in the following we denote (Γ( , ))∗ = Γ∗ ( , ). The definition (3.19) is a natural generalization of the classical (3.5): the classical factor k! — the spectrum of the anti-symmetrizer operator on k-forms in the classical case, where the braiding is the flip operator — is replaced by the spectrum of the quantum anti-symmetrizer. Also, the presence of the ∗-conjugate comes from consistency and in order to have non trivial solutions. Before we proceed, it is useful to re-express the volume forms in terms of the diagonal bases of the anti-symmetrizer operators. Some little algebra shows that µ = im{−ω− ⊗ χ∗+ + ω+ ⊗ χ∗− + ω0 ⊗ χ∗0 − ωz ⊗ χ∗z } ∗ ∗ − χ+ ⊗ ω − + χ0 ⊗ ω0∗ }, µ = im{−χz ⊗ ωz∗ + χ− ⊗ ω+
and im µ= 2 q −1
(3.20)
1 ∗ ∗ (q 4 ψ− ⊗ ψ+ − ψ+ ⊗ ψ− ) 1 + q2
+ (q 4 ϕz ⊗ ϕ∗0 − ϕ0 ⊗ ϕ∗z + q 2 ψ0 ⊗ ψz∗ − q −2 ψz ⊗ ψ0∗ ) .
(3.21)
A little more algebra shows in turn that on 1-forms (ωa ) = im{Γ∗ (ωa , ω− )χ+ − Γ∗ (ωa , ω+ )χ− − Γ∗ (ωa , ω0 )χ0 + Γ∗ (ωa , ωz )χz }; (3.22) and that using the bases (2.30), on 2-forms 1 im (ξ(+) ) = (q 4 Γ∗ (ξ(+) , ψ− )ψ+ − Γ∗ (ξ(+) , ψ+ )ψ− ) 1 − q4 1 + q2 + (q 4 Γ∗ (ξ(+) , ϕz )ϕ0 − Γ∗ (ξ(+) , ϕ0 )ϕz
+ q 2 Γ∗ (ξ(+) , ψ0 )ψz − q −2 Γ∗ (ξ(+) , ψz )ψ0 ) ,
July 6, J070-S0129055X11004370
590
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
(ξ(−) ) =
im −2 q − q2
1 (q 4 Γ∗ (ξ(−) , ψ− )ψ+ − Γ∗ (ξ(−) , ψ+ )ψ− ) 1 + q2
+ (q 4 Γ∗ (ξ(−) , ϕz )ϕ0 − Γ∗ (ξ(−) , ϕ0 )ϕz 2 ∗
+ q Γ (ξ(−) , ψ0 )ψz − q
Γ (ξ(−) , ψz )ψ0 ) .
−2 ∗
(3.23) As for 3-forms one finds (χa ) = −
im {−Γ∗ (χa , χ+ )ω− + Γ∗ (χa , χ− )ω+ 2(1 + q 2 + q −2 )
+ Γ∗ (χa , χ0 )ω0 − Γ∗ (χa , χz )ωz },
(3.24)
and finally for the top form (µ) =
2(q 4
+
2q 2
1 Γ∗ (µ, µ). + 6 + 2q −2 + q −4 )
(3.25)
As in (3.6) we define the sesquilinear map { , } : Ωkinv (SUq (2)) × Ωkinv (SUq (2)) → C by {ξ, ξ }µ := ξ ∗ ∧ (ξ ).
(3.26)
Then, mimicking the analogous construction of Sec. 3.1 we impose both an hermitianity and a reality condition on the contraction map. (a) A contraction map is hermitian provided it satisfies: {ωa , ωb } = Γ(ωa , ωb ),
for a, b = −, +, z, 0.
(3.27)
Given contraction maps fullfilling such an hermitianity constraint, from the first line in (3.22) one has that Γ(ωa , ωb ) = Γ∗ (ωb , ωa ). That is, Γab = Γ∗ba . With such a condition it is moreover possible to prove, that for with k = 2, 3, 4, {ξ, ξ } =
λξ∗ Γ(ξ, ξ ), λξ λξ
(3.28)
on any ξ, ξ ∈ Ωkinv (SUq (2)) of a diagonal basis of the anti-symmetrizer as in (3.18). The above expression is the counterpart of (3.10) for a braiding which is not just the flip operator. (b) An hermitian contraction map is real provided one has λξ∗ (ξ ∗ ) = (λξ (ξ))∗ ,
(3.29)
again on a diagonal basis of A(k) (ξ). This expression generalizes the classical one (3.11). Notice that it is set on any Ωkinv (SUq (2)), and not only on 1-forms as in the classical case.
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
591
The requirement that the contraction be hermitian and real results in a series of constraints. Firstly, the action on Ω1inv (SUq (2)) of the corresponding operator as defined in (3.17) is worked out to be given by 0 α 0 0 χ− ω− 2 −q α 0 0 0 χ+ ω+ . (3.30) ω = im 0 0 −ν 0 χ0 ωz χz 0 0 − γ The only non zero terms of the contraction Γ are given by Γ−− = q −2 Γ++ = α,
Γ0z = Γz0 = ,
Γ00 = ν,
Γzz = γ,
(3.31)
with parameters that are real and satisfy in addition the conditions: 2ν + (q 2 − q −2 ) = 0,
(3.32)
2(2 − γν) + (q − q −1 )2 (2q 2 α2 + 2 ) = 0.
On Ω2inv (SUq (2)) the action of such operator is block off-diagonal, Γ(ϕ0 , ϕ0 ) 0 0 ϕ0 ϕz 4 im 0 q Γ(ϕz , ϕz ) 0 ψz = 4 ψ0 , q −1 Γ(ψ+ , ψ+ ) ψ− ψ+ 0 0 1 + q2 (3.33) 6 0 0 q Γ(ϕz , ϕz ) ϕz ϕ0 0 0 q 2 Γ(ϕ0 , ϕ0 ) im ψz , ψ0 = 4 1−q 4 q Γ(ψ+ , ψ+ ) ψ+ ψ− 0 0 1 + q2 while on Ω3inv (SUq (2)) is χ− χ+ im χ = 2(1 + q 2 + q −2 ) 0 χz
0
2 q Γ(χ− , χ− ) × 0 0
−Γ(χ− , χ− )
0
0
0
0
−Γ(χ0 , χ0 )
0
−Γ(χ0 , χz )
0
ω−
ω+ . Γ(χz , χ0 ) ω0 ωz Γ(χz , χz ) 0
(3.34) It turns out that the square of the operator is not necessarily diagonal. An explicit computation shows moreover that when q = 1, given the constraints (3.32)
July 6, J070-S0129055X11004370
592
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
there is no choice for the contraction Γ, nor for the value of the scale parameter m ∈ R in the volume form such that the spectrum of the operator 2 is constant on any vector space Ωkinv (SUq (2)). This means that the operator does not satisfy the classical expressions in (3.12). We choose a particular value for the parameter m defining det Γ :=
1 Γ(ω− ∧ ω+ ∧ ω0 ∧ ωz , ω− ∧ ω+ ∧ ω0 ∧ ωz ), λµ
sgn(Γ) :=
det Γ |det Γ| (3.35)
and imposing 2 (1) = sgn(Γ),
(3.36)
which is clearly equivalent to the constraint m2 = |det Γ|−1 .
(3.37)
An explicit calculation shows that conditions (3.27) and (3.29) fix the quantum determinant (3.35) to be positive, so that we have sgn(Γ) = 1. We finally extend the operator to the whole exterior algebra. This can be defined in two ways, i.e. we define Hodge operators L , R : Ωk (SUq (2)) → Ω4−k (SUq (2)) by: L(x ω) := x (ω),
R (ω x) := ( ω)x,
(3.38)
with x ∈ A(SUq (2)) and ω ∈ Ωinv (SUq (2)). Both operators will find their use later on. 3.3. Hodge operators on Ω(SUq (2)) — A complementary approach The procedure used in the previous section cannot be extended ipso facto to introduce an Hodge operator on the exterior algebra Ω(S2q ): although all Ωk (S2q ) are free left A(S2q )-modules [18], the tensor product Ω⊗2 (S2q ) has no braiding like the σ above. In order to construct a suitable Hodge operator on the quantum sphere, we shall export to this quantum homogeneous space the construction of [21], originally conceived on the exterior algebra over a quantum group. The strategy largely coincides with the one described in [37] and presents similarities to that used in [8] where a Hodge operator has been introduced on a quantum projective plane. We start by briefly recalling the formulation from [21]. Consider a ∗-Hopf algebra H and the exterior algebra Ω(H) over an N -dimensional left covariant first order calculus (Ω1 (H), d), with dim ΩN −k (H) = dim Ωk (H) and dim ΩN (H) = 1. Suppose in addition that H has an Haar state h : H → C, i.e. a unital functional, which is invariant, i.e. (id ⊗ h)∆x = (h ⊗ id)∆x = h(x)1 for any x ∈ H, and positive, i.e. h(x∗ x) ≥ 0 for all x ∈ H. An Haar state so defined is unique and automatically faithful: h(x∗ x) = 0 implies x = 0. Upon fixing an inner product on a left invariant
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
593
basis of forms, the state h is then used to endow the whole exterior algebra with a left and a right inner product, when requiring left or right invariance, x ω, x ω L := h(x∗ x )ω, ω , ω x, ω x R := h(x∗ x )ω, ω
(3.39)
for any x, x ∈ H and ω, ω in Ωinv (H). The spaces Ωk (H) are taken to be pairwise orthogonal (this is stated by saying that the inner product is graded). The differential calculus is said to be non-degenerate if, whenever η ∈ Ωk (H) and η ∧ η = 0 for any η ∈ ΩN −k (H), then necessarily η = 0. Choose in ΩN (H) a left invariant hermitian basis element µ = µ∗ , referred to as the volume form of the calculus. For the sake of the present paper, we assume that the differential calculus has a volume form such that µx = xµ for any x ∈ H (this condition is satisfied by the 4D+ bicovariant calculus on SUq (2) that we are considering). Then one defines an “integral” : Ω(H) → C, xµ = h(x), for x ∈ H, µ
µ
and µ η = 0 for any k-form η with k < N . For a non-degenerate calculus the functional µ is left-faithful if η ∈ Ωk (H) is such that µ η ∧ η = 0 for all η ∈ ΩN −k (H), then η = 0. The central result is [21]: Proposition 3.1. Consider a left covariant, non-degenerate differential calculus on a ∗-Hopf algebra, whose corresponding exterior algebra is such that dim ΩN −k (H) = dim Ωk (H) and dim ΩN (H) = 1, with a left-invariant volume form µ = µ∗ satisfying xµ = µx for any x ∈ H. If Ω(H) is endowed with inner products and integrals as before, there exists a unique left H-linear bijective operator L : Ωk (H) → ΩN −k (H) for k = 0, . . . , N (respectively, a unique right H-linear bijective operator R) such that η ∗ ∧ L(η ) = η, η L , η ∗ ∧ R(η ) = η, η R (3.40) µ
µ
for any η, η ∈ Ωk (H). We mention that there is no R operator in [21]. It is just to prove its right H-linearity that one needs the condition xµ = µx for the volume form µ with x ∈ H. We are now ready to make contact with the previous Sec. 3.2. The 4D+ differential calculus on SUq (2) is easily seen to be non degenerate. On the other hand, the Haar state functional h is given by (cf. [20]): −1 k
1 q 2j = , (3.41) h(1) = 1; h((cc∗ )k ) = 2 1 + q + · · · + q 2k j=0 with k ∈ N, all other generators mapping to zero.
July 6, J070-S0129055X11004370
594
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
Now, use the sesquilinear map (3.26) for an inner product ω, ω := {ω, ω } on generators of Ωinv (SUq (2)) and extend it to a left invariant and a right invariant ones to the whole of Ωinv (SUq (2)) as in (3.39) using the state h. The uniqueness of the operators L and R from Proposition 3.1 then implies that the extended left and right inner products are related to the left and right Hodge operators (3.38) by η ∗ ∧ (L η ) = η, η L , η ∗ ∧ (R η ) = η, η R (3.42) µ
µ
for any η, η ∈ Ω (H). k
4. Hodge Operators on Ω(S2q ) From the previous section, the procedure to introduce Hodge operators on the quantum sphere appears outlined. Inner products on Ω(SUq (2)) naturally induce inner products on Ω(S2q ), and we shall explore the use of relations like the (3.42) above to define a class of Hodge operators. The exterior algebra Ω(S2q ) over the quantum sphere S2q is described in Sec. 2.3. In particular, we recall its description in terms of the A(S2q )-bimodules Ln given in (2.10): Ω0 (S2q ) A(S2q ) L0 , Ω1 (S2q ) L−2 ω− ⊕ L2 ω+ ⊕ L0 ω0 ω− L−2 ⊕ ω+ L2 ⊕ ω0 L0 , Ω2 (S2q ) L−2 (ω− ∧ ω0 ) ⊕ L0 (ω− ∧ ω+ ) ⊕ L2 (ω0 ∧ ω+ )
(4.1)
(ω− ∧ ω0 )L−2 ⊕ (ω− ∧ ω+ )L0 ⊕ (ω0 ∧ ω+ )L2 , Ω3 (S2q ) L0 ω− ∧ ω+ ∧ ω0 ω− ∧ ω+ ∧ ω0 L0 . In the rest of this section, to be consistent with the notation introduced in Sec. 2.3, we shall consider elements φ, ψ ∈ L−2 , elements φ , ψ ∈ L2 and elements φ , ψ ∈ L0 . Lemma 4.1. The above left covariant 3D calculus on S2q is non-degenerate. Proof. Given θ ∈ Ωk (S2q ) the condition of non-degeneracy, namely θ ∧ θ = 0 for any θ ∈ Ω3−k (S2q ) only if θ = 0, is trivially satisfied for k = 0, 3. From (4.1) take the 1-form θ = φ ω− and a 2-form θ = ψ ω− ∧ ω0 + ψ ω+ ∧ ω0 + ψ ω− ∧ω+ . Using the commutation properties (2.36) between 1-forms and elements 1 in A(SUq (2)), one has θ ∧ θ = {ψ (K 2 φ) − ψ (q − 2 KE φ)}ω− ∧ ω+ ∧ ω0 , so that the equation θ ∧ θ = 0 for any θ ∈ Ω2 (S2q ) is equivalent to the condition 1 {ψ (K 2 φ)−ψ (q − 2 KE φ)} = 0 for any θ ∈ Ω2 (S2q ); taking θ = ψ ω+ ∧ω0 , one shows that this condition is satisfied only if φ = 0. A similar conclusion is reached with a 1-form θ = φ ω+ , and with a 1-form θ = φ ω0 . Consider then a 2-form θ = φ ω− ∧ ω0 , and a 1-form θ = ψω− + ψ ω+ + ψ ω0 . Their product is θ ∧ θ = (ψ φ)ω+ ∧ ω− ∧ ω0 , so that the condition θ ∧ θ = 0, for
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
595
all θ ∈ Ω1 (S2q ) is equivalent to the condition ψ φ = 0 for any ψ ; this condition is obviously satisfied only by φ = 0. It is clear that a similar analysis can be performed for any 2-form θ ∈ Ω2 (S2q ). The Haar state h of A(SUq (2)) given in (3.41) yields a faithful and invariant ˇ=m ˇ ω− ∧ ω+ ∧ω0 = µ ˇ∗ state when restricted to A(S2q ). As a volume form we take µ with m ˇ ∈ R. It commutes with every algebra element, f µ ˇ=µ ˇ f for f ∈ A(S2q ), so 2 the integral on the exterior algebra Ω(Sq ) can be defined by θ = 0,
on θ ∈ Ωk (S2q ),
µ ˇ
(4.2)
fµ ˇ = h(f ), on f µ ˇ∈Ω
3
µ ˇ
Lemma 4.2. The integral
for k = 0, 1, 2,
µ ˇ
(S2q ).
: Ω(S2q ) → C defined by (4.2) is left-faithful.
Proof. The proof of the left-faithfulness of the integral can be easily established from a direct analysis, using the faithfulness of the Haar state h. The restriction to Ω(S2q ) of the left and right A(SUq (2))-linear graded inner products on Ω(SUq (2)) in (3.42) gives left and right A(S2q )-linear graded inner products on Ω(S2q ): L θ, θ L S2q := θ, θ ;
R θ, θ R S2q := θ, θ
(4.3)
with θ, θ ∈ Ω(S2q ). The analogue result to relation (3.42) is given in the following Proposition 4.3. On the exterior algebra on the sphere S2q endowed with the above graded left (respectively, right ) inner product, there exists a unique invertible left ˇ : Ωk (S2 ) → Ω3−k (S2 ), (respectively, a unique invertA(S2q )-linear Hodge operator L q q 2 ˇ for k = 0, 1, 2, 3, satisfying ible right A(Sq )-linear Hodge operator R) µ ˇ
ˇ ) = θ, θ L2 , θ∗ ∧ L(θ Sq
µ ˇ
ˇ ) = θ, θ R2 θ∗ ∧ R(θ Sq
(4.4)
for any θ, θ ∈ Ωk (S2q ). They can be written in terms of the sesquilinear map (3.26) as: ˇ L(1) =µ ˇ,
ˇ µ) = {ˇ L(ˇ µ, µ ˇ },
ˇ ω− ) = mα L(φ ˇ φ ω− ∧ ω0 ,
ˇ ω− ∧ ω0 ) = m{ω L(φ ˇ − ∧ ω0 , ω− ∧ ω0 }φ ω− ,
ˇ ω+ ) = m L(φ ˇ q 2 α φ ω0 ∧ ω+ ,
ˇ ω0 ∧ ω+ ) = m{ω L(φ ˇ + ∧ ω0 , ω+ ∧ ω0 } φ ω+ ,
ˇ 0 ) = −mν L(ω ˇ ω− ∧ ω+ ,
ˇ − ∧ ω+ ) = −m{ω L(ω ˇ − ∧ ω+ , ω− ∧ ω+ } ω0
(4.5)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
596
and ˇ R(1) =µ ˇ,
ˇ µ) = {ˇ R(ˇ µ, µ ˇ}
ˇ − φ) = mq R(ω ˇ 2 α ω− ∧ ω0 φ,
ˇ − ∧ ω0 φ) = mq R(ω ˇ 2 {ω− ∧ ω0 , ω− ∧ ω0 }ω− φ,
ˇ + φ ) = mα R(ω ˇ ω0 ∧ ω+ φ ,
ˇ 0 ∧ ω+ φ ) = mq R(ω ˇ −2 {ω+ ∧ ω0 , ω+ ∧ ω0 }ω+ φ ,
ˇ 0 ) = −mν R(ω ˇ ω− ∧ ω+ ,
ˇ − ∧ ω+ ) = −m{ω R(ω ˇ − ∧ ω+ , ω− ∧ ω+ } ω0 . (4.6)
Proof. For the rather technical proof we refer to [37], where the same strategy has been adopted for the analysis of an Hodge operator on a two-dimensional exterior algebra on S2q . Here we only observe that the uniqueness follows from the result ˇ L ˇ : Ωk (S2 ) → Ω3−k (S2 ) satisfying (4.4) in Lemma 4.2. Given two operators L, q q ˇ ˇ R ˇ ), their difference must satisfy the relation θ∗ ∧ (L(θ) − (or equivalently R, µ ˇ k 2 ˇ (θ)) = 0 for any θ, θ ∈ Ω (S ). The left-faithfulness of the integral allows one L q ˇ ˇ (θ). then eventually to get L(θ) =L From (2.31) and (2.33) it is µ ˇ=m ˇ χz , so we define ˇ := det Γ
Γ(χz , χz ) , 2(1 + q 2 + q −2 )
ˇ := sgn(Γ)
ˇ det(Γ) ˇ | det Γ|
(4.7)
and set ˇ := sgn(Γ) ˇ m ˇ 2 det Γ ˇ 2 (1) = R ˇ 2 (1) = as a definition for the scale factor m ˇ ∈ R. Clearly this choice gives L ˇ sgn(Γ). Analogously to what happened for SUq (2) before, the sign in (4.7) turns ˇ = 1. out to be positive for the class of contractions we are considering, i.e. sgn(Γ) We conclude by noticing that the Hodge operators (4.5) and (4.6) are diagonal, but still there is no choice for the parameters (3.31) and (3.32) of a real and hermitian contraction map such that a relation like (3.12) is satisfied. 5. Laplacian Operators Given the Hodge operators constructed in the previous sections, the corresponding Laplacian operators on the quantum group SUq (2), L SUq (2) : A(SUq (2)) → A(SUq (2)),
L L L SUq (2) (x) := − d dx,
R SUq (2) : A(SUq (2)) → A(SUq (2)),
R R R SUq (2) (x) := − d dx
can be readily written in terms of the basic derivations (2.14) and (2.18) for the first order differential calculus as 2 L SUq (2) x = {α(L+ L− + q L− L+ ) + νL0 L0 + γLz Lz + 2L0 Lz } x,
(5.1)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
597
and 2 R SUq (2) x = {α(q R+ R− + R− R+ ) + νR0 R0 + γRz Rz + 2R0 Rz } x,
(5.2)
with parameters given in (3.31). From the decomposition (2.9) and the action (2.11) it is immediate to see that such Laplacians restrict to operators : Ln → Ln . In order to diagonalize them, we recall the decomposition (2.12). The action of each term of the Laplacians on the basis elements {φn,J,l } in (2.13) can be explicitly computed by (2.5), giving: 1 1 −1−n L− L+ φn,J,l = q J − n J + 1 − n φn,J,l , 2 2 1 1 1−n L+ L− φn,J,l = q J + n J + 1 − n φn,J,l , 2 2 (5.3) 1 1 Lz φn,J,l = −q − 2 n n φn,J,l , 2 2 2 1 1 L0 φn,J,l = J+ − φn,J,l = [J][J + 1]φn,J,l . 2 2 Here for the labels one has n ∈ N with J = |n| 2 + Z and l = 0, . . . , 2J. The Laplacians on the quantum sphere are, with f ∈ A(S2q ): 2 ˇ ˇ L S2q f := −LdLdf = {α L+ L− + q α L− L+ + νL0 L0 } f,
(5.4)
2 ˇ ˇ R S2q f := −RdRdf = {q αR+ R− + αR− R+ + νR0 R0 } f.
(5.5)
and
They both are the restriction to S2q of the Laplacian on SUq (2), the left and right one respectively. Their actions can be written in terms of the action of the Casimir element Cq of Uq (su(2)), immediately giving their spectra. They coincide on S2q : 2 2 2 1 1 1 1 L,R S2 = 2qα Cq + − , + ν Cq + − q 4 2 4 2 = 2qαL0 + νL20
on A(S2q ).
Using (5.3), spectra are readily found: 2 2 L,R S2 (φ0,J,l ) = (2qα[J][J + 1] + ν[J] [J + 1] )φ0,J,l , q
(5.6)
with J ∈ N, l = 0, . . . , 2J. We end this section by comparing these spectra to the spectrum of D2 , the square of the Dirac operator on S2q studied in [3]. Some straightforward computation leads to: 2 1 L,R 2 ⇔ 2qα = 1, ν = q −2 (q − q −1 )4 . (5.7) spec(S2 ) = spec D − q 2
July 6, J070-S0129055X11004370
598
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
6. A Digression: Connections on the Hopf Fibration Over S2q A monopole connection for the quantum fibration A(S2q ) → A(SUq (2)) on the standard Podle´s sphere — with a left-covariant 3D calculus on SUq (2) and the (corresponding restriction to a) 2D left-covariant calculus on S2q — was explicitly described in [4]. A slightly different, but to large extent equivalent [10] formulation of this and of a fibration constructed on the same topological data A(S2q ) → A(SUq (2)), but with SUq (2) equipped with a bicovariant 4D calculus inducing on S2q a left-covariant 3D calculus, are presented in [9]. The general problem of finding the conditions between the differential calculi on a base space algebra and on a “structure” group, in a way giving a principal bundle structure with compatible calculi and a consistent definition of connections on it has been deeply studied [5, 27, 11, 14]. The slightly different perspective of this digression is to follow the path reviewed in Appendix B, namely, to recall from [3] the formulation of a Hopf bundle on the standard Podle´s sphere starting from the 4D bicovariant calculus a` la Woronowicz on the total space SUq (2), in order to fully describe the set of its connections. The first step in this analysis consists in describing how the differential calculus on SUq (2) naturally induces a 1-dimensional bicovariant calculus on the structure group U(1), and in which sense these two calculi are compatible. 6.1. A 1D bicovariant calculus on U(1) The Hopf projection (2.7) allows one to define an ideal QU(1) ⊂ ker εU(1) as the projection QU(1) = π(QSUq (2) ). Then QU(1) is generated by the three elements ξ1 = (z 2 − 1) + q 2 (z −2 − 1), ξ2 = (q 2 z + z −1 − (q 3 + q −1 ))(q 2 z + z −1 − (1 + q 2 )), ξ3 = (q 2 z + z −1 − (q −1 + q 3 ))(z −1 − z), and, since Ad(QU(1) ) ⊂ QU(1) ⊗A(U(1)), it corresponds to a bicovariant differential calculus on U(1). The identity −q(1 + q 4 )−1 (1 + q 2 + q 3 + q 5 )−1 {(q 6 − 1)ξ3 + (1 + q 4 )ξ2 − q 2 (1 + q 2 )ξ1 } = (z − 1) + q(z −1 − 1) shows that ξ = (z − 1) + q(z −1 − 1) is in QU(1) . By induction one also sees that j−1
n j−n j j > 0 : z (z − 1) = ξ q z + q j (z − 1), n=0
j < 0 : z −|j| (z − 1) = −ξ
|j|−1
n=1
q −n z n−|j| + q −|j| (z − 1).
(6.1)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
599
From these relations it is immediate to prove (as in [3]) that there is a complex vector space isomorphism ker εU(1) /QU(1) C. The differential calculus induced by QU(1) is 1-dimensional, and the projection πQU(1) : ker εU(1) → ker εU(1) /QU(1) can be written as πQU(1) : z j (z − 1) → q j [z − 1],
(6.2)
on the vector space basis ϕ(j) = z j (z − 1) in ker εU(1) , with notation [z − 1] ∈ ker εU(1) /QU(1) . The projection (6.2) will be used later on to define connection 1-forms on the fibration. As a basis element for the quantum tangent space XQU(1) we take X = Lz =
K −2 − 1 . q − q −1
(6.3)
The ∗-Hopf algebras A(U(1)) and U(1) {K, K −1 } are dually paired via the pairing, induced by the one in (2.4) between A(SUq (2)) and Uq (su(2)), with 1
K ±1 , z = q ∓ 2 ,
1
K ±1 , z −1 = q ± 2 ,
(6.4)
on the generators. Thus, the exterior derivative d : A(U(1)) → Ω1 (U(1)) can be written, for any u ∈ A(U(1)), as du = (X u) θ on the left invariant basis 1-form θ ∼ [z − 1]. On the generators of the coordinate algebra one has dz =
q−1 z θ, q − q −1
dz −1 =
q −1 − 1 −1 z θ, q − q −1
(6.5)
so to have θ = (q −1)(q −q −1 )−1 z −1 dz. From the coproduct ∆X = 1⊗X +X ⊗K −2 the A(U(1))-bimodule structure in Ω1 (U(1)) is θz ± = q ± z ± θ. 6.2. Connections on the principal bundle The compatibility — as described in Appendix B and expressed by the exactness of the sequence (B.4) — of the differential calculus U(1) presented above with the 4D differential calculus on SUq (2) presented in Sec. 2.2, has been proved in [3]. As a consequence, collecting the various terms, the data (A(SUq (2)), A(S2q ), A(U(1)); NSUq (2) = r−1 (SUq (2) ⊗ QSUq (2) ), QU(1) ) is a quantum principal bundle with the described calculi. In order to obtain connections on this bundle, that is maps (B.7) splitting the sequence (B.4), we need to compute the action of the map ∼NSUq (2) : Ω1 (SUq (2)) → A(SUq (2)) ⊗ (ker εU(1) /QU(1) ) defined via the diagram (B.3). Since it is left A(SUq (2))-linear, we take as representative universal 1-forms corresponding to the left invariant 1-forms (2.20) in
July 6, J070-S0129055X11004370
600
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
Ω1 (SUq (2)): −1 πN SU
q (2)
−1 πN SU
q (2)
(ω+ ) = (aδc − qcδa) (ω− ) = (c∗ δa∗ − qa∗ δc∗ )
−1 πN SU
q (2)
−1 πN SU
q (2)
(ω0 ) = {a∗ δa + c∗ δc + q(aδa∗ + q 2 cδc∗ )}/(q + 1)λ1 (ωz ) = a∗ δa + c∗ δc − (aδa∗ + q 2 cδc∗ ).
On them the action of the canonical map (B.2) is found to be: χ(aδc − qcδa) = (ac − qca) ⊗ (z − 1) = 0, χ(c∗ δa∗ − qa∗ δc∗ ) = (c∗ a∗ − qa∗ c∗ ) ⊗ (z ∗ − 1) = 0, ∗ ∗ ∗ 2 ∗ χ((1 + q)−1 λ−1 1 {a δa + c δc + q(aδa + q cδc )})
= 1 ⊗ {(z − 1) + q(z −1 − 1)} = 1 ⊗ ξ, χ(a∗ δa + c∗ δc − (aδa∗ + q 2 cδc∗ )) = 1 ⊗ (z − z −1 ), with ξ ∈ QSUq (2) introduced in Sec. 6.1. From the isomorphism (6.2) one finally has: ∼NSUq (2) (ω± ) = ∼NSUq (2) (ω0 ) = 0 ∼NSUq (2) (ωz ) = 1 ⊗ (1 + q −1 )[z − 1].
(6.6)
From these one recovers Ω1hor (SUq (2)) = ker ∼NSUq (2) with, using (2.36), ker ∼NSUq (2) A(SUq (2)){ω± , ω0 } {ω± , ω0 }A(SUq (2)).
(6.7)
Remark 6.1. From (6.6), for the generator X = Lz in (6.3) one gets that z ) = X, ∼N X(ω (ωz ) = 1, SUq (2) which identifies Lz ∈ XQ as a vertical vector for the fibration. In turn it is used to extend the notion of horizontality to higher order forms in Ω(SUq (2)). One defines [20] a contraction operator iLz : Ωk (SUq (2)) → Ωk−1 (SUq (2)), giving iLz (ω± ) = iLz (ω0 ) = 0, and iLz (ωz ) = 1 on 1-forms, so that ker iLz Ω1hor (SUq (2)). Then one defines Ωkhor (SUq (2)) := ker|Ωk (SUq (2)) iLz ,
(6.8)
that is the kernel of the contraction map when restricted to the bimodule of k-forms. Given the explicit expression (6.6) for the canonical map compatible with the differential calculi we are using, and the A(U(1))-coaction (1)
δR ωz = ωz ⊗ 1,
(1)
δR ω0 = ω0 ⊗ 1,
δR ω± = ω± ⊗ z ±2 , (1)
(6.9)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
601
using the vector space basis ϕ(j) in ker εU(1) of Sec. 6.1, a connection (B.7) is given by σ ˜ (φ ⊗ [ϕ(j)]) = q −2j (1 + q −1 )−1 φ(ωz + a)
(6.10)
for any φ ∈ A(SUq (2)) and any element a ∈ Ω1 (S2q ). On vertical forms, the projection Π associated to this connection turns out to be Π(ω± ) = 0 = Π(ω0 ), Π(ωz ) = σ ˜ (∼NSUq (2) (ωz )) = σ ˜ (1 ⊗ [ϕ(0)]) = ωz + a,
(6.11)
while the corresponding connection 1-form ω : A(U(1)) → Ω1 (SUq (2)) is given by n ω(z n ) = σ ˜ (1 ⊗ [z n − 1]) = q n/2 (6.12) (ωz + a). 2 Connections corresponding to a = sω0 with s ∈ R were already considered in [9]. The vertical projector (6.11) allows one to define a covariant derivative D : A(SUq (2)) → Ω1hor (SUq (2)), given (as usual) as the horizontal projection of the exterior derivative: Dφ := (1 − Π)dφ.
(6.13)
Covariance here clearly refers to the right coaction of the structure group U(1) of (1) the bundle, since it is that δR φ = φ ⊗ z −n ⇔ δR (Dφ) = (Dφ) ⊗ z −n . From (B.9) the action of this operator can be written as Dφ = dφ − φ ∧ ω(z −n )
(6.14)
for any φ ∈ Ln . From the bimodule structure (2.36) it is easy to check that all the above connections are strong connections in the sense of [13]. The analysis in this section allows us to prove the results in Proposition 2.1. The exterior algebra Ω(S2q ) is defined to be the set of horizontal and U(1)-coinvariant ele(k)
ments in Ω(SUq (2)), with respect to the extension δR (introduced in Appendix B) of the canonical coaction (2.8) to higher order forms in Ω(SUq (2)). It is then easy to check, from (6.8) and (6.9), that the isomorphisms given in expressions (2.39) for Ω(S2q ) do hold. 7. Gauged Laplacians on Line Bundles Each A(S2q )-bimodule Ln defined in (2.10) is a bimodule of co-equivariant elements in A(SUq (2)) for the right U(1)-coaction (2.8), and as such can be thought of as a module of “sections of a line bundle” over the quantum sphere S2q . Without requiring any compatibility with additional structures, any Ln can be realized both as a projective right or left A(S2q )-module (of rank 1 and winding number −n). One of such structures is that of a connection on the quantum principal bundle A(S2q ) → A(SUq (2)). By transporting the covariant derivative (6.13) on the principal bundle
July 6, J070-S0129055X11004370
602
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
to a derivative on sections forces to break the symmetry between the left or the right A(S2q )-module realization of Ln . With the choice in Sec. 2 for the principal bundle, we need an isomorphism Ln Fn with Fn a projective left A(S2q )-module [15]. This isomorphism is constructed in terms of a projection operator p(n) . Given this identification, in Sec. 7.1 we shall describe the complete equivalence between covariant derivatives on Fn (associated to the 3D left covariant differential calculus over S2q ) and connections (as described in Sec. 6) on the principal bundle A(S2q ) → A(SUq (2)), corresponding to compatible 4D+ bicovariant calculus over SUq (2) and 3D left covariant calculus over S2q . We shall then move to a family of gauged Laplacian operators on Fn , obtained by coupling the Laplacian operator over the quantum sphere with a set of suitable gauge potentials. We finally show that among them there is one whose action extends to Ln the action of the Laplacian (5.5) on L0 S2q . As we noticed in Sec. 5, the action of the (right) Laplacian (5.5) on S2q is given by the restriction of the action (5.2) of the (right) Laplacian R SUq (2) . Here we obtain that the action of such gauged Laplacian can be written in terms of the ungauged (right) Laplacian on SUq (2), in parallel to what happens on a classical principal bundle (see, e.g., [2, Proposition 5.6]) and on the Hopf fibration of the sphere S2q with calculi coming from the left covariant one on SUq (2) as shown in [22, 37]. 7.1. Line bundles as projective left A(S2q )-modules Every (equivariant, the only ones we use in this paper) finitely generated projective (left or right) A(S2q )-module is a direct sum of Ln ’s (cf. [30]). As said, these are line bundles of degree −n on the quantum sphere and to describe them all one needs is a collection of idempotents p(n) , which we are going to introduce. With n ∈ Z, we consider the projective left A(S2q )-module Fn = (A(S2q ))|n|+1 p(n) , with projections [6, 15] (cf. also [22]) p(n) = |Ψ(n) Ψ(n) |,
(7.1)
written in terms of elements |Ψ(n) ∈ A(SUq (2))|n|+1 and their duals Ψ(n) | as follows. One has: n ≤ 0 : |Ψ(n) µ =
√ αn,µ c|n|−µ aµ ∈ Ln ,
|n|−µ−1
αn,µ =
j=0
n ≥ 0 : |Ψ(n) µ =
1 − q 2(|n|−j) , 1 − q 2(j+1)
βn,µ c∗µ a∗n−µ ∈ Ln ,
βn,µ = q
2µ
µ−1 j=0
where αn,|n| = 1; µ = 0, . . . , |n| − 1
(7.2)
where βn,0 = 1;
1 − q −2(n−j) , 1 − q −2(j+1)
µ = 1, . . . , n.
(7.3)
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
603
The coefficients are chosen so that Ψ(n) , Ψ(n) = 1, as a consequence (p(n) )2 = p(n) . Also by construction it holds that (p(n) )† = p(n) . The isomorphism Ln Fn = (A(S2q ))|n|+1 p(n) is realized as follows:
Ln − → Fn ,
φ → σφ | = φΨ(n) |,
(7.4)
with inverse
Fn − → Ln ,
σφ | → φ = σφ , Ψ(n) .
Given the exterior algebra (Ω(S2q ), d) on the quantum sphere we are considering, a covariant derivative on the left A(S2q )-modules Fn is a C-linear map ∇ : Ωk (S2q ) ⊗A(S2q ) Fn → Ωk+1 (S2q ) ⊗A(S2q ) Fn
(7.5)
that satisfies the left Leibniz rule ∇(ξ ∧ σ|) = (dξ) ∧ σ| + (−1)m ξ ∧ ∇σ| for any ξ ∈ Ωm (S2q ) and σ| ∈ Ωk (S2q ) ⊗A(S2q ) Fn . The curvature associated to a covariant derivative is ∇2 : Fn → Ω2 (S2q ) ⊗A(S2q ) Fn , that is ∇2 (ξ σ|) = ξ ∇2 (σ|) = ξ F∇ (σ|) with the last equality defining the curvature 2-form F∇ ∈ HomA(S2q ) (Fn , Ω2 (S2q ) ⊗A(S2q ) Fn ). Any covariant derivative — an element in C(Fn ) — and its curvature can be written as ∇σ| = (dσ|)p(n) + (−1)k σ|A(n) , ∇2 σ| = σ|{−dp(n) ∧ dp(n) + dA(n) − A(n) ∧ A(n) }p(n) .
(7.6)
with σ| ∈ Ωk (S2q ) ⊗A(S2q ) Fn . The negative signs in the second expression above come from the left Leibniz rule, since form valued sections are elements of projective left A(S2q )-modules. For the “gauge potential” A(n) one has A(n) = p(n) A(n) = A(n) p(n) = |Ψ(n) a(n) Ψ(n) | ∈ HomA(S2q ) (Fn , Ω1 (S2q ) ⊗A(S2q ) Fn ), (7.7) with a(n) ∈ Ω1 (S2q ). The monopole (Grassmann) connection corresponds to a(n) = 0. In analogy with the identification (7.4), the covariant derivative ∇ naturally induces an operator D : Ln → Ln ⊗A(S2q ) Ω1 (S2q ) that can be written as Dφ := (∇σφ |)|Ψ(n) = dφ − φ {Ψ(n) , dΨ(n) − a(n) }.
(7.8)
We refer to the 1-form Ω1 (SUq (2)) (n) = (Ψ(n) , dΨ(n) − a(n) )
(7.9)
as the connection 1-form of the gauge potential. It allows to express the curvature as F∇ = −|Ψ(n) (d(n) + (n) ∧ (n) )Ψ(n) | where (d(n) + (n) ∧ (n) ) ∈ Ω2 (S2q ).
(7.10)
July 6, J070-S0129055X11004370
604
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
The covariant derivatives defined above on the left modules Fn fit in the general theory of connections on the quantum Hopf bundle as described in the Sec. 6.2: any covariant vertical projector, as in (6.11), induces a gauge potential A(n) as in (7.7). The notion (7.9) of connection 1-form of a given gauge potential in C(Fn ) matches the notion (6.12) of connection 1-form ω : A(U(1)) → Ω1 (SUq (2)) on the Hopf bundle. From the A(S2q )-bimodule isomorphism ⊕n∈Z Ln ⊗A(S2q ) Ω1 (S2q ) Ω1hor (SUq (2)) (see Remark 6.1), this matching amounts to equate the actions of the covariant derivative operators (7.8) and (6.13), ∀ φ ∈ Ln : Dφ = Dφ ⇔ (n) = ω(z −n ).
(7.11)
From formula (6.12), this correspondence can be written as a(n) = λn ω0 − ξ−n a, where the coefficients refer to the eigenvalue equations: n n Lz |Ψ(n) := ξn |Ψ(n) ⇒ ξn = −q − 2 2 |n| |n| (n) (n) +1 . L0 |Ψ := λn |Ψ ⇒ λn = 2 2
(7.12)
(7.13)
Finally, the equivalence (7.11) allows one to introduce a covariant derivative D : Ωkhor (SUq (2)) → Ωk+1 hor (SUq (2)), thus extending to horizontal forms on the total space of the quantum Hopf bundle the covariant derivative operator on A(SUq (2)) as given in (6.13). This follows the formulation described in [13], since any connection on the principal bundle is strong. Upon defining k −n }, L(k) n := {φ ∈ Ωhor (SUq (2)) : δR φ = φ ⊗ z (k)
(k)
where δR is the natural right U(1)-coaction on Ωk (SUq (2)), one obtains: Dφ = dφ − (−1)k φ ∧ ω(z −n ).
(7.14)
A further extension to the whole exterior algebra Ω(SUq (2)) is proposed in [9]: a generalization of the analysis in [37, §9] shows how this extension is far from being unique. We restrict our analysis again to covariant derivatives ∇s |σ in (7.6) whose gauge potential and corresponding connection 1-form are of the form: = s|Ψ(n) ω0 Ψ(n) |, A(n) s
s(n) = ξn ωz + (λn − s)ω0 ,
(7.15)
for s ∈ R and coefficients as in (7.13), since they reduce in the classical limit to the monopole connection on line bundles associated to the classical Hopf bundle π : S 3 → S 2 . Relations (2.26) and (2.27) allow to compute the curvature 2-form (7.10) as ds(n) = ((q + q −1 )ξn − (s − λn )(q − q −1 ))ω+ ∧ ω− s(n) ∧ s(n) = (q − q −1 )ξn ((q + q −1 )ξn + (q − q −1 )(s − λn ))ω+ ∧ ω− .
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
605
7.2. Gauged Laplacians In order to introduce an Hodge operator R : Ωk (S2q ) ⊗A(S2q ) Fn → Ω3−k (S2q ) ⊗A(S2q ) Fn ,
(7.16)
we use the right A(S2q )-linear Hodge operator (4.6) on Ω(S2q ): ˇ R (ξ σ|) := (Rξ)σ|
(7.17)
so that a gauged Laplacian operator is defined as R ∇ : Fn → Fn ,
R R R ∇ σ| := − ∇( ∇σ|).
Equivalently we have an operator on Ln Fn via the left A(S2q )-modules isomorphism (7.4). With φ = σ, Ψ(n) , it holds that R ∇ : Ln → Ln ,
R (n) R . ∇ φ = (∇ σ|)|Ψ
(7.18)
With the family of connections (7.15) and using the identities (R± σ|)|Ψ(n) = q −n R± φ, |n| |n| (R0 σ|)|Ψ(n) = q −n R0 − 1− φ 2 2
(7.19)
one readily computes: R ∇s φ
=q
−2n
2 |n| |n| −n α(q R+ R− + R− R+ ) + ν R0 + sq − φ. 1− 2 2 2
(7.20) Finally, fixing the parameter to be s(n) = q
n
|n| 2
|n| 1− , 2
(7.21)
the action of the gauged Laplacians extends, apart from a multiplicative factor depending on the label n, to elements in the line bundles Ln the action of the Laplacian operator (5.5) on the quantum sphere, that is, 2 2 q 2n (R ∇s φ) = {α(q R+ R− + R− R+ ) + νR0 } φ.
(7.22)
From (2.18), the above action can be written on φ ∈ Ln as the left action (2.17) of a polynomial in Uq (su(2)). We get (q + q −1 )(K − K −1 )2 −2 K (q − q −1 )2 2 2 2 1 1 1 1 −2 K + ν Cq − = 2qα Cq − + + K −4 2 4 2 4
2 2 −4 R − qα ∇s = (−2qαR0 K + νR0 )K
− qα
(q + q −1 )(K − K −1 )2 −2 K , (q − q −1 )2
(7.23)
July 6, J070-S0129055X11004370
606
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
having used (2.15). This relation is the counterpart of what happens on a classical principal bundle (see, e.g., [2, Proposition 5.6]) and on the Hopf fibration of the sphere S2q with calculi coming from the left covariant one on SUq (2) as shown in [22, 37]. Acknowledgments We are grateful to S. Albeverio, L. S. Cirio and I. Heckenberger for comments and suggestions. AZ thanks P. Lucignano for his help with Maple. GL was partially supported by the Italian Project “Cofin08–Noncommutative Geometry, Quantum Groups and Applications”. AZ gratefully acknowledges the support of the MaxPlanck-Institut f¨ ur Mathematik in Bonn, the Hausdorff Zentrum f¨ ur Mathematik der Universit¨ at Bonn, the Stiftelsen Blanceflor Boncompagni-Ludovisi f¨ odd Bildt (Stockholm), the I.H.E.S. (Bures sur Yvette, Paris). Appendix A. Exterior Differential Calculi on Hopf Algebras In this appendix, we briefly recall general definitions and results from the theory of differential calculi on quantum spaces and quantum groups. We confine to the notions that we need in this paper, in order to construct the exterior algebras over the quantum group SUq (2) and its subalgebra S2q . For a more complete analysis we refer to [36, 20]. Let A be a unital ∗-algebra over C and Ω1 (A) an A-bimodule. Given the linear map d : A → Ω1 (A), the pair (Ω1 (A), d) is a (first order) differential calculus over A if d satisfies the Leibniz rule, d(xy) = (dx)y + xdy for x, y ∈ A, and if Ω1 (A) is generated by d(A) as a A-bimodule. Furthermore, it is a ∗-calculus if there is an anti-linear involution ∗ : Ω1 (A) → Ω1 (A) such that (a1 (da)a2 )∗ = a∗2 (d(a∗ ))a∗1 for any a, a1 , a2 ∈ A. The universal calculus (Ω1 (A)un , δ) has universal 1-forms given by the submodule Ω1 (A)un = ker(m : A ⊗ A → A) ⊂ A ⊗ A with m(a ⊗ b) = ab the multiplication map, while the universal differential δ : A → Ω1 (A)un is δa = 1 ⊗ a − a ⊗ 1. It is universal since given any sub-bimodule N of Ω1 (A)un with projection πN : Ω1 (A)un → Ω1 (A) = Ω1 (A)un /N , then (Ω1 (A), d), with d := πN ◦ δ, is a first order differential calculus over A and any such a calculus can be obtained in this way. The projection πN : Ω1 (A)un → Ω1 (A) is πN ( i ai ⊗ bi ) = i ai dbi with associated subbimodule N = ker π. Next, suppose A is a left H-comodule algebra for the quantum group H = (H, ∆, ε, S), with left coaction δL : A → H ⊗ A, an algebra map. The calculus is (1) said to be left covariant provided a left coaction δL : Ω1 (A) → H ⊗ Ω1 (A) exists, (1) (1) (1) such that δL (da) = (1 ⊗ d)δL (a) and δL (a1 α a2 ) = δL (a1 ) δL (α) δL (a2 ) for any α ∈ Ω1 (A) and a1 , a2 ∈ A. Left covariance of a calculus can be stated in terms of the subbimodule N ⊂ Ω1 (A). The left coaction δL is naturally extended to the tensor product as δ˜L := (· ⊗ id ⊗ id) ◦ (id ⊗τ ⊗ id) ◦ (δL ⊗ δL )
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
607
with τ the standard flip. A calculus is left covariant if and only if δ˜L (N ) ⊂ H ⊗ N . In (1) this case, the coaction δL is the consistent restriction of δ˜L to Ω1 (A). The property of right covariance of a first order differential calculus is stated in complete analogy with respect to a right H-comodule structure of A. Clearly these notions make sense for A to be the algebra H with the coaction ∆ of H on itself extended then to maps (1)
(1)
∆R (dh) = (d ⊗ 1)∆(h)
and ∆L (dh) = (1 ⊗ d)∆(h).
On H there is in addition the notion of a bicovariant calculus, namely a calculus which is both left and right covariant and satisfying the compatibility condition: (1)
(1)
(1)
(1)
(id ⊗∆R ) ◦ ∆L = (∆L ⊗ id) ◦ ∆R . On a quantum group H the covariance of calculi are studied in terms of the bijection r : H ⊗ H → H ⊗ H, r(h ⊗ h ) = (h ⊗ 1)∆(h ),
r−1 (h ⊗ h ) = (h ⊗ 1)(S ⊗ id)∆(h )
(A.1)
which is such that r(Ω1 (H)un ) = H ⊗ ker ε. Left covariant calculi on H are in one to one correspondence with right ideals Q ⊂ ker ε, with subbimodule NQ = r−1 (H ⊗ Q) and Ω1 (H) := Ω1 (H)un /NQ . The left H-modules isomorphism given by Ω1 (H) H ⊗ (ker ε/Q) allows one to recover the complex vector space ker ε/Q as the set of left invariant 1-forms, namely the elements ωa ∈ Ω1 (H) such that (1)
∆L (ωa ) = 1 ⊗ ωa . The dimension of ker ε/Q is referred to as the dimension of the calculus. A left covariant first order differential calculus is a ∗-calculus if and only if (S(Q))∗ ∈ Q for any Q ∈ Q. If this is the case, the left coaction of H on Ω1 (H) is compatible with the (1) ∗-structure: ∆L (dh∗ ) = (∆(1) (dh))∗ . Bicovariant calculi corresponds to right ideals Q ⊂ ker ε which are in addition stable under the right adjoint coaction Ad of H onto itself, that is to say Ad(Q) ⊂ Q ⊗ H. Explicitly, Ad = (id ⊗ m)(τ ⊗ id)(S ⊗ ∆)∆, with τ the flip operator, or Ad(h) = h(2) ⊗ (S(h(1) )h(3) ) in Sweedler notation. The tangent space of the calculus is the complex vector space of elements out of H — the dual space H of functionals on H — defined by XQ := {X ∈ H : X(1) = 0, X(Q) = 0, ∀ Q ∈ Q}.
(A.2)
There exists a unique bilinear form { , } : XQ × Ω1 (H),
{X, xdy} := ε(x)X(y),
(A.3)
giving a non-degenerate dual pairing between the vector spaces XQ and ker ε/Q. We have then also a vector space isomorphism XQ (ker ε/Q). The dual space H has natural left and right (mutually commuting) actions on H: X h := h(1) X(h(2) ),
h X := X(h(1) )h(2) .
(A.4)
If the vector space XQ is finite dimensional its elements belong to the dual Hopf algebra H ⊃ Ho = (Ho , ∆Ho , εHo , SHo ), defined as the largest Hopf ∗-subalgebra
July 6, J070-S0129055X11004370
608
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
contained in H . In such a case the ∗-structures are compatible with both actions: X h∗ = ((S(X))∗ h)∗ ,
h∗ X = (h (S(X))∗ )∗ ,
for any X ∈ Ho , h ∈ H. Then the exterior derivative can be written as:
dh := (Xa h)ωa = ωa (−S −1 (Xa )) h, a
(A.5)
a
where {Xa , ωb } = δab , and one has the identity S −1 (Xa ) = −S −1 (fba )Xb . The twisted Leibniz rule of derivations of the basis elements Xa is dictated by their coproduct:
∆Ho (Xa ) = 1 ⊗ Xa + Xb ⊗ fba , (A.6) b
where the fab ∈ Ho consitute an algebra representation of H:
∆Ho (fab ) = fac ⊗ fcb , c
εHo (fab ) = δab ,
SHo (fab )fbc = fab SHo (fbc ) = δac . b
b
The elements fab also control the H-bimodule structure of Ω1 (H):
ωa h = (fab h)ωb , hωa = ωb ((S −1 (fab )) h), for h ∈ H. b
(A.7)
b
The right coaction of H on Ω1 (H) defines matrix elements
(1) ∆R (ωa ) = ωb ⊗ Jba ,
(A.8)
b
where Jab ∈ H. This matrix is invertible, since )Jbc = δac and b S(Jab J S(J ) = δ . In addition one finds that ∆(J ) = ab bc ac ab b c Jac ⊗ Jcb and ε(Jab ) = δab . It gives a basis of right invariant 1-forms, ηa = ωb S(Jba ) and, as we shall see in a moment, allows one for an explicit evaluation of the braiding of the calculus. In order to construct an exterior algebra Ω(H) over the bicovariant first order differential calculus (Ω1 (H), d) one uses a braiding map replacing the flip automorphism. Define the tensor product Ω1 (H)⊗k = Ω1 (H) ⊗H · · · ⊗H Ω1 (H) with k factors. There exists a unique H-bimodule homomorphism σ : Ω1 (H)⊗2 → Ω1 (H)⊗2 such that σ(ω ⊗ η) = η ⊗ ω for any left invariant 1-form ω and any right invariant 1-form η. The map σ is invertible and commutes with the left coaction of H: (2)
(2)
(id ⊗ σ) ◦ ∆L = ∆L ◦ σ, (2)
with ∆L the extension of the coaction to the tensor product. There is an analogous invariance for the right coaction. Moreover, σ satisfies a braid equation. On Ω1 (H)⊗3 : (id ⊗ σ) ◦ (σ ⊗ id) ◦ (id ⊗ σ) = (σ ⊗ id) ◦ (id ⊗ σ) ◦ (σ ⊗ id).
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
609
All of this was proved in [36], where, using the dual pairing between Ho and H, an explicit form of the braiding σ was given on a basis of left invariant 1-forms:
σabnk ωn ⊗ ωk = fak , Jnb ωn ⊗ ωk . (A.9) σ(ωa ⊗ ωb ) := nk
nk
The braiding map provides a representation of the braid group and an antisym(k) metrizer operator A(k) : Ω1 (H)⊗k → Ω1 (H)⊗k . The Hopf ideals SQ = ker A(k) give the quotients Ωk (H) = Ω1 (H)⊗k /SQ
(k)
(A.10)
the structure of a H-bicovariant bimodule which can be written as Ωk (H) = Range A(k) . The exterior algebra is (Ω(H) = ⊕k Ωk (H), ∧) with the identification Ω0 (H) = H. The exterior derivative is extended to Ω(H) as the only degree one derivation such that d2 = 0. The algebra Ω(H) has natural left and right H-comodule structure, given by recursively setting (k+1)
∆L
(k)
(dθ) = (1 ⊗ d)∆L (θ),
(k+1)
∆R
(k)
(dθ) = (d ⊗ 1)∆R (θ).
Finally, the ∗-structure on Ω1 (H) is extended to an antilinear ∗ : Ω(H) → Ω(H) by (θ ∧ θ )∗ = (−1)kk θ∗ ∧ θ∗ with θ ∈ Ωk (H) and θ ∈ Ωk (H); the exterior derivative operator satisfies the identity (dθ)∗ = d(θ∗ ). The quantum tangent space XQ can be endowed with a bilinear product, given as the functional [ , ]q : XQ ⊗ XQ → XQ : [X, Y ]q (h) := {X ⊗ Y, Ad(h)},
(A.11)
with a natural extension of the bilinear form (A.3). The bicovariance of the calculus ensures that the product is well defined and that, beside being braided antisymmetric it satisfies a braded Jacobi identity, both properties with respect to the (transpose of) the braiding σ. On a basis it is given by
σcdab Xc Xd , (A.12) [Xa , Xb ]q = Xa Xb − cd
and computed in terms of the pairing and the matrix Jab in (A.8) as
σcdab Xc Xd = Xb , Jac Xc . Xa Xb − cd
(A.13)
c
Appendix B. Quantum Principal Bundles and Connections on Them Following [4], we consider as a total space an algebra P (with multiplication m : P ⊗P → P) and as structure group a Hopf algebra H, thus P is a right H-comodule algebra with coaction δR : P → P ⊗H. The subalgebra of right coinvariant elements, B = P H = {p ∈ P : δR p = p ⊗ 1}, is the base space of the bundle. The algebras (P, B, H) define a topological principal bundle provided the sequence: χ
0 → P(Ω1 (B)un )P → Ω1 (P)un → P ⊗ ker εH → 0
(B.1)
July 6, J070-S0129055X11004370
610
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
is exact, with Ω1 (P)un and Ω1 (B)un the universal calculi and the map χ defined by χ : P ⊗ P → P ⊗ H,
χ := (m ⊗ id)(id ⊗ δR ).
(B.2)
In fancier parlance, the exactness of this sequence is also referred to as stating (for a structure quantum group which is cosemisimple and has bijective antipode) that the inclusion B → P is a Hopf–Galois extension [31, Theorem I]. Assume now that (Ω1 (P), d) is a right H-covariant differential calculus on P given via the subbimodule NP ⊂ Ω1 (P)un , and (Ω1 (H), d) a bicovariant differential calculus on H given via the Ad-invariant right ideal QH ∈ ker εH . A first order left invariant differential calculus is induced on the algebra basis B via Ω1 (B) = Ω1 (B)un /NB with NB := NP ∩ Ω1 (B)un . This definition is aimed to ensure that Ω1 (B) = BdB. To extend the coaction δR to a coaction of H on Ω1 (P), one requires δR (NP ) ⊂ NP ⊗ H. The compatibility of the calculi are then the requirements that χ(NP ) ⊆ P ⊗ QH and that the map ∼NP : Ω1 (P) → P ⊗ (ker εH /QH ), defined by the diagram Ω1 (P)un ↓χ
π
−−N→ id ⊗πQH
Ω1 (P) ↓ ∼N P
(B.3)
P ⊗ ker εH −−−−−−→ P ⊗ (ker εH /QH ) (with πN and πQH the natural projections), is surjective and has kernel ker(∼NP ) = PΩ1 (B)P =: Ω1hor (P). These conditions ensure the exactness of the sequence: ∼N
0 → PΩ1 (B)P → Ω1 (P) −−−P→ P ⊗ (ker εH /QH ) → 0.
(B.4)
The condition χ(NP ) ⊆ P ⊗ QH is needed to have a well defined map ∼NP . If (P, B, H) is a quantum principal bundle with the universal calculi, the equality χ(NP ) = P ⊗ QH ensures that (P, B, H; NP , QH ) is a quantum principal bundle with the corresponding nonuniversal calculi. Elements in the quantum tangent space XQH (H) act on ker εH /QH via the pairing between Ho and H. Given V ∈ XQH (H) one defines a map V : Ω1 (P) → P,
V := (id ⊗V ) ◦ (∼NP )
(B.5)
and declares a 1-form ω ∈ Ω1 (P) to be horizontal iff V (ω) = 0, for any V ∈ XQH (H). The collection of horizontal 1-forms coincides with Ω1hor (P). The compatibility conditions above allow one to define right coactions (for k = (k+1) (k+1) : Ωk+1 (P) → Ωk+1 (P) ⊗ H, as coalgebra maps, via δR ◦d = 0, 1, . . .) δR (k) (d ⊗ 1) ◦ δR . By direct computation Ad(ker εH ) ⊂ (ker εH ) ⊗ H. Being the right ideal QH Ad-invariant (i.e. the differential calculus on H is bicovariant), it is possible to define a right-adjoint coaction Ad(R) : ker εH /QH → (ker εH /QH ) ⊗ H by the
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
611
commutative diagram ker εH ↓ Ad
πQ
−−−H → ker εH /QH ↓ Ad(R)
πQ ⊗id
H ker εH ⊗ H −−− −−→ (ker εH /QH ) ⊗ H.
Such a right-adjoint coaction Ad(R) allows one further to define a right coaction (Ad) δR of H on P ⊗(ker εH /QH ) as a coaction of a Hopf algebra on the tensor product of its comodules. This coaction is explicitly given by the relation: (Ad)
δR
(p ⊗ πQH (h)) = p(0) ⊗ πQH (h(2) ) ⊗ p(1) (Sh(1) )h(3) .
(B.6)
A connection on the quantum principal bundle is a right invariant splitting of the sequence (B.4). Given a left P-linear map σ ˜ : P ⊗ (ker εH /QH ) → Ω1 (P) such that (1)
(Ad)
˜ = (˜ σ ⊗ id)δR δR ◦ σ
and
∼NP ◦˜ σ = id,
(B.7)
the map Π : Ω1 (P) → Ω1 (P) defined by Π = σ ˜ ◦ ∼NP is a right invariant left P-linear projection, whose kernel coincides with the horizontal forms PΩ1 (B)P: Π2 = Π, Π(PΩ1 (B)P) = 0, (1)
(B.8) (1)
δR ◦ Π = (Π ⊗ id) ◦ δR . The image of the projection Π is the set of vertical 1-forms of the principal bundle. A connection on a principal bundle can also be given via a connection one form, ˜ of the exact which is a map ω : H → Ω1 (P). Given a right invariant splitting σ sequence (B.4), define the connection 1-form as ω(h) = σ ˜ (1 ⊗ πQH (h − εH (h))) on h ∈ H. Such a connection 1-form has the following properties: ω(QH ) = 0, ∼NP (ω(h)) = 1 ⊗ πQH (h − εH (h))
∀ h ∈ H,
(1)
δR ◦ ω = (ω ⊗ id) ◦ Ad, Π(dp) = · (id ⊗ ω)δR (p)
(B.9)
∀ p ∈ P.
Conversely with a linear map ω : ker εH → Ω1 (P) that satisfies the first three conditions in (B.9), there exists a unique connection on the principal bundle, such that ω is its connection 1-form. The splitting of the sequence (B.4) will be σ ˜ (p ⊗ [h]) = pω([h])
(B.10)
with [h] in ker(εH /QH ), while the projection Π will be Π = m ◦ (id ⊗ ω)◦ ∼NP The general proof of these results is in [4].
(B.11)
July 6, J070-S0129055X11004370
612
2011 11:6 WSPC/S0129-055X
148-RMP
G. Landi & A. Zampini
References [1] J. Apel and K. Schm¨ udgen, Classification of three dimensional covariant differential calculi on Podle´s quantum spheres and on related spaces, Lett. Math. Phys. 32 (1994) 25–36. [2] N. Berline, E. Getzler and M. Vergne, Heat Kernels and Dirac Operators (Springer, 1991). [3] S. Brain and G. Landi, The 3D spin geometry of the quantum two-sphere, Rev. Math. Phys. 22 (2010) 963–993. [4] T. Brzezinski and S. Majid, Quantum group gauge theory on quantum spaces, Comm. Math. Phys. 157 (1993) 591–638; Erratum, ibid. 167 (1995) 235. [5] T. Brzezinski and S. Majid, Quantum differentials and the q-monopole revisited, Acta Appl. Math. 54 (1998) 185–233. [6] T. Brzezinski and S. Majid, Line bundles on quantum spheres, in Particles, Fields, and Gravitation (Lod´z, 1998 ), AIP Conf. Proc., Vol. 453 (Amer. Inst. Phys., 1998), pp. 3–8. [7] L. Cirio, C. Pagani and A. Zampini, The quantum Cartan algebra associated to a bicovariant differential calculus, arXiv:1003.1202[math.QA]. [8] F. D’Andrea and G. Landi, Anti-selfdual connections on the quantum projective plane: Monopoles, Comm. Math. Phys. 297 (2010) 841–893. [9] M. Durdevic, Geometry of quantum principal bundles II, Rev. Math. Phys. 9 (1997) 531–607. [10] M. Durdevic, Quantum principal bundles as Hopf–Galois extensions, arXiv:qalg/9507022. [11] M. Durdevic, Differential structures on quantum principal bundles, Rep. Math. Phys. 41 (1998) 91–115. [12] G. Fiore, Quantum groups covariant (anti)symmetrizers, ε-tensor, vielbein, Hodge map and Laplacian, arXiv:math/0405096. [13] P. Hajac, Strong connections on quantum principal bundles, Comm. Math. Phys. 182 (1996) 579–617. [14] P. Hajac, A note on first order differential calculus on quantum principal bundles, Czech. J. Phys. 47 (1997) 1139–1144. [15] P. M. Hajac and S. Majid, Projective module description of the q-monopole, Comm. Math. Phys. 206 (1999) 247–264. [16] I. Heckenberger, Hodge and Laplace Beltrami operators for bicovariant differential calculi on quantum groups, Comp. Math. 123 (2000) 329–354. [17] I. Heckenberger, Spin geometry on quantum groups via covariant differential calculi, Adv. Math. 175 (2003) 197–242. [18] I. Heckenberger and S. Kolb, Differential calculus on quantum homogeneous spaces, Lett. Math. Phys. 63 (2003) 255–264. [19] M. Khalkhali, G. Landi and W. van Suijlekom, Holomorphic structures on the quantum projective line, Int. Math. Res. Not. 4 (2011) 851–884. [20] A. Klimyk and K. Schm¨ udgen, Quantum Groups and Their Representations (Springer, 1997). [21] J. Kustermans, G. J. Murphy and L. Tuset, Quantum groups, differential calculi and the eigenvalues of the Laplacian, Trans. Amer. Math. Soc. 357 (2005) 4681–4717. [22] G. Landi, C. Reina and A. Zampini, Gauged Laplacians on quantum Hopf bundles, Comm. Math. Phys. 287 (2009) 179–209. [23] S. Majid, Foundations of Quantum Group Theory (Cambridge Univ. Press, 1995). [24] S. Majid, q-epsilon tensor for quantum and braided spaces, J. Math. Phys. 36 (1995) 1991–2007.
July 6, J070-S0129055X11004370
2011 11:6 WSPC/S0129-055X
148-RMP
Calculi, Hodges and Laplacians
613
[25] T. Masuda, K. Mimachi, Y. Nakagami, M. Noumi and K. Ueno, Representations of the quantum group SUq (2) and the little q-Jacobi polynomials, J. Funct. Anal. 99 (1991) 357–387. [26] U. Meyer, Wave equations on q-Minkowski spaces, Comm. Math. Phys. 174 (1995) 457–476. [27] M. J. Pflaum and P. Schauenburg, Differential calculi on noncommutative bundles, Z. Phys. C 76 (1997) 733–744. [28] P. Podle´s, Quantum spheres, Lett. Math. Phys. 14 (1987) 193–202. [29] P. Podle´s, Differential calculus on quantum spheres, Lett. Math. Phys. 18 (1989) 107–119. [30] K. Schm¨ udgen and E. Wagner, Representations of cross product algebras of Podle´s quantum spheres, J. Lie Theory 17 (2007) 751–790. [31] H. Schneider, Principal homogeneous spaces for arbitrary Hopf algebras, Israel J. Math. 72 (1990) 167–195. [32] A. Sch¨ uler, Differential Hopf algebras on quantum groups of type A, J. Algebra 214 (1999) 479–518. [33] P. Stachura, Bicovariant differential calculi on Sµ U (2), Lett. Math. Phys. 25 (1992) 175–188. [34] S. L. Woronowicz, Compact matrix pseudogroups, Comm. Math. Phys. 111 (1987) 613–665. [35] S. L. Woronowicz, Twisted SUq (2) group. An example of a noncommutative differential calculus, Publ. Rest. Inst. Math. Sci. Kyoto Univ. 23 (1987) 117–181. [36] S. L. Woronowicz, Differential calculus on compact matrix pseudogroups (quantum groups), Comm. Math. Phys. 122 (1989) 125–170. [37] A. Zampini, Laplacians and gauged Laplacians on a quantum Hopf bundle, to appear in Quantum Groups and Noncommutative Spaces, eds. M. Marcolli et al. (Vieweg Verlag, 2010); arXiv:1003.5598[math.QA].
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 6 (2011) 615–641 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004394
SHARP SPECTRAL ESTIMATES IN DOMAINS OF INFINITE VOLUME
LEANDER GEISINGER∗ and TIMO WEIDL† Universit¨ at Stuttgart, Pfaffenwaldring 57, D-70569 Stuttgart, Germany ∗
[email protected] †
[email protected] Received 29 December 2010 Revised 14 April 2011 We consider the Dirichlet Laplace operator on open, quasi-bounded domains of infinite volume. For such domains semiclassical spectral estimates based on the phase-space volume — and therefore on the volume of the domain — must fail. Here we present a method on how one can nevertheless prove uniform bounds on eigenvalues and eigenvalue means which are sharp in the semiclassical limit. We give examples in horn-shaped regions and so-called spiny urchins. Some results are extended to Schr¨ odinger operators defined on quasi-bounded domains with Dirichlet boundary conditions. Keywords: Dirichlet Laplacian; Lieb–Thirring inequality; Berezin–Li–Yau inequality; domains of infinite volume; Horn-shaped regions. Mathematical Subject Classification 2010: 35P15, 47A75
1. Introduction Let V (x) be a non-negative function on an open set Ω ⊂ Rd , d ≥ 1. In this paper we study the negative spectrum of Schr¨ odinger operators HΩ = −∆ − V defined in L2 (Ω) with Dirichlet conditions on the boundary of Ω. More precisely, one defines HΩ to be the self-adjoint operator generated by the quadratic form u, HΩ u = |∇u(x)|2 dx − V (x)|u(x)|2 dx, Ω
Ω
with form domain H01 (Ω), see [5] for details. We always assume that HΩ has purely discrete spectrum. Then the negative spectrum of HΩ , if not empty, consists of finitely many eigenvalues −λ1 ≤ −λ2 ≤ · · · − λN < 0, N < ∞, counted with multiplicity. In general, these eigenvalues cannot be calculated explicitly and for large N 615
July 6, J070-S0129055X11004394
616
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
it is difficult to approximate them numerically. Hence, to deduce information about the eigenvalues one studies also the Riesz means Rσ (V ; Ω) = Tr(HΩ )σ− =
N
λσk
k=1
of order σ ≥ 0 and their dependence on Ω and V . The first rigorous step in this direction dates back to Weyl, Courant and Hilbert [43, 6] who calculated the semiclasscial limit of the eigenvalues in the case of a constant potential. To state the general result let us introduce a scaling parameter λ > 0 and replace the potential V by λV . Then for σ ≥ 0 and V ∈ Lσ+d/2 (Ω) the limit cl Rσ (λV ; Ω) = Lσ,d V (x) dx λσ+d/2 + o(λσ+d/2 ), λ → ∞, (1) Ω
holds with the semiclassical constant Lcl σ,d =
Γ(σ + 1) , d d/2 (4π) Γ σ + + 1 2
see, e.g., [34]. To get information about finite potentials one needs to supplement this asymptotic result with uniform estimates. In [25] it was shown that for Ω = Rd and σ > max{0, 1 − d/2} the estimate Rσ (V ; Rd ) ≤ Lσ,d V (x)σ+d/2 dx Rd
holds with certain positive constants Lσ,d . These inequalities have many important applications, for example, in proving the stability of matter [23, 24]. Finding the best constants for which the Lieb–Thirring inequalities hold, poses a substantial mathematical challenge. In [27] the inequalities were established for σ ≥ 3/2 with the sharp constants Lσ,d = Lcl σ,d . This result immediately implies that d for any open set Ω ⊂ R , σ ≥ 3/2, and any non-negative potential V ∈ Lσ+d/2 (Ω) Rσ (V ; Ω) ≤ Lcl V (x)σ+d/2 dx. (2) σ,d Ω
If V ∈ Lσ+d/2 (Ω) then both (1) and (2) hold and we see that the bound (2) is sharp: It shows the correct power of V and holds with the sharp constant. In this paper we are interested in the case V ∈ / Lσ+d/2 (Ω), where the bound (2) and even the asymptotics (1) must fail and one needs to find a new approach to get sharp uniform bounds on eigenvalues means. If V ∈ / Lσ+d/2 (Ω) the leading order of the semiclassical limit depends on the potential V and on the geometry of Ω and it is challenging to find estimates that take these dependencies into account.
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
617
Let us discuss the case of a constant potential V ≡ Λ > 0 on Ω in more detail. If Ω is bounded then the semiclassical limit (1) reads as σ+d/2 + o(Λσ+d/2 ), Rσ (Λ; Ω) = Lcl σ,d |Ω| Λ
σ ≥ 0,
Λ → ∞,
(3)
where |Ω| denotes the volume of Ω. In this case the asymptotic results are supplemented by the Berezin–Lieb–Li–Yau inequality [4, 22, 28]: For σ ≥ 1 σ+d/2 , Rσ (Λ; Ω) ≤ Lcl σ,d |Ω|Λ
Λ > 0.
(4)
Again, the constant in this inequality is sharp and cannot be improved. However, under certain conditions on the geometry of Ω a negative second term exists in the semiclassical limit (3), see [9, 15–18, 36], and the question arises whether (4) can be improved by an additional negative correction term. Recently, several results have been found giving answer to this question [11–13, 19, 29, 42]. In [11] the corresponding sharp estimate for the discrete Laplacian was improved by a negative remainder term capturing the properties of the second term of the semiclassical limit. The first uniform improvement for the continuous Laplacian is due to Mel` as [29]. He improved the estimate (4) for σ ≥ 1, however, the remainder does not reflect the correct order of the second term of the semiclassical limit. In [42] this was improved in the case σ ≥ 3/2. Using an inductive argument based on operator-valued Lieb–Thirring inequalities [27] the Berezin inequality (4) was strengthened by a negative remainder term of correct order in comparison with the second term of the semiclassical limit. Here we are not concerned with the remainder term but we apply the same inductive argument to derive sharp spectral inequalities in domains of infinite volume. However, for unbounded domains Ω even the discreteness of the spectrum of the Dirichlet Laplacian is no longer guaranteed. A necessary condition is the so called quasi-boundedness of Ω [2] which is satisfied, by definition, if lim dist(x, ∂Ω) = 0.
x∈Ω |x|→∞
But even for quasi-bounded domains (3) and (4) must fail if the volume of Ω is infinite. In this article we show that one can nevertheless prove uniform bounds on the eigenvalue means for certain domains with infinite volume. In this case the leading order of the semiclassical limit depends on the geometry of Ω, see, e.g., [10, 35]. However, applying the induction-in-the-dimension argument from [42] we can prove sharp estimates valid for all Λ > 0 that capture the correct asymptotic behavior. If the potential V is not constant the situation is more difficult. The same inductive argument can still be used to reduce the problem to one dimension. But in contrast to the case of constant potential the eigenvalues of the resulting onedimensional operator cannot be calculated explicitly. Therefore we have to study the one-dimensional problem in more detail. In particular, we have to analyze the effect of different boundary conditions on the eigenvalues. The result yields an improved
July 6, J070-S0129055X11004394
618
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
version of the semiclassical bound (2). Again, this sharp Lieb–Thirring inequality with remainder term can be applied in situations, where all known results — in particular (1) and (2) — fail. The remainder of the paper is structured as follows. First we mention some key ingredients of the proofs. In particular, we review the induction-in-the-dimension argument from [42] and adapt it to our needs here. This is done in Sec. 2. In Sec. 3 we consider constant potentials on domains with infinite volume. We give examples, where the leading order of the semiclassical limit depends on the geometry of the domain Ω. In these examples we derive sharp upper bounds on the eigenvalue means. The last part of the article is devoted to the general setting of non-constant potentials. In Sec. 4 we first analyze the effect of different boundary conditions on the eigenvalues of one-dimensional Schr¨ odinger operators. We find an improvement of (2) that can be generalized to higher dimensions. Finally, we give an example to show that the result applies to certain potentials V ∈ / Lσ+d/2 (Ω). 2. Induction in the Dimension In this section we prove an inequality reducing estimates for eigenvalue means of the operator HΩ to estimates for one-dimensional Schr¨ odinger operators. The proof relies on a lifting technique from [20] and uses operator-valued Lieb–Thirring inequalities [27]. Here we follow the proof from [42], where this approach of induction-in-the-dimension is employed to derive improvements of (4) for constant potentials. Fix a Cartesian coordinate system in Rd and for x ∈ Rd write x = (x , t) ∈ d−1 × R. For x ∈ Rd−1 consider one-dimensional sections Ω(x ) = {t ∈ R : R (x , t) ∈ Ω}. If not empty, each section Ω(x ) consists of at most countably many open intervals Jk (x ) ⊂ R, k = 1, . . . , N (x ) ≤ ∞. odinger For x = (x , t) ∈ Ω put Vx (t) = V (x) and let the one-dimensional Schr¨ operators Hk (x ) = −
d2 − Vx , dt2
k = 1, . . . , N (x ),
be defined in L2 (Jk (x )) with Dirichlet boundary conditions. Finally let N (x )
W (x , V ) =
Hk (x )−
(5)
k=1
be the negative part of the Schr¨ odinger operator −d2 /dt2 − Vx given on Ω(x ) with Dirichlet boundary conditions at the endpoints of each interval forming Ω(x ), that is, on the boundary of Ω(x ). Using operator-valued Lieb–Thirring inequalities one can estimate eigenvalue means of HΩ in terms of W (x , V ).
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
619
Proposition 2.1. For σ ≥ 3/2 we have Rσ (V ; Ω) ≤
Lcl σ,d−1
Rd−1
Tr W (x , V )σ+(d−1)/2 dx .
Remark. In the case of a constant potential V ≡ Λ > 0 the trace of W (x , Λ) can be evaluated explicitly. If Ω is bounded, a detailed analysis of the resulting estimate leads to improved Berezin–Li–Yau inequalities with a remainder term capturing the properties of the second term of the semiclassical limit [42, 13, 12]. Proof of Proposition 2.1. We consider the quadratic form u, HΩ u and evaluate it on functions u from the form core C0∞ (Ω). We get u, HΩ uL2 (Ω) =
∇u2L2 (Ω) 2
−
= ∇ uL2 (Ω) +
Ω
V |u|2 dx
Rd−1
dx
2
Ω(x )
2
(|∂t u(x , t)| − Vx (t) |u(x , t)| ) dt,
where ∇ denotes the gradient in the first (d − 1)-coordinates. For fixed x ∈ Rd−1 the functions u(x , ·) belong to C0∞ (Ω(x )) and therefore to the form core of W (x , V ). It follows that 2
u, HΩ uL2 (Ω) ≥ ∇ uL2 (Ω) −
Rd−1
u(x , ·), W (x , V )u(x , ·)L2 (Ω(x )) dx . (6)
To apply operator-valued Lieb–Thirring inequalities we need to extend these forms to Rd . More precisely, we extend both sides of (6) by zero to C0∞ (Rd \∂Ω) which is a form core of (−∆Rd \Ω ) ⊕ HΩ . This operator corresponds to the left-hand side of (6), while the semi-bounded form on the right-hand side is closed on the larger domain H 1 (Rd−1 , L2 (R)), where it corresponds to the operator −∆ ⊗ I − W (x , V )
(7)
defined in L2 (Rd−1 , L2 (R)). Due to the positivity of (−∆Rd \Ω ) the variational principle implies Rσ (V ; Ω) = Tr(−∆Rd \Ω ⊕ HΩ )σ− ≤ Tr(−∆ ⊗ I − W (x , V ))σ− .
(8)
Now we apply sharp Lieb–Thirring inequalities [27] to the Schr¨ odinger operator (7) with operator-valued potential W (x , V ). For σ ≥ 3/2 we obtain Tr(−∆ ⊗ I − W (x , V ))σ− ≤ Lcl σ,d−1 and the claim follows from (8) and (9).
Rd−1
Tr W (x , V )σ+(d−1)/2 dx
(9)
July 6, J070-S0129055X11004394
620
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
3. Constant Potentials In this section we assume V ≡ Λ > 0 on quasi-bounded open sets Ω ⊂ Rd , d ≥ 2. First we remark the following relations between the eigenvalue means. For 0 ≤ γ < σ we have [3] 1 Rσ (Λ; Ω) = B(γ + 1, σ − γ)
0
∞
τ σ−γ−1 Rγ ((Λ − τ )+ ; Ω) dτ,
(10)
where B denotes the Beta-function. Hence one can use bounds or asymptotic results for Rγ to deduce the respective results for Rσ with σ > γ ≥ 0. Conclusions from eigenvalue means of higher order to means of lower order are more cumbersome since eigenvalue means of lower order are less smooth. To derive uniform bounds on the counting function, that is, on R0 (Λ; Ω) one can make use of the estimate [20] R0 (Λ; Ω) ≤ (τ Λ)−σ Rσ ((1 + τ )Λ; Ω),
τ > 0,
Λ > 0,
σ > 0,
(11)
and optimize the right-hand side in τ > 0. In general, sharp constants are lost but usually the correct order of growth in Λ is preserved. In the following we consider specific domains with infinite volume. The discreteness of the spectrum of the Dirichlet Laplace operator defined on these domains can be deduced from various sufficient conditions. For example, one can refer to the following result from [1]. Lemma 3.1. Let Ω be an open subset of Rd and for h > 0 let Qh be a cube with sides parallel to the coordinate axes. Let µd−1 (Qh , Ω) denote the maximum of the (d − 1)-dimensional measure of P (Qh \Ω), where the maximum is taken over all projections P onto (d − 1)-dimensional faces of Qh . Assume that for every > 0 there exist h ≤ 1 and r ≥ 0 such that for every cube Qh of side length h with sides parallel to the coordinate axes and with Qh ∩ {x ∈ Rd : |x| > r} = ∅ we have 1 µd−1 (Qh , Ω) ≥ . hd+1 Then the embedding H01 (Ω) → L2 (Ω) is compact. In the following examples the trace of the operator W (x , Λ) given in (5) can be calculated explicitly and we find that Proposition 2.1 yields sharp estimates on eigenvalue means.
3.1. Horn-shaped regions First we consider horn-shaped regions, domains stretching to infinity along distinguished directions, see [38] for a general definition. These regions were first examined
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
621
3 2 1
−4
−2
2
4
−1 −2 −3
Fig. 1.
The set Ω2 .
by Rozenbljum [31–33] and turn out to be of interest in different situations, see, e.g., [8, 26, 30, 35, 37, 38]. In [35] the semiclassical limit of the counting function was calculated for the horn-shaped regions Ων = (x, y) ∈ R2 : |x| · |y|ν < 1 ,
ν ≥ 1,
(12)
see Fig. 1. Note that discreteness of the spectrum of HΩν can be deduced from Lemma 3.1. In [13] it was shown that the methods introduced in Sec. 2 yield sharp upper bounds on the trace of the heat kernel of the Dirichlet Laplacian on various hornshaped regions. Here we derive sharp bounds on eigenvalue means and order-sharp bounds on the counting function. Let us recall the following asymptotic results from [35]. For ν > 1 the limit ν Γ ν + 1 2 2 Λ(ν+1)/2 + o(Λ(ν+1)/2 ), R0 (Λ; Ων ) = ζ(ν) √ ν+3 π πΓ 2
Λ → ∞,
holds, where ζ(ν) denotes the Zeta function. Moreover, for ν = 1 R0 (Λ; Ω1 ) =
1 Λ ln Λ + o(Λ ln Λ), π
Λ → ∞.
These formulas were improved and extended to higher dimensions in [40]. For more general geometries we refer to [21].
July 6, J070-S0129055X11004394
622
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
To obtain asymptotics for general eigenvalue means, we apply (10) with γ = 0 and for σ > 0 and ν > 1 we get ν B ν + 1, σ + 1 2 2 Λσ+(ν+1)/2 Rσ (Λ; Ων ) = ζ(ν) ν +3 1 π , B σ+ 2 2 + o(Λσ+(ν+1)/2 ),
Λ → ∞,
(13)
and for ν = 1 Rσ (Λ; Ω1 ) =
1 Λσ+1 ln Λ + o(Λσ+1 ln Λ), π (σ + 1)
Λ → ∞.
(14)
In order to treat domains in higher dimensions we generalize the notions from [35] in the following way: Ων = {(x , xd ) ∈ Rd−1 × R : |x | · |xd |ν/(d−1) < 1},
d ≥ 2,
ν > 1.
For these domains of infinite volume an application of Proposition 2.1 yields sharp spectral estimates. Theorem 3.2. For σ ≥ 3/2, ν > 1, and all Λ > 0 the estimate ν ν + 1 Γ(σ + 1) Γ 2 ζ(ν) 2 Λσ+(d−1+ν)/2 Rσ (Λ; Ων ) ≤ d−1 d+1+ν d+1 2 (d − 1) π Γ σ+ Γ 2 2 holds. Remark. For d = 2 we conclude that the bound ν B ν + 1, σ + 1 2 2 Λσ+(ν+1)/2 Rσ (Λ; Ων ) ≤ ζ(ν) ν +3 1 π , B σ+ 2 2 holds for σ ≥ 3/2 and all Λ > 0. Comparing this bound with the asymptotic relation (13) we see that the estimate is sharp: For horn-shaped regions, just as well as for bounded domains, the leading term of the semiclassical limit yields a uniform upper bound. Proof of Theorem 3.2. In this setting the section Ων (x ) consists of one open interval (−|x |(1−d)/ν , |x |(1−d)/ν ).
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
623
Since V ≡ Λ, the trace of the operator-valued potential W (x , Λ) defined in (5) can be evaluated explicitly. We find π2 j 2 Λ− Tr W (x , Λ) = . 4|x |2(1−d)/ν + j∈N Applying Proposition 2.1 yields cl Λ− Rσ (Λ; Ων ) ≤ Lσ,d−1 Rd−1 j∈N
=
Lcl σ,d−1 ωd−1
j∈N
0
∞
π2 j 2 4|x |2(1−d)/ν
1−
σ+(d−1)/2 +
π2 j 2 4Λr2(1−d)/ν
dx
σ+(d−1)/2 rd−2 dr Λσ+(d−1)/2 ,
+
where ωd−1 denotes the volume of the unit sphere in Rd−1 . We substitute s=
π 2 j 2 2(d−1)/ν r 4Λ
and get Rσ (Λ; Ων ) ≤ Lcl σ,d−1 ωd−1
j∈N
ν 2(d − 1)
√ ν ∞ 2 Λ σ+(d−1)/2 ν/2−1 (1 − s)+ s ds πj 0
× Λσ+(d−1)/2
d+1 ν ν νB σ + , 2 2 2 Λσ+(d−1+ν)/2 . = Lcl σ,d−1 ωd−1 ζ(ν) π 2(d − 1) Now we insert the identity d+1 ν , νB σ + 2 2 = ω Lcl d−1 σ,d−1 2(d − 1)
ν +1 Γ(σ + 1)Γ 2 d + 1 d+1+ν 2d−1 (d − 1)Γ Γ σ+ 2 2
and arrive at the claimed estimate. Now we apply (11) to deduce order-sharp bounds on the counting function. Corollary 3.3. For ν > 1 and all Λ > 0 the estimate R0 (Λ; Ων ) ≤ Cd,ν Λ(d−1+ν)/2 holds with a constant
Cd,ν
ν 2 ζ(ν) (d + ν + 2)(d+ν+2)/2 ≤ 3/2 3 (d + ν − 1)(d+ν−1)/2 2d−1 (d − 1) π
5 +1 Γ 2 2 . d+1 d+ν +2 Γ Γ 2 2 Γ
ν
July 6, J070-S0129055X11004394
624
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
Proof. We use (11) with σ = 3/2 and apply Theorem 3.2 to obtain ν 5 ν Γ +1 Γ 2 ζ(ν) 1 2 2 R0 (Λ; Ων ) ≤ d+1 d+ν (τ Λ)3/2 2d−1 (d − 1) π +2 Γ Γ 2 2 × ((1 + τ )Λ)(d+ν)/2+1
=
(1 + τ )(d+ν)/2+1 τ 3/2
5 ν +1 Γ Γ 2 ζ(ν) 2 2 d+1 d+ν 2d−1 (d − 1) π +2 Γ Γ 2 2 ν
× Λ(d−1+ν)/2 . Minimizing in τ > 0 yields τmin = 3/(d + ν − 1) and inserting this we obtain the claimed result. Let us now consider the critical case ν = 1 in dimension d = 2. Here the domain yields two equally strong singularities and we cannot distinguish one direction. However, choosing an intermediate direction we obtain a sharp estimate with a remainder term. Theorem 3.4. Let σ ≥ 3/2. Then for Λ ≤ π 2 /16 we have Rσ (Λ; Ω1 ) = 0 and for Λ > π 2 /16 the estimate Rσ (Λ; Ω1 ) ≤ holds with a constant
1 C Λσ+1 ln Λ + Λσ+1 π (σ + 1) σ+1
4 33 + 16 ln π < 1.47. C< 8π
Remark. Again, comparing this inequality with the asymptotics (14), we see that the main term of the bound is sharp. Proof of Theorem 3.4. Since the function |Ω1 (x)| = x1 has non-integrable singularities at zero and at infinity we have to choose a coordinate system (x1 , x2 ) ∈ R2 rotated by π4 with respect to the coordinate system (x, y) ∈ R2 which was used in definition (12). We get Ω1 (x1 ) = {x2 ∈ R : |x2 | ≤ |x1 |2 + 2} √ for |x1 | ≤ 2 and Ω1 (x1 ) = {x2 ∈ R : |x1 |2 − 2 ≤ |x2 | ≤ |x1 |2 + 2} √ for |x1 | > 2. To simplify the following calculations and the resulting bound we confine ourselves to rough estimates which are nevertheless sufficient to prove the
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
625
sharp constant in the leading term. First, we estimate |Ω1 (x1 )| ≤ 4 for |x1 | ≤ 2 and 4 2 |Ω1 (x1 )| ≤ 2( |x1 |2 + 2 − |x1 |2 − 2) ≤ + |x1 | |x1 |3 for |x1 | > 2. Suppose that Λ ≤ π 2 /16. Since |Ω1 (x1 )| ≤ 4 for all x1 ∈ R we get π2 j 2 =0 Tr W (x1 , Λ) = Λ− |Ω(x1 )|2 + j∈N
for all x1 ∈ R. From Proposition 2.1 it follows that Rσ (Λ; Ω1 ) = 0 for Λ ≤ π 2 /16. On the other hand, if Λ > π 2 /16, Proposition 2.1 implies σ+1/2 π2 j 2 cl Rσ (Λ; Ω1 ) ≤ Lσ,1 dx1 Λ− |Ω(x1 )|2 + R j∈N
≤
4Lcl σ,1
σ+1/2 σ+1/2 ∞ π2 j 2 π2 j 2 cl Λ− Λ− + 2Lσ,1 dx1 , 16 + l(x1 )2 + 2 j∈N
j∈N
(15) with l(x1 ) =
2 4 + . |x1 | |x1 |3
Note that for A > 0 and γ > 0 we have γ 1 A j2 ,γ + 1 , ≤ B 1− 2 A + 2 2
(16)
j∈N
thus 4Lcl σ,1
j∈N
Λ−
π2 j 2 16
σ+1/2 +
≤
1 3 8 cl 4 1 Lσ,1 B ,σ + Λσ+1 . Λσ+1 = π 2 2 πσ+1
(17)
√ √ Now we turn to the second√term in (15). Put x(Λ) = (4 Λ)/π + π/(4 Λ). For x1 ≥ x(Λ) we have l(x1 ) ≤ π/ Λ, hence π2 j 2 = 0. Λ− l(x1 )2 + j∈N
In view of (16) it follows that σ+1/2 x(Λ) ∞ 1 3 1 cl π2 j 2 2Lcl L , σ + dx ≤ B l(x1 ) dx1 Λσ+1 Λ − 1 σ,1 σ,1 2 l(x ) π 2 2 1 2 2 + j∈N x(Λ) 1 1 = l(x1 ) dx1 Λσ+1 . (18) 2π σ + 1 2
July 6, J070-S0129055X11004394
626
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
By definition of x(Λ) and l(x1 ), x(Λ) l(x1 )dx1 = 2
4 2 + 3 dx1 x1 x1 2 4 π 1 + ≤ 2 ln Λ + 4 ln − 4 ln 2 + π 4Λ 4 4 1 ≤ 2 ln Λ + 4 ln + π 4 x(Λ)
(19)
for Λ > π 2 /16. Inserting (17)–(19) into (15) finishes the proof. Again we can apply (11) to deduce order-sharp bounds on the counting function. Corollary 3.5. For Λ ≤ π 2 /16 we have R0 (Λ; Ω1 ) = 0 and for Λ > π 2 /16 the estimate 3/2 5 1 Λ ln Λ + CΛ, R0 (Λ; Ω1 ) ≤ 3 π holds, with a constant
825 + 400 ln 4 + 360π ln 5 5 π 3 < 8.56. C< 3 72π
3.2. Spiny urchins In this subsection we study the eigenvalues of the Dirichlet Laplacian on so called spiny urchins, radially symmetric domains ΩS ⊂ R2 with infinite volume, which were introduced in [7]. To construct ΩS we use polar coordinates (r, ϕ) ∈ [0, ∞) × [0, 2π) and choose an increasing sequence (rn )n∈N of positive real numbers and put r0 = 0. For n ∈ N0 and k = 1, 2, . . . , 2n+2 let k−1 Γn,k = (r, ϕ) : r ≥ rn , ϕ = n+1 π 2 be semi-axes and define ΩS = R2
Γn,k ,
n,k
see Fig. 2. Note that this domain, though quasi bounded, has empty exterior. However, if lim rn 2−n = 0,
n→∞
(20)
then discreteness of the spectrum of HΩS can be deduced from Lemma 3.1, see also [39].
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
Fig. 2.
627
The set ΩS .
For rn = n the domain ΩS was analyzed in [10], where the leading term of the semiclassical limit was calculated: For rn = n the asymptotic relation R0 (Λ; ΩS ) = C Λ(ln Λ)2 + o(Λ(ln Λ)2 ),
Λ → ∞,
holds with a constant C > 0. The general setting of an arbitrary increasing sequence (rn )n∈N0 was studied in [39]: If r0 > 0 and (20) is satisfied then for all Λ > 214 r0−2 the bound 2 R0 (Λ; ΩS ) ≤ 50(8−1 + 8π)2 ΛrK(Λ)
√ holds with K(Λ) = max{n ∈ N : rn 2−n > (32)−1 Λ}. Moreover, there is a similar lower bound. Here we extend the upper bound: We derive order-sharp estimates on the eigenvalue means of HΩS valid for all Λ > 0. First, we need to adapt Proposition 2.1 to the radially symmetric situation. For r ∈ (0, ∞) put ΩS (r) = {ϕ ∈ [0, 2π) : (r, ϕ) ∈ ΩS }.
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
628
Then ΩS (r) consists of finitely many open intervals Ik (r), k = 1, . . . , N (r). Choose u ∈ C0∞ (ΩS ) and consider the quadratic form u(x) (−∆u(x) − Λu(x)) dx u, HΩS uL2 (ΩS ) =
Ω ∞
= 0
1 1 u(r, ϕ) −∂r2 − ∂r − 2 ∂ϕ2 − Λ u(r, ϕ) dϕ r dr. r r Ω(r)
(21)
For fixed r > 0 the function ur (ϕ) = u(r, ϕ) belongs to C0∞ (ΩS (r)). It satisfies Dirichlet boundary conditions at the endpoints of the intervals Ik (r), k = 1, . . . , N (r). √ To rewrite the form in the ground state representation put v(r, ϕ) = r u(r, ϕ). Then again v(r, ϕ) belongs to C0∞ (ΩS ) and for fixed r > 0, we have vr (ϕ) = v(r, ϕ) ∈ C0∞ (ΩS (r)). Moreover, ∞ ∞ 2 |u(r, ϕ)| dϕ r dr = |v(r, ϕ)|2 dϕ dr 0
and
0
Ω(r)
Ω(r)
1 1 1 1 1 −∂r2 − ∂r − 2 ∂ϕ2 u(r, ϕ) = √ −∂r2 − 2 − 2 ∂ϕ2 v(r, ϕ). r r 4r r r
Inserting this into (21) we obtain ∞ 1 1 2 2 2 u, HΩS uL2 (ΩS ) = + Λ |v| dϕ dr. |∂r v| + 2 |∂ϕ v| − r 4r2 0 ΩS (r)
(22)
In this setting, we define the Schr¨ odinger-type operators 2 1 1 d + Λ , k = 1, . . . , N (r), Hk (r) = − 2 2 − r dϕ 4r2 in L2 (Ik (r)) with Dirichlet boundary conditions at the endpoints of Ik (r). In the same way as in (5) let N (r)
W (r, Λ) =
Hk (r)−
k=1
be the negative part of the operator 1 d2 − 2 2− r dϕ
1 +Λ 4r2
in L2 (ΩS (r)) with Dirichlet boundary conditions. In view of (22) we can apply Proposition 2.1 and for σ ≥ 3/2 we get ∞ Tr W (r, Λ)σ+1/2 dr. (23) Rσ (Λ; ΩS ) ≤ Lcl σ,1 0
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
629
To estimate the right-hand side and to derive bounds on the eigenvalues means we assume that (20) is satisfied and that rn+1 ≤ 2rn
(24)
holds for all n ∈ N. Then the sequence 22n 1 − 2, 2 rn 4rn
n ∈ N,
is increasing and for all Λ > 15/4 · r1−2 there is a unique index n ˆ (Λ) ∈ N satisfying Λ >
1 22n − 2 rn2 4rn
for all n ≤ n ˆ (Λ) and Λ ≤
22n 1 − 2 rn2 4rn
for all n > n ˆ (Λ). (25)
To simplify notation we put rˆ(Λ) = rnˆ (Λ) . Lemma 3.6. Let σ ≥ 3/2 and assume that (20) and (24) are satisfied. Then for Λ ≤ 15/4 · r1−2 we have Rσ (Λ; ΩS ) = 0 and for Λ > 15/4 · r1−2 the estimate Rσ (Λ; ΩS ) ≤ Lcl r (Λ)2 Λσ+1 + Cσ Λσ ln (Λˆ r (Λ)) , σ,2 πˆ holds with a constant Cσ > 0 depending only on σ. Remark. If we compare the main term of this bound with the Berezin inequality (4) we see that the effective domain that enters into the bound is a disk with radius rˆ(Λ). Proof of Lemma 3.6. In view of (23) we have to estimate N (r) 1 π2 j 2 1 1 d2 = . Λ+ 2 − 2 Tr W (r, Λ) = Tr − 2 2 − Λ − 2 r dϕ 4r − 4r r |Ik (r)|2 + k=1 j∈N
Fix r > 0 and n0 ∈ N such that rn0 −1 < r ≤ rn0 . Then the section ΩS (r) ⊂ [0, 2π) consists of 2n0 +1 identical open intervals of length |Ik (r)| = π/2n0 . Hence, 22n0 j 2 1 . Tr W (r, Λ) = 2n0 +1 Λ+ 2 − 4r r2 + j∈N
Note that for all j ∈ N 22n0 j 2 1 22n0 +2 − 1 15 − ≥ ≥ 2. 2 2 2 r 4r 4r 4r1 For Λ ≤ 15/4 · r1−2 we obtain Tr W (r, Λ) = 0 and by (23) also Rσ (Λ; ΩS ) = 0. ˆ (Λ). Hence, we can assume Λ > 15/4 · r1−2 . Suppose that r > rˆ(Λ) thus n0 > n From (25) we get 1 22n0 +2 − 1 22n0 j 2 − ≥ ≥Λ r2 4r2 4rn2 0
July 6, J070-S0129055X11004394
630
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
for all j ∈ N and it follows that Tr W (r, Λ) = 0 for r > rˆ(Λ). Moreover, if r2 ≤ 15/(4Λ) we have r ≤ r1 and 4j 2 1 15 − 2 ≥ 2 ≥Λ 2 r 4r 4r for √ all j√ ∈ N. Again it follows that Tr W (r, Λ) = 0 and it remains to consider 15/(2 Λ) < r < rˆ(Λ). For such r we apply (16) to estimate σ+1/2
Tr W (r, Λ)
=2
n0 +1
j∈N
22n0 j 2 1 Λ+ 2 − 4r r2
σ+1/2 +
σ+1 1 3 1 ,σ + ≤ r Λ+ 2 B . 4r 2 2 From (23) we conclude Rσ (Λ; ΩS ) ≤
= ≤
Lcl σ,1 B
1 3 ,σ + 2 2
1 Λσ 16(σ + 1)
σ+1 1 r Λ+ 2 dr √ √ 4r 15/(2 Λ)
rˆ(Λ)
4Λˆ r (Λ)2
1+
15
1 s
σ+1 ds
16σ−1 σ 1 rˆ(Λ)2 Λσ+1 + Λ ln 4Λˆ r(Λ)2 σ 4(σ + 1) 15
and the claim of the lemma follows from the identity 4π(σ + 1)Lcl σ,2 = 1. Before we give examples we supplement Lemma 3.6 with the following lower bound on Rσ (Λ; ΩS ). Lemma 3.7. Assume there exists N0 ∈ N such that rn−1 < (1 − 2−n )rn is satisfied for all n ≥ N0 . Then for σ ≥ 0 there exist positive constants C and µ independent of Λ such that n ˆ (µΛ)
Rσ (Λ; ΩS ) ≥ C
rn (rn − rn−1 ) Λσ+1
n=N0
holds for Λ > 0 with n ˆ (µΛ) > N0 . Proof. For n ≥ N0 and k ∈ {1, . . . , 2n+1 } consider a segment Ωn,k ⊂ ΩS , i.e. a region between r = rn−1 , r = rn and two adjacent semi-axes Γn,k and Γn,k+1 . Note that there are 2n+1 identical segments Ωn,k . Let τ (n) denote the maximal number of disjoint squares Qln with side length ln = rn /2n+1 that can be placed in the
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
631
interior of Ωn,k . From the definition of ΩS it follows that τ (n) ≥ C
rn − rn−1 , ln
n ≥ N0 .a
Hence, the variational principle implies Rσ (Λ; ΩS ) ≥
2n+1 τ (n) Rσ (Λ; Qln ) ≥ C
n≥N0
2n+1
n≥N0
rn − rn−1 Rσ (Λ; Qln ). ln
(26)
To estimate R(Λ; Qln ) from below, we first consider the square Q1 with side length 1. From Weyl’s asymptotic law (3) we know that there are positive constants C and Λ0 , such that Rσ (Λ; Q1 ) ≥ C Λσ+1 holds for all Λ ≥ Λ0 . By scaling, we deduce that Rσ (Λ; Qln ) ≥ Cln2 Λσ+1
(27)
holds for all Λ ≥ Λ0 /ln2 . Fix Λ > 0. From (25) we deduce that 22n Λ0 = 4Λ ≤ 8Λ0 0 ln2 rn2
1 22n − 2 2 rn 4rn
≤Λ
holds if n ≤ n ˆ (Λ/(8Λ0 )). Denoting µ = 1/(8Λ0 ) we find that (27) is valid for all ˆ (µΛ). squares Qln with n ≤ n In view of (26) it follows that n ˆ (µΛ)
Rσ (Λ; ΩS ) ≥ C
2
n+1
n=N0
n ˆ (µΛ) rn+1 − rn 2 σ+1 ln Λ ≥C rn (rn − rn−1 )Λσ+1 ln n=N0
and the proof is complete. Let us state some examples to show that the bounds capture the correct order in Λ and that choosing different sequences (rn )n∈N leads to different behavior in the semiclassical limit. Corollary 3.8. Let σ ≥ 0. (1) Assume rn = n. Then for 0 < Λ ≤ 15/4 we have Rσ (Λ; ΩS ) = 0 and for Λ > 15/4 Rσ (Λ; ΩS ) ≤ Cσ Λσ+1 (ln Λ)2 .
a Here
of Λ.
and in the following the letter C denotes various positive constants that are independent
July 6, J070-S0129055X11004394
632
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
(2) Assume rn = 2δn with 0 < δ < 1. Then for 0 < Λ ≤ 15 · 2−2(1+δ) we have Rσ (Λ; ΩS ) = 0 and for Λ > 15 · 2−2(1+δ) Rσ (Λ; ΩS ) ≤ Cσ,δ Λσ+1/(1−δ) . All bounds capture the correct order in Λ as Λ → ∞. Proof. To prove the bounds for σ ≥ 3/2 we can apply Lemma 3.6 and it remains to estimate rˆ(Λ). By definition, rˆ(Λ) = rnˆ (Λ) and by (25) rnˆ (Λ) satisfies 1 22ˆn(Λ) − 2 ≤ Λ. rn2ˆ (Λ) 4rnˆ (Λ) It follows that rˆ(Λ) ≤ C ln Λ in the case rn = n and rˆ(Λ) ≤ Cδ Λδ/(2(1−δ)) in the case rn = 2δn and the bounds for σ ≥ 3/2 follow from Lemma 3.6. To deduce the claimed estimates for 0 ≤ σ < 3/2 we apply (11) and finally (10). It remains to prove that the estimates are of correct order in Λ. Note that in the case rn = n the assumptions of Lemma 3.7 are satisfied with N0 = 1. Hence, we have n ˆ (µΛ)
n ˆ (µΛ)
rn (rn − rn−1 ) =
n ≥ Cn ˆ (µΛ)2 = C rˆ(µΛ)2 .
n=1
n=N0
In the case rn = 2δn we find for sufficiently large Λ that n ˆ (µΛ)
n ˆ (µΛ)
rn (rn − rn−1 ) =
n=N0
n ˆ (µΛ) δn
2 (2
δn
−2
δ(n−1)
)≥C
n=N0
22δn ≥ C22δnˆ (µΛ)
n=N0 2
= C rˆ(µΛ) , holds. In both cases, we insert this into Lemma 3.7 and get Rσ (Λ; ΩS ) ≥ CΛσ+1 rˆ(µΛ)2 .
(28)
For Λ large enough the relations (25) imply rˆ(µΛ) ≥ C ln(µΛ) ≥ C ln Λ if rn = n and rˆ(µΛ) ≥ C(µΛ)δ/(2(1−δ)) ≥ CΛδ/(2(1−δ)) if rn = 2δn . As Λ → ∞ we obtain from (28) that Rσ (Λ; ΩS ) = O(Λσ+1 (ln Λ)2 ) in the case rn = n and Rσ (Λ; ΩS ) = O(Λσ+1/(1−δ) ) in the case rn = 2δn . Thus the bounds on Rσ (Λ, ΩS ) show the correct order in Λ.
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
633
Let us state one more example, where one encounters exponential growth of the eigenvalue means. √ Corollary 3.9. Assume σ ≥ 3/2 and rn = 2n / n. Then for 0 < Λ < 15/16 we have Rσ (Λ; ΩS ) = 0 and for Λ > 15/16 Rσ (Λ; ΩS ) ≤ Cσ 22Λ Λσ . This bound follows from Lemma 3.6 similar as in Corollary 3.8. 4. Non-Constant Potentials In this section we consider Schr¨odinger operators HΩ with non-constant potentials V ≥ 0 on open sets Ω ⊂ Rd . Since we define HΩ with Dirichlet boundary conditions the variational principle implies that the sharp Lieb–Thirring inequality (2) holds. In fact, the Dirichlet condition gives rise to an improvement of this bound. In this section we use this to derive sharp Lieb-Thirring inequalities with remainder term. 4.1. One-dimensional considerations As in Sec. 3 we can apply Proposition 2.1 to reduce the problem to one dimension. However, for non-constant potentials V the trace of the operator-valued potential W (x , V ) defined in (5) cannot be calculated explicitly. Therefore we need to study the one-dimensional situation in more detail to derive the following improvement of (2). Theorem 4.1. Let I ⊂ R be an open interval of length l < ∞ and assume σ ≥ 3/2 and V ∈ Lσ+1/2 (I) such that A = l V (t) dt < ∞. I
Then for A ≤ 2 ln 3 we have Rσ (V ; I) = 0 and for A > 2 ln 3 2 σ V (t) dt 2 I σ+1/2 V (t) dt − Rσ (V ; I) ≤ Lcl . σ,1 exp(A) − 1 I The remainder of Sec. 4.1 is devoted to the proof of this result. In particular, we study the effect of different boundary conditions on the eigenvalues. First we assume I = (0, l) and V ∈ C0∞ (I). Recall that HI = −
d2 −V dt2
July 6, J070-S0129055X11004394
634
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
is defined in L2 (I) as self-adjoint operator generated by the quadratic form 2 |u (t)| − V (t)|u(t)|2 dt, u, HI u =
(29)
with form domain H01 (I). Moreover, we define the operator HR = −
d2 −V dt2
in L2 (R) generated by the form (29) with form domain H 1 (R). N We assume that the negative spectrum of HI consists of N eigenvalues (−λk )k=1 , N ∈ N, and denote the negative eigenvalues of HR by (−µk )M k=1 . The variational principle implies M ≥ N and −µk ≤ −λk for each k = 1, . . . , N . In order to derive relations between the eigenvalues of HI and HR we define (α,β)
HI
=−
d2 − V, dt2
0 ≤ α, β ≤
π , 2
as self-adjoint operators generated by the form (α,β) u, HI u = |u (t)|2 dt − V (t)|u(t)|2 dt + (cot α) |u(0)|2 + (cot β) |u(l)|2 (α,β)
with form domain H 1 (I). Note that eigenfunctions of HI satisfy boundary con (0) = (cot α)u(0) and u (l) = −(cot β)u(l). For α, β ∈ ditions of the third kind: u π (α,β) N (α,β) 0, 2 the negative spectrum of HI consists of eigenvalues (−νk (α, β))k=1 . We point out that for α = β = 0 we recover Dirichlet boundary conditions: (0,0)
HI
= HI ,
N (0, 0) = N,
N (0,0)
and (νk (0, 0))k=1
= (λk )N k=1 .
(30)
We need the following result from [41] about the behavior of the eigenvalues of . For α ∈ 0, π2 and ν > 0 let u(t; ν, α) to be the unique solution of
(α,β)
HI
−u (t) − V (t)u(t) = −ν u(t),
t ∈ I,
u(0; ν, α) = sin α,
(31)
u (0; ν, α) = cos α. Lemma 4.2. Fix β ∈ 0, π2 . Then for α ∈ (0, π2 ) the map α → νk (α, β) is monotone increasing and differentiable and we have dνk (α, β) = u(·; νk (α, β), α)−2 L2 (I) . dα Because of the symmetry of the eigenvalue problem (31) result a corresponding holds for fixed α ∈ 0, π2 and the map β → νk (α, β), β ∈ 0, π2 . For k = 1, . . . , N it follows that for all α ∈ 0, π2 .
−νk (α, α) ≤ −νk (0, 0) = −λk < 0
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
For k = 1, . . . , N put
π √ ωk = arccot µk ∈ 0, . 2
635
(32)
Then we have N (ωk , ωk ) ≥ N and both −µk and −νk (ωk , ωk ) exist as negative (ω ,ω ) eigenvalues of HR and HI k k , respectively. (ωk ,ωk )
Proposition 4.3. For k = 1, . . . , N the eigenvalues of HR and HI
satisfy
−µk = −νk (ωk , ωk ). Proof. For arbitrary k ∈ {1, . . . , N } let Φk denote the eigenfunction of HR corresponding to −µk . Then supp V ⊂ I = (0, l) implies √ Φk (t) = c1 exp(− µk t) √ Φk (t) = c2 exp(+ µk t)
for t ≥ l
and
for t ≤ 0
with suitable constants c1 , c2 ∈ R. From (32) it follows that Φk (0) = (cot ωk )Φk (0) ˜ k = Φk |(0,l) . Since Φ ˜ k belongs to the domain and Φk (l) = −(cot ωk )Φk (l). Put Φ (ω ,ω )
(ω ,ω )
of HI k k we find that −µk is an eigenvalue of HI k k . Note that Φk has k − 1 ˜ k has k − 1 zeros as well and we conclude zeros in the interior of I. Therefore Φ −µk = −νk (ωk , ωk ). Similar as in (31) let u ˜(t; ν, β), β ∈ 0, π2 , ν > 0, be the unique solution of −˜ u (t) − V (t)˜ u(t) = −ν u ˜(t),
t ∈ I,
u ˜(l; ν, β) = sin β, u˜ (l; ν, β) = −cos β. Due to the symmetry of the eigenvalue problem (31) there is a result analogous to Lemma 4.2 relating the derivative of the map β → νk (α, β) to the L2 -norm of u ˜(·; νk (α, β), β). In view of (30) and Proposition 4.3 we have µk − λk = νk (ωk , ωk ) − νk (0, 0) = νk (ωk , ωk ) − νk (0, ωk ) + νk (0, ωk ) − νk (0, 0). Hence, applying Lemma 4.2 and its analog for the map β → νk (0, β) yields ωk ωk µk − λk = u(·; νk (α, ωk ), α)−2 dα + ˜ u(·, νk (0, β), β)−2 2 L (I) L2 (I) dβ 0
(33)
0
for k = 1, . . . , N . In the remainder of this subsection we use this identity to complete the proof of Theorem 4.1. In order to get a result valid without further assumptions on the potential V we have to restrict ourselves to considering the ground states.
July 6, J070-S0129055X11004394
636
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
Lemma 4.4. Let I ⊂ R be an open interval of length l and V ∈ C0∞ (I). Then the inequality 2 2 V (t) dt µ1 − λ1 ≥ exp l V (t) dt − 1 holds. Moreover, if l σ ≥ 0.
V (t) dt ≤ 2 ln 3 then −λ1 ≥ 0 and we have Rσ (V ; I) = 0 for
Proof. First we remark that it suffices to prove the result for I = (0, l). To apply ˜(·; ν1 (0, β), β) for 0 < (33) we have to analyze the functions u(·; ν1 (α, ω1 ), α) and u α, β < ω1 . (α,ω ) By definition, the function u is the first eigenfunction of HI 1 thus it is nonnegative on I. As a solution of (31) u solves the integral equation
√ √ √ √ e νt − e− νt 1 1 νt − νt √ +e ) + (cos α) u(t; ν, α) = (sin α) (e 2 2 ν √ t sinh( ν(t − s)) √ − V (s)u(s; ν, α) ds. (34) ν 0 The first two summands are non-decreasing in ν > 0. For α ∈ [0, ω1 ], Lemma 4.2 and Proposition 4.3 imply ν1 (α, ω1 ) ≤ µ1 . Since the integrand in (34) is positive it follows that √ √µ1 t √ √ e − e− µ1 t 1 1 µ1 t − µ1 t +e ) + (cos α) u(t; ν1 (α, ω1 ), α) ≤ (sin α) (e √ 2 2 µ1 1 √ cos α 1 √ cos α = e µ1 t sin α + √ + e− µ1 t sin α − √ . 2 µ1 2 µ1 √ Now we use that sin α − cos α/ µ1 ≤ 0 holds for α ∈ [0, ω1 ] and conclude 1 √µ1 t cos α 0 < u(t; ν1 (α, ω1 ), α) ≤ e sin α + √ . 2 µ1 By explicit calculations it follows that ω1 u(·; ν1 (α, ω1 ), α)−2 dα ≥ 0
Similarly, we find
0
ω1
˜ u(·; ν1 (0, β), β)−2 dβ ≥
4µ √1 . exp 2l µ1 − 1 4µ √1 exp 2l µ1 − 1
and (33) implies µ1 − λ1 ≥
8µ √1 . exp 2l µ1 − 1
(35)
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
637
√ For l µ1 ≤ ln 3 it follows that −λ1 ≥ 0. Since the right-hand side of (35) is nonincreasing the estimate [14] 1 √ µ1 ≤ V (t) dt 2 I implies the claimed result. The proof of Theorem 4.1 is an immediate consequence of the results above: Proof of Theorem 4.1. Using convexity of the map λ → λσ and the Lieb– Thirring inequality (2) we estimate N N σ σ σ σ σ cl λk ≤ µk − (µ1 − λ1 ) ≤ Lσ,1 V (t)σ+1/2 dt − (µ1 − λ1 ) . Rσ (V ; I) = k=1
I
k=1
Hence, for V ∈ C0∞ (I) the claim follows from Lemma 4.4. A standard approximation argument allows us to prove the claim for all non-negative potentials V ∈ Lσ+1/2 (I).
4.2. A sharp Lieb–Thirring inequality with remainder term Let us now consider general Schr¨ odinger operators HΩ on bounded or quasibounded open sets Ω ⊂ Rd with Dirichlet boundary conditions. To apply the inductive argument introduced in Sec. 2, fix a coordinate system in Rd . For x ∈ Ω we write x = (x , t) ∈ Rd−1 ×R and assume that Vx ∈ Lσ+d/2 (Ω(x )), a.e. in x ∈ Rd−1 . We use the notation introduced in Sec. 2 and put Vx (t) dt, Ak (x ) = |Jk (x )| Jk (x ) Bk (x ) = Vx (t) dt. Jk (x )
Let κ(x , V ) ⊂ N be the subset of all indices k with Ak (x ) > 2 ln 3 and put Jk (x ) ⊂ R and ΩV = {x } × ΩV (x ) ⊂ Ω. ΩV (x ) = k∈κ(x ,V )
x ∈Rd−1
The results from Secs. 2 and 4.1 imply the following sharp Lieb–Thirring inequality with remainder term. Theorem 4.5. Let Ω be an open set in Rd , d ≥ 2, and assume σ ≥ 3/2. Then the estimate cl σ+d/2 cl V (x) dx − Lσ,d−1 ρ(x , V ) dx Rσ (V ; Ω) ≤ Lσ,d ΩV
holds with a remainder
ρ(x , V ) =
k∈κ(x ,V )
Rd−1
2Bk (x )2 exp(Ak (x )) − 1
σ+(d−1)/2 .
July 6, J070-S0129055X11004394
638
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
Proof. In view of Proposition 2.1 we have to estimate N (x )
σ+(d−1)/2
Tr W (x , V )
=
N (x ) σ+(d−1)/2 Tr Hk (x )−
=
k=1
Rσ+(d−1)/2 (Vx ; Jk (x )).
k=1
The potential Vx satisfies the conditions of Theorem 4.1, a.e. in x ∈ Rd−1 . For k ∈ / κ(x , V ) we have |Jk (x )| Jk (x ) Vx dt ≤ 2 ln 3 and Theorem 4.1 yields Tr Hk (x )− = 0. Hence, Tr W (x , V )σ+(d−1)/2 = Rσ+(d−1)/2 (Vx ; Jk (x )) k∈κ(x ,V )
≤
k∈κ(x ,V
σ+(d−1)/2 2 2B (x ) k Vx (t)σ+d/2 dt − . exp(Ak (x )) − 1 Jk (x )
Lcl σ+(d−1)/2,1 )
Thus the claim follows from Proposition 2.1 using the identities Vx (t)σ+d/2 dt dx = V (x)σ+d/2 dx Rd−1 k∈κ(x ,V )
Jk (x )
ΩV
cl cl and Lcl σ,d−1 Lσ+(d−1)/2,1 = Lσ,d .
4.3. An example with V ∈ / Lσ+d/2 Let us illustrate Theorem 4.5 by an example of a Schr¨ odinger operator defined on a horn-shaped region with a potential such that the classical Lieb–Thirring inequality (2) fails. As in Sec. 3.1 set Ω1 = (x, y) ∈ R2 : |x| · |y| < 1 and put Vα (x, y) = |x|α |y|−α with α > 0. Again, we introduce a scaling parameter λ > 0 and study the operator Hα = −∆ − λVα , / Lσ+1 (Ω1 ) the defined in L2 (Ω1 ) with Dirichlet boundary conditions. Since Vα ∈ classical results (2) and (1) fail and it is not clear whether the spectrum of Hα is discrete and whether eigenvalue means are bounded in terms of the potential λVα . While our methods do not answer this question in general, we can apply Theorem 4.5 to get upper bounds on Rσ (λVα ; Ω1 ) for 0 < α < 2/5 and 3/2 ≤ σ < (1 − α)/α. Indeed, for any x ∈ R the section Ω1 (x) consists of one open interval (−x−1 , x−1 ) and A1 (x) =
4 |x|
0
|x|−1
λ|x|α |y|−α dy =
4λ |x|2(α−1) . 1−α
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
639
Since α < 1 we find that A1 (x) tends to zero as |x| tends to infinity. Thus A1 (x) ≤ 2 ln 3 holds for 1/(2−2α) 2λ |x| ≥ = xα (λ). (1 − α) ln 3 From Theorem 4.5 it follows that for σ ≥ 3/2 Rσ (λVα ; Ω1 ) ≤ 4Lcl σ,2
xα (λ)
0
0
x−1
xα(σ+1) y −α(σ+1) dy dx λσ+1 .
Note that the right-hand side is of this inequality is finite if σ < (1 − α)/α. Thus the estimate α(σ+1)/(1−α) 4 2 cl Rσ (λVα ; Ω1 ) ≤ Lσ,2 2α(σ + 1)(1 − α(σ + 1)) (1 − α) ln 3 × λ(σ+1)/(1−α) holds for 3/2 ≤ σ < (1 − α)/α and for all λ > 0. References [1] R. A. Adams, Capacity and compact imbeddings, J. Math. Mech. 19 (1970) 923–929. [2] R. A. Adams and J. F. Fournier, Sobolev Spaces, 2nd edn. (Academic Press, 2003). [3] M. Aizenman and E. H. Lieb, On semi-classical bounds for eigenvalues of Schr¨ odinger operators, Phys. Lett. 66 (1978) 427–429. [4] F. A. Berezin, Covariant and contravariant symbols of operators, Izv. Akad. Nauk SSSR Ser. Mat. 13 (1972) 1134–1167. [5] M. Sh. Birman and M. Z. Solomjak, Spectral Theory of Selfadjoint Operators in Hilbert Space, Mathematics and its Applications (Soviet Series) (D. Reidel Publishing Co., Dordrecht, 1987). [6] R. Courant and D. Hilbert, Methoden der Mathematischen Physik (Springer, Berlin, 1924). [7] C. Clark, Rellich’s embedding theorem for a ‘spiny urchin’, Canad. Math. Bull 10 (1967) 731–734. [8] E. B. Davies and B. Simon, Spectral properties of the Neumann Laplacian of horns, Geom. Funct. Anal. 2 (1992) 105–117. [9] R. L. Frank and L. Geisinger, Two-term spectral asymptotics of the Dirichlet Laplacian on a bounded domain, in Mathematical Results in Quantum Physics: Proceedings of the Qmath11 Conference, ed. Pavel Exner (World Scientific Publishing Company, 2011), pp. 138–147. [10] J. Fleckinger, R´epartition des valeurs propres d’op´erateurs elliptiques sur des ouverts non born´es, C. R. Acad. Sci. Paris S´ er. A 286(3) (1978) 149–152. [11] J. K. Freericks, E. H. Lieb and D. Ueltschi, Segregation in the Falicov–Kimball model, Comm. Math. Phys. 227(2) (2002) 243–279. [12] L. Geisinger, A. Laptev and T. Weidl, Geometrical versions of improved Berezin–Li– Yau inequalities, J. Spectral Theory 1(1) (2011) 87–109. [13] L. Geisinger and T. Weidl, Universal bounds for traces of the Dirichlet Laplace operator, J. Lond. Math. Soc. 82(2) (2010) 395–419.
July 6, J070-S0129055X11004394
640
2011 10:59 WSPC/S0129-055X
148-RMP
L. Geisinger & T. Weidl
[14] D. Hundertmark, E. H. Lieb and L. E. Thomas, A sharp bound for an eigenvalue moment of the one-dimensional Schr¨ odinger operator, Adv. Theor. Math. Phys. 2 (1998) 719–731. [15] L. H¨ ormander, The Analysis of Linear Partial Differential Operators, Vol. 4 (Springer-Verlag, Berlin, 1985). [16] V. Ja. Ivrii, The second term of the spectral asymptotics for the Laplace–Beltrami operator on manifolds with boundary, Funktsional. Anal. i Prilozhen. 14(2) (1980) 25–34. [17] ——, The second term of the spectral asymptotics for the Laplace–Beltrami operator on manifolds with boundary and for elliptic operators acting in vector bundles, Soviet Math. Dokl. 20(1) (1980) 1300–1302. [18] ——, Microlocal Analysis and Precise Spectral Asymptotics, Springer Monographs in Mathematics (Springer-Verlag, Berlin, 1998). [19] H. Kovaˇr´ık, S. Vugalter and T. Weidl, Two dimensional Berezin–Li–Yau inequalities with a correction term, Comm. Math. Phys. 287(3) (2009) 959–981. [20] A. Laptev, Dirichlet and Neumann eigenvalue problems on domains in Euclidean spaces, J. Funct. Anal. 151(2) (1997) 531–545. [21] M. Lianantonakis, On the eigenvalue counting function for weighted Laplace– Beltrami operators, J. Geom. Anal. 10 (2000) 299–322. [22] E. H. Lieb, The classical limit of quantum spin systems, Comm. Math. Phys. 31 (1973) 327–340. [23] ——, The Stability of Matter: From Atoms to Stars, Selecta of Elliott H. Lieb, ed. W. Thirring (Springer, 1997). [24] E. H. Lieb and R. Seiringer, The Stability of Matter in Quantum Mechanics (Cambridge University Press, Cambridge, 2010). [25] E. H. Lieb and W. Thirring, Inequalities for the moments of the eigenvalues of the Schr¨ odinger Hamiltonian and their relation to Sobolev inequalities, in Studies in Math. Phys., Essays in Honor of Valentine Bargmann, eds. E. Lieb, B. Simon and A. S. Wightman (Princeton Univ. Press, Princeton, New Jersey, 1976), pp. 269–330. [26] D. Lundholm, Weighted supermembrane toy model, Lett. Math. Phys. 92(2) (2010) 125–141. [27] A. Laptev and T. Weidl, Sharp Lieb–Thirring inequalities in high dimensions, Acta Math. 184(1) (2000) 87–111. [28] P. Li and S. T. Yau, On the Schr¨ odinger equation and the eigenvalue problem, Comm. Math. Phys. 88(3) (1983) 309–318. [29] A. D. Mel´ as, A lower bound for sums of eigenvalues of the Laplacian, Amer. Math. Soc. 131 (2003) 631–636. [30] S. G. Matinyan and B. M¨ uller, Adventures of the coupled Yang–Mills oscillators. I. Semiclassical expansion, J. Phys. A 39(1) (2006) 45–59. [31] G. V. Rozenbljum, On the distribution of eigenvalues of the first boundary problem in unbounded domains, Dokl. Akad. Nauk SSSR 200 (1971) 1539–1542. [32] ——, On the eigenvalues of the first boundary value problem in unbounded domains, Math. USSR-Sb. 18 (1972) 235–248. [33] ——, The computation of the spectral asymptotics for the Laplace operator in domains of infinite volume, J. Sov. Math. 6 (1976) 64–71. [34] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV. Analysis of Operators (Academic Press, 1978). [35] B. Simon, Non-classical eigenvalue asymptotics, J. Funct. Anal. 53 (1983) 84–98.
July 6, J070-S0129055X11004394
2011 10:59 WSPC/S0129-055X
148-RMP
Sharp Spectral Estimates in Domains of Infinite Volume
641
[36] Y. Safarov and D. Vassiliev, The Asymptotic Distribution of Eigenvalues of Partial Differential Operators, Translations of Mathematical Monographs, Vol. 155 (American Mathematical Society, Providence, RI, 1997). [37] M. van den Berg, On the spectrum of the Dirichlet Laplacian for horn-shaped regions in Rn with infinite volume, J. Funct. Anal. 58 (1984) 150–156. [38] ——, Dirichlet–Neumann bracketing for horn-shaped regions, J. Funct. Anal. 104(1) (1992) 110–120. [39] ——, On the spectral counting function for the Dirichlet Laplacian, J. Funct. Anal. 107(2) (1992) 352–361. [40] M. van den Berg and M. Lianantonakis, Asymptotics for the spectrum of the Dirichlet Laplacian on horn-shaped regions, Indiana Univ. Math. J. 50 (2001) 299–333. [41] J. Weidmann, Lineare Operatoren in Hilbertr¨ aumen. Teil II: Anwendungen (B. G. Teubner, Stuttgart, 2003). [42] T. Weidl, Improved Berezin–Li–Yau inequalities with a remainder term, Amer. Math. Soc. Transl. 225(2) (2008) 253–263. [43] H. Weyl, Das asymptotische Verteilungsgesetz der Eigenwerte linearer partieller Differentialgleichungen (mit einer Anwendung auf die Theorie der Hohlraumstrahlung), Math. Ann. 71(4) (1912) 441–479.
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 6 (2011) 643–667 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004382
SIMPLICITY OF EXTREMAL EIGENVALUES OF THE KLEIN–GORDON EQUATION
MARIO KOPPEN∗,§ , CHRISTIANE TRETTER†,¶ and MONIKA WINKLMEIER‡, ∗Mathematics
Centre, Technical University of Munich, Boltzmannstr. 3, 85748 Garching, Germany
†Institute
of Mathematics, University of Bern, Sidlerstr. 5, 3012 Bern, Switzerland
‡Department
of Mathematics, Universidad de Los Andes, Cra. 1a No 18A-70, A.A. 4976 Bogot´ a, Colombia §
[email protected] ¶
[email protected] [email protected] Received 2 August 2010 Revised 11 May 2011
Dedicated to Professor Heinz Langer on the occasion of his 75th birthday We consider the spectral problem associated with the Klein–Gordon equation for unbounded electric potentials such that the spectrum is contained in two disjoint real intervals related to positive and negative energies, respectively. If the two inner boundary points are eigenvalues, we show that these extremal eigenvalues are simple and possess strictly positive eigenfunctions. Examples of electric potentials satisfying these assumptions are given. Keywords: Klein–Gordon polynomial.
equation;
eigenvalue;
ground
state
energy;
operator
Mathematics Subject Classification 2010: 81Q05, 47A75, 81Q10, 47A56
1. Introduction The motion of a spin-0 particle with mass m > 0 and electric charge e > 0 in an exterior electromagnetic field in n spatial dimensions is described by the time-dependent Klein–Gordon equation. With the physical units chosen such that c = = 1, the Klein–Gordon equation takes the form 2 2 n ∂ ∂ − ieq + − eAj + m2 u =0 (1.1) −i ∂t ∂x j j=1 in L 2 (Rn ) where the electric potential q and the components Aj of the vector potential are real-valued functions. For the corresponding Cauchy problem, solutions 643
July 6, J070-S0129055X11004382
644
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
u (·, t) ∈ L 2 (Rn ), t ∈ R, are subject to an initial condition u (·, 0) = u0 with u0 ∈ L 2 (Rn ). The solvability of this Cauchy problem is closely related to the spectral problem (H0 − (λ − V )2 )u = 0,
(1.2)
formally obtained from the Klein–Gordon equation (1.1) by means of the ansatz u (·, t) = eiλt u. Here H0 is a self-adjoint realization in L 2 (Rn ) of the formal second order differential expression 2 n ∂ − eAj + m2 (1.3) −i ∂xj j=1 and V is the multiplication operator by eq in the Hilbert space L 2 (Rn ) with maximal domain, which is in general unbounded. Note that in (1.2) and (1.3), suitable assumptions on V and Aj are required to properly define the sums of operators, at least in the sense of quadratic forms. For V = 0, a point λ belongs to the spectrum of the eigenvalue problem (1.2) if and only if λ2 belongs to the spectrum of H0 . Since the latter is continuous and given by σ(H0 ) = σess (H0 ) = [m2 , ∞), the spectrum of (1.2) is continuous and consists of the two disjoint intervals (−∞, −m] ∪˙ [m, ∞). For bounded V with V < m, classical perturbation arguments show that the spectrum of (1.2) is contained in the two disjoint intervals (−∞, −m + V ] ∪˙ [m − V , ∞).
(1.4)
For bounded V with V > m, it was observed already in the 1940s that (1.2) may have non-real spectrum (see [29]); the latter is related to the so-called Klein paradox. On the other hand, even for unbounded V , the spectrum of (1.2) may remain 1
−1
real and retain a spectral gap (see [17]); e.g. if V is H02 -bounded with VH 0 2 < 1, then the spectrum of (1.2) is contained in the two disjoint intervals −1
−1
(−∞, −m + VH 0 2 m] ∪˙ [m − VH 0 2 m, ∞).
(1.5)
The aim of this paper is to investigate if, in the case when eigenvalues bound the spectral gap, they are simple. In analogy with the terminology for Schr¨ odinger operators, we call these extremal eigenvalues ground state energies. For bounded V , the simplicity of the ground state energies was proved in [22]. For relatively bounded V , the eigenvalues in the gap of the essential spectrum were studied and estimated by means of variational principles in [20], but their multiplicities were not investigated. In [22], as well as in [17, 18], the Klein–Gordon equation or, equivalently, the spectral problem (1.2) was linearized to obtain a first order system of differential equations or a spectral problem that depends linearly on the eigenvalue parameter λ, respectively.
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
645
For the purpose of the present paper, the original form (1.2) of the Klein– Gordon spectral problem used in [20] is more advantageous for two reasons. Firstly, the operators T (λ) associated with the left-hand side of (1.2) depend quadratically on the spectral parameter λ. This allows us to use the theory of strongly damped quadratic operator polynomials to ensure the reality of the spectrum due to the existence of a spectral gap and to prove that all the eigenvalues are semisimple, i.e. there are no associated vectors. Secondly, for fixed λ in the spectral gap, the operator T (λ) is a semi-bounded perturbation of the free Schr¨ odinger operator H0 = −∆ + m2 . Therefore we may apply a Krein–Rutman type theorem to establish the simplicity of the ground state energies of the Klein–Gordon spectral problem (1.2). The theory of self-adjoint quadratic operator polynomials in a Hilbert space H, L(λ) = λ2 A + λB + C,
λ ∈ C,
with A = I, bounded B, and compact positive C, was developed in the seminal work [8] of Krein and Langer (the results of which first appeared in [7] and were translated in [9, 10]). For non-compact C, this theory was further developed by Langer in [14] under the assumption that the pencil L is strongly damped. This notion goes back to Duffin (see [3]) and means that (Bx, x)2 > 4(Ax, x)(Cx, x),
x ∈ H\{0};
as a consequence, each quadratic equation (L(λ)x, x) = 0 has two different real solutions p− (x) < p+ (x), and the two (real) root zones ∆± := {p± (x) : x ∈ H\{0}} are disjoint and may have at most a common boundary point. It is well known that the spectrum of a strongly damped operator polynomial is real and all Jordan chains have length 1 (see e.g. [21, §31], where the term “hyperbolic” is used instead of strongly damped). The case of strongly damped operator pencils with unbounded coefficients was studied by Shkalikov (see [27]; here the two root zones ∆± may touch each other also at ∞, as it is the case for the Klein–Gordon pencil T , see (1.6) below). An operator version of the well-known Perron–Frobenius theorem for matrices with positive entries was established by Krein and Rutman (see [11, 12]). In the original version, if (M, dµ) is a measure space and T is an operator in L 2 (M, dµ) such that, for some integer k, the operator T k is positivity improving, i.e. T k f is positive for every non-negative f ∈ L 2 (M, dµ)\{0}, then the largest eigenvalue in modulus of T is real, positive and simple, with a strictly positive eigenfunction. There are numerous generalizations and extensions of the Krein–Rutman theorem. Here we apply a corresponding result by Faris for bounded self-adjoint operators in L 2 (M, dµ), which applies to the resolvent or the semi-group generated by semibounded Schr¨ odinger operators (see [4]). The paper is organized as follows. In Sec. 2 we consider an abstract form of the Klein–Gordon spectral problem (1.2), where H0 is assumed to be a uniformly positive operator in a Hilbert space H and V is a symmetric perturbation which is
July 6, J070-S0129055X11004382
646
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
H0 -form-bounded with H0 -form bound < 1; the latter implies, in particular, that 1 D(H02 ) ⊂ D(V ). Under these assumptions, with the right-hand side of (1.2) we can associate a self-adjoint operator ˙ 2 + 2λV − λ2 , T (λ) := H0 −V
λ ∈ C,
(1.6)
˙ 2 denotes the operator form sum of H0 and −V 2 . In Sec. 3 we study where H0 −V the quadratic operator polynomial T given by (1.6) by means of the associated 1
quadratic form t(λ) defined on D(t(λ)) = D(H02 ) ⊂ D(V ). If the two root zones 1
Λ± consisting of all zeros p± (x) of t(λ)[x] = 0 for x ∈ D(H02 ) are separated by a gap, then T is strongly damped and hence all eigenvalues of T are semi-simple, i.e. there are no associated vectors. In Sec. 4 we establish explicit conditions on the operator V in (1.6) guaranteeing that the two root zones are contained in two disjoint intervals separated by a gap, such as in (1.5). In Sec. 5 we specialize to the Klein–Gordon equation (1.2) in L 2 (Rn ) where H0 = −∆ + m2 . We show that, for λ in the gap between the two zones Λ± , the corresponding operator T (λ) is positivity improving, so that a Krein–Rutman type theorem applies. This yields that the ground state energies of the Klein–Gordon equation are simple. Finally, in Sec. 6 we show how our results apply to some concrete potentials, including Coulomb-like and Rollnik potentials with vanishing vector potential A. The following definitions are used throughout the paper. For a linear operator A in a Hilbert space H, we denote by D(A) its domain and by ker A its kernel. A sesquilinear form a with domain D(a) is called symmetric if a[x, y] = a[y, x], x, y ∈ D(a), and we write a[x] := a[x, x], x ∈ D(a), for the associated quadratic form. For a symmetric linear operator A, we write A 0 if there exists some γ0 > 0 such that (Ax, x) ≥ γ0 x2 , x ∈ D(A); the same notation is used for quadratic forms. The numerical range of a linear operator A and of a quadratic form a, respectively, are given by W (A) := {(Ax, x) : x ∈ D(A), x = 1}, W (a) := {a[x] : x ∈ D(a), x = 1}.
(1.7)
For a symmetric operator A, the numerical range W (A) is real; for a self-adjoint operator A, the inclusion σ(A) ⊂ W (A) holds. If a is a densely defined closed symmetric sesquilinear form bounded from below and A is the self-adoint operator associated with a by the first representation theorem, i.e. a[x, y] = (Ax, y), x ∈ D(A), y ∈ D(a), then W (A) is dense in W (a) (see [6, Theorem VI.2.1, Corollary VI.2.3]). More details on linear operators and quadratic forms may be found in [6]. If T is an operator function on C, i.e. a function λ → T (λ) defined on C whose values T (λ) are closed linear operators, the resolvent set, spectrum, point spectrum and essential spectrum of T are defined by ρ(T ) := {λ ∈ C : 0 ∈ ρ(T (λ))} = {λ ∈ C : T (λ) bijective}, σ(T ) := {λ ∈ C : 0 ∈ σ(T (λ))} = C\ρ(T ),
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
647
σp (T ) := {λ ∈ C : 0 ∈ σp (T (λ))} = {λ ∈ C : T (λ) not injective}, σess (T ) := {λ ∈ C : 0 ∈ σess (T (λ))} = {λ ∈ C : T (λ) not Fredholm}. If λ0 ∈ C is an eigenvalue of an analytic operator function T , then a sequence (xj )m−1 j=0 is called a Jordan chain of length m of T in λ0 if j 1 (k) T (λ0 )xj−k = 0, k!
j = 0, . . . , m − 1.
(1.8)
k=0
An eigenvalue λ0 of T is called semi-simple if the maximal length of a Jordan chain is 1; it is called simple if λ0 is semi-simple and dim ker T (λ0 ) = 1. If the domains D(T (λ)) =: D0 are independent of λ, we define the numerical range of T as W (T ) := {λ ∈ C : ∃ x ∈ D0 (T (λ)x, x) = 0}.
(1.9)
Clearly σp (T ) ⊂ W (T ). For operator functions T with bounded values, the inclusion / W (T (z0 )) (see [21, σ(T ) ⊂ W (T ) holds if there exists a z0 ∈ C such that 0 ∈ Theorem 26.6]). All the above notions are defined analogously for analytic operator functions defined on some domain Ω ⊂ C; since the operator functions occurring in this paper are polynomials, we may restrict ourselves to the case Ω = C. More details on analytic operator functions may be found in [21] and [6, Chap. VII]. 2. The Abstract Klein–Gordon Pencil T Suppose that H0 is an unbounded self-adjoint uniformly positive operator in a Hilbert space H, H0 ≥ m2 > 0, and V is a symmetric operator in H. In this section, we associate a self-adjoint operator T (λ) with the formal operator sum H0 − (λ − V )2 in (1.2) by means of quadratic forms. To this end, we assume that V satisfies the following two assumptions: 1
(V1) D(H02 ) ⊆ D(V ); (V2) there exist α, β ≥ 0 with β < 1 such that 1
1
|(V x, V x)| ≤ α x2 + β |(H02 x, H02 x)|,
1
x ∈ D(H02 ).
(2.1)
Assumption (V2) means that the self-adjoint operator V ∗ V , of which V 2 is a restriction, is H0 -form-bounded with H0 -form bound < 1; the infimum over all β such that (2.1) holds for some α ≥ 0 is called H0 -form bound of V ∗ V (see [23, Chap. X.2]). Remark 2.1. It is well-known (see e.g. [6, Sec. V.4.1]) that the existence of α, β ≥ 0, β < 1, with (2.1) is equivalent to the existence of a, b ≥ 0, b < 1, such that 1
V x ≤ ax + bH02 x,
1
x ∈ D(H02 );
(2.2)
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
648
√ √ in fact, (2.1) implies (2.2) with a = α, b = β, while (2.2) implies (2.1) with α = (1 + ε−1 )a2 , β = (1 + ε)b2 for arbitrary ε > 0. Hence assumption (V2) is equivalent to the assumption 1
1
(V2 ) V is H02 -bounded with H02 -bound < 1. − 12
Assumption (V1) alone already implies that VH 0 − 12
norm of VH 0
is a bounded operator. The
is related to the constants in assumption (V2) as follows.
Proposition 2.2. Suppose that assumption (V1) holds. If (V2) is satisfied with constants α, β in (2.1) or with constants a, b in (2.2), respectively, then
√ α α a − 12 − 12 − 12 + b, VH 0 ≤ + β, VH 0 ≤ + β. (2.3) VH 0 ≤ m m m2 In particular, the following are equivalent: (i) (ii) (iii) (iv)
−1
VH 0 2 < 1; a + b < 1; assumption (V2) holds with a, b ≥ 0 in (2.2) such that m √ √ α assumption (V2) holds with α, β ≥ 0 in (2.1) such that m + β < 1; α assumption (V2) holds with α, β ≥ 0 in (2.1) such that m2 + β < 1. −1
Proof. Since H0 ≥ m2 , we have H0 2 ≤ 1/m. If (2.1) or (2.2) hold, then the estimates 1 α − 12 − 12 − 12 2 2 2 2 + β x2 , (2.4) VH 0 x ≤ αH0 x + βH0 H0 x ≤ m2 1 a − 12 − 12 − 12 2 + b x, (2.5) VH 0 x ≤ aH0 x + bH0 H0 x ≤ m for x ∈ H imply the first and the third estimate in (2.3). Since (2.1) with α, β √ √ implies (2.2) with a = α, b = β by Remark 2.1, the second estimate in (2.3) follows from the first. −1 (i) ⇒ (ii), (i) ⇒ (iii), (i) ⇒ (iv): Since VH 0 2 is bounded by (V1), the estimate −1
1
−1
1
V x = VH 0 2 H02 x ≤ VH 0 2 H02 x,
1
x ∈ D(H02 ),
−1
shows that (2.1) holds with α = 0, β = VH 0 2 2 < 1 and that (2.2) holds with −1
a = 0, b = VH 0 2 < 1. (ii) ⇒ (i), (iii) ⇒ (i), (iv) ⇒ (i): All implications are obvious from the estimates in (2.3). The next lemma shows that if conditions (V1), (V2) are satisfied, then, for every λ ∈ C, there is a well-defined self-adjoint operator T (λ) associated with the formal operator sum H0 − (λ − V )2 in the abstract Klein–Gordon spectral problem (1.2).
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
649
Lemma 2.3. Assume that conditions (V1), (V2) hold. Then, for every λ ∈ C, there exists a unique closed m-sectorial operator T (λ) in H such that 1
1
(T (λ)x, y) = (H02 x, H02 y) − ((λ − V )x, (λ − V )y) =: t(λ)[x, y] 1
for all x, y ∈ D(H02 ) =: D(t(λ)) =: Dt . The corresponding operator function T has the following properties: (i) T is self-adjoint, i.e. T (λ)∗ = T (λ) for λ ∈ C, and for λ ∈ R the self-adjoint operators T (λ) are bounded from below ; (ii) the domains D(T (λ)) =: DT are independent of λ ∈ C and T (λ) = T (µ) + 2(λ − µ)(V − µ) − (λ − µ)2 ,
λ, µ ∈ C;
(2.6)
(iii) T is analytic in the generalized sense with derivatives given by T (λ)x = 2(V − λ)x,
T (λ)x = −2x,
T (j) (λ)x = 0,
j = 3, 4, . . . ,
for x ∈ D(T (λ)). Proof. The operator H0 is positive and self-adjoint and so the corresponding form 1 1 1 given by h0 [x, y] := (H02 x, H02 y), x, y ∈ D(H02 ), is closed and positive. By (V2) and Remark 2.1, there exist a, b ≥ 0, b < 1, such that 1
(V − λ)x ≤ V x + |λ| x ≤ (a + |λ|) x + b H02 x,
1
x ∈ D(H02 ).
1
Using Remark 2.1 again, we see that, for x ∈ D(H02 ), |((V − λ)x, (V − λ)x)| ≤ (V − λ)x (V − λ)x = (V − λ)x2 1
1
≤ (1 + ε−1 )(a + |λ|)2 x2 + (1 + ε)b2 (H02 x, H02 x)
(2.7)
with arbitrary ε > 0. Choosing ε < 1/b2 − 1, we see that the form v(λ) given by v(λ)[x, y] := ((λ − V )x, (λ − V )y), x, y ∈ D(V ), is h0 -bounded with h0 -bound < 1. Hence, for λ ∈ C, the form t(λ) is closed and sectorial by [6, Theorem VI.1.33] and the existence of the m-sectorial operator T (λ) follows from the first representation theorem (see e.g. [6, Theorem VI.2.1]). (i) Since t(λ) = t∗ (λ) for λ ∈ C, the operator function T is self-adjoint. For λ ∈ R the self-adjoint operators T (λ) are semi-bounded because T (λ) is m-sectorial. Note that for λ ∈ R the existence and semi-boundedness of T (λ) also follow from the so-called KLMN theorem (due to Kato–Lions–Lax–Milgram–Nelson, see [23, Theorem X.17] or [2, Sec. 2.1]). 1
(ii) It is easy to see that, for µ, λ ∈ R and x, y ∈ Dt = D(H02 ), t(λ)[x, y] = t(µ)[x, y] + 2(λ − µ)((V − µ)x, y) − (λ − µ)2 (x, y).
(2.8)
In particular, the operator T (λ) associated with the form on the left-hand side coincides with the operator which is associated with the sum of forms on the right-hand side; this operator has the same domain as the operator T (µ)
July 6, J070-S0129055X11004382
650
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
associated with the first term t(µ) because ((V −µ)x, y) defines a t(µ)-bounded form with t(µ)-bound 0 and (x, y) is a bounded form (see [6, Theorem VI.1.33]). (iii) For the family t[λ] of closed sectorial forms with λ-independent domains Dt = 1
D(t(λ)) = D(H02 ), clearly, t(λ)[u] depends analytically on λ for each u ∈ Dt , i.e. t(λ) is an analytic family of type (a) (see [6, VII.§4.2]). Hence, by [6, Theorem VII.4.2], the corresponding operator family T (λ) is also analytic (in the generalized sense), i.e. it is analytic of type (B). The formulas for the derivatives of T follow from the identity (2.6). ˙ 2 := T (0) the operator form sum correCorollary 2.4. If we denote by H0 −V sponding to t(0), then ˙ 2 + 2λV − λ2 , T (λ) = H0 −V
˙ 2 ), D(T (λ)) = DT = D(H0 −V
λ ∈ C.
Proof. The claim is immediate from Lemma 2.3(ii) with µ = 0. Remark 2.5. Note that, e.g. if the operator V is bounded, then T (0) = H0 − V 2 ˙ 2 may be defined as an operator sum; however, the operator form sum T (0) = H0 −V 2 may even be defined if D(H0 ) ∩ D(V ) = {0}. In particular cases, e.g. for the Klein–Gordon equation in R3 , it may even be possible to define the operators T (λ) without using assumption (V2) by means of the Leinfelder–Simader theorem [19, Theorem 4]. −1
By assumption (V1), the operator S := VH 0 2 is bounded and hence the quadratic operator polynomial L in the Hilbert space H, given by −1
−1
L(λ) = I − (S ∗ − λH0 2 )(S − λH0 2 ),
λ ∈ C,
(2.9)
has bounded coefficients. However, the numerical range W (L) of L is not bounded since L is not monic and its leading coefficient −H0−1 is not bounded away from 0. The following relation between the operator polynomials T and L was proved in [20]; a similar factorization may be found in [31, Eq. (4.9)]. 1
Proposition 2.6. Suppose that assumption (V1) holds, i.e. D(H02 ) ⊆ D(V ). Then 1
1
T (λ) = H02 L(λ)H02 ,
λ ∈ C,
(2.10)
and we have (i) σp (T ) ⊂ W (T ) ⊂ W (L) = W (t); (ii) σ(T ) ⊂ σ(L), σp (T ) = σp (L); (iii) σ(T ) ∩ R = σ(L) ∩ R, σess (T ) ∩ R = σess (L) ∩ R. Proof. All claims except for (i) were proved in [20, Proposition 2.3]. The first inclusion in (i) is obvious. In fact, if λ0 ∈ σp (T ), then there exists x0 ∈ DT \{0} such that T (λ0 )x0 = 0. Taking the scalar product with x0 yields λ0 ∈ W (T ).
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
651
The second inclusion in (i) follows from identity (2.10); the last equality follows from the relation 1
1
(L(λ)H02 x, H02 x) = t(λ)[x],
1
x ∈ Dt = D(H02 ),
(2.11)
1
and from the fact that H02 is bijective. Remark 2.7. In [17], [18], and [20], the abstract Klein–Gordon equation was studied under the assumption − 12
(V3) VH 0
=: S = S0 + S1 where S0 < 1 and S1 is compact, 1
which implies condition (V2). In fact, since S1 is compact, the operator S1 H02 1
has H02 -bound 0 and hence, for ε < 1 − S0 , there exists an α ≥ 0 such that 1
1
1
S1 H02 x ≤ αx + εH02 x for x ∈ D(H02 ) and so 1
1
1
Vx = S1 H02 x + S0 H02 x ≤ αx + (ε + S0 )H02 x. 3. Semi-Simplicity of the Eigenvalues In this section we establish conditions on V which guarantee that the quadratic operator polynomial T (λ), λ ∈ C, induced by the formal operator sum H0 −(λ−V 2 ) in Sec. 2, is strongly damped and that its spectrum splits into two parts of different type. Clearly, this holds for V = 0 since in this case T (λ) = H0 − λ2 and hence σ(T ) = σess (T ) = (−∞, −m] ∪˙ [m, ∞). For V = 0, corresponding conditions for V were established in [17]; for bounded V = 0, weaker conditions were given in [22] (see also [20]). The notion of strongly damped operator polynomials was first introduced in [3] in the finite-dimensional case; in the infinite-dimensional case with bounded coefficients it was elaborated in [7, 8] (see also [9, 10]), and in [14]; for unbounded coefficients and constant domain, it was considered in [27]. Definition 3.1. The operator polynomial T defined in Lemma 2.3 is called strongly damped if, for every x ∈ DT , the quadratic polynomial T (·)x, x on R has two real and distinct roots. The form polynomial t defined in Lemma 2.3 is called strongly damped if, for every x ∈ Dt , the quadratic polynomial t(·)[x] on R has two real and distinct roots. Remark 3.2. (i) If T is strongly damped, then W (T ) ⊂ R. (ii) If t is strongly damped, then T is strongly damped (since DT ⊂ Dt ). The following lemma and its proof generalize a result for strongly damped quadratic operator polynomials L(λ) = λ2 + λB + C, λ ∈ C, which was proved in [14, Behauptung 5.1] for the case of an unbounded self-adjoint coefficient B and
July 6, J070-S0129055X11004382
652
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
bounded C ≥ 0; for the case of bounded coefficients, a less direct proof may be found in [21, Theorem 31.1]. Lemma 3.3. Let t be strongly damped and denote the two different real zeros of 1 the quadratic equation t(λ)[x] = 0 by p− (x) < p+ (x) for x ∈ D(H02 )\{0}. If we let 1
Λ− := {p− (x) : x ∈ D(H02 )\{0}}, 1
Λ+ := {p+ (x) : x ∈ D(H02 )\{0}},
ν− := sup Λ− ,
(3.1)
ν+ := inf Λ+ ,
(3.2)
then the sets Λ− and Λ+ are disjoint; in particular, ν− ≤ ν+ . 1
Proof. Assume, to the contrary, that there exist elements x, y ∈ D(H02 )\{0} with λ0 := p− (x) = p+ (y).
(3.3)
Then, by the assumption on t, we have t(λ0 )[x] = t(λ0 )[y] = 0,
d d
t(λ)[x] t(λ)[y]
> 0, < 0. dλ dλ λ=λ0 λ=λ0
(3.4) (3.5)
Moreover, without loss of generality, we may assume that Re t(λ0 )[x, y] ≤ 0. Otherwise, we may replace x by −x since p− (−x) = p− (x); in fact, p± (αw) = p± (w),
1
w ∈ D(H02 )\{0},
α ∈ C\{0},
(3.6)
as t(λ)[αw] = |α|2 t(λ)[w] and so the two equations t(λ)[αw] = 0 and t(λ)[w] = 0 have the same roots. Set z(t) := tx + (1 − t)y ∈ Dt , t ∈ [0, 1]. First we show that z(t) = 0, t ∈ [0, 1]. Clearly, z(0) = y = 0 and z(1) = x = 0. If z(t) = 0 for some t0 ∈ (0, 1), then x = (1 − t0 )/t0 y and hence, by (3.6), it follows that p+ (y) = p− (x) = p− ((1 − t0 )/t0 y) = p− (y), a contradiction to (3.3). By the definition of z(t) and by (3.4), we see that, for all t ∈ [0, 1], t(λ0 )[z(t)] = t2 t(λ0 )[x] + (1 − t)2 t(λ0 )[y] + 2t(1 − t) Re t(λ0 )[x, y] = 2t(1 − t) Re t(λ0 )[x, y] ≤ 0. Moreover, the function h(t) :=
d t(λ)[z(t)]
, dλ λ=λ0
t ∈ [0, 1],
depends continuously on t and, by (3.5), we have h(0) < 0 and h(1) > 0. Hence there exists a t0 ∈ [0, 1] such that h(t0 ) = 0. Altogether, we have shown that the
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
653
quadratic polynomial q(λ) := t(λ)[z(t0 )], λ ∈ R, satisfies q(λ0 ) = t(λ0 )[z(t0 )] ≤ 0,
q (λ0 ) =
d t(λ)[z(t0 )]
= 0. dλ λ=λ0
As limλ→±∞ q(λ) = −∞ (see the definition of t in Lemma 2.3), it follows that q is non-positive and possesses at most one real zero, a contradiction to the assumption that t is strongly damped. Proposition 3.4. The form polynomial t satisfies the following implications: (i) If t is strongly damped, then t(λ) ≥ 0 for all λ ∈ [ν− , ν+ ]. (ii) If t(λ0 ) > 0 for some λ0 ∈ R, then t is strongly damped and ν− ≤ λ0 ≤ ν+ . Proof. (i) Let λ ∈ [ν− , ν+ ] and x ∈ Dt , x = 0, be arbitrary. Then, by the definition of p± (x) as the zeros of the quadratic equation t(λ)[x] = 0 and by the definition of ν± in (3.1), (3.2), we have t(λ)[x] = (λ − p− (x))(p+ (x) − λ)(x, x) ≥ (λ − ν− )(ν+ − λ)(x, x) =: γ(x, x)
(3.7)
where γ ≥ 0. (ii) Let x ∈ Dt , x = 0. As limλ→±∞ t(λ)[x] = −∞ and t(λ0 )[x] > 0 by assumption, it follows that t(λ)[x] = 0 has two real zeros p± (x) and p− (x) < λ0 < p+ (x). By the definition of ν± , it is immediate that ν− ≤ λ0 ≤ ν+ . For strongly damped quadratic operator polynomials for which at least one of the two root zones is bounded, it is well-known that the length of every Jordan chain is 1 (see [14, p. 164] and also [21, Lemma 30.13]). The proof for two unbounded root zones is similar; we repeat it for its simplicity and for the convenience of the reader. Theorem 3.5. If the operator polynomial T is strongly damped, then all eigenvalues of T are real and semi-simple, i.e. all Jordan chains of T have length 1. Proof. Since T is strongly damped, Proposition 2.6(i) and Remark 3.2(i) imply that σp (T ) ⊂ W (T ) ⊂ R. Assume that λ0 ∈ R is an eigenvalue of T that is not semi-simple. Then, by (1.8), there exist elements x0 , x1 ∈ D(T (λ0 )) = DT , x0 = 0, such that T (λ0 )x0 = 0,
T (λ0 )x0 + T (λ0 )x1 = 0.
July 6, J070-S0129055X11004382
654
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
This implies that (T (λ0 )x0 , x0 ) = 0,
d (T (λ)x0 , x0 )
= (T (λ0 )x0 , x0 ) = −(T (λ0 )x1 , x0 ) dλ λ=λ0 = −(x1 , T (λ0 )x0 ) = 0; here we have used that T (λ0 ) is self-adjoint by Proposition 2.3(i) since λ0 is real. This shows that the quadratic polynomial λ → (T (λ)x0 , x0 ) has a double zero at λ0 , a contradiction to the assumption that T is strongly damped. If T is strongly damped, it is not immediate that the whole spectrum of T is real. The reason for this is that we only have a spectral inclusion theorem for analytic operator functions with bounded coefficients (see [21, Theorem 26.6]). Using the quadratic form polynomial t, we shall now show that σ(T ) ⊂ W (T ) ⊂ W (t) ⊂ R if ν− < ν+ . The following definiteness properties were proved in [21, Lemma 31.15] for operator polynomials with bounded coefficients (see also the original work [14, Abschnitt II.5.1]). Proposition 3.6. The form polynomial t satisfies the following implications: (i) If t is strongly damped with ν− < ν+ , then t(λ) 0 for all λ ∈ (ν− , ν+ ), and t(ν± ) ≥ 0. (ii) If t(λ0 ) 0 for some λ0 ∈ R, then t is strongly damped with ν− < λ0 < ν+ . Proof. (i) Since t is strongly damped, Proposition 3.4(i) shows that t(λ) ≥ 0 for all λ ∈ [ν− , ν+ ] . If ν− < ν+ and λ ∈ (ν− , ν+ ), then γ = (λ − ν− )(ν+ − λ) > 0 in (3.7) and hence t(λ) ≥ γ > 0. (ii) By Proposition 3.4(ii), t is strongly damped. By assumption, there exists γ0 such that t(λ0 ) ≥ γ0 > 0. To prove that ν− < λ0 < ν+ , we show that for every γ ∈ (0, γ0 ) there exists an ε > 0 such that t(λ) ≥ γ > 0 for all λ ∈ (λ0 −ε, λ0 +ε); then Proposition 3.4(ii) implies that ν− ≤ λ0 − ε < λ0 + ε ≤ ν+ . Let γ ∈ (0, γ0 ) and assume, to the contrary, that there exist sequences (µn )n∈N ⊂ (0, ∞), µn → λ0 for n → ∞, and (xn )n∈N ⊂ Dt , xn = 1, such that γ > t(µn )[xn ],
n ∈ N.
Set δn := 2(µn − λ0 )λ0 + (µn − λ0 )2 , n ∈ N. Then, by (2.8), γ + δn > t(µn )[xn ] + 2(µn − λ0 )λ0 + (µn − λ0 )2 = t(λ0 )[xn ] + 2(µn − λ0 )(V xn , xn ) ≥ t(λ0 )[xn ] − 2|µn − λ0 ||(V xn , xn )| ≥ γ0 − 2|µn − λ0 ||(V xn , xn )|.
(3.8)
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
655
If |(V xn , xn )| were bounded, the left-hand side in the above inequalities (3.8) would tend to γ, while the right-hand side would tend to γ0 > γ, a contradiction. Hence |(V xn , xn )| → ∞ for n → ∞. By the Cauchy–Schwarz inequality and assumption (V2), there exist α, β ≥ 0, β < 1, such that 1
|(V xn , xn )|2 ≤ V xn 2 ≤ αxn 2 + βH02 xn 2 , n ∈ N, 1 which implies that also H02 xn → ∞ for n → ∞. By (3.8), the definition of t in Lemma 2.3, and the above inequality, it follows that 1
γ + δn ≥ H02 xn 2 − (V − λ0 )xn 2 − 2|µn − λ0 | |(V xn , xn )| 1
≥ H02 xn 2 − V xn 2 − 2|λ0 | V xn − |λ0 |2 − 2 |µn − λ0 | V xn 1 1 √ ≥ (1 − β)H02 xn 2 − α − 2(|λ0 | + |µn − λ0 |)( α + βH02 xn ) − λ20 . (3.9) 1 2
Since β < 1 and H0 xn → ∞ for n → ∞, the right hand side of the inequalities (3.9) tends to ∞, whereas the left hand side tends to γ, a contradiction. The proof of the following theorem relies implicitly on the Langer factorization theorem on quadratic operator polynomials (see [14, Abschnitt II.3] or [15]) which is the main ingredient for the two lemmas from [27] that we use. Theorem 3.7. If t is strongly damped and ν− < ν+ , then σ(T ) ⊂ (−∞, ν− ] ∪˙ [ν+ , ∞) ⊂ R
(3.10)
and ν± ∈ σ(T ); if ν± ∈ W (t), then ν± ∈ σp (T ). Proof. By Proposition 2.6(i), we have W (T ) ⊂ W (t) ⊂ (−∞, ν− ] ∪˙ [ν+ , ∞) ⊂ R. Hence T satisfies the assumptions of [27, Lemmas 1.1 and 1.2], which yield that σ(T ) ⊂ R. By Proposition 3.6(i), we have t(λ) 0 for all λ ∈ (ν− , ν+ ). Since D(T (λ)) = 1
DT ⊂ Dt = D(H02 ), this implies that also T (λ) 0 for all λ ∈ (ν− , ν+ ). Since T (λ) is self-adjoint, it follows that 0 ∈ ρ(T (λ)) and hence λ ∈ ρ(T ) for all λ ∈ (ν− , ν+ ), which proves (3.10). By (3.10) and Propositions 2.6(ii) and (iii), it follows that σp (L) = σp (T ) and σ(L) ∩ R = σ(T ). By Proposition 2.6(i), we have W (t) = W (L). From [14, Behauptung 5.1], it follows that the boundary points ν± of W (t) = W (L) belong to σ(L) and hence ν± ∈ σ(L) ∩ R = σ(T ). If ν− ∈ W (t) = W (L), then there is an x− ∈ H\{0} with (L(ν− )x− , x− ) = 0. −1
−1
By (2.11) and Proposition 3.6(i), we have (L(ν− )x, x) = t(ν− )[H0 2 x, H0 2 x] ≥ 0 for all x ∈ H. Now the Cauchy–Schwarz inequality for the positive semi-definite
July 6, J070-S0129055X11004382
656
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
inner product (L(ν− )·, ·) shows that 1
1
|(L(ν− )x− , y)| ≤ |(L(ν− )x− , x− )| 2 |(L(ν− )y, y)| 2 = 0 for arbitrary y ∈ H and hence L(ν− )x− = 0, i.e. ν− ∈ σp (L). By Proposition 2.6(ii), we know that σp (L) = σp (T ), which completes the proof of the last claim. 4. Criteria for Real Spectrum In this section we establish conditions on the operator V guaranteeing that the abstract Klein–Gordon pencil T is strongly damped and ν− < ν+ , thus ensuring that T has real spectrum σ(T ) ⊂ (−∞, ν− ] ∪˙ [ν+ , ∞). They generalize the conditions given in [22] for bounded V and they weaken the conditions given in [20] in the case when V is definite, i.e. V ≥ 0 or V ≤ 0. Lemma 4.1. Assume that (V1) and (V2) hold, and let a, b ≥ 0, b < 1, be the constants according to (2.2). Then, for λ ∈ R, the form t(λ) is semi-bounded with (i) t(λ) ≥ m2 − (a + bm + |λ|)2 ; (ii) if a + bm < m, then t(λ) 0 for |λ| < m − (a + bm); −1
−1
(iii) if VH 0 2 < 1, then t(λ) 0 for |λ| < m − VH 0 2 m. 1
Proof. (i) Let λ ∈ R and x ∈ Dt = D(H02 ). By (2.2) and (2.7), we have, for arbitrary ε > 0, 1
t(λ)[x] = H02 x2 − (V − λ)x2 1
≥ (1 − (1 + ε)b2 )H02 x2 − (1 + ε−1 )(a + |λ|)2 x2 ≥ ((1 − (1 + ε)b2 )m2 − (1 + ε−1 )(a + |λ|)2 )x2 =: h(ε)x2 . It is not difficult to check that the function h : (0, ∞) → R has a maximum at ε0 = (a + |λ|)/bm and hence t(λ) ≥ h(ε0 ) = m2 − (a + bm + |λ|)2 . (ii) is immediate from (i).
1
−1
(iii) Let λ ∈ R and x ∈ Dt = D(H02 ). Since H0 ≥ m2 , we have H0 2 ≤ 1/m, 1
H02 x ≥ mx and hence the estimate 1
− 12
t(λ)[x] = H02 x2 − (VH 0 ≥
−1
1
− λH0 2 )H02 x2 2 1 |λ| − 12 1 − VH 0 + H02 x2 m −1
≥ (m2 − (VH 0 2 m + |λ|)2 )x2 ;
(4.1)
here, for the last estimate, we have used that the first factor is > 0 if (and only −1
if) |λ| < m − VH 0 2 m.
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
657
Remark 4.2. Claim (iii) of Lemma 4.1 is stronger than claim (ii) since (2.3) shows −1
that VH 0 2 m ≤ a + bm. However, the estimate (4.1) in this case does not permit to derive a lower bound for t(λ) as in (i) for all λ ∈ R. Part (i) of the following theorem was proved in [17, Lemma 5.1] by means of the operator polynomial L (using the inclusions W (T ) ⊂ W (L), σ(T ) ⊂ σ(L), see Propositions 2.6(i) and (ii)); parts (ii) and (iii) were proved in [22] for bounded V . 1
Theorem 4.3. Assume that (V1) holds, i.e. D(H02 ) ⊂ D(V ). −1
(i) If VH 0 2 < 1, then −1
−1
ν− ≤ −m + V H0 2 m < 0 < m − V H0 2 m ≤ ν+ and hence −1
−1
σ(T ) ⊂ (−∞, −m + V H0 2 m] ∪˙ [m − V H0 2 m, ∞). −1
(ii) If V is self-adjoint with V ≥ 0 and VH 0 2 < 2, then −1
ν− ≤ −m + V H0 2 m < m ≤ ν+ and hence −1
σ(T ) ⊂ (−∞, −m + V H0 2 m] ∪˙ [m, ∞). −1
(iii) If V is self-adjoint with V ≤ 0 and VH 0 2 < 2, then −1
ν− ≤ −m < m − V H0 2 m ≤ ν+ and hence 1
− σ(T ) ⊂ (−∞, −m] ∪˙ [m − V H0 2 m, ∞).
Proof. (i) The estimates for ν± are immediate from Lemma 4.1(iii) and Proposition 3.6(ii). Together with Theorem 3.7, the inclusion for σ(T ) follows. −1
−1
(ii) The condition V H0 2 < 2 implies that −m + V H0 2 m < m. Then, for −1
1
arbitrary λ ∈ (−m + V H0 2 m, m) and x ∈ Dt = D(H02 ), x = 1, we have 1
1
t(λ)[x] = (H02 x, H02 x) − ((V − λ + m)x, (V − λ − m)x) − m2 x2 ≥ −((V + (m − λ))x, (V − m − λ)x) because H0 ≥ m2 . Since λ < m and V ≥ 0, we have V + m − λ > 0. Moreover, for arbitrary y ∈ D(V ), we can estimate ((V − m − λ)y, y) ≤ |(V y, y)| − (m + λ)(y, y) −1
1
−1
1
≤ (V H0 2 H02 y − (m + λ)H0 2 H02 y)y 1 λ − 12 ≤ V H0 − 1 − H02 y y ≤ 0 m
July 6, J070-S0129055X11004382
658
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier −1
because λ > −m + V H0 2 m. Thus, as V is self-adjoint, the square roots of the non-negative operators V + m − λ, −(V − m − λ) exist and we arrive at the estimate t(λ)[x] 1
1
1
1
≥ ((V + m − λ) 2 (−(V − m − λ)) 2 x, (V + m − λ) 2 (−(V − m − λ)) 2 x) > 0. −1
Now Proposition 3.4(ii) applied twice with λ0 = −m + V H0 2 m and λ0 = m −1
implies that ν− ≤ −m + V H0 2 m < m ≤ ν+ . (iii) The proof of (iii) is completely analogous to the proof of (ii). Corollary 4.4. Suppose that assumptions (V1) and (V2) hold, let α, β ≥ 0, β < 1 and a, b ≥ 0, b < 1, be the constants according to (2.1) and (2.2), respectively. Define √ δ1 := a + b m, δ2 := α + β m, δ3 := α + β m2 , and let i ∈ {1, 2, 3}. (i) If δi < m, then σ(T ) ⊂ (−∞, −m + δi ] ∪˙ [m − δi , ∞). (ii) If V is self-adjoint with V ≥ 0 and δi < 2m, then σ(T ) ⊂ (−∞, −m + δi ] ∪˙ [m, ∞). (iii) If V is self-adjoint with V ≤ 0 and δi < 2m, then σ(T ) ⊂ (−∞, −m] ∪˙ [m − δi , ∞). −1
Proof. The claims are immediate from Proposition 4.3 since VH 0 2 ≤ δi m due to the estimates in (2.3). Remark 4.5. Note that δ3 ≤ δ2 so that the assumption δ3 < m is weaker than δ2 < m. Hence if δ2 < m, then we have the spectral inclusions for i = 2 and i = 3 and thus σ(T ) ⊂ (−∞, −m + δ3 ] ∪˙ [m − δ3 , ∞) ⊂ (−∞, −m + δ2 ] ∪˙ [m − δ2 , ∞). Remark 4.6. If V is a symmetric operator in H of the form V = V + c where V satisfies (V1) and (V2), then all the above results as well as Theorem 5.5 below also apply to V ; in this case, we only have to replace the spectral parameter λ by = λ − c and ν± by ν± := ν± − c. the new spectral parameter λ 5. Simplicity of the Ground State Energies In this section we consider the particular case of the Hilbert space H = L 2 (Rn ), the self-adjoint operator H0 = −∆ + m2 , and a (real-valued) multiplication operator
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
659
V therein. We assume that V satisfies the assumptions of the previous sections so ˙ 2 + 2λV − λ2 , λ ∈ C, is that the corresponding Klein–Gordon pencil T (λ) = H0 −V strongly damped and ν− < ν+ . We shall show that if, in this case, ν− or ν+ are eigenvalues of the Klein–Gordon pencil T , then they are not only semi-simple, but even simple with strictly positive eigenfunction. To this end, we need the concept of positivity preserving and improving linear operators and a few other definitions. Definition 5.1. Let (M, dµ) be a measure space. (i) A function f ∈ L 2 (M, dµ) is called strictly positive if f (x) > 0 for almost all x ∈ M and positive if f (x) ≥ 0 for almost all x ∈ M and f ≡ 0. (ii) A linear operator B in L 2 (M, dµ) is called positivity preserving if Bf is positive for all positive f ∈ D(B) and positivity improving if Bf is strictly positive for all positive f ∈ L 2 (M, dµ). (iii) A configuration projection is a projection that is a multiplication operator in L 2 (M, dµ); the range of such a configuration projection is called a configuration subspace of L 2 (M, dµ). (iv) A bounded linear operator B in L 2 (M, dµ) is called indecomposable if no non-trivial configuration subspace is left invariant under B. The simplicity of the ground state energies of Schr¨ odinger operators is usually proved by means of the following Krein–Rutman type theorem for bounded selfadjoint operators in L 2 (M, dµ), applied to the resolvent or to the corresponding semi-group. The following two theorems were proved in [4]. Theorem 5.2 ([4, Theorem 10.3]). Let B be a bounded self-adjoint operator in L 2 (M, dµ) which is positivity preserving and indecomposable. If b+ := max σ(B) is an eigenvalue of B, then b+ is simple with strictly positive eigenfunction. Theorem 5.3 ([4, Theorem 10.5]). Assume that T0 is a non-negative selfadjoint operator in L 2 (M, dµ) and let v be a real-valued measurable function on M such that the corresponding multiplication operator V is T0 -form-bounded with relative bound < 1. Denote by T the operator form sum T0 + V, which is self-adjoint and bounded from below. Then (i) if T0 is indecomposable, then T is indecomposable; (ii) if T0 is indecomposable and (T0 + c)−1 is positivity preserving for all c > 0 and t− := min σ(T ) is an eigenvalue of T, then t− is simple with strictly positive eigenfunction. In the sequel we apply these results to the Klein–Gordon equation with vanishing vector potential A (i.e. Aj ≡ 0, j = 1, . . . , n); here the following lemma will be used. Lemma 5.4. The operator H0 = −∆ + m2 in L 2 (Rn ) is indecomposable and (H0 + c)−1 is positivity preserving for all c > 0.
July 6, J070-S0129055X11004382
660
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
Proof. In [4] it was shown that H0 is indecomposable. By the Trotter product formula (see [25, Theorem VIII.31]), we have e−t(−∆+m
2
)
= s-lim (et∆/n e−tm
2
/n) n
n→∞
) .
Since et∆ is positivity preserving (see [24, XIII.12, Ex. 1]) and this property is preserved under strong limits (see [23]), it follows that e−tH0 is positivity preserving. Now the well-known formula (see [23, (X.98)]) ∞ −1 e−ct e−tH0 x dt, x ∈ L 2 (Rn ), (H0 + c) x = 0
shows that (H0 + c)−1 is positivity preserving as well. Theorem 5.5. Let H0 = −∆ + m2 and let V be a symmetric operator in L 2 (Rn ) 1
−1
with D(H02 ) ⊂ D(V ) and VH 0 2 < 1. Suppose that ν− < ν+ for ν± defined in (3.2), (3.1). If ν+ (ν− , respectively) is an eigenvalue of T, then it is simple with strictly positive eigenfunction. Proof. Assume that ν− is an eigenvalue of T ; the proof for ν+ is analogous. Then 0 is an eigenvalue of T (ν− ) and T (ν− ) ≥ 0 by Proposition 3.6(i). By Theorem 5.3, 0 is a simple eigenvalue of T (ν− ) and there is a strictly positive eigenfunction of T (ν− ) at 0 which is simultaneously an eigenfunction of T at ν− . Since ν− is a semi-simple eigenvalue of T by Theorem 3.5, the assertion is proved. Remark 5.6. For non-vanishing magnetic vector potential A, the extremal eigenvalues cannot be expected to be simple in general. This phenomenon already manifests itself in the situation of two spatial dimensions and a constant magnetic field B (in which case we can choose A = x ∧ B). Then 2 2 ∂ H0 = − eAj + m2 −i ∂x j j=1 is the so-called Landau Hamiltonian (shifted by the constant m2 ), for which the spectral resolution is explicitly known and whose eigenvalues are infinitely degenerate (see [13]). Another related result is the Aharonov–Casher theorem (see [1, Theorem 6.5]) which asserts that the degeneracies of the eigenvalues of the Landau Hamiltonian with a bounded compactly supported magnetic field are proportional to the total flux of the magnetic field. It is also known (see [30, 28]) that, for A = 0, the operator H0 does not generate a positivity-preserving semigroup and thus does not satisfy the assumptions of the Krein–Rutman theory we have employed in this section. 6. Examples of Potentials In this final section we apply the above simplicity results for the ground state energies to the spectral problem associated with the Klein–Gordon equation (1.1)
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
661
in Rn with n ≥ 3. We show, in particular, how the assumptions of our abstract theorems work out for certain classes of scalar potentials, including Coulomb-like and Rollnik potentials. The first class of scalar potentials was motivated by a remark in [22], although the potentials therein were, in general, assumed to be bounded. In the sequel, we denote by H 1 (Rn ) the first order Sobolev space associated with L 2 (Rn ). Theorem 6.1. Let A ≡ 0 and suppose that the potential q satisfies 2 2 γ γ < ess inf eq(x) + m2 + =: q+ q− := ess sup eq(x) − m2 + x∈Rn x2 x2 x∈Rn with n ≥ 3 and 0 ≤ γ < (n − 2)/2. Then the form polynomial t given by t(λ)[Ψ] = (−i∇ Ψ, −i∇ Ψ) + m2 Ψ2 − ((eq − λ)Ψ, (eq − λ)Ψ),
Ψ ∈ H 1 (Rn ),
is strongly damped and the boundary points ν± of its numerical range defined in (3.1), (3.2) satisfy ν− ≤ q− < q+ ≤ ν+ . Hence the operator polynomial T associated with the Klein–Gordon equation (1.1) is strongly damped, its spectrum σ(T ) is real, σ(T ) ⊂ (−∞, q− ] ∪˙ [q+ , ∞), ν± ∈ σ(T ), all eigenvalues of T are semi-simple, and if ν− or ν+ are eigenvalues of T, they are simple with strictly positive eigenfunctions. Proof. The claims follow from Theorems 3.5 and 5.5 if we show that V = eq satisfies conditions (V1) and (V2) and that t(λ) 0 for λ ∈ (q− , q+ ). So let λ ∈ (q− , q+ ). Then, by the definition of q± and by assumption, there exists an ε > 0 such that γ2 − ε for almost all x ∈ Rn . (6.1) |eq(x) − λ| ≤ m2 + x2 Together with Hardy’s inequality (see e.g. [26, Sec. 3.3]), we thus obtain, for Ψ ∈ 1
D(H02 ) = H 1 (Rn ), 2
(V − λ)Ψ =
Rn
|(eq(x) − λ)Ψ(x)|2 dx
γ2 2 − ε m2 + |Ψ(x)|2 dx 2 x n R 1 |Ψ(x)|2 dx = (m2 − ε2 )Ψ2 + γ 2 2 Rn x
≤
July 6, J070-S0129055X11004382
662
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
≤ (m2 − ε2 )Ψ2 +
4γ 2 (n − 2)2
Rn
|∇Ψ(x)|2 dx
4γ 2 ∇Ψ2 (n − 2)2 1 4γ 2 4γ 2 2 2 ≤ 1− − ε H02 Ψ2 . m Ψ2 + 2 2 (n − 2) (n − 2) = (m2 − ε2 )Ψ2 +
(6.2)
1
This shows that D(H02 ) = H 1 (Rn ) ⊂ D(V ), i.e. condition (V1) holds and, since γ < 1/2 and n ≥ 3, condition (V2) is satisfied with some α ≥ 0 and β = 4γ 2 /(n − 2)2 < 1. The inequality (6.2) yields that 1
(V − λ)Ψ2 ≤ (m2 − ε2 )Ψ2 + ∇Ψ2 = −ε2 Ψ2 + H02 Ψ2 1
for λ ∈ (q− , q+ ) and Ψ ∈ D(H02 ) = H 1 (Rn ) and hence 1
t(λ)[Ψ] = H02 Ψ2 − (V − λ)Ψ2 ≥ ε2 Ψ2, i.e. t(λ) 0 for λ ∈ (q− , q+ ), as required. Remark 6.2. The assumptions on q in [22] are slightly different; it is only required that q satisfies a pointwise estimate as in Theorem 6.1 with γ = 1/2. It is not clear −1
why, under this weaker assumption, still VH 0 2 < 1 as claimed in [22]. A particular case of Theorem 6.1 is the Coulomb potential in Rn ; in the case n = 3, the explicitly known formulas for the eigenvalues and eigenfunctions (see [31] and also [20, Sec. V]) confirm our results. Corollary 6.3 (Coulomb Potential in Rn ). Let A ≡ 0 and let q(x) = −Ze/x, x ∈ Rn \{0}, where Z is the nuclear charge. If Ze2 < (n − 2)/2, then the Klein– Gordon pencil T is strongly damped with ν− ≤ −m < 0 ≤ ν+ , the spectrum of T is real, σp (T ) ⊂ σ(T ) ⊂ (−∞, −m] ∪˙ [0, ∞), ν± ∈ σ(T ), and all eigenvalues of T are semi-simple. If ν+ < m, then ν+ ∈ σp (T ) and ν+ is simple with strictly positive eigenfunction. Proof. By Theorem 6.1 (or by directly applying Hardy’s inequality), we see that 1 V = eq is (−∆ + m2 ) 2 -bounded, 1
D(H02 ) ⊂ D(V ),
−1
V H0 2 ≤
2Ze2 . n−2
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
663
Hence all claims but the last follow from Theorem 6.1 if we show that q− ≤ −m and q+ ≥ 0. These two estimates follow from the inequalities 2 2 )2 (Ze −Ze (Ze2 )2 −Ze2 2+ ≤ −m, m ≥ 0, x ∈ Rn \{0}. − m2 + + x x2 x x2 Notice that the bounds ν− ≤ −m, ν+ ≥ 0 also follow from Theorem 4.3(iii) since the Coulomb potential is negative. In order to prove the last claim, we observe that V = eq is H0 -compact where H0 = −∆ + m2 (see [6, Lemma V.5.8]). This implies that the difference of the resolvents of T (λ) and of H0 − λ2 is compact for every λ ∈ R and σess (T ) = {λ ∈ C : λ2 ∈ σess (H0 )} = (−∞, −m] ∪˙ [m, ∞) (compare [32]). Since ν± ∈ σ(T ) by Theorem 3.7, we conclude that, if 0 ≤ ν+ < m, / σess (T ) or, equivalently 0 ∈ / σess (T (ν+ )). Since T (ν+ ) is self-adjoint, it then ν+ ∈ follows that 0 ∈ σ(T (ν+ ))\σess (T (ν+ )) ⊂ σp (T (ν+ )). Thus ν+ ∈ σp (T ) is simple with strictly positive eigenfunction by Theorem 5.5. Example 6.4 (Coulomb Potential in R3 ). The above results agree with the explicit formulas for the eigenvalues and eigenfunctions of the Klein–Gordon problem in R3 . In fact, if |Ze2 | < 1/2, the eigenvalues are given by − 12
λk,l
(Ze2 )2 = m 1 + 2 2 1 2 2 k − l − 1 + − (Ze ) l+ 2 2
,
k = 1, 2, . . . , l = 0, 1, . . . , k − 1. All eigenvalues λk,l lie in the interval (0, m), they are semi-simple, and λk,l has (geometric and algebraic) multiplicity 2l + 1. A corresponding set Ψk,l,j : j = 0, ±1, ±2, . . . , ±l of linearly independent eigenfunctions is given by µ − 12
Ψk,l,j (x) = Nk,l Yl,j (θ, φ)βk,ll
1
xµl − 2 e−βk,l x/2 1 F1 (l + 1 − k, 2µl + 1; βk,l x)
for x ∈ R3 in spherical coordinates (x, θ, φ) and k = 1, 2, . . . , l = 0, 1, 2, . . . , constant, Yl,j are the spherj = 0, ±1, ±2, . . . , ±l; here Nk,l > 0 is a normalization 1 2 2 2 ical harmonics, βk,l := 2 m − λk,l , µl := (l + 2 ) − (Ze2 )2 , and 1 F1 are the confluent hypergeometric functions. In particular, the smallest eigenvalue λ1,0 in the gap (−m, m) of the essential spectrum is simple; λ1,0 and the corresponding eigenfunction Ψ1,0,0 are given by
July 6, J070-S0129055X11004382
664
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
(see [5, Exercise 1.11], [31] or [20, Sec. 5, Example 1]) − 12 (Ze2 )2 λ1,0 = m1 + 2 ,
1 1 + − (Ze2 )2 2 4 √ √ 2 2 1 2 2 Ψ1,0,0 (x) = N1,0 (2x m2 − λ21,0 ) 1/4−(Ze ) − 2 e−x m −λ1,0 ,
x ∈ R3 .
Obviously, Ψ1,0,0 is strictly positive (note that we chose N1,0 > 0). Since λ1,0 < m and λ1,0 = p+ (Ψ1,0,0 ), we know that ν+ ≤ λ1,0 < m and hence, by Corollary 6.3, ν+ ∈ σp (T ). This illustrates the claim of Theorem 5.5 that ν+ = λ1,0 is simple with strictly positive eigenfunction Ψ1,0,0 . Proposition 6.5 (Rollnik Potentials). Let n = 3 and A ≡ 0. The scalar potential V = eq is said to be in the Rollnik class R if q : R3 → C is a measurable function and |V (x)V (y)| V 2R := dx dy < ∞ |x − y|2 R3 R3 (see [23]). If V 2 ∈ R and V 2 R < 4π, then the Klein–Gordon pencil T is strongly damped with ν− < ν+ , its spectrum σ(T ) is real with
V 2 R V 2 R σ(T ) ⊂ −∞, −m + m ∪˙ m − m, ∞ , 4π 4π all eigenvalues of T are semi-simple, ν± ∈ σ(T ), and if ν− or ν+ are eigenvalues of T, they are simple with strictly positive eigenfunctions. 1
Proof. It was shown in [23, Theorem X.19] that if V 2 ∈ R, then V is H02 -bounded 1 2
with H0 -bound 0 and
2 − 12
V (−∆ + m )
≤
V 2 R . 4π
2 Hence condition (V1) holds and, if V R < 4π, then condition (V2) is satisfied with a = 0 and b = V 2 R /(4π) < 1. Thus Theorems 4.3(i), 3.5, and 5.5 apply and yield all claims.
While the simplicity results of Sec. 5 only apply for vanishing magnetic potentials A ≡ 0, the semi-simplicity results of Sec. 3 also apply if A ≡ 0. Here we only 1 mention that the condition (V1), i.e. D(H02 ) ⊂ D(V ), may be guaranteed by the following assumptions on A and V : 2 (Rn )n ; (KG1) A = (Aj )nj=1 ∈ Lloc
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
665
∂ (KG2) H0 is the self-adjoint realization of nj=1 (−i ∂x − eAj )2 + m2 in L 2 (Rn ); j 2 n (KG3) the multiplication operator V by eq in L (R ) satisfies ! " ∂ HA1 (Rn ) := Ψ ∈ L 2 (Rn ) : −i − eAj Ψ ∈ L 2 (Rn ), j = 1, . . . , n ∂xj
⊂ D(V ), (see e.g. [16, Definition 7.20]). Note that HA1 (Rn ) = H 1 (Rn ) if A ≡ 0. For A ≡ 0, the inclusion HA1 (Rn ) ⊂ H 1 (Rn ) need not be true; Ψ ∈ HA1 (Rn ) only implies |Ψ| ∈ H 1 (Rn ). 1
1
If (KG1), (KG2) hold, then the quadratic form t0 := (H02 ·, H02 ·) with domain 1
D(t0 ) := D(H02 ) is given by t0 [Ψ] = ((−i∇ − eA)Ψ, (−i∇ − eA)Ψ) + m2 Ψ2 ,
D(t0 ) = HA1 (Rn ).
In fact, it is easy to see that the formula for t0 [Ψ] holds for Ψ ∈ C0∞ (Rn ); since C0∞ (Rn ) is a core of t0 by [16, Theorem 7.22]), it extends to Ψ ∈ D(t0 ) = HA1 (Rn ). It now remains to establish conditions on V = eq guaranteeing that V also satisfies the relative form-boundedness assumption (V2) with respect to H0 , so ˙ 2 + 2λV − λ2 are defined for λ ∈ C according to that the operators T (λ) = H0 −V Lemma 2.3. Such conditions will, in general, depend on the particular properties of the magnetic potential A. For certain classes of vector potentials A, it may even be possible to define the operators T (λ) for λ ∈ R without using Lemma 2.3; if e.g. 4 2 2 (Rn )n , ∇ · A ∈ Lloc (Rn ), and q 2 ∈ Lloc (Rn ), then it can be A = (Aj )nj=1 ∈ Lloc shown that T (λ) is essentially self-adjoint on C0∞ (Rn ) by the Leinfelder–Simader theorem (see [19], [1, Theorem 1.15]). Acknowledgments We thank an anonymous referee for a careful reading of our paper and thoughtful comments. The support for this work of Deutsche Forschungsgemeinschaft DFG, grant no. TR368/6-1, and of Schweizerischer Nationalfonds, SNF, grant no. 200021119826/1, is greatly appreciated. References [1] H. L. Cycon, R. G. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators with Application to Quantum Mechanics and Global Geometry, Texts and Monographs in Physics, Study edition (Springer-Verlag, Berlin, 1987). [2] M. Demuth and M. Krishna, Determining Spectra in Quantum Theory, Progress in Mathematical Physics, Vol. 44 (Birkh¨ auser Boston Inc., Boston, MA, 2005). [3] R. J. Duffin, A minimax theory for overdamped networks, J. Ration. Mech. Anal. 4 (1955) 221–233. [4] W. G. Faris, Self-Adjoint Operators, Lecture Notes in Mathematics, Vol. 433 (Springer-Verlag, Berlin, 1975).
July 6, J070-S0129055X11004382
666
2011 11:3 WSPC/S0129-055X
148-RMP
M. Koppen, C. Tretter & M. Winklmeier
[5] W. Greiner, Relativistic Quantum Mechanics, 3rd edn. (Springer-Verlag, Berlin, 2000); Wave equations, translated from the second German (1987) edition, with a foreword by D. A. Bromley. [6] T. Kato, Perturbation Theory for Linear Operators, 2nd edn. (Springer-Verlag, 1980). [7] M. G. Kre˘ın and G. K. Langer, On the theory of quadratic pencils of self-adjoint operators, Dokl. Akad. Nauk SSSR 154 (1964) 1258–1261 (Russian). [8] M. G. Kre˘ın and G. K. Langer, Certain mathematical principles of the linear theory of damped vibrations of continua, in Appl. Theory of Functions in Continuum Mechanics (Proc. Internat. Sympos., Tbilisi, 1963 ), Vol. II, Fluid and Gas Mechanics, Math. Methods (Izdat. “Nauka”, Moscow, 1965), pp. 283–322 (Russian). [9] M. G. Kre˘ın and H. Langer, On some mathematical principles in the linear theory of damped oscillations of continua. I, Integral Equations Operator Theory 1(3) (1978) 364–399; translated from the Russian by R. Troelstra. [10] M. G. Kre˘ın and H. Langer, On some mathematical principles in the linear theory of damped oscillations of continua. II, Integral Equations Operator Theory 1(4) (1978) 539–566; translated from the Russian by R. Troelstra. [11] M. G. Kre˘ın and M. A. Rutman, Linear operators leaving invariant a cone in a Banach space, Uspehi Matem. Nauk (N.S.) 3(1(23)) (1948) 3–95 (Russian). [12] M. G. Kre˘ın and M. A. Rutman, Linear operators leaving invariant a cone in a Banach space, Amer. Math. Soc. Trans. 1950 (1950), No. 26, 128 pp. [13] L. D. Landau, Diamagnetismus der Metalle, Z. Phys. 64 (1930) 629. [14] H. Langer, Spektraltheorie linearer Operatoren in J-R¨ aumen und einige Anwendunat Dresden, Habilitationsgen auf die Schar L(λ) = λ2 I+λB+C, Technische Universit¨ schrift (1965) (German). ¨ [15] H. Langer, Uber stark ged¨ ampfte Scharen im Hilbertraum, J. Math. Mech. 17 (1967/1968) 685–705. [16] E. H. Lieb and M. Loss, Analysis, Graduate Studies in Mathematics, Vol. 14, 2nd edn. (American Mathematical Society, Providence, RI, 2001). [17] H. Langer, B. Najman and C. Tretter, Spectral theory of the Klein–Gordon equation in Pontryagin spaces, Comm. Math. Phys. 267(1) (2006) 159–180. [18] H. Langer, B. Najman and C. Tretter, Spectral theory of the Klein–Gordon equation in Krein spaces, Proc. Edinburgh Math. Soc. (2 ) 51(3) (2008) 711–750. [19] H. Leinfelder and C. G. Simader, Schr¨ odinger operators with singular magnetic vector potentials, Math. Z. 176(1) (1981) 1–19. [20] M. Langer and C. Tretter, Variational principles for eigenvalues of the Klein–Gordon equation, J. Math. Phys. 47(10) (2006) 103506, 18 pp. [21] A. S. Markus, Introduction to the Spectral Theory of Polynomial Operator Pencils, Translations of Mathematical Monographs, Vol. 71 (American Mathematical Society, Providence, RI, 1988); translated from the Russian by H. H. McFaden, with an appendix by M. V. Keldysh. [22] B. Najman, Eigenvalues of the Klein–Gordon equation, Proc. Edinburgh Math. Soc. (2 ) 26(2) (1983) 181–190. [23] M. Reed and B. Simon, Methods of Modern Mathematical Physics. II. Fourier Analysis, Self-Adjointness (Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1975). [24] M. Reed and B. Simon, Methods of Modern Mathematical Physics. IV. Analysis of Operators (Academic Press [Harcourt Brace Jovanovich Publishers], New York, 1978). [25] M. Reed and B. Simon. Methods of Modern Mathematical Physics. I. Functional Analysis, 2nd edn. (Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1980).
July 6, J070-S0129055X11004382
2011 11:3 WSPC/S0129-055X
148-RMP
Simplicity of Extremal Eigenvalues of the Klein–Gordon Equation
667
[26] G. Rozenblum, M. Solomyak and M. A. Shubin, Partial Differential Equations VII. Spectral Theory of Differential Operators, Encyclopaedia of Mathematical Sciences, Vol. 64 (Springer-Verlag, Berlin, 1994); English translation of the 1989 Russian original. [27] A. A. Shkalikov, Strongly damped operator pencils and the solvability of the corresponding operator-differential equations, Mat. Sb. (N.S.) 135(177)(1) (1988) 96–118, 143. [28] B. Simon, Functional Integration and Quantum Physics, Pure and Applied Mathematics, Vol. 86 (Academic Press Inc. [Harcourt Brace Jovanovich Publishers], New York, 1979). [29] L. I. Schiff, H. Snyder and J. Weinberg, On the existence of stationary states of the mesotron field, Phys. Rev. 57 (1940) 315–318. [30] E. H. Sondheimer and A. H. Wilson, The diamagnetism of free electrons, Proc. R. Soc. London Ser. A 210 (1951) 173–190. [31] K. Veseli´c, On the nonrelativistic limit of the bound states of the Klein–Gordon equation, J. Math. Anal. Appl. 96(1) (1983) 63–84. [32] R. Weder, Selfadjointness and invariance of the essential spectrum for the Klein–Gordon equation, Helv. Phys. Acta 50(1) (1977) 105–115.
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 6 (2011) 669–690 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004400
INTRODUCTION TO SUPERGEOMETRY
†,§ ¨ ALBERTO S. CATTANEO∗,‡ and FLORIAN SCHATZ ∗Institut
f¨ ur Mathematik, Universit¨ at Z¨ urich, Switzerland †Center
for Mathematical Analysis, Geometry and Dynamical Systems, IST Lisbon, Portugal ‡
[email protected] §
[email protected] Received 18 January 2011
These notes are based on a series of lectures given by the first author at the school of “Poisson 2010”, held at IMPA, Rio de Janeiro. They contain an exposition of the theory of super- and graded manifolds, cohomological vector fields, graded symplectic structures, reduction and the AKSZ-formalism. Keywords: Supermanifolds and graded manifolds; graded symplectic geometry; AKSZ formalism. Mathematics Subject Classification 2010: 58A50, 51P05, 53D20
1. Introduction The main idea of supergeometry is to extend classical geometry by allowing for odd coordinates. These are coordinates which anticommute, in contrast to usual coordinates which commute. The global object, which one obtains from gluing such extended coordinate systems, are supermanifolds. A prominent example is obtained by considering the one-forms (dxi )ni=1 as odd coordinates, accompanying the usual “even” coordinates (xi )ni=1 on Rn . The corresponding supermanifold is known as ΠT Rn . The use of odd coordinates has its roots in physics, but it turned out to have interesting mathematical applications as well. Let us briefly mention those which are explained in more detail below: (i) Some classical geometric structures can be encoded in simple supergeometric structures. For instance, Poisson manifolds and Courant algebroids can be described in a uniform way in terms of supermanifolds equipped with a supersymplectic structure and a symplectic cohomological vector field, see Sec. 4.3. One can also treat generalized complex structures in this setting.
669
July 6, J070-S0129055X11004400
670
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
(ii) As an application, a unifying approach to the reduction of Poisson manifolds, Courant algebras and generalized complex structures can be developed. An outline of this approach is presented in Sec. 4.4. (iii) The AKSZ-formalism ([1]) allows one to associate topological fields theories to supermanifolds equipped with additional structures, see Sec. 5. Combinging this with the supergeometric description of classical geometric structures mentioned in (i), one obtains topological field theories associated to Poisson manifolds and Courant algebroids. These field theories include the Poisson Sigma model as well as Chern–Simons theory for trivial principal bundles. 1.1. Plan of the notes Section 2 is a short review of the basics of supergeometry. In Sec. 3, graded manifolds, as well as graded and cohomological vector fields, are introduced. The concept of cohomological vector fields allows one to think of “symmetries”, which appear in a wide variety of examples, in a unified and geometric way. For instance, L∞ -algebras and Lie algebroid structures can be seen as special instances of cohomological vector fields. In Sec. 4, graded symplectic manifolds are explained. Due to the additional grading, graded symplectic geometry often behaves much more rigidly than its ungraded counterpart. A dg symplectic manifold is a graded symplectic manifold with a compatible cohomological vector field. Poisson manifolds, Courant algebroids and generalized complex structures fit naturally into the framework of dg symplectic manifolds. This is explained in Sec. 4.3, while Sec. 4.4 outlines a unified approach to the reduction of these structures via graded symplectic geometry. Finally, Sec. 5 provides an introduction to the AKSZ-formalism ([1]). This is a procedure which allows one to associate topological field theories to dg symplectic manifolds. As particular examples, one recovers the Poisson Sigma model and Chern–Simons theory (for trivial principal bundles). In this section, we basically follow the expositions of the AKSZ-formalism from [17] and [8], respectively. 2. Supermanifolds 2.1. Definition A supermanifold M is a locally ringed space (M, OM ) which is locally isomorphic to (U, C ∞ (U ) ⊗ ∧ W ∗ ), where U is an open subset of Rn and W is some finite-dimensional real vector space. The isomorphism mentioned above is in the category of Z2 -graded algebras, i.e. the parity C ∞ (U ) ⊗ ∧k W ∗ → Z2 , f ⊗ x → |f ⊗ x| := |x| = k mod 2 k≥0
has to be preserved.
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
671
Loosely speaking, every supermanifold is glued from pieces that look like open subsets of Rn , together with some odd coordinates, which correspond to a basis of W ∗ . The supermanifold corresponding to such a local piece is denoted by U × ΠW , and we write C ∞ (U × ΠW ) := C ∞ (U ) ⊗ ∧W ∗ . Similarly, the algebra of polynomial functions on V × ΠW , for V and W real, finite dimensional vector spaces, is S(V ∗ ) ⊗ ∧W ∗ . Here S(V ∗ ) denotes the symmetric algebra of the vector space V ∗ . In the global situation, the algebra of smooth function C ∞ (M) on a supermanifold M is defined to be the algebra of global sections of the sheaf associated to M. The parity extends to C ∞ (M) and C ∞ (M) is a graded commutative algebra with respect to this parity, i.e. for f and g homogeneous elements of degree |f | and |g| respectively, one has f · g = (−1)|f ||g| g · f. Examples 2.1. (1) The algebra of differential forms Ω(M ) on a manifold M is locally isomorphic to C ∞ (U ) ⊗ ∧Tx∗ M where x is some point on U . Hence the sheaf of differential forms on a manifold corresponds to a supermanifold. (2) Let g be a real, finite dimensional Lie algebra. The cochains of the Chevalley– Eilenberg complex of g are the elements of ∧g∗ , which is the same as the algebra of smooth functions on the supermanifold Πg.
2.2. Morphisms of supermanifolds Since supermanifolds are defined as certain locally ringed spaces, it is natural to define morphisms of supermanifolds as morphims of these locally ringed spaces. In the smooth setting, one can equivalently define morphisms from M to N to be morphisms of superalgebras from C ∞ (N ) to C ∞ (M), see [24] or [7]. Let us spell this out in more detail for the local case, i.e. consider a patch of N ˜ . In this situation one has isomorphic to V˜ × ΠW ˜ ) ⊗ C ∞ (M))even . Mor(M, N ) ∼ = ((V˜ ⊕ ΠW = Moralg (C ∞ (N ), C ∞ (M)) ∼ Here, the last object is a super vector space, i.e. a vector space with a decomposition ˜ is considered into an even and an odd part. C ∞ (M) is graded by the parity, V˜ ⊕ ΠW with its obvious decomposition and the parity of a tensor product a ⊗ b is the product of the parities of a and b, respectively.
July 6, J070-S0129055X11004400
672
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
The last isomorphism uses the fact that it suffices to know the restriction of a morphism ˜ ) → C ∞ (M) C ∞ (V˜ × ΠW to the subspace of linear functions, in order to be able to recover the morphism itself. 3. Graded Manifolds 3.1. Definition Let us first introduce the relevant linear theory: A graded vector space V is a collection of vector spaces (Vi )i∈Z . The algebra of polynomial functions on V is the graded symmetric algebra S(V ∗ ) over V ∗ . In more detail: ∗ • The dual V ∗ of a graded vector space V is the graded vector space (V−i )i∈Z . • The graded symmetric algebra S(W ) over a graded vector space W is the quotient of the tensor algebra of W by the ideal generated by the elements of the form
v ⊗ w − (−1)|v||w|w ⊗ v for any homoegenous elements v and w of W . A morphism f : V → W of graded vector spaces is a collection of linear maps (fi : Vi → Wi )i∈Z . The morphisms between graded vector spaces are also referred to as graded linear maps. Moreover, V shifted by k is the graded vector space V [k] given by (Vi+k )i∈Z . By definition, a graded linear map of degree k between V and W is a graded linear map between V and W [k]. The definition of a graded manifold is analogous to that of supermanifold, but now in the graded setting: • The local model is (U, C ∞ (U ) ⊗ S(W ∗ )), where U is an open subset of Rn and W is a graded vector space. • The isomorphism between the structure sheaf and the local model is in the category of Z-graded algebras. The algebra of smooth functions of a graded manifold (M, OM ) (i.e. algebra of global sections) automatically inherits a Z-grading. Morphisms between graded manifolds are morphisms of locally ringed spaces. In the smooth setting, one can equivalently consider morphisms of the Z-graded algebra of smooth functions.
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
673
Essentially all examples of graded manifolds come from graded vector bundles, which are generalizations of graded vector spaces. A graded vector bundle E over a manifold M is a collection of ordinary vector bundles (Ei )i∈Z over M . The sheaf U → Γ(U, S(E|∗U )) corresponds to a graded manifold which we will also denote by E from now on. It can be shown that any graded manifold is isomorphic to a graded manifold associated to a graded vector bundle. Examples 3.1. (1) For V an ordinary vector space, one has C ∞ (V [1]) = ∧V ∗ . In particular, the space of cochains of the Chevalley–Eilenberg complex of a Lie algebra g is equal to C ∞ (g[1]). (2) The algebra of differential forms Ω(M ) is the algebra of smooth functions on T [1]M . 3.2. Graded vector fields Let V be a graded vector space with homogeneous coordinates (xi )ni=1 corresponding to a basis of V ∗ . A vector field on V is a linear combination of the form X=
n
Xi
i=1
∂ ∂xi
∂ n where (X i )ni=1 is a tuple of functions on V , i.e. of elements of S(V ∗ ), and ( ∂x i )i=1 is the basis of V dual to (xi )ni=1 . The vector field X acts on the algebra of functions according to the following rules:
• •
j ∂ j ∂xi (x ) = δi and ∂ ∂ ∂xi (f g) = ∂xi (f ) g
+ (−1)|x
i
||f |
f
∂ ∂xi (g)
.
A vector field is graded if it maps functions of degree m to functions of degree m + k for some fixed k. In this case, the integer k is called the degree of X. Globally, graded vector fields on a graded manifold can be identified with graded derivations of the algebra of smooth functions. Accordingly, a graded vector field on M is a graded linear map X : C ∞ (M) → C ∞ (M)[k] which satisfies the graded Leibniz rule, i.e. X(f g) = X(f )g + (−1)k|f | f X(g) holds for all homogenoeus smooth functions f and g. The integer k is called the degree of X.
July 6, J070-S0129055X11004400
674
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
Example 3.2. Every graded manifold comes equipped with the graded Euler vector field which can be defined in two ways: • In local coordinates (xi )ni=1 , it is given by E=
n i=1
|xi |xi
∂ . ∂xi
• Equivalently, it is the derivation which acts on homogeneous smooth functions via E(f ) = |f |f. 3.3. Cohomological vector fields Definition 3.3. A cohomological vector field is a graded vector field of degree +1 which commutes with itself. Remark 3.4. Observe that the graded commutator equips the graded vector space of graded vector fields with the structure of a graded Lie algebra: if X and Y are graded derivations of degree k and l, respectively, then [X, Y ] := X ◦ Y − (−1)kl Y ◦ X is a graded derivation of degree k + l. It is left as an exercise to the reader to verify that in local coordinates (xi )ni=1 , the graded commutator [X, Y ] of X=
n i=1
Xi
∂ ∂xi
and Y =
n j=1
Yj
∂ ∂xj
is equal to n i,j=1
Xi
n ∂Y j ∂ ∂X i ∂ − (−1) Yj , j ∂xi ∂x ∂xj ∂xi i,j=1
for some appropriate sign . Futhermore, one can check that the degree of a graded vector field X is its eigenvalue with respect to the Lie derivative along E, i.e. [E, X] = deg(X)X. Let Q be a graded vector field of degree +1, i.e. Q is a linear map Q : C ∞ (M) → C ∞ (M)[1] which satisfies the graded Leibniz rule. Because of [Q, Q] = 2(Q ◦ Q), every cohomological vector field on M corresponds to a differential on the graded algebra of smooth functions C ∞ (M).
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
675
Examples 3.5. (1) Consider the shifted tangent bundle T [1]M , whose algebra of smooth functions is equal to the algebra of differential forms Ω(M ). The de Rham differential on Ω(M ) corresponds to a cohomological vector field Q on T [1]M . We fix local coordinates (xi )ni=1 on M and denote the induced fiber coordinates on T [1]M by (dxi )ni=1 . In the coordinate system (xi , dxi )ni=1 , the cohomological vector field Q is given by Q=
n
dxi
i=1
∂ . ∂xi
(2) Let g be a real, finite dimensional Lie algebra. The graded manifold g[1] carries a cohomological vector field Q which corresponds to the Chevalley–Eilenberg differential on ∧g∗ = C ∞ (g[1]). In more detail, let (ei )ni=1 be a basis of g, and (fijk ) be the corresponding structure constants given by [ei , ej ] =
n
fijk ek .
(1)
k=1
Then, the cohomological vector field Q reads n 1 i j k ∂ x x fij k , 2 ∂x i,j,k=1
where (xi )ni=1 are the coordinates on g[1] which correspond to the basis dual to (ei )ni=1 . Actually, one can check that [Q, Q] = 0 is equivalent to the statement that the bracket [−, −] : g ⊗ g → g defined via formula (1) satisfies the Jacobiidentity. (3) One can generalize the last example in two directions: (a) Allowing for higher degrees: let V be a graded vector space with only finitely many non-zero homogeneous components, all of which are finite dimensional. Formala cohomological vector fields on V are in one-to-one correspondence with L∞ -algebra structures on V . (b) Allowing for a non-trivial base: let A be a vector bundle over a manifold M . Cohomological vector fields on A[1] are in one-to-one correspondence with Lie algebroids structures on A. This observation is due to Vaintrob [23]. Definition 3.6. A graded manifold endowed with a cohomological vector field is called a differential graded manifold, or dg manifold for short. A morphism of dg manifolds is a morphism of graded manifolds, with respect to which the cohomological vector field are related. a Formal cohomological vector fields are elements of the completion of the space of vector fields with ˆ ∗ ) ⊗ V instead of S(V ∗ ) ⊗ V . The subset of (ordinary) respect to the degree, i.e. one considers S(V cohomological vector fields corresponds to L∞ -algebra structures on V whose structure maps V ⊗n → V [1] vanish for all but finitely many n.
July 6, J070-S0129055X11004400
676
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
Remark 3.7. Morphisms of dg manifolds can be defined equivalently by requiring that the corresponding morphisms between the algebra of smooth functions is a chain map with respect to the differentials given by the cohomological vector fields. Remark 3.8. So far we elaborated on the condition [Q, Q] = 0 mostly from an algebraic perspective. However, it has also geometric significance, as we will see now. Assume X is a graded vector field of degree k on a graded manifold M. We want to construct the flow of X, i.e. solve the ordinary differential equation dxi (t) = X i (x(t)) dt in a coordinate chart of M, where X is given by n
Xi
i=1
(2)
∂ . ∂xi
Observe that the degree of the components X i is |xi | + k. Consequently we have to assign degree −k to the time parameter t in order for the two sides of the flow equation to have the same degree. Although we will not introduce the concept of maps between graded manifolds until Sec. 5, let us mention that one can think of the solution of Eq. (2) as a map R[k] → M. Now, assume that X is of degree +1. This implies that t is of degree −1 and hence squares to zero. The expansion of the flow with respect to t looks like xi (t) = xi + tv i where v i is of degree |xi | + 1. On the one hand, this implies dxi (t) = vi dt while on the other hand, X i (x(t)) = X i (x) + t
n ∂X i j=1
∂xj
vj
holds. Using the flow equation, we obtain v i = X i (x)
and
n ∂X i j=1
∂xj
v j = 0,
which combines into n ∂X i j=1
∂xj
X j = 0 ⇔ [X, X] = 0.
So [X, X] = 0 turns out to be a necessary and sufficient condition for the integrability of X. This conclusion can be seen as a special instance of Frobenius Theorem for smooth graded manifolds, see [6].
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
677
4. Graded Symplectic Geometry 4.1. Differential forms Locally, the algebra of differential forms on a graded manifold M is constructed by adding new coordinates (dxi )ni=1 to a system of homogeneous local coordinates (xi )ni=1 of M. Moreover, one assigns the degree |xi | + 1 to dxi . Remark 4.1. (1) If xi is a coordinate of odd degree, dxi is of even degree and consequently (dxi )2 = 0. (2) On an ordinary smooth manifold, differential forms have two important properties: they can be differentiated — hence the name differential forms — and they also provide the right objects for an integration theory on submanifolds. It turns out that on graded manifolds, this is no longer true, since the differential forms we introduced do not come along with a nice integration theory. To solve this problem, one has to introduce new objects, called “integral forms”. The interested reader is referred to [14]. A global description of differential forms on M is as follows: the shifted tangent bundle T [1]M carries a natural structures of a dg manifold with a cohomological vector field Q that reads n
dxi
i=1
∂ ∂xi
in local coordinates. The de Rham complex (Ω(M), d) of M is C ∞ (T [1]M), equipped with the differential corresponding to the cohomological vector field Q. Remark 4.2. The classical Cartan calculus extens to the graded setting: (1) Let X be a graded vector field on M and ω a differential form. Assume their local expressions in a coordinate system (xi )ni=1 are n
∂ X= X i ∂x i=1 i
and ω =
n
ωi1 ···ik dxi1 · · · dxik ,
respectively.
i1 ,...,ik =1
The contraction ιX ω of X and ω is given locally by n
k
il · · · dxik . ±ωi1 ···ik X il dxi1 · · · dx
i1 ,...,ik =1 l=1
Alternatively — and to get the signs right — one uses the following rules: • ι ∂ i dxj = δij , ∂x • ι ∂ i is a graded derivation of the algebra of differential forms of degree ∂x
|xi | − 1.
July 6, J070-S0129055X11004400
678
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
(2) The Lie derivative is defined via Cartan’s magic formula LX ω := ιX dω + (−1)|X| dιX ω. If one considers both ιX and d as graded derivations, LX is their graded commutator [d, ιX ]. It follows immediately that [LX , d] = 0 holds. Let us compute the Lie derivative with respect to the graded Euler vector field E in local coordinates (xi ). By definition LE xi = |xi |xi and consequently LE dxi = dLE xi = |xi |dxi . This implies that LE acts on homogenous differential forms on M by multiplication by the difference between the total degree — i.e. the degree in C ∞ (T [1]M) — and the form degree — which is given by counting the d’s, loosely speaking. We call this difference the degree of a differential form ω and denote it by deg ω. 4.2. Basic graded symplectic geometry Definition 4.3. A graded symplectic form of degree k on a graded manifold M is a two-form ω which has the following properties: • ω is homogeneous of degree k, • ω is closed with respect to the de Rham differential, • ω is non-degenerate, i.e. the induced morphism of graded vector bundles ω : T M → T ∗ [k]M is an isomorphism. A graded symplectic manifold of degree k is a pair (M, ω) of a graded manifold M and a graded symplectic form ω of degree k on M. Examples 4.4. (1) Ordinary symplectic structures on smooth manifolds can be seen as graded symplectic structures. (2) Let V be a real vector space. The contraction between V and V ∗ defines a symmetric non-degenerate pairing on V ⊕ V ∗ . This pairing is equivalent to a symplectic form of degree k + l on V [k] ⊕ V ∗ [l]. (3) Consider R[1] with the two-form ω = dxdx. This is a symplectic form of degree 2.
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
679
Lemma 4.5. Let ω be a graded symplectic form of degree k = 0. Then ω is exact. Proof. One computes kω = LE ω = dιE ω. This implies ω =
dιE ω k .
Definition 4.6. Let ω be a graded symplectic form on a graded manifold M. A vector field X is called . . . • symplectic if the Lie derivative of ω with respect to X vanishes, i.e. LX ω = 0, • Hamiltonian if the contraction of X and ω is an exact one-form, i.e. there is a smooth function H such that ιX ω = dH. Lemma 4.7. Suppose ω is a graded symplectic form of degree k and X is a symplectic vector field of degree l. If k + l = 0, then X is Hamiltonian. Proof. By definition, we have [E, X] = lX
and LX ω = dιX ω = 0.
Set H := ιE ιX ω and compute dH = dιE ιX ω = LE ιX ω − ιE dιX ω = ι[E,X] ω = (k + l)ιX ω. Hence ιX ω =
dH k+l .
Remark 4.8. Lemmas 4.5 and 4.7 can be found in [16]. Example 4.9. Let (M, ω) be a graded symplectic manifold and Q a symplectic cohomological vector field. By definition, the degree of Q is 1. Lemma 4.7 implies that Q is Hamiltonian if k = deg ω = −1. We remark that the exceptional case k = −1 is also relevant since it appears in the BV-formalism ([4, 19, 22]). Assume that Q is Hamiltonian. Similar to the ungraded case, the graded symplectic form induces a bracket {−, −} via {f, g} := (−1)|f |+1 Xf (g) where Xf is the unique graded vector field that satisfies ιXf ω = df . It can be checked that {−, −} satisfies relations similar to the ordinary Poisson bracket. Using the bracket, one can express Q with the help of a Hamiltonian function S by Q = {S, −}. Since [Q, Q](f ) = {{S, S}, f }. The relation [Q, Q] = 0 is equivalent to {S, S} being a constant.
July 6, J070-S0129055X11004400
680
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
Observe that S can be chosen of degree k +1, while the bracket {−, −} decreases the degree by k. Consequently, the degree of {S, S} is k + 2. Since constants are of degree 0, k = −2 implies {S, S} = 0. This last equation is known as the classical master equation. Definition 4.10. A graded manifold endowed with a graded symplectic form and a symplectic cohomological vector field is called a differential graded symplectic manifold, or dg symplectic manifold for short. 4.3. Examples of dg symplectic manifolds Next, we study some special cases of dg symplectic manifolds (M, ω, Q), where we assume the cohomological vector field Q to be Hamiltonian with Hamiltonian function S. As before, k denotes the degree of the graded symplectic form ω. As mentioned before, the case k = −1 occures in the BV-formalism ([4] and [19, 22]). (1) Consider the case k = 0. This implies that S is of degree 1. Since the degree of the graded symplectic form is zero, it induces an isomorphism between the coordinates of positive and negative degree. Assuming that S is non-trivial, there must be coordinates of positive degree — and hence of negative degree as well. We remark that the situation just described appears in the BFV-formalism ([2, 3]). (2) Suppose k > 0 and that all the coordinates are of non-negative degree. Dg symplectic manifolds with that property were called dg symplectic N -manifolds ˇ by Severa, ([20, letter nr. 8] and [21]). Let us look at the case k = 1 and k = 2 in more detail: (a) k = 1. The graded symplectic structure induces an isomorphism between the coordinates of degree 0, which we denote by (xi )ni=1 , and the coordinates in degree 1, which we denote by (pi )ni=1 . All other degrees are excluded, since the would imply that coordinates of negative degree are around. The Hamiltonian S has degree 2, so locally it must be of the form S=
n 1 ij π (x)pi pj . 2 i,j=1
Hence, locally S corresponds to a bivector field and the classical master equation {S, S} = 0 implies that S actually corresponds to a Poisson bivector field. This also holds globally, as the following theorem due to Schwarz ([19]) asserts: Theorem 4.11. Let (M, ω) be a graded symplectic structure of degree 1. Then (M, ω) is symplectomorphic to T ∗ [1]M, equipped with the standard symplectic form. Moreover, one can choose M to be an ordinary manifold.
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
681
The theorem provides us with an isomorphism C ∞ (M) ∼ = C ∞ (T ∗ [1]M ) = Γ(∧T M ), which maps the bracket {−, −} induced by ω to the Schouten-Nijenhuis bracket on Γ(∧T M ). A smooth function S of degree 2 is mapped to a bivector field π and if S satisfies the classical master equation, π will be Poisson. Hence, there is a one-to-one correspondence isomorphism classes of dg symplectic N -manifolds of degree 1 O 1 1
isomorphism classes of Poisson manifolds. ˇ (b) k = 2. It was noticed by Severa (see [20, letter nr. 7]) that there is a one-to-one correspondence isomorphism classes of dg symplectic N -manifolds of degree 2 O 1 1
isomorphism classes of Courant algebroids. We will not spell out the details of this correspondence — the interested reader can find them in [20] or [16]. Observe that the degree of the graded symplectic form ω allows for coordinates in degree 0, 1 and 2. We denote them by (xi )ni=1 , (ξ α )A α=1 and n (pi )i=1 , respectively. The graded symplectic form can be written as ω=
n i=1
dpi dxi +
n 1 d(gαβ (x)ξ α )dξ β , 2 α,β=1
where (gαβ ) is a symmetric non-degenerate form. Globally, the graded symplectic form ω corresponds to T ∗ [2]M and an additional vector bundle E over M , equipped with a non-degenerate fiber pairing g. A Hamiltonian function S for a cohomological vector field on such a graded manifold is locally of the form S=
i,α
ρiα (x)pi ξ α +
1 fαβγ (x)ξ α ξ β ξ γ . 6 α,β,γ
The first term corresponds to a bundle map ρ : E → T M , while the second one gives a bracket [−, −] on Γ(E). The classical master equation {S, S} = 0 is equivalent to the statement that (ρ, [−, −]) equips (E, g) with the structure of a Courant algebroid.
July 6, J070-S0129055X11004400
682
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
(c) Following Grabowski ([11]), it is possible to include generalized complex structures on Courant algebroids in the graded picture as follows: Assume that S is a Hamiltonian function for a symplectic cohomological vector field on a graded symplectic manifold of degree 2. We want to construct a two-parameter family of solutions of the classical master equation, i.e. we look for a smooth function T of degree 3 such that {αS + βT, αS + βT } = 0 is satisfied for all constants α and β. An important class of solutions arises when one finds a smooth function J of degree 2 such that T = {S, J} satisfies {T, T } = 0. Under these circumstances, T will solve the above two-parameter version of the classical master equation. One way to assure {T, T } = 0 is to require {{S, J}, J} = λS to hold for some constant λ. Up to rescaling, λ is −1, 0 or 1. For λ = −1, such a smooth function J yields a generalized complex structure on the Courant algebroid corresponding to S, if J does not depend on the dual coordinates (pi )ni=1 .
4.4. Graded symplectic reduction We recall some facts about reduction of presymplectic submanifolds. This can be extended to graded symplectic manifolds and provides a unified approach to the reduction of Poisson structures, Courant algebroids and generalized complex structures. This subsection relies on [5, 6, 9, 10], respectively. Definition 4.12. Let (M, ω) be a symplectic manifold. A submanifold i : S → M is presymplectic if the two-form i∗ ω has constant rank. In this case the kernel of i∗ ω forms an integrable distribution D of S, called the characterisitic distribution, and we denote its space of leaves by S. Example 4.13. A special class of presymplectic manifolds are coisotropic submanifolds. A submanifold S is coisotropic if for every point x ∈ S, the symplectic orthogonal of Tx S is contained in Tx S. Equivalently, one can require that S is given locally by the zero-set of constraints in involution, i.e. there is a submanifold chart of S with coordinates (xi , y a ) such that • S is given locally by {y a ≡ 0} and • the Poisson bracket {y a , y b } of any two constraints lies in the ideal of the algebra of smooth functions generated by the transverse coordinates (y a ).
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
683
Lemma 4.14. Let i : S → M be a presymplectic submanifold of (M, ω). If S carries a smooth structure such that the canonical projection π:S → S is a submersion, there is a unique symplectic structure ω on S which statisfies π ∗ ω = i∗ ω. Remark 4.15. From now on, we will always assume that S can be equipped with a smooth structure such that the natural projection from S is a submersion. Assuming this, the symplectic manifold (S, ω) is called the reduction of S. Definition 4.16. Let i : S → M be a presymplectic submanifold of (M, ω). A function f ∈ C ∞ (M ) is called • S-reducible, if i∗ f is invariant under the characteristic distribution D. • Strongly S-reducible, if the Hamiltonian vector field of f is tangent to S. Lemma 4.17. • Let f be an S-reducible function. There is a unique smooth function f on S satisfying π ∗ f = i∗ f. • Let f be strongly S-reducible. Then the following assertions hold : (1) f is S-reducible. (2) The restriction Xf |S of the Hamiltonian vector field of f to S is projectable, i.e. [Xf |S , D] ⊂ D. This implies that there is a unique vector field X f on S which is π-related to Xf |S , i.e. Tx π(Xf )x = (X f )π(x) holds for all x ∈ S. (3) The vector field X f is the Hamiltonian vector field of f . Proof. Let f be an S-reducible function. To establish the existence and uniqueness of f , one observes that i∗ f is constant along the fibers of π:S → S and hence descends to a function f on S. Smoothness of f follows from the assumption that π is a surjective submersion, which implies that a function on S is smooth if and only if its pull back by π is. From now on, let f be a strongly S-reducible function. We first show that f is also S-reducible, i.e. that it is invariant under the characteristic distribution D of S. The identities LX i∗ f = ιX (di∗ f ) and (i∗ df )(X) = i∗ ω(Xf |S , X),
July 6, J070-S0129055X11004400
684
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
where Xf is the Hamiltonian vector field of f , imply L X i∗ f = 0 for all X ∈ Γ(D), since D is the kernel of i∗ ω. Next, we claim that the commutator of Xf |S and an arbitrary vector field X on S, with values in D, is a vector field with values in D too. To this end, we compute ι[X,Xf |S ] i∗ ω = ([LX , ιXf |S ])i∗ ω = LX (ιXf |S i∗ ω) = LX i∗ df = 0. This guarantees that Xf |S (π ∗ g) is S-reducible and one can define a linear endomorphism of C ∞ (S) by g → Xf |S (π ∗ g). It is easy to check that this is a derivation of C ∞ (S) and hence corresponds to a vector field X f on S. By construction, Xf |S and X f are π-related; uniqueness of X f follows from π being a surjective submersion. The final claim follows from π ∗ (df ) = i∗ df = ιXf |S i∗ ω = π ∗ (ιX f ω). Lemma 4.18. Let S be a coisotropic submanifold. Then the notion of S-reducibility is equivalent to the notion of strong S-reducibility. Remark 4.19. • The above definitions and statements can be extended to graded symplectic manifolds. • In the last subsection, we saw that any Poisson manifold gives rise to a dg symplectic manifold (M, ω) of degree 1 with Hamiltonian function Θ of degree 2. Let S be a graded presymplectic submanifold of M and suppose that Θ is strongly reducible. Graded reduction yields a new graded symplectic manifold (S, ω) of degree 1. Moreover, Θ induces a smooth function Θ on S which satisfies {Θ, Θ} = XΘ (Θ) = X Θ (Θ) = XΘ (Θ) = 0. Hence, one obtains a new dg symplectic manifold of degree 1, which corresponds to a new Poisson manifold. • Similarly, one can reduce dg manifolds of degree 2, which correspond to Courant algebroids. Furthermore, it is possible to include generalized complex structures if one assumes that the corresponding function J of degree 2 is strongly reducible as well.
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
685
Remark 4.20. It is also possible to extend Marsden–Weinstein reduction to graded symplectic manifolds by allowing Hamiltonian graded group actions, see [9]. Example 4.21. Let us consider the case of the dg symplectic manifolds arising from a Poisson manifold (M, π) in more detail. Recall that the corresponding graded symplectic manifold is M := (T ∗ [1]M ), equipped with the standard symplectic form. Every coisotropic submanifold S corresponds to a submanifold C of M and an integrable distrubtion B on C. Reducibility of the Hamiltonian function Θ, which corresponds to the Poisson bivector field π, is equivalent to: (1) The image of the restriction of the bundle map π : T ∗ M → T M to the conormal bundle N ∗ C of C is contained in B. (2) For any two function f and g on M whose restriction to C is invariant with respect to the distribution B, the restriction of the Poisson bracket of f and g to C is invariant too. Suppose that these conditions hold and that the leaf space M of B admits a smooth structure such that the natural projection p : C → M is a surjective submersion. Setting ∗f , p ∗ g}, {f , g} := {p ∗ f denotes a smooth extension of defines a Poisson structure on M . Here, p p∗ f to M . What one recovers here is a particular case of Marsden–Ratiu reduction, see [15]. More interesting examples can be obtained by considering presymplectic submanifolds, see [9, 10].
5. An Introduction to the AKSZ-Formalism The AKSZ-formalism goes back to the article [1] of Alexandrov et al. It is a procedure that allows one to construct solutions to the classical master equation on mapping spaces between graded manifolds that are equipped with additional structures. Particularly interesting examples arise from mapping spaces between shifted tangent bundles and dg symplectic manifolds. This allows one to associate topological field theories to dg symplectic manifolds, encompassing examples such as Chern–Simons theory (on trivial principal bundles) and the Poisson Sigma model. Let us describe the AKSZ-formalism in a nutshell. The “input data” are: • The source N : a dg manifold, equipped with a measure which is invariant under the cohomological vector field. • The target M: a dg symplectic manifold, whose cohomological vector field is Hamiltonian.
July 6, J070-S0129055X11004400
686
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
Out of this, the AKSZ-formalism tailors: • a graded symplectic structure on the space of maps C ∞ (N , M) and • a symplectic cohomological vector field on C ∞ (N , M). Under mild conditions, symplectic cohomological vector fields are Hamiltonian and one might find a Hamiltonian function which commutes with itself under the Poisson bracket. Hence, in many cases the AKSZ-formalism produces a solution of the classical master equation on C ∞ (N , M). In the following, we (1) describe the relevant mapping spaces between graded manifolds and (2) give an outline of the AKSZ-formalism.
5.1. Maps of graded manifolds Given two graded manifold X and Y , the set of morphisms Mor(X, Y ) was defined to be the set of morphisms of Z-graded algebras from C ∞ (Y ) to C ∞ (X). The category of graded manifolds admits a monoidal structure ×, which is the coproduct of locally ringed spaces. From a categorical perspective, one might wonder whether the set Mor(X, Y ) can be equipped with the structure of a graded manifold in a natural way, such that it is the adjoint to the monoidal structure, i.e. such that there is a natural isomorphism Mor(Z × X, Y ) ∼ = Mor(Z, Mor(X, Y )) for arbitrary graded manifolds X, Y and Z. This turns out not to be possible. However, there actually is a (usually infinite dimensional) graded manifold Map(X, Y ) canonically associated to a pair (X, Y ), which satisfies Mor(Z × X, Y ) ∼ = Mor(Z, Map(X, Y )). Moreover, there is a natural inclusion of Mor(X, Y ) into Map(X, Y ) as the submanifold of degree 0. Remark 5.1. (1) Usually, Map(X, Y ) is an infinite-dimensional object. However, there are noteworthy finite dimensional examples such as Map(R[1], X) = T [1]X. (2) The difference between Mor and Map can be illustrated in the following example: While Mor(X, R) is equal to the elements of C ∞ (X) in degree 0, the mapping space Map(X, R) is equal to the whole of C ∞ (X). (3) Similarly, the infinitesimal object associated to the group of invertible morphisms from X to X is the Lie algebra of vector fields of degree 0, whereas considering the group of invertible elements of the mapping space Map(X, X) yields the graded Lie algebra of all vector fields on X.
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
687
Remark 5.2. Let us spell out Mor and Map in the local picture, i.e. for two graded vector spaces V and W . One has Mor(V, W ) = Mor(C ∞ (W ), C ∞ (V )) = (W ⊗ C ∞ (V ))0 where (W ⊗ C ∞ (V )) is considered as a graded vector space and the superscript 0 refers to the elements in degree 0. In contrast to this, Map(V, W ) = W ⊗ C ∞ (V ) holds. 5.2. Lifting geometric structures Geometric structures on the graded manifolds X and Y induce interesting structures on the mapping space Map(X, Y ). For instance, cohomogical vector fields on X and Y can be lifted to commuting cohomological vector fields on Map(X, Y ). Another example is a graded symplectic structure on Y and an invariant measure on X, which allow one to construct a graded symplectic structure on Map(X, Y ). Let us elaborate on this in more detail: (1) The groups of invertible maps Diff(X) and Diff(Y ) act on Map(X, Y ) by composition and these two actions commute. Differentiation yields commuting infinitesimal actions X (X)
L
/ X (Map(X, Y )) o
R
X (Y ).
Now, suppose QX and QY are cohomological vector fields on X and Y . We R denote their images under L – respectively R – by QL X and QY . Their sum R Q := QL X + QY
is then a cohomological vector field on Map(X, Y ). (2) Suppose Y carries a graded symplectic form ω. Any form α ∈ Ω(Y ) can be pulled back to a form on Map(X, Y ) × Y via the evaluation map ev : Map(X, Y ) × X → Y. To produce a differential form on the mapping space Map(X, Y ), some notion of push forward along X is required. To this end, the theory of Berezinian measures and Berezinian integration is needed — the interested reader might consult [14]. Basically, this is an extension of the usual (Lebesgue-) integration theory by adding the rule ξdξ = 1 for each odd coordinate ξ. For instance, if one considers integration on a graded vector space V concentrated in odd degrees, the integration map C ∞ (V ) = ∧V ∗ → R is just the projection to the top exterior product (which we identify with R). Another special case is provided by X = T [1]Σ, where Σ is some compact,
July 6, J070-S0129055X11004400
688
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
smooth, oriented manifold. The graded manifold X carries a canonical measure which maps a function f on X to f := j(f ), Σ
X
where j denotes the isomorphism between C ∞ (X) and Ω(Σ). Assuming that X = T [1]Σ is equipped with its canonical measure, we obtain a map Ω(Y ) → Ω(Map(X, Y )) α → α ˆ := ev∗ α X
which is of degree − dim(Σ). Furthermore, one can check that if ω is a graded symplectic form of degree k on Y , ω ˆ is a graded symplectic form of degree k − dim(Y ) on Map(X, Y ). (3) We want to combine the structures obtained in (1) and (2). First, we assume that the cohomological vector field QY is Hamiltonian with Hamiltonian function Θ. This implies that QR Y is also Hamiltonian with respect to the graded ˆ is a Hamiltonian function. symplectic structure ω ˆ — in fact, Θ Concerning the source X, we concentrate on the case X = T [1]Σ, equipped with the cohomological vector field QX which corresponds to the de Rham differential. The canonical measure on X is invariant with respect to the cohomological vector field in the sense that QX (f ) = 0 X
holds for every smooth function f . It follows from invariance that QL X is also Hamiltonian with respect to ω ˆ . We denote the Hamiltonian function by S0 . Example 5.3. Let Y be a graded symplectic vector space of degree 1, equipped with the standard exact symplectic form dα. By Theorem 4.11, Y is symplectomorphic to T ∗ [1]V , for some vector space V . Moreover, a symplectic cohomological vector field on Y corresponds to a Poisson bivector field π on V . Suppose (q i ) are local coordinates on V and denote the dual fiber coordinates on T ∗ [1]V by (pi ). The set of coordinates (q i , pi ) of Y , together with a set of coordinates (xi , dxi ) of X = T [1]Σ, induces a set of coordinates (X i , ηi ) on Map(X, Y ). In these coordinates, the Hamiltonian S0 for the lift of the de Rham differential on X to the mapping space reads ηi dX i , S0 = ± Σ
while the Hamiltonian function for the lift of the cohomological vector field on Y is given by ˆ = 1 Θ π ij (X)ηi ηj . 2 Σ
July 6, J070-S0129055X11004400
2011 11:7 WSPC/S0129-055X
148-RMP
Introduction to Supergeometry
689
ˆ is the BV action functional of the Poisson Sigma model The sum of S0 and Θ on Σ. This is a topological field theory associated to Poisson structures, which was discovered by Ikeda ([12]) and Schaller and Strobl ([18]). Remark 5.4. Assuming that the graded symplectic form ω on Y is of degree k, the Hamiltonian function Θ has degree k + 1. If one assumes in addition that Σ is of dimension n, the graded symplectic form ω ˆ on Map(X, Y ) is of degree k − n and ˆ is of degree k + 1 − n. the Hamiltonian function Θ The case n = k + 1 corresponds to the BV-formalism — introduced by Batalin and Vilkovisky ([4]), while n = k corresponds to the BFV-formalism — introduced by Batalin, Fradkin and Vilkovisky ([2, 3]). These together with the cases k < n should be related to extended topological fields theories in the sense of Lurie ([13]). Acknowledgments ˇ We thank Dmitry Royenberg, Pavol Severa and Marco Zambon for helpful comments. Moreover, we thank the school of “Poisson 2010” for partial financial support. The first author was partially supported by SNF Grants 20113439 and 20-131813. The second author was partially supported by the FCT through program POCI 2010/FEDER, by a post-doctorial grant and by project PTDC/MAT/098936/2008. References [1] M. Alexandrov, M. Kontsevich, A. Schwarz and O. Zaboronsky, The geometry of the Master equation and topological quantum field theory, Int. J. Mod. Phys. A 12(7) (1997) 1405–1429. [2] I. A. Batalin and E. S. Fradkin, A generalized canonical formalism and quantization of reducible gauge theories, Phys. Lett. B 122 (1983) 157–164. [3] I. A. Batalin and G. A. Vilkovisky, Relativistic S-matrix of dynamical systems with boson and fermion constraints, Phys. Lett. B 69 (1977) 309–312. [4] I. A. Batalin and G. A. Vilkovisky, Gauge algebra and quantization, Phys. Lett. B 102 (1981) 27–31. [5] H. Bursztyn, A. S. Cattaneo, R. Mehta and M. Zambon, Generalized reduction via graded geometry, in preparation. [6] H. Bursztyn, A. S. Cattaneo, R. Metha and M. Zambon, The Frobenius theorem for graded manifolds and applications in graded symplectic geometry, in preparation. [7] C. Carmeli, L. Caston and R. Fioresi, Mathematical Foundation of Supersymmetry, with an appendix with I. Dimitrov, EMS Ser. Lect. Math. (European Math. Soc., Zurich, 2011). [8] A. S. Cattaneo and G. Felder, On the AKSZ formulation of the Poisson sigma model, Lett. Math. Phys. 56 (2001) 163–179. [9] A. S. Cattaneo and M. Zambon, A supergeometric approach to Poisson reduction, arXiv:1009.0948. [10] A. S. Cattaneo and M. Zambon, Graded geometry and Poisson reduction, American Institute of Physics Conference Proceedings 1093 (2009) 48–56. [11] J. Grabowski, Courant–Nijenhuis tensors and generalized geometries, Monograf´ıas de la Real Academia de Ciencias de Zaragoza 29 (2006) 101–112.
July 6, J070-S0129055X11004400
690
2011 11:7 WSPC/S0129-055X
148-RMP
A. S. Cattaneo & F. Sch¨ atz
[12] N. Ikeda, Two-dimensional gravity and nonlinear gauge theory, Ann. Phys. 235 (1994) 435–464. [13] J. Lurie, On the classification of topological field theories, in Current Developments in Mathematics (Int. Press, 2009), pp. 129–280. [14] Yu. Manin, Gauge Fields and Complex Geometry (Springer-Verlag, Berlin, 1997). [15] J. E. Marsden and T. Ratiu, Reduction of Poisson manifolds, Lett. Math. Phys. 11 (1986) 161–169. [16] D. Roytenberg, On the structure of graded symplectic supermanifolds and Courant algebroids, in Quantization, Poisson Brackets and Beyond, ed. Th. Voronov, Contemp. Math., Vol. 315 (Amer. Math. Soc., Providence, RI, 2002). [17] D. Roytenberg, AKSZ-BV formalism and Courant algebroid-induced topological field theories, Lett. Math. Phys. 79 (2007) 143–159. [18] P. Schaller and T. Strobl, Poisson structure induced (topological) field theories, Mod. Phys. Lett. A 9(33) (1994) 3129–3136. [19] A. Schwarz, Geometry of Batalin–Vilkovisky quantization, Comm. Math. Phys. 155 (1993) 249–260. ˇ [20] P. Severa, Some Letters to Alan Weinstein, http://sophia.dtp.fmph.uniba.sk/˜severa/ letters/; (1998–2000). ˇ [21] P. Severa, Some title containing the words “homotopy” and “symplectic”, e.g. this one, in Travaux Math´ematiques XVI (Univ. Luxembourg, 2005), pp. 121–137. ˇ [22] P. Severa, On the origin of the BV operator on odd symplectic supermanifolds, Lett. Math. Phys. 78(1) (2006) 55–59. [23] A. Vaintrob, Lie algebroids and homological vector fields, Uspekhi Mat. Nauk 52(2) (1997) 428–429. [24] V. S. Varadarajan, Supersymmetry for Mathematicians: An Introduction, Courant Lecture Notes in Mathematics, Vol. 11 (Amer. Math. Soc., New York, 2004).
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 7 (2011) 691–747 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004412
QUANTUM f -DIVERGENCES AND ERROR CORRECTION
´ MOSONYI∗,†,¶ , DENES ´ FUMIO HIAI∗,§ , MILAN PETZ‡, †,∗∗ ´ ´ and CEDRIC BENY ∗Graduate
School of Information Sciences, Tohoku University, Aoba-ku, Sendai 980-8579, Japan †Centre
for Quantum Technologies, National University of Singapore, 3 Science Drive 2, 117543 Singapore ‡Department of Analysis, Budapest University of Technology and Economics, Egry J´ ozsef u. 1., Budapest, 1111 Hungary §
[email protected] ¶
[email protected] [email protected] ∗∗
[email protected]
Received 2 August 2010 Revised 11 May 2011 Quantum f -divergences are a quantum generalization of the classical notion of f -divergences, and are a special case of Petz’ quasi-entropies. Many well-known distinguishability measures of quantum states are given by, or derived from, f -divergences. Special examples include the quantum relative entropy, the R´enyi relative entropies, and the Chernoff and Hoeffding measures. Here we show that the quantum f -divergences are monotonic under substochastic maps whenever the defining function is operator convex. This extends and unifies all previously known monotonicity results for this class of distinguishability measures. We also analyze the case where the monotonicity inequality holds with equality, and extend Petz’ reversibility theorem for a large class of f -divergences and other distinguishability measures. We apply our findings to the problem of quantum error correction, and show that if a stochastic map preserves the pairwise distinguishability on a set of states, as measured by a suitable f -divergence, then its action can be reversed on that set by another stochastic map that can be constructed from the original one in a canonical way. We also provide an integral representation for operator convex functions on the positive half-line, which is the main ingredient in extending previously known results on the monotonicity inequality and the case of equality. We also consider some special cases where the convexity of f is sufficient for the monotonicity, and obtain the inverse H¨ older inequality for operators as an application. The presentation is completely self-contained and requires only standard knowledge of matrix analysis. Keywords: Relative entropy; quasi-entropy; f -divergences; R´enyi relative entropies; Schwarz maps; stochastic maps; substochastic maps; operator convex functions; Chernoff distance; Hoeffding distances. Mathematics Subject Classification 2010: 81P16, 81P50, 94A17, 62F03 691
August 23, J070-S0129055X11004412
692
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
1. Introduction In the stochastic modeling of systems, the probabilities of the different outcomes of possible measurements performed on the system are given by a state, which is a probability distribution in the case of classical systems and a density operator on the Hilbert space of the system in the quantum case. In applications, it is important to have a measure of how different two states are from each other and, as it turns out, such measures arise naturally in statistical problems like state discrimination. Probably the most important statistically motivated distance measure is the relative entropy, given as Tr ρ(log ρ − log σ), supp ρ ≤ supp σ, S(ρσ) := +∞, otherwise, for two density operators ρ, σ on a finite-dimensional Hilbert space. Its operational interpretation is given as the optimal exponential decay rate of an error probability in the state discrimination problem of Stein’s lemma [7, 21, 38, 45], and it is the mother quantity for many other relevant notions in information theory, like the entropy, the conditional entropy, the mutual information and the channel capacity [7, 45]. Undisputably the most relevant mathematical property of the relative entropy is its monotonicity under stochastic maps, i.e. S(Φ(ρ)Φ(σ)) ≤ S(ρσ)
(1.1)
for any two states ρ, σ and quantum stochastic map Φ [45]. Heuristically, (1.1) means that the distinguishability of two states cannot increase under further randomization. The monotonicity inequality yields immediately that if the action of Φ can be reversed on the set {ρ, σ}, i.e. there exists another stochastic map Ψ such that Ψ(Φ(ρ)) = ρ and Ψ(Φ(σ)) = σ, then Φ preserves the relative entropy of ρ and σ, i.e. inequality (1.1) holds with equality. A highly non-trivial observation, made by Petz in [43, 44], is that the converse is also true: If Φ preserves the relative entropy of ρ and σ then it is reversible on {ρ, σ} and, moreover, the reverse map can be given in terms of Φ and σ in a canonical way. This fact has found applications in the theory of quantum error correction [25, 26, 39], the characterization of quantum Markov chains [18] and the description of states with zero quantum discord [10, 14], among many others. Relative entropy has various generalizations, most notably R´enyi’s α-relative entropies [47] that share similar monotonicity and convexity properties with the relative entropy and are also related to error exponents in binary state discrimination problems [9, 35]. A general approach to quantum relative entropies was developed by Petz in 1985 [41], who introduced the concept of quasi-entropies (see also [42] and [40, Chap. 7]). Let A := B(Cn ) denote the algebra of linear operators on the finite-dimensional Hilbert space Cn (which is essentially the algebra of n × n matrices with complex entries, and hence we also use the term matrix
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
693
algebra). For a positive A ∈ A and a strictly positive B ∈ A, a general K ∈ A and a real-valued continuous function f on [0, +∞), the quasi-entropy is defined as SfK (AB) := KB 1/2 , f (∆(A/B))(KB 1/2 )HS = Tr B 1/2 K ∗ f (∆(A/B))(KB 1/2 ), where X, Y HS := Tr X ∗ Y, X, Y ∈ A, is the Hilbert–Schmidt inner product, and ∆(A/B) : A → A is the so-called relative modular operator acting on A as ∆(A/B)X := AXB −1 , X ∈ A. The relative entropy can be obtained as a special case, corresponding to the function f (x) := x log x and K := I, and R´enyi’s α-relative entropies are related to the quasi-entropies corresponding to f (x) := xα . The two most important properties of the quasi-entropy are its monotonicity and joint convexity. Let Φ : A1 → A2 be a linear map between two matrix algebras A1 and A2 , and let Φ∗ : A2 → A1 denote its dual with respect to the Hilbert–Schmidt inner products. A trace-preserving map Φ : A1 → A2 is called a stochastic map if Φ∗ satisfies the Schwarz inequality Φ∗ (Y ∗ )Φ∗ (Y ) ≤ Φ∗ (Y ∗ Y ), Y ∈ A2 . The following monotonicity property of the quasi-entropies was shown in [41, 42]: Assume that f is an operator monotone decreasing function on [0, +∞) with f (0) ≤ 0 and Φ : A1 → A2 is a stochastic map. Then Φ∗ (K)
SfK (Φ(A)Φ(B)) ≤ Sf
(AB)
(1.2)
holds for any K ∈ A2 and invertible positive operators A, B ∈ A1 . If f is an operator convex function on [0, +∞), then SfK (A, B) is jointly convex in the variables A and B [40–42], i.e. K pi Ai pi Bi ≤ pi SfK (Ai Bi ) Sf i
i
i
for any finite set of positive invertible operators Ai , Bi ∈ A and probability weights {pi }. Quasi-entropy is a quantum generalization of the f -divergence of classical probability distributions, introduced independently by Csisz´ ar [8] and Ali and Silvey [1], which is a widely used concept in classical information theory and statistics [31, 32]. This motivates the terminology “quantum f -divergence”, which we will use in this paper for the quasi-entropies with K = I. Actually, our notion of f -divergence is also a slight generalization of the quasi-entropy in the sense that we extend it to cases where the second operator is not invertible. This extension is the same as in the classical setting, and was already considered in the quantum setting, e.g., in [51]. We give the precise definition of the quantum f -divergences in Sec. 2, where we also give some of their basic properties, and prove that they are continuous in their second variable; the latter seems to be a new result. In Sec. 3, we collect various technical statements on positive maps, which are necessary for the succeeding
August 23, J070-S0129055X11004412
694
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
sections. In particular, we introduce a generalized notion of Schwarz maps, and investigate the properties of this class of positive maps. The monotonicity Sf (Φ(A)Φ(B)) ≤ Sf (AB) of the f -divergences was proved in [42] for the case where f is operator monotone decreasing and Φ is a stochastic map, and where f is operator convex and Φ is the restriction onto a subalgebra; in both cases B was assumed to be invertible. This was extended in [30] to the case where f is operator convex, Φ is stochastic and both A and B are invertible, using an integral representation of operator convex functions on (0, +∞), and in [51] to the case where f is operator convex and Φ is a completely positive trace-preserving map, without assuming the invertibility of A or B, using the monotonicity under restriction onto a subalgebra and Lindblad’s representation of completely positive maps. In Sec. 4, we give a common generalization of these results by proving the monotonicity relation for the case where f is operator convex, Φ is a substochastic map which preserves the trace of B, and both A and B are arbitrary positive semidefinite operators. This is based on the continuity result proved in Sec. 2 and an integral representation of operator convex functions on [0, +∞) that we provide in Sec. 8. To the best of our knowledge, this representation is new, and might be interesting in itself. It has been known [25, 26, 43] for the relative entropy and some R´enyi relative entropies that the monotonicity inequality for two operators and a 2-positive trace-preserving map holds with equality if and only if the action of the map can be reversed on the given operators. We extend this result to a large class of f divergences in Sec. 5, where we show that if a stochastic map Φ preserves the f -divergence of two operators A and B corresponding to an operator convex function which is not a polynomial then it preserves a certain set of “primitive” f divergences, corresponding to the functions ϕt (x) := −x/(x + t) for a set T of t’s. Moreover, if this set has large enough cardinality (depending on A, B and Φ) and Φ is 2-positive then there exists another stochastic map Ψ reversing the action of Φ on {A, B}, i.e. such that Ψ(Φ(A)) = A and Ψ(Φ(B)) = B. In Sec. 6, we formulate equivalent conditions for reversibility in terms of the preservation of measures relevant to state discrimination, namely, the Chernoff distance and the Hoeffding distances, and we also show that these measures cannot be represented as f -divergences. In Sec. 7, we apply the above results on reversibility to the problem of quantum error correction, and give equivalent conditions for the reversibility of a quantum operation on a set of states in terms of the preservation of pairwise f -divergences, Chernoff and Hoeffding distances, and manycopy trace-norm distances. Related to the latter, we also analyze the connection with the recent results of [6], where reversibility was obtained from the preservation of single-copy trace-norm distances under some extra technical conditions, and show that the approach of [6] is unlikely to be recovered from our analysis of the preservation of f -divergences, as the quantum trace-norm distances cannot be represented as f -divergences. This is in contrast with the classical case, and is another manifestation of the significantly more complicated structure of
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
695
quantum states and their distinguishability measures, as compared to their classical counterparts. In our analysis of the monotonicity inequality Sf (Φ(A)Φ(B)) ≤ Sf (AB) and the case of the equality, it is essential that f is operator convex; it is an open question though whether this is actually necessary. In the Appendix, we consider some situations where convexity of f is sufficient; this includes the case of commuting operators, which is essentially a reformulation of the classical case, and the monotonicity under the pinching operation defined by the reference operator B, which was first proved in [14] for the R´enyi relative entropies. Although both of these cases are very special and their proofs are considerably simpler than the general case, they are important for applications. As an illustration, we derive from these results the exponential version of the operator H¨ older inequality and the inverse H¨older inequality, and analyze the case when they hold with equality. 2. Quantum f -Divergences: Definition and Basic Properties Let A be a finite-dimensional C ∗ -algebra. Unless otherwise stated, we will always assume that A is a C ∗ -subalgebra of B(H) for some finite-dimensional Hilbert space H, i.e. A is a subalgebra of B(H) that is closed under taking the adjoint of operators. For simplicity, we also assume that the unit of A coincides with identity operator I on H; if this is not the case, we can simply consider a smaller Hilbert space. The Hilbert–Schmidt inner product on A is defined as A, BHS := Tr A∗ B, A, B ∈ A, √ with induced norm AHS := Tr A∗ A, A ∈ A. We will follow the convention that powers of a positive semidefinite operator are only taken on its support; in particular, if 0 ≤ X ∈ A then X −1 denotes the generalized inverse of X and X 0 is the projection onto the support of X. For a real t ∈ R, X it is a unitary on supp X but not on the whole Hilbert space unless X 0 = I. We denote by log∗ the extension of log to the domain [0, +∞), defined to d X z |z=0 = log∗ X. We also set be 0 at 0. With these conventions, we have dz 0 · ±∞ := 0,
log 0 := −∞ and
log +∞ := +∞.
For a linear operator A ∈ A, let LA , RA ∈ B(A) denote the left and the right multiplications by A, respectively, defined as LA : X → AX ,
RA : X → XA,
X ∈ A.
Left and right multiplications commute with each other, i.e. LA RB = RB LA , A, B ∈ A. If A, B are positive elements in A with spectral decompositions A = a∈spec(A) aPa and B = b∈spec(B) bQb (where spec(X) denotes the spectrum of X ∈ A) then the spectral decomposition of LA RB −1 is given −1 LPa RQb , and for any function f on by LA RB −1 = a∈spec(A) b∈spec(B) ab
August 23, J070-S0129055X11004412
696
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
{ab−1 : a ∈ spec(A), b ∈ spec(B)}, we have f (LA RB −1 ) =
f (ab−1 )LPa RQb .
(2.1)
a∈spec(A) b∈spec(B)
(Note that we have 0−1 = 0 in the above formulas due to our convention.) Definition 2.1. Let A and B be positive semidefinite operators on H and let f : [0, +∞) → R be a real-valued function on [0, +∞) such that f is continuous on (0, +∞) and the limit ω(f ) := lim
x→+∞
f (x) x
exists in [−∞, +∞]. The f -divergence of A with respect to B is defined as Sf (AB) := B 1/2 , f (LA RB −1 )B 1/2 HS when supp A ≤ supp B. In the general case, we define Sf (AB) := lim Sf (AB + εI).
(2.2)
ε0
Proposition 2.2. The limit in (2.2) exists, and lim Sf (AB + εI) = B 1/2 , f (LA RB −1 )B 1/2 HS + ω(f ) Tr A(I − B 0 ).
ε0
In particular, Definition 2.1 is consistent in the sense that if supp A ≤ supp B then lim Sf (AB + εI) = B 1/2 , f (LA RB −1 )B 1/2 HS .
ε0
Proof. By (2.1), we have Sf (AB + εI) = a∈spec(A) b∈spec(B) (b + ε)f (a/(b + ε)) Tr Pa Qb , and the assertion follows by a straightforward computation using that for any a, b ≥ 0, bf (a/b), b > 0, lim ˜bf (a/˜b) = (2.3) 0<˜ b→b aω(f ), b = 0. Corollary 2.3. For A, B and f as in Definition 2.1, Sf (AB) = B 1/2 , f (LA RB −1 )B 1/2 HS + ω(f ) Tr A(I − B 0 )
(2.4)
= f (0) Tr B + B 1/2 , (f − f (0))(LA RB −1 )B 1/2 HS + ω(f ) Tr A(I − B 0 ) = a∈spec(A)
bf (a/b) Tr Pa Qb + aω(f ) Tr Pa Q0 ,
(2.5) (2.6)
b∈spec(B)\{0}
and Sf (AB) = B 1/2 , f (LA RB −1 )B 1/2 HS if and only if supp A ≤ supp B or limx→+∞ f (x) x = 0.
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
697
Remark 2.4. Note that LA RB −1 = ∆(A/B), given in the Introduction, and hence the f -divergence is a special case of the quasi-entropy (with K = I) when supp A ≤ supp B or limx→+∞ f (x)/x = 0. Corollary 2.5. Let A, A1 , A2 , B, B1 , B2 and f be as in Definition 2.1. We have the following: (i) For every λ ∈ [0, +∞), Sf (λAλB) = λSf (AB). (ii) If A01 ∨ B10 ⊥ A02 ∨ B20 then Sf (A1 + A2 B1 + B2 ) = Sf (A1 B1 ) + Sf (A2 B2 ). (iii) If V : H → K is a linear or anti-linear isometry then Sf (VAV ∗ VBV ∗ ) = Sf (AB). (iv) If x is a unit vector in some Hilbert space K then Sf (A ⊗ |xx|B ⊗ |xx|) = Sf (AB). Proof. Immediate from (2.6). Remark 2.6. Note that if V is an anti-linear isometry then there exists a linear ˜ T V˜ ∗ , A ∈ A+ , where the transpoisometry V˜ and a basis B such that VAV ∗ = VA sition is in the basis B. Hence, Corollary 2.5(iii) is equivalent to the f -divergences being invariant under conjugation by an isometry and transposition in an arbitrary basis. Example 2.7. Let fα (x) := xα for α > 0, x ≥ 0. For α = 0, we define f0 (x) := 1, x > 0, f0 (0) := 0. A straightforward computation yields that Sfα (AB) = Tr Aα B 1−α + lim xα−1 Tr A(I − B 0 ) (2.7) x→+∞
for any A, B ∈ A+ , and hence, if 0 ≤ α < 1 then Sfα (AB) = Tr Aα B 1−α , whereas for α > 1 we have Sfα (AB) =
Tr Aα B 1−α , supp A ≤ supp B, +∞,
otherwise.
The R´enyi relative entropy of A and B with parameter α ∈ [0, +∞)\{1} is defined as 1 log Sfα (AB) Sα (AB) := α−1 1 log Tr Aα B 1−α , supp A ≤ supp B or α < 1, = α−1 +∞, otherwise.
August 23, J070-S0129055X11004412
698
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
The choice f (x) := x log x yields the relative entropy of A and B, Tr A(log∗ A − log∗ B), supp A ≤ supp B, Sf (AB) = +∞, otherwise, where the second case follows from limx→+∞
x log x x
= +∞.
The following shows that the representing function for an f -divergence is unique: Proposition 2.8. Assume that a function D : A+ × A+ → R can be represented as an f -divergence. Then the representing function f is uniquely determined by the restriction of D onto the trivial subalgebra as f (x) = Sf (xII)/dim H,
x ∈ [0, +∞).
(2.8)
In particular, for every D : A+ × A+ → R there is at most one function f such that D = Sf holds. Proof. Formula (2.8) is obvious from (2.6), and the rest follows immediately. In most of the applications, f -divergences are used to compare probability distributions in the classical, and density operators in the quantum case, and one might wonder whether there is more freedom in representing a measure as an f -divergence if we are only interested in density operators instead of general positive semidefinite operators. The following simple argument shows that if a measure can be represented as an f -divergence on quantum states then its values are uniquely determined by its values on classical probability distributions. Given density operators ρ and σ with spectral decomposition ρ = a∈spec(ρ) aPa and σ = b∈spec(σ) bQb , we can define classical probability density functions (ρ : σ)1 and (ρ : σ)2 on spec(ρ) × spec(σ) as (ρ : σ)1 (a, b) := a Tr Pa Qb ,
(ρ : σ)2 (a, b) := b Tr Pa Qb .
This kind of mapping from pairs of quantum states to pairs of classical states was introduced in [37], and is one of the main ingredients in the proofs of the quantum Chernoff and Hoeffding bound theorems. Lemma 2.9. For any two density operators ρ, σ and any function f as in Definition 2.1, Sf (ρσ) = Sf ((ρ : σ)1 (ρ : σ)2 ). Proof. It is immediate from (2.6). Corollary 2.10. Let f and g be functions as in Definition 2.1. If Sf and Sg coincide on classical probability distributions then they coincide on quantum states as well.
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
699
Proof. Obvious from Lemma 2.9. Example 2.11. For two density operators ρ, σ, their quantum fidelity is given by F (ρ, σ) := Tr ρ1/2 σρ1/2 [53]. For classical probability distributions, the fidelity coincides with Sf1/2 , where f1/2 (x) = x1/2 . If the fidelity could be represented as an f -divergence for quantum states then the representing function should be f1/2 , due to Corollary 2.10. However, the corresponding quantum f -divergence is Sf1/2 (ρσ) = Tr ρ1/2 σ 1/2 , which is not equal to F (ρ, σ) in general. This shows that the fidelity of quantum states cannot be represented as an f -divergence. In Secs. 6 and 7, we give similar non-represantability results for measures related to state discrimination on the state spaces of individual algebras. Our last proposition in this section says that the f -divergences are continuous in their second variable. Note that continuity in the first variable is not true in general. As a counterexample, consider A := B := P for some non-trivial projection P on a Hilbert space, and let f (x) := x log x. Then Sf (A + εIB) = +∞, ε > 0, while Sf (AB) = 0. Proposition 2.12. Let A, B, Bk ∈ A with A, B, Bk ≥ 0 for all k ∈ N, and assume that limk→∞ Bk = B. Then lim Sf (ABk ) = Sf (AB).
k→∞
Proof. By the definition (2.2), we can choose a sequence εk > 0, k ∈ N, such that limk→∞ εk = 0, and for all k ∈ N, Sf (ABk + εk I) −
1 1 < Sf (ABk ) < Sf (ABk + εk I) + k k
if Sf (ABk ) is finite, and Sf (ABk + εk I) > k
or Sf (ABk + εk I) < −k
˜k := Bk + εk I, which if Sf (ABk ) = +∞ or Sf (ABk ) = −∞, respectively. Let B ˜ is strictly positive for all k ∈ N. Obviously, limk→∞ Bk = B, and the assertion will follow if we can show that ˜k ) = Sf (AB). lim Sf (AB
k→∞
(k) ˜k = Let A = a∈spec(A) aPa , B = b∈spec(B) bQb and B be ˜k ) cQc c∈spec(B the spectral decompositions of the respective operators. Then ˜k ) = Sf (AB f (a/c)c Tr Pa Q(k) c . ˜k ) a∈spec(A) c∈spec(B
˜k → B, From the continuity of the eigenvalues and the spectral projections when B 1 we see that, for every δ > 0 with δ < 2 min{|b − b | : b, b ∈ spec(B), b = b }, if k is
August 23, J070-S0129055X11004412
700
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
sufficiently large, then we have ˜k ) ⊂ spec(B
(b − δ, b + δ)
(disjoint union)
b∈spec(B)
and moreover, ˆ (k) := Q b
Q(k) c → Qb
as k → +∞, for all b ∈ spec(B).
˜k ) c∈spec(B c∈(b−δ,b+δ)
Assume that Sf (AB) ∈ (−∞, +∞). Then by (2.4), it follows that ω(f )a ∈ (−∞, +∞) when a ∈ spec(A) and Pa Q0 = 0. Due to (2.3), for every ε > 0 there ˜k ), exists a δ > 0 as above such that, for a ∈ spec(A), b ∈ spec(B) and c ∈ spec(B |f (a/c)c − f (a/b)b| < ε |f (a/c)c − ω(f )a| < ε
if b > 0 and c ∈ (b − δ, b + δ), if c ∈ (0, δ) and Pa Q0 = 0.
Hence, if k is sufficiently large, then we have ˜k ) − Sf (AB)| |Sf (AB = f (a/c)c Tr Pa Q(k) f (a/b)b Tr Pa Qb c − a∈spec(A) c∈spec(B˜ ) a∈spec(A) b∈spec(B)\{0} k − ω(f )a Tr Pa Q0 a∈spec(a) (k) ≤ f (a/c)c Tr Pa Qc − f (a/b)b Tr Pa Qb ˜k ) a∈spec(A) b∈spec(B)\{0} c∈spec(B c∈(b−δ,b+δ) (k) f (a/c)c Tr Pa Qc − ω(f )a Tr Pa Q0 + ˜ a∈spec(A) c∈spec(B c∈(0,δ)k ) ≤ |f (a/c)c − f (a/b)b| Tr Pa Q(k) c ˜k ) a∈spec(A) b∈spec(B)\{0} c∈spec( B c∈(b−δ,b+δ)
(k) ˆ − Qb )| + |f (a/b)b Tr Pa (Q b
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
701
(k) (k) ˆ + |f (a/c)c − ω(f )a| Tr Pa Qc + |ω(f )a Tr Pa (Q0 − Q0 )| a∈spec(A) c∈spec(B˜k ) c∈(0,δ)
≤ ε Tr I +
ˆ (k) − Qb 1 |f (a/b)b|Q b
a∈spec(A) b∈spec(B)\{0}
+
ˆ (k) − Q0 1 . |ω(f )a|Q 0
a∈spec(A)
This implies that ˜k ) − Sf (AB)| ≤ ε Tr I lim sup |Sf (AB k→∞
for every ε > 0, and so ˜k ) = Sf (AB). lim Sf (AB
k→∞
Next, assume that Sf (AB) = +∞. Then ω(f ) = +∞ and there is an a0 ∈ spec(A)\{0} such that Pa0 Q0 = 0. For every ε > 0 there exists a δ > 0 as above ˜k ), such that, for a ∈ spec(A), b ∈ spec(B) and c ∈ spec(B |f (a/c)c − f (a/b)b| < ε
if b > 0 and c ∈ (b − δ, b + δ),
f (a/c)c > 1/ε
if a > 0
and c ∈ (0, δ).
Hence, if k is sufficiently large, then we have ˜k ) ≥ Sf (AB (f (a/b)b − ε) Tr Pa Q(k) c ˜k ) a∈spec(A) b∈spec(B)\{0} c∈spec(B c∈(b−δ,b+δ)
+
(−|f (0)|δ) Tr P0 Q(k) c +
˜k ) c∈spec(B c∈(0,δ)
≥ −(Tr I)
(1/ε) Tr Pa Q(k) c
˜k ) a∈spec(A) c∈spec(B a>0 c∈(0,δ)
|f (a/b)b − ε|
a∈spec(A) b∈spec(B)\{0}
ˆ (k) + (1/ε) Tr Pa0 Q ˆ (k) , − |f (0)|δ Tr P0 Q 0 0 which implies that ˜ k ) ≥ −(Tr I) lim inf Sf (AB k→∞
|f (a/b)b − ε|
a∈spec(A) b∈spec(B)\{0}
− |f (0)|δ Tr P0 Q0 + (1/ε) Tr Pa0 Q0 .
August 23, J070-S0129055X11004412
702
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
Since Tr Pa0 Q0 > 0 and both ε > 0 and δ > 0 can be chosen to be arbitrarily small, we have ˜k ) = +∞ = Sf (AB). lim Sf (AB
k→∞
The case where Sf (AB) = −∞ is similar. 3. Preliminaries on Positive Maps Let Ai ⊂ B(Hi ) be finite-dimensional C ∗ -algebras with unit Ii for i = 1, 2. For a subset B ⊂ Ai , we will denote the set of positive elements in B by B+ ; in particular, Ai,+ denotes the set of positive elements in Ai . For a linear map Φ : A1 → A2 , we denote its adjoint with respect to the Hilbert–Schmidt inner products by Φ∗ . Note that Φ and Φ∗ uniquely determine each other and, moreover, Φ is positive/n-positive/completely positive if and only if Φ∗ is positive/npositive/completely positive, and Φ is trace-preserving/trace non-increasing if and only if Φ∗ is unital/sub-unital. For given B ∈ A1,+ and Φ : A1 → A2 , we define ΦB : A1 → A2 and Φ∗B : A2 → A1 as ΦB (X) := Φ(B)−1/2 Φ(B 1/2 XB 1/2 )Φ(B)−1/2 , Φ∗B (Y ) := B 1/2 Φ∗ (Φ(B)−1/2 Y Φ(B)−1/2 )B 1/2 ,
X ∈ A1 ,
(3.1)
Y ∈ A2 .
(3.2)
With these notations, we have (ΦB )∗ = Φ∗B and (Φ∗B )∗ = ΦB . For a normal operator X ∈ A1 , let P{1} (X) denote the spectral projection of X onto its fixed-point set. Note that if B ∈ A1,+ then B 0 is a projection in A1 and hence B 0 A1 B 0 is a C ∗ -algebra with unit B 0 . Lemma 3.1. If Φ : A1 → A2 is a positive map and A, B are positive elements in A1 such that A0 = B 0 then Φ(A)0 = Φ(B)0 . In particular, Φ(B)0 = Φ(B 0 )0 for any positive B ∈ A1 . Proof. The assumption A0 = B 0 is equivalent to the existence of strictly positive numbers α, β such that αA ≤ B ≤ βA, which yields αΦ(A) ≤ Φ(B) ≤ βΦ(A) and hence Φ(A)0 = Φ(B)0 . Lemma 3.2. Let B ∈ A1,+ and let Φ : A1 → A2 be a positive map such that Φ∗ (Φ(B)0 ) ≤ I1 (in particular, this is the case if Φ is trace non-increasing). Then Tr Φ(B) ≤ Tr B, and the following are equivalent: (i) Tr Φ(B) = Tr B. (ii) For any function f on spec(B) such that f (0) = 0 if 0 ∈ spec(B), we have f (B)Φ∗ (Φ(B)0 ) = Φ∗ (Φ(B)0 )f (B) = f (B).
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
703
(iii) B 0 ≤ P{1} (Φ∗ (Φ(B)0 )). (iv) Φ is trace-preserving on B 0 A1 B 0 . (In particular, if A ∈ A1,+ is such that A0 ≤ B 0 then Tr Φ(A) = Tr A.) (v) For the map Φ∗B given in (3.2), we have Φ∗B (Φ(B)) = B. Proof. By assumption, Φ∗ (Φ(B)0 ) ≤ I1 and hence, 0 ≤ Tr(I1 − Φ∗ (Φ(B)0 ))B = Tr B − Tr Φ∗ (Φ(B)0 )B = Tr B − Tr Φ(B)0 Φ(B) = Tr B − Tr Φ(B). If Tr Φ(B) = Tr B then (I1 − Φ∗ (Φ(B)0 ))B = 0, i.e. B = Φ∗ (Φ(B)0 )B, so we get B n = Φ∗ (Φ(B)0 )B n , n ∈ N, which yields (ii). Hence, the implication (i) ⇒ (ii) holds. If (ii) holds then we have B 0 = Φ∗ (Φ(B)0 )B 0 and hence, for any x ∈ H such that B 0 x = x, we have x = B 0 x = Φ∗ (Φ(B)0 )B 0 x = Φ∗ (Φ(B)0 )x, or equivalently, x ∈ ran P{1} (Φ∗ (Φ(B)0 )). This yields (iii), and the converse direction (iii) ⇒ (ii) is obvious. Assume now that (ii) holds. If X ∈ B 0 A1 B 0 , then XB 0 = B 0 X = X, and Tr Φ(X) = Tr Φ(X)Φ(B)0 = Tr XΦ∗ (Φ(B)0 ) = Tr XB 0 Φ∗ (Φ(B)0 ) = Tr XB 0 = Tr X, showing (iv). The implication (iv)⇒(i) is obvious. Assume that (ii) holds. Then Φ∗B (Φ(B)) = B 1/2 Φ∗ (Φ(B)0 )B 1/2 = B, showing (v). On the other hand, if (v) holds then B 1/2 Φ∗ (Φ(B)0 )B 1/2 = B, and hence 0 = B 1/2 (I1 − Φ∗ (Φ(B)0 ))B 1/2 . Since I1 − Φ∗ (Φ(B)0 ) ≥ 0, we obtain B 1/2 (I1 − Φ∗ (Φ(B)0 ))1/2 = 0, which in turn yields B = BΦ∗ (Φ(B)0 ). From this (ii) follows as above. Corollary 3.3. Let A, B ∈ A1,+ , and let Φ : A1 → A2 be a trace non-increasing positive map. Then Φ is trace-preserving on (A + B)0 A1 (A + B)0 if and only if Tr Φ(A) = Tr A
and
Tr Φ(B) = Tr B.
Proof. Obvious from Lemma 3.2. Corollary 3.4. Let A, B ∈ A1,+ and let Φ : A1 → A2 be a trace non-increasing positive map such that Tr Φ(A) = Tr A. Then Tr Φ(B)Φ(A)0 ≥ Tr BA0
and
Tr Φ(B)(I2 − Φ(A)0 ) ≤ Tr B(I1 − A0 ).
Note that the first inequality means the monotonicity of the R´enyi 0-relative entropy S0 (AB) ≥ S0 (Φ(A)Φ(B)) under the given conditions. Proof. Due to Lemma 3.2, the assumptions yield that A0 ≤ P{1} (Φ∗ (Φ(A)0 )) ≤ Φ∗ (Φ(A)0 ), and hence 0 ≤ Tr B(Φ∗ (Φ(A)0 ) − A0 ) = Tr Φ(B)Φ(A)0 − Tr BA0 . The second inequality follows by taking into account that Tr Φ(B) ≤ Tr B.
August 23, J070-S0129055X11004412
704
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
The following lemma yields the monotonicity of the R´enyi 2-relative entropies, and is needed to prove the monotonicity of general f -divergences. The statement and its proof can be obtained by following the proofs of [5, Theorem 1.3.3], [5, Theorem 2.3.2 (Kadison’s inequality)] and [5, Proposition 2.7.3] using the weaker conditions given here. For readers’ convenience, we include a self-contained proof here. Lemma 3.5. Let A, B ∈ A1,+ and Φ : A1 → A2 be a positive map. Then Φ(B 0 AB 0 )Φ(B)−1 Φ(B 0 AB 0 ) ≤ Φ(B 0 AB −1 AB 0 ).
(3.3)
In particular, if A0 ≤ B 0 then Φ(A)Φ(B)−1 Φ(A) ≤ Φ(AB −1 A).
(3.4)
If, moreover, Φ is also trace non-increasing then Sf2 (Φ(A)Φ(B)) = Tr Φ(A)2 Φ(B)−1 ≤ Tr A2 B −1 = Sf2 (AB).
(3.5)
Proof. Define Ψ : A1 → A2 as Ψ(X) := Φ(B 1/2 XB 1/2 ), X ∈ A1 . Let X := B −1/2 AB −1/2 and let X = x∈σ(X) xPx be its spectral decomposition. Then x2 x Ψ(X 2 ) Ψ(X) ˆ X := = ⊗ Ψ(Px ) ≥ 0, x 1 Ψ(X) Ψ(I1 ) x∈σ(X)
and hence we have ˆ∗
ˆY = 0 ≤ Yˆ X where
Ψ(X 2 ) − Ψ(X)Ψ(I1 )−1 Ψ(X) Ψ(X)(I2 − Ψ(I)0 ) (I2 − Ψ(I1 )0 )Ψ(X) I2 ˆ Y := 0
−Ψ(X)Ψ(I1 )−1 I2
Ψ(I1 )
,
.
Hence Ψ(X 2 ) ≥ Ψ(X)Ψ(I1 )−1 Ψ(X), which is exactly (3.3). The inequalities in (3.4) and (3.5) follow immediately. We say that a map Φ : A1 → A2 is a Schwarz map if ΦS := inf{c ∈ [0, +∞) : Φ(X)∗ Φ(X) ≤ cΦ(X ∗ X), X ∈ A} < +∞. Obviously, if Φ is a Schwarz map then Φ is positive, and we have Φ = Φ(I1 ) ≤ ΦS . (Note that Φ = Φ(I1 ) is true for any positive map Φ [5, Corollary 2.3.8].) We say that Φ is a Schwarz contraction if it is a Schwarz map with ΦS ≤ 1. A Schwarz contraction Φ is also a contraction, due to Φ ≤ ΦS . Note that a positive map Φ is a contraction if and only if it is subunital, which is equivalent to Φ∗ being trace non-increasing. We say that a map Φ between two finite-dimensional C ∗ -algebras is a substochastic map if its Hilbert–Schmidt adjoint Φ∗ is a Schwarz contraction, and Φ is stochastic if it is a trace-preserving substochastic map. Note
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
705
that in the commutative finite-dimensional case substochastic/stochastic maps are exactly the ones that can be represented by substochastic/stochastic matrices. It is known that if Φ is 2-positive then it is a Schwarz map with ΦS = Φ. In general, however, we might have Φ < ΦS < +∞, as the following example shows. In particular, not every Schwarz map is 2-positive. Example 3.6. Let H be a finite-dimensional Hilbert space, and for every ε ∈ R, let Φε : B(H) → B(H) be the map Φε (X) := (1 − ε)X T + ε(Tr X)I/d,
X ∈ B(H),
where d := dim H > 1 and X T denotes the transpose of X in some fixed basis {e1 , . . . , ed } of H. It was shown in [52] that Φε is positive if and only if 0 ≤ ε ≤ 1 + 1/(d−1), for k ≥ 2 it is k-positive if and only if 1−1/(d+1) ≤ ε ≤ 1+1/(d−1), and it is a Schwarz contraction if and only if 1− 1/(1/2 + d + 1/4) ≤ ε ≤ 1 + 1/(d− 1). This already shows that there are parameter values ε for which Φε is a Schwarz contraction but not 2-positive. Moreover, if ε ∈ [0, 1) then for every c ∈ [0, +∞) we have cΦε (X ∗ X) − Φε (X ∗ )Φε (X) = c(1 − ε)(X ∗ X)T + cε(Tr X ∗ X)I/d − (1 − ε)2 (X ∗ )T X T − ε(1 − ε)(Tr X)(X ∗ )T /d − ε(1 − ε)(Tr X ∗ )X T /d − ε2 |Tr X|2 I/d2 √ ≥ (Tr X ∗ X)I/d[cε − d(1 − ε)2 − 2ε(1 − ε) d − ε2 ], where we used that |Tr X|2 ≤ (Tr I)(Tr X ∗ X) and X ∗ X ≤ X2I ≤ (Tr X ∗ X)I. This shows that √Φε is a Schwarz map for every ε ∈ (0, 1) and Φε S ≤ (1/ε)(d(1 − ε)2 + 2ε(1 − ε) d + ε2 ). Note that for X := |e1 e2 | we have 0 ≤ e1 , (Φε S Φε (X ∗ X) − Φε (X ∗ )Φε (X))e1 = Φε S ε/d − (1 − ε)2 , which yields that Φε S ≥ d(1 − ε)2 /ε. In particular, limε0 Φε S = +∞. Since Φε is a positive unital map for every ε ∈ [0, 1 + 1/(d − 1)], we have Φε = 1 for every ε ∈ [0, 1 + 1/(d − 1)], while Φε S > 1 and hence Φε < Φε S whenever (1 − ε)2 /ε > d. Similarly, it was shown in [52] that the map Ψε (X) := (1 − ε)X + ε(Tr X)I/d,
X ∈ B(H),
is completely positive if and only if 0 ≤ ε ≤ 1 + 1/(d2 − 1), for 1 ≤ k ≤ d − 1 it is k-positive if and only if 0 ≤ ε ≤ 1 + 1/(dk − 1), and it is a Schwarz contraction if and only if 0 ≤ ε ≤ 1 + 1/d. A similar computation as above shows that Ψε is a Schwarz map if and only if 0 ≤ ε < 1 + 1/(d − 1), and limε1+1/(d−1) Ψε S = +∞. Finally, the map Λε (X) := (1 − ε)X T + εX,
X ∈ B(H),
August 23, J070-S0129055X11004412
706
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
is positive if and only if 0 ≤ ε ≤ 1, for each k ≥ 2 it is k-positive if and only if ε = 1, and it is a Schwarz contraction if and only if ε = 1 [52]. Moreover, for X := |e1 e2 | and every c ∈ R we have e1 , (cΛε (X ∗ X) − Λε (X ∗ )Λε (X))e1 = −(1 − ε)2 , and hence Λε is a Schwarz map if and only if ε = 1. Lemma 3.7. Let Φ : A1 → A2 be a substochastic map, and assume that there exists a B ∈ A1,+ \{0} such that Tr Φ(B) = Tr B. Then Φ∗ S = Φ∗ = 1. ˜ : A˜1 → A˜2 as Proof. Let A˜1 := B 0 A1 B 0 , A˜2 := Φ(B)0 A2 Φ(B)0 , and define Φ 0 0 ∗ 0 ∗ ˜ ˜ ˜ Φ(X) := Φ(B XB ) = Φ(X), X ∈ A1 . Then Φ (Y ) = B Φ (Y )B 0 , Y ∈ A˜2 , and ˜ ∗ is unital. Hence, 1 = Φ ˜ ∗ ≤ ˜ ∗ (Φ(B)0 ) = B 0 , i.e. Φ Lemma 3.2 yields that Φ Φ∗ ≤ Φ∗ S ≤ 1, from which the assertion follows. Lemma 3.8. The set of Schwarz maps is closed under composition, taking the adjoint, and positive linear combinations. Moreover, for α ≥ 0 and Φ, Φ1 , Φ2 : A1 → A2 , αΦS = αΦS ,
Φ1 + Φ2 S ≤ Φ1 S + Φ2 S .
(3.6)
Proof. The assertion about the composition is obvious. To prove closedness under the adjoint, assume that Φ : A1 → A2 is a Schwarz map. Our goal is to prove that Φ∗ is a Schwarz map, too. Let ιk be the trivial embedding of Ak into B(Hk ) for k = 1, 2. The adjoint πk := ι∗k of ιk is the trace-preserving conditional expectation (or equivalently, the Hilbert–Schmidt orthogonal projection) from B(Hk ) onto Ak . Since ιk is completely positive, so is πk , and since πk is unital, it is also a Schwarz ˜ ∗ = ι1 ◦ Φ∗ ◦ π2 . Note ˜ := ι2 ◦ Φ ◦ π1 , the adjoint of which is Φ contraction. Let Φ ˜ is a Schwarz map, too, with Φ ˜ S = ΦS , since for any X ∈ B(H1 ), that Φ ˜ ∗ )Φ(X) ˜ Φ(X = ι2 (Φ(π1 (X ∗ ))Φ(π1 (X))) ≤ ΦS ι2 Φ(π1 (X ∗ )π1 (X)) ˜ ∗ X). ≤ ΦS Φ(X 1 in H1 , we have Hence, for any vector v ∈ H1 and any orthonormal basis {ei }di=1
˜ ˜ ˜ ≥ Φ(|ve ΦS Φ(|vv|) i |)Φ(|ei v|),
i = 1, . . . , d1 ,
where d1 := dim H1 . Let Y ∈ A2 be arbitrary. Multiplying the above inequality with Y from the left and Y ∗ from the right, and taking the trace, we obtain ∗ ∗ ˜ ∗ (Y ∗ Y )v = ΦS Tr Y Φ(|vv|)Y ˜ ˜ ˜ ≥ Tr Y Φ(|ve ΦS v, Φ i |)Φ(|ei v|)Y .
Note that Tr : A2 → C is completely positive, and hence it is a Schwarz map with Tr S = Tr(I2 ) = d2 := dim H2 . Hence, the above inequality can be continued as ∗ ˜ ∗ (Y ∗ Y )v ≥ Tr Y Φ(|ve ˜ ˜ d2 ΦS v, Φ i |) Tr Φ(|ei v|)Y
˜ ∗ (Y ∗ )ei ei , Φ ˜ ∗ (Y )v, = v, Φ
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
707
and summing over i yields ˜ ∗ (Y ∗ Y )v ≥ v, Φ ˜ ∗ (Y ∗ )Φ ˜ ∗ (Y )v. d1 d2 ΦS v, Φ ˜ ∗ (Y ) = Φ∗ (Y ) for any Since the above inequality is true for any v ∈ H1 , and Φ Y ∈ A2 , the assertion follows. The assertion on positive linear combinations follows from (3.6), and the first identity in (3.6) is obvious. To see the second identity, assume first that Φ1 and Φ2 are Schwarz contractions. Then, for any ε ∈ [0, 1] and any X ∈ A1 we have ((1 − ε)Φ1 + εΦ2 )(X ∗ X) − ((1 − ε)Φ1 + εΦ2 )(X ∗ )((1 − ε)Φ1 + εΦ2 )(X) = (1 − ε)[Φ1 (X ∗ X) − Φ1 (X ∗ )Φ1 (X)] + ε[Φ2 (X ∗ X) − Φ2 (X ∗ )Φ2 (X)] + ε(1 − ε)[(Φ1 (X) − Φ2 (X))∗ (Φ1 (X) − Φ2 (X))] ≥ 0, and hence (1 − ε)Φ1 + εΦ2 is a Schwarz contraction for any ε ∈ [0, 1]. Finally, let ˜ k := Φk /Φk S is a Schwarz Φ1 , Φ2 : A1 → A2 be non-zero Schwarz maps. Then Φ contraction for k = 1, 2, and choosing ε := Φ2 S /(Φ1 S + Φ2 S ), we get ˜ 1 + εΦ ˜ 2 S ≤ Φ1 S + Φ2 S . Φ1 + Φ2 S = (Φ1 S + Φ2 S )(1 − ε)Φ Lemma 3.9 and Corollary 3.10 below are well known when Φ and γ are unital 2-positive maps. Their proofs are essentially the same for Schwarz contractions, which we provide here for the readers’ convenience. Lemma 3.9. Let Φ : A1 → A2 be a Schwarz map, and let MΦ := {X ∈ A1 : Φ(X)Φ(X ∗ ) = ΦS Φ(XX ∗ )}. Then X ∈ MΦ
if and only if
Φ(X)Φ(Z) = ΦS Φ(XZ),
Z ∈ A1 .
(3.7)
Moreover, the set MΦ is a vector space that is closed under multiplication. Proof. We may assume that ΦS > 0, since otherwise Φ = 0 and the assertions become trivial. Define γ(X1 , X2 ) := ΦS Φ(X1 X2∗ ) − Φ(X1 )Φ(X2 )∗ , X1 , X2 ∈ A1 . Let X ∈ MΦ , Z ∈ A1 and t ∈ R. Then 0 ≤ γ(tX + Z, tX + Z) = t2 γ(X, X) + t[γ(X, Z) + γ(Z, X)] + γ(Z, Z) = t[γ(X, Z) + γ(Z, X)] + γ(Z, Z). Since this is true for any t ∈ R, we get γ(X, Z) + γ(Z, X) = 0, and repeating the same argument with iZ in place of Z, we get γ(X, Z) − γ(Z, X) = 0. Hence, Φ(X)Φ(Z) = ΦS Φ(XZ). The implication in the other direction is obvious. The assertion about the algebraic structure of MΦ follows immediately from (3.7). For a map γ from a C ∗ -algebra into itself, we denote by ker(id −γ) the set of fixed points of γ.
August 23, J070-S0129055X11004412
708
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
Corollary 3.10. Let γ : A → A be a Schwarz contraction, and assume that there exists a strictly positive linear functional α on A such that α ◦ γ = α. Then γS = γ = 1, ker(id −γ) is a non-zero C ∗ -algebra, γ is a C ∗ -algebra mor phism on ker(id −γ), and γ∞ := limn→∞ n1 nk=1 γ k is an α-preserving conditional expectation onto ker(id −γ). Proof. The assumption α ◦ γ = α is equivalent to γ ∗ (A) = A, where α(X) = Tr AX, X ∈ A, and A is strictly positive definite. Thus 1 is an eigenvalue of γ ∗ and therefore also of γ. Hence, the fixed-point set of γ is non-empty, and it is obviously a linear subspace in A, which is also self-adjoint due to the positivity of γ. If X ∈ ker(id −γ) then 0 ≤ α(γ(X ∗ X) − γ(X ∗ )γ(X)) = α(γ(X ∗ X)) − α(X ∗ X) = 0, and hence γ(X ∗ X) = γ(X ∗ )γ(X) = X ∗ X, i.e. X ∗ X ∈ ker(id −γ). The polarization identity then yields that ker(id −γ) is closed also under multiplication, so it is a ˜ = γ(I) ˜ ≤ C ∗ -subalgebra of A. Let I˜ be the unit of ker(id −γ); then 1 = I γ ≤ γS ≤ 1, so γS = 1. Repeating the above argument with X ∗ yields that ker(id −γ) ⊂ Mγ ∩ M∗γ , where Mγ is defined as in Lemma 3.9. Moreover, by Lemma 3.9, γ is a C ∗ -algebra morphism on Mγ ∩ M∗γ , and hence also on ker(id −γ). Note that X, Y := α(X ∗ Y ) defines an inner product on A with respect to which γ is a contraction, and hence γ∞ exists and is the orthogonal projection onto ker(id −γ), due to von Neumann’s mean ergodic theorem. By Lemma 3.9 we have γ(XY ) = γ(X)γ(Y ) = Xγ(Y ) for any X ∈ ker(id −γ) and Y ∈ A, which yields that γ∞ is a conditional expectation. Lemma 3.11. Let B1 := B ∈ A1,+ be non-zero, and let Φ : A1 → A2 be a trace non-increasing 2-positive map such that Tr Φ(B) = Tr B. Let B2 := Φ(B). Then r there exist decompositions supp Bm = k=1 Hm,k,L ⊗ Hm,k,R , m = 1, 2, invertible density operators ωB,k on H1,k,R and ω ˜ B,k on H2,k,R , and unitaries Uk : H1,k,L → H2,k,L such that ker(id −Φ∗B ◦ Φ)+ =
r
B(H1,k,L )+ ⊗ ωB,k ,
k=1
Φ(A1,k,L ⊗ ωB,k ) = Uk A1,k,L Uk∗ ⊗ ω ˜ B,k ,
A1,k,L ∈ B(H1,k,L ).
(3.8)
˜ : A˜1 → A˜2 Proof. Let A˜1 := B 0 A1 B 0 , A˜2 := Φ(B)0 A2 Φ(B)0 , and define Φ 0 0 ∗ ˜ ˜ (Y ) = B 0 Φ∗ (Y )B 0 , as Φ(X) := Φ(B XB ) = Φ(X), X ∈ A˜1 . Then Φ −1/2 ˜ B (X) := Φ(B) ˜ Y ∈ A˜2 , and a straightforward computation verifies that Φ 1/2 1/2 ˜ −1/2 ∗ 1/2 ˜ ∗ ˜ −1/2 ˜ ˜ ˜ Φ(B XB )Φ(B) = ΦB (X), X ∈ A1 , and ΦB (Y ) := B Φ (Φ(B) × −1/2 ˜∗ ◦ Φ ˜ B and γ2 := Φ ˜B ◦ Φ ˜ ∗. ˜ )B 1/2 = Φ∗ (Y ), Y ∈ A˜2 . Let γ1 := Φ Y Φ(B) B
Obviously, γ1 and γ2 are again 2-positive and, since ˜ ∗ (Φ(B)0 ) = B 0 Φ∗ (Φ(B)0 )B 0 = B 0 , γ1 (B 0 ) = Φ γ2 (Φ(B)0 ) = Φ(B)−1/2 Φ(B 1/2 Φ∗ (Φ(B)0 )B 1/2 )Φ(B)−1/2 = Φ(B)0
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
709
due to Lemma 3.2, they are also unital. Hence, γi S = γi = 1, i = 1, 2. Note that if A1 := A ∈ ker(id −Φ∗B ◦ Φ)+ then A0 ≤ B 0 and hence A ∈ A˜1 , and γ1∗ (A + B) = Φ∗B (Φ(A + B)) = A + B, γ2∗ (Φ(A + B)) = Φ(Φ∗B (Φ(A + B))) = Φ(A + B). Let A2 := Φ(A1 ). By the above, γm leaves the faithful state αm with density (Am + Bm )/ Tr(Am + Bm ) invariant, and hence, by Corollary 3.10, ker(id −γm ) is a C ∗ r r algebra of the form ker(id −γm ) = k=1 B(Hm,k,L )⊗Im,k,R , where k=1 Hm,k,L ⊗ n k gives an Hm,k,R is a decomposition of supp Bm . Moreover, limn→∞ n1 k=1 γm αm -preserving conditional expectation onto ker(id −γm ), for m = 1, 2. Hence, by Takesaki’s theorem [50], (Am + Bm )it ker(id −γm )(Am + Bm )−it = ker(id −γm ). Now the argument of [34, Sec. 3] yields the existence of invertible density operators ωA,B,k on H1,k,R and positive definite operators X1,k,L,A,B on H1,k,L such that r A+B = k=1 X1,k,L,A,B ⊗ωA,B,k . By [40, Theorem 9.11], we have (A+B)it B −it ∈ ker(id −γ1 ) for every t ∈ R, which yields that ωA,B,k is independent of A, and hence that every A ∈ ker(id −Φ∗B ◦Φ)+ can be written in the form A = rk=1 A1,k,L ⊗ωB,k with ωB,k := ωA,B,k and some positive semidefinite operators A1,k,L on H1,k,L . This shows that ker(id −Φ∗B ◦ Φ)+ ⊂ rk=1 B(H1,k,L )+ ⊗ ωB,k . For the proof of (3.8), we r refer to [33, Theorem 4.2.1]. Finally, the decomposition B = k=1 B1,k,L ⊗ ωB,k r together with (3.8) shows that ker(id −Φ∗B ◦ Φ)+ ⊃ k=1 B(H1,k,L )+ ⊗ ωB,k . 4. Monotonicity Now we turn to the proof of the monotonicity of the f -divergences under substochastic maps. Let Ai ⊂ B(Hi ) be finite-dimensional C ∗ -algebras for i = 1, 2. Recall that we call a map Φ : A1 → A2 substochastic if Φ∗ satisfies the Schwarz inequality Φ∗ (Y ∗ )Φ∗ (Y ) ≤ Φ∗ (Y ∗ Y ),
Y ∈ A2 ,
and Φ is called stochastic if it is a trace-preserving substochastic map. For a B ∈ A1,+ and a substochastic map Φ : A1 → A2 , we define the map V : A2 → A1 as V (X) := Φ∗ (XΦ(B)−1/2 )B 1/2 ,
X ∈ A2 .
(4.1)
Note that V = RB 1/2 ◦ Φ∗ ◦ RΦ(B)−1/2 and hence V ∗ = RΦ(B)−1/2 ◦ Φ ◦ RB 1/2 , which yields V ∗ (B 1/2 ) = Φ(B)1/2 .
(4.2)
Lemma 4.1. We have the following equivalence: V (Φ(B)1/2 ) = B 1/2
if and only if
Tr Φ(B) = Tr B.
Proof. By definition, V (Φ(B)1/2 ) = Φ∗ (Φ(B)1/2 Φ(B)−1/2 )B 1/2 = Φ∗ (Φ(B)0 )B 1/2 .
August 23, J070-S0129055X11004412
710
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
Hence, if Tr Φ(B) = Tr B then V (Φ(B)1/2 ) = B 1/2 due to Lemma 3.2. On the other hand, B 1/2 = V (Φ(B)1/2 ) = Φ∗ (Φ(B)0 )B 1/2 yields Φ∗ (Φ(B)0 )B n = B n , n ∈ N, and hence also (ii) of Lemma 3.2, which in turn yields Tr Φ(B) = Tr B. Lemma 4.2. The map V is a contraction and V ∗ (LA RB −1 )V ≤ LΦ(A) RΦ(B)−1 .
(4.3)
Moreover, when Φ∗ is a C ∗ -algebra morphism, V is an isometry if Φ(B) is invertible, and (4.3) holds with equality if B is invertible. Proof. Let X ∈ A2 . Then, VX 2HS = Tr(VX )∗ (VX ) = Tr B 1/2 Φ∗ (Φ(B)−1/2 X ∗ )Φ∗ (XΦ(B)−1/2 )B 1/2 ≤ Φ∗ S Tr B 1/2 Φ∗ (Φ(B)−1/2 XX ∗ Φ(B)−1/2 )B 1/2
(4.4)
= Φ∗ S Tr Φ(B)Φ(B)−1/2 XX ∗ Φ(B)−1/2 = Φ∗ S Tr Φ(B)0 XX ∗ ≤ Φ∗ S Tr XX ∗ = Φ∗ S X2HS ≤ X2HS.
(4.5)
If Φ∗ is a C ∗ -algebra morphism then Φ∗ S = 1 and the inequality in (4.4) holds with equality, and if Φ(B) is invertible then and the inequality in (4.5) holds with equality. Similarly, X, V ∗ (LA RB −1 )VX HS = Tr(VX )∗ A(VX )B −1 = Tr B 1/2 Φ∗ (Φ(B)−1/2 X ∗ )AΦ∗ (XΦ(B)−1/2 )B 1/2 B −1 = Tr AΦ∗ (XΦ(B)−1/2 )B 0 Φ∗ (Φ(B)−1/2 X ∗ ) ≤ Tr AΦ∗ (XΦ(B)−1/2 )Φ∗ (Φ(B)−1/2 X ∗ )
(4.6)
≤ Φ∗ S Tr AΦ∗ (XΦ(B)−1/2 Φ(B)−1/2 X ∗ )
(4.7)
∗
−1
= Φ S Tr Φ(A)XΦ(B)
X
∗
= Φ∗ S X, LΦ(A) RΦ(B)−1 XHS ≤ X, LΦ(A) RΦ(B)−1 XHS .
(4.8)
If Φ∗ is a C ∗ -algebra morphism then Φ∗ S = 1 and the inequalities in (4.7) and (4.8) hold with equality, and if B is invertible then (4.6) holds with equality. Recall that a real-valued function f on [0, +∞) is operator convex if f (tA + (1 − t)B) ≤ tf (A)+(1−t)f (B), t ∈ [0, 1], for any positive semi-definite operators A, B on any finite-dimensional Hilbert space (or equivalently, on some infinite-dimensional Hilbert space). For a continuous real-valued function f on [0, +∞), the following are equivalent (see [13, Theorem 2.1]): (i) f is operator convex on [0, +∞) and f (0) ≤ 0; (ii) f (V ∗ AV ) ≤ V ∗ f (A)V for any contraction V and any positive semidefinite operator A. The function f is operator monotone decreasing if f (A) ≥ f (B)
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
711
whenever A and B are such that 0 ≤ A ≤ B. If f is operator monotone decreasing on [0, +∞) then it is also operator convex (see the proof of [13, Theorem 2.5] or [4, Theorem V.2.5]). A function f is operator concave (respectively, operator monotone increasing) if −f is operator convex (respectively, operator monotone decreasing). An operator convex function on [0, +∞) is automatically continuous on (0, +∞), but might be discontinuous at 0. For instance, a straightforward computation shows that the characteristic function 1{0} of the set {0} is operator convex on [0, +∞). It is easy to verify that the functions ϕt (x) := −
t x = −1 + x+t x+t
(4.9)
are operator monotone decreasing and hence operator convex on [0, +∞) for every t ∈ (0, +∞). Theorem 4.3. Let A, B ∈ A1,+ , let Φ : A1 → A2 be a substochastic map such that Tr Φ(B) = Tr B, and let f be an operator convex function on [0, +∞). Assume that Tr Φ(A) = Tr A
or
0 ≤ ω(f ).
(4.10)
Then, Sf (Φ(A)Φ(B)) ≤ Sf (AB).
(4.11)
Proof. First we prove the theorem when f is continuous at 0. Due to Theorem 8.1, we have the representation x 2 f (x) = f (0) + ax + bx + + ϕt (x) dµ(t), x ∈ [0, +∞), 1+t (0,∞) where b ≥ 0 and ϕt (x) is given in (4.9). Define ∆ := LA RB −1
˜ := LΦ(A) RΦ(B)−1 . and ∆
Then Sf (AB) = f (0) Tr B + a Tr AB 0 + b Tr A2 B −1 Tr AB 0 + Sϕt (AB) dµ(t) + ω(f ) Tr A(I − B 0 ). + 1+t (0,+∞)
(4.12)
Note that Tr B = Tr Φ(B) by assumption and, since b ≥ 0, we have b Tr A2 B −1 ≥ b Tr Φ(A)2 Φ(B)−1 due to Lemma 3.5. Since ϕt is operator convex, operator monotone decreasing and ϕt (0) = 0, we have ˜ V ∗ ϕt (∆)V ≥ ϕt (V ∗ ∆V ) ≥ ϕt (∆)
(4.13)
for the contraction V defined in (4.1), due to (4.3) and [13, Theorem 2.1] as mentioned above. Hence, by Lemma 4.1, Sϕt (AB) = B 1/2 , ϕt (∆)B 1/2 HS = V Φ(B)1/2 , ϕt (∆)V Φ(B)1/2 HS 1/2 ˜ ≥ Φ(B)1/2 , ϕt (∆)Φ(B) HS = Sϕt (Φ(A)Φ(B)).
(4.14)
August 23, J070-S0129055X11004412
712
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
Therefore, in order to prove the monotonicity inequality (4.11), it suffices to prove the monotonicity of the remaining terms in (4.12). Assume first that supp A ≤ supp B, and hence also Tr Φ(A) = Tr A (see Lemma 3.2). Then Tr AB 0 = Tr A = Tr Φ(A) = Tr Φ(A)Φ(B)0 , which also yields Tr A(I1 − B 0 ) = Tr Φ(A)(I2 − Φ(B)0 ). Hence, all the terms in (4.12) are monotonic non-increasing under Φ, and therefore we have the inequality (4.11). Next, assume that Tr Φ(A) = Tr A, and define Bε := B + εA, ε > 0. Then Tr Φ(Bε ) = Tr Φ(B) + ε Tr Φ(A) = Tr B + ε Tr A = Tr Bε , and supp A ≤ supp Bε . Hence, by the previous argument, Sf (Φ(A)Φ(Bε )) ≤ Sf (ABε ). Taking ε 0 and using Proposition 2.12, we obtain (4.11). If ω(f ) = +∞, then either supp A supp B, in which case Sf (AB) = +∞ ≥ Sf (Φ(A)Φ(B)), or we have supp A ≤ supp B, and hence (4.11) follows by the previous argument. Finally, assume that 0 ≤ ω(f ) < +∞. By Proposition 8.4, this yields the representation ϕt (x)dµ(t), f (x) = f (0) + ω(f )x + (0,∞)
and hence Sf (AB) = f (0) Tr B + ω(f ) Tr AB 0 + Sϕt (AB)dµ(t) + ω(f ) Tr A(I − B 0 ) (0,+∞)
= f (0) Tr B + ω(f ) Tr A +
Sϕt (AB)dµ(t). (0,+∞)
Since Tr Φ(A) ≤ Tr A, inequality (4.11) follows. So far, we have proved the theorem for the case where f is continuous at 0. Consider the functions f˜α (x) := −xα , x ≥ 0, 0 < α < 1. Then f˜α is operator convex, continuous at 0 and ω(f˜α ) = 0 for all α ∈ (0, 1). Hence, by the above, we have −Tr Φ(A)α Φ(B)1−α = Sf˜α (Φ(A)Φ(B)) ≤ Sf˜α (AB) = −Tr Aα B 1−α , α ∈ (0, 1).
(4.15)
Taking the limit α 0, we obtain Tr Φ(A)0 Φ(B) ≥ Tr A0 B,
(4.16)
which in turn yields S1{0} (Φ(A)Φ(B)) = Tr Φ(B) − Tr Φ(A)0 Φ(B) ≤ Tr B − Tr A0 B = S1{0} (AB).
(4.17)
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
713
Assume now that f is an operator convex function on [0, +∞), that is not necessarily continuous at 0. Convexity of f yields that f (0+ ) := limx0 f (x) is finite, and α := f (0) − f (0+ ) ≥ 0. Note that f˜ := f − α1{0} is operator convex and continuous at 0, ω(f˜) = ω(f ), and Sf (AB) = Sf˜(AB) + αS1{0} (AB) for any A, B ∈ A1,+ . Applying the previous argument to f˜ and using (4.17), we see that Sf (Φ(A)Φ(B)) = Sf˜(Φ(A)Φ(B)) + αS1{0} (Φ(A)Φ(B)) ≤ Sf˜(AB) + αS1{0} (AB) = Sf (AB) if any of the conditions in (4.10) holds, completing the proof of the theorem. Remark 4.4. Note that supp A ≤ supp B is also sufficient for (4.11) to hold, due to Lemma 3.2. Example 4.5. Let A, B ∈ A1,+ and Φ : A1 → A2 be a substochastic map such that Tr Φ(B) = Tr B. Let sgn x := x/|x|, x = 0, and define f˜α := sgn(α−1)fα , 0 < α = 1, where fα is given in Example 2.7. Since f˜α is operator convex, and ω(f˜α ) ≥ 0 for all α ∈ [0, 2]\{1}, Theorem 4.3 yields that sgn(α − 1) Tr Φ(A)α Φ(B)1−α = Sf˜α (Φ(A)Φ(B)) ≤ Sf˜α (AB) = sgn(α − 1) Tr Aα B 1−α
(4.18)
when α ∈ (1, 2] and supp A ≤ supp B. (Note that Sf˜α (Φ(A)Φ(B)) ≤ Sf˜α (AB) = +∞ is trivial when α ∈ (1, 2] and supp A supp B.) The same inequality has been shown in the proof of Theorem 4.3 for α ∈ [0, 1); see (4.15) and (4.16). This yields the monotonicity of the R´enyi relative entropies, 1 log Sfα (Φ(A)Φ(B)) α−1 1 log Sfα (AB) ≤ α−1
Sα (Φ(A)Φ(B)) =
= Sα (AB)
(4.19)
for α ∈ [0, 2]\{1}. Since ω(f ) ≥ 0 for f (x) := x log x, Theorem 4.3 also yields the monotonicity of the relative entropy, S(Φ(A)Φ(B)) ≤ S(AB). Remark 4.6. In the proof of Theorem 4.3 it was essential that f is operator convex, but it is not known if it is actually necessary. See the Appendix for some special cases where convexity of f is sufficient.
August 23, J070-S0129055X11004412
714
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
Theorem 4.3 yields the joint convexity of the f -divergences: Corollary 4.7. Let Ai , Bi ∈ A+ and pi ≥ 0 for i = 1, . . . , r, and let f be an operator convex function on [0, +∞). Then pi Ai pi Bi ≤ pi Sf (Ai Bi ). Sf i
i
i
Proof. Let δ1 , . . . , δr be a set of orthogonal rank-one projections on Cr , and define r r A := i=1 pi Ai ⊗ δi , B := i=1 pi Bi ⊗ δi . The map Φ : A ⊗ B(Cr ) → A, given by Φ(X ⊗Y ) := X Tr Y, X ∈ A, Y ∈ B(Cr ), is completely positive and trace-preserving and hence, by Theorem 4.3, pi Ai pi Bi = Sf (Φ(A)Φ(B)) Sf i
i
≤ Sf (AB) = pi Sf (Ai Bi ),
(4.20)
i
where the last identity is due to Corollary 2.5. Remark 4.8. For an operator convex function f on [0, +∞) let Mf (A1 , A2 ) denote the set of positive linear maps Φ : A1 → A2 such that the monotonicity Sf (Φ(A)Φ(B)) ≤ Sf (AB) holds for all A, B ∈ A1 . The joint convexity of the f -divergences shows that Mf (A1 , A2 ) is convex. Indeed, if Φ1 , Φ2 ∈ Mf (A1 , A2 ) then Corollary 4.7 yields Sf ((1 − λ)Φ1 (A) + λΦ2 (A)(1 − λ)Φ1 (B) + λΦ2 (B)) ≤ (1 − λ)Sf (Φ1 (A)Φ1 (B)) + λSf (Φ2 (A)Φ2 (B)) ≤ (1 − λ)Sf (AB) + λSf (AB) = Sf (AB) for any λ ∈ [0, 1] and A, B ∈ A1 . Note also that if Φ1 ∈ Mf (A1 , A2 ) and Φ2 ∈ Mf (A2 , A3 ) then Φ2 ◦ Φ1 ∈ Mf (A1 , A3 ). We say that a linear map Φ : A1 → A2 is a co-Schwarz map if there is a c ∈ [0, ∞) such that Φ(X ∗ )Φ(X) ≤ cΦ(XX ∗ ),
X ∈ A1 ,
and it is a co-Schwarz contraction if the above inequality holds with c = 1. It is easy to see that a linear map Φ : A1 → A2 is a co-Schwarz map (respectively, a co-Schwarz contraction) if and only if there is a Schwarz map (respectively, a ˜ ◦ T , where T (X) := X T ˜ : AT → A2 such that Φ = Φ Schwarz contraction) Φ 1 denotes the transpose of X ∈ A1 with respect to a fixed orthonormal basis of H1 , T and AT 1 := {X : X ∈ A1 } ⊂ B(H1 ). Furthermore, we say that Φ is co-substochastic
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
715
(respectively, co-stochastic) if Φ∗ is a a co-Schwarz contraction (respectively, a unital co-Schwarz contraction). Theorem 4.3 holds also when Φ : A1 → A2 is a cosubstochastic map. This follows immediately from Theorem 4.3 and the fact that transpositions leave every f -divergences invariant (see Corollary 2.5(iii)). Alternatively, this can be proved by replacing the operator V defined in (4.1) with the conjugate-linear map Vˆ (X) := Φ∗ (Φ(B)−1/2 X ∗ )B 1/2 ,
X ∈ A2 ,
(4.21)
and following the proofs of Lemma 4.2 and Theorem 4.3 with Vˆ in place of V . Recall that a positive map is called decomposable if it can be written as the sum of a completely positive map and a completely positive map composed with a transposition. By the above, a similar notion of decomposability is sufficient for the monotonicity of the f -divergences. Namely, if a trace-preserving positive map Φ : A1 → A2 is decomposable in the sense that it can be written as a convex combination of a stochastic and a co-stochastic map then Φ ∈ Mf (A1 , A2 ) for any operator convex function f on [0, +∞). Example 3.6 provides simple examples of trace-preserving positive maps that are decomposable in this sense but which are neither stochastic nor co-stochastic.
5. Equality in the Monotonicity In this section we analyze the situation where the monotonicity inequality Sf (Φ(A)Φ(B)) ≤ Sf (AB) holds with equality, based on the integral representation of operator convex functions that we give in Sec. 8. By Theorem 8.1, every operator convex function f on [0, +∞) admits a decomposition f (x) = α1{0} (x) + f (0+ ) + ax + bf2 (x) + (0,∞)
x + ϕt (x) dµf (t), 1+t
x ∈ [0, +∞),
(5.1)
where α, b ≥ 0, f (0+ ) := limx0 f (x), 1{0} is the characteristic function of the singleton {0}, f2 (x) := x2 , ϕt (x) is given in (4.9), and µf is a positive measure on (0, +∞). Recall that spec(X) denotes the spectrum of an operator X. We will use the notation |H| to denote the cardinality of a set H. Given B ∈ A1,+ and a positive map Φ : A1 → A2 , let ΦB : A1 → A2 and Φ∗B : A2 → A1 be the maps defined in (3.1) and (3.2).
August 23, J070-S0129055X11004412
716
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
Theorem 5.1. Let A, B ∈ A1,+ be such that supp A ≤ supp B, let Φ : A1 → A2 be a substochastic map such that Tr Φ(B) = Tr B, and define ∆ := LA RB −1
and
˜ := LΦ(A) RΦ(B)−1 . ∆
Then, for the following conditions (i)–(x), we have (i)⇒(ii)⇒(iii)⇒(iv)⇔(v)⇔(vi)⇔(vii)⇔(viii)⇔(ix)⇒(x), and if Φ is 2-positive then (x)⇒(i) holds as well. (i) There exists a stochastic map Ψ : A2 → A1 such that Ψ(Φ(A)) = A,
Ψ(Φ(B)) = B.
(5.2)
(ii) There exists a substochastic map Ψ : A2 → A1 such that (5.2) holds. (iii) For every operator convex function f on [0, +∞), Sf (Φ(A)Φ(B)) = Sf (AB).
(5.3)
(iv) The equality in (5.3) holds for some operator convex function f on [0, +∞) such that ˜ |supp µf | ≥ |spec(∆) ∪ spec(∆)|.
(5.4)
˜ and (v) There exists a T ⊂ (0, +∞) such that |T | ≥ |spec(∆) ∪ spec(∆)| Sϕt (Φ(A)Φ(B)) = Sϕt (AB), (vi) (vii) (viii) (ix) (x)
t ∈ T.
B 0 Φ∗ (Φ(B)−z Φ(A)z ) = B −z Az for all z ∈ C. B 0 Φ∗ (Φ(B)−α Φ(A)α ) = B −α Aα for some α ∈ (0, 2)\{1}. B 0 Φ∗ (Φ(B)−it Φ(A)it ) = B −it Ait for all t ∈ R. B 0 Φ∗ (log∗ Φ(A) − (log∗ Φ(B))Φ(A)0 ) = log∗ A − (log∗ B)A0 . Φ∗B (Φ(A)) = A.
Moreover, (ii) ⇒ (iii) holds without assuming that supp A ≤ supp B. If Φ is n-positive/completely positive then Ψ in (i) can also be assumed to be npositive/completely positive. Proof. The implication (i)⇒(ii) is obvious. Assume that (ii) holds, and let ˜ := Φ(B). Then Tr A = Tr Ψ(A) ˜ ≤ Tr A˜ = Tr Φ(A) ≤ Tr A A˜ := Φ(A), B ˜ which yields Tr Ψ(A) ˜ = Tr A, ˜ Tr Ψ(B) ˜ = Tr B ˜ and and similarly for B and B, Tr Φ(A) = Tr A, Tr Φ(B) = Tr B (note that this latter is automatic here, and not necessary to assume from the beginning). Applying Theorem 4.3 twice, we get that ˜ ˜ ≤ Sf (A ˜ B) ˜ = Sf (Φ(A)Φ(B)) ≤ Sf (AB) for any B)) Sf (AB) = Sf (Ψ(A)Ψ( operator convex function f on [0, +∞), proving (iii). The implication (iii)⇒(iv) is again obvious. Note that if A = 0 then Sf (AB) = f (0) Tr B for any function f , and (i)–(x) hold true automatically. Hence, for the rest we will assume that A = 0 and hence also B = 0.
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
717
Assume that (iv) holds, i.e. Sf (Φ(A)Φ(B)) = Sf (AB) for an operator convex function f on [0, +∞) satisfying (5.4). By (5.1), we have Sf (AB) = αS1{0} (AB) + f (0+ ) Tr B + a Tr A + bSf2 (AB) Tr A + + Sϕt (AB) dµ(t) 1+t (0,+∞) (cf. (4.12)). Note that Tr Φ(B) = Tr B by assumption and Tr Φ(A) = Tr A follows due to Lemma 3.2. Thus, 0 = Sf (AB) − Sf (Φ(A)Φ(B)) = α(S1{0} (AB) − S1{0} (Φ(A)Φ(B))) + b(Sf2 (AB) − Sf2 (Φ(A)Φ(B))) + (Sϕt (AB) − Sϕt (Φ(A)Φ(B))) dµf (t). (0,+∞)
By Theorem 4.3, the f -divergences corresponding to 1{0} , f2 and ϕt are monotonic non-increasing under Φ, and hence the above equality yields that Sϕt (Φ(A)Φ(B)) = Sϕt (AB) for all t ∈ supp µf . This gives (v) with T := supp µf . Assume now that (v) holds. This means that for every t ∈ T , 0 = Sϕt (AB) − Sϕt (Φ(A)Φ(B)) 1/2 ˜ = Φ(B)1/2 , (V ∗ ϕt (∆)V − ϕt (∆))Φ(B) HS ,
where we used that V Φ(B)1/2 = B 1/2 due to Lemma 4.1 (note that ω(ϕt ) = 0, t > 0). By (4.13) this is equivalent to 1/2 ˜ , V ∗ ϕt (∆)V Φ(B)1/2 = ϕt (∆)Φ(B)
t ∈ T,
or equivalently, ˜ + tI2 )−1 ]Φ(B)1/2 , V ∗ [−I1 + t(∆ + tI1 )−1 ]B 1/2 = [−I2 + t(∆
t ∈ T.
By (4.2) we get ˜ + tI2 )−1 Φ(B)1/2 , V ∗ (∆ + tI1 )−1 B 1/2 = (∆
t ∈ T.
˜ Using Lemma 5.2 below and the assumption that |T | ≥ |spec(∆) ∪ spec(∆)|, we obtain 1/2 ˜ V ∗ h(∆)B 1/2 = h(∆)Φ(B)
(5.5)
˜ In particular, for any function h on spec(∆) ∪ spec(∆). ˜ + tI2 )−γ Φ(B)1/2 , V ∗ (∆ + tI1 )−γ B 1/2 = (∆
γ, t > 0.
(5.6)
August 23, J070-S0129055X11004412
718
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
Using (5.6) with γ = 1 and γ = 2, we obtain ˜ + tI2 )−1 Φ(B)1/2 , (∆ ˜ + tI2 )−1 Φ(B)1/2 HS V ∗ (∆ + tI1 )−1 B 1/2 2HS = (∆ ˜ + tI2 )−2 Φ(B)1/2 , Φ(B)1/2 HS = (∆ = V ∗ (∆ + tI1 )−2 B 1/2 , Φ(B)1/2 HS = (∆ + tI1 )−2 B 1/2 , B 1/2 HS = (∆ + tI1 )−1 B 1/2 2HS . Therefore, we have V ∗ x2HS = x2HS for x := (∆ + tI1 )−1 B 1/2 , and since V is a contraction, we get 0 ≤ V V ∗ x − x2HS = V V ∗ x2HS − 2V ∗ x2HS + x2HS = V V ∗ x2HS − x2HS ≤ 0, by which V V ∗ (∆ + tI1 )−1 B 1/2 = (∆ + tI1 )−1 B 1/2 . Substituting (5.6) with γ = 1, we finally obtain ˜ + tI2 )−1 Φ(B)1/2 = (∆ + tI1 )−1 B 1/2 , V (∆
t > 0,
(5.7)
and using again Lemma 5.2, we get 1/2 ˜ V h(∆)Φ(B) = h(∆)B 1/2
˜ By the definition (4.1) of V , this means for any function h on spec(∆) ∪ spec(∆). that 1/2 ˜ )Φ(B)−1/2 )B 1/2 = h(∆)B 1/2 . Φ∗ ((h(∆)Φ(B)
In particular, the choice h(x) := xz , x > 0, h(0) := 0, yields Φ∗ (Φ(A)z Φ(B)−z )B 1/2 = Az B 1/2−z ,
z ∈ C.
(5.8)
Multiplying from the right with B −1/2 and taking the adjoint, we obtain (vi). The implication (vi)⇒(vii) is obvious. Assume now that (vii) holds, i.e. −α α B A = B 0 Φ∗ (Φ(B)−α Φ(A)α ) for some α ∈ (0, 2)\{1}. Multiplying by B and taking the trace, we obtain Sfα (AB) = Tr Aα B 1−α = Tr BΦ∗ (Φ(B)−α Φ(A)α ) = Tr Φ(B)Φ(B)−α Φ(A)α = Sfα (Φ(A)Φ(B)), where fα (x) := xα , x ≥ 0. Since the support of the representing measure µfα is (0, +∞) (see Example 8.3), we see that (vii) implies (vi). The equivalence of (vi) and (viii) is obvious from the fact that the functions z → B 0 Φ∗ (Φ(B)−z Φ(A)z ) and z → B −z Az are both analytic on the whole complex plane. Differentiating (viii) at t = 0, we obtain (ix). A straightforward computation shows that (ix) yields (iv) for f (x) := x log x, that is, the equality for the standard relative entropy (note that the support of the representing measure for x log x is (0, +∞) by Example 8.3). Hence, we have proved that (i)⇒(ii)⇒(iii)⇒(iv)⇔(v)⇔(vi)⇔(vii)⇔(viii)⇔(ix). Assume now that (vi) holds. In particular, the choice z = 0 yields B 0 Φ∗ (Φ(A)0 ) = A0
(5.9)
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
719
(recall that A0 ≤ B 0 ). Since Φ is substochastic, we have Φ∗ (Y ∗ Y ) ≥ Φ∗ (Y ∗ )Φ∗ (Y ) ≥ Φ∗ (Y ∗ )B 0 Φ∗ (Y ), and multiplying from both sides by B 0 , we obtain that Ψ(Y ) := B 0 Φ∗ (Y )B 0 , Y ∈ A2 , is a Schwarz contraction. For ut := Φ(B)−it Φ(A)it and wt := B −it Ait , we have ut u∗t = Φ(B)−it Φ(A)0 Φ(B)it , 0
wt wt∗ = B −it A0 B it ,
t ∈ R.
∗
Note that (vi) says that B Φ (ut ) = wt , and hence Ψ(ut ) = wt B 0 = wt . Thus, 0 ≤ Tr B 1/2 (Ψ(ut u∗t ) − Ψ(ut )Ψ(u∗t ))B 1/2 = Tr BΦ∗ (ut u∗t ) − Tr Bwt wt∗ = Tr Φ(B)Φ(B)−it Φ(A)0 Φ(B)it − Tr BB −it A0 B it = Tr Φ(B)Φ(A)0 − Tr BA0 = Tr BΦ∗ (Φ(A)0 ) − Tr BA0 = Tr BA0 − Tr BA0 = 0, where we used (5.9). Hence, B 1/2 Ψ(ut u∗t )B 1/2 = B 1/2 Ψ(ut )Ψ(u∗t )B 1/2 , and multiplying from both sides with B −1/2 , we obtain Ψ(ut u∗t ) = Ψ(ut )Ψ(u∗t ). Since Ψ(ut ) = 0, and Ψ is a Schwarz contraction, this yields that ΨS = 1 and ut ∈ MΨ . Hence, by Lemma 3.9, Ψ(ut Y ) = Ψ(ut )Ψ(Y ) = wt Φ∗ (Y )B 0 for all Y ∈ A2 and t ∈ R, i.e. B 0 Φ∗ (Φ(B)−it Φ(A)it Y )B 0 = B −it Ait Φ∗ (Y )B 0 ,
t ∈ R,
Y ∈ A2 .
Note that the maps z → B 0 Φ∗ (Φ(B)−z Φ(A)z Y )B 0 and z → B −z Az Φ∗ (Y )B 0 are analytic on the whole complex plane and coincide on iR and thus they are equal for every z ∈ C. Choosing z = 1/2 and Y := Φ(A)1/2 Φ(B)−1/2 , we get B 0 Φ∗ (Φ(B)−1/2 Φ(A)1/2 Φ(A)1/2 Φ(B)−1/2 )B 0 = B −1/2 A1/2 Φ∗ (Φ(A)1/2 Φ(B)−1/2 )B 0 = B −1/2 A1/2 A1/2 B −1/2 , where we used the adjoint of (vi) with z = 1/2. Multiplying from both sides by B 1/2 , we obtain (x). Finally, assume that (x) holds, and hence Φ∗B (Φ(A)) = A,
Φ∗B (Φ(B)) = B.
Note that Φ∗B is not necessarily trace-preserving, as (Φ∗B )∗ (I1 ) = ΦB (I1 ) = Φ(B)0 , which might be strictly smaller than I2 . However, if ρ is a density operator on H1 then the map X → ΦB (X) + (Tr ρX)(I2 − Φ(B)0 ) is obviously unital and hence its adjoint Ψ : A2 → A1 , Ψ(Y ) = Φ∗B (Y ) + [Tr(I2 − Φ(B)0 )Y ]ρ is trace-preserving. Moreover, Ψ(Φ(A)) = Φ∗B (Φ(A)) and Ψ(Φ(B)) = Φ∗B (Φ(B)), as one can easily verify. Since Ψ is obtained from Φ∗ by composing it with completely positive maps and adding a completely positive map, it inherits the positivity of Φ∗ , i.e. if Φ, and hence Φ∗ , is n-positive/completely positive then so is Ψ. In particular, if Φ is 2-positive then Ψ∗ is a unital 2-positive map and hence it is also a Schwarz contraction, i.e. Ψ is stochastic. Thus (x)⇒(i) holds in this case. Lemma 5.2. If f is a complex-valued function on finitely many points {xi }i∈I ⊂ [0, +∞) then for any pairwise different positive numbers {ti }i∈I , there exist complex 1 , i ∈ I. numbers {ci }i∈I such that f (xi ) = j∈I cj xi +t j
August 23, J070-S0129055X11004412
720
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
1 Proof. The matrix C with entries Cij := xi +t , i, j ∈ I, is a Cauchy matrix which j is invertible due to the assumptions that xi = xj and ti = tj for i = j. From this the statement follows.
Corollary 5.3. Assume that supp Ai ≤ supp Bi , i = 1, . . . , r, in the setting of Corollary 4.7. Then equality holds in (4.20) if and only if −1/2 −1/2 1/2 1/2 pi Ai = pi Bi pj Bj pj Aj pj Bj Bi , i = 1, . . . , r. j
j
j
Proof. It is immediate from writing out the equality A = Φ∗B (Φ(A)) given in (x) in the setting of Corollary 4.7. Remark 5.4. Note that if supp A ≤ supp B and Tr Φ(B) = Tr B then for a linear function f (x) = f (0)+ax, the preservation of the f -divergence is automatic, and has no implication on the reversibility of Φ on {A, B}. Indeed, we have Tr Φ(A) = Tr A due to Lemma 3.2, and Sf (Φ(A)Φ(B)) = f (0) Tr Φ(B) + a Tr Φ(A) = f (0) Tr B + a Tr A = Sf (AB). The f -divergence corresponding to the quadratic function f2 (x) := x2 is Sf2 (AB) = Tr A2 B −1 (when supp A ≤ supp B). Preservation of the f -divergence by a stochastic map is not automatic in this case; however, it is not sufficient for the reversibility of the map, either. Indeed, it was shown in [28, Example 2.2] that there exists a positive definite operator D123 on a tripartite Hilbert space H1 ⊗ H2 ⊗ H3 , such that D123 (τ1 ⊗ D23 )−1 = (D12 ⊗ τ3 )(τ1 ⊗ D2 ⊗ τ3 )−1 ,
(5.10)
but it (τ1 ⊗ D23 )−it = (D12 ⊗ τ3 )it (τ1 ⊗ D2 ⊗ τ3 )−it D123
for some t ∈ R,
(5.11)
where τi := dim1 Hi Ii , and D23 := TrH1 D123 , D12 := TrH3 D123 , D2 := TrH1 ⊗H3 D123 . Define H := H1 ⊗ H2 ⊗ H3 , A := D123 and B := τ1 ⊗ D23 . Let A1 := B(H), A2 := B(H1 ⊗ H2 ) ⊗ I3 and let Φ∗ be the identical embedding of A2 into A1 . Then, (5.10) reads as AB −1 = Φ(A)Φ(B)−1 . Multiplying both sides by A and taking the trace, we obtain Tr A2 B −1 = Tr AΦ(A)Φ(B)−1 .
(5.12)
Note that Φ is the orthogonal (with respect to the Hilbert–Schmidt inner product) projection from A1 onto A2 , i.e. Φ is the conditional expectation onto A2 with respect to Tr, and Φ(A)Φ(B)−1 ∈ A2 . Hence, we have
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
721
Tr AΦ(A)Φ(B)−1 = Tr Φ(A)2 Φ(B)−1 . Hence, (5.12) can be rewritten as Sf2 (AB) = Tr A2 B −1 = Tr Φ(A)2 Φ(B)−1 = Sf2 (Φ(A)Φ(B)). However, (5.11) tells that Ait B −it = Φ∗ (Φ(A)it Φ(B)−it ) for some t ∈ R, and hence (viii) in Theorem 5.1 is not satisfied. Since Φ is 2-positive (actually, completely positive), it means that none of (i)–(x) of Theorem 5.1 are satisfied. Remark 5.5. It was shown in [8] that, in the classical setting, preservation of an f -divergence by Φ is equivalent to the reversibility condition (x) of Theorem 5.1 whenever f is strictly convex. We reformulate the classical case in our setting in the Appendix, and use the condition for equality to give a necessary and sufficient condition for the equality in the operator H¨ older and inverse H¨ older inequalities. Remark 5.6. The classical case suggests that the support condition (5.4) might be too restrictive in general. On the other hand, [24] provides an example where the f -divergence corresponding to a function f with |supp µf | = 1 is preserved and yet the reversibility property (x) of Theorem 5.1 fails to hold. This shows that the support condition (5.4) cannot be completely removed in general. Remark 5.7. Theorem 5.1 holds also if we replace Φ and Ψ with co-(sub)stochastic maps, and change conditions (vi)–(viii) to the following: (vi) B 0 Φ∗ (Φ(A)z Φ(B)−z ) = B −z Az for all z ∈ C. (vii) B 0 Φ∗ (Φ(A)α Φ(B)−α ) = B −α Aα for some α ∈ (0, 2)\{1}. (viii) B 0 Φ∗ (Φ(A)it Φ(B)−it ) = B −it Ait for all t ∈ R. 1/2 ˜ In the proof of (v)⇒(vi) , the previous equality V h(∆)Φ(B) = h(∆)B 1/2 in (5.5) is replaced with 1/2 1/2 ¯ ˜ Vˆ h(∆)Φ(B) = h(∆)B
due to the conjugate-linearity of Vˆ , where Vˆ is given in (4.21). In the proof of (vi) ⇒(x), let ut := Φ(A)it Φ(B)−it and wt := B −it Ait ; then u∗t ut = Φ(B)−it Φ(A)0 Φ(B)it ,
wt wt∗ = B −it A0 B −it ,
t ∈ R.
Using that Φ is a co-Schwarz contraction, we have Φ(u∗t ut ) = Φ(ut )Φ(u∗t ). From the multplicative domain for a co-Schwarz contraction, we have Φ(Y ut ) = Φ(ut )Φ(Y ) = wt Φ∗ (Y )B 0 for all Y ∈ A2 and t ∈ R. The rest of the proof is as before with Y = Φ(B)−1/2 Φ(A). The implication (x)⇒(i) holds also if we assume Φ to be 2-copositive. Remark 5.8. Note that the assumption that Φ is substochastic guarantees that (Φ∗B )∗ = ΦB is a Schwarz map, which is also subunital. However, as Example 3.6 shows, there exist subunital Schwarz maps that are not Schwarz contractions.
August 23, J070-S0129055X11004412
722
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
Even more, it was shown in [24] that if Φ is not 2-positive then there exists a positive invertible B such that ΦB is not a Schwarz contraction. To circumvent this problem, we assumed that Φ is 2-positive in the proof of (x)⇒(i) of Theorem 5.1. Note on the other hand that the monotonicity inequality holds not only for substochastic maps but also for Schwarz decomposable maps, i.e. for those maps that can be decomposed as a convex combination of a substochastic and a co-substochastic map; see Remark 4.8. Hence, the implication (x)⇒(iii) might still hold even if Φ∗B is not a substochastic map. It is easy to see that this is the case, for instance, if Φ is 2-decomposable, i.e. it is the convex combination of two trace non-increasing maps, one being 2-positive and the other a composition of a 2-positive map with a transposition. It is an open question whether the Schwarz decomposability of Φ implies that Φ∗B is Schwarz decomposable for every positive semidefinite B. 6. Distinguishability Measures Related to Binary State Discrimination Let A ⊂ B(H) be a C ∗ -algebra, where H is a finite-dimensional Hilbert space, and let S(A) be the state space of A, i.e. S(A) := {A ∈ A+ : Tr A = 1} is the set of density operators in A. Definition 6.1. For A, B ∈ A+ , the Chernoff distance C(AB) of A and B is defined as C(AB) := sup {(1 − α)Sα (AB)} = − min ψ(α|AB), 0≤α≤1
0≤α<1
(6.1)
where Sα (AB) is the R´enyi relative entropy defined in Example 2.7, and ψ(α|AB) := log Tr Aα B 1−α ,
α ∈ R.
(6.2)
For every r ∈ R, we define the Hoeffding distance Hr (AB) of A and B as Hr (AB) := sup Sα (er AB) 0≤α<1
= sup 0≤α<1
! αr + Sα (AB) − 1−α
−αr − ψ(α|AB) . 1−α 0≤α<1
(6.3)
˜ Hr (AB) = sup{−sr − ψ(s|AB)},
(6.4)
= sup Remark 6.2. Note that
s≥0
where ˜ ψ(s|AB) := (1 + s)ψ(s/(1 + s)|AB), ˜ ψ(s|AB) := +∞,
s < 0.
s ∈ [0, +∞),
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
723
˜ ˜ For simplicity, we will use the notation ψ(α) = ψ(α|AB) and ψ(s) := ψ(s|AB). ∗ ˜ ˜ Let ψ (r) := sups∈R {sr − ψ(s)} be the polar function, or Legendre–Fenchel transform of ψ˜ [12]. By (6.4), Hr (ρσ) = ψ˜∗ (−r), r ∈ R. It is easy to see (by com˜ Furthermore, puting its second derivative) that ψ is convex, and hence so is ψ. +˜ ˜ ψ (s) = ψ(s/(1+s))+ψ (s/(1+s))/(1+s), s ∈ (0, +∞), and ∂ ψ(0) = ψ(0)+ψ (0), ˜ is the right derivative of ψ˜ at 0. In particular, lims→+∞ ψ˜ (s) = ψ(1). where ∂ + ψ(0) Hence, ˜ −ψ(0) = −ψ(0), −r < ψ(0) + ψ (0), ∗ ˜ Hr (AB) = ψ (−r) = +∞, −r > ψ(1). It is easy to see that ψ(0) = −S0 (AB),
and if
A0 ≥ B 0
then
ψ (0) = −S(BA),
ψ(1) = −S0 (BA),
and if
A0 ≤ B 0
then
ψ (1) = S(AB).
Being a polar function, ψ˜∗ is convex, and hence so is the function r → Hr (ρσ). Moreover, ψ˜ is lower semicontinuous and thus the bipolar theorem (see, e.g. [12, Proposition 4.1]) yields that ψ˜ is the polar function of its polar ψ˜∗ . Hence, for every s ∈ [0, +∞), we have s ˜ (1 + s)ψ = ψ(s) 1+s = sup{sr − ψ˜∗ (r)} r∈R
=
sup
{−rs − ψ˜∗ (−r)}.
ψ(0)+ψ (0)≤−r≤ψ(1)
Replacing s with α/(1 − α), we finally get that for every α ∈ [0, 1), −Sα (AB) =
ψ(α) 1−α
= sup r∈R
=
! −rα − Hr (AB) 1−α sup
−ψ(1)≤r≤−ψ(0)−ψ (0)
! −rα − Hr (AB) . 1−α
(6.5)
That is, the R´enyi α-relative entropies with parameter α ∈ [0, 1) and the Hoeffding distances mutually determine each other. If Tr A ≤ 1 then ψ(1) = log Tr AB 0 ≤ 0, and hence the optimization is over nonnegative values of r in the last formula of (6.5). Thus, α → Sα (AB) is monotonic increasing on [0, 1) and hence H0 (AB) = lim Sα (AB) =: S1 (AB). α1
Note that ψ˜∗ is lower semicontinuous (see, e.g., [12, Proposition 4.1 and Corollary 4.1]), and hence ψ˜∗ (0) ≤ lim inf r0 ψ˜∗ (−r). On the other hand, it is
August 23, J070-S0129055X11004412
724
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
obvious from the definition that r → Hr (AB) = ψ˜∗ (−r) is monotonic decreasing on R, and hence we finally obtain lim Hr (AB) = lim ψ˜∗ (−r) = ψ˜∗ (0) = H0 (AB) = S1 (AB).
r0
r0
(6.6)
Finally, it is easy to verify that S1 (AB) = S(AB) if Tr A = 1.
(6.7)
The importance of the above measures comes from the problem of binary state discrimination, that we briefly describe below. Assume that we have several identical copies of a quantum system, and we know that either all of them are in a state described by a density operator ρ, or all of them are in a state described by a density operator σ. We assume that the system’s Hilbert space H is finite-dimensional. Our goal is to give a good guess on the true state of the system, based on the outcome of a binary POVM measurement (T, I − T ) on a fixed number (say n) copies, where T is an operator on H⊗n satisfying 0 ≤ T ≤ I. If the outcome corresponding to T happens then we conclude that the state of the system is ρ, and an error occurs if the true state is σ, which has probability βn (T ) := Tr σ ⊗n T . Similarly, the outcome corresponding to I − T yields the guess σ for the true state, and the probability of error in this case is αn (T ) := Tr ρ⊗n (I − T ). If, moreover, there are prior probabilities p and 1 − p assigned to ρ and σ, then the optimal Bayesian error probability is given by Pn,p := min {pαn (T ) + (1 − p)βn (T )} = (1 − pρ − (1 − p)σ)/2, 0≤T ≤I
where the minimum is reached at T = {pρ − (1 − p)σ > 0}, the spectral projection corresponding to the positive part of the spectrum of pρ − (1 − p)σ. For every p ∈ (0, 1), let Tp (ρ σ) 1 1 0 < p ≤ 1/2, −log 2p (1 − pρ − (1 − p)σ1 ) = −log p Pn,p , := 1 1 −log (1 − pρ − (1 − p)σ1 ) = −log Pn,p , 1/2 < p < 1. 2(1 − p) 1−p (6.8) The theorem for the quantum Chernoff bound [3, 37] says that, as the number of copies n tends to infinity, the error probabilities Pn,p decay exponentially, and the rate of the decay is given by the Chernoff distance. More formally, − lim (1/n) log Pn,p = lim (1/n)Tp (ρ⊗n σ ⊗n ) = C(ρσ), n→∞
n→∞
p ∈ (0, 1).
(6.9)
In the asymmetric setting of the quantum Hoeffding bound, the error probabilities αn are required to be exponentially small, and βn is optimized under this constraint,
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
725
i.e. one is interested in the quantities βn,r := min{βn (T ) : αn (T ) ≤ e−nr , T ∈ B(H⊗n ), 0 ≤ T ≤ I}, where r is some fixed positive number. The theorem for the quantum Hoeffding bound [15, 36] says that, for every r > 0, the error probabilities βn,r decay exponentially fast as n goes to infinity, and the decay rate is given by the Hoeffding distance with parameter r. Moreover, if supp ρ ≤ supp σ, then for every r > 0 we have a real number ar such that [22, 36] − lim (1/n) log βn,r = lim (1/n)T n→∞
n→∞
e−nar 1+e−nar
(ρ⊗n σ ⊗n ) = Hr (ρσ).
(6.10)
Note that for density operators ρ and σ, ψ(α|ρσ) = log Tr ρα σ 1−α ≤ 0 for every α ∈ [0, 1] due to H¨ older’s inequality (A.8). Hence, C(ρσ) ≥ 0, and C(ρσ) = 0 if and only if equality holds in H¨ older’s inequality, which is equivalent to ρ = σ. Similarly, Hr (ρσ) ≥ 0 for every r ∈ R, and Hr (ρσ) = 0 if and only if ρ = σ, or supp ρ ≥ supp σ and r ≥ S(σρ). Proposition 6.3. Let A, B ∈ A1,+ and let Φ : A1 → A2 be a substochastic map such that Tr Φ(B) = Tr B. Then C(Φ(A)Φ(B)) ≤ C(AB)
and
Hr (Φ(A)Φ(B)) ≤ Hr (AB),
r ∈ R. (6.11)
If there exists a substochastic map Ψ : A2 → A1 such that Ψ(Φ(A)) = A and Ψ(Φ(B)) = B then the inequalities in (6.11) hold with equality. Proof. By Example 4.5, Sα (Φ(A)Φ(B)) ≤ Sα (AB) for every α ∈ [0, 1), and equality holds for every α ∈ [0, 1) if there exists a substochastic map Ψ : A2 → A1 such that Ψ(Φ(A)) = A and Ψ(Φ(B)) = B, due to Theorem 5.1. The assertion then follows immediately from the definitions (6.1) and (6.3). Our goal now is to give the converse of the above proposition, i.e. to show that equality in the inequalities of (6.11) yields the existence of a substochastic map Ψ : A2 → A1 such that Ψ(Φ(A)) = A and Ψ(Φ(B)) = B. This would be immediate from Theorem 5.1 if the Chernoff and the Hoeffding distances could be represented as f -divergences (at least when Φ is also assumed to be 2-positive). However, no such representation is possible, as is shown in the following proposition: Proposition 6.4. The Chernoff and the Hoeffding distances cannot be represented as f -divergences on the state space of any non-trivial finite-dimensional C ∗ -algebra. Proof. Let A ⊂ B(H) where dim H ≥ 2, and let e1 , e2 be orthonormal vectors in H such that |ej ej | ∈ A, j = 1, 2. Define ρ := |e1 e1 |, σp := p|e1 e1 | + (1 − p)|e2 e2 |, p ∈ (0, 1). One can easily check that C(ρσp ) = Hr (ρσp ) = − log p for every r > 0, while Sf (ρσp ) = pf (1/p) + (1 − p)f (0) for any function f on [0, +∞). Hence, if any of the above measures can be represented as an f -divergence, then we
August 23, J070-S0129055X11004412
726
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
have pf (1/p) + (1 − p)f (0) = − log p for the representing function f , and taking the limit p 0 yields ω(f ) = +∞. In particular, Sf (σp ρ) = +∞ for every p ∈ (0, 1). On the other hand, C(σp ρ) = − log p and Hr (σp ρ) = 0 if r ≥ − log p. That is, C(σp ρ) is finite for every p ∈ (0, 1) and for every r > 0 there exists a p ∈ (0, 1) such that Hr (σp ρ) is finite. Note, however, that for the applications of Theorems 4.3 and 5.1, it is sufficient to have a more general representability. Indeed, let A be a finite-dimensional C ∗ -algebra and D : S(A) × S(A) → R. We say that D is a monotone function of an f -divergence on the state space of A if there exists an operator convex function f : [0, +∞) → R and a strictly monotonic increasing function g : {Sf (ρσ) : ρ, σ ∈ S(A)} → R ∪ {±∞} such that D(ρ σ) = g(Sf (ρ σ)),
ρ, σ ∈ S(A).
Obviously, if D is a monotone function of an f -divergence then it is monotonic non-increasing under stochastic maps due to Theorem 4.3. Moreover, if D(Φ(ρ) Φ(σ)) = D(ρ σ) for some stochastic map Φ and ρ, σ ∈ S(A) such that supp ρ ≤ supp σ, and the representing function f satisfies |supp µf | ≥ |spec(Lρ Rσ−1 ) ∪ spec(LΦ(ρ) RΦ(σ)−1 )| then Φ∗σ (Φ(ρ)) = ρ, due to Theorem 5.1(iv). For instance, the R´enyi α-relative entropy is a monotone function of the f˜α 1 log sgn(α − 1)x, for every α ∈ [0, 2]\{1}. However, divergence with g(x) := α−1 the same argument as in Proposition 6.4 yields that none of the R´enyi relative entropies with parameter α ∈ (0, 1) can be represented as f -divergences. Proposition 6.5. For any r ∈ (0, +∞) and any non-trivial C ∗ -algebra A, the Hoeffding distance Hr cannot be represented on the state space of A as a monotone function of an f -divergence with an operator convex function f on [0, +∞) such that |supp µf | ≥ 6. Proof. Let A ⊂ B(H) be a C ∗ -algebra and let e1 , e2 be orthogonal vectors in H such that |e1 e1 |, |e2 e2 | ∈ A. Choose p, q ∈ (0, 1) such that p = q and 1−q q log pq + (1 − q) log 1−p < r, and define ρ := p|e1 e1 | + (1 − p)|e2 e2 | and σ := q|e1 e1 | + (1 − q)|e2 e2 |. Then ψ(0|ρσ) = 0 and −ψ(0|ρσ) − ψ (0|ρσ) = S(σρ) = q log pq + (1 − q) log 1−q 1−p < r, and hence Hr (ρσ) = −ψ(0|ρσ) = 0. Define Φ : A → A, Φ(X) := (Tr X)I/(dim H). Then Φ is completely positive and tracepreserving, Φ(ρ) = Φ(σ), and hence Hr (Φ(ρ)Φ(σ)) = 0 = Hr (ρσ). Note that |spec(Lρ Rσ−1 )| ≤ 5 and |spec(LΦ(ρ) RΦ(σ)−1 )| = 1. If we had Hr (ρσ) = g(Sf (ρσ)) and Hr (Φ(ρ)Φ(σ)) = g(Sf (Φ(ρ)Φ(σ))) for some strictly monotone g and an operator convex f on [0, +∞) such that |supp µf | ≥ 6 then Theorem 5.1 would yield Φ∗σ (Φ(ρ)) = ρ. However, Φ(ρ) = Φ(σ) and hence Φ∗σ (Φ(ρ)) = Φ∗σ (Φ(σ)) = σ = ρ.
The above proposition also shows that the preservation of a Hoeffding distance of a pair (ρ, σ) by a stochastic map for a given parameter r might not be sufficient
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
727
for the reversibility of Φ on {ρ, σ} in the sense of Theorem 5.1; the reason for this in the above proof is that the Hoeffding distance might be equal to zero even for nonequal states. The Chernoff distance, on the other hand, is always strictly positive for unequal states; yet the following example shows that the preservation of the Chernoff distance is not sufficient for reversibility in general, either. Example 6.6. Let H := C3 and let A be the commutative C ∗ -algebra of operators on H that are diagonal in some fixed basis e1 , e2 , e3 . Let ρ := (2/3)|e1 e1 | + (1/3)|e2 e2 |, σ := (1/6)|e1 e1 | + (1/3)|e2 e2 | + (1/2)|e3 e3 |, and define Φ : A → A as Φ(|e1 e1 |) := Φ(|e2 e2 |) := |e1 e1 |,
Φ(|e3 e3 |) := |e3 e3 |.
Then Φ is completely positive and trace-preserving, and we have Φ(ρ) = |e1 e1 |, α Φ(σ) = (1/2)|e1 e1 | + (1/2)|e3 e3 |. For every α ∈ R, we have Tr ρα σ 1−α = 2+4 6 and Tr Φ(ρ)α Φ(σ)1−α = 2α−1 , and hence C(Φ(ρ)Φ(σ)) = − log ψ(0|Φ(ρ)Φ(σ)) = S0 (Φ(ρ)Φ(σ)) = log 2 = S0 (ρσ) = − log ψ(0|ρσ) = C(ρσ). On the other hand, it is easy to see that Φ∗σ (Φ(ρ)) = (1/3)|e1 e1 | + (2/3)|e2e2 | = ρ, and therefore (x) of Theorem 5.1 does not hold, and hence Φ is not reversible on the pair {ρ, σ}. Remark 6.7. Note that in the Sα (Φ(A)Φ(B)) = Sα (AB) for preservation of a R´enyi α-relative reversibility of Φ on {A, B}. The for the 0-relative entropy.
setting of Theorem 5.1, if Φ is 2-positive and some α ∈ (0, 1) then Φ∗B (Φ(A)) = A, i.e. the entropy with some α ∈ (0, 1) is sufficient for the above example shows that the same is not true
Corollary 6.8. Let A be a C ∗ -algebra which contains at least three orthogonal non-zero projections. Then the Chernoff distance cannot be represented on its state space as a monotone function of an f -divergence with an operator convex f on [0, +∞) such that |supp µf | ≥ 6. Proof. Immediate from Example 6.6. After the above preparation, we are ready to prove the analogue of Theorem 5.1 for the preservation of the Chernoff and the Hoeffding distances. The preservation of the Chernoff distance was already treated in the proof of [23, Theorem 6] in the case where both operators are invertible density operators and the substochastic map is the trace-preserving conditional expectation onto a subalgebra. We use essentially the same proof to treat the general case below. Theorem 6.9. Let A, B ∈ A1,+ be such that supp A ≤ supp B, let Φ : A1 → A2 be a substochastic map such that Tr Φ(B) = Tr B, and assume that (i) or (ii) below
August 23, J070-S0129055X11004412
728
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
holds: (i) C(Φ(A)Φ(B)) = S0 (Φ(A)Φ(B)), C(Φ(A)Φ(B)) = S0 (Φ(B)Φ(A)), and C(Φ(A)Φ(B)) = C(AB). (ii) For some r ∈ (−ψ(1|Φ(A)Φ(B)), −ψ(0|Φ(A)Φ(B)) − ψ (0|Φ(A)Φ(B)), Hr (Φ(A)Φ(B)) = Hr (AB).
(6.12)
Then Φ∗B (Φ(A)) = A, and if Φ is 2-positive then there exists a stochastic map Ψ : A2 → A1 such that Ψ(Φ(A)) = A and Ψ(Φ(B)) = B. Proof. Assume first that (i) holds. Due to the assumptions C(Φ(A)Φ(B)) = S0 (Φ(A)Φ(B)) = −ψ(0|Φ(A)Φ(B)), C(Φ(A)Φ(B)) = S0 (Φ(B)Φ(A)) = −ψ(1|Φ(A)Φ(B)), and the Definition (6.1) of the Chernoff distance, there exists an α∗ ∈ (0, 1) such that C(Φ(A)Φ(B)) = −ψ(α∗ |Φ(A)Φ(B)). Using the monotonicity relation (4.15), we get ∗
∗
∗
∗
C(Φ(A)Φ(B)) = − log Tr Φ(A)α Φ(B)1−α ≤ − log Tr Aα B 1−α ≤ C(AB) = C(Φ(A)Φ(B)). ∗
∗
∗
∗
Hence, Tr Φ(A)α Φ(B)1−α = Tr Aα B 1−α , which yields Φ∗B (Φ(A)) = A due to Theorem 5.1(iv). Assume next that (6.12) holds for some r ∈ (−ψ(1|Φ(A)Φ(B)), −ψ(0|Φ(A)Φ(B)) − ψ (0|Φ(A)Φ(B)). Then there exists an s∗ ∈ (0, +∞) such ˜ ∗ |Φ(A)Φ(B)) (see Remark 6.2). Thus, that Hr (Φ(A)Φ(B)) = −s∗ r − ψ(s s∗ Hr (Φ(A)Φ(B)) = −α∗ r/(1 − α∗ ) + Sα∗ (Φ(A)Φ(B)), where α∗ := 1+s ∗ ∈ (0, 1). Using the monotonicity (4.19), we obtain Hr (Φ(A)Φ(B)) = −α∗ r/(1 − α∗ ) + Sα∗ (Φ(A)Φ(B)) ≤ −α∗ r/(1 − α∗ ) + Sα∗ (AB) ≤ Hr (AB) = Hr (Φ(A)Φ(B)). ∗
∗
∗
∗
Hence, Tr Φ(A)α Φ(B)1−α = Tr Aα B 1−α , which yields Φ∗B (Φ(A)) = A due to Theorem 5.1(iv). Finally, if Φ is 2-positive then Φ∗B (Φ(A)) = A yields the existence of Ψ in the last assertion the same way as in the proof of (x)⇒(i) in Theorem 5.1. Corollary 6.10. Assume in the setting of Theorem 6.9 that supp A = supp B and Tr A = Tr B. If C(Φ(A)Φ(B)) = C(AB) then Φ∗B (Φ(A)) = A. Proof. Let ψ(α) := ψ(α|Φ(A)Φ(B)), α ∈ R. By the assumptions, we have supp Φ(A) = supp Φ(B) and Tr Φ(A) = Tr Φ(B), and hence ψ(0) = ψ(1). Since ψ is convex, there are two possibilities: either ψ is constant, or the minimum of ψ on [0, 1] is attained at some α∗ ∈ (0, 1). In the latter case we have C(Φ(A)Φ(B)) = S0 (Φ(A)Φ(B)), C(Φ(A)Φ(B)) = S0 (Φ(B)Φ(A)), and
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
729
hence the assertion follows due to Theorem 6.9. If ψ is constant then we have Tr Φ(A)α Φ(B)1−α = eψ(α) = eψ(1) = Tr Φ(A) = (Tr Φ(A))α (Tr Φ(B))1−α for every α ∈ [0, 1], and the equality case in H¨ older’s inequality yields that Φ(A) is constant multiple of Φ(B) (see Corollary A.5). Since Tr Φ(A) = Tr Φ(B), this yields that Φ(A) = Φ(B). Similarly, − min ψ(α|AB) = C(AB) = C(Φ(A)Φ(B)) = − log Tr Φ(A) 0≤α≤1
= − log Tr A = −ψ(0|AB), and since Tr A = Tr B, we also have − log Tr A = − log Tr B = −ψ(1|AB). Hence, α → ψ(α|AB) is constant on [0, 1], and the same argument as above yields that A = B. Therefore, Φ∗B (Φ(A)) = Φ∗B (Φ(B)) = B = A. Remark 6.11. Note that the interval (−ψ(1|Φ(A)Φ(B)), −ψ(0|Φ(A)Φ(B)) − ψ (0|Φ(A)Φ(B)) in Theorem 6.9(ii) might be empty; this happens if and only if α → ψ(α|Φ(A)Φ(B)) is constant. A characterization of this situation was given in [22, Lemma 3.2]. 7. Error Correction Noise in quantum mechanics is usually modeled by completely positive trace nonincreasing maps. The aim of error correction is, given a noise operation Φ, to identify a subset C of the state space (called the code) and a quantum operation Ψ such that it reverses the action of the noise on the code, i.e. Ψ(Φ(ρ)) = ρ, ρ ∈ C. It was first noticed in [43] that the preservation of certain distinguishability measures of two states by the noise operation is a sufficient condition for correctability of the noise on those two states. This result was later extended to general families of states in [25, 26]. The measures considered in these papers were the R´enyi relative entropies and the standard relative entropy. Recently, the same problem was considered in [6] using the measures Tp given in (6.8), and similar results were found, although only under some extra technical conditions. Below we summarize these results and extend them to a wide class of measures, based on Theorem 5.1. Let Ai be a C ∗ -algebra on Hi for i = 1, 2, and let S(Ai ) denote the set of density operators in Ai . For a non-empty set C ⊂ S(A1 ), let co C denote the closed convex hull of C, and let supp C be the supremum of the supports of all states in C. Note that there exists a state σ ∈ co C such that supp σ = supp C. We introduce the notation d2 := (dim H1 )2 + (dim H2 )2 . Note that if X ∈ A1 and Φ : A1 → A2 is a trace non-increasing positive map then Φ(X)1 = max{Tr Φ(X)S : S ∈ A2 self-adjoint, −I2 ≤ S ≤ I2 } = max{Tr XΦ∗ (S) : S ∈ A2 self-adjoint, −I2 ≤ S ≤ I2 } ≤ max{Tr XR : R ∈ A1 self-adjoint, −I1 ≤ R ≤ I1 } = X1,
August 23, J070-S0129055X11004412
730
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
which in particular yields that the measures Tp are monotonic non-increasing under substochastic maps. Theorem 7.1. Let Φ : A1 → A2 be a trace-preserving 2-positive map, and let C ⊂ S(A1 ) be a non-empty set of states. The following are equivalent: (i) There exists a stochastic map Ψ : A2 → A1 such that for every ρ ∈ co C, Ψ(Φ(ρ)) = ρ.
(7.1)
(ii) For every operator convex function f on [0, +∞), and every ρ, σ ∈ co C, Sf (Φ(ρ)Φ(σ)) = Sf (ρσ).
(7.2)
(iii) The equality (7.2) holds for every ρ ∈ C and for some σ ∈ S(A1 ) such that supp σ ≥ supp C, and some operator convex f on [0, +∞) such that |supp µf | ≥ d2 . (iv) Sϕt (Φ(ρ)Φ(σ)) = Sϕt (ρσ) for every ρ ∈ C and for some σ ∈ S(A1 ) such that supp σ ≥ supp C, and a set T of t’s such that |T | ≥ d2 . (v) For every ρ, σ ∈ co C and every r ∈ R, Hr (Φ(ρ)Φ(σ)) = Hr (ρσ).
(7.3)
(vi) The equality in (7.3) holds for every ρ ∈ C and for some σ ∈ S(A1 ) such that supp σ ≥ supp C, and for every r ∈ (0, δ) for some δ > 0. (vii) For every ρ ∈ co C and every σ ∈ co C such that supp σ = supp C, Φ∗σ (Φ(ρ)) = ρ.
(7.4)
(viii) The equality (7.4) holds for every ρ ∈ C and some σ ∈ S(A1 ). r (ix) There exist decompositions supp C = k=1 H1,k,L ⊗ H1,k,R and supp Φ(C) = r ˜ k on k=1 H2,k,L ⊗ H2,k,R , invertible density operators ωk on H1,k,R and ω H2,k,R , and unitaries Uk : H1,k,L → H2,k,L , k = 1, . . . , r, such that every ρ ∈ C can be written in the form ρ=
r
pk ρk,L ⊗ ωk
k=1
with some density operators ρk,L on H1,k,L and probability distribution {pk }rk=1 , and Φ(A ⊗ ωk ) = Uk AUk∗ ⊗ ω ˜k,
A ∈ B(H1,k,L ).
Moreover, if Φ is n-positive/completely positive then Ψ in (i) can also be chosen to be n-positive/completely positive. The implications (i)⇒(ii)⇒(iii)⇒(iv)⇔(viii) hold also if we only assume Φ to be substochastic. Furthermore, criterion (x) below is sufficient for (i)–(viii) to hold, and it is also necessary if Φ is completely positive.
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
731
(x) For every ρ ∈ C, every p ∈ (0, 1), every n ∈ N, and for some σ ∈ S(A1 ) such that supp σ ≥ supp C, Tp (Φ⊗n (ρ⊗n ) Φ⊗n (σ ⊗n )) = Tp (ρ⊗n σ ⊗n ).
(7.5)
Proof. The implications (i)⇒(ii)⇒(iii)⇒(iv)⇔(viii) follow immediately from Theorem 5.1 under the condition that Φ is substochastic (note that in the implication (iii)⇒(iv), T can be chosen to be supp µf , and hence it is independent of the pair (ρ, σ)). If (viii) holds then ρ = Φ∗σ (Φ(ρ)) = σ 1/2 Φ∗ (Φ(σ)−1/2 Φ(ρ)Φ(σ)−1/2 )σ 1/2 implies that supp ρ ≤ supp σ for every ρ ∈ co C, and hence Φ∗σ can be completed to a map Ψ as required in (i) the same way as in the proof of (x)⇒(i) in Theorem 5.1. This proves (viii)⇒(i). Assume that (i) holds. Fixing any ρ ∈ co C and σ ∈ co C such that supp σ = supp C, we have Ψ(Φ(ρ)) = ρ and Ψ(Φ(σ)) = σ, and Theorem 5.1 yields (7.4) for this pair (ρ, σ), proving (i)⇒(vii). The implication (vii)⇒(viii) is obvious. The implication (i)⇒(v) follows by Proposition 6.3, and the implication (v)⇒(vi) is obvious. Assume now that (vi) holds. Then, by (6.6) and (6.7), we have S(Φ(A)Φ(B)) = S(AB), i.e. the equality holds for the standard relative entropy, which is the f -divergence corresponding to f (x) = x log x. Since the support of the representing measure for x log x is (0, +∞), this yields (iii). The implication (x)⇒(vi) follows from (6.10). Assume that Φ is completely positive and (i) holds. Then we can assume Ψ to be completely positive, and hence Φ⊗n and Ψ⊗n are positive and trace-preserving for every n ∈ N. Thus, by the monotonicity of the measures Tp , Tp (ρ⊗n σ ⊗n ) = Tp (Ψ⊗n (Φ⊗n (ρ⊗n )) Ψ⊗n (Φ⊗n (σ ⊗n ))) ≤ Tp (Φ⊗n (ρ⊗n ) Φ⊗n (σ ⊗n )) ≤ Tp (ρ⊗n σ ⊗n ), and hence (x) holds. Finally, (vii)⇒(ix) follows due to Lemma 3.11, and (ix)⇒(vii) is a matter of straightforward computation. Briefly, the above theorem tells that if the noise does not decrease some suitable measure of the pairwise distinguishability on a set of states then its action can be reversed on that set with some other quantum operation; moreover, the reversion operation can be constructed by using the noise operation and any state with maximal support. There are apparent differences between the conditions given above; indeed, (iii) tells that the preservation of one single f -divergence is sufficient, while (iv) requires the preservation of sufficiently (but finitely) many f -divergences, (v) requires the preservation of a continuum number of measures, and (x) requires even more. The equivalence between (iii) and (iv) is easy to understand; as we have seen in the proof of Theorem 5.1, as far as monotonicity and equality in the monotonicity are considered, any f -divergence with an operator convex f which is not a polynomial is equivalent to the collection of ϕt -divergences with t ∈ supp µf , and the condition on the cardinality of supp µf is imposed so that any function on the joint spectrum of the relative modular operators can be decomposed as a linear combination of ϕt ’s, which in turn is used to construct the inversion map Φ∗σ .
August 23, J070-S0129055X11004412
732
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
It is an open question how much the condition on the cardinality of supp f can be improved; cf. Remark 5.6. Note that (iii) tells in particular that the preservation of the pairwise R´enyi relative entropies for one single parameter value α ∈ (0, 2) is sufficient for reversibility. This is in contrast with (iv), where the preservation of continuum many Hoeffding distances are required, despite the symmetry suggested by (6.3) and (6.5). On the other hand, we have the following: Proposition 7.2. In the setting of Theorem 7.1, assume that there exists a C0 ⊂ S(A1 ) such that co C0 = co C, and a σ ∈ S(A1 ) such that supp σ ≥ supp C, and the following hold: 0 < m := inf {−ψ(0|Φ(ρ)Φ(σ)) − ψ (0|Φ(ρ)Φ(σ))} ρ∈C0
and for some r ∈ (0, m), Hr (Φ(ρ)Φ(σ)) = Hr (ρσ),
ρ ∈ C0 .
Then Φ∗σ (Φ(ρ)) = ρ for every ρ ∈ co C. Proof. Immediate from Theorem 6.9. Finally, if all the states in C have the same support then some of the conditions in Theorem 7.1 and Proposition 7.2 can be simplified, and we can give a simple condition in terms of preservation of the Chernoff distance: Proposition 7.3. Let Φ : A1 → A2 be a trace-preserving 2-positive map and let C ⊂ S(A1 ) be a non-empty set of states such that supp ρ = supp C for every ρ ∈ C. Assume that there exists a σ ∈ S(A1 ) such that supp σ = supp C and one of the following holds: (i) There exists a p ∈ (0, 1) such that Tp (Φ⊗n (ρ⊗n ) Φ⊗n (σ ⊗n )) = Tp (ρ⊗n σ ⊗n ),
ρ ∈ C,
n ∈ N.
(7.6)
(ii) For every ρ ∈ C, C(Φ(ρ)Φ(σ)) = C(ρσ). (iii) There exists a C0 such that co C0 = co C and an r ∈ (0, inf ρ∈C0 S(Φ(σ)Φ(ρ))) such that for every ρ ∈ C0 , Hr (Φ(ρ)Φ(σ)) = Hr (ρσ).
(7.7)
Then Φ∗σ (Φ(ρ)) = ρ,
ρ ∈ co C.
(7.8)
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
733
Proof. The implication (i)⇒(ii) is immediate from (6.9), and (ii) implies (7.8) due to Corollary 6.10. Assume now that (iii) holds. Since supp ρ = supp σ, ρ ∈ C0 , we have ψ(0|Φ(ρ)Φ(σ)) = 0 and −ψ (0|Φ(ρ)Φ(σ)) = S(Φ(σ)Φ(ρ)), ρ ∈ C0 . Hence, (7.7) yields (7.8) due to Proposition 7.2. Note that the conditions (7.5) and (7.6) are very different from the others, as they require the preservation of some measure for arbitrary tensor powers. These conditions could be simplified if the trace-norm distance could be represented as an f -divergence. Note that this is possible in the classical case; indeed, if p and q are probability density functions on some finite set X , and f (x) := |x − 1|, x ∈ R, then q(x)|p(x)/q(x) − 1| = |p(x) − q(x)| = p − q1 . Sf (pq) = x∈X
x∈X
Note, however, that the above f is not operator convex, and hence the proof given in Theorem 5.1 would not work for it. Even worse, the trace-norm distance cannot be represented as an f -divergence, as we show below by a simple argument. Corollary 7.4. If the observable algebra of a quantum system is non-commutative then the trace-norm distance on its state space cannot be represented as an f -divergence. Proof. Assume that A ⊂ B(H) is non-commutative; then we can find orthonormal vectors e1 , e2 ∈ H such that |ei ej | ∈ A, i = 1, 2. (For simplicity, we neglect possible higher multiplicities; taking them into account would only result in a constant multiplication factor in the formulas below.) Assume that the trace-norm distance can be represented as an f -divergence. Then, for every s ∈ [0, 1] and t ∈ (0, 1), when ρ := s|e1 e1 | + (1 − s)|e2 e2 | and σ := t|e1 e1 | + (1 − t)|e2 e2 |, we have tf (s/t) + (1 − t)f ((1 − s)/(1 − t)) = Sf (ρσ) = ρ − σ1 = 2|s − t|. Letting s = t gives f (1) = 0. Letting t 0 gives sω(f ) + f (1 − s) = 2s for all s ∈ (0, 1]. This implies that ω(f ) is finite √ and ω(f ) + f (0) = 2. √Now let ρ := |e1 e1 | and σ := |ψψ|, where ψ := (e1 + e2 )/ 2. Then ρ − σ1 = 2, while by (2.6) one can easily compute Sf (ρσ) =
1 1 1 1 f (1) + ω(f ) + f (0) = (ω(f ) + f (0)) = 1. 2 2 2 2
Remark 7.5. A similar argument as above can be used to show that for any p ∈ (0, 1), the measure Dp (ρσ) := 1 − pρ − (1 − p)σ1 cannot be represented as an f -divergence on the state space of any non-commutative finite-dimensional C ∗ -algebra. Remark 7.6. In general, a function on pairs of classical probability distributions might have several different extensions to quantum states. A function that can be represented as an f -divergence has an extension given by the corresponding
August 23, J070-S0129055X11004412
734
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
quantum f -divergence. It is not clear whether this extension has any operational significance in the case of f (x) := |x − 1|. While the impossibility to represent the trace-norm distance as an f -divergence shows that the approach followed in Theorem 7.1 cannot be used to simplify the condition in (x) of the theorem, other approaches might lead to better results. Indeed, the results of the recent paper [6] can be reformulated in the following way: Theorem 7.7. Let C ⊂ S(A1 ) be a convex set of states and let Φ : A1 → A2 be a completely positive trace-preserving map such that Tp (Φ(ρ) Φ(σ)) = Tp (ρ σ),
p ∈ (0, 1).
Then the fixed-point set of Φ∗P ◦ Φ is a C ∗ -subalgebra of P A1 P, where P is the projection onto supp C, and the trace-preserving conditional expectation P from P A1 P onto ker(id −Φ∗P ◦ Φ) is Tp -preserving for all p ∈ (0, 1). If, moreover, the restriction of P onto C is surjective onto the state space of ker(id −Φ∗P ◦ Φ) then Theorem 7.1 (i)–(x) hold. Note that the continuum many conditions requiring the preservation of Tp for all p ∈ (0, 1) in Theorem 7.7 can be simplified to a single condition, requiring that Φ is trace-norm preserving on the real subspace generated by C. Note also that the surjectivity condition is sufficient but obviously not necessary. It is, however, an open question whether it can be completely removed. In the approach followed in [6], it is important that one starts with a convex set of states. The same problem was studied in [23] in a different setting, and the following has been shown: Theorem 7.8. Let ρ, σ ∈ S(A) be invertible density operators and Φ be the trace-preserving conditional expectation onto a subalgebra A0 of A. Assume that Tp (Φ(ρ) Φ(σ)) = Tp (ρ σ) for every p ∈ (0, 1), and A0 is commutative or ρ and σ commute. Then Φ∗σ (Φ(ρ)) = ρ and Φ∗ρ (Φ(σ)) = σ. Remark 7.9. In [23] the condition Tp (Φ(ρ) Φ(σ)) = Tp (ρ σ), p ∈ (0, 1), was called 2-sufficiency, and Tp (Φ⊗n (ρ⊗n )Φ⊗n (σ ⊗n )) = Tp (ρ⊗n σ ⊗n ),
p ∈ (0, 1),
n ∈ N,
(7.9)
was called (2, n)-sufficiency. It was also shown in [23, Theorem 6] that in the setting of Theorem 7.8, (7.9) is sufficient for the conclusion of Theorem 7.8 to hold.
8. An Integral Representation for Operator Convex Functions Operator monotone and operator convex functions play an important role in quantum information theory [45]. Several ways are known to decompose them as integrals
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
735
of some families of functions of simpler forms [4, 19]. Here we present a representation that is well-suited for our analysis of f -divergences, and seems to be a new result. Theorem 8.1. A continuous real-valued function f on [0, +∞) is operator convex if and only if there exist a real number a, a non-negative number b, and a nonnegative measure µ on (0, +∞), satisfying dµ(t) < +∞, (8.1) 2 (0,+∞) (1 + t) such that
f (x) = f (0) + ax + bx2 + (0,+∞)
x x − 1+t x+t
dµ(t),
x ∈ [0, +∞). (8.2)
Moreover, the numbers a, b, and the measure µ are uniquely determined by f, and b = lim
x→+∞
f (x) , x2
a = f (1) − f (0) − b.
Proof. Obviously, if f admits an integral representation as in (8.2) then f is operator convex, and f (1) = f (0) + a + b,
b = lim
x→+∞
f (x) , x2
where the latter follows by the Lebesgue dominated convergence theorem, using (8.1) and that, for x > 1, x 1 x 2x 2 x−1 0≤ 2 − ≤ = . = x 1+t x+t x(x + t)(1 + t) x(1 + t)(1 + t) (1 + t)2 Hence what is left to prove is that any operator convex function admits a representation as in (8.2), and that the measure µ is uniquely determined by f . Assume now that f is an operator convex function on [0, +∞). Then, by Kraus’ theorem (see [29] or [19, Corollary 2.7.8]), the function g(x) :=
f (x) − f (1) , x−1
x ∈ [0, +∞)\{1},
g(1) := f (1),
is an operator monotone function on (0, +∞). Therefore, it admits an integral representation x(1 + t) dm(t), x ∈ [0, +∞), (8.3) g(x) = a + bx + (0,+∞) x + t where m is a positive finite measure on (0, +∞), and a = g(0) = f (1) − f (0),
0 ≤ b = lim
x→+∞
g(x) f (x) = lim x→+∞ x2 x
August 23, J070-S0129055X11004412
736
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
(see [19, Theorem 2.7.11] or [4, pp. 144–145]). Here, the measure m, as well as a , b, are unique and m((0, +∞)) = g(1) − a − b = f (1) − f (1) + f (0) − b. Thus, we have f (x) = f (1) + g(x)(x − 1)
x(x − 1)(1 + t) dm(t) x+t (0,+∞) x x − = f (0) + (f (1) − f (0) − b)x + bx2 + (1 + t)2 dm(t) 1 + t x + t (0,+∞) x x − = f (0) + ax + bx2 + dµ(t), 1+t x+t (0,+∞)
= f (1) + (f (1) − f (0))(x − 1) + bx(x − 1) +
where we have defined a := f (1) − f (0) − b and dµ(t) := (1 + t)2 dm(t). Finiteness of m yields that µ satisfies (8.1). Finally, to see the uniqueness of the measure µ, assume that f admits an integral representation as in (8.2). Then, f is operator convex, and hence the function g on [0, +∞), defined as x(1 + t) dµ(t) f (x) − f (1) = (a + b) + bx + , (8.4) g(x) := 2 x−1 (0,+∞) x + t (1 + t) is operator monotone. Therefore, it admits an integral representation as in (8.3), and the uniqeness of the parameters of that representation yields that dµ(t) = (1 + t)2 dm(t). Hence, the measure µ is uniquely determined by f . Corollary 8.2. Assume that f is a continuous operator convex function on [0, +∞) that is not a polynomial. Then it can be written in the form x 2 ψ(t)x − f (x) = f (0) + bx + dµ(t), x ∈ [0, +∞), (8.5) x+t (0,+∞) where b = limx→+∞ f (x)/x2 ≥ 0, and µ is a non-negative measure on (0, +∞). Moreover, we can choose ψ(t) :=
f (1) − f (0) − b 1 1 + · , 1 + t f (1) − f (1) + f (0) − b (1 + t)2
(8.6)
and if b = 0 and f (1) ≥ 0 then ψ(t) ≥ 0, t ∈ (0, +∞). Proof. Since f is operator convex, it can be written in the form (8.2) due to Theorem 8.1. Since f is not a polynomial, we have m((0, +∞)) > 0, where
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
737
dm(t) := dµ(t)/(1 + t2 ). Moreover, by (8.4), f (1) = g(1) = a + 2b + m((0, +∞)), from which m((0, +∞)) = f (1) − a − 2b. Using that a = f (1) − f (0) − b, we finally obtain f (1) − f (0) − b a 1 dm(t) = dµ(t). a= m((0, +∞)) (0,+∞) f (1) − f (1) + f (0) − b (0,+∞) (1 + t)2 Substituting it into (8.2), we obtain (8.5) with ψ as in (8.6). Note that (1+t)2 ψ(t) = a a ≥ 1 + m((0,+∞)) . Hence, if b = 0 and 0 ≤ f (1) = a + 2b + 1 + t + m((0,+∞)) m((0, +∞)) = a + m((0, +∞)) then ψ(t) ≥ 0, proving the last assertion. Example 8.3. (i) f (x) := x log x admits the integral representation x x − x log x = dt. 1+t x+t (0,+∞) (f (0) = a = b = 0 and µ is the Lebesgue measure in (8.2).) (ii) f (x) := −xα (0 < α < 1) admits the integral representation (see [4, Exercise V.1.10]) sin απ x −xα = − tα−1 dt. π x + t (0,+∞) (f (0) = b = 0, dµ(t) = sinπαπ tα−1 dt, and ψ ≡ 0 in (8.5).) Using that " sin απ xtα−1 dt = x, we have π (0,+∞) 1+t sin απ −x = −x + π
α
(0,+∞)
x x − 1+t x+t
tα−1 dt.
(f (0) = b = 0, a = −1, and dµ(t) = sinπαπ tα−1 dt in (8.2).) (iii) By the previous point, f (x) := xα (1 < α < 2) admits the representation sin(α − 1)π π
x2 tα−2 dt (0,+∞) x + t x x sin(α − 1)π − = tα−1 dt π t x+t (0,+∞)
xα =
(f (0) = b = 0, ψ(t) = 1/t, and dµ(t) = sin(α − 1)π π
sin(α−1)π α−1 t dt π
(0,+∞)
sin(α − 1)π = π
x x − t 1+t
(0,+∞)
in (8.5).) Using that
tα−1 dt
xtα−2 dt = x, 1+t
August 23, J070-S0129055X11004412
738
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
we also obtain xα = x +
sin(α − 1)π π
(0,+∞)
(f (0) = 0, a = 1, b = 0 and dµ(t) =
x x − 1+t x+t
sin(α−1)π α−1 t dt π
tα−1 dt.
in (8.2).)
Note that the function ψ in (8.5) is not unique. For instance, if µ is finitely supported on a set {t1 , . . . , tr } then only the sum ri=1 ψ(tr ) is determined by f while the individual values ψ(t"1 ), . . . , ψ(tr ) are not. 1 dµ(t) might not be finite and hence the Note also that in general, (0,+∞) 1+t term x dµ(t) 1 + t (0,+∞) " cannot be merged with ax in (8.2). Similarly, the integral (0,+∞) ψ(t)dµ(t) might be infinite and hence it might not be possible to separate it as a linear term in the representation (8.5) of f . This is clear, for instance, from (i) of Example 8.3. We have the following: Proposition 8.4. For a continuous real-valued function f on [0, +∞) the following are equivalent: (i) f is operator convex on [0, +∞) with limx→+∞ f (x)/x < +∞; (ii) there exist an α ∈ R and a positive measure µ on (0, +∞), satisfying dµ(t) < +∞, (0,+∞) 1 + t such that
f (x) = f (0) + αx − (0,+∞)
x dµ(t), x+t
x ∈ [0, +∞).
(8.7)
(8.8)
Proof. First, note that if f is convex on [0, +∞) as a numerical function, then limx→+∞ f (x)/x exists in (−∞, +∞]. In fact, by convexity, (f (x) − f (1))/(x − 1) is non-decreasing for x > 1, so that f (x) f (x) − f (1) = lim x→+∞ x x→+∞ x−1 lim
exists in (−∞, +∞]. Also, note that condition (8.7) is necessary for f (1) to be defined in (8.8), and also sufficient to define f (x) by (8.8) for all x ∈ [0, +∞). (i) ⇒ (ii). By assumption, f is an operator convex function on [0, +∞) such that limx→+∞ f (x)/x is finite, hence limx→+∞ f (x)/x2 = 0. By Theorem 8.1, we have x x − f (x) = f (0) + ax + dµ(t), x ∈ [0, +∞), 1+t x+t (0,+∞)
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
739
where a ∈ R and µ is a positive measure on (0, +∞). We write 1 f (0) 1 f (x) = +a+ − dµ(t). x x 1+t x+t (0,+∞) Since 1 1 1 − as 1 < x +∞, 1+t x+t 1+t the monotone convergence theorem yields that f (x) dµ(t) lim =a+ , x→+∞ x (0,+∞) 1 + t 0<
which implies (8.7) and f (x) = f (0) +
a+ (0,+∞)
dµ(t) 1+t
x− (0,+∞)
x dµ(t). x+t
Hence f admits a representation of the form (8.8). (ii) ⇒ (i). It is obvious that f given in (8.8) is operator convex on [0, +∞). Since 1 1 x+t ≤ 1+t for all x > 1 and all t ∈ [0, +∞), the Lebesgue convergence theorem yields that dµ(t) =0 lim x→+∞ (0,+∞) x + t and so f (x) f (0) = +α− x x
(0,+∞)
dµ(t) → α as x → +∞. x+t
Hence (i) follows. Remark 8.5. Note that the condition limx→+∞ f (x)/x < +∞ puts a strong restriction on an operator convex function f . Important examples for which it is not satisfied include f (x) = x log x and f (x) = xα for α ∈ (1, 2]. 9. Closing Remarks Quantum f -divergences are a quantum generalization of classical f -divergences, which class in the classical case contains most of the distinguishability measures that are relevant to classical statistics. Although our Corollary 7.4 shows that f -divergences are less universal in the quantum case, they still provide a very efficient tool to obtain monotonicity and convexity properties of several distinguishability measures that are relevant to quantum statistics, including the relative entropy, the R´enyi relative entropies, and the Chernoff and Hoeffding distances. There are also differences between the classical and the quantum cases in the technical conditions needed to prove the monotonicity. For the approach followed here, it is important that the defining function is not only convex but operator
August 23, J070-S0129055X11004412
740
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
convex, and the map is not only positive but it is also decomposable in the sense of Remark 4.8. It is unknown whether the monotonicity can be proved without these assumptions in general, although Corollary 3.4 and Lemma 3.5 show for instance that positivity of Φ might be sufficient in some special cases. For measures that have an operational interpretation in state discrimination, like the relative entropy, the R´enyi α-relative entropies with α ∈ (0, 1), and the Chernoff and Hoeffding distances, the monotonicity holds for any positive trace-preserving map Φ such that Φ⊗n is positive for every n ∈ N [14, 35]. Note that both the set of maps satisfying this latter property and the set of maps that are decomposable in the sense of Remark 4.8 contain all the completely positive trace-preserving maps, but we are not aware of any other explicit relation between these two sets. Moreover, the only example we know for a map Φ which is not completely positive but Φ⊗n is positive for every n ∈ N is the transposition, which is trivial in the sense that it preserves any f -divergence (where f does not even need to be convex; see Corollary 2.5 and Remark 2.6). Quantum f -divergences are essentially a special case of Petz’ quasi-entropies with K = I (see the Introduction) with the minor modification of allowing operators that are not strictly positive definite. While the monotonicity inequality in Theorem 4.3 can be proved for the quasi-entropies with general K quite similarly to the case K = I, our analysis of the equality case in Theorem 5.1 does not seem to extend to K = I. A special case has been treated recently in [27], where a characterization for the equality case in the joint convexity of the quasi-entropies SfKα (..) (see Example 2.7 for K = I) was given for arbitrary K and α ∈ (0, 2). Note that joint convexity is a special case of the monotonicity under partial traces (see [42, Theorem 6] or Corollary 4.7 of this paper), while monotonicity under partial traces can also be proven from the joint convexity for K’s of special type [30], which in turn implies the monotonicity under completely positive trace-preserving maps by using their Lindblad respresentation [51]. For a particularly elegant recent proof of the joint convexity for general K’s, see [11]. Various characterizations of the equality in the case K = I have been given before for different types of maps and classes of functions, including the equality case for the strong subadditivity of entropy and the joint convexity of the R´enyi relative entropies [18, 25, 27, 40, 43–46, 48, 49]. Our Theorem 5.1 extends all these results and it seems to be the most general characterization of the equality, at least in finite dimension. The relevant part from the point of view of application to quantum error correction is that the preservation of some suitable distinguishability measure yields the reversibility of the stochastic operation, and the reversal map can be constructed from the original one in a canonical way.
Acknowledgments Partial funding was provided by the JSPS-HAS Japan–Hungary Joint Project, the Grant-in-Aid for Scientific Research (C)21540208 (FH), and the Hungarian
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
741
Research Grant OTKA T068258 (M.M. and D.P.). The Centre for Quantum Technologies is funded by the Singapore Ministry of Education and the National Research Foundation as part of the Research Centres of Excellence program. Part of this work was done when M.M. was a Research Fellow at the Erwin Schr¨ odinger Institute for Mathematical Physics in 2009 and later when the first three authors participated in the Quantum Information Theory program of the Mittag-Leffler Institute in 2010. Discussions with Fernando Brandao, Tomohiro Ogawa and David Reeb (M.M.) and with Hui Khoon Ng (C.B. and M.M.) helped to improve the paper and are gratefully acknowledged here. The authors are grateful to anonymous referees for their comments, especially for pointing out [23]. Appendix. Commuting Operators and the Operator H¨ older Inequality We will need the following two well-known lemmas in this section. The first one is a generalization of the so-called log-sum inequality, while the second one is a generalization of Jensen’s inequality for the expectation values of self-adjoint operators. Lemma A.1. Let f : [0, +∞) → R be a convex function. Let ai ≥ 0, bi > 0, r r i = 1, . . . , r, and define a := i=1 ai , b := i=1 bi . Then, bf (a/b) ≤
r
bi f (ai /bi ).
(A.1)
i=1
Moreover, if f is strictly convex, then equality holds if and only if ai /bi is independent of i. Proof. Convexity of f yields that r r b i ai ai bi f ≤ f (a/b) = f , b b b bi i i=1 i=1 which yields (A.1), and the characterization of equality is immediate from the strict convexity of f . Lemma A.2. Let A be a self-adjoint operator and ρ be a density operator on a finite-dimensional Hilbert space H. If f is a convex function on the convex hull of spec(A) then f (Tr Aρ) ≤ Tr f (A)ρ.
(A.2)
If f is strictly convex then equality holds in (A.2) if and only if ρ0 is a subprojection of a spectral projection of A. Proof. Let A = a aPa be the spectral decomposition of A. Since {Tr Pa ρ : a ∈ spec(A)} is a probability distribution on spec(A), Jensen’s inequality yields f (Tr Aρ) = f ( a a Tr Pa ρ) ≤ a f (a) Tr Pa ρ, and it is obvious that equality holds
August 23, J070-S0129055X11004412
742
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
whenever Tr Pa ρ = 0 for all but one a ∈ spec(A). On the other hand, if there are more than one a ∈ spec(A) such that Tr Pa ρ > 0 then the above inequality is strict whenever f is strictly convex. Proposition A.3. Let A, B ∈ A1,+ be such that A commutes with B and let Φ : A1 → A2 be a substochastic map such that Φ(A) commutes with Φ(B) and Tr Φ(B) = Tr B. For any convex function f : [0, +∞) → R, Sf (Φ(A)Φ(B)) ≤ Sf (AB).
(A.3)
If supp A ≤ supp B and f is strictly convex then equality holds in (A.3) if and only if Φ∗B (Φ(A)) = A. Proof. Let us consider first the inequality (A.3). Due to the continuity property given in Proposition 2.12, we can assume without loss of generality that supp A ≤ supp B. Since A and B commute, there exists a basis {ex }x∈X in supp B such that A = x∈X A(x)|ex ex | and B = x∈X B(x)|ex ex |, where A(x) := ex , Aex , B(x) := ex , Bex , x ∈ X . Similarly, there exists a basis {fy }y∈Y in supp Φ(B) such that Φ(A) = y∈Y Φ(A)(y)|fy fy | and Φ(B) = y∈Y Φ(B)(y)|fy fy |, where Φ(A)(y) := fy , Φ(A)fy , Φ(B)(y) := fy , Φ(B)fy . We have A(x) B(x)f Sf (AB) = , B(x) x Φ(A)(y) Φ(B)(y)f Sf (Φ(A)Φ(B)) = . Φ(B)(y) y Let Txy := fy , Φ(|ex ex |)fy ; then Φ(A)(y) = x∈X Txy A(x), Φ(B)(y) = T B(x), and Lemma A.1 yields xy x∈X Φ(A)(y) Txy A(x) Φ(B)(y)f Txy B(x)f ≤ . (A.4) Φ(B)(y) Txy B(x) x The assumption supp A ≤ supp B guarantees that Tr Φ(|ex ex |) = 1, x ∈ X , due to Lemma 3.2, and hence y∈Y Txy = 1, x ∈ X . Summing over y in (A.4) yields (A.3). Assume now that supp A ≤ supp B. Obviously, equality holds in (A.3) if and only if (A.4) holds with equality for every y ∈ Y. Assuming that f is strictly convex, we obtain, due to Lemma A.1, that for every y ∈ Y there exists a positive constant c(y) such that Txy A(x) = c(y)Txy B(x), i.e. A(x) = c(y)B(x)
(A.5)
for every x such that Txy > 0. Assume that (A.5) holds; then we have Φ(A)(y) = x Txy A(x) = x Txy c(y)B(x) = c(y)Φ(B)(y) and hence, Φ(A)(y) A(x) Φ∗B (Φ(A))(x) = B(x) Txy Txy = B(x) = A(x), x ∈ X . Φ(B)(y) B(x) y y
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
743
The following proposition gives an important special case where the monotonicity inequality (A.3) holds even though A and B do not commute and f is only assumed to be convex. Proposition A.4. Let A, B ∈ A+ be such that B = 0, let B = b∈spec(B) bQb be the spectral decomposition of B and let EB : X → b∈spec(B) Qb XQb be the pinching defined by B. For every convex function f : [0, +∞) → R, Tr A Sf (AB) ≥ Sf (EB (A)EB (B)) = Sf (EB (A)B) ≥ (Tr B)f . (A.6) Tr B Moreover, if supp A ≤ supp B and f is strictly convex then the first inequality in (A.6) holds with equality if and only if A commutes with B, and the second inequality holds with equality if and only if EB (A) is a constant multiple of B. In Tr A particular, Sf (AB) = (Tr B)f ( Tr B ) if and only if A is a constant multiple of B. Proof. All the assertions are obvious when A = 0, so for the rest we assume A = 0. Assume first that supp A ≤ supp B. For every b ∈ spec(B) and λ ∈ R, let (b) Pλ be the spectral projection of Qb AQb corresponding to the singleton {λ}, and let (b) (b) (b) (b) (b) P˜λ := Qb Pλ Qb . Note that P˜λ = Pλ for every λ = 0, and Qb = λ P˜λ . The (b) spectral projection of EB (A) corresponding to the singleton {λ} is b∈spec(B) P˜λ . For every b ∈ spec(B)\{0} and λ ∈ R, let ρb,λ be a density operator such that (b) (b) (b) ρb,λ = P˜λ / Tr P˜λ whenever P˜λ = 0. By (2.6), we have Sf (EB (A)EB (B)) = Sf (EB (A)B) bf (λ/b) Tr = =
(b ) P˜λ Qb
b ∈spec(B)
b∈spec(B)\{0} λ
(b)
bf (λ/b) Tr P˜λ
b∈spec(B)\{0} λ
=
(b) bf (Tr((A/b)ρb,λ )) Tr P˜λ
b∈spec(B)\{0} λ
≤ =
λ b∈spec(B)\{0}
(b)
b Tr f (A/b)ρb,λ Tr P˜λ
(A.7)
(b) b Tr f (A/b)P˜λ
b∈spec(B)\{0} λ
=
b∈spec(B)\{0}
=
b Tr f (A/b)Qb
bf (a/b) Tr Pa Qb
b∈spec(B)\{0} a∈spec(A)
= Sf (AB),
where A = a aPa is the spectral decomposition of A, and the inequality in (A.7) follows due to Lemma A.2. This yields the first inequality in (A.6). If A commutes
August 23, J070-S0129055X11004412
744
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
with B then EB (A) = A and hence the first inequality in (A.6) holds with equality. Conversely, assume that the first inequality in (A.6) holds with equality; then the inequality in (A.7) has to hold with equality as well. If f is strictly convex then this implies that for every b ∈ spec(B)\{0} and λ ∈ R, there exists an a(b, λ) such that (b) (b) P˜λ ≤ Pa(b,λ) , due to Lemma A.2. In particular, P˜λ commutes with A, and, since ˜ (b) P , so does also Qb , which finally implies that B commutes with A. Qb = λ
λ
Consider now the stochastic map Φ : A → C, Φ(X) := Tr X, X ∈ A. Since EB (A) and B, as well as Φ(EB (A)) = Tr A and Φ(B) = Tr B, commute, the second inequality in (A.6) follows due to Proposition A.3, which also yields that this inequality holds with equality if and only if EB (A) = Φ∗B (Φ(EB (A)) = (Tr A/ Tr B)B. Finally, consider the general case where supp A ≤ supp B does not necessarily hold. For every ε > 0, let Bε := B + εI. Note that supp A ≤ supp Bε and EBε = EB for every ε > 0, and hence by the above, Sf (ABε ) ≥ Sf (EB (A)Bε ) ≥ Tr A (Tr Bε )f ( Tr Bε ) for every ε > 0. Taking the limit ε 0 then yields (A.6). The first inequality above was proved for the case f = fα , α > 1, in [14, Sec. 3.7], and we followed essentially the same proof here. It was also proved in [14, Sec. 3.7] that the monotonicity inequality (4.18) extends for the values α ∈ (2, +∞) if Φ(A) and Φ(B) commute. We conjecture that this holds in more generality, namely that the monotonicity inequality Sf (Φ(A)Φ(B)) ≤ Sf (AB) holds for every convex f A if A and B or Φ(A) and Φ(B) commute. The inequality Sf (AB) ≥ (Tr B)f ( Tr Tr B ) was given in [42, Theorem 3] for the case where A and B are invertible density operators and f is a non-linear operator convex function. Note that the inequality between the first and the last term in (A.6) is a non-commutative generalization of the generalized log-sum inequality (A.1). Corollary A.5. For any positive semidefinite operators A, B on a finitedimensional Hilbert space H, we have Tr Aα B 1−α ≤ (Tr A)α (Tr B)1−α ,
α ∈ [0, 1].
(A.8)
α ∈ [1, +∞).
(A.9)
If, moreover, supp A ≤ supp B then Tr Aα B 1−α ≥ (Tr A)α (Tr B)1−α ,
If supp A ≤ supp B then Tr Aα B 1−α = (Tr A)α (Tr B)1−α for some α ∈ (0, +∞)\{1} if and only if A is a constant multiple of B. Proof. The assertions are trivial when A or B is equal to zero, and hence we assume that both of them are non-zero. The inequality in (A.8) is obvious when α = 0 or α = 1, and the inequality in (A.9) is obvious when α = 1. For α ∈ (0, +∞)\{1}, the inequalities in (A.8) and (A.9) follow immediately by applying Proposition A.4 to the functions f˜α (x) := sgn(α − 1)xα . Since these functions are strictly convex for every α ∈ (0, +∞)\{1}, if equality holds in (A.8) or (A.9), and supp A ≤ supp B, then A is a constant multiple of B, due to Proposition A.4.
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
745
Conversely, the inequalities (A.8) and (A.9) obviously hold with equality if A is a constant multiple of B. Let H be a finite-dimensional Hilbert space. For every A ∈ B(H) and p ∈ R\{0}, let
Ap :=
0,
A = 0,
(Tr |A| ) , A = 0, √ where |A| := A∗ A. For p ∈ [1, +∞), this is the well-known p-norm. Note that p 1/p
A∗ p = Ap = |A|p for every A ∈ B(H) and p ∈ R\{0}. Corollary A.5 yields the following inverse H¨ older inequality: Proposition A.6. Let p ∈ (0, 1) and q < 0 be such that 1/p + 1/q = 1. Let A, B ∈ B(H) for some finite-dimensional Hilbert space H, and assume that supp |A| ≤ supp |B ∗ |. Then AB1 ≥ Ap Bq
(A.10)
Moreover, the equality case occurs in the above inequality if and only if |A|p and |B ∗ |q are proportional, i.e. |A|p = α|B ∗ |q for some α ≥ 0. Proof. The assertion is obvious if A or B is zero, and hence we assume that both of them are non-zero. Let A = U |A| and B ∗ = V |B ∗ | be the polar decompositions with U, V unitaries. Then AB = U |A| |B ∗ |V ∗ , and hence AB1 = |A| |B ∗ |1 . ˜ := |B ∗ |q and α := 1/p. Then α > 1 and supp A˜ ≤ supp B ˜ by Let A˜ := |A|p , B assumption, and hence ˜ 1−α ≥ (Tr A) ˜ α (Tr B) ˜ 1−α Tr |A| |B ∗ |1 = Tr A˜α B = (Tr |A|p )1/p (Tr |B ∗ |q )1/q = Ap Bq , where the inequality follows due to Corollary A.5. It is well-known that |Tr X| ≤ s |f e | is a singular-value X1 for every X ∈ B(H); indeed, if X = i i i i decomposition then |Tr X| = | i si ei , fi | ≤ i si = Tr |X| = X1 . Hence, Tr |A| |B ∗ |1 ≤ |A| |B ∗ |1 = AB1 , which completes the proof of the inequality (A.10). The characterization of the equality case is immediate from Corollary A.5.
Remark A.7. Our interest in the inverse operator H¨older inequality was motivated by [16]. The inequality was proved in [17] for positive semidefinite operators, using the usual H¨ older inequality. An alternative direct proof for the general case and the condition for the equality was obtained in [20], based on majorization theory [4, 19].
August 23, J070-S0129055X11004412
746
2011 10:39 WSPC/S0129-055X
148-RMP
F. Hiai et al.
References [1] S. M. Ali and S. D. Silvey, A general class of coefficients of divergence of one distribution from another, J. Roy. Statist. Soc. Ser. B 28 (1966) 131–142. [2] H. Araki, On an inequality of Lieb and Thirring, Lett. Math. Phys. 19 (1990) 167–170. [3] K. M. R. Audenaert, J. Calsamiglia, Ll. Masanes, R. Munoz-Tapia, A. Acin, E. Bagan and F. Verstraete, Discriminating states: The quantum Chernoff bound, Phys. Rev. Lett. 98 (2007) 160501. [4] R. Bhatia, Matrix Analysis (Springer, 1997). [5] R. Bhatia, Positive Definite Matrices (Princeton University Press, 2007). [6] R. Blume-Kohout, H. K. Ng, D. Poulin and L. Viola, Information preserving structures: A general framework for quantum zero-error information, Phys. Rev. A 82 (2010) 062306. [7] T. Cover and J. A. Thomas, Elements of Information Theory (Wiley-Interscience, 1991). [8] I. Csisz´ ar, Information type measure of difference of probability distributions and indirect observations, Stud. Sci. Math. Hung. 2 (1967) 299–318. [9] I. Csisz´ ar, Generalized cutoff rates and R´enyi’s information measures, IEEE Trans. Inform Theory 41 (1995) 26–34. [10] A. Datta, A condition for the nullity of quantum discord, arXiv:1003.5256. [11] E. G. Effros, A matrix convexity approach to some celebrated quantum inequalities, Proc. Natl. Acad. Sci. 106 (2009) 1006–1008. [12] I. Ekeland and R. Temam, Convex Analysis and Variational Problems (NorthHolland, American Elsevier, 1976). [13] F. Hansen and G. K. Pedersen, Jensen’s inequality for operators and L¨ owner’s theorem, Math. Ann. 258 (1982) 229–241. [14] M. Hayashi, Quantum Information: An Introduction (Springer, 2006). [15] M. Hayashi, Error exponent in asymmetric quantum hypothesis testing and its application to classical-quantum channel coding, Phys. Rev. A 76 (2007) 062301. [16] M. Hayashi, private communication. [17] M. Hayashi, Symmetry and Quantum Information (Iwanami-Shoten, in press); in Japanese. [18] P. Hayden, R. Jozsa, D. Petz and A. Winter, Structure of states which satisfy strong subadditivity of quantum entropy with equality, Comm. Math. Phys. 246 (2004) 359–374. [19] F. Hiai, Matrix analysis: Matrix monotone functions, matrix means, and majorization (GSIS selected lectures), Interdiscip. Inform. Sci. 16 (2010) 139–248. [20] F. Hiai, unpublished. [21] F. Hiai and D. Petz, The proper formula for relative entropy and its asymptotics in quantum probability, Comm. Math. Phys. 143 (1991) 99–114. [22] F. Hiai, M. Mosonyi and T. Ogawa, Error exponents in hypothesis testing for correlated states on a spin chain, J. Math. Phys. 49 (2008) 032112. [23] A. Jenˇcov´ a, Quantum hypothesis testing and sufficient subalgebras, Lett. Math. Phys. 93 (2010) 15–27. [24] A. Jenˇcov´ a, Reversibility conditions for quantum operations, arXiv:1107.0453. [25] A. Jenˇcov´ a and D. Petz, Sufficiency in quantum statistical inference, Comm. Math. Phys. 263 (2006) 259–276. [26] A. Jenˇcov´ a and D. Petz, Sufficiency in quantum statistical inference. A survey with examples, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 9 (2006) 331–351. [27] A. Jenˇcov´ a and M. B. Ruskai, A unified treatment of convexity of relative entropy and related trace functions, with conditions for equality, Rev. Math. Phys. 22 (2010) 1099–1121.
August 23, J070-S0129055X11004412
2011 10:39 WSPC/S0129-055X
148-RMP
Quantum f -Divergences and Error Correction
747
[28] A. Jenˇcov´ a, D. Petz and J. Pitrik, Markov triplets on CCR algebras, Acta Sci. Math. (Szeged) 76 (2010) 27–50. ¨ [29] F. Kraus, Uber konvexe matrixfunktionen, Math. Z. 41 (1936) 18–42. [30] A. Lesniewski and M. B. Ruskai, Monotone Riemannian metrics and relative entropy on non-commutative probability spaces, J. Math. Phys. 40 (1999) 5702–5724. [31] F. Liese and I. Vajda, Convex Statistical Distances (B. G. Teubner Verlagsgesellschaft, Leipzig, 1987). [32] F. Liese and I. Vajda, On divergences and informations in statistics and information theory, IEEE Trans. Inform. Theory 52 (2006) 4394–4412. [33] M. Mosonyi, Entropy, information and structure of composite quantum states, Ph.D. thesis, Catholic University of Leuven (2005), https://repository.cc.kuleuven.be/ dspace/handle/1979/41. [34] M. Mosonyi and D. Petz, Structure of sufficient quantum coarse grainings, Lett. Math. Phys. 68 (2004) 19–30. [35] M. Mosonyi and F. Hiai, On the quantum Renyi relative entropies and related capacity formulas, IEEE Trans. Inform. Theory 57 (2011) 2474–2487. [36] H. Nagaoka, The converse part of the theorem for quantum Hoeffding bound, preprint; quant-ph/0611289. [37] M. Nussbaum and A. Szkola, A lower bound of Chernoff type for symmetric quantum hypothesis testing, Ann. Statist. 37 (2009) 1040–1057. [38] T. Ogawa and H. Nagaoka, Strong converse and Stein’s lemma in quantum hypothesis testing, IEEE Trans. Inform. Theory 47 (2000) 2428–2433. [39] T. Ogawa, Perfect quantum error-correcting condition revisited (2005); arXiv:quantph/0505167. [40] M. Ohya and D. Petz, Quantum Entropy and Its Use, 2nd edn. (Springer-Verlag, Heidelberg, 2004). [41] D. Petz, Quasi-entropies for states of a von Neumann algebra, Publ. RIMS. Kyoto Univ. 21 (1985) 781–800. [42] D. Petz, Quasi-entropies for finite quantum systems, Rep. Math. Phys. 23 (1986) 57–65. [43] D. Petz, Sufficiency of channels over von Neumann algebras Quart. J. Math. Oxford Ser. (2) 39(153) (1988) 97–108. [44] D. Petz, Monotonicity of quantum relative entropy revisited, Rev. Math. Phys. 15 (2003) 79–91. [45] D. Petz, Quantum Information Theory and Quantum Statistics (Springer, 2008). [46] D. Petz, From f -divergence to quantum quasi-entropies and their use, Entropy 12 (2010) 304–325. [47] A. R´enyi, On measures of entropy and information, in Proc. 4th Berkeley Symp. on Math. Statist. Probability, Vol. 4, Berkeley, CA, USA (1961), pp. 547–561. [48] M. B. Ruskai, Inequalities for quantum entropy: A review with conditions for equality, J. Math. Phys. 43 (2002) 4358–4375. [49] N. Sharma, On the quantum f-relative entropy and generalized data processing inequalities (2009); arXiv:0906.4755. [50] M. Takesaki, Conditional expectations in von Neumann algebras, J. Funct. Anal. 9 (1972) 306–321. [51] M. Tomamichel, R. Colbeck and R. Renner, A fully quantum asymptotic equipartition property, IEEE Trans. Inform Theory 55 (2009) 5840–5847. [52] J. Tomiyama, On the geometry of positive maps in matrix algebras. II, Linear Algebra Appl. 69 (1985) 169–177. [53] A. Uhlmann, The “transition probability” in the state space of a ∗ -algebra, Rep. Math. Phys. 9 (1976) 273–279.
August 12, 2011 10:27 WSPC/S0129-055X S0129055X11004424
148-RMP
J070-
Reviews in Mathematical Physics Vol. 23, No. 7 (2011) 749–822 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004424
SELF-DUAL CONE ANALYSIS IN CONDENSED MATTER PHYSICS
TADAHIRO MIYAO Institute for Fundamental Sciences, Setsunan University, Ikeda-Naka-Machi 17-8, Neyagawa 572-8508, Japan
[email protected] Received 10 August 2010 Revised 31 May 2011 The self-dual cone — the central object of this review — is introduced. Several operator inequalities associated with the self-dual cone are defined and mathematical properties of those are investigated. In general there are infinitely many choices of self-dual cones in a Hilbert space. Each of these lead to a distinct family of operator inequalities in the Hilbert space which enables us to analyze quantum physical models with respect to several aspects. We refer to these applications as self-dual cone analysis. The focus of this review lies on the self-dual cone analysis of models in condensed matter physics. In particular, by taking a physically proper self-dual cone, the interaction term of the Hamiltonian of the system becomes attractive from a viewpoint of our new operator inequalities. This attractive term enables us to analyze the system and various aspects of physical interest in detail. For instance, if the attractive term is ergodic, it is shown that the ground state is unique. By the uniqueness and the conservation laws, the physically symmetric state is realized as the ground state. This could be regarded as a physical order. As applications, the BCS model and the one-dimensional Fr¨ ohlich model are studied. We explain, from a viewpoint of the self-dual cone analysis, the appearance of macroscopic phase angles in the superconductors, Josephson effect and the Peierls instability. Keywords: Self-dual cone; operator inequality; condensed matter physics; BCS theory; Peierls instablity. Mathematical Subject Classification 2010: 46N50, 47A63, 81Q10, 82C99
Contents 1. Introduction 1.1. Self-dual cone analysis . . . . 1.2. A new viewpoint of attraction 1.3. A typical way of application . 1.4. Motivations . . . . . . . . . . 1.5. Structure of the paper . . . .
. . . . .
. . . . .
749
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
752 752 753 753 754 755
August 12, 2011 10:27 WSPC/S0129-055X J070-S0129055X11004424
750
148-RMP
T. Miyao
2. Mathematical Basis for the Self-Dual Cone Analysis 2.1. Self-dual cones and related operator inequalities . 2.2. Partial orders . . . . . . . . . . . . . . . . . . . . 2.3. Basic tools . . . . . . . . . . . . . . . . . . . . . 2.4. Traces and self-dual cones . . . . . . . . . . . . . 2.5. Operator monotonicity . . . . . . . . . . . . . . . 2.6. Beurling–Deny criterion . . . . . . . . . . . . . . 2.7. Perron–Frobenius–Faris theorem . . . . . . . . . 2.8. Direct sums of self-dual cones . . . . . . . . . . . 2.9. Tensor products of self-dual cones . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
3. Fundamental Theorems of Self-Dual Cone Analysis 4. Canonical Self-Dual Cone in L (h) 4.1. Hilbert–Schmidt operators . . . . . . . . . . . . 4.1.1. Algebraic structures . . . . . . . . . . . 4.1.2. A natural identification . . . . . . . . . 4.2. Definition of canonical self-dual cone in L2 (h) . 4.3. The Lieb cone . . . . . . . . . . . . . . . . . . . 4.4. Reflection positivity and positivity with respect 4.4.1. Reflection positivity . . . . . . . . . . . 4.4.2. General properties . . . . . . . . . . . .
755 755 757 758 758 759 760 760 761 761 762
2
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
765 765 765 767 768 769 770 770 770
inequality . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
771 771 771 773 774 776
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . to L2 (h)+ . . . . . . . . . . . .
5. Self-Dual Cone Analysis in L2 (h) 5.1. Derivation of the attractive interaction by the DLS 5.1.1. Structures of general interactions . . . . . . 5.1.2. Dyson–Lieb–Simon inequality . . . . . . . . 5.2. Analysis of Hamiltonian with attractive interaction 5.3. Representation in h ⊗ h . . . . . . . . . . . . . . . 6. Self-Dual Cone Analysis in Hilbert Spaces with Tensor Product Structure 6.1. General settings . . . . . . . . . . . . . . . . . . . . 6.2. Choice of a natural self-dual cone . . . . . . . . . . 6.3. Derivation of attractive interaction . . . . . . . . . 6.4. Analysis of Hamiltonian with attractive interaction
. . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
777 777 778 779 781
7. The BCS Hamiltonian and Macroscopic Quantum Effects 7.1. Background . . . . . . . . . . . . . . . . . . . . . . . 7.2. A warmup: The BCS Hamiltonian . . . . . . . . . . 7.2.1. Cooper-pair space . . . . . . . . . . . . . . . 7.2.2. Uniqueness of the ground state . . . . . . . . 7.3. Appearance of a macroscopic phase angle . . . . . . 7.4. Josephson effect . . . . . . . . . . . . . . . . . . . . . 7.5. A remark on the Josephson effect . . . . . . . . . . . 7.5.1. The Duhamel expectations . . . . . . . . . . 7.5.2. Free energy equation . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
782 782 783 783 784 784 785 786 786 787
August 12, 2011 10:27 WSPC/S0129-055X S0129055X11004424
148-RMP
J070-
Self-Dual Cone Analysis in Condensed Matter Physics
7.6. Proof of Theorem 7.1 . . . . . . . . . . . . . 7.6.1. Representation on L2 (F) . . . . . . . 7.6.2. Structures of H(s) . . . . . . . . . . 7.6.3. Proof of Theorem 7.1 . . . . . . . . 7.7. Proof of Theorem 7.3 . . . . . . . . . . . . . 7.7.1. A natural self-dual cone in CN . . . 7.7.2. Proof of Theorem 7.3 . . . . . . . . 7.8. Proof of Theorem 7.5 . . . . . . . . . . . . . 7.8.1. A natural self-dual cone in C . . . . 7.8.2. General properties of G(θ) . . . . . 7.8.3. Proof of Theorem 7.5 . . . . . . . . 7.9. Proof of Theorem 7.7 . . . . . . . . . . . . . 7.9.1. Ergodicity of G([0]) . . . . . . . . . 7.9.2. Proof of Theorem 7.7 . . . . . . . . 7.10. Proof of Theorem 7.9 . . . . . . . . . . . . . 7.10.1. A natural self-dual cone in CL ⊗ CR 7.10.2. Structures of interactions . . . . . . 7.10.3. Proof of Theorem 7.9 . . . . . . . . 7.11. Proof of Theorem 7.11 . . . . . . . . . . . . 7.11.1. Ergodicity of Vg . . . . . . . . . . . 7.11.2. Proof of Theorem 7.11 . . . . . . . . 7.12. Proof of Theorem 7.12 . . . . . . . . . . . . 7.12.1. Proof of (i) . . . . . . . . . . . . . . 7.12.2. Proof of (ii) . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
8. Peierls Instabilities 8.1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2. Peierls instability I . . . . . . . . . . . . . . . . . . . . . . . 8.3. Peierls instability II . . . . . . . . . . . . . . . . . . . . . . 8.4. The most realizable Hilbert space . . . . . . . . . . . . . . . 8.5. Periodic structure in the unique ground state . . . . . . . . 8.5.1. Elimination of marginal phonons . . . . . . . . . . . 8.5.2. Hamiltonian at a fixed total momentum . . . . . . . 8.6. Proof of Theorem 8.1 . . . . . . . . . . . . . . . . . . . . . . 8.6.1. Factorization properties . . . . . . . . . . . . . . . . 8.6.2. A natural identification between FL and FR . . . . . 8.6.3. Representation on L2 (FL ) . . . . . . . . . . . . . . . 8.6.4. A canonical self-dual cone in the bosonic Fock space 8.6.5. Proof of Theorem 8.1 and the Fermi surface nesting 8.7. Proof of Theorem 8.3 . . . . . . . . . . . . . . . . . . . . . . 8.8. Proof of Theorem 8.5 . . . . . . . . . . . . . . . . . . . . . . 8.8.1. Structure of H|Λ|/2 (s) . . . . . . . . . . . . . . . . . 8.8.2. Proof of Theorem 8.5 . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
751
. . . . . . . . . . . . . . . . . . . . . . . .
788 788 788 789 790 790 792 792 792 792 793 794 794 795 795 795 796 796 797 797 798 798 798 798
. . . . . . . . . . . . . . . . .
799 799 800 802 802 803 803 804 805 805 806 807 807 808 810 811 811 812
August 12, J070-S0129055X11004424
752
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
8.9. Proof of Theorem 8.7 . . . . . . . . . . . . . . 8.10. Proofs of Proposition 8.8 and Theorem 8.10 . 8.10.1. Expression of Ptot on H (Q) and proof 8.10.2. Structures of H (Q; P ) . . . . . . . . . 8.10.3. A natural self-dual cone in H (Q; P ) . 8.10.4. Uniqueness of a ground state of KP . 8.10.5. Proof of Theorem 8.10 . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . of Proposition 8.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
813 814 814 815 816 817 820
1. Introduction 1.1. Self-dual cone analysis Let p be a convex cone in the Hilbert space H. p is called a self-dual cone if it satisfies p = {x ∈ H | x, y ≥ 0 ∀ y ∈ p}.
(1.1)
Once we determine a self-dual cone p, then we can define several operator inequalities associated with p. For instance, if two bounded operators A and B satisfy (A − B)p ⊆ p, then we denote this by A B with respect to p. The binary relation “” is a partial order as we will show, so that we can really regard it as an operator inequality. Moreover we can define not only the inequality “” but also several other operator inequalites related with p. Precise definitions and properties of these inequalities will be discussed in Sec. 2. In general, there are infinitely many self-dual cones in the Hilbert space. Corresponding to each self-dual cone, we can define operator inequalities with respect to it. The proper choice of a self-dual cone is an essential step for our analysis. These inequalities are completely different from the standard quadratic form inequality A ≥ B.a In particular, the inequality “” is closed under the product: if A 0, B 0 with respect to p, then AB 0 with respect to p. This special nature will be repeatedly used in applications. We expect that these differences provide novel viewpoints and results both mathematically and physically. The term “self-dual cone analysis” is a generic name for analysis related with self-dual cones and applications of operator inequalites with respect to the self-dual cones. The self-dual cone analysis mainly originates from the Perron–Frobenius theorem [11, 45] in linear algebra and from reflection positivity in quantum field theory [43]. Both theories are already classical, however the word “classical” indicates an affirmative nuance in this context. Namely, both are widespread in mathematics and physics. The Perron–Frobenius theorem has been applied to physics to show the uniqueness of the ground state and become a standard method [3, 9, 10, 14, 15, 21, 22, 24, 34, 36, 37, 40, 41, 49–51]. Similarly reflection positivity has an important application to the theory of phase transitions in statistical physics and has lead to many fruitful results [1, 8, 16–20, 26, 29, 30]. On the other aA
≥ B means x, Ax ≥ x, Bx for all x ∈ H.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
753
hand, these two developments have been regarded as independent or different. The self-dual cone analysis is a general theory unifying these two seemingly different mathematical and physical viewpoints. 1.2. A new viewpoint of attraction This review concerns applications of the self-dual cone analysis to condensed matter physics. We will explain the foundation of the self-dual cone analysis and applications this analysis to the BCS theory and the Peierls instablity in detail. So far the idea of this analysis has been used unclearly. Then if we recognize the idea of the self-dual cone analysis clearly, what can we see physically? To observe this question, let us consider a certain quantum system. Suppose the system is described by the Hamiltonian HV = H0 − V , where H0 is the free Hamiltonian and −V is the interaction term. First, depending on the physical phenomenon we want to study, we select a proper self-dual cone so that e−βH0 0 with respect to p for all β ≥ 0. Then the interaction term becomes attractive in a sense that −V 0 with respect to p. On the other hand, if we choose an irrelevant self-dual cone, the interaction will never be attractive. Possible choices of the self-dual cones are infinitely many, so, to make good use of these possibilites is one of the vital points of the self-dual cone analysis. In case where the interaction is attractive in a sense mentioned above, the attractive interation makes the energy lower. Furthermore the ground state of the system becomes unique by this attraction, i.e. the ground state entropy is zero. As a result, if the system has a conservative physical quantity represented by the self-adjoint operator S such as the total spin, then the ground state must belong to ker[S], i.e. the ground state forms a symmetric distribution. This symmetric ground state can be regarded as a physical order. For more precise discussions, see the following sections, in particular Sec. 3. To grasp the attraction in this manner will provide useful viewpoints. It is noteworthy that effectiveness of the self-dual cone analysis has been confirmed already in [37, 38]. By further developments and applications, this analysis will be able to cover wider area of physical phenomena. 1.3. A typical way of application Here we will give a representative strategy of the self-dual cone analysis in condensed matter physics. Roughly speaking, the analysis consists of the following four steps. Step I: Choice of a proper self-dual cone. First of all, we have to determine a proper self-dual cone. Once we pick up a correct one, then we can define several operator inequalities which will be our main language in the analysis below. Hence this step is a heart of our study. If we choose an irrelevant self-dual cone, then the following steps will be futile. Step II: Derivation of attractive Hamiltonians. Consider a certain family of Hamiltonians. Each Hamiltonian in the family describes a physically
August 12, J070-S0129055X11004424
754
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
possible situation in the system. By the DLS inequality or applications of positivity associated with the self-dual cone in Step I, we can see that the free energy is minimized when the interaction is attractive (applications of Theorems 2.8, 5.4 and 6.10). Therefore, among various Hamiltonians in the family, we can concentrate our attention on the attractive Hamiltonian only. In many cases, the attractive Hamiltonian is given from the outset. Then we can skip this optimization process. Step III: Operator inequality analysis. By Step II, we can select a Hamiltonian with attractive interaction. In this step, we analyze this Hamiltonian from the viewpoint of operator inequalities associated with the selfdual cone given by Step I. For instance, positivity preserving property of the heat semi-group of the Hamiltonian can be proven (Theorem 5.7). This is an important input in the next step. Step IV: Uniqueness of the ground state. We will show the uniqueness of the ground state of the Hamiltonian discussed in Step III. This means the ground state entropy equals zero. Thus a physical order would occur in the ground state. To understand what kind of the physical order appears depends on each model under the consideration and additional observation will be needed, see Secs. 7 and 8. If the readers will get lost in their way in a labyrinth of inequality symbols, we advise them to check this chart each time. 1.4. Motivations This paper is mainly motivated by Fr¨ ohlich [14, 15] and Lieb [32, 33]. It seems these works are rather far from each others, but by reading the works about the phase transion [8, 16, 18, 19], the readers could notice relations among these works. In Lieb’s conctruction of the ferromagnetism in the Hubbard model [32], the reflection positivity used in [8, 16–19] and the Perron–Frobenius–Faris theorem played essential roles. After [32], Lieb determined the electron number which minimizes the energy of the Hubbard system by the Dyson–Lieb–Simon (DLS) inequality which also appeared in [8]. On the other hand, in his famous thesis [14, 15], Fr¨ ohlich investigated a general model of a charged particle coupled with a bose field. He proved the uniqueness of the ground states of this system by clever choice of a selfdual cone. As a result, the system has a unique, rotation invariant ground state, i.e. an orderly state. In this disscusion, Perron–Frobenius–Faris theorem is important too. The Fr¨ohlich–Lieb’s order comes from the uniqueness of a ground state, furthermore, it is interesting that their proofs are very different but both philosophies behind the proofs are similar. By using the inequality “”, the spin reflection positivities in [32] and the particle-field interaction in [14, 15] can be regarded as attractions. Then by the effect of the attraction, the energy of the system becomes lower, and by the Perron–Frobenius–Faris theorem, the ground state of the system is unique. As a consequence, physical orders are realized in ground states.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
755
1.5. Structure of the paper Roughly speaking, this paper consists of two parts. (I) Mathematical foundation of the self-dual cone analysis (Secs. 2–6). (II) Applications to the BCS theory (Sec. 7) and to the Peierls instablity (Sec. 8). Technically the noncommutative integration will be heavily used, however since only special case will be discussed, the readers do not need any knowledge about this topics. (The readers who want to learn the noncommutative integration, see [42, 48, 52].) To show the uniqueness of a ground state, we will often apply the Perron–Frobenius theorem abstracted by Faris (Perron–Frobenius–Faris theorem) [9] and we will give a brief review in Sec. 2. In Sec. 2, we provide a mathematical language describing the self-dual cone analysis. By mastering this language, the readers will be able to handle the selfdual cone analysis freely. In Sec. 3, we give fundamental theorems of the self-dual cone analysis. All results in this section are abstract, hence will often applied in the following sections. Through theorems and proofs, it is clarified that the notion of our attraction plays important role. Section 4 is devoted to an introduction of a special class of self-dual cones which will be crucial in Secs. 7 and 8. Using this self-dual cone, the notion of the reflection positivity can be regarded as an attraction mentioned in Sec. 1.2 naturally. In Sec. 5, we perform the self-dual cone analysis by using the self-dual cone introduced in Sec. 4. The stream of discussions here will appear frequently in applications. An inequality introduced by Dyson, Lieb and Simon (DLS inequality) is observed within the framework of the self-dual cone analysis. Section 6 is concerned with the self-dual cone analysis in interacting systems. This section is a natural generalization of Sec. 5. Sections 7 and 8 are devoted to applications of the self-dual cone analysis to the BCS theory and the Peierls instability respectively. 2. Mathematical Basis for the Self-Dual Cone Analysis 2.1. Self-dual cones and related operator inequalities Let H be a complex Hilbert space and p be a convex cone in H. Then p is called to be self-dual if p = {x ∈ H | x, y ≥ 0 ∀ y ∈ p}. The following properties of p are well known [6, 25]: Proposition 2.1. We have the following: (i) p ∩ (−p) = {0}. (ii) There exists a unique involution j in H such that jx = x for all x ∈ p.
(2.1)
August 12, J070-S0129055X11004424
756
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
(iii) Each element x ∈ H with jx = x has a unique decomposition x = x+ − x− , where x+ , x− ∈ p and x+ , x− = 0. (iv) H is linearly spanned by p. Proof. For reader’s convenience, we will give a skecth of a proof. (i) Let x ∈ p ∩ (−p). Since −x, x ∈ p, we see that 0 ≤ x, (−x) = − x 2 ≤ 0. Thus x must be 0, i.e. p ∩ (−p) = {0}. n (ii) Let HR = { j=1 aj xj | aj ∈ R, xj ∈ p, n ∈ N}. Then HR is a real subspace of H. Set H0 = {x + iy | x, y ∈ HR }− , where {· · ·}− means the closure of the set {· · ·} under the strong topology. We will show that H = H0 . Assume H = H0 . Then there exists a z ∈ H\H0 with z = 0. Hence z, x = 0 for all x ∈ H0 . In particular, for all x ∈ p, we have z, x = 0 = z, (−x) which implies z ∈ p ∩ (−p). Hence by (i) we have z = 0 which is a contradiction. Now we see that H = {x + iy | x, y ∈ HR }− . Hence we can define an involution j by j(x + iy) = x − iy for all x, y ∈ HR . To see the uniqueness of the involution, assume there exists another involution j in H so that j x = x for all x ∈ p. Then, for all x ∈ HR , we have j x = x as well. Thus, for all x, y ∈ HR , j (x + iy) = x − iy = j(x + iy). This means j = j . (iii) Let x ∈ H with jx = x. Consider the distance between x and p, i.e. dist(x, p) = inf y∈p x − y . Then there exists a unique x+ ∈ p such that dist(x, p) =
x − x+ . By defining x− = x+ − x, we obtain x = x+ − x− . By the orthogonal projection theorem, x+ , x− = 0. For all z ∈ p and s ∈ R+ , we see that
− x− = dist(x, p) ≤ − x− − sz . Thus we obtain s2 z 2 + 2sx− , z ≥ 0 which implies x− , z ≥ 0. Then by the self-duality of p, we conclude x− ∈ p. To see the uniqueness of the decomposition is easy. (iv) By the proof of (ii), any element z in H can be written as z = x + iy with jx = x and jy = y. Then by (iii), we have x = x+ − x− and y = y+ − y− with x± ∈ p and y± ∈ p. Therefore z = (x+ − x− ) + i(y+ − y− ). If x − y ∈ p, then we will write x ≥ y (or y ≤ x) with respect to p. We denote x+ + x− by |x|p . Clearly one has |x|p ≥ x± with respect to p. Let A and B be densely defined linear operators on H. If Ax ≥ Bx with respect to p for all x ∈ dom(A) ∩ dom(B) ∩ p, then we will write A B (or B A) with respect to p. Especially if A satisfies 0 A with respect to p, then we say that A preserves positivity with respect to p. We remark that this symbol “” was first introduced by Miura [39]. An element x in p is called to be strictly positive if x, y > 0 for all y ∈ p\{0}. We will write this as x > 0 with respect to p. Of course, the inequality x > y with respect to p means x − y is strictly positive with respect to p. If bounded operators A and B satsify Ax > Bx with respect to p for all x ∈ p\{0}, then we will express this as A B (or B A) with respect to p. Clearly if A B with respect to p, then A B with respect to p. We say that A improves positivity with respect to p if A 0 with respect to p.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
757
Let A ∈ L∞ (H), where L∞ (H) is the set of all bounded linear operators on H. Assume A 0 with respect to p. Then we say that A is ergodic with respect to p if for any x, y ∈ p\{0}, there is an N ∈ N0 such that x, AN y > 0. (Note that N depends on x, y.) If A is ergodic with respect to p, we denote A 0 with respect to p. For any A, B ∈ L∞ (H), the notation A B with respect to p means a bounded linear operator A−B is ergodic with respect to p. Remark that extending the notion of ergodicity to unbounded operators can be possible, however to make arguments simple, we do not enter the details. Finally we note the following. Proposition 2.2. Let A ∈ L∞ (H). Then one has the following: A 0 with respect to p ⇒ A 0 with respect to p ⇒ A 0 with respect to p. (2.2) Proof. This proposition immediately follows from the definitions in the above.
2.2. Partial orders In the remainder of this section, we will review some results about the operator inequalities introduced in Sec. 2.1. Almost all of results here are taken from the author’s previous work [37]. In Sec. 2.1, we introduce three binary relations , , on the set of all densely defined linear operators on H. Here we will observe these binary relations possess properties of the partial order. To see mathematical structures clearly, we will consider bounded operators only. To extend the observations in this subsection to the unbounded operators, we have to care about domains of the operators under consideration. First two lemmas are straightforward. Lemma 2.3. The binary relation “” is a partial order, namly one has the following. Let A, B, C ∈ L∞ (H). (i) (Reflexivity) A A with respect to p. (ii) (Antisymmetry) If A B and B A with respect to p, then A = B. (iii) (Transitivity) If A B and B C with respect to p, then A C with respect to p. Lemma 2.4. The binary relation “” is a strict partial order, namely one has the following. Let A, B, C ∈ L∞ (H). (i) (Irreflexivity) ¬(A A with respect to p). (ii) (Asymmetry) If A B with respect to p, then ¬(B A with respect to p). (iii) (Transitivity) If A B and B C with respect to p, then A C with respect to p.
August 12, J070-S0129055X11004424
758
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Lemma 2.5. The binary relation “” is a strict partial order, namely one has the following. Let A, B, C ∈ L∞ (H). (i) (Irreflexivity) ¬(A A with respect to p). (ii) (Asymmetry) If A B with respect to p, then ¬(B A with respect to p). (iii) (Transitivity) If A B and B C with respect to p, then A C with respect to p. Proof. (i) and (ii) are easy to check. (iii) For each x, y ∈ p\{0}, there exist m, n ∈ N such that x, (B − A)m y > 0 and x, (C − B)n y > 0. Hence x, (C − A)m y = x, {(C − B) + (B − A)}m y = x, (C − B)m y + (Others).
(2.3)
Since (B − A) 0 and (C − B) 0 with respect to p, all terms contained in (Others) are positive. Thus the right-hand side of (2.3) is strictly positive.
2.3. Basic tools Anticipating concrete applications to physics, we will treat unbounded operators from this subsection. The following two lemmas are immediate consequences of the definitions. Lemma 2.6. Suppose that 0 A1 B1 and 0 A2 B2 with respect to p. Then one has the following: (i) 0 A1 A2 with respect to p. Moreover if A1 , B1 ∈ L∞ (H), then 0 A1 A2 B1 B2 with respect to p. (ii) 0 aA1 +bA2 aB1 +bB2 with respect to p, for all a, b ∈ R+ = {x ∈ R | x ≥ 0}. (iii) Let A be positivity preserving: 0 A with respect to p. Suppose that p ∩ dom(A) is dense in p. Then 0 A∗ with respect to p. Lemma 2.7. Let A, B ∈ L∞ (H). Suppose that 0 A and 0 B with respect to p. Then we have the following properties: (i) 0 A∗ with respect to p. (ii) Suppose that ker B # = {0} with a# = a or a∗ . Then 0 AB and 0 BA with respect to p. (iii) 0 aA + bB with respect to p for a > 0 and b ≥ 0. 2.4. Traces and self-dual cones Let trH be the trace on the Hilbert space H. In this subsection, we discuss an important connection between the trace and positivity preserving operators associated with p. To this end, we assume the following.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
759
(C) There exists a complete orthonormal system (CONS) {xn } in H so that {xn } ⊆ p. In various applications, the above assumption is satisfied naturally. The following simple theorem will play crucial roles in applications. Theorem 2.8. Assume (C ). Assume A 0 with respect to p. If A is a trace class operator, then trH [A] ≥ 0. Proof. For each n ∈ N, we see that Axn ≥ 0 with respect to p which implies xn , Axn ≥ 0. Thus trH [A] = n xn , Axn ≥ 0. Corollary 2.9. Assume (C ). Let A1 , . . . , An be trace class operators in H. Assume Aj 0 with respect to p for all j. Then one has the following: (i) trH [α1 A1 + · · · + αN AN ] ≥ 0 for any α1 , . . . , αN ∈ R+ . (ii) trH [A1 · · · AN ] ≥ 0. 2.5. Operator monotonicity In the theory of standard operator inequality, a property of operator monotonicity is a fundamental notion. For example, if A, B ∈ L∞ (H) are positive and A ≥ B, then A1/2 ≥ B 1/2 holds, so the order is preserved. As to our theory of new operator inequalities, we can also construct a couterpart as below. Note, in this subsection, our term “operator monotonicity” includes not only preserving the order but also reversing the order in general. Proposition 2.10 (Monotonicity). Let A and B be positive self-adjoint operators. We assume the following. (a) dom(A) ⊆ dom(B) or dom(A) ⊇ dom(B). (b) (A + s)−1 0 and (B + s)−1 0 with respect to p for all s > 0. Then the following are equivalent to each other. (i) B A with respect to p. (ii) (A + s)−1 (B + s)−1 with respect to p for all s > 0. (iii) e−tA e−tB with respect to p for all t ≥ 0. Proof. (i) ⇒ (ii): By the assumptions (a) and (b), we see that (A + s)−1 − (B + s)−1 = (A + s)−1 (B − A)(B + s)−1 0. (ii) ⇒ (iii): e−tA = s- lim (1 + tA/n)−n s- lim (1 + tB/n)−n = e−tB . n→∞
n→∞
August 12, J070-S0129055X11004424
760
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
(iii) ⇒ (i): A = s- lim(1 − e−tA )/t s- lim(1 − e−tB )/t = B. t↓0
t↓0
Corollary 2.11. Let A be a positive self-adjoint operator and let B be a symmetric operator. Assume the following: (i) B is A-bounded with relative bound a < 1, i.e. dom(A) ⊆ dom(B) and Bx ≤ a Ax + b x for all x ∈ dom(A). (ii) 0 e−tA with respect to p for all t ≥ 0. (iii) 0 −B with respect to p. Then 0 e−t(A+B) with respect to p for all t ≥ 0. Proof. Let C = A + B. Then by the assumptions we see that dom(A) = dom(C) and A − C = −B 0. Thus applying Proposition 2.10, one obtains e−tC e−tA 0.
2.6. Beurling–Deny criterion Let j be the involution j given in Sec. 2.1. Let A be a linear operator acting in H. We say that A is j-real if j dom(A) ⊆ dom(A) and jAx = Ajx for all x ∈ dom(A). Set HR = {x ∈ H | jx = x}. Then for any x ∈ HR , we have a unique decomposition x = x+ − x− with x± ∈ p and x+ , x− = 0. Recall the notation |x|p = x+ + x− . The following theorem is an abstract version of Beurling–Deny criterion [5]. Theorem 2.12 (Beurling–Deny Criterion). Let A be a positive self-adjoint operator on H. Assume that A is j-real. Then the following are equivalent: (i) (ii) (iii) (iv)
0 e−tA with respect to p for all t ≥ 0. If x ∈ dom(A) ∩ HR , then |x|p ∈ dom(A1/2 ) ∩ HR and |x|p , A|x|p ≤ x, Ax. If x ∈ dom(A) ∩ HR , then x+ ∈ dom(A1/2 ) ∩ HR and x+ , Ax+ ≤ x, Ax. If x ∈ dom(A) ∩ HR , then x± ∈ dom(A1/2 ) ∩ HR and x+ , Ax+ + x− , Ax− ≤ x, Ax.
Proof. Proof is a slight modification of [47, Theorem XIII.50]. 2.7. Perron–Frobenius–Faris theorem Theorem 2.13 (Perron–Frobenius–Faris). Let A be a positive self-adjoint operator on H. Suppose that 0 e−tA with respect to p for all t ≥ 0 and inf spec(A)b is an eigenvalue. Let PA be the orthogonal projection onto the closed subspace spanned by eigenvectors associated with inf spec(A). Then the following are b For
a closed operator A, spectrum of A is denoted by spec(A).
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
761
equivalent: (i) (ii) (iii) (iv) (v)
dim ran PA = 1 and PA 0 with respect to p. 0 (A + s)−1 with respect to p for some s > 0. For all x, y ∈ p\{0}, there exists a t > 0 such that 0 < x, e−tA y. 0 (A + s)−1 with respect to p for all s > 0. 0 e−tA with respect to p for all t > 0.
Proof. See, e.g., [9, 37, 47]. 2.8. Direct sums of self-dual cones Let {Hn }n∈N be a family of Hilbert spaces and let pn be a self-dual cone in Hn . The direct sum of {pn } is now defined by pn = x = xn ∈ Hn | xn ∈ pn ∀ n ∈ N . (2.4) n∈N
It is easily checked that
n∈N n∈N pn
n∈N
is a self-dual cone in
n∈N
Hn .
Proposition 2.14. Let An be a densely defined closable operator on Hn . Then A = n∈N An 0 with respect to n∈N pn if and only if An 0 with respect to pn for all n ∈ N. Proof. Easy exercise. 2.9. Tensor products of self-dual cones Let p1 and p2 be self-dual cones in Hilbert spaces H1 and H2 respectively. Set p1 ⊗ p2 = {z ∈ H1 ⊗ H2 | x ⊗ y, z ≥ 0 ∀ x ∈ p1 ∀ y ∈ p2 }.
(2.5)
In general, p1 ⊗ p2 is not a self-dual cone. The following proposition is a standard criterion for p1 ⊗ p2 to be a self-dual cone. Proposition 2.15. One has the following: (i) If there exists a CONS {xn } in H1 such that {xn } ⊆ p1 , then p1 ⊗p2 is self-dual. (ii) If there exists a CONS {yn } in H2 such that {yn } ⊆ p2 , then p1 ⊗p2 is self-dual. Proof. We will show (i) only. Set (p1 ⊗ p2 )† = {w ∈ H1 ⊗ H2 | w, z ≥ 0 ∀z ∈ p1 ⊗ p2 }, the dual cone of p1 ⊗ p2 . We will prove p1 ⊗ p2 = (p1 ⊗ p2 )† . It is easy to check (p1 ⊗ p2 )† ⊆ p1 ⊗ p2 . To see the converse, let z ∈ p1 ⊗ p2 . Then since {xn } ⊆ p1 is a CONS of H1 , we have xn ⊗ xn , zH1 . (2.6) z= n
Here, for each z ∈ H1 ⊗ H2 and x ∈ H1 , a vector x, zH1 ∈ H2 is defined by y, x, zH1 H2 = x ⊗ y, zH1 ⊗H2 for all y ∈ H2 . Since xn ⊗ y, zH1 ⊗H2 ≥ 0 for all
August 12, J070-S0129055X11004424
762
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
n ∈ N and y ∈ p2 , we get xn , zH1 ≥ 0 with respect to p2 . Next choose z ∈ p1 ⊗ p2 . Then one also sees z = n xn ⊗ xn , z H1 with xn , z H1 ≥ 0 with respect to p2 . Hence z, z H1 ⊗H2 = n xn , zH1 , xn , z H1 H2 ≥ 0. This means z ∈ (p1 ⊗ p2 )† . Henceforth we assume p1 ⊗ p2 is self-dual. Proposition 2.16. Let A, B be densely defined closable operators on H1 and H2 respectively. Assume A 0 with respect to p1 and B 0 with respect to p2 . Then one has A ⊗ B 0 with respect to p1 ⊗ p2 . Proof. For each z ∈ dom(A ⊗ B) ∩ p1 ⊗ p2 , x ∈ dom(A) ∩ p1 and y ∈ dom(B) ∩ p2 , we observe x ⊗ y, A ⊗ Bz = Ax ⊗ By, z ≥ 0.
(2.7)
Since dom(A) ∩ p1 and dom(B) ∩ p2 are dense in p1 and p2 , the inequality (2.7) can be extended to any x ∈ p1 and y ∈ p2 . 3. Fundamental Theorems of Self-Dual Cone Analysis Let H0 be a positive self-adjoint operator acting in the Hilbert space H and let V be a self-adjoint operator acting in H. We will investigate a linear operator HV = H0 − V.
(3.1)
For simplicity, we assume that V is bounded in this section.c Then, by the Kato– Rellich theorem [46], HV is self-adjoint and bounded from below with dom(HV ) = dom(H0 ). H0 and −V are referred to as a free Hamiltonian and an interaction, respectively. HV is called to be the Hamiltonian with an interaction −V . Definition 3.1. Fix a self-dual cone p in H. We say that the interaction −V is attractive with respect to p, if V 0 (equivalently −V 0) with respect to p. On the other hand, if V 0 (equivalently −V 0) with respect to p, then we say that the potential −V is repulsive with respect to p. Remark 3.2. Clearly choice of p is not unique. Thus our notion of the attraction is not unique in this sense. For instance, if we choose a self-dual cone p1 , then −V could not be attractive, but under the another choice p2 , −V could be attractive with respect to p2 . Hence in concrete applications to physics, a proper choice of p is essential to clarify a mathematical structure of phenomenon under the consideration. We also assume the following in this section. (H) e−βH0 0 with respect to p for all β ∈ R+ . results in this section can be extented to unbounded V such that V is H0 -bounded with relative bound less than one, see [37].
c All
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
763
Theorem 3.3. Assume (H ). Assume V 0 with respect to p, that is, −V is attractive with respect to p. Then one has e−βHV 0 with respect to p for all β ∈ R+ . Proof. This immediately follows from Corollary 2.11. Theorem 3.4. Let E(V ) = inf spec(HV ). Under the assumptions in Theorem 3.3, we have E(V ) = inf{ϕ, HV ϕ | ϕ ∈ dom(H0 ) ∩ p, ϕ = 1}.
(3.2)
Proof. Since e−βHV 0 with respect to p by Theorem 3.3, HV is j-real, i.e. HV 1 commutes with j. For each ϕ ∈ H, put ϕ = 12 (1 + j)ϕ and ϕ = 2i (1 − j)ϕ. Clearly ϕ 2 = ϕ 2 + ϕ 2 . Let E(V ) = inf{ϕ, HV ϕ | ϕ ∈ dom(H0 ) ∩ HR , ϕ = 1}.
(3.3)
First we will show E(V ) = E(V ). Since HV is j-real, we have ϕ, HV ϕ = ϕ, HV ϕ + ϕ, HV ϕ ≥ E(V ) ϕ 2 .
(3.4)
Hence E(V ) ≥ E(V ). On the other hand, it is easy to check E(V ) ≤ E(V ). For any ϕ ∈ dom(H0 ) ∩ HR , we have ϕ, HV ϕ ≥ |ϕ|p , HV |ϕ|p ≥ RHS of (3.2)
(3.5)
by Theorem 2.12. This completes the proof. Corollary 3.5. In addition to the assumptions in Theorem 3.3, we assume that HV has a ground state ϕg . Then we can always choose ϕg to be positive with respect to p, i.e. ϕg ≥ 0 with respect to p. Theorem 3.6. Assume (H ). Let −V1 and −V2 be interactions. If V1 V2 0 with respect to p, then we have e−βHV1 e−βHV2 0 with respect to p for all β ∈ R+ . Proof. Apply Proposition 2.10. Corollary 3.7. Under the assumptions in Theorem 3.6, we have E(V1 ) ≤ E(V2 ). Proof. By Theorem 3.4, for any ε > 0, there exists a ϕ ∈ dom(H0 ) ∩ p so that
ϕ = 1 and ϕ, HV2 ϕ ≤ E(V2 ) + ε. Then, by Theorem 3.4 again, E(V2 ) + ε ≥ ϕ, HV1 ϕ + ϕ, (V1 − V2 )ϕ ≥ ϕ, HV1 ϕ ≥ E(V1 ).
(3.6)
This proves the assertion in the corollary. Theorem 3.8. Assume (H ). Assume V 0 with respect to p. Then e−βHV 0 with respect to p for all β > 0.
August 12, J070-S0129055X11004424
764
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Proof. The essential idea commes from Fr¨ohlich [14, 15]. By the Perron– Frobenius–Faris theorem (Theorem 2.13), it sufficies to show that, for any x, y ∈ p\{0}, there exists a β > 0 such that x, e−βHV y > 0.
(3.7)
By the Duhamel’s formula, we see that e−βHV =
∞
DN (β),
N =0
DN (β) =
SN (β)
e−t1 H0 V e−t2 H0 V · · · e−tN V e−(β−
PN
j=1
tj )H0
,
−1 β−t β β−PN j=1 tj where SN (β) = 0 dt1 0 1 dt2 · · · 0 dtN . Then, by the assumptions, DN (β) 0 with respect to p for all N ∈ N0 and β ∈ R+ . Hence e−βHV DN (β) with respect to p for all N ∈ N0 . Thus to see (3.7), it suffices to show that a sequence {DN (β)}N ≥0 is ergodic in the sense that, for any x, y ∈ p\{0}, there exists an N ∈ N0 so that x, DN (β)y > 0. But since V is ergodic with respect to p, there exists an N ∈ N0 such that x, V N e−βH0 y > 0 which implies x, DN (β)y > 0.
Corollary 3.9. Under the assumptions in Theorem 3.8, assume HV has a ground state. Then it is unique and can be chosen to be strictly positive with respect to p. Proof. This is a direct consequence of the Perron–Frobenius–Faris theorem (Theorem 2.13). Corollary 3.10. Under the assumptions in Theorem 3.8, let U be a unitary operator on H. Assume U 0 with respect to p and U HV U −1 = HV . If HV has a unique ground state ϕg , then U ϕg = ϕg . Proof. Since U ϕg is a ground state of HV and strictly positive, it must be equal to ϕg by the uniqueness. Theorem 3.11. Assume (H ). Assume V1 0 and V2 0 with respect to p. Moreover assume V1 V2 with respect to p and V1 = V2 . If HV2 has a ground state, then one has E(V1 ) < E(V2 ). Proof. Before we will enter the proof, we remark the following fact. Let A be a bounded linear operator such that A 0 with respect to p and A = 0. Then there exists y ∈ p\{0} such that Ay ∈ p\{0}. [Proof: Assume Ay = 0 for all y ∈ p\{0}. Then since H is linearly spanned by p, we have Ax = 0 for all x which means A = 0. This is a contradiction.] Let ϕ be a ground state of HV2 . Then, by Corollary 3.9, it is unique and ϕ > 0 with respect to p. Then, by the above remark, (V1 − V2 )ϕ ∈ p\{0}. [Proof: Assume
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
765
(V1 − V2 )ϕ = 0. Then for all y ∈ p, we have y, (V1 − V2 )ϕ = 0. By the above remark we can choose y ∈ p\{0} so that (V1 − V2 )y ∈ p\{0}. Then since ϕ > 0 with respect to p, we have 0 < (V1 − V2 )y, ϕ = y, (V1 − V2 )ϕ = 0 which is a contradiction.] Observe that, since ϕ, (V1 − V2 )ϕ > 0, E(V2 ) = ϕ, HV1 ϕ + ϕ, (V1 − V2 )ϕ > E(V1 ). Now the assertion is clear. 4. Canonical Self-Dual Cone in L2 (h) In this section, a special class of self-dual cones will be considered. Characteristic nature of this self-dual cone will be essential to applications in Secs. 7 and 8. Using this self-dual cone, we can clarify consistent connections between the reflection positivity and positivity preserving operators introduced in Sec. 2.1. In this way, the reflection positivity can be included in the self-dual cone analysis naturally. 4.1. Hilbert–Schmidt operators 4.1.1. Algebraic structures Let L2 (h) be the set of all Hilbert–Schmidt operators in a complex Hilbert space h: L2 (h) = {ξ ∈ L∞ (h) | tr ξ ∗ ξ < ∞},
(4.1)
where tr is the trace on h. Then L2 (h) is a ∗-algebra and a two-sided ideal in L∞ (h). Moreover L2 (h) becomes a Hilbert space under the inner product ξ, ηL2 = tr ξ ∗ η for each ξ, η ∈ L2 (h). Then, by the cyclicity of the trace, we have ξ, ηL2 = η ∗ , ξ ∗ L2 . For each ξ ∈ L2 (h), we define the left multiplication π (ξ) by π (ξ)η = ξη,
∀ η ∈ L2 (h).
(4.2)
Similarly the right multiplication πr (ξ) is defined by πr (ξ)η = ηξ,
∀ η ∈ L2 (h).
(4.3)
It is easily verified that π (·) is a ∗-representation of L2 (h), namely, one has the following: π (aξ + bη) = aπ (ξ) + bπ (η),
π (ξη) = π (ξ)π (η),
π (ξ)∗ = π (ξ ∗ )
(4.4)
for all a, b ∈ C and η, ξ ∈ L2 (h). On the other hand, πr (·) is a ∗-antirepresentation of L2 (h): πr (aξ + bη) = aπr (ξ) + bπr (η),
πr (ξη) = πr (η)πr (ξ),
πr (ξ)∗ = πr (ξ ∗ ).
(4.5)
Note that the difference between the middle equations in (4.4) and (4.5) will play important roles.
August 12, J070-S0129055X11004424
766
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
For each ξ ∈ L2 (h), one defines an involution J by Jξ = ξ ∗ ,
∀ ξ ∈ L2 (h).
(4.6)
Proposition 4.1. L2 (h) is a Hilbert algebra. Namely one has the following: (i) One has
π (ξ)η L2 ≤ ξ L∞ η L2 ,
πr (ξ)η L2 ≤ ξ L∞ η L2 ,
(4.7)
where a L∞ means the operator norm on h for each a ∈ L∞ (h). Hence π (ξ), πr (ξ) ∈ L∞ (L2 (h)) for each ξ ∈ L2 (h). (ii) π (ξ)ζ, ηL2 = ζ, π (ξ ∗ )ηL2 , πr (ξ)ζ, ηL2 = ζ, πr (ξ ∗ )ηL2 . (iii) For each ξ, η ∈ L2 (h), one has Jξ, JηL2 = η, ξL2 . By (i) in the above proposition and density of L2 (h) in L∞ (h), we can extend the left and right multiplications to L∞ (h). The von Neumann algebra π (L∞ (h)) is sometimes called the left von Neumann algebra. Similary πr (L∞ (h)) is called the right von Neumann algebra. Proposition 4.2. For all a ∈ L∞ (h), one has Jπ (a)J = πr (a∗ ),
Jπr (a)J = π (a∗ ).
(4.8)
Accordingly one obtains Jπ (L∞ (h))J = π (L∞ (h)) = πr (L∞ (h)), Jπr (L∞ (h))J = πr (L∞ (h)) = π (L∞ (h)),
(4.9)
where π# (L∞ (h)) = {a ∈ L∞ (h) | ab = ba, ∀ b ∈ π# (L∞ (h))} (# = or r), the commutant of π# (L∞ (h)). Let π (L∞ (h)) · πr (L∞ (h)) = Lin{π (a)πr (b) ∈ L∞ (L2 (h)) | a, b ∈ L∞ (h)},
(4.10)
where Lin{· · ·} is the linear span of the set {· · ·}. Proposition 4.3. One has L∞ (L2 (h)) = [π (L∞ (h)) · πr (L∞ (h))]−w ∩ L∞ (L2 (h)), where −w means the closure in the weak topology. Proof. We will show L∞ (L2 (h)) ⊆ [π (L∞ (h)) · πr (L∞ (h))]−w ∩ L∞ (L2 (h)). The converse inclusion is trivial. Let P be a one-dimensional orthogonal projection on L2 (h). Then there exist orthogonal projections p and q on h so that P = π (p)πr (q). Hence P ∈ π (L∞ (h))·πr (L∞ (h)). From this it follows any orthogonal projection on L2 (h) is in [π (L∞ (h))·πr (L∞ (h))]−w ∩L∞ (L2 (h)). Hence, by the spectral theorem, any bounded self-adjoint operator on L2 (h) belongs to [π (L∞ (h)) · πr (L∞ (h))]−w ∩ L∞ (L2 (h)). Since any element in L∞ (L2 (h)) can be written as a sum of self-adjoint operators, one concludes the assertion.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
767
Corollary 4.4. Let H ∈ L∞ (L2 (h)) be self-adjoint. Then H has the following form: H = H0 − V, H0 = π (A) + πr (B)
(4.11) (4.12)
and V is a weak limit of operators of the form M
π (Cj )πr (Dj ),
(4.13)
j=1
where A, B, Cj , Dj are self-adjoint on h so that Cj ∈ / C1 and Dj ∈ / C1. Remark 4.5. In general, V is a weak limit of operators of the form M
[π (Cj )πr (Dj∗ ) + π (Cj∗ )πr (Dj )]
(4.14)
j=1
with non-self-adjoint Cj , Dj . However this expression is essentially equivalent to (4.13) because M
[π (Cj )πr (Dj∗ ) + π (Cj∗ )πr (Dj )]
j=1 M
=
1 [π (Cj + Cj∗ )πr (Dj + Dj∗ ) + π (i(Cj − Cj∗ ))πr (i(Dj − Dj∗ ))]. 2 j=1
(4.15)
Thus all arguments about V of the form (4.14) can be reduced to the case of (4.13). 4.1.2. A natural identification Let {xn }n∈N be a CONS of h and let ϑ be an involution on h. Then each element ξ in L2 (h) has the following expression ξ= ξm,n |xm xn |, ξm,n = xm , ξxn , (4.16) m,n∈N
where |xy| is defined by |xy|z = y, zx for all x, y, z ∈ h. Now we define a map Φϑ : L2 (h) → h ⊗ h by ξm,n xm ⊗ ϑxn . (4.17) Φϑ (ξ) = m,n
Then Φϑ (ξ) is independent of a choice of {xn }n . The following proposition can be easily shown. Proposition 4.6. Φϑ has the following properties: (i) Φϑ is linear: Φϑ (aξ + bη) = aΦϑ (ξ) + bΦϑ (η) for any ξ, η ∈ L2 (h) and a, b ∈ C.
August 12, J070-S0129055X11004424
768
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
(ii) Φϑ is isometric isomorphism, namely, Φϑ (ξ), Φϑ (η)h⊗h = ξ, ηL2 and, for any f ∈ h ⊗ h, there exists a ξ ∈ L2 (h) such that f = Φϑ (ξ). Proposition 4.7. Let A, B ∈ L∞ (h). Then one has the following: (i) Φϑ π (A)Φ−1 ϑ = A ⊗ 1. (ii) Φϑ πr (ϑA∗ ϑ)Φ−1 ϑ = 1 ⊗ A. Proof. A direct computation. Remark 4.8. In concrete applications to physics, a choice of the involution ϑ heavily depends on each problem under the consideration. If we choose an irrelevant ϑ, then we will never find a proper self-dual cone which governs the phenomenon. 4.2. Definition of canonical self-dual cone in L2 (h) Let L2 (h)+ = {ξ ∈ L2 (h) | ξ ≥ 0}. Then L2 (h)+ is self-dual, namely, L2 (h)+ = {ξ ∈ L2 (h) | ξ, ηL2 ≥ 0, ∀η ∈ L2 (h)+ }. Furthermore one immediately concludes the following: (i) L2 (h)+ ∩ (−L2 (h)+ ) = {0}. (ii) Let J be given by (4.6). Each element ξ ∈ L2 (h) with Jξ = ξ has a unique decomposition ξ = ξ+ − ξ− such that ξ± ∈ L2 (h)+ and ξ+ , ξ− L2 = 0. Moreover ξ± can be expressed as ξ+ = R max{λ, 0} dEξ (λ) and ξ− = − R min{λ, 0} dEξ (λ), where Eξ (·) is the spectral measure of ξ. (iii) L2 (h) is linealy spanned by L2 (h)+ . Remark 4.9. Some of the results in this subsection are special case of the pioneering work by Gross [22]. However in order to make this note self-contained, we are starting from the very beginning. Also remark that many of the results in this note can be extended to a more general framework established in [22]. Proposition 4.10. Let A ∈ L∞ (h). Then one has, for each ξ ∈ L2 (h) with Jξ = ξ, |ξ|, π (A)|ξ|L2 = ξ, π (A)ξL2 ,
|ξ|, πr (A)|ξ|L2 = ξ, πr (A)ξL2 ,
(4.18)
where |ξ| = ξ+ + ξ− . Proof. Observe that, by the cyclicity of the trace, |ξ|, π (A)|ξ|L2 = tr|ξ|A|ξ| = tr A|ξ|2 = tr Aξξ = tr ξAξ = ξ, AξL2 .
(4.19)
Similarly we see that |ξ|, πr (A)|ξ|L2 = ξ, πr (A)ξL2 . Proposition 4.11. Let C ∈ L∞ (h). Then one has π (C ∗ )πr (C) 0 with respect to L (h)+ . 2
(4.20)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
769
Proof. For each ξ ∈ L2 (h)+ , we see that π (C ∗ )πr (C)ξ = C ∗ ξC ≥ 0. Hence we have the result. Proposition 4.12. Let A ∈ L∞ (h) be self-adjoint. Then one has e−t(π (A)+πr (A)) 0
(4.21)
with respect to L2 (h)+ for all t ∈ R. Proof. Note that e−t(π (A)+πr (A)) = π (e−tA )πr (e−tA ). Then applying Proposition 4.11, we have the desired assertion. Let P be an orthogonal projection acting in h. Then τ (P ) = π (P )πr (P ) is an orthogonal projection and τ (P )L2 (h) = L2 (ran(P )). Moreover L2 (ran(P ))+ = τ (P )L2 (h)+ .
(4.22)
The following claim is easily proven. Proposition 4.13. Let A be a bounded operator acting in L2 (h). Assume that A commutes with τ (P ). Now set AP = A L2 (ran(P )), a restriction of A to L2 (ran(P )). If A 0 with respect to L2 (h)+ , then AP 0 with respect to L2 (ran(P ))+ . 4.3. The Lieb cone The Lieb cone (h ⊗ h)+ is defined by (h ⊗ h)+ = Φϑ (L2 (h)+ ).
(4.23)
Then (h ⊗ h)+ is a self-dual cone as well. [Proof: Let (h ⊗ h)†+ = {x ∈ h ⊗ h | x, y ≥ 0 ∀ y ∈ (h⊗h)+ }. (h⊗h)†+ ⊇ (h⊗h)+ is trivial. To show the converse, we will use the self-duality of L2 (h)+ . Any element f ∈ h ⊗ h is expressed as f = ηm,n xm ⊗ xn = Φϑ (η) with some η ∈ L2 (h). Hence if f = Φϑ (η) is in (h⊗h)+ , then Φϑ (η), Φϑ (ξ) = η, ξL2 ≥ 0 for each ξ ∈ L2 (h)+ . Then by the self-duality of L2 (h)+ , η must be in L2 (h)+ .] Moreover one sees Φϑ (ξ)± = Φϑ (ξ± ),
|Φϑ (ξ)|(h⊗h)+ = Φϑ (|ξ|)
(4.24)
and Φϑ (ξ) > 0
with respect to (h ⊗ h)+ ⇔ ξ > 0.
(4.25)
Remark 4.14. In this note, we will mainly use the canonical cone L2 (h)+ in the the latter sections. But it would be useful for the readers to study the Lieb cone here. Although the Lieb cone was not emphasized clearly in the original paper [32], the fundamental idea was already used in the work.
August 12, J070-S0129055X11004424
770
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Proposition 4.15. One has the following: (i) Let A ∈ L∞ (h). If A is self-adjoint, then one has e−t(A⊗1+1⊗ϑAϑ) 0 with respect to (h ⊗ h)+ . (ii) Let C ∈ L∞ (h). Then one has C ⊗ (ϑCϑ) 0 with respect to (h ⊗ h)+ . 4.4. Reflection positivity and positivity with respect to L2 (h)+ 4.4.1. Reflection positivity For each A ∈ L∞ (h), let us define τ (A) = π (A)πr (A∗ ).
(4.26)
If A ∈ L1 (h), then τ (A) ∈ L1 (L2 (h)) and trL2 [τ (A)] = trL2 [π (A)πr (A∗ )] = |tr A|2 ≥ 0,
(4.27)
where trL2 means the trace on L2 (h).d In the representation on h ⊗ h, (4.27) is expressed as 2 trh⊗h [Φϑ τ (A)Φ−1 ϑ ] = trh⊗h [A ⊗ ϑAϑ] = |tr A| ≥ 0.
(4.28)
This property is often referred to as the reflection positivity. Let R(L2 (h)) be the closure of the following set of conical combinations under the weak topology in L2 (h): N αj τ (Aj ) Aj ∈ L∞ (h), αj ∈ R+ , j = 1, . . . , N ∈ N , (4.29) j=1
where R+ = {r ∈ R | r ≥ 0}. Definition 4.16. In general, we say that A is reflection positive with respect to L2 (h)+ if A ∈ R(L2 (h)) [8, 18]. If A ∈ R(L2 (h)), then we denote A 0 with respect to L2 (h)+ . Similarly if A − B ∈ R(L2 (h)), then we write A B with respect to L2 (h)+ . Remark 4.17. The reflection positivity was introduced by Osterwalder and Schrader in quantum field theory [43]. Later its importance in the phase transition was discovered in [1, 8, 16–19]. 4.4.2. General properties Proposition 4.18. If A 0 with respect to L2 (h)+ , then A 0 with respect to L2 (h)+ . Proof. This immediately follows from Proposition 4.11. d For
a Hilbert space h, L1 (h) is defined by L1 (h) = {A ∈ L∞ (h) | tr|A| < ∞}.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
771
Remark 4.19. Note the binary relation “” is not a partial order. More precisely, we can easily check the reflexivity and the antisymmetry, however the transitivity fails. Proposition 4.20. R(L2 (h)) is a multicative convex cone. Namely one has the following: (i) If A 0, B 0 with respect to L2 (h)+ and a, b ∈ R+ , then aA + bB 0 with respect to L2 (h)+ . (ii) If A 0, B 0 with respect to L2 (h)+ , then AB 0 and BA 0 with respect to L2 (h)+ . Proof. (i) is trivial. To show (ii), we just remark π (A)πr (A∗ )π (B)πr (B ∗ ) = π (AB)πr (B ∗ A∗ ) = π (AB)πr ((AB)∗ ). Proposition 4.21. If A 0 with respect to L2 (h)+ and L1 (L2 (h)), then trL2 A ≥ 0. Remark 4.22. Notice differences between Theorem 2.8 and Proposition 4.21. In case of the reflection positivity, the assumption (C) is unnecessary to show Proposition 4.21. Proof. Fix a CONS {xn }n∈N in h arbitrarily and set ξm,n = |xm xn |. Then {ξm,n }m,n∈N is a CONS of L2 (h). For each N ∈ N, define an orthogonal projection N PN by PN = m,n=1 |ξm,n ξm,n |. Then we see limN →∞ trL2 [PN APN ] = trL2 A. By the definition, for each A ∈ R(L2 (h)), there exists a sequence {An }n∈N of operators such that w-limn→∞ An = A and each An belongs to the set given by (4.29). Then since trL2 [PN An PN ] ≥ 0 for all n, N ∈ N, we have trL2 [PN APN ] = limn→∞ trL2 [PN An PN ] ≥ 0 for any N ∈ N. Next taking N → ∞, one obtains trL2 A = limN →∞ trL2 [PN APN ] ≥ 0. 5. Self-Dual Cone Analysis in L2 (h) L2 (h)+ is a special example of self-dual cone. Although its definition is simple, it contains rich mathematical structures, as we saw in Sec. 4. In this section, we present an analysis of operators on L2 (h) by making good use of these structures. Not only results but also a stream of our discussions in this section is important for concrete applications to physics and will appear repeatedly in Secs. 7 and 8. 5.1. Derivation of the attractive interaction by the DLS inequality 5.1.1. Structures of general interactions In this section, we will study a self-adjoint operator HV ∈ L∞ (L2 (h)) of the form HV = H0 − V,
(5.1)
August 12, J070-S0129055X11004424
772
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
H0 = π (A) + πr (B), V =
M
(5.2)
π (Cj )πr (Dj ) = π (C) · πr (D),
(5.3)
j=1
where A, B, C = {Cj }, D = {Dj } are self-adjoint on h. (Be careful about the sign in front of V .) H0 and −V are called to be the free Hamiltonian and the interaction, respectively. Remark 5.1. We can also treat general HV in L∞ (L2 (h)). In this case, by Corollary 4.4, V is a weak limit of the operators of the form (5.3). Then all arguments in this section can be extended to this general HV . However to avoid unnecessary complexities, we will restrict our observation to simple HV given by (5.1)–(5.3). First of all, we clarify some structures of V . Proposition 5.2 (Structures of V ). Although V is self-adjoint operator acting in the Hilbert space L2 (h), it contains both J-real and J-imaginary parts. Namely one has the following: (i) V can be written as V = V + iV
(5.4)
with V =
1 (V + JV J), 2
V =
1 (V − JV J). 2i
(5.5)
Both V and V are J-real in a sense that JV = V J, JV = V J. (ii) J-real part V can be expressed as V = V+ − V− with V+ =
1 π (C + D) · πr (C + D), 4
V− =
1 π (C − D) · πr (C − D). 4
−V+ is the attractive part of −V and V− is the repulsive part of −V, namely, −V+ 0 and V− 0 with respect to L2 (h)+ . In particular, if Cj = αj Dj with αj > 0 for all j, then −V is purely attractive in a sense that −V 0 with respect to L2 (h)+ . (iii) J-imaginary part V can be expressed as V = V+ − V− with V+ =
1 π (C + iD) · πr (C − iD), 4
V− =
1 π (C − iD) · πr (C + iD). 4
V is an antiself-adjoint operator acting in L2 (h), namely, one has (V )∗ = −V . In particular, if Cj = αj Dj with αj > 0 for all j, then J-imaginary part are vanishing: V = 0. Corollary 5.3. If Cj = αj Dj with αj > 0 for all j, then −V is attractive, i.e. −V 0 with respect to L2 (h)+ .
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
773
5.1.2. Dyson–Lieb–Simon inequality Keeping above results in mind, we enjoy the following well known inequality by Dyson et al. [8]. To display dependences on A, B, C, D, we denote HV by H(A, B; C, D). Theorem 5.4 (Dyson–Lieb–Simon Inequality). For any E, F ∈ L∞ (h), one has |trL2 [π (E)πr (F ∗ )e−βH(A,B;C,D) ]|2 ≤ trL2 [π (E)πr (E ∗ )e−βH(A,A;C,C) ]trL2 [π (F )πr (F ∗ )e−βH(B,B;D,D) ],
(5.6)
where trL2 is the trace on L2 (h). Remark 5.5. (i) Note that the every interactions in the right-hand side of (5.6) are attractive with respect to L2 (h)+ , on the other hand, the interaction in the left-hand side contains not only repulsive part but also imaginary part. Hence the DLS inequality tells us that the attractive interaction makes the (free) energy lower. (ii) The inequality in the theorem still holds true for the interaction given by (4.14), see Remark 4.5. Proof. It sufficies to show the assertions when dim h < ∞. By the Duhamel formula, DN,β (A, B; C, D), (5.7) e−βH(A,B;C,D) = N ≥0
DN,β (A, B; C, D) = where
SN (β)
=
β 0
dt1
SN (β)
β−t1 0
dt2 · · ·
DN,β (A, B; C, D) = k1 ,...,kN ≥1
e−t1 H0 V e−t2 H0 · · · e−tN H0 V e−(β−
SN (β)
−1 β−PN j=1 tj
0
PN
j=1 tj )H0
,
(5.8)
dtN . Observe that
π [LA;C (k(N ) ; t(N ) )]πr [LB;D (k(N ) ; t(N ) )∗ ],
(5.9)
where k(N ) = (k1 , . . . , kN ) ∈ NN , t(N ) = (t1 , . . . , tN ) ∈ RN + and LX;Y (k(N ) ; t(N ) ) = e−t1 X Yk1 e−t2 X · · · e−tN X YkN e−(β−
PN
j=1 tj )X
(5.10)
with Y = {Yj }j . By a direct computation, one sees trL2 [π (E)πr (F ∗ )DN,β (A, B; C, D)] = {tr[ELA;C (k(N ) ; t(N ) )]} × {tr[F LB;D (k(N ) ; t(N ) )]}∗ . k1 ,...,kN ≥1
SN (β)
(5.11)
August 12, J070-S0129055X11004424
774
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Thus introducing the inner product F, GN,β = k1 ,...,kN ≥1
SN (β)
F (k(N ) ; t(N ) )G(k(N ) ; t(N ) )∗ ,
(5.12)
one obtains trL2 [π (E)πr (F ∗ )DN,β (A, B; C, D)] = FA;C;E , FB;D;F N,β , (N )
(N )
(5.13)
where (N )
FX;Y;Z (k(N ) ; t(N ) ) = tr[ZLX;Y (k(N ) ; t(N ) )].
(5.14)
Then, by the Schwartz inequality, 2 (N ) (N ) |trL2 [π (E)πr (F ∗ )e−βHV ]|2 = FA;C;E , FB;D;F N,β N ≥0 ≤
(N )
FA;C;E 2N,β
N ≥0
Finally we remark that
(N )
FA;C;E 2N,β =
N ≥0
N ≥0 k1 ,...,kN ≥1
=
(N )
FB;D;F 2N,β . (5.15)
N ≥0
SN (β)
|tr[ELA;C (k(N ) ; t(N ) )]|2
trL2 [π (E)πr (E ∗ )DN,β (A, A; C, C)]
N ≥0
= trL2 [π (E)πr (E ∗ )e−βH(A,A;C,C) ].
(5.16)
Combining (5.15) and (5.16), one concludes the assertion. Corollary 5.6. Let Zβ (A, B; C, D) = trL2 [e−βH(A,B;C,D) ].
(5.17)
Zβ (A, B; C, D)2 ≤ Zβ (A, A; C, C)Zβ (B, B; D, D).
(5.18)
Then one has
Especially if A = B and Cj = Dj for all j, then equality holds in (5.18). In this case, V is purely attractive, i.e. V 0 with respect to L2 (h)+ . 5.2. Analysis of Hamiltonian with attractive interaction By the DLS inequality, we see that the partition function Z(A, B; C, D) is maximized when A = B and Cj = Dj . In this case, the Hamiltonian
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
775
HV = H(A, B; C, D) has the form HV = H(A, A; C, C) = π (A) + πr (A) − π (C) · πr (C).
(5.19)
Hence it is quite natural to concentrate our attention on investigating properties of H(A, A; C, C). To this end, we remember the following notations: HV = H(A, A; C, C) = H0 − V,
(5.20)
H0 = π (A) + πr (A),
(5.21)
V = π (C) · πr (C).
Clearly the interaction term −V is attractive with respect to L2 (h)+ . Theorem 5.7. Assume that e−βHV ∈ L1 (L2 (h)). Then one has the following: (i) (Positivity) For any D ∈ L∞ (h), one has π (D)πr (D∗ )e−βHV 0
with respect to L2 (h)+
(5.22)
for all β ≥ 0. In particular trL2 [π (D)πr (D∗ )e−βHV ] ≥ 0.
(5.23)
(ii) (Monotonicity) If V1 V2 0 with respect to L2 (h)+ , one has π (D)πr (D∗ )e−βHV1 π (D)πr (D∗ )e−βHV2
(5.24)
with respect to L2 (h)+ , for any D ∈ L∞ (h) and β ≥ 0. In particular trL2 [π (D)πr (D∗ )e−βHV1 ] ≥ trL2 [π (D)πr (D∗ )e−βHV2 ]. Proof. By the Duhamel formula, we have V DN,β , e−βHV =
(5.25)
(5.26)
N ≥0 V DN,β =
SN (β)
e−t1 H0 V e−t2 H0 · · · e−tN H0 V e−(β−
PN
j=1 tj )H0
,
(5.27)
where we use the notations in the proof of Theorem 3.8. (i) By Propositions 4.11 and 4.12, we immediately see that V 0 π (D)πr (D∗ )DN,β
(5.28)
for all N ∈ N0 and β ∈ R+ . Then, by (5.26), we conclude (i). (ii) If V1 V2 0, then, by Proposition 4.20, we have V1 V2 π (D)πr (D∗ )DN,β 0 π (D)πr (D∗ )DN,β
for all N ≥ 0 and β ∈ R+ . Hence we can conclude (ii).
(5.29)
August 12, J070-S0129055X11004424
776
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
As mentioned already, the DLS inequality tells us the attractive interaction with respect to L2 (h)+ makes the free energy lower. Then the attractive interaction with respect to L2 (h)+ induces the positivity of e−βH(A,A;C,C) with respect to L2 (h)+ by Theorem 5.7. Furthermore we can describe some facts about the ground states of H(A, A; C, C). Theorem 5.8. (i) For all ξ ∈ L2 (h) with Jξ = ξ, we have ξ, HV ξL2 ≥ |ξ|, HV |ξ|L2 .
(5.30)
(ii) Let e be an eigenvalue of HV and ξ ∈ L2 (h) be a corresponding eigenvector. 1 (1 − J)ξ are also eigenvectors corresponding Then ξ = 12 (1 + J)ξ and ξ = 2i to e. (iii) (Positivity of a ground state) Let eg = inf spec(HV ). Assume that eg is an eigenvalue. If ξg ∈ L2 (h) is a real eigenvector corresponding to eg , then |ξg | is also eigenvector corresponding to eg , a ground state. Proof. (i) Apply Theorem 2.12. (iii) directly follows from (i). (ii) is trivial. With an additional assumption on V , we will see the uniqueness of ground state which means, by the attractive interaction with respect to L2 (h)+ , the ground state entropy of HV is vanishing. This suggests that, in the ground state of HV , an order could appear as a physical phenomenon. Theorem 5.9. Assume V 0 with respect to L2 (h)+ . Then e−βHV 0 with respect to L2 (h)+ for all β > 0. Consequently the ground state of HV is unique and strictly positive with respect to L2 (h)+ . Proof. Apply Theorem 3.8. 5.3. Representation in h ⊗ h ˆ V = Φϑ HV Φ−1 . Let HV be the Hamiltonian defined by (5.20) and (5.21). Set H ϑ Then ˆ V = A ⊗ 1 + 1 ⊗ ϑAϑ − Vˆ H
(5.31)
M with Vˆ = Φϑ V Φ−1 ϑ = j=1 Cj ⊗ ϑCj ϑ. Translating the theorems in the previous subsection from L2 (h) into h ⊗ h, we have the following. ˆ V be the self-adjoint operator defined by (5.31). Assume Corollary 5.10. Let H ˆV −β H 1 ∈ L (h ⊗ h). Then one has the following: that e (i) For any D ∈ L∞ (h), one has ˆ
trh⊗h [D ⊗ ϑDϑ e−β HV ] ≥ 0.
(5.32)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
777
(ii) If V1 V2 0, one has ˆ
ˆ
trh⊗h [D ⊗ ϑDϑ e−β HV1 ] ≥ trh⊗h [D ⊗ ϑDϑ e−β HV2 ].
(5.33)
Corollary 5.11. We have the following: (i) For all ξ ∈ L2 (h) with Jξ = ξ, we have ˆ V Φϑ (ξ)h⊗h ≥ Φϑ (|ξ|), H ˆ V Φϑ (|ξ|)h⊗h . Φϑ (ξ), H
(5.34)
ˆ V ). Assume that eˆg is an (ii) (Positivity of a ground state) Let eˆg = inf spec(H eigenvalue. If Φϑ (ξg ) ∈ h ⊗ h is a real eigenvector corresponding to eˆg , then |Φϑ (ξg )|(h⊗h)+ = Φϑ (|ξg |) is also eigenvector corresponding to eˆg , a ground state. 6. Self-Dual Cone Analysis in Hilbert Spaces with Tensor Product Structure 6.1. General settings In this section, we will consider a special class of interacting systems. Namely our Hilbert space has the following tensor product structure L2 (h) ⊗ X,
(6.1)
where X is a complex Hilbert space. (For instance, a class of electron–phonon interacting systems is described by the abstract theory in this section. In this case, electrons are living in L2 (h) and phonons are living in X.) Suppose the system is described by the Hamiltonian HV given by HV = H0 − V
(6.2)
H 0 = HL 2 ⊗ 1 + 1 ⊗ H X ,
(6.3)
with
HL2 = π (A) + πr (B), V=
M
φj ⊗ j .
(6.4) (6.5)
j=1
Both A, B ∈ L∞ (h) are self-adjoint, each φj is self-adjoint and bounded operators acting in L2 (h). In addition, HX is self-adjoint and positive in X and j is selfadjoint in X for all j. To guarantee the self-adjointness and semiboundedness of HV , we assume the following: M (A.1) Each j is infinitesimally small with respect to HX and j=1 dom(j ) is dense in X.
August 12, J070-S0129055X11004424
778
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
To discuss attractive effects of HV , we will assume the following: (A.2) There exists a self-dual cone X+ in X such that e−βHX 0,
j 0
(6.6)
with respect to X+ for all β ≥ 0 and j. Although most results in this section are variations of theme in the previous section, we repeat the arguments without hesitation because results here will be needed in Secs. 7 and 8. 6.2. Choice of a natural self-dual cone As a first step of our analysis, we have to select a relevant sel-dual cone in L2 (h)⊗X. In this paper, we simply take a tensor product of L2 (h)+ and X+ as follow: L2 (h)+ ⊗ X+ = {ϕ ∈ L2 (h) ⊗ X | ϕ, ξ ⊗ x ≥ 0 ∀ ξ ∈ L2 (h)+ ∀ x ∈ X+ }.
(6.7)
In many concrete cases, L2 (h)+ ⊗ X+ becomes a self-dual cone. Thus it is natural to assume the following: (A.3) L2 (h)+ ⊗ X+ is self-dual. Remark 6.1. A simple example satisfying (A.3) is L2 (h)+ ⊗ L2 (M, dµ)+ , where (M, µ) is a σ-finite measeure space and L2 (M, dµ)+ = {f ∈ L2 (M, dµ) | f (m) ≥ 0 µ-a.e.}. Additional examples will appear in Secs. 7 and 8. Proposition 6.2. Assume (A.1)–(A.3). If φj 0 with respect to L2 (h)+ for all j, then V 0 with respect to L2 (h)+ ⊗ X+ . Proof. Apply Proposition 2.16. Instead of (A.3), we will often assume the condition below. (A.4) There exists a CONS {xn }n of X such that {xn }n ⊂ X+ . Remark 6.3. (A.4) is a stronger condition than (A.3). Indeed, under (A.4), L2 (h)+ ⊗ X+ is self-dual by Proposition 2.15. We will observe the following simple case as a motivation. Lemma 6.4. Assume (A.4). Let A ∈ L1 (L2 (h)) and B ∈ L1 (X). Assume A 0 with respect to L2 (h)+ and B 0 with respect to X+ . Then one has trL2 ⊗X [A ⊗ B] ≥ 0. Proof. Since trX B ≥ 0 holds by (A.4), we obtain trL2 ⊗X [A ⊗ B] = (trL2 A)(trX B) ≥ 0. This completes the proof.
(6.8)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
779
Motivated by the above lemma, we introduce the following. Definition 6.5. Let R(L2 (h)⊗X) be the weak closure of the convex cone generated by the conical combinations of A ⊗ B with A 0 with respect to L2 (h)+ and B 0 with respect to X+ . In general if X ∈ R(L2 (h) ⊗ X), then we denote X 0 with respect to L2 (h)+ ⊗ X+ . Of course X Y with respect to L2 (h)+ ⊗ X+ means X − Y 0 with respect to L2 (h)+ ⊗ X+ . Remark 6.6. For later use, we mention a bit about a case of unbounded operators. Let A 0 with respect to L2 (h)+ and B 0 with respect to X+ . Even if B is unbounded, we denote A ⊗ B 0 with respect to L2 (h)+ ⊗ X+ . Furthermore when X is an unbounded operator given by a conical combination of unbounded Aj ⊗ Bj 0 with respect to L2 (h)+ ⊗ X+ (j = 1, . . . , N ), we still denote X 0 with respect to L2 (h)+ ⊗ X+ . Proposition 6.7. Assume (A.4). If X 0 with respect to L2 (h)+ ⊗X+ , then X 0 with respect to L2 (h)+ ⊗ X+ . Proof. An easy practice. We can check the following in a similar way as in the proof of Proposition 4.20. Proposition 6.8. Assume (A.3). Then R(L2 (h) ⊗ X) is a multicative convex cone. Namely one has the following: (i) If X 0 and Y 0, then XY 0 and Y X 0. (ii) If X 0 and Y 0, then αX + βY 0 for all α, β ∈ R+ . Proposition 6.9. Assume (A.4). If X 0 with respect to L2 (h)+ ⊗ X+ and X ∈ L1 (L2 (h) ⊗ X), then trL2 ⊗X X ≥ 0. Proof. The proof is in parallel to that of Proposition 4.21.
6.3. Derivation of attractive interaction In the remainder of this section, we will treat the Hamiltonian HV with φj = π (Cj )πr (Dj ),
(6.9)
M where C = {Cj }M j=1 and D = {Dj }j=1 are families of bounded self-adjoint operators on h. In order to display the dependence on A, B, C, D, we denote HV by H(A, B; C, D).
August 12, J070-S0129055X11004424
780
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Theorem 6.10 (Dyson–Lieb–Simon Inequality II). Assume (A.1), (A.2) and (A.4). Then, for any E, F ∈ L∞ (h) and β > 0, one has |trL2 ⊗X [π (E)πr (F ∗ ) ⊗ 1X e−βH(A,B;C,D) ]|2 ≤ trL2 ⊗X [π (E)πr (E ∗ ) ⊗ 1X e−βH(A,A;C,C) ] × trL2 ⊗X [π (F )πr (F ∗ ) ⊗ 1X e−βH(B,B;D,D) ],
(6.10)
where trL2 ⊗X is the trace on L2 (h) ⊗ X. Remark 6.11. (i) As we have observed in Remark 5.5, every interactions V in the right-hand side of (6.10) are attractive with respect to L2 (h)+ ⊗ X+ . (ii) Nonself-adjoint C, D can also be treated by Remark 4.5. Proof. We will use the notations in the proof of Theorem 5.4. By the Duhamel formula, e−βHV =
DN,β (A, B; C, D),
(6.11)
N ≥0
DN,β (A, B; C, D) =
SN (β)
e−t1 H0 Ve−t2 H0 · · · e−tN H0 Ve−(β−
PN
j=1 tj )H0
.
(6.12)
× πr [LB;D (k(N ) ; t(N ) )∗ ] ⊗ LHX ; (k(N ) ; t(N ) ),
(6.13)
Then we have DN,β (A, B; C, D) =
k1 ,...,kN ≥1
SN (β)
π [LA;C (k(N ) ; t(N ) )]
where = {j }M j=1 . Hence one sees trL2 ⊗X [π (E)πr (F ∗ ) ⊗ 1X DN,β (A, B; C, D)] = {trL2 [ELA;C (k(N ) ; t(N ) )]} × {trL2 [F LB;D (k(N ) ; t(N ) )]}∗ k1 ,...,kN ≥1
SN (β)
× {trX [LHX ; (k(N ) ; t(N ) )]}.
(6.14)
Note that by the assumptions (A.2) and (A.4), ΓN (k(N ) ; t(N ) ) = trX [LHX ; (k(N ) ; t(N ) )] ≥ 0.
(6.15)
Using the inner product (5.12), we have trL2 ⊗X [π (E)πr (F ∗ ) ⊗ 1X DN,β (A, B; C, D)] = ΓN FA;C;E , ΓN FB;D;F N,β . 1/2
(N )
1/2
(N )
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
781
By the Schwartz inequality, |trL2 ⊗X [π (E)πr (F ∗ ) ⊗ 1X e−βHV ]|2 2 1/2 (N ) 1/2 (N ) = ΓN FA;C;E , ΓN FB;D;F N,β N ≥0 1/2 (N ) 1/2 (N ) ≤
ΓN FA;C;E 2N,β
ΓN FB;D;F 2N,β . N ≥0
(6.16)
N ≥0
Finally we remark that 1/2 (N )
ΓN FA;C;E 2N,β N ≥0
=
N ≥0 k1 ,...,kN ≥1
=
SN (β)
|trL2 [ELA;C (k(N ) ; t(N ) )]|2 trX [LHX ; (k(N ) ; t(N ) )]
trL2 ⊗X [π (E)πr (E ∗ ) ⊗ 1X DN,β (A, A; C, C)]
N ≥0
= trL2 ⊗X [π (E)πr (E ∗ ) ⊗ 1X e−βH(A,A;C,C) ].
(6.17)
Combining (6.16) with (6.17), one concludes the assertion.
6.4. Analysis of Hamiltonian with attractive interaction Theorem 6.10 tells us that the free energy is minimized when the interaction term is attractive with respect to L2 (h)+ ⊗ X+ . Taking this fact into consideration, we will study the Hamiltonian H(A, A; C, C). We can extend Theorems 5.7–5.9 to the interacting system under the considerarion. Here, for later use, we only state the following theorem which is a generalized version of Theorem 5.7. Theorem 6.12. Assume (A.1), (A.2) and (A.4). Moreover assume that e−βH(A,A;C,C) ∈ L1 (L2 (h) ⊗ X). Then one has the following: (i) (Positivity) For any D ∈ L∞ (h) and β ∈ R+ , one has π (D)πr (D∗ ) ⊗ 1X e−βH(A,A;C,C) 0
(6.18)
with respect to L2 (h)+ ⊗ X+ , where 1X is the identity operator on X. In particular trL2 ⊗X [π (D)πr (D∗ ) ⊗ 1X e−βH(A,A;C,C) ] ≥ 0.
(6.19)
August 12, J070-S0129055X11004424
782
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
(ii) (Monotonicity) Let V1 and V2 be interactions possessing the similar forms of (6.5). If V1 V2 0 with respect to L2 (h)+ ⊗ X+ , one has π (D)πr (D∗ ) ⊗ 1X e−βHV1 (A,A) π (D)πr (D∗ ) ⊗ 1X e−βHV2 (A,A)
(6.20)
with respect to L2 (h)+ ⊗ X+ for any D ∈ L∞ (h), where HV (A, A) denotes the Hamiltonian (6.2) with A = B. In particular trL2 ⊗X [π (D)πr (D∗ ) ⊗ 1X e−βHV1 (A,A) ] ≥ trL2 ⊗X [π (D)πr (D∗ ) ⊗ 1X e−βHV2 (A,A) ].
(6.21)
7. The BCS Hamiltonian and Macroscopic Quantum Effects 7.1. Background In this section, we will study the famous BCS theory of the superconductivity and related macroscopic quantum effects, including the Josephson effect, from a viewpoint of the self-dual cone analysis. It is hard to give a detailed history of the superconductivity because so many physicists contributed toward understanding the phenomenon. Thus we roughly follow a main stream which relates to this section. In 1911, Onnes discovered that an electrical resistance of exactly zero occurs in a certain metal below a transition temperature. Since then, many physicists tried to explain this phenomenon theoretically. In 1955, Cooper observed, in metal, two electrons with different spins and opposite momenta are bound together by small attraction comming from the electron–phonon interaction [7]. Taking this observation into consideration, Bardeen, Cooper and Schrieffer proposed the first microscopic theory of superconductivity, the BCS theory, in 1957 [4]. They started from a general Hamiltonian of electrons, then by absorbing the Cooper’s idea about the electron pair, they gave a reduced Hamiltonian, named the BCS Hamiltonian nowadays. By analyzing the BCS Hamiltonian, the superconductivity can be explained very well. In 1962, Josephson predicted that, across two weakly coupled superconductors, electric current crosses [27]. This phenomenon is called the Josephson effect and has various applications in quantum-mechanical circuits, such as SQUID. Applying the general theory developed in the previous sections, we will first investigate the BCS Hamiltonian. Then, by the self-dual cone analysis, we will see if phase angles in the superconductors are macroscopically aligned, the free energy of the system attains the minimum. Taking this result into consideration, we will study the Josephson effect by the self-dual cone analysis. Our analysis in this section, as far as we know, is novel and shows that the self-dual cone analysis is very useful both methologically and conceptually in condensed matter physics.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
783
7.2. A warmup: The BCS Hamiltonian 7.2.1. Cooper-pair space Here we will discuss the BCS Hamiltonian [4] as a warmup. The BCS Hamiltonian is defined by H= ε(k)a∗kσ akσ − vk,k a∗k↑ a∗−k↓ a−k ↓ ak ↑ , (7.1) k∈Λ,σ=↑,↓
k,k ∈Λ
where ε(k) = k 2 /2 − µ and Λ ⊆ Z3 so that −Λ = Λ and |Λ| < ∞.e Throughout this section, we always assume that vk,k ∈ R and vk,k = vk ,k for all k, k ∈ Λ. Then H is a self-adjoint operator acting in F ⊗ F, where F is a fermionic Fock space over 2 (Λ). The operartor akσ is defined by ak↑ = ak ⊗1 and ak↓ = (−1)Nf ⊗ak , where ak is the standard annihilation operator in F with anticommutation relations {ak , a∗q } = δk,q , ∗ {ak , aq } = 0. Nf is the number operator on F defined by Nf = k∈Λ ak ak . ∗ Then akσ satisfies the well known anticommutation relations {akσ , ak σ } = δk,k δσ,σ . Let Sk = nk↑ − n−k↓ for all k ∈ Λ, where nk↑ = a∗k↑ ak↑ = a∗k ak ⊗ 1 and nk↓ = a∗k↓ ak↓ = 1 ⊗ a∗k ak . Clearly spec(Sk ) = {−1, 0, 1}. We easily see that e−iφSk HeiφSk = H
(7.2)
for all φ ∈ R and k ∈ Λ. Then, for each s = {sk }k∈Λ ∈ ×|Λ| {−1, 0, 1}, we introduce a closed subspace H(s) by H(s) = {ϕ ∈ F ⊗ F | Sk ϕ = sk ϕ ∀ k ∈ Λ}.
(7.3)
C = H(0) = {ϕ ∈ F ⊗ F | ϕ ∈ ker(Sk ) ∀ k ∈ Λ}
(7.4)
Especially
is called the Cooper-pair space, where 0 = {0}k∈Λ ∈ {−1, 0, 1}×|Λ|. By (7.2), H is reduced by H(s) for all s ∈ ×|Λ| {−1, 0, 1}.f Theorem 7.1. Assume vk,k > 0 for all k, k . For each s ∈ {−1, 0, 1}×|Λ|, one obtains trH(s) [e−βH ] ≤ trC [e−βH ]
(7.5)
for all β > 0. There is eqaulity only if s = 0, that is, H(s) = C. Remark 7.2. trH(s) [e−βH ] can be regarded as a partition function parametrized by Hilbert spaces {H(s)}s . The Hilbert space which maximizes this partition function is a space of physical states which easily occur. Thus the above theorem claims the a set S, |S| means the cardinality of S. H be a self-adjoint operator on a Hilbert space h. Let h0 be a closed subspace of h and P0 be the orthogonal projection onto h0 . If P0 dom(H) ⊆ dom(H) and P0 H = HP0 on dom(H), then we say that H is reduced by h0 . If H is reduced by h0 , then H can be decomposed as H = (H h0 ) ⊕ (H h⊥ 0 ). e For f Let
August 12, J070-S0129055X11004424
784
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Cooper pair space C is the most realizable Hilbert space. Accordingly the ground state of H belongs to this space. Remark that the Cooper pair space consists of the so called Cooper pairs, that is, electron pairs with different spin and opposite momenta. We will prove Theorem 7.1 in Sec. 7.6. 7.2.2. Uniqueness of the ground state Let N↑ = k∈Λ a∗k↑ ak↑ and N↓ = k∈Λ a∗k↓ ak↓ , the number operators. C has a direct sum decomposition C=
|Λ|
CN ,
CN = {ϕ ∈ C | N↑ ϕ = N↓ ϕ = N ϕ}.
(7.6)
N =0
Each CN is called the N -Cooper-pair subspace. Then since H commutes with Sk and Ntot = N↑ + N↓ , H is reduced by CN for all N ≥ 0. Therefore H can be decomposed as H=
|Λ|
HN ,
HN = H CN .
(7.7)
N =0
Theorem 7.3. Assume that vk,k > 0 for all k, k ∈ Λ. For each N ≥ 0, HN has a unique ground state. Remark 7.4. The Hilbert space CN is composed of N Cooper pairs. Thus, by Theorem 7.3, N -Cooper pair system described by HN has a unique ground state. By combining this with Theorem 7.1, we conclude, in the Cooper pair space which is energetically most stable, the ground state entropy vanishes. The proof of Theorem 7.3 will be given in Sec. 7.7. 7.3. Appearance of a macroscopic phase angle For all θ = {θk }k∈Λ ∈ T|Λ| with T = [0, 2π), let us define H(θ) = H − gk [eiθk a∗k↑ a∗−k↓ + e−iθk a−k↓ ak↑ ],
(7.8)
k∈Λ
where H is defined by (7.1) and gk ∈ R. This Hamiltonian is acting in C, the Cooper-pair space. H(θ) describes a superconductor with the phase angle configuration θ. Among various phase angle configurations, which configuration is most stable? Theorem 7.5 (Macroscopic Phase Angle). Let α ∈ T. If θk = α for all k ∈ Λ, then we will write θ = [α]. Let Zβ (θ) = trC [e−βH(θ) ]
(7.9)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
785
for all β > 0. Assmue that vk,k > 0 for all k, k ∈ Λ. Assume that gk > 0 for all k ∈ Λ. Then one has Zβ (θ) ≤ Zβ ([α])
(7.10)
for any α ∈ T. There is equality only if θ = [α] with α ∈ T. Remark 7.6. (i) Since eiαNtot /2 H([0])e−iαNtot /2 = H([α]), Zβ ([α]) is independent of α. (ii) By (7.10), when θ = [α], that is, phase angles are aligned macroscopically, the superconductor is stable energetically. That is the reason why the macroscopically aligned phase angle appears in the superconductors. By using this mechanism, the Josephson effect, one of the most famous macroscopic quantum effect, will be explained in the next subsection. Moreover Theorem 7.7 below tells us that, in this case, the ground state entropy vanishes. (iii) Since the new Hamiltonian (7.8) contains terms which annihilate and creat the Cooper pairs, the number of the Cooper pairs is not conserved, i.e. the number of the Cooper pairs cannot be constant. In contrast, as mentioned in (ii), the phase angles are alinged macroscopically. This can be understood as a reflection of the uncertain relation between the phase angle and the particle number. Theorem 7.7. Assume that vk,k > 0 for all k, k ∈ Λ and gk > 0 for all k ∈ Λ. Then H([α]) has a unique ground state for all α ∈ T. We will give proofs of Theorems 7.5 and 7.7 in Secs. 7.8 and 7.9, respectively. 7.4. Josephson effect In this subsection, we will discuss the Josephson effect [27]. Let us consider the Hamiltonian of the form H(φ) = HL ⊗ 1R + 1L ⊗ HR − W (φ).
(7.11)
HL and HR are Hamiltonians for the “left” and “right” superconductors respectively. These are defined as, with # = L or R, # ε(k)a∗kσ akσ − gk [a∗k↑ a∗−k↓ + a−k↓ ak↑ ] H# = k∈Λ# ,σ=↑,↓
−
k∈Λ#
# ∗ ∗ vk,k ak↑ a−k↓ a−k ↓ ak ↑ .
(7.12)
k,k ∈Λ#
HL and HR respectively act in the Cooper pair spaces CL and CR associated with momentum lattices ΛL ⊂ (k Z)3 and ΛR ⊂ (kr Z)3 for some k , kr > 0. Remark 7.8. It is not necessary that the superconductors described by (CL , HL ) and (CR , HR ) are identical. Hence two superconductors occupy different volumes, so units of momenta denoted by k and kr are not necessary same.
August 12, J070-S0129055X11004424
786
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
The interaction between the superconductors is given by wk,k [ei(θL −θR ) a∗k↑ a∗−k↓ ⊗ a−k ↓ ak ↑ W (φ) = k∈ΛL ,k ∈ΛR
+ e−i(θL −θR ) a−k↓ ak↑ ⊗ a∗k ↑ a∗−k ↓ ]
(7.13)
with φ = θL − θR ∈ T. The Hamiltonian H(φ) is acting in CL ⊗ CR . H(φ) describes a system of weakly interacting two superconductors. The phase angles of the “left” (respectively, “right”) superconductor are aligned in θL (respectively, θR ) macroscopically. # # > 0 for all Theorem 7.9 (Josephson Effect I). Assume vk,k > 0 and gk k, k ∈ Λ# and # = L, R. Moreover assume wk,k > 0 for all k ∈ ΛL , k ∈ ΛR . Let
Zβ (φ) = trCL ⊗CR [e−βH(φ) ].
(7.14)
Then one has Zβ (φ) ≤ Zβ (0) for all β > 0. There is equality only if φ = 0, equivalently θL = θR . Remark 7.10. The above theorem tells us that the interacting superconductors are energetically favorable if θL = θR . By this fact, if two superconductors with different phase angles are coupled, then electric current crosses until two phase angles are aligned. The current exists even if the chemical potentials of the two superconductors are same. This phenomenon is called the Josephson effect. Once the two phase angles are aligned, energy of the system reaches the lowest value, in additon, the ground state entropy equals zero as stated in Theorem 7.11, so that the system comes to be static. Theorem 7.11. H(0) has a unique ground state in CL ⊗ CR . We will prove Theorems 7.9 and 7.11 in Secs. 7.10 and 7.11, respectively. 7.5. A remark on the Josephson effect Here we will study the Josephson effect more precisely. 7.5.1. The Duhamel expectations Let A, B be bounded operators. Let H0 = HL ⊗ 1R + 1L ⊗ HR . Define Aβ,φ Zβ (φ) =
N ∞ N =1 j=1
SN (1)
β N −1 trCL ⊗CR [e−t1 βH0 W (φ)e−t2 βH0
· · · A e−tj+1 βH0 · · · W (φ)e−(1− jth
PN
j=1 tj )βH0
],
(7.15)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
A; Bβ,φ Zβ (φ) =
∞
N
N =2 j,k=1,j =k
SN (1)
β N −2 trCL ⊗CR [e−t1 βH0 W (φ)e−t2 βH0
· · · A e−tj+1 βH0 · · · B e−tk+1 βH0 · · · W (φ)e−(1− jth
787
PN
j=1 tj )βH0
].
kth
(7.16) At a first glance, these definitions are seemingly artificial. However, as we will see later, these objects naturally arise in our discussions. For instance, we see ∂ Zβ (φ) = −β sin φCβ,φ Zβ (φ) + β cos φSβ,φ Zβ (φ). ∂φ
(7.17)
Trivially A; Bβ,φ = B; Aβ,φ . 7.5.2. Free energy equation Let
C=
wk,k [a∗k↑ a∗−k↓ ⊗ a−k ↓ ak ↑ + a−k↓ ak↑ ⊗ a∗k ↑ a∗−k ↓ ],
(7.18)
k∈ΛL k ∈ΛR
S=i
wk,k [a∗k↑ a∗−k↓ ⊗ a−k ↓ ak ↑ − a−k↓ ak↑ ⊗ a∗k ↑ a∗−k ↓ ].
(7.19)
k∈ΛL ,k ∈ΛR
Then both C and S are self-adjoint and we have W (φ) = cos φ C + sin φ S.
(7.20)
Theorem 7.12 (Josephson Effect II). Let ∆β (φ) = Cβ,φ . One has the following: (i) ∆β (0) is strictly positive. (ii) Let Fβ (φ) = −
1 ln Zβ (φ). β
(7.21)
Then we have Fβ (φ) − Fβ (0) = ∆β (0)(1 − cos φ) φ +β dα(cos φ − cos α){sin α(C; Cβ,α − C2β,α ) 0
− cos α(C; Sβ,α − Cβ,α Sβ,α )} φ +β dα(sin φ − sin α){cos α(S; Sβ,α − S2β,α ) 0
− sin α(S; Cβ,α − Sβ,α Cβ,α )}.
(7.22)
August 12, J070-S0129055X11004424
788
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Remark 7.13. By (7.14), the left-hand side of (7.22) must be positive. The fluctuation terms in the right-hand side of (7.22) should be small and the term ∆β (0)(1 − cos φ) dominates the behavior of the system. To see this clearly, it is instructive to consider the weak coupling case wk,k 1. Then we have Fβ (φ) − Fβ (0) ∼ = ∆β (0)(1 − cos φ), the well known equation of the Josephson effect. The proof of Theorem 7.12 will be given in Sec. 7.12. 7.6. Proof of Theorem 7.1 7.6.1. Representation on L2 (F) H can be expressed as H = dΓ(ε) ⊗ 1 + 1 ⊗ dΓ(ε) −
vk,k c∗k ck ⊗ a∗−k a−k ,
(7.23)
k,k ∈Λ
where ck = (−1)Nf ak and dΓ(ε) = k∈Λ ε(k)a∗k ak = k∈Λ ε(k)c∗k ck . Next we want to express H in the representation L2 (F). To this end, we first determine an involution ϑ, see Remark 4.8. The following choice of ϑ is natural for our purpose: ϑa∗−k1 · · · a∗−kn Ω = c∗k1 · · · c∗kn Ω,
ϑΩ = Ω
(7.24)
for all k1 , . . . , kn ∈ Λ with k1 = k2 = · · · = kn , where Ω is the Fock vacuum. Then since ϑa−k ϑ = ck , we have vk,k c∗k ck ⊗ ϑc∗k ck ϑ. (7.25) H = dΓ(ε) ⊗ 1 + 1 ⊗ ϑdΓ(ε)ϑ − k,k ∈Λ
Hence under the identification F ⊗ F = L2 (F) by Φϑ , we have H = H0 − V, H0 = π (dΓ(ε)) + πr (dΓ(ε)), V = vk,k π (c∗k ck )πr (c∗k ck ).
(7.26) (7.27) (7.28)
k,k ∈Λ
7.6.2. Structures of H(s) Here we will investigate some structures of the orthogonal projection onto H(s). Under the identification F ⊗ F = L2 (F) by Φϑ , Sk can be expressed as Sk = π (nk ) − πr (nk ) with nk = c∗k ck . For each j ∈ {0, 1}, let [nk ](j) be the orthogonal projection onto ker(nk − j). In addition, for each s ∈ {−1, 0, 1}, let [Sk ](s) be the orthgonal projection onto ker(Sk − s). Then one sees [Sk ](−1) = π ([nk ](0))πr ([nk ](1)),
(7.29)
[Sk ](0) = π ([nk ](0))πr ([nk ](0)) + π ([nk ](1))πr ([nk ](1)),
(7.30)
[Sk ](1) = π ([nk ](1))πr ([nk ](0)).
(7.31)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
The orthogonal projection onto H(s) is given by P(s) = [Sk ](sk )
789
(7.32)
k∈Λ
for all s = {sk }k ∈ ×|Λ| {−1, 0, 1}. Notation: We will use the following notations: (i) For each s = {sk }k ∈ ×|Λ| {−1, 0, 1} and j ∈ {−1, 0, 1}, we denote s(j) = {sk | sk = j}. Hence s = s(−1) ∪ s(0) ∪ s(1). (ii) For each s = {sk }k ∈ ×|Λ| {−1, 0, 1} and j ∈ {−1, 0, 1}, we denote Λ(s(j)) = {k ∈ Λ | sk ∈ s(j)}. Hence Λ = Λ(s(−1)) ∪ Λ(s(0)) ∪ Λ(s(1)). (iii) For each subset Λ0 ⊆ Λ and N = {Nk }k ∈ ×|Λ0 | {0, 1}, we denote [n]Λ0 (N) = [nk ](Nk ). (7.33) k∈Λ0
Lemma 7.14. For each s = {sk }k ∈ ×|Λ| {−1, 0, 1} and j ∈ {−1, 0, 1}, set P(s(j)) = [Sk ](sk ). (7.34) k∈Λ(s(j))
Then one has the following: (i) P(s) = P(s(−1))P(s(0))P(s(1)). (ii) P(s(−1)) = π ([n]Λ(s(−1)) (0))πr ([n]Λ(s(−1)) (1)), where 0 = {0}k∈Λ(s(−1)) and 1 = {1}k∈Λ(s(−1)) . (iii) P(s(0)) = N∈×|Λ(s(0))| {0,1} π ([n]Λ(s(0)) (N))πr ([n]Λ(s(0)) (N)). (iv) P(s(1)) = π ([n]Λ(s(1)) (1))πr ([n]Λ(s(1)) (0)). Proof. These fomulas are immediate consequences of the definition of P(s(j)).
Corollary 7.15. The orthogonal projection onto the Cooper-pair subspace C is given by π ([n]Λ (N))πr ([n]Λ (N)), (7.35) P(0) = N∈×|Λ| {0,1}
where 0 = {0}k ∈ ×|Λ| {−1, 0, 1}. 7.6.3. Proof of Theorem 7.1 We will apply the DLS inequality. Remark that similar idea can be found in [1]. By Lemma 7.14, one obtains trH(s) [e−βH ] = tr[P(s)e−βH ] = tr[P(s(−1))P(s(0))P(s(1))e−βH ]
August 12, J070-S0129055X11004424
790
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
=
tr[π ([n]Λ(s(−1)) (0)[n]Λ(s(0)) (N)[n]Λ(s(1)) (1))
N∈×|Λ(s(0))| {0,1}
× πr ([n]Λ(s(−1)) (1)[n]Λ(s(0)) (N)[n]Λ(s(1)) (0))e−βH ].
(7.36)
Now set A(N) = [n]Λ(s(−1)) (0)[n]Λ(s(0)) (N)[n]Λ(s(1)) (1),
(7.37)
B(N) = [n]Λ(s(−1)) (1)[n]Λ(s(0)) (N)[n]Λ(s(1)) (0).
(7.38)
Then using the DLS inequality (Theorem 5.4), we have RHS of (7.36) = tr[π (A(N))πr (B(N))e−βH ] N∈×|Λ(s(0))| {0,1}
≤
(tr[π (A(N))πr (A(N))e−βH ])1/2
N∈×|Λ(s(0))| {0,1}
× (tr[π (B(N))πr (B(N))e−βH ])1/2 1 ≤ {tr[π (A(N))πr (A(N))e−βH ] 2 |Λ(s(0))| N∈×
{0,1}
+ tr[π (B(N))πr (B(N))e−βH ]}. By Corollary 7.15, we see that
(7.39)
tr[π (A(N))πr (A(N))e−βH ]
N∈×|Λ(s(0))| {0,1}
≤
tr[π ([n]Λ (N))πr ([n]Λ (N))e−βH ]
N∈×|Λ| {0,1}
= trC [e−βH ].
(7.40)
There is equality only if s = 0. Similar ineqaulity holds if we replace A(N) with B(N). Thus we arrive at RHS of (7.39) ≤ trC [e−βH ].
(7.41)
There is equality only if s = 0. 7.7. Proof of Theorem 7.3 7.7.1. A natural self-dual cone in CN Let XN = {µ = (k1 , . . . , kN ) ∈ ×n Λ | k1 = · · · = kn }. If µ, ν ∈ XN are same as a set,g then we denote µ ∼ ν. This relation is actually an equivalence relation. µ = (µ1 , . . . , µN ) and ν = (ν1 , . . . , νN ). If there exists a permutation σ so that (µσ(1) , . . . , µσ(N) ) = (ν1 , . . . , νN ), then we say that µ and ν are same as a set.
g Write
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
791
Now let ΘN be the quotient set of XN by ∼. For each µ = (µ1 , . . . , µN ) ∈ ΘN , set eµ = c∗µ1 · · · c∗µN Ω ∈ FN .
(7.42)
Then we immediately obtain the following. Lemma 7.16. Set eµ = |eµ eµ | for each µ ∈ ΘN . Then {eµ }µ∈ΘN is a CONS of CN . Remark 7.17. The vectors in (7.42) do not seem to be well-defined, since the sign depends on the order of the indices. This observation is certainly correct. However the vector eµ in the above lemma is well-defined, i.e. eµ is independent of the order of indicies. Definition 7.18. A natural self-dual cone in CN is then defined by CN,+ = αµ eµ αµ ∈ R+ ∀ µ ∈ ΘN
(7.43)
µ∈ΘN
with C0,+ = R+ |ΩΩ|. Lemma 7.19. For all N ∈ N0 , one has the following: (i) e−βH0 0 with respect to CN,+ for all β ∈ R+ . (ii) (Attraction) V 0 with respect to CN,+ . (iii) e−βHN 0 with respect to CN,+ for all β ∈ R+ . Proof. These immediately follow from (7.26)–(7.28) and Theorem 5.7. Lemma 7.20. For any µ, ν ∈ ΘN , there exists an n ∈ N0 such that eµ , V n eν > 0.
(7.44)
Proof. For notational simplicity, we put vk,k = 1 for all k, k ∈ Λ. Write V = ∗ ∗ k,k ∈Λ Vkk with Vkk = π (ck ck )πr (ck ck ) 0 with respect to CN,+ . Observe that Vkk eµ = e(µ\{k })∪{k} ,
(7.45)
where the right-hand side is understood as e(µ\{k })∪{k} = 0 if k ∈ / µ or k ∈ µ. We consider µ, ν as sets and write µ = {µ1 , . . . , µN }, ν = {ν1 , . . . , νN }. Since V Vkk with respect to CN,+ for all k, k , we have V n Vµ1 ν1 · · · Vµn νn
with respect to CN,+ ,
(7.46)
where n = |ν\(µ ∩ ν)| and ν1 , . . . , νn ∈ ν\(µ ∩ ν), µ1 , . . . , µn ∈ µ\(µ ∩ ν). Then, by (7.45), one sees V n eν ≥ eµ
with respect to CN,+ .
Now we arrive at eµ , V n eν ≥ eµ , eµ = 1 > 0. This completes the proof.
(7.47)
August 12, J070-S0129055X11004424
792
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
7.7.2. Proof of Theorem 7.3 By Theorem 3.8, it sufficies to show that V is ergodic with respect to CN,+ . Choose x, y ∈ CN,+ \{0} arbitrarily. Observe that there exist µ, ν ∈ ΘN such that x ≥ c1 eµ and y ≥ c2 eν with c1 , c2 > 0. Hence by Lemma 7.20, we have x, V n y ≥ c1 c2 eµ , V n eν > 0 which means V is ergodic with respect to CN,+ . 7.8. Proof of Theorem 7.5 7.8.1. A natural self-dual cone in C Our choice of a self-dual cone in C is C+ =
|Λ|
CN,+ ,
(7.48)
N =0
where CN,+ is given by Definition 7.18. Remark that, by Lemma 7.16, {eµ | µ ∈ ΘN , N ∈ N0 } is a CONS of C. Furthermore we have Lemma 7.21. {eµ | µ ∈ ΘN , N ∈ N0 } ⊂ C+ . Remark 7.22. By Theorem 2.8 and Lemma 7.21, A 0 with respect to C+ implies trC [A] ≥ 0. This property will be important. 7.8.2. General properties of G(θ) In the representation L2 (F) by Φϑ , H(θ) can be expressed as H(θ) = π (dΓ(ε)) + πr (dΓ(ε)) − G(θ) − V, where V is given by (7.28) and G(θ) = gk [eiθk π (c∗k )πr (ck ) + e−iθk π (ck )πr (c∗k )].
(7.49)
(7.50)
k∈Λ
Lemma 7.23. One has the following: (i) (Attraction) G([0]) 0 with respect to C+ . (ii) e−βH([0]) 0 with respect to C+ for all β ≥ 0. Proof. (i) is clear. To see (ii), we just apply Theorem 3.3. Lemma 7.24. Let A1 , . . . , An be bounded operators acting in C. Assume the following: (i) Ai 0 with respect to C+ for all i = 1, . . . , n. (ii) Ai commutes with Ntot = π (Nf ) + πr (Nf ) for all i = 1, . . . , n.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
793
Then one has |trC [A1 G(θ)A2 · · · An−1 G(θ)An ]| ≤ trC [A1 G([α])A2 · · · An−1 G([α])An ]
(7.51)
for all α ∈ T. Proof. Set Gk,0 = gk π (ck )πr (c∗k ) and Gk,1 = G∗k,0 for all k ∈ Λ. Then we have α G(θ) = ei(−1) θk Gk,α . (7.52) α=0,1,k∈Λ
Hence we get
trC [A1 G(θ)A2 · · · An−1 G(θ)An ] =
ei
Pn
j=1 (−1)
αj
θkj
trC
k1 ,...,kn ∈Λ α1 ,...,αn =0,1
× [A1 Gk1 ,α1 A2 · · · An−1 Gkn ,αn An ]. Since Ai 0, Gk,α 0 and Remark An−1 Gkn ,αn An ] ≥ 0. Thus we have
7.22,
we
(7.53)
trC [A1 Gk1 ,α1 A2 · · ·
have
|trC [A1 G(θ)A2 · · · An−1 G(θ)An ]| ≤ trC [A1 G([0])A2 · · · An−1 G([0])An ].
(7.54)
Finally using the fact trC [A1 G([0])A2 · · · An−1 G([0])An ] = trC [A1 G([α])A2 · · · An−1 G([α])An ]
(7.55)
for all α ∈ T, we obtain the assertion. 7.8.3. Proof of Theorem 7.5 By the Duhamel’s formula, we have e−βH(θ) =
∞
Dn,β (θ),
(7.56)
n=0
Dn,β (θ) =
SN (β)
e−t1 H G(θ)e−t2 H · · · e−tn H G(θ)e−(β−
where H is given by (7.1). Thus we have Zβ (θ) = trC [Dn,β (θ)].
Pn
j=1 tj )H
,
(7.57)
(7.58)
n≥0
Now applying the fact e−βH 0 with respect to C+ and Lemma 7.24, we see that |trC [Dn,β (θ)]| ≤ trC [Dn,β ([α])] for all α ∈ T which implies Zβ (θ) ≤ Zβ ([α]). Next we will prove Zβ (θ) < Zβ ([α]) if θ = [α] for any α ∈ T. To this end, we only investigate D1,β (θ). Write β D1,β (θ) = dtMt (θ), (7.59) 0
August 12, J070-S0129055X11004424
794
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Mt (θ) =
gk {eiθk trC [e−tH π (c∗k )πr (ck )e−(β−t)H ]
k∈Λ
+ e−iθk trC [e−tH π (ck )πr (c∗k )e−(β−t)H ]}.
(7.60)
Since Zβ (θ) is real, it suffices to study trC [D1,β (θ)] only. On the other hand, both trC [e−tH π (ck )πr (c∗k )e−(β−t)H ] and trC [e−tH π (c∗k )πr (ck )e−(β−t)H ]
(7.61)
are also real numbers because π (c∗k )πr (ck ) 0, πr (ck )πr (c∗k ) 0, e−tH 0. Thus if θ = [α] for any α ∈ T, we have gk cos θk {trC [e−tH π (ck )πr (c∗k )e−(β−t)H ] Mt (θ) = k∈Λ
+ trC [e−tH π (c∗k )πr (ck )e−(β−t)H ]} < Mt ([0])
(7.62)
which means trC [D1,β (θ)] < trC [D1,β ([0])]. Therefore we conclude Zβ (θ) < Zβ ([0]) = Zβ ([α]) for any α ∈ T. 7.9. Proof of Theorem 7.7 7.9.1. Ergodicity of G([0]) Lemma 7.25. Let H0 ([0]) be a Hamiltonian defined by H0 ([0]) = π (dΓ(ε)) + πr (dΓ(ε)) − G([0]).
(7.63)
e−βH([0]) e−βH0 ([0])
(7.64)
Then we have with respect to C+ .
Proof. Note that H0 ([0]) H([0]) with respect to C+ . Hence applying Theorem 2.10, we have the assertion. Lemma 7.26 (Ergodicity). Let G = G([0]). Then G 0 with respect to C+ . Namely, for all x, y ∈ C\{0}, there exists an N ∈ N0 such that x, GN y > 0. |Λ| |Λ| Proof. Write x = j=0 xj and y = j=0 yj with xj , yj ∈ Cj . Then there exist M, N ∈ N0 such that xM ∈ CM,+ \{0} and yN ∈ CN,+ \{0}. Thus x ≥ xM and y ≥ yN with respect to C+ , where we identify xM with |Λ| j=0 δjM xj and yN with |Λ| N such that xM ≥ cM eµ and j=0 δjN yj . Then there exist µ ∈ ΘM and ν ∈ Θ ∗ yN ≥ cN eν . Set Ak = gk π (ck )πr (ck ). Then G = k∈Λ (Ak + A∗k ). Since G Ak with respect to C+ for all k, we have, with µ = (µ1 , . . . , µM ) and ν = (ν1 , . . . , νN ), GM eµ ≥ Aµ1 · · · AµM eµ = gµ1 · · · gµM e∅ , GN eν ≥ Aν1 · · · AνN eν = gν1 · · · gνM e∅ ,
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
795
where e∅ = |ΩΩ|. Thus we arrive at x, GM+N y ≥ xM , GM+N yN ≥ cM cN eµ , GM+N eν ≥ cM cN (gµ1 · · · gµM )(gν1 · · · gνN ) > 0. This completes the proof. 7.9.2. Proof of Theorem 7.7 Since e−iαNtot /2 H([α])eiαNtot /2 = H([0]), it suffices to see the assertion for α = 0 only. By Lemma 7.25, if we can show e−βH0 ([0]) 0 with respect to C+ , then we automatically obtain e−βH([0]) e−βH0 ([0]) 0 with respect to C+ . Hence we have the desired assertion by Perron–Frobenius–Faris theorem (Theorem 2.13). Hence we will show e−βH0 ([0]) 0 with respect to C+ . But this can be easily proven by Lemma 7.26 and Theorem 3.8. 7.10. Proof of Theorem 7.9 We will apply the self-dual cone analysis to obtain the assertion. To this end, we have to find a suitable self-dual cone to capture the attraction which governs the phenomenon. 7.10.1. A natural self-dual cone in CL ⊗ CR Let CL,+ (respectively, CR,+ ) be the natural self-dual cone associated with CL (respectively, CR ), defined by (7.48). Our choice of self-dual cone in this system is the tensor product of CL,+ and CR,+ , namely, CL,+ ⊗ CR,+ = {ϕ ∈ CL ⊗ CR | ηL ⊗ ηR , ϕ ≥ 0 ∀ ηL ∈ CL,+ ∀ ηR ∈ CR,+ }. (7.65) It is not so hard to check that CL,+ ⊗ CR,+ is self-dual by Propositon 2.15 and Lemma 7.21. We define ΘL,N and ΘR,N by replacing Λ with ΛL and ΛR in the definition of ΘN , see Sec. 7.7.1. Then, by Lemma 7.16, {eµ ⊗ eν | µ ∈ ΘL,M , ν ∈ ΘR,N , M, N ∈ N0 } is a CONS of CL ⊗ CR . Lemma 7.27. One has the following: (i) {eµ ⊗ eν | µ ∈ ΘL,M , ν ∈ ΘR,N , M, N ∈ N0 } ⊂ CL,+ ⊗ CR,+ . (ii) CL,+ ⊗ CR,+ = { µ∈ΘL,M ,ν∈ΘR,N αµν eµ ⊗ eν | αµν ∈ R+ ∀ µ ∈ ΘL,M ∀ ν ∈ ΘR,N , M, N ∈ N0 }. Proof. These follow from (7.43), (7.48) and the definitions of the tensor produt of self-dual cones. Remark 7.28. It must be emphasized that, by Theorem 2.8 and Lemma 7.27, A 0 with respect to CL,+ ⊗ CR,+ implies trCL ⊗CR [A] ≥ 0.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
796
7.10.2. Structures of interactions Note that H(φ) can be expressed as H(φ) = H0 − W(φ),
(7.66)
H0 = K − Vg − Vv
(7.67)
with K = [π (dΓ(ε)) + πr (dΓ(ε))] ⊗ 1R + 1L ⊗ [π (dΓ(ε)) + πr (dΓ(ε))], gkL [π (c∗k )πr (ck ) + adj.] ⊗ 1R Vg =
(7.68)
k∈ΛL
+
gkR 1L ⊗ [π (c∗k )πr (ck ) + adj.],
(7.69)
k∈ΛR
Vv =
L ∗ ∗ vk,k [π (ck ck )πr (ck ck )] ⊗ 1R
k,k ∈ΛL
+
R ∗ ∗ vk,k 1L ⊗ [π (ck ck )πr (ck ck )],
(7.70)
k,k ∈ΛR
W(φ) =
wk,k [eiφ π (c∗k )πr (ck ) ⊗ πr (ck )πr (c∗k ) + adj.],
(7.71)
k∈ΛL ,k ∈ΛR
where the symbol “adj.” means adjoint of the operatrors immediately preceeding. Using the expressions (7.68)–(7.71), we immediately obtain the following: Proposition 7.29. One has the following: (i) W(φ) = W(φ) + iW(φ) with W(φ) = cos φ C and W(φ) = −i sin φ S. (ii) (Attraction) W(0) 0 with respect to CL,+ ⊗ CR,+ . (iii) e−βH0 0 with respect to CL,+ ⊗ CR,+ for all β ∈ R+ . Proof. (i) is trivial. (ii) and (iii) follow from Proposition 2.16 and Lemma 7.23.
7.10.3. Proof of Theorem 7.9 Our strategy of the proof is similar to that of Theorem 7.5. (0) (1) (0) Set Wk,k = wk,k π (c∗k )πr (ck ) ⊗ π (ck )πr (c∗k ) and Wk,k = (Wk,k )∗ . Then we obtain α (α) ei(−1) φ Wk,k . (7.72) W(φ) = α=0,1 k,k
By the Duhamel’s formula, we have DN,β (φ), e−βH(φ) =
(7.73)
N ≥0
DN,β (φ) =
SN (β)
e−t1 H0 W(φ)e−t2 H0 · · · e−tN H0 W(φ)e−(β−
PN
j=1 tj )H0
.
(7.74)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
Then one sees
trCL ⊗CR [DN,β (φ)] =
∈Λ k1 ,...,kN ∈ΛL k1 ,...,kN R
ei
α1 ,...αN ∈{0,1} (α )
1
j=1 (−1)
αj
φ
SN (β)
−(β− 1 N ×trCL ⊗CR [e−t1 H0 Wk1 ,k · · · Wk ,k e N (α )
PN
797
PN
j=1 tj )H0
N
]. (7.75)
Since e−tH0 0 and Wk,k 0 with respect to CL,+ ⊗ CR,+ , we have (α)
|trCL ⊗CR [DN,β (φ)]| ≤ trCL ⊗CR [DN,β (0)].
(7.76)
Hence Zβ (φ) ≤ Zβ (0). Next we will study the term trCL ⊗CR [D1,β (φ)] carefully. Since Zβ (φ) is real, it suffices to investigate the real part of trCL ⊗CR [D1,β (φ)] and we have β trCL ⊗CR [D1,β (φ)] = dt trCL ⊗CR [e−tH0 W(φ)e−(β−t)H0 ] 0
β
= cos φ
dt trCL ⊗CR [e−tH0 Ce−(β−t)H0 ]
0
≤ trCL ⊗CR [D1,β (0)].
(7.77)
There is equality only if φ = 0. Now we can conclude the assertion. 7.11. Proof of Theorem 7.11 7.11.1. Ergodicity of Vg As we discussed in Sec. 7.10.1, the natural self-dual cone in CL ⊗ CR is CL,+ ⊗ CR,+ . Lemma 7.30. One has e−βH(0) e−β(K−Vg ) 0 with respect to CL,+ ⊗ CR,+ for all β ∈ R+ . Proof. Since e−βK 0 and Vg 0 with respect to CL,+ ⊗ CR,+ , we have e−β(K−Vg ) 0 with respect to CL,+ ⊗ CR,+ . Observe that H(0) K − Vg with respect to CL,+ ⊗ CR,+ . Then by the operator monotonicity (Proposition 2.10), we conclude e−βH(0) e−β(K−Vg ) with respect to CL,+ ⊗ CR,+ . Lemma 7.31 (Ergodicity). Vg 0 with respect to CL,+ ⊗ CR,+ . Namely, for all x, y ∈ CL,+ ⊗ CR,+ \{0}, there exists an L ∈ N0 such that x, VgL y > 0. Proof. Let x, y ∈ CL,+ ⊗ CR,+ \{0}. Then there exist µ ∈ ΘL,M , µ ∈ ΘL,M , ν ∈ ΘR,N , ν ∈ ΘR,N such that x ≥ Ceµ ⊗ eν and y ≥ C eµ ⊗ eν with C, C > 0. Hence it suffices to show that there exists an L ∈ N0 such that eµ ⊗ eν , VgL eµ ⊗ eν > 0.
(7.78)
August 12, J070-S0129055X11004424
798
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Without loss of generality, we may assume that gk# = 1 for all k ∈ Λ# and # = L, R. Since Vg π (ck )πr (c∗k ) ⊗ 1R for all k ∈ ΛL and Vg 1L ⊗ π (ck )πr (c∗k ) for all k ∈ ΛR , we see that VgM+N eµ ⊗ eν ≥ π (cµ1 . . . cµM )πr (c∗µM · · · c∗µ1 ) ⊗ π (cν1 . . . cνN ) × πr (c∗νN · · · c∗ν1 )eµ ⊗ eν = e∅ ⊗ e∅ . VgM +N eµ
Similarly M + N + N .
⊗ eν ≥ e∅ ⊗ e∅ . Hence we conclude (7.78) with L = M +
7.11.2. Proof of Theorem 7.11 We will show e−βH(0) 0 with respect to CL,+ ⊗ CR,+ for all β > 0. By Lemma 7.30, it suffices to show e−β(K−Vg ) 0 with respect to CL,+ ⊗ CR,+ for all β > 0. By Theorem 3.8, it suffices to show Vg is ergodic. But it is already proven in Lemma 7.31. 7.12. Proof of Theorem 7.12 7.12.1. Proof of (i) Since C = W (0) 0 and e−βH(0) 0 with respect to CL,+ ⊗ CR,+ , one immediately obtains (i). 7.12.2. Proof of (ii) Proposition 7.32. One has the following: (i) (ii) (iii) (iv) (v)
∂ ∂φ Zβ (φ) = −β sin φCβ,φ Zβ (φ) + β cos φSβ,φ Zβ (φ), ∂ ∂φ (Cβ,φ Zβ (φ)) = −β sin φC; Cβ,φ Zβ (φ) + β cos φC; Sβ,φ Zβ (φ), ∂ ∂φ (Sβ,φ Zβ (φ)) = −β sin φS; Cβ,φ Zβ (φ) + β cos φS; Sβ,φ Zβ (φ), ∂ ∂φ Cβ,φ = −β sin φ(C; Cβ,φ ∂ ∂φ Sβ,φ = β cos φ(S; Sβ,φ −
− C2β,φ ) + β cos φ(C; Sβ,φ − Cβ,φ Sβ,φ ), S2β,φ ) − β sin φ(S; Cβ,φ − Sβ,φ Cβ,φ ).
Proof. (i)–(iii) Direct computations. (iv) By (i) ∂ ∂ ∂ (Cβ,φ Zβ (φ)) = Cβ,φ Zβ (φ) + Cβ,φ Zβ (φ) ∂φ ∂φ ∂φ ∂ Cβ,φ Zβ (φ) + Cβ,φ (−β sin φCβ,φ Zβ (φ) = ∂φ + β cos φSβ,φ Zβ (φ)).
(7.79)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
799
Thus, by (ii), we have −β sin φC; Cβ,φ + β cos φC; Sβ,φ =
∂ Cβ,φ − β sin φC2β,φ + β cos φCβ,φ Sβ,φ . ∂φ
(7.80)
Now (iv) is clear. Similarly we can show (v). Proof of Theorem 7.12(ii). By Proposition 7.32(i), we have φ φ dαCβ,α sin α − dαSβ,α cos α. Fβ (φ) − Fβ (0) = 0
(7.81)
0
(Here we have used the fact Fβ (φ) is a real-valued function.) By integration by parts, we have φ ∂ Cβ,α cos α RHS of (7.81) = Cβ,0 − Cβ,φ cos φ + dα ∂α 0 φ ∂ + Sβ,φ sin φ − Sβ,α sin α. dα (7.82) ∂α 0 On the other hand, we see that
Cβ,φ = Cβ,0 + Sβ,φ = 0
dα 0
φ
φ
∂ Cβ,α , ∂α
∂ dα Sβ,α , ∂α
(7.83)
where we used the fact Sβ,0 = 0. [Proof: Write S = iA − iA∗ with A = ∗ ∗ k,k π (ck )πr (ck ) ⊗ π (ck )πr (ck ). Clearly A 0 which implies Aβ,0 ≥ 0. Hence ∗ A β,0 ≥ 0 as well. Thus we obtain Sβ,0 is purely imaginary.] Combining this with (7.82) and using Proposition 7.32 (iv) and (v), we arrive at the desired result.
8. Peierls Instabilities 8.1. Background In [44], Peierls proposed the following problem: Let us consider a system of electrons in a lattice. The energy of the system contains the lattice periodicity as a parameter. Compare all possible energies by varying the parameter. Then the type of the lattice will be determined by minimizing the energy. This type will occur at low temperatures. Based on the variational principle, Peierls predicted a phenomenon, called the Peierls instablity in one-dimensional system. In a word, the Peierls instablity means that a system of one-dimensional electrons does not take its minimum energy at obvious periodicity a, but it does at periodicity 2a, where a is lattice spacing. This indicates a distorted lattice configuration will appear.
August 12, J070-S0129055X11004424
800
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Independently, Fr¨ ohlich studied one-dimensional electron–phonon system, and reached the same conclusion [13]. He tried to explain the superconductivity by this result. Unfortunately, as we already know, Fr¨ ohlich’s attempt failed and the BCS theory has been widely accepted as a theory of superconductivity. However the development of experimental technology enables us to create quasi one-dimensional systems, and the Peierls instablity was detected experimentally. Nowadays Fr¨ ohlich and Peierls’s observations provide an active area in condensed matter physics [2, 23, 31]. In this section, we will study the Peierls instablity from a viewpoint of the selfdual cone analysis. (We refer to [28, 35] as pioneering researches.) We will analyze the Fr¨ ohlich Hamiltonian [12, 13] in detail. It is interesting that mathematical structures of this model and the BCS model are similar as we will see. In this sense, we may say that Fr¨ ohlich’s claim about the superconductivity was not so far from the correct answer. The analysis developed in this section, as far as we know, is novel and shows a validity of the self-dual cone analysis. 8.2. Peierls instability I Let us consider electrons in a one-dimensional crystal. We begin with a simplified Fr¨ ohlich Hamiltonian [2, 12, 13] given by ε(k)a∗k ak + ω b∗q bq HFr¨ohlich (q) = q ∈Z
k∈Z
− g[q φ−q + ∗q φ∗−q + −q φq + ∗−q φ∗q ]
(8.1)
with q =
a∗k+q ak ,
φq = b−q + b∗q .
(8.2)
k∈Z
The Hamiltonian HFr¨ohlich (q)(q ∈ N) is acting in the Hilbert space Fel ⊗ Fph , where Fel is the fermionic Fock space over 2 (Z), Fph is the bosonic Fock space over 2 (Z). g > 0 is the coupling strength between electrons and phonons. ak is the electron annihilation operator satisfying the standard anticommutation relations {ak , a∗k } = δkk ,
{ak , ak } = 0 = {a∗k , a∗k }.
(8.3)
In addition, bq (q ∈ N) is the phonon annihilation operator with the commutation relations [bq , b∗q ] = δqq ,
[bq , bq ] = 0 = [b∗q , b∗q ].
(8.4)
By the Kato–Rellich theorem [46], HFr¨ohlich (q) is self-adjoint. Fix kF ∈ N, the Fermi momentum. Our purpose is to clarify an attractive effect comming from a neighborhood of the Fermi surface. To this end, let us
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
801
introduce Λ = ΛL ∪ ΛR , δ δ ΛL = k ∈ Z − kF − < k ≤ −kF + , 2 2 δ δ ΛR = −ΛL = k ∈ Z kF − ≤ k < kF + 2 2
(8.5)
with 0 < δ < kF /2. Then the electron Fock space can be factorized as Fel = FLow ⊗ FΛ ⊗ FHigh , FLow = F( 2 (BkF −δ/2 )), FΛ = F( (Λ)),
(8.6) (8.7)
2
(8.8)
FHigh = F( 2 (BkcF +δ/2 )),
(8.9)
where Br = {k ∈ Z | |k| < r} and Brc is the complement of Br . A vector a∗k ΩLow ΨLow = Ok∈BkF −δ/2
descirbes a state completely occupied by electrons with momentum |k| < kF − δ/2, where ΩLow is the Fock vacuum in FLow and, for a set S = {i1 , . . . , in } ⊂ Z with i1 < i2 < · · · < in , the symbol O- k∈S Ak means the ordered product defined by O- k∈S Ak = Ain , . . . , Ai1 . Let ΩHigh be the Fock vacuum in FHigh . Let PΛ be the orthogonal projection onto the closed subspace H = [ΨLow ⊗ FΛ ⊗ ΩHigh ] ⊗ Fph .
(8.10)
We will concentrate our attention on the following restricted Hamiltonian given by H(q) = PΛ HFr¨ohlich (q)PΛ .
(8.11)
Of course, H(q) acts in H. For each q, H(q) describes a system of electrons coupled with phonons with momentum ±q. Among various systems described by H(q), which one is energetically farvorable? The following theorem answers the question. Theorem 8.1 (Peierls Instability I). Let Zβ (q) = trH [e−βH(q) ]
(8.12)
for all β > 0. Set Q = 2kF . Assume that ε(k − Q) = −ε(k) for all k ∈ ΛR . Then we obtain Zβ (q) ≤ Zβ (Q)
(8.13)
for all q > δ. There is equality only if q = Q. Remark 8.2. (i) Let us consider the following typical case: ε(k) = 12 k 2 − 12 kF2 . Since we are studying the neighborhood of the Fermi surface, i.e. δ kF , k
August 12, J070-S0129055X11004424
802
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
can be approximated as k ∼ = kF for k ∈ ΛR . Thus, we obtain ε(k) ∼ = kF (k − kF ) ∼ for k ∈ ΛR . Similarly we have ε(k) = −kF (k + kF ) for all k ∈ ΛL . In this case, ε(k − Q) = −ε(k) is satisfied for all k ∈ ΛR . (ii) The above theorem tells us that, a system of electrons is most stable if it couples with the phonons with momentum ±Q. Hence the system takes a periodic structure with periodicity 2π/Q. This property is called the Peierls instablity. We will show Theorem 8.1 in Sec. 8.6. 8.3. Peierls instability II Let Nel be the electron number operator given by Nel = ker(Nel − n) for each n ∈ N0 . Then one sees H=
|Λ|
k∈Λ
a∗k ak . Set Hn =
Hn .
(8.14)
n=0
Since the number of electrons is conserved, i.e. e−iθNel H(q)eiθNel = H(q) for all θ ∈ R, H(q) is reduced by Hn . Hence we have a corresponding decomposition H(q) =
|Λ|
H(q) Hn .
(8.15)
n=0
Theorem 8.3 (Peierls Instability II). Let Zβ (q) = trHn [e−βH(q) ] [n]
(8.16)
for all β > 0. Assume that ε(k − Q) = −ε(k) for all k ∈ ΛR . Then we have [n]
[|Λ|/2]
Zβ (q) ≤ Zβ
(Q)
(8.17)
for all q > δ. There is equality only if n = |Λ|/2 and q = Q. Remark 8.4. Among various pairs of electron numbers and phonon modes (n, ±q), [n] the pair (|Λ|/2, ±Q) maximizes Zβ (q). This means, among physically possible systems parametrized by (n, ±q), the system with |Λ|/2 electrons and periodicity 2π/Q is realized. This fact can be understood as a stronger version of the Peierls instability stated in Theorem 8.1. The proof of Theorem 8.3 will be given in Sec. 8.7. 8.4. The most realizable Hilbert space We will restrict our attention to H|Λ|/2 by Theorem 8.3. For each k ∈ ΛL , let Sk = nk + nk+Q with nk = a∗k ak . Clearly spec(Sk ) = {0, 1, 2}. For all
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
803
s = {sk }k∈ΛL ∈ ×|ΛL | {0, 1, 2}, set H|Λ|/2 (s) = {Ψ ∈ H|Λ|/2 | Sk Ψ = sk Ψ, ∀ k ∈ ΛL }. We have
H|Λ|/2 =
H|Λ|/2 (s).
(8.18)
(8.19)
s∈×|ΛL | {0,1,2}
Since e−iaSk H(Q)eiaSk = H(Q) for all k ∈ ΛL and a ∈ R, H(Q) is reduced by H|Λ|/2 (s) for each s ∈ ×|ΛL | {0, 1, 2}. Theorem 8.5. For all s ∈ ×|ΛL | {0, 1, 2} and β > 0, set Zβ (Q; s) = trH|Λ|/2 (s) [e−βH(Q) ].
(8.20)
Assume that ε(k − Q) = −ε(k) for all k ∈ ΛR . Then one has, for all q > δ, Zβ (Q; s) ≤ Zβ (Q; 1), where 1 = {1}k∈ΛL ∈ ×
|ΛL |
(8.21)
{0, 1, 2}. There is equality only if s = 1.
Remark 8.6. It is noteworthy that the above theorem is comparable to Theorem 7.1. Namely, as to the BCS model, the Cooper pair space maximizes the partition function. In contrast, the Hilbert space H|Λ|/2 (1) maximizes the partition function in the above theorem. Thus, the most realizable space of physical state consists of states so that Sk Ψ = Ψ for all k ∈ ΛL . We will give the proof of Theorem 8.5 in Sec. 8.8. 8.5. Periodic structure in the unique ground state 8.5.1. Elimination of marginal phonons Set MQ = {N = {nq }q∈Z\{−Q,Q} | nq ∈ N0 ∀ q ∈ Z\{−Q, Q}}.
(8.22)
For all N = {nq }q∈Z\{−Q,Q} ∈ MQ , define H|Λ|/2 (1; N ) = {Ψ ∈ H|Λ|/2 (1) | b∗q bq Ψ = nq Ψ, ∀ q ∈ Z\{−Q, Q}}. Then H|Λ|/2 (1) =
H|Λ|/2 (1; N ).
(8.23)
(8.24)
N ∈MQ
It is easily verified that H(Q) is redeuced by H|Λ|/2 (1; N ). Theorem 8.7. Assume that ε(k − Q) = −ε(k) for all k ∈ ΛR . For all β > 0, we have trH|Λ|/2 (1;N ) [e−βH(Q) ] ≤ trH|Λ|/2 (1;0) [e−βH(Q) ], where 0 = {0}q∈Z\{−Q,Q} . There is equality only if N = 0.
(8.25)
August 12, J070-S0129055X11004424
804
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
The proof is postponed to Sec. 8.9. Notation. For notational simplicity, we set H (Q) = H|Λ|/2 (1; 0). Hence H (Q) = {Ψ ∈ H|Λ|/2 | Sk Ψ = Ψ, ∀ k ∈ ΛL and b∗q bq Ψ = 0, ∀ q ∈ Z\{−Q, Q}}. 8.5.2. Hamiltonian at a fixed total momentum Taking Theorem 8.7 into consideration, we will restrict the Hamiltonian H(Q) to H (Q). Let Ptot be the total momentum operator given by Ptot = Pel + Pph , where Pel = k∈Λ ka∗k ak and Pph = q ∈Z q b∗q bq .
(8.26)
Proposition 8.8. One obtains spec(Ptot H (Q)) = QZ + P0 , where P0 =
k∈ΛL
(8.27)
k and QZ = {0, ±Q, ±2Q, . . .}.
Proof. See Sec. 8.10. Now define, for all P ∈ QZ, H (Q; P ) = H (Q) ∩ ker(Ptot − P − P0 ). Then we have the following decomposition H (Q) = H (Q; P ).
(8.28)
(8.29)
P ∈QZ
Since the total momentum of the system is conserved, i.e. e−iaPtot H(Q)eiaPtot = H(Q) for all a ∈ R, we have a decomposition KP , KP = H(Q) H (Q; P ). (8.30) H(Q) = P ∈QZ
Definition 8.9. KP is called the Hamiltonian at a fixed total momentum P . Theorem 8.10 (Periodic Order in the Unique Ground State). Assume that ε(k − Q) = −ε(k) for all k ∈ ΛR . For each P ∈ QZ, KP has a unique ground state ϕ(P ) such that (i) exp{i 2π Q (Ptot − P0 )}ϕ(P ) = ϕ(P ), ∗ (ii) Nel ϕ(P ) = |Λ| k∈Λ ak ak , the electron number operator, 2 ϕ(P ), where Nel = (iii) [nk + nk+Q ]ϕ(P ) = ϕ(P ) for all k ∈ ΛL , where nk = a∗k ak .
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
805
Remark 8.11. The Hamiltonian KP describes |Λ|/2 electrons coupled with momentum ±Q phonons. Moreover it also describes the uniformly translating system with total momentum P . Hence its ground state is the lowest energy state of electron–phonon system with periodicity 2π/Q and total momentum P . By the above theorem, it is unique. By (iii), the unique ground state ϕ(P ) has an additional symmetry so that [nk + nk+Q ]ϕ(P ) = ϕ(P )
(8.31)
for all k ∈ ΛL . In this sense, the ground state of KP is a kind of condensation state of electrons satisfying (8.31). This is comparable to the fact the BCS ground state consists of the Cooper pairs (Theorem 7.3). We will prove Theorem 8.10 in Sec. 8.10. 8.6. Proof of Theorem 8.1 8.6.1. Factorization properties FΛ can be factorized as FΛ = FL ⊗ FR ,
(8.32)
where FL = F( 2 (ΛL )) and FR = F( 2 (ΛR )). Then electron annihilation operators can be expressed as (−1)NL ⊗ ak if k ∈ ΛR , (8.33) ak = ak ⊗ 1 if k ∈ ΛL where NL = k∈ΛL a∗k ak is the number operator on FL . Denoting k∈Λ F (k)a∗k ak by dΓ(F ), we have dΓ(ε) = dΓ(εL ) ⊗ 1 + 1 ⊗ dΓ(εR ), where
εL (k) =
ε(k) if k ∈ ΛL , 0 otherwise
εR (k) =
ε(k) if k ∈ ΛR . 0 otherwise
(8.34)
(8.35)
For each k ∈ ΛL , set ck = (−1)NL ak . Then dΓ(εL ) =
ε(k)c∗k ck
(8.36)
(8.37)
k∈ΛL
and, under the notation A˜ = PΛ APΛ , ˜q = k∈ΛL ,k+q∈ΛR
ck ⊗ a∗k+q ,
(8.38)
August 12, J070-S0129055X11004424
806
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
˜−q =
c∗k−q ⊗ ak
k∈ΛR ,k−q∈ΛL
=
c∗−k−q ⊗ a−k
(8.39)
k∈ΛL ,k+q∈ΛR
for all q > δ. (Note that if k + q ∈ / ΛR for all k ∈ ΛL , then (8.38) and (8.39) are eqaul to 0.) 8.6.2. A natural identification between FL and FR Let ΩL and ΩR be Fock vacuums in FL and FR , respectively. Define a unitary operator τ from FR to FL by τ ΩR = Oc∗k ΩL , (8.40) k∈ΛL
τ a∗k1
· · · a∗kn ΩR
= ck1 −Q · · · ckn −Q τ ΩR
(8.41)
for k1 , . . . , kn ∈ ΛR . Here, O- k∈ΛL c∗k is the ordered product. Then we see that, for each k ∈ ΛR , τ a∗k τ −1 = ck−Q ,
τ ak τ −1 = c∗k−Q .
Hence, by introducing U = 1 ⊗ τ , we have U dΓ(εL ) ⊗ 1U −1 = ε(k)c∗k ck ⊗ 1, k∈ΛL
U 1 ⊗ dΓ(εR )U −1 = 1 ⊗
(8.42)
(8.43)
ε(k)ck−Q c∗k−Q
k∈ΛR
=1⊗
ε(k + Q)ck c∗k
k∈ΛL
=1⊗
(−ε(k))(1 − c∗k ck )
k∈ΛL
=1⊗
ε(k)c∗k ck − tr εL ,
(8.44)
k∈ΛL
where we have used the assumption ε(k + Q) = −ε(k) and a notation tr εL = k∈ΛL ε(k). Moreover we have U ˜q U −1 = ck ⊗ ck+q−Q , (8.45) k∈ΛL ,k+q∈ΛR
U ˜−q U −1 =
k∈ΛL ,k+q∈ΛR
for all q > δ.
c∗−k−q ⊗ c∗−k−Q
(8.46)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
807
8.6.3. Representation on L2 (FL )
Our choice of ϑ is the following: Let ϕ = n≥0 ϕn ∈ FL . Then ϑϕ = ϕn ,
(8.47)
n≥0
where ϕn is the complex conjugation of the antisymmetric function ϕn . Under the identification L2 (FL ) = FL ⊗ FL by Φϑ , we have −1 ˜ = π (dΓ(εL )) + πr (dΓ(εL )) − tr εL U dΓ(ε)U
and
U ˜q U −1 =
(8.48)
π (ck )πr (c∗k+q−Q ),
(8.49)
π (c∗−k−q )πr (c−k−Q ).
(8.50)
k∈ΛL ,k+q∈ΛR
U ˜−q U −1 =
k∈ΛL ,k+q∈ΛR
Now our Hamiltonian becomes ˜ Λ (q)U −1 = K0 − V (q), K(q) = U H
(8.51)
K0 = [π (dΓ(εL )) + πr (dΓ(εL ))] ⊗ 1 + 1 ⊗ ωNph , V (q) = g [π (ck )πr (c∗k+q−Q ) ⊗ φ−q
(8.52)
where
k∈ΛL ,k+q∈ΛR
+ π (c∗−k−q )πr (c−k−Q ) ⊗ φq + adj.],
(8.53)
where the symbol “adj.” means adjoint of the operators immediately preceeding and Nph is the phonon number operator: Nph = q ∈Z b∗q bq . Remark 8.12. If q = Q, then V (q) is attractive, that is, V (Q) 0 with respect to L2 (FL )+ ⊗ Fph,+ , see Lemma 8.17. On the other hand, if q = Q, V (q) contains repulsive parts. Hence we can expect Zβ (q) attains the maximum at q = Q. 8.6.4. A canonical self-dual cone in the bosonic Fock space Here we define a natural self-dual cone in the bosonic Fock space which will be useful later. For each subset Λ0 = {q1 , . . . , q|Λ0 | } ⊆ Z with q1 < · · · < q|Λ0 | , we denote the bosonic Fock space over 2 (Λ0 ) by Fph (Λ0 ). Set, for each n1 , . . . , n|Λ0 | ∈ N0 , 1 ∗n 0 | ΨΛ0 (n1 , . . . , n|Λ0 | ) = b∗n1 · · · bq|Λ|Λ Ωph (Λ0 ), 0| n1 ! · · · n|Λ0 | ! q1
(8.54)
where b∗q is the bosonic creation operator indexed by a momentum q ∈ Λ0 and Ωph (Λ0 ) is the Fock vacuum in Fph (Λ0 ). Definition 8.13. Let Fph (Λ0 )+ be a convex cone generated by the conical combinations of {ΨΛ0 (n1 , . . . , n|Λ0 | )}n1 ,...,n|Λ0 | ∈N0 . Then Fph (Λ0 )+ is self-dual. If Λ0 = Z,
August 12, J070-S0129055X11004424
808
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
then we simply denote Fph (Z)+ by Fph,+ . Remark that since {ΨΛ0 (n1 , . . . , n|Λ0 | )} is a CONS of Fph (Λ0 ), the assumption (A.4) in Sec. 6 is satisfied. Remark 8.14. The self-dual cone Fph (Λ0 )+ was introduced by Fr¨ ohlich in [14, 15] to prove the uniqeuness of the ground state of a bosonic quantum field model. Further applications can be found in [37, 38]. By definition we have the followinig basic properties. Proposition 8.15. One obtains the following: (i) bq 0, b∗q 0 with respect to Fph (Λ0 )+ for all q ∈ Λ0 . (ii) e−βNph (Λ0 ) 0 with respect to Fph (Λ0 ) for all β ∈ R+ , where Nph (Λ0 ) is the number operator on Fph (Λ0 ). Let Λ1 and Λ2 be subsets of Z with Λ1 ∩ Λ2 = ∅. Then it is well known that Fph (Λ1 ∪ Λ2 ) = Fph (Λ1 ) ⊗ Fph (Λ2 ). Now consider the tensor product of self-dual cones: Fph (Λ1 )+ ⊗Fph (Λ2 )+ . By Proposition 2.15, Fph (Λ1 )+ ⊗Fph (Λ2 )+ is self-dual. Proposition 8.16. One has Fph (Λ1 ∪ Λ2 )+ = Fph (Λ1 )+ ⊗ Fph (Λ2 )+ . Proof. We just remark the following fact: ΨΛ1 ∪Λ2 (n1 , . . . , n|Λ1 | , n1 , . . . , n|Λ2 | ) = ΨΛ1 (n1 , . . . , n|Λ1 | ) ⊗ ΨΛ2 (n1 , . . . , n|Λ2 | ) for any n1 , . . . , n|Λ1 | ∈ N0 and n1 , . . . , n|Λ2 | ∈ N0 . 8.6.5. Proof of Theorem 8.1 and the Fermi surface nesting Set KL (q) = K0 − VL (q),
KR (q) = K0 − VR (q),
(8.55)
where VL (q) = g
[π (ck )πr (c∗k ) ⊗ φ−q + π (c∗−k−q )πr (c−k−q ) ⊗ φq + adj.],
k∈ΛL ,k+q∈ΛR
VR (q) = g
[π (ck+q−Q )πr (c∗k+q−Q ) ⊗ φ−q
k∈ΛL ,k+q∈ΛR
+ π (c∗−k−Q )πr (c−k−Q ) ⊗ φq + adj.]. We wish to apply the results in Sec. 6. To this end, we have to check several conditions. Lemma 8.17 (Attractions). One has the following: (i) V (Q) 0 with respect to L2 (FL )+ ⊗ Fph,+ . (ii) VL (q) 0 and VR (q) 0 with respect to L2 (FL )+ ⊗ Fph,+ for all q > δ.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
809
Remark 8.18. Recall Definition 6.5 and Remark 6.6. Proof. By Proposition 8.15(i), we see that φ±q 0 with respect to Fph,+ . Hence by the above expressions of VL (q) and VR (q), we have the assertions in the lemma.
Then, by Theorem 6.10, we obtain the following. Lemma 8.19. For all q > δ, we have Zβ (q)2 ≤ (tr exp{−βKL(q)}) × (tr exp{−βKR (q)}).
(8.56)
Definition 8.20. Note that |{k ∈ ΛL | k + q ∈ ΛR }| ≤ |{k ∈ ΛL | k + Q ∈ ΛR }| = |ΛL |.
(8.57)
There is equality only if q = Q. This property is often called the Fermi surface nesting. An important consequence of the Fermi surface nesting is the following property: Lemma 8.21. For each fixed q > 0 with q = Q, let Tq be a unitary operator acting in Fph defined by Tq bq Tq−1 = bQ ,
Tq b−q Tq−1 = b−Q ,
Tq b−Q Tq−1 = b−q ,
Tq bQ Tq−1 = bq ,
Tq bq Tq−1 = bq
if q = q
(8.58)
and Tq Ωph = Ωph , where Ωph is the Fock vacuum on Fph . Then one has KL (q) 1 ⊗ Tq−1 K(Q)1 ⊗ Tq , KR (q) 1 ⊗
Tq−1 K(Q)1
⊗ Tq
(8.59) (8.60)
with respect to L2 (FL )+ ⊗ Fph,+ , for all q > δ. Proof. By (8.57), we have [π (ck )πr (c∗k ) ⊗ φ−q + adj.] [π (ck )πr (c∗k ) ⊗ φ−q + adj.] k∈ΛL ,k+q∈ΛR
and
k∈ΛL
[π (c−k−q )πr (c∗−k−q ) ⊗ φq + adj.]
k∈ΛL ,k+q∈ΛR
[π (ck )πr (c∗k ) ⊗ φq + adj.]
k∈ΛL
which imply VL (q) 1 ⊗ Tq−1 V (Q)1 ⊗ Tq ,
VR (q) 1 ⊗ Tq−1 V (Q)1 ⊗ Tq .
Hence noting Tq−1 Nph Tq = Nph , we have the assertions in the lemma.
(8.61)
August 12, J070-S0129055X11004424
810
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Proof of Theorem 8.1. Since Tq 0 with respect to Fph,+ , we have, by Lemma 8.21 and Theorem 6.12, tr exp{−βKL (q)} ≤ tr exp{−βK(Q)} = Zβ (Q),
(8.62)
tr exp{−βKR (q)} ≤ tr exp{−βK(Q)} = Zβ (Q).
(8.63)
Combining these facts with (8.56), we can conclude that Zβ (q) ≤ Zβ (Q). By Duhamel’s formula, we have (L) e−βKL (q) = Dβ,N (q),
(8.64)
N ≥0
(L)
Dβ,N (q) =
SN (β)
e−t1 K0 VL (q)e−t2 K0 · · · e−tN K0 VL (q)e−(β−
PN
j=1 tj )K0
.
(8.65)
(L)
Then, by (8.61) and the fact 1 ⊗ Tq commutes with K0 , we see that tr[Dβ,N (q)] ≤ (L)
tr[Dβ,N (Q)]. Now set ΛL (q) = {k ∈ ΛL | k + q ∈ ΛR }. Then, by (8.57), |ΛL (q)| ≤ |ΛL (Q)|. There is equality only if q = Q (the Fermi surface nesting). Accordingly if q = Q, we see that (L)
(L)
tr[Dβ,1 (Q)] − tr[Dβ,1 (q)] = tr[1 ⊗ Tq Dβ,1 (Q)1 ⊗ Tq−1 ] − tr[Dβ,1 (q)] β dt tr[e−tK0 {π (ck )πr (c∗k ) ⊗ φ−q + π (c∗−k−q )πr (c∗−k−q ) ⊗ φq = (L)
k∈ΛL \ΛL (q)
(L)
0
+ adj.}e−(β−t)K0 ] > 0.
(8.66)
Hence we have tr[e−βKL (q) ] < Zβ (Q) if q = Q. Similarly we have tr[e−βKR (q) ] < Zβ (Q) if q = Q. Now we have the desired results in the theorem. 8.7. Proof of Theorem 8.3 We will apply the DLS inequality II (Theorem 6.10). Under the identification F = L2 (FL ) by Φϑ , the electron number operator can be expressed as Nel = π (NL ) − πr (NL ) +
|Λ| , 2
(8.67)
where NL = k∈ΛL c∗k ck . Let Pn be the orthogonal projection onto ker(Nel − n − |Λ|/2). Then one sees Pn = π ([NL ](i))πr ([NL ](j)), (8.68) i−j=n
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
811
where [NL ](i) is the orthogonal projection onto ker(NL − i). Thus, denoting the identity operator on Fph by 1ph , we obtain, by Theorem 6.10, [n+|Λ|/2]
Zβ
(q) = tr[Pn ⊗ 1ph e−βH(q) ] tr[π ([NL ](i))πr ([NL ](j)) ⊗ 1ph e−βH(q) ] = i−j=n
≤
{tr[π ([NL ](i))πr ([NL ](i)) ⊗ 1ph e−βKL (q) ]}1/2
i−j=n
× {tr[π ([NL ](j))πr ([NL ](j)) ⊗ 1ph e−βKR (q) ]}1/2 ≤
1 {tr[π ([NL ](i))πr ([NL ](i)) ⊗ 1ph e−βKL (q) ] 2 i−j=0 + tr[π ([NL ](j))πr ([NL ](j)) ⊗ 1ph e−βKR (q) ]} [|Λ|/2]
≤ Zβ
(Q),
(8.69)
where we have used Lemma 8.21. By the similar arguments in the proof of Theorem 8.1, we can see [|Λ|/2] tr[π ([NL ](i))πr ([NL ](i)) ⊗ 1ph e−βKL (q) ] < Zβ (Q) i
and
tr[π ([NL ](i))πr ([NL ](i)) ⊗ 1ph e−βKR (q) ] < Zβ
[|Λ|/2]
(Q)
i
if q = Q. (Note that the Fermi surface nesting is essential.) 8.8. Proof of Theorem 8.5 8.8.1. Structure of H|Λ|/2 (s) The idea of the proof is a modification of the proof of Theorem 7.1. But for the reader’s convenience, we will provide a complete proof. First we remark that, under the identification F = L2 (FL ) by Φϑ , Sk can be represented by Sk = π (nk ) − πr (nk ) + 1 with nk = c∗k ck . Let [Sk ](s) be the orthogonal projection onto ker(Sk − s) for s = 0, 1, 2. Similarly let [nk ](s) be the orthogonal projection onto ker(nk − s) for s = 0, 1. Then we see that [Sk ](0) = π ([nk ](0))πr ([nk ](1)),
(8.70)
[Sk ](1) = π ([nk ](0))πr ([nk ](0)) + π ([nk ](1))πr ([nk ](1)),
(8.71)
[Sk ](2) = π ([nk ](1))πr ([nk ](0)).
(8.72)
August 12, J070-S0129055X11004424
812
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Then the orthogonal projection onto the Hilbert space H|Λ|/2 (s) is given by P(s) =
[Sk ](sk )
(8.73)
k∈ΛL
for all s = {sk }k ∈ ×|Λ|/2 {0, 1, 2}. Notation. As we used in the proof of Theorem 7.1, we will employ the following notations. (The readers must be careful about slight differences of the notations between two proofs.) (i) For each s = {sk }k ∈ ×|ΛL | {0, 1, 2} and j ∈ {0, 1, 2}, we denote s(j) = {sk | sk = j}. Hence s = s(0) ∪ s(1) ∪ s(2). (ii) For each s = {sk }k ∈ ×|ΛL | {0, 1, 2} and j ∈ {0, 1, 2}, we denote ΛL (s(j)) = {k ∈ ΛL | sk ∈ s(j)}. Hence ΛL = ΛL (s(0)) ∪ ΛL (s(1)) ∪ ΛL (s(2)). (iii) For each subset Λ ⊆ ΛL and N = {Nk }k ∈ ×|Λ| {0, 1}, we denote [n]Λ (N) =
[nk ](Nk ).
(8.74)
k∈Λ
We immediately obtain the following: Lemma 8.22. For each s = {sk }k ∈ ×|ΛL | {0, 1, 2} and j ∈ {0, 1, 2}, set P(s(j)) =
[Sk ](sk ).
(8.75)
k∈ΛL (s(j))
Then one has the following: (i) P(s) = P(s(0))P(s(1))P(s(2)). (ii) P(s(0)) = π ([n]ΛL (s(0)) (0))πr ([n]ΛL (s(0)) (1)), where 0 = {0}k∈ΛL (s(0)) and 1 = {1}k∈ΛL (s(0)) . (iii) P(s(1)) = N∈×|ΛL (s(1))| {0,1} π ([n]ΛL (s(1)) (N))πr ([n]ΛL (s(1)) (N)). (iv) P(s(2)) = π ([n]ΛL (s(2)) (1))πr ([n]ΛL (s(2)) (0)). 8.8.2. Proof of Theorem 8.5 Recall the definition of P0 , i.e. (8.68) with n = 0. Then, by Lemma 8.22, one obtains Zβ (Q; s) = tr[P0 P(s) ⊗ 1ph e−βH(Q) ] |Λ|/2
=
i=0
tr[π ([NL ](i))πr ([NL ](i))P(s(0))P(s(1))P(s(2)) ⊗ 1ph e−βH(Q) ]
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics |Λ|/2
=
813
tr[π ([NL ](i)[n]ΛL (s(0)) (0)[n]ΛL (s(1)) (N)[n]ΛL (s(2)) (1))
i=0 N∈×|ΛL (s(1))| {0,1}
×πr ([NL ](i)[n]ΛL (s(0)) (1)[n]ΛL (s(1)) (N)[n]ΛL (s(2)) (0)) ⊗ 1ph e−βH(Q) ]. (8.76) Now set A(N) = [n]ΛL (s(0)) (0)[n]ΛL (s(1)) (N)[n]ΛL (s(2)) (1),
(8.77)
B(N) = [n]ΛL (s(0)) (1)[n]ΛL (s(1)) (N)[n]ΛL (s(2)) (0).
(8.78)
Then using the DLS inequality II (Theorem 6.10), we have RHS of (8.76) |Λ|/2
=
tr[π ([NL ](i)A(N))πr ([NL ](i)B(N)) ⊗ 1ph e−βH(Q) ]
i=0 N∈×|ΛL (s(1))| {0,1} |Λ|/2
≤
{tr[π ([NL ](i)A(N))πr ([NL ](i)A(N)) ⊗ 1ph e−βH(Q) ]}1/2
i=0 N∈×|ΛL (s(1))| {0,1}
×{tr[π ([NL ](i)B(N))πr ([NL ](i)B(N)) ⊗ 1ph e−βH(Q) ]}1/2 ≤ Zβ (Q; 1).
(8.79)
It is not so difficult to see that there is equality only if s = 1. 8.9. Proof of Theorem 8.7 Let E = P0 P(1)L2 (FL ),
(8.80)
the electron part of H|Λ|/2 (1). Then one has H|Λ|/2 (1) = E ⊗ Fph . For each subset Λ0 ⊆ Z, we denote the bosonic Fock space over 2 (Λ0 ) by Fph (Λ0 ). Then one has the factorization property Fph = P ⊗ Fph (Z\{−Q, Q}),
P = Fph ({−Q, Q}).
Thus we have H (Q) = E ⊗ P and H|Λ|/2 (1) = E ⊗ [P ⊗ Fph (Z\{−Q, Q})] = H (Q) ⊗ Fph (Z\{−Q, Q}).
(8.81)
For all N = {nq }q∈Z\{−Q,Q} ∈ MQ , let Fph,N (Z\{−Q, Q}) = {Ψ ∈ Fph (Z\{−Q, Q}) | b∗q bq Ψ = nq Ψ, ∀ q ∈ Z\{−Q, Q}}.
August 12, J070-S0129055X11004424
814
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Then Fph (Z\{−Q, Q}) =
Fph,N (Z\{−Q, Q}).
(8.82)
N ∈MQ
By the expression (8.82), H|Λ|/2 (1; N ) can be expressed as H|Λ|/2 (1; N ) = H (Q) ⊗ Fph,N (Z\{−Q, Q}). Then, by (8.81) and (8.82), one sees
H|Λ|/2 (1) = H (Q) ⊕
(8.83)
H|Λ|/2 (1; N ) ,
(8.84)
N ∈MQ \{0}
where 0 = {0}q∈Z\{−Q,Q} ∈ MQ . Lemma 8.23 (Decomposition of K(Q)). Let K = [π (dΓ(εL )) + πr (dΓ(εL ))] ⊗ 1 + 1 ⊗ ω[b∗Q bQ + b∗−Q b−Q ] − V (Q). Corresponding to (8.84), one obtains K(Q) = K ⊕ where N 1 =
(8.85)
(K + ω N 1 ) ,
(8.86)
N ∈MQ \{0}
q∈Z\{−Q,Q}
nq for each N = {nq } ∈ MQ \{0}.
Proof of Theorem 8.7. By Lemma 8.23, we have trH|Λ|/2 (1;N ) [e−βH(Q) ] = trH (Q) [e−βK ]e−βω N 1 ≤ trH (Q) [e−βK ].
(8.87)
There is equality only if N = 0. 8.10. Proofs of Proposition 8.8 and Theorem 8.10 8.10.1. Expression of Ptot on H (Q) and proof of Proposition 8.8 In the representation L2 (FL ) ⊗ Fph , Ptot has the following form k + Q|ΛL |, Ptot = π (PL ) − πr (PL ) − Qπr (NL ) + Pph + where PL = expression.
(8.88)
k∈ΛL
k∈ΛL
kc∗k ck . Noting π (PL ) = πr (PL ) on E, we obtain the following
Lemma 8.24. We have Ptot H (Q) = [Qb∗Q bQ − Qb∗−Q b−Q − Qπr (NL ) + P0 + Q|ΛL |] H (Q), where P0 = k∈ΛL k.
(8.89)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
815
Proof of Proposition 8.8. The assertion is a corollary of Lemma 8.24. 8.10.2. Structures of H (Q; P ) Now let P$tot = Pph (Q) − Qπr (NL ),
Pph (Q) = Qb∗Q bQ − Qb∗−Q b−Q .
Then H (Q) has a direct sum decomposition H (Q; P ), H (Q; P ) = ker(P$tot − P ). H (Q) =
(8.90)
P ∈QZ b
b
Since the total momentum is conserved, i.e. e−iaPtot KeiaPtot = K for all a ∈ R, K has a corresponding decomposition as we described in Sec. 8.5.2: K= KP , KP = K H (Q; P ). (8.91) P ∈QZ
Remark 8.25. The definition (8.90) is slightly different from (8.28), i.e. in the definition (8.90), the factor Q|ΛL | appearing in (8.89) is omitted for a notational simplicity. However this modification does not affect any arguments below. (A) Structure of E: Let P(1) be the orthogonal projection given by (8.73) with s = 1. Then, by Lemma 8.22, π ([n]ΛL (N))πr ([n]ΛL (N)). (8.92) P(1) = N∈×|ΛL | {0,1}
For each µ = (µ1 , . . . , µn ) ∈ Θn = {µ = (µ1 , . . . , µn ) ∈ ×n ΛL | µ1 < µ2 < · · · < µn }, define eµ = Oc∗k ΩL ∈ FL . (8.93) k∈µ
Next we define eµ = |eµ eµ | for all µ ∈ Θn with e∅ = |ΩL ΩL |. Then by (8.92), we have the following. Lemma 8.26. (i) Let PE be the orthogonal projection onto E. Then one has |Λ|/2
PE = P0 P(1) = |e∅ e∅ | +
|eµ eµ |.
(8.94)
(ii) (Decomposition of E) For each n ∈ N0 , let |eµ eµ |, PE,0 = |e∅ e∅ |. PE,n =
(8.95)
n=1 µ∈Θn
µ∈Θn
Then one has the following decomposition |Λ|/2
E=
n=0
En ,
En = PE,n L2 (FL ).
(8.96)
August 12, J070-S0129055X11004424
816
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
(B) Structure of P: By the factorization property of the Fock space, we have P = Fph ({−Q, Q}) = Fph ({−Q}) ⊗ Fph ({Q}) =
∞
Pm,n ,
(8.97)
m,n=0
where Pm,n = Fph,m ({−Q}) ⊗ Fph,n ({Q}) and Fph,m (Λ0 ) means the m-phonon 2 subspace defined by Fph,m (Λ0 ) = ⊗m s (Λ0 ) for any subset Λ0 ⊆ Z. Using this formula, we see the following. Lemma 8.27 (Decomposition of P). Let P(P ) = ker(Pph (Q) − P ) for each P ∈ QZ. Then one has P(P ). (8.98) P= P ∈QZ
Moreover, we have
P(P ) =
Pm,n .
(8.99)
n−m=P/Q
(C) Structure of H (Q;P ): Summarizing the results in (A) and (B), we have the following. Proposition 8.28 (Decomposition of H (Q;P )). For all P ∈ QZ, we have H (Q; P ) = Em ⊗ P(Qn). (8.100) n−m=P/Q
8.10.3. A natural self-dual cone in H (Q; P ) (A) Electron part: We introduce a self-dual cone E+ in E by E+ = P0 P(1)L2 (FL )+ .
(8.101)
Similarly we define a self-dual cone in En by En,+ = PE,n L2 (FL )+
(8.102)
with E0,+ = R+ . Then, by Lemma 8.26, one easily sees the following. Lemma 8.29. E+ has a direct sum decomposition |ΛL |
E+ =
En,+ .
(8.103)
n=0
Each En,+ has the following expression En,+ = a µ e µ a µ ∈ R+ ∀ µ ∈ Θ n . µ∈Θn
(8.104)
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
817
(B) Phonon part: The phonon part P has a natural self-dual cone P+ = Fph ({−Q, Q})+. Set Pm,n,+ = Fph,m ({−Q})+ ⊗ Fph,n ({Q})+ . Then Pm,n,+ is selfdual as well. Lemma 8.30. One has the following: ∞ (i) P+ = m,n=0 Pm,n,+ . (ii) A natural self-dual cone P(P )+ in P(P ) is defined by P(P )+ = Pm,n,+
(8.105)
n−m=P/Q
for all P ∈ QZ. (C) A natural self-dual cone in H (Q; P )): We adopt the following definition of a self-dual cone in H (Q; P ). Definition 8.31. A natural self-dual cone in H (Q; P ) is given by H (Q; P )+ = Em,+ ⊗ P(Qn)+
(8.106)
n−m=P/Q
for all P ∈ QZ. 8.10.4. Uniqueness of a ground state of KP Here we will prove the following theorem by series of lemmas. Theorem 8.32. For all β > 0, e−βKP 0 with respect to H (Q; P )+ . Lemma 8.33. One has the following: (i) Let L0 = [π (dΓ(εL ))+πr (dΓ(εL ))]⊗1+1⊗ω[b∗Q bQ +b∗−Q b−Q ]. Then e−βL0 0 with respect to H (Q; P )+ for all β ≥ 0. (ii) (Attraction) V (Q) 0 with respect to H (Q; P )+ . (iii) e−βKP 0 with respect to H (Q; P )+ for all β ≥ 0. Proof. These are immediate consequences of Proposition 6.2 and Corollary 2.11.
Since we are treating an unbounded interaction V (Q), we cannot apply Theorem 3.8 directly. To overcome this difficulty, we have to take a detour. Lemma 8.34. Let Pn Dn (β) = e−t1 L0 V (Q)e−t2 L0 · · · e−tn L0 V (Q)e−(β− j=1 tj )L0
(8.107)
Sn (β)
with D0 (β) = e−βL0 , where L0 is defined in Lemma 8.33(i). (Here we use the notation in the proof of Theorem 3.8.) Then a sequence of a bounded operators
August 12, J070-S0129055X11004424
818
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
{Dn (β)}n is ergodic with respect to H (Q; P )+ in a sense that, for any ϕ, ψ ∈ H (Q; P )+ \{0}, there exists an n ∈ N0 such that ϕ, Dn (β)ψ > 0. Proof. In first two steps, we will treat a case where P ≥ 0. In Step III, we will discuss how to extend the proofs in Steps I and II to a case where P < 0. Step I. In this step, we will show the following: Assume P ≥ 0. Suppose that, for each µ ∈ Θm , ν ∈ Θn and F ∈ Pi,i+m+P/Q,+ \{0}, G ∈ Pj,j+n+P/Q,+ \{0}, there exists an N ∈ N0 such that eµ ⊗ F, V (Q)N eν ⊗ G > 0.
(8.108)
Then {Dn (β)}n is ergodic with respect to H (Q; P )+ . Proof of Step I. Let Ξ = {0, 1, . . . , |ΛL |} × N0 . Choose ϕ, ψ ∈ H (Q; P )+ \{0} arbitrarily. Both vectors ϕ, ψ are expressed as ϕm,i , ψ = ψp,j (8.109) ϕ= (m,i)∈Ξ
(p,j)∈Ξ
with ϕm,i , ψm,i ∈ Em,+ ⊗ Pi,i+m+P/Q,+ . Since both ϕ and ψ are non-zero, there exist (m, i) ∈ Ξ and (p, j) ∈ Ξ such that ϕm,i = 0 and ψp,j = 0. Hence, by (8.109), we have ϕ ≥ ϕm,i ,
ψ ≥ ψp,j
with respect to H (Q; P )+ , where we identify ϕm,i with Then one obtains
(8.110)
(α,β)∈Ξ δα,m δβ,i ϕα,β .
ϕ, DN (β)ψ ≥ ϕm,i , DN (β)ψp,j . Since {eµ }µ∈Θm is a CONS of Em , we see that |eµ eµ | ⊗ 1ph ϕm,i . ϕm,i =
(8.111)
(8.112)
µ∈Θm
Equivalently ϕm,i =
eµ ⊗ eµ , ϕm,i Em .
(8.113)
µ∈Θm
Since ϕm,i is non-zero, there exists a µ ∈ Θm such that F = eµ , ϕm,i Em is a non-zero vector in Pi,i+m+P/Q,+ . Hence since |eµ eµ | 0 with respect to Em,+ , we have ϕm,i ≥ eµ ⊗ F with respect to H (Q; P )+ . We note that F ≥ 0 with respect to Pi,i+m+P/Q,+ . Similarly we have ψp,j ≥ eν ⊗ G for some ν ∈ Θn and G ∈ Pj,j+n+P/Q,+ \{0}. Therefore, by (8.111), we arrive at ϕ, DN (β)ψ ≥ ϕm,i , DN (β)ψp,j ≥ eµ ⊗ F, DN (β)eν ⊗ G.
(8.114)
By the assumption (8.108), we have that eµ ⊗ F, DN (β)eν ⊗ G > 0 for some N ∈ N0 . Combining this with (8.114), we conclude ϕ, DN (β)ψ > 0. This completes the proof of Step I.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
819
Step II. In this step, we will prove the following. Assmue P ≥ 0. For each µ ∈ Θm , ν ∈ Θn and F ∈ Pi,i+m+P/Q,+ \{0}, G ∈ Pj,j+n+P/Q,+ \{0}, there exists an N ∈ N0 such that (8.108) is satisfied. Proof of Step II. For notational simplicity, we put g = 1/2. Then V (Q) = W ⊗ φ∗Q + W ∗ ⊗ φQ (8.115) with W = k∈ΛL π (ck )πr (c∗k ). Write W = k∈ΛL Wk with Wk = π (ck )πr (c∗k ). Note that Wk eµ = eµ\{k} ,
Wk∗ eµ = eµ∪{k} ,
(8.116)
/ µ, similarly eµ∪{k} = 0 if k ∈ µ. where eµ\{k} is understood as eµ\{k} = 0 if k ∈ ∗ Write µ = (µ1 , . . . , µm ). Since V (Q) Wk ⊗ φQ with respect to H (Q; P )+ for all k ∈ ΛL , we have V (Q)m Wµ1 Wµ2 · · · Wµm ⊗ bm Q
(8.117)
with respect to H (Q; P )+ . This implies V (Q)m eµ ⊗ F ≥ e∅ ⊗ bm QF
(8.118)
with respect to H (Q; P )+ . Note that bm Q F ∈ Pi,i+P/Q,+ \{0}. Next remark that V (Q)2i Wk1 Wk∗1 · · · Wki Wk∗i ⊗ bi−Q biQ
(8.119)
with respect to H (Q; P )+ . Hence one obtains i+m F V (Q)m+2i eµ ⊗ F ≥ e∅ ⊗ bi−Q bQ
(8.120)
i+m F ∈ P0,P/Q,+ \{0}. Similarly one with respect to H (Q; P )+ . Note that bi−Q bQ has
V (Q)n+2j eν ⊗ G ≥ e∅ ⊗ bj−Q bj+n Q G
(8.121)
with respect to H (Q; P )+ with bj−Q bj+n Q G ∈ P0,P/Q,+ \{0}. Finally we arrive at i+m eµ ⊗ F, V (Q)m+n+2i+2j eν ⊗ G ≥ bi−Q bQ F, bj−Q bj+n Q G > 0.
(8.122)
This completes the proof of Step II. Step III. In this final step, we will sketch how to extend the assertions in Steps I and II to a case where P < 0. The assertion in Step I is modifined as follow. Assume P < 0. Suppose that, for each µ ∈ Θm , ν ∈ Θn and F ∈ Pi+|P |/Q,i+m,+ \{0}, G ∈ Pj+|P |/Q,j+n,+ \{0}, there exists an N ∈ N0 such that eµ ⊗ F, V (Q)N eν ⊗ G > 0. Then {Dn (β)}n is ergodic with respect to H (Q; P )+ .
(8.123)
August 12, J070-S0129055X11004424
820
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
Proof of this statement is similar to the proof of Step I. Finally we remark that to check (8.123) is a slight modification of the proof of Step II. Proof of Theorem 8.32. By Lemma 8.34, for any ϕ, ψ ∈ H (Q; P )+ \{0}, there exists an N ∈ N0 so that ϕ, DN (β)ψ > 0. On the other hand, since DN (β) 0 with respect to H (Q; P )+ for all N ∈ N0 , we see that, by noting the fact e−βKP = n≥0 Dn (β) (Duhamel’s formula), e−βKP DN (β) with respect to H (Q; P )+ . Thus we obtain ϕ, e−βKP ψ ≥ ϕ, DN (β)ψ > 0. Hence by the Perron– Frobenius–Faris theorem (Theorem 2.13), we conclude e−βKP 0 with respect to H (Q; P )+ for all β > 0. 8.10.5. Proof of Theorem 8.10 By Theorems 8.32 and 2.13, the ground state ϕ(P ) of KP is unique. (Moreover ϕ(P ) > 0 with respect to H (Q; P )+ .) Properties (i)–(iii) immediately come from the structure of H (Q; P ) and (8.89). Acknowledgment I am indebted to the anonymous referees for their valuable comments on my previous draft. This work is supported by KAKENHI (20554421).
References [1] M. Aizenman, E. H. Lieb, R. Seiringer, J. P. Solovej and J. Yngvason, Bose–Einstein quantum phase transition in an optical lattice model, Phys. Rev. A 70 (2004) 023612. [2] A. Altland and B. Simons, Condensed Matter Field Theory, 2nd edn. (Cambridge University Press, New York, 2010). [3] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137(2) (1998) 299–395. [4] J. Bardeen, L. N. Cooper and J. R. Schrieffer, Theory of superconductivity, Phys. Rev. 108 (1957) 1175–1204. [5] A. Beurling and J. Deny, Dirichlet spaces, Proc. Nat. Acad. Sci. 45 (1959) 208–215. [6] W. B¨ os, Direct integrals of selfdual cones and standard forms of von Neumann algebras, Invent. Math. 37(3) (1976) 241–251. [7] L. N. Cooper, Bounds electron pairs in a degenerate Fermi gas, Phys. Rev. 104 (1956) 1189–1190. [8] F. J. Dyson, E. H. Lieb and B. Simon, Phase transitions in quantum spin systems with isotropic and nonisotropic interactions, J. Stat. Phys. 18 (1978) 335–383. [9] W. G. Faris, Invariant cones and uniqueness of the ground state for fermion systems, J. Math. Phys. 13 (1972) 1285–1290. [10] J. Glimm and A. Jaffe, The λ(ϕ4 )2 quantum field theory without cutoffs: II. The field operators and the approximate vacuum, Ann. Math. 91 (1970) 362–401. ¨ [11] G. Frobenius, Uber Matrizen aus nicht negativen Elementen, Sitzungsber. Konigl. Preuss. Akad. Wiss. 26 (1912) 456–477. [12] H. Fr¨ ohlich, Electrons in lattice fields, Adv. Phys. 3 (1954) 325–361.
August 12, J070-S0129055X11004424
2011 10:27 WSPC/S0129-055X
148-RMP
Self-Dual Cone Analysis in Condensed Matter Physics
821
[13] H. Fr¨ ohlich, On the theory of superconductivity: The one-dimensional case, Proc. R. Soc. A 223 (1954) 296–305. [14] J. Fr¨ ohlich, On the infrared problem in a model of scalar electrons and massless, scalar bosons, Ann. Inst. H. Poincar´ e Sect. A (N.S.) 19 (1973) 1–103. [15] J. Fr¨ ohlich, Existence of dressed one electron states in a class of persistent models, Fortschr. Phys. 22(3) (1974) 150–198. [16] J. Fr¨ ohlich and E. H. Lieb, Phase transitions in anisotropic lattice spin systems, Comm. Math. Phys. 60 (1978) 233–267. [17] J. Fr¨ ohlich, B. Simon and T. Spencer, Infrared bounds, phase transitions and continuous symmetry breaking, Comm. Math. Phys. 50 (1976) 79–95. [18] J. Fr¨ ohlich, R. Israel, E. H. Lieb and B. Simon, Phase transitions and reflection positivity. I. General theory and long range lattice models, Comm. Math. Phys. 62 (1978) 1–34. [19] J. Fr¨ ohlich, R. Israel, E. H. Lieb and B. Simon, Phase transitions and reflection positivity. II. Lattice systems with short-range and Coulomb interactions, J. Stat. Phys. 22 (1980) 297–347. [20] J. Fr¨ ohlich, The pure phases (harmonic functions) of generalized processes or: Mathematical physics of phase transitions and symmetry breaking, Bull. Amer. Math. Soc. 84 (1978) 165–193. [21] B. Gerlach and H. L¨ owen, Analytical properties of polaron systems or: Do polaronic phase transitions exist or not?, Rev. Mod. Phys. 63(1) (1991) 63–90. [22] L. Gross, Existence and uniqueness of physical ground states, J. Funct. Anal. 10 (1972) 52–109. [23] G. Gruner, Density Waves in Solid, Frontiers in Physics, Vol. 89 (Addison-Wesley, 1994). [24] F. Hiroshima, Ground states of a model in nonrelativistic quantum electrodynamics. II, J. Math. Phys. 41(2) (2000) 661–674. [25] U. Haagerup, The standard form of von Neumann algebras, Math. Scand. 37(2) (1975) 271–283. [26] O. J. Heilmann and E. H. Lieb, Lattice models for liquid crystals, J. Stat. Phys. 20 (1979) 673–693. [27] B. D. Josephson, Possible new effects in superconductive tunnelling, Phys. Lett. 1 (1962) 251–253. [28] T. Kennedy and E. H. Lieb, Proof of the Peierls instability in one dimension, Phys. Rev. Lett. 59 (1987) 1309–1312. [29] T. Kennedy, E. H. Lieb and B. S. Shastry, Existence of Neel order in some spin-1/2 Heisenberg antiferromagnets, J. Stat. Phys. 53 (1988) 1019–1030. [30] K. Kubo and T. Kishi, Rigorous bounds on the susceptibilities of the Hubbard model, Phys. Rev. B 41 (1990) 4866–4868. [31] P. A. Lee, T. M. Rice and P. W. Anderson, Conductivity from charge or spin density waves, Solid State Commun. 14 (1974) 703–709. [32] E. H. Lieb, Two theorems on the Hubbard model, Phys. Rev. Lett. 62 (1989) 1201–1204. [33] E. H. Lieb, Flux phase of the half-filled band, Phys. Rev. Lett. 73 (1994) 2158–2161. [34] E. H. Lieb and D. C. Mattis, Ordering energy levels of interacting spin systems, J. Math. Phys. 3 (1962) 749–751. [35] E. H. Lieb and B. Nachtergaele, Stability of the Peierls instability for ring-shaped molecules, Phys. Rev. B 51 (1995) 4777–4791. [36] E. H. Lieb and F. Y. Wu, Absence of Mott transition in an exact solution of the short-range, one-band model in one dimension, Phys. Rev. Lett. 20 (1968) 1445–1448.
August 12, J070-S0129055X11004424
822
2011 10:27 WSPC/S0129-055X
148-RMP
T. Miyao
[37] T. Miyao, Nondegeneracy of ground states in nonrelativistic quantum field theory, J. Operat. Theor. 64 (2010) 207–241. [38] T. Miyao, The polaron: From a viewpoint of operator inequalities, preprint. [39] Y. Miura, On order of operators preserving selfdual cones in standard forms, Far East J. Math. Sci. (FJMS ) 8(1) (2003) 1–9. [40] J. S. Møller, The translation invariant massive Nelson model. I. The bottom of the spectrum, Ann. Henri Poincar´e 6(6) (2005) 1091–1135. [41] Y. Nagaoka, Ferromagnetism in a narrow, almost half-filled s band, Phys. Rev. 147 (1966) 392–405. [42] E. Nelson, Notes on non-commutative integration, J. Funct. Anal. 15 (1974) 103–116. [43] K. Osterwalder and R. Schrader, Axioms for Euclidean Green’s functions, Comm. Math. Phys. 31 (1973) 83–112. [44] R. E. Peierls, Quantum Theory of Solids (Oxford at the Clarendon Press, 1956). [45] O. Perron, Zur Theorie der Matrices, Math. Ann. 64 (1907) 248–263. [46] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. II (Academic Press, New York, 1975). [47] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. IV (Academic Press, New York, 1978). [48] I. E. Segal, A non-commutative extension of abstract integration, Ann. Math. 57 (1953) 401–457. [49] B. Simon and R. Hoegh-Krohn, Hypercontractive semigroups and two dimensional self-coupled Bose fields, J. Funct. Anal. 9 (1972) 121–180. [50] A. D. Sloan, A nonperturbative approach to nondegeneracy of ground states in quantum field theory: Polaron models, J. Funct. Anal. 16 (1974) 161–191. [51] A. D. Sloan, Analytic domination with quadratic form type estimates and nondegeneracy of ground states in quantum field theory, Trans. Amer. Math. Soc. 194 (1974) 325–336. [52] M. Takesaki, Theory of Operator Algebra II (Springer-Verlag, 2003).
September 20, J070-S0129055X11004436
2011 11:25 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 8 (2011) 823–838 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004436
¨ SCHRODINGER EQUATIONS WITH TIME-DEPENDENT UNBOUNDED SINGULAR POTENTIALS
KENJI YAJIMA Department of Mathematics, Gakushuin University, 1-5-1 Mejiro, Toshima-ku, Tokyo 171-8588, Japan
[email protected] Received 16 April 2011 Revised 12 July 2011 We consider time-dependent perturbations by unbounded potentials of Schr¨ odinger operators with scalar and magnetic potentials which are almost critical for the selfadjointness. We show that the corresponding time-dependent Schr¨ odinger equations generate a unique unitary propagator if perturbations of scalar and magnetic potentials are differentiable with respect to the time variable and they increase at the spatial infinity at most quadratically and at most linearly, respectively, where both have mild local singularities. We use time-dependent gauge transforms and apply Kato’s abstract theorem on evolution equations. Keywords: Time-dependent Schr¨ odinger equations; initial value problem; propagators. Mathematics Subject Classification 2010: 35Q41, 81Q15
1. Introduction and Theorem We consider the problem of the existence and the uniqueness of a unitary propagator in the Hilbert space H = L2 (Rd ) for time-dependent Schr¨ odinger equations i
1 ∂u = H(t)u(t) ≡ − ∇2A(t) u + V (t, x)u, ∂t 2
∇A(t) = ∇ − iA(t, x)
(1)
where A(t, x) = (A1 (t, x), . . . , Ad (t, x)) ∈ Rd and V (t, x) ∈ R are magnetic vector and electric scalar potentials, respectively. When the Hamiltonian H(t) is t-independent, H(t) = H, it is well known that the problem is equivalent to that of the selfadjointness of H and, after a long and extensive study by many authors, it has almost been settled now ([2, 14]). There are two types of results, one for the essential selfadjointness of H on C0∞ (Rd ) and the other for the selfadjointness of the maximal operator H via the theory of quadratic forms. In what follows, the letter C will denote various constants which may change from line to line and whose exact values are not important. We 823
September 20, J070-S0129055X11004436
824
2011 11:25 WSPC/S0129-055X
148-RMP
K. Yajima
write (1 + |x|2 )1/2 = x, Lp = Lp (Rd ), 1 ≤ p ≤ ∞ are Lebesgue spaces and Lploc = Lploc (Rd ) are their localizations. 2
(a) If V = V1 + V2 with V1 ∈ L2loc such that V1 (x) ≥ −Cx and V2 of Stummel type and if A ∈ L4loc with divA ∈ L2 , then H on C0∞ (Rd ) is essentially selfadjoint ([12]). If we denote the closure again by H, then the solution of (1) with the initial condition u(0) = ϕ ∈ H is uniquely given by u(t, x) = (e−itH ϕ)(x) and e−itH is the unique propagator. When V and A are locally more singular than in (a) and H cannot be defined on C0∞ (Rd ), the theory of quadratic forms may be used to define H. In what follows, u is the norm of H = L2 (Rd ) and (u, v) is the inner product. (b) If V = V1 + V2 with V1 ∈ L1loc such that V1 (x) ≥ −C and V2 of Kato class and if A ∈ L2loc , then the form q(u) = 12 ∇A u 2 + (V u, u) with domain D(q) = {∇A u ∈ L2 , |V |1/2 u ∈ L2 } is closed and bounded from below and C0∞ is a core. The selfadjoint operator H defined by q has domain D(H) = {u ∈ L2 : ∇A u ∈ L2 , V u ∈ L1loc , Hu ∈ L2 } and is the maximal operator, viz. D(H) = {u ∈ L2 : V u ∈ L1loc , Hu ∈ L2 } if A is of C 1 class ([2, 10]). If ϕ ∈ D(H), then u(t, x) = (e−itH ϕ)(x) satisfies (1) in the sense of distributions. The unitary propagator which satisfies the last property is unique. We recall that W (t, x) is of Stummel class uniformly with respect to t ∈ I if |W (t, y)|2 lim sup dy = 0, ε→0 t∈I,x∈Rd |x−y|<ε |x − y|d−4
(2)
where |x − y|4−d should be replaced by |log|x − y|| if d = 4 and by 1 if 1 ≤ d ≤ 3; W (t, x) is of Kato class uniformly with respect to t ∈ I if |W (t, y)| lim sup dy = 0, (3) ε→0 t∈I,x∈Rd |x−y|<ε |x − y|d−2 where |x − y|2−d should be replaced by |log|x − y|| if d = 2 and by 1 if d = 1. As is well known, conditions in (a) and (b) are almost critical in both cases (see, however, [7] for the effect of strong magnetic fields: H can be essentially selfadjoint 2+ε even if V blows up in the negative direction like V (x) ≤ −Cx , ε > 0, if strongly divergent magnetic fields are present near infinity). In this paper, we ask to what extent the property of the generation of a unique unitary propagator is universally stable if the operators H in (a) or (b) are perturbed by genuinely time-dependent potentials, viz. we seek a class of timedependent perturbations which, when added to any stationary potentials of (a) or (b) still generates a unique unitary propagator. We also ask how to accomodate large negative potentials ≥ −Cx2 in the case (b). We shall show, by using a simple time-dependent gauge transform and by applying Kato’s abstract theory of evolution equations, that the property is stable under perturbations by time-dependent scalar and magentic potentials which increase as
September 20, J070-S0129055X11004436
2011 11:25 WSPC/S0129-055X
148-RMP
Schr¨ odinger Equations with Time-Dependent Unbounded Singular Potentials
825
|x| → ∞ at most quadratically and linearly, respectively, where both have mild singularities. In particular, we show for the case (b) that H(t) generates a unique unitary propagator even if V (t, x) has a quadratically decreasing large negative part. We state and prove two theorems corresponding to the cases (a) and (b) under separate assumptions. Multiplication operators by V (t, x), A(t, x) and etc. will be denoted by V (t), A(t) ˙ x) = ∂t A(t, x) and etc. and etc. respectively and A(t, 1 1 Λ = − ∆ + x2 , D(Λ) = {u : xα ∂ β u ∈ L2 , for all |α + β| ≤ 2} 2 2 is the harmonic oscillator, I is a compact interval; for Banach spaces X and Y , B(X, Y ) is the Banach space of bounded operators from X to Y ; and B(X) = B(X, X). Assumption 1.1. (1) For t ∈ I, A(t, ·) ∈ L4loc and divx A(t, ·) ∈ L2loc . For almost all x ∈ Rd , A(t, x) is absolutely continuous with respect to t ∈ I, and with a constant M, −1 ˙ ˙ 2 Λ−1 B(H) + ∇x (A(t) ˙ 2 )Λ−1 B(H) ≤ M, divx A(t)Λ B(H) + A(t)
a.e. t ∈ I.
(2) V = V1 + V2 with V1 (t, ·) ∈ L2loc such that, for a constant C∗ > 0, 2
−C∗ x ≤ V1 (t, x),
(t, x) ∈ I × Rd ,
and V2 of Stummel class uniformly with respect to t ∈ I. For almost all x ∈ Rd , V (t, x) is absolutely continuous with respect to t, and with a constant M > 0 V˙ (t)Λ−1 B(L2 ) ≤ M,
a.e. t ∈ I.
We define, using the constant C∗ > 0 of Assumption 1.1, as follows: 2 V˜ (t, x) = V (t, x) + 2C∗ x ,
2 V˜1 (t, x) = V1 (t, x) + 2C∗ x
(4)
1 H1 (t) = − ∇2A(t) + V˜1 (t, x). 2
(5)
2 so that V˜1 (t, x) ≥ C∗ x and
1 H0 (t) = − ∇2A(t) + V˜ (t, x), 2
By virtue of result (a) above, H(t), H0 (t) and H1 (t) are all essentially selfadjoint on C0∞ (Rd ) for every t ∈ I, and we denote selfadjoint extensions by same symbols. We have D(H0 (t)) = D(H1 (t)) for all t ∈ I since V2 (t, x) is of Stummel class (see Sec. 4). The operator H1 (t) will not appear in the statement of Theorem 1.2 (but will appear in the proof) and it is introduced here to compare it with Q1 (t) which will appear before Theorem 1.4. Theorem 1.2. Suppose that A and V satisfy Assumption 1.1. Then: (1) Operators H0 (t), t ∈ I have a common domain, which we denote by D, and D ⊂ D(H(t)) for all t ∈ I. We equip D with the graph norm of H0 (t0 ), t0 ∈ I being chosen and fixed arbitrarily.
September 20, J070-S0129055X11004436
826
2011 11:25 WSPC/S0129-055X
148-RMP
K. Yajima
(2) There uniquely exists a strongly continuous family of unitary operators {U (t, s) : t, s ∈ I} in H which satisfies the following properties: (a) U (t, s)U (s, r) = U (t, r), U (s, s) = 1 for all t, s, r ∈ I. (b) U (t, s)D = D and (t, s) → U (t, s) ∈ B(D) is strongly continuous. (c) If ϕ ∈ D, (t, s) → U (t, s)ϕ ∈ H is of class C 1 and i∂t U (t, s)ϕ = H(t)U (t, s)ϕ,
i∂s U (t, s)ϕ = −U (t, s)H(s)ϕ.
(6)
(3) For ϕ ∈ H, u(t, x) = (U (t, s)ϕ)(x) satisfies Eq. (1) in the sense of distributions. We emphasize that H(t) can have wildly changing domains, D(H0 (t)) ≡ D is independent of t nevertheless, and this D is preserved by the propagator U (t, s). We next formulate the result for the perturbations of potentials of (b). We write Q(u, u) = Q(u) for quadratic forms Q(u, v). Assumption 1.3. (1) For t ∈ I, A(t, ·) ∈ L2loc . For almost all x ∈ Rd , A(t, x) is absolutely continuous with respect t and for a constant M − 12 ˙ B(H) ≤ M, A(t)Λ
a.e. t ∈ I.
(2) V = V1 + V2 with V1 (t, ·) ∈ L1loc such that, for a constant C∗ > 0, −C∗ x2 ≤ V1 (t, x),
(t, x) ∈ I × Rd
and V2 of Kato class uniformly with respect to t ∈ I. For almost all x ∈ Rd , V (t, x) is absolutely continuous with respect t and, for a constant M , 1 1 Λ− 2 V˙ (t)Λ− 2 B(H) ≤ M,
a.e. t ∈ I.
When V satisfies Assumption 1.3, we define V˜ (t, x) and V˜1 (t, x) again by (4). Then, by virtue of the result (b) above, the quadratic form defined by Q1 (t)(u) =
1 ∇A(t) u 2 + (V˜1 (t, x)u, u) 2
(7)
with domain D(Q1 (t)) = {u ∈ L2 : ∇A(t) u ∈ L2 , V˜1 (t, x)1/2 u ∈ L2 } is positive and closed and C0∞ (Rd ) is a core. Since V2 (t) is of Kato class uniformly with respect to t ∈ I, it is uniformly Q1 (t) form bounded with bound 0 (see Lemma 3.1). It follows that the quadratic form Q0 (t)(u) =
1 ∇A(t) u 2 + (V˜ (t, x)u, u), 2
D(Q0 (t)) = D(Q1 (t))
(8)
is again closed and bounded from below and C0∞ (Rd ) is a core of Q0 (t). We may and do assume in what follows without losing the generality that Q0 (t) ≥ 1 by choosing C∗ sufficiently large. We denote by H0 (t) and H1 (t) the selfadjoint operators defined by Q0 (t) and Q1 (t), respectively. This should not cause confusions since these operators are the same as the previously defined H0 (t) and H1 (t) if Assumption 1.1 is satisfied.
September 20, J070-S0129055X11004436
2011 11:25 WSPC/S0129-055X
148-RMP
Schr¨ odinger Equations with Time-Dependent Unbounded Singular Potentials
827
Theorem 1.4. Suppose that A and V satisfy Assumption 1.3. Then: (1) Quadratic forms Q0 (t), t ∈ I have a common domain which is contained 1 in D(Λ 2 ), and which we denote by Y. We equip Y with the inner product Q0 (t0 )(u, v), t0 ∈ I being arbitrarily chosen and fixed, so that Y becomes a Hilbert space. We let X be its dual space so that Y ⊂ H ⊂ X constitutes a Gel’fand triplet of Hilbert spaces. (2) Operators H(t) from Y to X defined for t ∈ I by (H(t)u, v) =
1 (∇A(t) u, ∇A(t) v) + (V (t, x)u, v), 2
u, v ∈ Y
is bounded and is norm continuous with respect to t ∈ I. (3) There uniquely exists a strongly continuous family {U (t, s) : t, s ∈ I} of unitary operators in H which satisfies the following properties: (a) U (t, s)U (s, r) = U (t, r), U (s, s) = I for all t, s and r ∈ I. (b) U (t, s)Y = Y and (t, s) → U (t, s) ∈ B(Y) is strongly continuous. (c) U (t, s) extends to a bounded operator in X , which we denote by the same symbol, and (t, s) → U (t, s) ∈ B(X ) is strongly continuous. (d) If ϕ ∈ Y, (t, s) → U (t, s)ϕ ∈ X is of class C 1 and it satisfies i∂t U (t, s)ϕ = H(t)U (t, s)ϕ,
i∂s U (t, s)ϕ = −U (t, s)H(s)ϕ.
Remark 1.5. Since C0∞ (Rd ) ⊂ Y is dense, X can be embedded into the space of distributions and for ϕ ∈ Y, u(t, x) = U (t, s)ϕ(x) satisfies (1) in the sense of distributions. If local singularities of A and V are as in Assumption 1.1 and A ∈ L1loc (It , L4loc ), divx A ∈ L1loc (It , L2loc ) and V ∈ L1 (It , L2loc ) then u(t, x) = U (t, s)ϕ(x) for ϕ ∈ H also satisfies (1) in the sense of distributions. This is easily seen by approximating ϕ by ϕn ∈ Y such that ϕn − ϕ → 0 as n → ∞. Remark 1.6. If we take t0 arbitrarily and define A(x) = A(t0 , x), V (x) = V (t0 , x) and A1 (t, x) = A(t, x) − A(t0 , x), V1 (t, x) = V (t, x) − V (t0 , x) in Theorems 1.2 and 1.4, then A(x) and V (x) satisfy the conditions of (a) and (b), respectively and A(t, x) = A(x) + A1 (t, x) and V (t, x) = V (x) + V1 (t, x) are considered as pertubations of A(x) and V (x) by t t ˙ x)ds and V1 (t, x) = A(s, V˙ (s, x)ds, A1 (t, x) = t0
t0
respectively, which are, roughly speaking, Λ1/2 - and Λ-(form)bounded. We remark that assumptions in Assumptions 1.1 and 1.3 are almost sharp if the universality is insisted as is already shown by time-independent perturbations: The perturbation of − 21 ∆ by a super-quadtratic negative potential H = − 12 ∆ − x2+ε is not essentially selfadjoint on C0∞ (Rd ) and has infinitely many selfadjoint extensions determined by boundary conditions at infinity and different boundary conditions produce different unitary propagators for (1).
September 20, J070-S0129055X11004436
828
2011 11:25 WSPC/S0129-055X
148-RMP
K. Yajima
The construction of unitary propagators for time-dependent Schr¨ odinger equations is an old problem. There is a large body of literature on the subject and the problem has been studied by many authors by using various methods. We refer readers to monographs of Reed and Simon [14, Sec. X.12], Lions and Magenes [11] and references therein for older literature on the method of semi-group theory, of parabolic regularization and of smooth quadratic forms. For newer results on the subject, we mention that the energy method has been applied by Ichinose [6] and Doi [3], the extended phase space technique has been introduced by Howland [5], a method for directly constructing the integral kernel of the propagator has been invented by Fujiwara [4] (see also [17]) and that Strichartz’s estimates which is a consequence of [4, 17] has been used to construct the propagator for equations with very singular scalar potentials (see [18]). In spite of these, to the best knowlegde of the author, there is no literature which studies the problem from the point of view of this paper: We take unperturbed stationary potentials which can behave as wildly as possible locally and globally, in the scope of selfadjointness. Thus, when looked at any fixed t ∈ I, the class of potentials treated here is much wider than any in the existing literature. However, the timedependent perturbations considered here are actually rather moderate as stated in the theorems and Remark 1.6 though they are unbounded and not smooth. This is somewhat unavoidable since we are seeking perturbations which apply universally for stationary potentials of (a) or (b) and there is little hope that any additional properties other than unitarity like local smoothing properties or Strichartz’ inequality are present for the unperturbed propagator e−itH which could be used for constructing U (t, s). Indeed, smoothing properties of e−itH are getting weaker indefinitely as the growth rate at infinity of scalar potentials increases indefinitely (see [19, 20]). It is possible to accomodate perturabtions with stronger local singularities when unperturbed potentials have nicer proporties as mentioned above (e.g., [16–18]). In the following Sec. 2, we define the Gauge transform which plays important role and present the basic idea of the proof of theorems and the plan of the paper. 2. Gauge Transform and the Idea of the Proof We use time-dependent Gauge transform G(t) defined by 2
G(t)u(x) = ei2tC∗ x u(x),
t ∈ I,
(9)
by using the constant C∗ of Assumptions 1.1 and 1.3. It is obvious that {G(t) : t ∈ I} is a strongly continuous unitary group in H and that, if u and v are related by u(t, x) = G(t)v(t, x) then, u(t, x) satisfies Eq. (1) if and only if v(t, x) does 1 ˜ 0 (t)v, v + V˜ (t, x)v ≡ H i∂t v = − ∇2A(t) 2 ˜
˜ x) = A(t, x) − 4tC∗ x. A(t,
(10)
September 20, J070-S0129055X11004436
2011 11:25 WSPC/S0129-055X
148-RMP
Schr¨ odinger Equations with Time-Dependent Unbounded Singular Potentials
829
˜ 0 (t) is essentially If A and V satisfy Assumption 1.1 then so do A˜ and V˜ and H ∞ d ˜ selfajoint on C0 (R ). We denote its closure again by H0 (t). Then, using the fact 2 that V˜1 (t, x) ≥ C∗ x , we prove the following in Sec. 4. Recall that H0 (t) = − 21 ∇2A(t) + V˜ (t, x). • • • •
D(H0 (t)) ≡ D is independent of t ∈ I. G(t) maps D onto D. ˜ 0 (t). ˜ 0 (t)) = D(H0 (t)) and H0 (t)G(t) = G(t)H D(H D ⊂ D(H(t)), the latter being t dependent in general. ˜ 0 (t)), t ∈ I satisfies the assumptions of Kato’s abstract The triplet (H, D, H theorem which will be recalled below (see Theorem 3.2).
˜ s) for (10) which ˜ 0 (t)) generates a unique unitary propagator U(t, Thus, (H, D, H satisfies the properties of Theorem 1.2 with obvious modification. We then define ˜ (t, s)G(s)−1 . U (t, s) = G(t)U It is then straightforward to check that U (t, s) is the desired propagator of Theorem 1.2. The situation for Theorem 1.4 is similar. In addition to the quadratic form Q0 (t) ˜ 0 (t) by on H associated with operator H0 (t), we define the form Q ˜ 0 (t)(u) = 1 ∇ ˜ u 2 + (V˜ (t, x)u, u), Q 2 A(t) ˜ 0 (t)) = {u ∈ L2 : ∇ ˜ u ∈ L2 , V˜1 (t, x)1/2 u ∈ L2 }. D(Q A(t)
(11)
˜ 0 (t) is closed and bounded from below and C ∞ (Rd ) is a core. We denote Then, Q 0 ˜ 0 (t) (under Assumption 1.1, this is ˜ 0 (t) by H the selfadjoint operator defined by Q ˜ 0 (t) defined previously). By using again the property V˜1 (t, x) ≥ C∗ x2 , we the H prove the following in Sec. 5: ˜ 0 (t)) = Y is independent of t and G(t) maps Y onto Y. • D(Q ˜ 0 (t)(u, v) = Q0 (t)(G(t)u, G(t)v). ˜ • D(Q0 (t)) = D(Q0 (t)) and Q ∗ ˜ • Let X = Y . Then, H0 (t) may be considered as a closed operator on X and H(t) is bounded from Y to X . ˜ 0 (t)) satisfies the conditions of Kato’s Theorem 3.2. • The triplet (X , Y, H ˜ (t, s) associated with Eq. (10) ˜ 0 (t)) produces the propagator U Thus, (X , Y, H which satisfies the properties of Theorem 1.4 with obvious modification. We define ˜ (t, s)G(s)−1 again and check that this U (t, s) is the desired U (t, s) = G(t)U propagator. 3. Preliminaries Before starting the proof of the theorems, we state two results which play key roles in what follows. The first is so called diamagnetic inequality and proof may be found in [2, pp. 9–10]. 2
Lemma 3.1. Let A(x) ∈ L2loc and let V (x) ∈ L1loc satisfy V (x) ≥ Cx , C > 0. Let H = − 12 ∇2A + V be the selfadjoint operator associated with quadratic form
September 20, J070-S0129055X11004436
830
2011 11:25 WSPC/S0129-055X
148-RMP
K. Yajima
q(u) = 12 ∇A u 2 + V 1/2 u 2 with domain D(q) = {u ∈ L2 : ∇A u, V 1/2 u ∈ L2 }. Then, for every 0 < γ ≤ 1 and u ∈ L2 (Rd ), we have that −γ −γ 1 1 2 |u|(x) ≤ − ∆ + C |u|(x). (12) |H −γ u(x)| ≤ − ∆ + Cx 2 2 2
In particular, we have x u ≤ C Hu if u ∈ D(H). Proofs of both Theorems 1.2 and 1.4 will eventually rely upon the following abstract theorem. The theorem is a direct consequence of Theorem 5.2, Remarks 5.3 and 5.4 of Kato’s seminal paper [8]. Theorem 3.2. Let X and Y be Hilbert spaces such that Y is continuously and densely embedded in X . Let {A(t), t ∈ I} be a family of closed operators in X with dense domain such that Y ⊂ D(A(t)) and A(t) ∈ B(Y, X ) is norm continuous with respect to t ∈ I. Suppose that for every t ∈ I, X and Y are equipped with respective Hilbert structures (·, ·)Xt and (·, ·)Yt which are equivalent to the original structures and which satisfy, for a constant c > 0 : u Yt / u Ys ≤ ec|t−s| ,
u Xt / u Xs ≤ ec|t−s| ,
u = 0.
(13)
Let Xt and Yt be Hilbert spaces X and Y with respective new inner products (·, ·)Xt ˜ of A(t) and (·, ·)Yt . Suppose further that A(t) is selfadjoint in Xt and the part A(t) in Yt is also selfadjoint in Yt . Then, there uniquely exists a family of operators {U (t, s) : t, s ∈ I} which satisfies the following properties: (a) {U (t, s) : t, s} is a strongly continuous family of bounded operators in X and, for constants M and β, U (t, s) B(X ) ≤ M eβ|t−s| , t, s ∈ I. (b) For all t, s, r ∈ I, U (s, s) = I and U (t, r) = U (t, s)U (s, r). (c) For t, s ∈ I, U (t, s)Y ⊂ Y and U (t, s) ∈ B(Y). {U (t, s) : t, s ∈ I} is a strongly continuous family of bounded operators in Y and, for constants M and β, U (t, s) B(Y) ≤ M eβ|t−s| , t, s ∈ I. (d) For u ∈ Y, U (t, s)u is an X -valued strongly continuously differentiable with respect to t, s ∈ I and satisfies ∂t U (t, s)u = −iA(t)U (t, s)u,
∂s U (t, s)u = iU (t, s)A(s)u.
(14)
4. Proof of Theorem 1.2 In this section we assume Assumption 1.1 with sufficiently large C∗ > 0. Recall definitions (4) of V˜ and V˜1 , H0 (t) = − 21 ∇2A(t) + V˜ (t, x) and H1 (t) = − 21 ∇2A(t) + ˜ 1 (t) by V˜1 (t, x). We further define H ˜ 1 (t) = − 1 ∇2˜ + V˜1 (t, x) H 2 A(t)
(15)
˜ x) in H1 (t) as we did it for H ˜ 0 (t) = − 1 ∇2 + V˜ (t, x) by changing A(t, x) by A(t, ˜ 2 A(t) ˜ 0 (t), H ˜ 1 (t) are all essentially selfadjoint on in Sec. 2. The operators H0 (t), H1 (t), H
September 20, J070-S0129055X11004436
2011 11:25 WSPC/S0129-055X
148-RMP
Schr¨ odinger Equations with Time-Dependent Unbounded Singular Potentials
831
C0∞ (Rd ) ([12]) and we denote the selfadjoint extensions by the same symbols. Note that if A and V satisfy Assumption 1.1 then so do Assumption 1.3. Thus, they are the same operators as the ones defined via the corresponding quadratic forms and ˜ 1 (t) ([2]). It follows, since V2 (t, x) is of Stummel Lemma 3.1 applies to H1 (t) and H class uniformly with respect to t ∈ I, that, for any ε > 0, there exists λ0 such that V2 (t)(H1 (t) + λ)−1 B(H) ≤ V2 (t)(−∆ + λ)−1 B(H) < ε,
λ > λ0
˜ 1 (t) in place of H1 (t). It follows that and likewise with H H0 (t) = H1 (t) + V2 (t),
D(H0 (t)) = D(H1 (t)),
˜ 0 (t) = H ˜ 1 (t) + V2 (t), H
˜ 0 (t)) = D(H ˜ 1 (t)) D(H
are selfadjoint and, by taking C∗ large enough, we have H0 u ≥ C∗ u ,
˜ 0 (t)u ≥ C∗ u . H
Then, we have for a constant C1 independent of t ∈ I C1−1 H1 (t)u ≤ H0 (t)u ≤ C1 H1 (t)u ,
(16)
˜ 1 (t)u C1−1 H
(17)
˜ 0 (t)u ≤ C1 H ˜ 1 (t)u . ≤ H
˜ 0 (t) and H ˜ 1 (t) are identical Lemma 4.1. (1) Domains of operators H0 (t), H1 (t), H and are independent of t ∈ I. We denote them by D. D ⊂ D(H(t)) for any t ∈ I. (2) There exists a constant c > 0 such that ˜ 0 (s)u , ˜ 0 (t)u ≤ ec|t−s| H H ˜ 0 (s))u ≤ c|t − s| H ˜ 0 (s)u , ˜ 0 (t) − H (H
t, s ∈ I, t, s ∈ I.
(18) (19)
˜ 0 (t). The same holds for H0 (t) replacing H i2tC∗ x2 satisfies G(t)D = D and (3) The gauge transform G(t) = e ˜ 0 (t), H0 (t)G(t) = G(t)H
˜ 1 (t). H1 (t)G(t) = G(t)H
(20)
If ϕ ∈ D, G(t)ϕ is H-valued differentiable with respect to t and 2
∂t G(t)ϕ = 2iC∗ x G(t)ϕ. Proof. Let u ∈ C0∞ (Rd ). Then, H0 (t)u is H-valued differentiable almost everywhere with respect to t by virtue of Assumption 1.1 and ˙ x)u + V˙ (t, x)u. ˙ x)∇A(t) u + i 1 divx A(t, H˙ 0 (t)u = iA(t, 2
(21)
September 20, J070-S0129055X11004436
832
2011 11:25 WSPC/S0129-055X
148-RMP
K. Yajima
We write the right-hand side in the form t ˙ x) · ∇A(s,x) u + A(t, ˙ x)dr u + i divx A(t, ˙ x)u + V˙ (t, x)u ˙ x) · iA(t, A(r, 2 s = I1 (t, s)u + I2 (t, s)u + I3 (t, s)u. ˙ x)2 ), A(t, ˙ x)2 and If B(t) is multiplication by any of functions V˙ (t, x), ∇x (A(t, ˙ x), then, Assumption 1.1 and (12) imply divx A(t, B(t)H1 (s)−1 u ≤ B(t)Λ−1 u ≤ C1 u uniformly with respect to t, s ∈ I, and consequently B(t)u ≤ C1 H1 (s)u ,
t, s ∈ I.
(22)
2 2 ˙ A(r)| ˙ ˙ ˙ Since 2|A(t) ≤ |A(t)| + |A(r)| and I is compact, it follows that
I2 (t, s)u ≤ C2 H1 (s)u ,
t, s ∈ I.
I3 (t, s)u ≤ C1 H1 (s)u ,
t, s ∈ I.
Likewise we have
We next estimate I1 (t, s)u . Write
∇jA(t)
= ∂/∂xj − iAj (t) and [F, G] = F G − GF ˙ x)2 ) and we may estimate = ∂xj (A(t,
˙ x)2 ] [∇jA(s) , A(t,
for the commutator. Then, ˙ 2 ∇j u, ∇j u) A˙ j (t)∇jA(s) u 2 ≤ (A(t) A(s) A(s) j
j
=−
˙ 2 u) − ((∇jA(s) )2 u, A(t)
j
˙ 2 u) + ≤ 2(H1 (t)u, A(t)
˙ 2 )u) (∇jA(s) u, ∂xj (A(t)
j
˙ 2 )u , ∇jA(s) u ∂xj (A(t)
j
where we used the posivity of V˜1 (t, x) in the final stage. Using that V˜1 (t, x) ≥ C∗ x with large C∗ ≥ 1, we have
2
∇A(s) u 2 ≤ 2(H1 (s)u, u) ≤ 2 H1 (s)u 2 . Then, the Schwarz inequality and (22) yield A˙ j (t)∇j I1 (t, s)u 2 ≤ d
A(s) u
2
≤ C H1 (s)u 2 .
Thus, combining these with (16), we have proven H˙ 0 (t)u ≤ C2 H1 (s)u ≤ C2 H0 (t) ,
t, s ∈ I.
(23)
u ∈ C0∞ (Rd ).
(24)
Then, by integration, it follows that (H0 (t) − H0 (s))u ≤ c|t − s| H0 (s)u ,
Since C0∞ (Rd ) is a core of H0 (s) for all s ∈ I, (24) extends to u ∈ D(H0 (s)). It follows that D(H0 (s)) ⊂ D(H0 (t)) and by symmetry D(H0 (s)) = D(H0 (t)) for
September 20, J070-S0129055X11004436
2011 11:25 WSPC/S0129-055X
148-RMP
Schr¨ odinger Equations with Time-Dependent Unbounded Singular Potentials
833
any t, s ∈ I and, consequently, (19) for H0 (t) is satisfied. (19) clearly implies (18). ˜ in the agrument above, we obtain the same results for H ˜ 0 (t). Changing A(t) by A(t) 2 Let u ∈ D. Then, x u ∈ H by virtue of (12) and 2
H(t)u = H0 (t)u − 2C∗ x u ∈ H.
(25)
Since H(t) (and H0 (t)) is the maximal operator, viz. D(H(t)) = {u ∈ H : H(t)u ∈ L2 }, (25) implies u ∈ D(H(t)) and D ⊂ D(H(t)). ˜ 1 (t)). For this purpose, we freeze t and introduce We now prove D(H1 (t)) = D(H the family of operators 1 H1 (t, θ) = − ∇2A(t,θ) + V˜1 (t, x), 2
A(t, θ, x) = A(t, x) − 4tθC∗ x
˜ 1 (t). We then apply the for θ ∈ [0, 1] so that H1 (t, 0) = H1 (t) and H1 (t, 1) = H argument of the previous paragraph to ∂θ H1 (t, θ). Then, we obtain entirely similarly as above that (H1 (t, θ) − H1 (t, σ))u ≤ C1 |θ − σ| H1 (t, σ)u ,
u ∈ C0∞ (Rd ).
˜ 1 (t)) and statements (1) and (2) are proved. Then D(H1 (t)) = D(H It is clear that G(t) maps C0∞ (Rd ) onto itself and, for ϕ ∈ C0∞ (Rd ), a direct ˜ 0 (t)ϕ. Since C0∞ (Rd ) is a core of H ˜ 0 (t), it computation shows H0 (t)G(t)ϕ = G(t)H ˜ follows that G(t)D(H0 (t)) ⊂ D(H0 (t)). Since this clearly holds for G(−t) = G(t)−1 ˜ 0 (t). This argument likewise as well, we obtain G(t)D = D and H0 (t)G(t) = G(t)H ˜ applies to the pair H1 (t) and H1 (t) and we obtain (20). The last statement is obvious since D ⊂ D(x2 ) as remarked previously. This completes the proof of the lemma. Proof of Theorem 1.2. We arbitrarily choose and fix t0 ∈ I and equip D with the ˜ 0 (t0 ) and make it a Hilbert space. It is clear that D ⊂ H is a dense graph norm of H ˜ 0 (t), and continuous embedding. When D is equipped with the graph norm of H we denote the corresponding Hilbert space by Yt . We then apply Theorem 3.2 to ˜ 0 (t)). We use the original Hilbert space structure the triplet (X , Y, A(t)) ≡ (H, D, H for X = H and Xt = H for all t ∈ I. Then, (18) and (19) of Lemma 4.1 imply ˜ 0 (t) ∈ B(Y, X ) is norm continuous that the condition (13) is satisfied and that H ˜ 0 (t) is selfadjoint with respect to t ∈ I. Leinfelder–Simader’s theorem implies H ˜ ˜ in H and the part of H0 (t) in Yt = D(H0 (t)) is again selfadjoint with domain ˜ 0 (t)2 ). Thus, all the conditions of Theorem 3.2 are satisfied and there uniquely D(H ˜ (t, s) : s, t ∈ I} which satisfies properties (a)–(d) of exists a family of operators {U ˜ s) ˜ 0 (t)). Moreover, U(t, Theorem 3.2 with (X , Y, A(t)) being replaced by (H, D, H ˜ is a unitary operator of H. Indeed, if we set u(t) = U (t, s)ϕ for ϕ ∈ Y, we have by ˜ 0 (t) virtue of (14) for H i
d ˜ 0 (t)u(t), u(t)) − (u(t), H ˜ 0 (t)u(t)). u(t) 2 = (H dt
September 20, J070-S0129055X11004436
834
2011 11:25 WSPC/S0129-055X
148-RMP
K. Yajima
˜ 0 (t) is selfadjoint and U(t, ˜ s) is an isometry of The right-hand side vanishes since H ˜ H. The unitarity of U (t, s) then follows by combining this with the property (b) of Theorem 3.2. We define ˜ (t, s)G(s)−1 . U (t, s) = G(t)U Then, U (t, s) is a strongly continuous family of unitary operators and it is obvious by virtue of Lemma 4.1 that U (t, s) satisfies properties (1) and (a), (b) of (2) of Theorem 1.2. If ϕ ∈ D, then, by virtue of Lemma 4.1 and the fact that ˜ (t, s)G(s)−1 ϕ ∈ D ⊂ D(H(t)), we obtain U ˜ 0 (t))U˜ (t, s)G(s)−1 ϕ i∂t U (t, s)ϕ = G(t)(−2C∗ x2 + H ˜ 0 (t))G(t)−1 · G(t)U ˜ (t, s)G(s)−1 ϕ = G(t)(−2C∗ x2 + H = H(t)U (t, s)ϕ. The proof of the other equation of (6) is similar. To prove the uniqueness, we have ˜ (t, s) = only to notice that, if U (t, s) satisfies the properties of the theorem, then U −1 ˜ ˜ G(t) U (t, s)G(s) satisfies those corrersponding to H0 (t). But such U (t, s) is unique by virtue of Theorem 3.2. When ϕ ∈ D, it is obvious that u(t, x) = U (t, s)ϕ(x) satisfies (1) in the sense of distributions. Then, the approximation argument as in Remark 1.5 shows that the same holds for ϕ ∈ H as well. We omit the details. The proof is completed. 5. Proof of Theorem 1.4 We assume Assumption 1.3 with sufficiently large C∗ > 0. In addition to quadratic ˜ 0 (t) defined by (7), (8) and (11), respectively, we use the forms Q0 (t), Q1 (t) and Q ˜ 1 (t) defined by form Q ˜ 1 (t)(u) = 1 ∇ ˜ u 2 + (V˜1 u, u), Q 2 A(t)
˜ 1 (t)) = D(Q ˜ 0 (t)) D(Q
(26)
˜ 0 (t), where A(t, ˜ x) = A(t, x) − by removing the negative singular part V2 (t) from Q ˜ 4tC∗ x as previously. We let H1 (t) and H1 (t) denote the selfadjoint operators asso˜ 1 (t) respectively. As explained in the ciated with the quadratic forms Q1 (t) and Q ˜ introduction, H1 (t) and H1 (t) are the same as those of previous section when A and V satisfy Assumption 1.1. Since V2 is of Kato class uniformly with respect to ˜ 1 (t) replacing H implies that V2 is Q1 (t) and Q ˜ 1 (t)-form t ∈ I, (12) for H1 (t) and H bounded with bound 0 uniformly with respect to t ∈ I. It follows that the forms ˜ 0 (t) with D(Q0 (t)) = D(Q1 (t)) and D(Q ˜ 0 (t)) = D(Q ˜ 1 (t)) respectively Q0 (t) and Q are closed and bounded from below. Furthermore, by taking C∗ > 0 large enough, we may and do in what follows assume that Q0 (t) ≥ C∗ ,
˜ 0 (t) ≥ C∗ , Q
(27)
September 20, J070-S0129055X11004436
2011 11:25 WSPC/S0129-055X
148-RMP
Schr¨ odinger Equations with Time-Dependent Unbounded Singular Potentials
835
C1−1 Q0 (t)(u) ≤ Q1 (t)(u) ≤ C1 Q0 (t)(u),
u ∈ D(Q0 (t)),
(28)
˜ 0 (t)(u) C1−1 Q
˜ 0 (t)), u ∈ D(Q
(29)
˜ 0 (t)(u), ˜ 1 (t)(u) ≤ C1 Q ≤Q
˜ 0 (t) for the for a constant C1 ≥ 1 independent of t ∈ I. We write H0 (t) and H ˜ selfadjoint operators associated with the forms Q0 (t) and Q0 (t), respectively. ˜ 0 (t) and Q ˜ 1 (t) are Lemma 5.1. (1) The domains of qudratic forms Q0 (t), Q1 (t), Q identical and are independent of t ∈ I. We equip this space with the inner ˜ 0 (t0 )(u, v) by arbitrarily choosing and fixing t0 ∈ I and denote this product Q Hilbert space by Y. (2) There exists a constant c > 0 such that ˜ 0 (t)(u) ≤ ec|t−s| Q ˜ 0 (s)(u), Q
u ∈ Y,
t, s ∈ I.
(30)
(3) The gauge transform G(t) maps Y onto Y and ˜ 0 (t)(u), Q0 (t)(G(t)u) = Q
˜ 1 (t)(u), Q1 (t)(G(t)u) = Q
u ∈ Y.
(31)
˜ 1 in place of H and Assumption 1.3 imply Proof. Estimate (12) with γ = 1/2 for H −1/2 ˜˙ H ˜ −1/2 (s)u ≤ A(r)Λ ˜˙ A(r) u ≤ M u . 1 ˙ ˜ x) − A(s, ˜ x) = t A(r, ˜ x)dr yields Applying this to A(t, s
˜ − A(s)) ˜ ˜ −1/2 (s)u ≤ M |t − s| u . (A(t) H 1
(32)
˜ −1/2 (s)u ≤ C |V˙ (r)|1/2 Λ−1/2 u ≤ M u and, Likewise, we have |V˙ (r)|1/2 H 1 t applying this to V˜ (t, x) − V˜ (s, x) = s V˙ (r, x)dr, we obtain −1/2
˜ H 1
˜ −1/2 (s)u ≤ C|t − s| u . (s)(V˜ (t) − V˜ (s))H 1
(33)
˜ 0 (s)(u, v) for u, v ∈ C ∞ (Rd ) in the form ˜ 0 (t)(u, v) − Q Write Q 0 1 1 ˜ − A(t))u, ˜ ˜ − A(t))v) ˜ (∇ ˜ u, i(A(s) ∇A(s) + (i(A(s) ˜ v) 2 A(s) 2 1 ˜ ˜ ˜ − A(s)v) ˜ − A(s))u, A(t) + ((V˜ (t) − V˜ (s))u, v). + ((A(t) 2 We estimate each term separately by using (32) and (33) and, then apply (29). We obtain for |t − s| ≤ 1 that ˜ 0 (t)(u, v) − Q ˜ 0 (s)(u, v)| ≤ C|t − s|Q ˜ 1 (s)(u)Q ˜ 1 (s)(v) |Q ˜ 0 (s)(v). ˜ 0 (s)(u)Q ≤ c|t − s|Q ˜ 0 (s)) as in the proof of Lemma 4.1 and ˜ 0 (t)) = D(Q It follows that D(Q ˜ 0 (t)(u) ≤ (1 + c|t − s|)Q ˜ 0 (s)(u). ˜ 0 (s)(u) ≤ ec|t−s| Q Q
(34)
September 20, J070-S0129055X11004436
836
2011 11:25 WSPC/S0129-055X
148-RMP
K. Yajima
In particular, statement (2) is satisfied. Similar argument applies to Q0 (t) and we have D(Q0 (t)) = D(Q0 (s)). Moreover, since estimate (12) implies xu ∈ L2 if ˜ 1 (t)), we have D(Q1 (t)) = D(Q ˜ 1 (t)) and statement (1) u ∈ D(Q1 (t)) or u ∈ D(Q follows. Both ∇A(t) ˜ G(t)u = ∇A(t) u and (V (t)G(t)u, G(t)u) = (V (t)u, u) are obvious for u ∈ C0∞ (Rd ). Since the latter space is a core of the forms Q0 (t) and ˜ 0 (t), we see that D(Q0 (t)) = G(t)D(Q ˜ 0 (t)) or that G(t) maps Y onto Y, and that Q ˜ 0 (t)(u) for u ∈ Y. The corresponding relation for Q1 (t) and Q ˜ 1 (t) Q0 (t)(G(t)u) = Q may be proved similarly. Before proceeding to the proof Theorem 1.4, we recall that if H is a positive selfadjoint operator in a Hilbert space H and H1 ⊂ H ⊂ H−1 is the scale of Hilbert spaces associated with H, viz. H1 = D(H 1/2 ) = D(Q), Q being the quadratic form associated with H, and H−1 = H1∗ with H∗ being identified with H, then: (a) H−1 is the completion of H by the norm H −1/2 u . (b) H has a natural extension H− to H−1 and H− is selfadjoint in H−1 with domain D(H 1/2 ). (c) The part H+ of H− in H1 is again selfadjoint with domain D(H 3/2 ). These should be obvious if we represent H as a multiplication operator by a positive function on L2 (M, dµ), (M, dµ) being a suitable measure space. Proof of Theorem 1.4. For t ∈ I, we let Yt be the Hilbert space Y with the ˜ 0 (t)(u, v). By virtue of Lemma 5.1, the norm u Yt is inner product (u, v)Yt = Q equivalent with the original norm u Yt0 and satisfies the relation (13) for Yt . It is obvious that Y ⊂ H is a dense and continuous embedding. We define Xt = Yt∗ . If we identify H with H∗ , then we obtain Gel’fand triplet of Hilbert spaces Yt ⊂ H ⊂ Xt
(35)
with continuous and dense inclusions. This is the scale of Hilbert space associated ˜ 0 (t). Since Yt = Y is independent of t with equivalent Hilbert space strucwith H tures, so is the space Xt = X by virtue of the property (a) above, and (30) implies that (13) is satisfied by Xt as well: c = 2M 2 . (36) ˜ 0 (t)− ). By virtue of property (b) We want to apply Theorem 3.2 to triplet (X , Y, H ˜ ˜ 0 (t)+ in Yt by virtue of above, operator H0 (t)− is selfadjoint in Xt and, so is H property (c). Moreover, the relation (34) implies the estimate ˜ 0 (t)− − H ˜ 0 (s)− )u Xs ≤ c|t − s| H ˜ 0 (s)− u Xs ≤ c|t − s| u Ys . (H u Xt ≤ ec|t−s| u Xs ,
˜ 0 (t)− ∈ B(Y, X ) is norm continuous with respect to t ∈ I. Thus, It follows that H ˜ 0 (t)− ˜ 0 (t)− ) satisfies all the conditions of Theorem 3.2 with H the triplet (X , Y, H ˜ (t, s) : t, s ∈ I} replacing A(t), and Theorem 3.2 produces a family of operators {U ˜ 0 (t)− ). We then define which satisfies the properties of Theorem 3.2 for (X , Y, H ˜ (t, s)G(s)−1 . U (t, s) = G(t)U
September 20, J070-S0129055X11004436
2011 11:25 WSPC/S0129-055X
148-RMP
Schr¨ odinger Equations with Time-Dependent Unbounded Singular Potentials
837
We know that G(t) maps Y onto Y by virtue of Lemma 5.1. Moreover, the multi2 plication by x is bounded from Y to X since −1/2
˜ xH 0
u ≤ C xΛ−1/2 u ≤ C
by virtue of (12). It follows that H(t) is bounded from Y to X and it is norm continuous and that, for u ∈ Y, G(t)u is strongly continuosly differentiable in X with respect to t ∈ I. Then, it is easy to check that this U (t, s) satisfies all properties (a)–(d) of Theorem 1.4(3). It remains to show that U (t, s) is unitary and strongly continuous in H. Define u(t) = U (t, s)ϕ for ϕ ∈ Y. Then, with ·, · being the coupling of X and Y, we have ∂t (u(t), u(t))L2 = 2−iH(t)u(t), u(t) 2
˜ 0(t)(u(t), u(t)) + 2iC∗ x u(t), u(t)} = 0. = 2{−iQ It follows that u(t) = ϕ and, since Y is dense in H, we conclude U (t, s)H ⊂ H and U (t, s)u = u for all ϕ ∈ H. Then, U (t, s) must be unitary since U (t, s)U (s, t)u = u. If ϕ ∈ Y, (t, s) → U (t, s)ϕ ∈ H is continuous in H. Hence U (t, s) is strongly continuous in B(H) by the unitarity. The uniqueness of U (t, s) of Theorem 1.4 follows from the uniqueness result of Theorem 3.2 by tracing back the argument above. Acknowledgment Supported by JSPS grant in aid for scientific research No. 22340029. References [1] L. Baudouin, O. Kavian and J.-P. Puel, Regularity for a Schr¨ odinger equation with singular potentials and application to bilinear optimal control, J. Differential Equations 216 (2005) 188–222. [2] H. Cyson, R. Froese, W. Kirsch and B. Simon, Schr¨ odinger Operators with Application to Quantum Mechanics and Global Geometry (Springer-Verlag, Berlin, 1987). [3] S. Doi, Smoothness of solutions for Schr¨ odinger equations with unbounded potentials, Publ. RIMS Kyoto Univ. 41 (2005) 175–221. [4] D. Fujiwara, Remarks on convergence of Feynmann path integrals, Duke Math. J. 47 (1980) 41–96. [5] J. Howland, Stationary theory for time dependent Hamiltonians, Math. Ann. 207 (1974) 315–335. [6] W. Ichinose, A note on the existence and -dependency of the solution of equations in quantum mechanics, Osaka J. Math. 32 (1995) 327–345. [7] A. Iwatsuka, Essential self-adjointness of the Schr¨ odinger operators with magnetic fields diverging at infinity, Publ. RIMS Kyoto Univ. 26 (1990) 841–860. [8] T. Kato, Linear evolution equations of “hyperbolic type”, J. Fac. Sci. Univ. Tokyo Sec. I 17 (1970) 214–258. [9] T. Kato, Linear evolution equations of “hyperbolic type” II, J. Math. Soc. Japan 25 (1973) 684–666.
September 20, J070-S0129055X11004436
838
2011 11:25 WSPC/S0129-055X
148-RMP
K. Yajima
[10] T. Kato, Remarks on the essential selfadjointness and related problems for differential operators, in Spectral Theory of Differential Operators, eds. J. W. Knowles and R. T. Lewis (North Holland, Amsterdam, 1981), pp. 253–266. [11] J.-L. Lions and E. Magenes, Non-Homogeneous Boundary Value Problems and Applications, Vol. I, Translated from French by P. Kenneth, Die Grundlehren der Mathematischen Wissenshaften, Vol. 181 (Springer-Verlag, 1972). [12] H. Leinfelder and C. Simader, Schr¨ odinger operators with singular magnetic vector potentials, Math. Z. 176 (1981) 1–19. [13] N. Okazawa, T. Yokota and K. Yoshii, Remarks on linear Schr¨ odinger evolution equations with Coulomb potential with moving center, preprint (2010). [14] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. II, Fourier Analysis, Selfadjointness (Academic Press, 1975). [15] M. E. Taylor, Pseudodifferential Operators (Princeton Univ. Press, 1981). [16] K. Yajima, Existence of solutions for Schr¨ odinger evolution equations, Comm. Math. Phys. 110 (1987) 415–426. [17] K. Yajima, Schr¨ odinger evolution equations with magnetic fields, J. Anal. Math. 56 (1991) 29–76. [18] K. Yajima, On time dependent Schr¨ odinger equations, in Dispersive Nonlinear Problems in Mathematical Physics, eds. P. D’Ancona and V. Georgev, Quaderni di Matematica, Vol. 15 (Seconda Universit` a di Napoli, 2005), pp. 267–329. [19] K. Yajima, Smoothness and non-smothness of fundamental solution of time dependent Schr¨ odinger equations, Comm. Math. Phys. 181 (1996) 605–629. [20] G. P. Zhang and K. Yajima, Smoothing property for Schr¨ odinger equations with potential superquadratic at infinity, Comm. Math. Phys. 221 (2001) 573–590.
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 8 (2011) 839–863 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004448
OPTIMAL FOCUSING FOR MONOCHROMATIC SCALAR AND ELECTROMAGNETIC WAVES
JEFFREY RAUCH Department of Mathematics, University of Michigan, Ann Arbor 48109 MI, USA
[email protected] Received 8 August 2011 Revised 12 August 2011 For monochromatic solutions of D’Alembert’s wave equation and Maxwell’s equations, we obtain sharp bounds on the sup norm as a function of the far field energy. The extremizer in the scalar case is radial. In the case of Maxwell’s equation, the electric field maximizing the value at the origin follows longitude lines on the sphere at infinity. In dimension d = 3, the highest electric field for Maxwell’s equation is smaller by a factor 2/3 than the highest corresponding scalar waves. The highest electric field densities on the balls BR (0) occur as R → 0. The density dips to half max at R approximately equal to one third the wavelength. For these small R, the extremizing fields are identical to those that attain the maximum field intensity at the origin. Keywords: Maxwell equations; focusing; energy density; extreme light initiative. Mathematics Subject Classification 2010: 35Q60, 35Q61, 35L40, 35P15, 35L25
1. Introduction The problem we address is to find for fixed frequency ω/2π, the monochromatic solutions of the wave equation and of Maxwell’s equation that achieve the highest field values at a point or more generally the greatest electrical energy in ball of fixed small radius. They are constrained by the energy at |x| ∼ ∞. This leads to several variational problems: • Maximize the field strength at a point. • For fixed R, maximize the energy in a ball of radius R. A third problem is, • Find Roptimal so that the energy density is largest. We show that the last is degenerate, the maximum occuring at R = 0. There are experiments in course whose strategy is to focus a number of coherent high power laser beams on a small volume to achieve very high energy densities. 839
September 20, J070-S0129055X11004448
840
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
The problem was proposed to the author by G. Mourou because of his leadership role in the European Extreme Light Initiative. If by better focusing, one can reduce the size of the incoming lasers there would be significant benefits. Identification of the extrema guides the deployment of the lasers. With the experimental design as motivation the maximization for the Maxwell equations is interpreted as maximization of the energy in the electric field E, ignoring the magnetic contribution. Including the magnetic contribution creates an analogous problem amenable to the techniques introduced here. As natural as these questions appear, we have been unable to find previous work on them. We study solutions of the scalar wave equation and of Maxwell equations, vtt − ∆v = 0,
Et = curl B,
Bt = −curl E,
div E = div B = 0,
for spatial dimensions d ≥ 2. Units are chosen so that the propagation speed is 1. Definition 1.1. A solution v of the wave equation is monochromatic if it is of the form with ψ + ω 2 ψ = 0, ω > 0.
v = ψ(t)u(x),
(1.1)
Monochromatic solutions of Maxwell’s equation are those of the form ψ(t)(E(x), B(x)),
with ψ + ω 2 ψ = 0, ω > 0.
(1.2)
They are generated by e±iωt u(x) and e±iωt (E(x), B(x)).
(1.3)
Scaling t, x → ωt, ωx reduces the study to the case ω = 1. In that case, the reduced wave equations are satisfied, (∆ + 1)u(x) = 0,
(∆ + 1)E(x) = (∆ + 1)B(x) = 0.
(1.4)
Notation. The absolute value sign | | is used to denote the modulus of complex numbers, the length of vectors in Cd , surface area, and, volume. Examples: |S d−1 | and |BR (0)|. Example 1.2. The plane waves ei(±ωt+ξx) with |ξ| = ω is a monochromatic solution of the wave equation. Its temporal period and spatial wavelength are both equal to 2π/ω. For Maxwell’s equations, the analog is E = ei(±ωt+ξx) e with e ∈ Cd satisfying ξ · e = 0 to guarantee the divergence free condition. The solutions that interest us tend to zero as |x| → ∞. Example 1.3. When d = 3, u(x) := sin|x|/|x| is a solution of the reduced wave equation (see also Example 3.8). The corresponding solutions of the wave equation is 1 ei(±t+|x|) ei(±t−|x|) ±it sin|x| v=e = − . |x| 2 |x| |x| For the plus sign, the first term represents an incoming spherical wave and the second outgoing. To create such a solution it suffices to generate the incoming
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
841
wave. The outgoing wave with the change of sign is then generated by that wave after it focuses at the origin. Example 1.4. Finite energy solutions of Maxwell’s equations are those for which 2 2 Rd |E| + |B| dx < ∞. They satisfy ∀ R > 0, lim |E(t, x)|2 + |B(t, x)|2 dx = 0. t→∞
|x|≤R
Therefore, the solution (E, B) = 0 is the only monochromatic solution of finite energy. The solutions (E(x), B(x)) that tend to zero as x → ∞ define tempered distributions on Rd . When (E(x), B(x)) is a tempered solution of the reduced wave equation, the Fourier Transform satisfies = (1 − |ξ|2 )B(ξ) = 0. (1 − |ξ|2 )E(ξ) is contained in the unit sphere S d−1 := {|ξ| = 1}. Since Therefore the support of E 2 on a 1 − |ξ| has nonvanishing gradient on this set it follows that the value of E d−1 . Therefore there is test function ψ(ξ) is determined by the restriction of ψ to S a distribution e ∈ D ({|ξ| = 1}) so that E(x) := eixξ e(ξ)dσ, (1.5) |ξ|=1
where we use the usual abuse of notation indicating as an integral the pairing of the distribution e with the test function eixξ ||ξ|=1 . Conversely, every such expression is a tempered vector valued solution of the reduced wave equation. If e is smooth, the principal of stationary phase (see Sec. 2.2) shows that as |x| → ∞, √ 1/ 2π (e−i|x| e(−x/|x|) + eiπ(d−1)/4 ei|x| e(x/|x|) + O(1/|x|)). (1.6) E(x) = |x|(d−1)/2 The field is O(|r|−(d−1)/2 ). In particular, sup R−1 |E(x)|2 dσ < ∞ and R≥1
|x|≤R
lim
R→∞
R≤|x|≤2R
|E(x)|2 dx = cd
|ξ|=1
(1.7) |e(ξ)|2 dσ.
(1.8)
For a field defined by a distribution e, it is known (see Theorem 2.4) that (1.7) holds if and only if e ∈ L2 (S d−1 ). In that case, (1.8) holds and stationary phase approximation holds in an L2 sense. This is the class of solutions of the reduced wave equation that we study. Equation (1.8) shows that eL2 (S d−1 ) is a natural measure of the strength of the field at infinity.
September 20, J070-S0129055X11004448
842
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
The divergence free condition in Maxwell’s equations is satisfied if and only if ξ · e(ξ) = 0 on S d−1 . In that case, the solutions e±it E(x) of the time dependent equation are linear combinations of the plane waves in Example 1.2. Denote by H the closed subspace of e ∈ L2 (S d−1 ; Cd ) with ξ · e = 0. For e ∈ H and x = rξ with |ξ| = 1 and r 1, the solution eit E(x) of Maxwell’s equations satisfies 1 r(d−1)/2 E(x) ≈ √ (ei(t−r) e(−ξ) + eiπ(d−1)/4 ei(t+r) e(ξ)). 2π In practice, the incoming wave ei(t+r) 1 √ eiπ(d−1)/4 (d−1)/2 e(ξ)dσ r 2π S d−1 is generated at large r and the monochromatic solution is observed for t 1. The phase factor eiπ(d−1)/4 correponds to the phase shift from the focusing at the origin. The first two variational problems for Maxwell’s equations seek to maximize 2 J1 (e) := |E(0)| and J2 (e) := |E(x)|2 dx
|x|≤R
among e ∈ H with S d−1 |e(ξ)|2 dσ = 1. Theorems 3.1 and 3.2 compute the maxima of J1 in the scalar and electromagnetic cases. The maximum in the scalar case and also the vector case without divergence free condition is |S d−1 |. It is attained when and only when e is constant. For the electromagnetic case, ξ · e(ξ) = 0 so the constant densities are excluded. The maximum is achieved at multiples and rotations of the field (ξ) from the next definition. Definition 1.5. For ξ ∈ S d−1 denote by (ξ) the projection of the vector (1, 0, . . . , 0) orthogonal to ξ, (ξ) := (1, 0, . . . , 0) − (ξ · (1, 0, . . . , 0))ξ = (1, 0, . . . , 0) − ξ1 ξ.
(1.9)
is a vector field whose integral curves are the lines of longitude connecting the pole (−1, 0, . . . , 0) to the opposite pole (1, 0, . . . , 0). The maximum value of J1 for electromagnetic waves is smaller by (d − 1)/d than the extremum in the scalar case. The same functions also solve the J2 problem when R is not too large. The study of J1 is reduced to an application of the Cauchy–Schwartz inequality. In Sec. 4, the maximization of J2 is transformed to a problem in spectral theory. Maximizing J2 is equivalent to finding the norm of an operator. In the scalar case, we call the operator L. Finding the norm is equivalent to finding the spectral radius of the selfadjoint operator L∗ L. The operator L∗ L is compact and rotation invariant on L2 (S d−1 ). Its spectral theory is reduced by the spaces of spherical harmonics of order k. On the space of spherical harmonics of degree k, L∗ L is multiplication by a constant Λd,k (R) computed exactly in terms of Bessel functions in Theorem 5.2. Theorem 7.1 shows that in the scalar case Λd,0 (R) is the largest for R ≤ π/2.
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
843
For the Maxwell problem, the corresponding operator is denoted L∗M LM with M standing for Maxwell. We do not know all its eigenvalues. However two explicit eigenvalues are (2/3)(Λd,0 (R)−Λd,2 (R)), and, Λd,1 (R). When R ≤ π/2, Theorem 7.3 proves that the they are the largest and second largest eigenvalues. The proof uses the minimax principal. The eigenfunctions for the largest eigenvalue are rotates of multiples of . For R > π/2, we derive in Sec. 8 rigorous sufficient conditions guaranteeing that the same functions provide the extremizers. The conditions involve the Λd,k . To verify them, we evaluate the integrals defining the Λ3,k approximately. By such evaluations, we show that for d = 3 and R ≤ 2.5 the solutions maximizing J1 also maximize J2 . The energy density is equal to the largest eigenvalue divided by |BR (0)|. In the range d = 3, R ≤ 2.5 in both the scalar and electromagnetic case this quantity is a decreasing functions of R. This shows that the third problem at the start, of finding the radius with highest energy density is solved by R = 0. However the graph is fairly flat. The density dips to about 1/2 its maximum at about R = 2 which is about a third the wavelength. For focusing of electromagnetic waves to a ball of radius R no larger than one third of a wavelength the optimal strategy is to choose e(ξ) a multiple of a rotate of (ξ). The extremizing electric fields are as polarized as a divergence free field can be. When d = 3, formula (1.6) and Example 3.10 show that the far field for this choice is equal up to rotations by x sin |x| . c |x| |x| This field is cylindrically symmetric with axis of symmetry along the x1 -axis. The restriction of to the unit sphere is cylindrically symmetric given by rotating the following figure about the horizontal axis. For the problem of focusing a family of lasers, this suggests using linearly polarized sources concentrated near an equator and sparse near the poles. In contrast, for scalar waves one should distribute sources as uniformly as possible.
Fig. 1.
September 20, J070-S0129055X11004448
844
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
2. Monochromatic Waves 2.1. Electromagnetic waves and their transforms Proposition 2.1. (i) E given by (1.5) satisfies div E = 0 if and only if e(ξ) · ξ = 0
on {|ξ| = 1}.
(2.1)
(ii) If a monochromatic solution of the Maxwell equations has electric field given by (1.3) with ω = 1 and E is given by (1.5) then the magnetic field is equal to eit B(x) with eixξ ξ ∧ e(ξ)dσ. (2.2) B(x) = − |ξ|=1
Proof. Differentiating (1.5) yields, div E = eixξ iξ · e(ξ)dσ,
curl E =
|ξ|=1
|ξ|=1
eixξ iξ ∧ e(ξ)dσ.
The first formula proves (i). The Maxwell equations together with (1.3) yield −curl E = Bt = iB. Therefore, the second formula proves (ii). Remark 2.2. The condition (2.1) asserts that e(ξ) is tangent to the unit sphere. Brouwer’s Theorem asserts that if ξ → e(ξ) is continuous then there must be a ξ where e(ξ) = 0. Example 2.3. If d = 3 and E is given (1.5) with e(ξ) = (ξ) then on |ξ| = 1, ξ ∧ (ξ) = ξ ∧ ((1, 0, 0) − ξ1 ξ) = ξ ∧ (1, 0, 0) = (0, ξ3 , −ξ2 ) is the tangent field to latitude lines winding around the x1 -axis. Since this is an odd function, the magnetic field vanishes at the origin, B(0) = 0. 2.2. The competing solutions First verify the stationary phase formula (1.6) from the introduction. Consider E given by (1.5) with e ∈ C ∞ ({|ξ| = 1}). For x large, the integral (1.5) has two stationary points, ξ = ±x/|x|. At ξ = x/|x| parametrize the surface by coordinates in the tangent plane at x to find that the phase xξ has a strict maximum equal to |x| and hessian equal to the −I(d−1)×(d−1) . At ξ = −x/|x|, the phase has a minimum with value −|x| and hessian equal to the identity. The stationary phase method yields (1.6). The energy in the electric field satisfies (1.7) and (1.8). Theorem 2.4. Suppose that E ∈ S (Rd ) is a tempered solution of the reduced wave equation given by (1.5) with e ∈ D (S d−1 ). Then, (1.7) holds if and only
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
845
if e ∈ L2 (S d−1 ; Cd ). In that case (1.8) holds. In addition the stationary phase approximation holds in the sense that as R → ∞, √ 2 2π 1/ −i|x| iπ(d−1)/4 i|x| E(x) − dx = o(R). (e e((−x/|x|) + e e e(x/|x|)) (d−1)/2 |x| R≤|x|≤2R Proof. The first two assertions are consequences of H¨ormander [1, Theorems 7.1.27 and 7.1.28]. Theorem 7.1.28 also implies that 1 2 |E(x)| dx ≤ c(d) |e(ξ)|2 dσ. sup R≥1 R R≤|x|≤2R |ξ|=1 This estimate shows that to prove the third assertion it suffices to prove it for the dense set of e ∈ C ∞ (S d−1 ). In that case the result is a consequence of the stationary phase formula (1.6). Definition 2.5. H is the closed subspace of e ∈ L2 (S d−1 ; Cd ) consisting of e(ξ) so that ξ · e(ξ) = 0. Denote by Π the orthogonal projection of L2 (S d−1 ; Cd ) onto the closed subspace H. The next example explains a connection between the solutions of the reduced equation that we consider and those satisfying the Sommerfeld radiation conditions. Example 2.6. If g ∈ E (Rd : Cd ) is a distribution with compact support, then there are unique solutions of the reduced wave equation (∆ + 1)Eout = g,
(respectively (∆ + 1)Ein = g)
satsifying the outgoing (respectively incoming) radiation conditions. The difference E := Eout − Ein is a solution of the homogeneous reduced wave equation. The field F := eit E(x) is the unique solution of the initial value problem F = 0,
F |t=0 = g, Ft |t=0 = ig. ∞ When g ∈ L2 the formula E(x)δ(τ −1) = c −∞ e−iτ t F dt together with the solution formula for the Cauchy problem imply that (1.7) holds (or see [1, Theorem 14.3.4] showing that both incoming and outgoing fields satisfy (1.7)). More generally, the Fourier transforms in time of solutions of Maxwell’s equations with compactly supported divergence free square integrable initial data yield examples of monochromatic solutions in our class. 2.3. Spherical symmetry is impossible It is natural to think that focusing is maximized if waves come in equally in all directions. For the scalar wave equation that is the case. However, such waves do
September 20, J070-S0129055X11004448
846
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
not exist for Maxwell’s equations. Whatever is the definition of spherical symmetry, such a field must satisfy the hypotheses of the following theorem. Theorem 2.7. If E(x) ∈ C 1 (Rd ) satisifies div E = 0 and for x = 0 the angular part x x E− E· |x| |x| has length that depends only on |x| then E is identically equal to zero. Proof. The restriction of the angular part of E to each sphere |x| = r is a C 1 vector field tangent to the sphere and of constant length. Brouwer’s Theorem asserts that there is a point x on the sphere where the tangent vector field vanishes. Therefore the constant length is equal to zero and E is radial. Therefore in x = 0, E(x) = φ(|x|)x. Since E ∈ C 1 it follows that φ ∈ C 1 ({|x| > 0}). Compute for those x, div E = φdiv x + (∇x φ) · x = dφ + rφr . Therefore in x = 0, rφr + dφ = 0 so φ = cr−d . Since E is continuous at the origin it follows that c = 0 so E = 0 in x = 0. By continuity, E vanishes identically. 3. Maximum Field Strengths We solve the variational problems associated to the functional J1 to yield sharp pointwise bounds on monochromatic waves. The fact that the bounds for electromagnetic fields are smaller shows that focusing effects are weaker. The extremizing fields are first characterized by their Fourier Transforms. Explicit formulas in x-space are given in Sec. 3.4. 3.1. Scalar waves Theorem 3.1. If
eixξ f (ξ)dσ,
u(x) = |ξ|=1
f ∈ L2 (S d−1 ),
and x ∈ Rd , then |u(x)| ≤ |S d−1 |1/2 f L2(S d−1 ) with equality achieved if an only if f is a scalar multiple of e−ixξ . Proof. The quantity to maximize is the L2 (S d−1 ) scalar product of f with e−ixξ . The result is exactly the Cauchy–Schwartz inequality.
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
847
3.2. Electromagnetic waves From Definition 1.5, (ξ) is tangent to the longitude lines on the unit sphere connecting the pole (−1, 0, . . . , 0) to the pole (1, 0, . . . , 0). It is the gradient of the restriction of the function ξ1 to the unit sphere. Theorem 3.2. If d ≥ 2 and
eixξ e(ξ)dσ,
E(x) = |ξ|=1
Then
|E(0)| ≤
e ∈ H.
1/2 d − 1 d−1 |S | eL2 (S d−1 ) . d
(3.1)
Equality holds if and only if e is equal to a constant mulitple of a rotate of (ξ). Proof. By homogeneity it suffices to consider eL2 (S d−1 ) = 1. Rotation and multiplication by a complex number of modulus one reduces to the case E(0) = |E(0)|(1, 0, . . . , 0) and |E(0)| = e1 (ξ)dσ. Need to study, sup
e1 (ξ)dσ : ξ · e(ξ) = 0,
|ξ|=1
e(ξ)2 dσ = 1 .
The quantity to be maximized is e1 (ξ)dσ = (e, (1, 0, . . . , 0))L2 (S d−1 ) . |ξ|=1
The constant function, (1, 0, . . . , 0) does not belong to the subspace H. The projection theorem shows that the quantity is maximized for e proportional to the projection of (1, 0, . . . , 0) on H. Equivalently, using (1.9) together with e · ξ = 0 yields e1 = e · (1, 0, . . . , 0) = e · ((ξ) + ξ1 ξ) = e · , e1 (ξ)dσ = e(ξ) · (ξ)dσ |ξ|=1
(3.2)
|ξ|=1
which is equal to the H scalar product of e and . Since one has the orthogonal decomposition (1, 0, . . . , 0) = (ξ) + ξ1 ξ,
one has, 1 = |(ξ)|2 + ξ12 .
The Cauchy–Schwartz inequality shows that the quantity (3.2) is
1/2 ≤ eL2 (S d−1 ) L2(S d−1 ) = eL2 (S d−1 )
|ξ|=1
(1 − ξ12 )dσ
The extremum is attained uniquely when e = z/ with |z| = 1.
.
(3.3)
September 20, J070-S0129055X11004448
848
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
To evaluate the integral on the right-hand side of (3.3) compute, 1 1 |S d−1 | . ξ12 dσ = ξj2 dσ = ξj2 dσ = 1dσ = d |ξ|=1 j d |ξ|=1 d |ξ|=1 |ξ|=1 Therefore
|ξ|=1
(1 − ξ12 )dσ = |S d−1 | −
|S d−1 | d−1 = |S d−1 | . d d
Together with (3.3) this proves (3.1). Remark 3.3. If one constrains e to have support in a subset Ω then with χ denoting the characteristic function of Ω, e(ξ) · (ξ)dσ = e(ξ) · (ξ)χ(ξ)dσ |ξ|=1
|ξ|=1
and E1 is maximized by the choice e = (ξ)χ(ξ). In the extreme light initiative, Ω is a small number of disks distributed around the equator x1 = 0. 3.3. Derivative bounds Corollary 3.4. If d ≥ 2 and E satisfies eixξ e(ξ)dσ, E(x) = |ξ|=1
e ∈ H,
then for all α ∈ Nd and x ∈ Rd , 1/2 d − 1 d−1 α |∂x E(x)| ≤ |S | eL2 (S d−1 ) . d
(3.4)
Proof. The case α = 0 follows from Theorem 3.2 applied to ixξ ixξ E(x) := E(x + x) = e(ξ)dσ. e e e(ξ)dσ := eixξ |ξ|=1
Compute for |α| > 0 ∂xα E
=
∂xα
e |ξ|=1
|ξ|=1
ixξ
eixξ (iξ)α e(ξ)dσ,
e(ξ)dσ = |ξ|=1
which is of the same form as E with density (iξ)α e(ξ) orthogonal to ξ. Since |ξj | ≤ 1 it follows that |ξ α | ≤ 1 so (iξ)α eL2 (S d−1 ) ≤ eL2 (S d−1 ) . Therefore the general case follows from the case α = 0. Remark 3.5. Derivative bounds for the scalar case are derived in the same way. They lack the factor (d − 1)/d.
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
849
3.4. Formulas for the extremizing fields The electric field corresponding to the extremizing density is explicitly calculated. The computation relies on relations between Bessel functions, spherical harmonics, and, the Fourier Transform. These relations are needed later to analyze J2 . Start from identities in Stein–Weiss [2]. Their Fourier transform is defined as f (x)e−i2πxξ dx, n.b. the 2π in the exponent. We will not follow this convention, so adapt their identities. The Bessel function of order k is ([2, p. 153]), (t/2)k Γ[(2k + 1)/2]Γ(1/2)
Jk (t) =
1
−1
eits (1 − s2 )(2k−1)/2 ds,
−1/2 < k ∈ R.
(3.5)
Theorem 3.10 ([2, p. 158]) is the following. Theorem 3.6. If x ∈ Rd , f = f (|x|)P ∈ L1 (Rd ) with P a homogeneous har (x)−2πixξ dx = F (|ξ|)P (ξ) with monic polynomial of degree k, then f (x)e F (r) = 2πi
−k −(d+2k−2)/2
r
0
∞
f (s)J(d+2k−2)/2 (2πrs)s(d+2k)/2 ds.
This theorem is equivalent, by scaling and linear combination, to the same formula with f = δ(r − 1). That case is the identity, e−i2πxξ P (x)dσ = 2πi−k |ξ|−(d+2k−2)/2 J(d+2k−2)/2 (2π|ξ|)P (ξ). (3.6) |x|=1
Remark 3.7. (i) For |ξ| → ∞, J(|ξ|) = O(|ξ|−1/2 ), and P (ξ) = O(|ξ|k ) so the right-hand side is O(|ξ|−(d−2)/2−1/2 ) = O(|ξ|−(d−1)/2 ) as required by the principle of stationary phase. (ii) For |ξ| → 0, J((d−2k−2)/2 (|ξ|) = O(|ξ|(d−2k−2)/2 ) so the right-hand side of (3.6) is O(|ξ|k ). The higher the order of P the smaller is the Fourier transform near the origin. To adapt to the Fourier transform without the 2π in the exponent, use the substitution η = 2πξ, |η| = 2π|ξ| to find, e−ixη P (x)dσ = 2πi−k (|η|/2π)−(d+2k−2)/2 J(d+2k−2)/2 (|η|)P (η/2π). |x|=1
Using the homogeniety of P yields (2π)1−k i−k (|η|/2π)−(d+2k−2)/2 J(d+2k−2)/2 (|η|)P (η).
September 20, J070-S0129055X11004448
850
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
The exponent of 2π is equal to d/2 yielding, e−ixη P (x)dσ = (2π)d/2 i−k |η|−(d+2k−2)/2 J(d+2k−2)/2 (|η|)P (η).
(3.7)
Since, |η|−k P (η) = P (η/|η|) (3.7) is equivalent to, e−ixη P (x)dσ = (2π)d/2 i−k |η|−(d−2)/2 J(d+2k−2)/2 (|η|)P (η/|η|).
(3.8)
The change of variable η → −η yields, eixη P (x)dσ = (2π)d/2 (−i)−k |η|−(d−2)/2 J(d+2k−2)/2 (|η|)P (η/|η|).
(3.9)
Finally, interchange the role of x and η to find, eixη P (η)dσ = (2π)d/2 (−i)−k |x|−(d−2)/2 J(d+2k−2)/2 (|x|)P (x/|x|).
(3.10)
|x|=1
|x|=1
|x|=1
|η|=1
Example 3.8. The second most interesting example is the extremizing field for the scalar case when d = 3. In that case, P = constant and there is a short derivation. The function u(x) := |ξ|=1 eixξ dσ is a radial solution of (∆ + 1)u = 0. In x = 0 these are spanned for d = 3 by e±ir/r. Smoothness at the origin forces u = A sin r/r. Since u(0) = |S d−1 | it follows that A = |S d−1 |. The most interesting case for us is d = 3 and the extremizing field E with e(ξ) = (ξ). Since is not a spherical harmonic, the preceding result does not apply directly. To find the exact electric field, decompose in spherical harmonics. Lemma 3.9. The spherical harmonic expansion of the restriction of (ξ) to the unit sphere S d−1 ⊂ Rd is d 2 2 ξ − ξ d−1 1 j (ξ) = − , −ξ1 ξ2 , . . . , −ξ1 ξd on |ξ| = 1. (3.11) d d j=2 Proof. When |ξ| = 1, (ξ) = (1, 0, . . . , 0) − ξ1 ξ = (1, −ξ1 ξ2 , . . . , −ξ1 ξd ) − (ξ12 , 0, . . . , 0). The first summand has coordinates that are spherical harmonics. Decompose ξ12 =
ξ12 + · · · + ξd2 ξ12 − ξj2 + , d d j=2 d
ξ ∈ Rd ,
(3.12)
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
851
to find the expansion in spherical harmonics of the restriction of ξ12 to the unit sphere ξ12 + · · · + ξd2 = 1, ξ12
1 ξ12 − ξj2 = + d j=2 d d
on ξ12 + · · · + ξd2 = 1.
(3.13)
Using (3.13) in (3.12) proves (3.11). Example 3.10. The stationary phase formula (1.6) applied to the extremizing e = (ξ) which is an even function yields for d = 3 x sin |x| −1 + O(|x|−2 ). E(x) = √ |x| |x| 2π This is the sum of incoming and outgoing waves with spherical wave fronts and each with profile on large spheres proportional to (x/|x|). The desired incoming wave is such an -wave. 4. Equivalent Selfadjoint Eigenvalue Problems The section introduces eigenvalue problems equivalent to the maximization of J2 . 4.1. The eigenvalue problem for focusing scalar waves Definition 4.1. For R > 0 define the compact linear operator L : L2 (S d−1 ) → L2 (BR (0)) by eixξ f (ξ)dσ. (Lf )(x) := |ξ|=1
The operator L commutes with rotations. The adjoint L∗ maps L2 (BR (0)) → L2 (S d−1 ). Proposition 4.2. The following four problems are equivalent: (i) (ii) (iii) (iv)
Maximize the functional J2 on scalar monochromatic waves. Find f ∈ L2 (S d−1 ) with f L2 (S d−1 ) = 1 so that Lf BR (0) is largest. Find the norm of L. Find the largest eigenvalue of the positive compact self adjoint operator L∗ L on L2 (S d−1 ).
Proof. The equivalence of the first three follows from the definitions. The equivalence with the third follows from the identity Lf 2L2(BR ) = (Lf, Lf )L2 (BR ) = (L∗ Lf, f )L2 (S d−1 ) .
September 20, J070-S0129055X11004448
852
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
Definition 4.3. Define a rank one operator L2 (S d−1 ; C) f → L0 f :=
|ξ|=1
f (ξ)dσ ∈ C.
Remark 4.4. (i) The problem of maximizing J1 for scalar waves is equivalent to finding the norm of L0 and also finding the largest eigenvalue of L∗0 L0 . (ii) The same formula defines an operator from L2 (S d−1 ) → L2 (BR (0)) mapping f to a constant function. With only small risk of confusion we use the same symbol L0 for that operator too. Definition 4.5. The vector valued version of L and L0 are defined by (Le)(x) := eixξ e(ξ)dσ, e ∈ L2 (S d−1 ; Cd ), |ξ|=1
L0 e :=
e(ξ)dσ, |ξ|=1
e ∈ L2 (S d−1 ; Cd ).
Denote by LM and L0,M the restriction of L and L0 to H. Remark 4.6. (i) The problem of maximizing the functional J1 for monochromatic electromagnetic waves is equivalent to finding the largest eigenvalue of L∗0,M L0,M . (ii) The problem of maximizing the functional J2 for monochromatic electromagnetic waves is equivalent to finding the largest eigenvalue of L∗M LM . (iii) The operator LΠ is equal to LM on H and equal to zero on H⊥ . Therefore (LΠ)∗ (LΠ) is equal to L∗M LM on H and equal to zero on H⊥ . 5. Exact Eigenvalue Computations 5.1. The operators L∗0 L0 , L∗0 L0 and L∗0,M L0,M Theorem 5.1. (i) The spectrum of L∗0 L0 contains one nonzero eigenvalue, |S d−1 |, with multiplicity one. The eigenvectors are the constant functions. (ii) The spectrum of the operator L∗0 L0 contains one nonzero eigenvalue, |S d−1 |, with multiplicity d. The eigenvectors are the Cd -valued constant functions. (iii) The spectrum of L∗0,M L0,M contains one nonzero eigenvalue, |S d−1 |(d − 1)/d with multiplicity d. The corresponding eigenspace consists of (ξ) and functions obtained by rotation and scalar multiplication. Proof of Theorem 5.1(iii). Suppose that f is an eigenfunction in H so that the norm of f dσ is maximal. Rotating f yields a function in H with the same L0 f and with L0 f parallel to (1, 0, . . . , 0). Therefore maximizing L0 f and maximizing L0 f · (1, 0, . . . , 0) yield the same extreme value.
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
Since f ∈ H,
f · (1, 0, . . . , 0))dσ =
f1 dσ =
853
f · dσ.
The extreme value is attained for f parallel to . This shows that is an eigenfunction corresponding to the largest eigenvalue. Rotating and taking scalar multiples yields a complex eigenspace of dimension d. Since rank L0 = d these are all the eigenfunctions. If f L2 (S d−1 ) = 1, then the maximization of J0 shows that |L0,M f |2 = d−1 |(d − 1)/d proving the formula for the eigenvalue. |S 5.2. The eigenfunctions and eigenvalues of L∗ L and L∗ L Theorem 5.2. In dimension d, the spherical harmonics of order k are eigenfunctions of L∗ L with eigenvalue R Λd,k (R) := (2π)d/2 |S d−1 | r[J(d+2k−2)/2 (r)]2 dr. (5.1) 0
Remark 5.3. From the point of view of focusing of energy into balls, all spherical harmonics of the same order are equivalent. Proof of Theorem 5.2. If P is a homogeneous harmonic polynomial of degree k formula (3.10) shows that (LP )(x) = φd,k (|x|)P (x), defining the function φ. The operator L∗ is an integral operator from L2 (BR ) → L2 (S d−1 ) with kernel −ixξ . Therefore e L∗ L(P ) = e−ixξ φd,k (|x|)P (x)dx. |x|≤R
Introduce polar coordinates x = ry with |y| = 1 to find R rd−1 e−iryξ φd,k (r)rk P (y)dσ(y)dr. L∗ L(P ) = |S d−1 | 0
|y|=1
Formula (3.10) shows that e−iryξ P (y)dσ(y) = φd,k (r)P (−rξ) = (−r)k φd,k (r)P (ξ). |y|=1
Therefore L∗ L(P ) =
0
R
(−1)k |S d−1 |rd+2k−1 φd,k (r)2 drP (ξ).
September 20, J070-S0129055X11004448
854
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
This proves that the harmonic polynomial are eigenvectors with eigenvalue depending only on d and k. The formula for the eigenvalue follows on noting that the eigenvalue is equal to the square of the L2 (BR (0)) norm of ixη F (x) := e P (η)dσ, |P (η)|2 dσ = 1. (5.2) |η|=1
|η|=1
Using polar coordinates and (3.10) yields, F 2BR (0) = (2π)d/2 |S d−1 |
0
= (2π)d/2 |S d−1 |
R
0
R
r−(d−2) [J(d+2k−2)/2 (r)]2 rd−1 dr r[J(d+2k−2)/2 (r)]2 dr.
(5.3)
This proves (5.1). The spectral decomposition of L is nearly identical to that of L. The next result is elementary. Corollary 5.4. The eigenvalues of L∗ L are the same as the eigenvalues of L∗ L. The eigenspaces consists of vector valued functions each of whose components belongs to the corresponding eigenspace of L∗ L. 5.3. Some eigenfunctions and eigenvalues of L∗M LM The situation for L∗M LM is more subtle. Our first two results show that there are eigenfunctions intimately related to the eigenvalus Λd,1 (R) and Λd,0 (R). Theorem 5.5. The d-dimensional space of functions e(ξ) := ζ ∧ ξ with ζ ∈ Cd \0 consists of eigenfunctions of L∗M LM . The eigenvalue is Λd,1 (R). Remark 5.6. These e(ξ) are the vector valued spherical harmonics of degree 1 that belong to H, that is, that satisfy ξ · e(ξ) = 0. Proof of Theorem 5.5. Follows from Le = Λd,1 (R)e and e ∈ H. Though the constant functions that are eigenvectors of L do not belong to H, their projection on H yield eigenvectors of L∗M LM . Theorem 5.7. The d-dimensional space consisting of scalar multiples of rotates of (ξ) consists of eigenfunctions of L∗M LM with eigenvalue equal to (Λd,0 (R) − Λd,2 (R))
d−1 . d
(5.4)
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
855
Proof. Use (3.11). Since the spherical harmonics are eigenfunctions of L∗ L one has, suppressing the R dependence of Λ, d 2 2 ξ − ξ d−1 1 j − Λd,2 , −Λd,2 ξ1 ξ2 , . . . , −Λd,2ξ1 ξd . L∗ L = Λd,0 d d j=2 Multiply (3.11) by Λd,2 to find on ξ12 + · · · + ξd2 = 1, d ξ12 − ξj2 d − 1 Λd,2 = Λd,2 − Λd,2 , −Λd,2 ξ1 ξ2 , . . . , −Λd,2ξ1 ξd . d d j=2 Subtract from the preceding identity to find, L∗ L = (Λd,0 − Λd,2 )
d−1 (1, 0, . . . , 0). d
Projecting perpendicular to ξ using Π(1, 0, . . . , 0) = yields ΠL∗ LΠ = ΠL∗ L = (Λd,0 − Λd,2 )
d−1 . d
This proves that is an eigenfunction of (LΠ)∗ (LΠ) with eigenvalue (Λd,0 − Λd,2 ) (d − 1)/d. Remark 4.6(iii) shows that it is an eigenfunction of L∗M LM with the same eigenvalue. By rotation invariance the same is true of all scalar multiples of rotates of . They form a d-dimensional vector space. spanned by the projections tangent to the unit sphere of the unit vectors along the coordinate axes. As in Theorem 5.5 if one defines Hk to consist of spherical harmonics of degree k that belong to H, then Hk are orthogonal eigenspaces of L∗M LM with eigenvalue Λd,k (R). Example 5.8. In R2 the homogeneous R2 valued polynomials of degree two whose radial components vanish are spanned by (−x1 x2 , x21 ) and (x22 , −x1 x2 ). There are no harmonic functions is their span proving that when d = 2, H2 = 0. It is clear that H0 = 0. Though there are a substantial number of eigenvectors of L∗M LM accounted for by the Hk they are far from the whole story. 6. Spectral Asymptotics 6.1. Behavior of the Λd,k (R) Proposition 6.1. (i) As R → 0, Λd,k (R) = O(Rd+2k ). (ii) As R → 0, Λd,0 (R) = |S d−1 | |BR (0)|(1 + O(R)). (iii) limk→∞ Λd,k (R) = 0 uniformly on compact sets of R.
September 20, J070-S0129055X11004448
856
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
Proof. (i) Formula (3.5) shows that Jk (t) = O(tk ) as t → 0. Assertion (i) then follows from (5.1). (ii) By definition, Λd,0 (R) is the square of the L2 (BR ) norm of |ξ|=1 eixξ f (ξ)dσ for f a constant function of norm 1. Take f = |S d−1 |−1/2 . For R small eixξ = 1 + O(R) so eixξ f (ξ)dσ = (1 + O(R))|S d−1 |−1/2 dσ = |S d−1 |1/2 (1 + O(R)). |ξ|=1
|ξ|=1
Squaring and integrating over BR proves (ii). (iii) As k → ∞, the prefactors in the formula for Jk tend to zero uniformly since Γ[(2k + 1)/2] dominates. The integral in the definition tends to zero uniformly by Lebesgue’s Dominated Convergence Theorem.
6.2. Small R asymptotics of the largest eigenvalues Each of the operators L, L, and LM has integral kernel eixξ . They differ in the Hilbert space on which they act. For x small, eixξ ≈ 1 showing that for R small the three operators are approximated by L0 , L0 , and L0,M respectively. We know the exact spectral decomposition of the approximating operators. Each has exactly one nonzero eigenvalue. In performing the approximation some care must be exercised since the operator to be approximated has norm O(Rd/2 ) tending to zero as R → 0. Proposition 6.2. (i) Each of the operators L, L, and LM has norm no larger than (|BR (0)| |S d−1 |)1/2 . (ii) Each of the differences L − L0 , L − L0 , and LM − L0,M has norm no larger than |S d−1 |R(d+2)/2 . (d + 2)1/2
(6.1)
Proof. (i) Treat the case of L. For e = 1, the Cauchy–Schwartz inequality implies that for each x, Le(x)2Cd ≤ |S d−1 |. Integrating over the ball of radius R proves (i). (ii) The Cauchy–Schwarz inequality estimates the difference by ixξ |(L − L0 )e| = (e − 1)e(ξ)dσ ≤ |x||e|dσ |ξ|=1 |ξ|=1 ≤ |x||S d−1 |1/2 eL2 (S d−1 ) .
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
857
For e of norm one this yields (L −
L0 )e2L2 (BR (0))
≤ |S
d−1
|
|x|2 dx
BR
= |S
d−1
|
|ω|=1
0
R
r2 rd−1 drdσ(ω) = |S d−1 |2
Rd+2 , d+2
completing the proof. Theorem 6.3. (i) For each d there is an RL (d) > 0 so that for 0 ≤ R < RL (d) the eigenvalue Λd,0 (R) is the largest eigenvalue of L∗ L. It has multiplicity one. The eigenfunctions are constants. (ii) For 0 ≤ R < RL (d) the eigenvalue Λd,0 (R) is the largest eigenvalue of L∗ L. It has multiplicity d. The eigenfunctions are constant vectors. (iii) For each d there is an RM (d) > 0 so that for 0 ≤ R < RM (d) the eigenvalue (Λd,0 (R) − Λd,2 (R))
d−1 d
(6.2)
is the largest eigenvalue of L∗M LM . It has multiplicity d. The eigenfunctions are rotates of constant multiples of . (iv) In all three cases, the other eigenvalues are O(Rd+1 ). Proof. We prove (iii) and (iv) for the operator L∗M LM . Proposition 6.2 implies that the compact self adjoint operators L∗M LM and L∗0,M L0,M differ by O(Rd+1 ) in norm. Theorem 5.1(iii) shows that the spectrum of L∗0,M L0,M contains one positive eigenvalue, λ+ := |BR (0)| |S d−1 |(d − 1)/d. The factor |BR (0)| arises because L0,M in the present context is viewed as an operator with values in the functions on BR (0) ⊂ Rd . The eigenfunctions are scalar multiplies of rotates of . The rest of the spectrum is the eigenvalue 0. It follows that the spectrum of L∗M LM lies in the union of disks of radius O(Rd+1 ) centered at zero and λ+ . For R small these disks are disjoint and the eigenspace associated to the disk about λ+ has dimension d. Theorem 5.7 shows that the eigenfunctions of L∗0,M L0,M with eigenvalue λ+ are eigenfunctions of L∗M LM . The eigenvalue is given by (5.4). It follows that for R small, the scalar multiples of rotates of is an eigenspace of L∗M LM of dimension d and eigenvalue in the disk about λ+ . This completes the proof of (iii). The fact that the other eigenvalues lie in a disk of radius O(Rd+1 ) centered at the origin proves (iv). The proofs for the operators L and L are similar.
September 20, J070-S0129055X11004448
858
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
7. Largest Eigenvalues for R ≤ π/2 7.1. Monotonicity of Λd,k (R) in k Recall that the wavelength is equal to 2π. Theorem 7.1. For 0 ≤ R ≤ π/2, Λd,k (R) is strictly monotonically decreasing in k = 0, 1, 2, . . . . In particular the largest eigenvalue of L∗ L and L∗ L is Λd,0 (R). The corresponding eigenfunctions are constant scalar and constant vector functions respectively. Proof. Write, Jk (r) =
2(r/2)k Γ[(2k + 1)/2]Γ(1/2)
0
1
cos(rs)(1 − s2 )(2k−1)/2 ds.
(7.1)
For 0 ≤ r ≤ π/2 the cosine factor in the integral is positive. Since (1 − s2 )(2k−1)/2 is decreasing in k for s ∈ [0, 1], the integral is decreasing in k. Γ[(2k + 1)/2] is increasing in k. Since r ≤ 2, (r/2)k decreases with k. The proof is complete. Remark 7.2. Figure 2 in Sec. 8.1 shows that the graphs of the functions Λ3,0 (R) and and Λ3,1 (R) cross close to R = π. The proof shows that this cannot happen for R ≤ π/2. 7.2. Largest eigenvalues of L∗M LM when d = 3, R ≤ π/2 Theorem 7.3. When d = 3 and R ≤ π/2 the strictly largest eigenvalue of L∗M LM is (Λ3,0 (R) − Λ3,2 (R))2/3. The eigenfunctions are the scalar multiples of rotates of , and, Λ3,1 (R) is the next largest eigenvalue. The proof uses the following criterion valid for all d, R. For ease of reading, the R dependence of Λd,k (R) is often suppressed. Theorem 7.4. (i) The eigenvalue (Λd,0 − Λd,2)(d− 1)/d of L∗M LM is strictly larger than all others only if (Λd,0 − Λd,2 )(d − 1)/d > Λd,1 .
(7.2)
(ii) If in addition to (7.2), the two largest evalues of L∗ L are Λd,0 and Λd,1 , then the eigenvalue (Λd,0 −Λd,2 )(d−1)/d of L∗M LM is strictly larger than the others. The eigenfunctions are the scalar multiples of rotates of , and, Λd,1 is the next largest eigenvalue of L∗M LM . Remark 7.5. Equation (7.2) implies that Λd,0 > Λd,1 . The additional information in Theorem 7.4(ii) is that for all k ≥ 1, Λd,1 ≥ Λd,k . Proof of Theorem 7.4. (i) Since Λd,1 is an eigenvalue of L∗M LM , necessity is clear.
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
859
(ii) Under these hypotheses Theorem 5.2 shows that Λd,0 is the largest eigenvalue of L∗ L with one dimensional eigenspace consisting of constant functions. Corollary 5.4 shows that L∗ L has the same largest eigenvalue with d dimensional eigenspace consisting of Cd valued constant functions. The next largest eigenvalue of L∗ L is Λd,1 . In particular, L∗ L has exactly d eigenvalues counting multiplicity that are greater than Λd,1 . Since LM is the restriction of L to a closed subspace, the minmax principal implies that L∗M LM has at most d eigenvalues counting multiplicity that are greater than Λd,1 . Theorem 5.7 provides a d-dimensional eigenspace with eigenvalue given by the left hand side of (7.2) and therefore greater than Λd,1 . In particular there are exactly d eigenvalues counting multiplicity that are greater than Λd,1 . Theorem 5.5 shows that Λd,1 is an eigenfunction of L∗M LM so it must be the next largest. Example 7.6. Parts (i) and (ii) of Proposition 6.1 show that the sufficient condition is satisfied for small R. This gives a second proof that for small R, is an extreme eigenfunction for L∗M LM . The first proof is part (iii) of Theorem 6.3. Proof of Theorem 7.3. Verify the sufficient condition of Theorem 7.4(ii). Since R ≤ π/2, the Λ3,k are strictly decreasing in k. Therefore Λ3,0 and Λ3,1 are the two largest eigenvalues of L∗ L. It remains to verify (7.2). Use formulas (5.1) and (7.1). Formula (5.1) with d = 3 and k = 0, 1, 2 involves J1/2 , J3/2 , J5/2 . Since the integral in (7.1) is decreasing in k it follows that r Γ((2k + 1)/2) Jk+1 (r) ≤ . Jk (r) 2 Γ((2k + 1)/2 + 1) The functional equation Γ(n + 1) = (n + 1)Γ(n) yields r 1 r Jk+1 (r) < = . Jk (r) 2 (2k + 3)/2 2k + 3 Therefore, J3/2 (r) r < J1/2 (r) 4
and
J5/2 (r) r ≤ . J3/2 (r) 6
Injecting these estimates in (5.1) yields Λ3,1 (R) R2 < 2 Λ3,0 (R) 4
and
R2 Λ3,2 (R) < 2. Λ3,1 (R) 6
(7.3)
September 20, J070-S0129055X11004448
860
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
2.5
1.06 1.04
2
1.02 1
1.5 0.98 0.96
1
0.94 0.92
0.5
0.9 0 0
1
2
3
4
5
6
7
0.88 2.95
3
3.05
3.1
3.15
3.2
Fig. 2.
Therefore, Λ3,1 ≤
R2 Λ3,0 42
and Λ3,2 ≤
2 (Λ3,0 − Λ3,2 ) − Λ3,1 > Λ3,0 3
R2 R2 R2 Λ ≤ Λ3,0 , 3,1 62 42 62
2 2 R4 R2 − − 3 3 42 62 42
so
:= Λ3,0 h(R).
The polynomial h(R) is equal to 2/3 when R = 0 and decreases as R increases. To verify (7.2) it suffices to show that h(π/2) > 0. Since 2 > π/2, h(π/2) > h(2) = 43/108 > 0. 8. Numerical Simulations to Determine Largest Eigenvalues Recall that the wavelength is equal to 2π. In this section, the dimension d = 3. Theorem 5.2, Corollary 5.4 and Theorem 7.3 allow one in favorable cases to find the largest eignevalues of L∗ L, L∗ L, and L∗M LM , by evaluating the intergrals defining Λ3,k (R) for k = 0, 1, 2, . . . . These quantities decrease rapidly with k so to compute the largest ones requires little work. 8.1. Simulations for scalar waves For scalar waves the eigenvalues are exactly the Λ3,k (R). For R small, they are monotone in k so the optimal focusing is for k = 0. Our first simulation (performed with the aid of Matlab) computes approximately the integrals defining Λ3,k (R) for R ≤ 2π and k = 0, 1, 2, 3. The resulting graphs are on the left in Fig. 2. The horizontal axis is R and on the vertical axis is plotted the integral on the right-hand side of (5.1), that is, Λ3,k (R) Λ3,k (R) = 7/2 5/2 , 3/2 2 (2π) |S | 2 π
k = 0, 1, 2, 3.
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
861
0.06
0.05
0.04
0.03
0.02
0.01
0 0
1
2
3
4
5
6
7
Fig. 3.
The four curves correspond to the four values of k. The graph with the leftmost hump is Λ3,0 (R). The graph with the hump second from the left is Λ3,1 (R) and so on. The conclusion is that Λ3,0 (R) crosses transversaly the graph Λ3,1 (R) just to the right of R = 3. At that point, Λ3,1 (R) becomes the largest. On the right is a zoom showing that the crossing is suspiciously close to R = π. The graphs of the Λ3,k (R) are a little misleading since it is not the total energy but the energy density that is of interest. Figure 3 plots Λ3,0 /(27/2 π 5/2 |BR (0)|) as a function of R. The small gap near R = 0 is because the division by |BR (0)| is a sensitive operation and leads to numerical errors in that range. The energy density is greatest for balls with radius close to R = 0. The density drops to half its maximum value at about R = 2 which is about 1/3 of the wavelength. 8.2. Simulations for electromagnetic waves Using Theorem 7.3 one can investigate the analogous questions for Maxwell’s equations by manipulations of the Λ3,k (R). The simulations of the preceding subsection show that for R ≤ 3 one has Λ3,0 (R) > Λ3,1 (R). To show that the eigenvalue corresponding to (ξ) is the optimum it suffices to verify (7.2). To do so one needs to verify the positivity of 2 Λ3,0 (R) 2 Λ3,2 (R) Λ3,1 (R) − − 2−7/2 π −5/2 . 3 |BR (0)| 3 |BR (0)| |BR (0)| This is a linear combination of quantities computed in the preceding subsection. Its graph is plotted on the left in Fig. 4. The graph crosses from positive to negative near R = 2.5. The criterion is satisfied for all R to the left of this crossing.
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
J. Rauch
862
For R < 2.5, the energy density for the optimizing monochromatic electromagnetic fields associated with (ξ) is equal to 2 Λ3,0 (R) 2 Λ3,2 (R) − . 3 |BR (0)| 3 |BR (0)| Because of the factor 2/3 it is smaller than the density in the scalar case by that factor. The subtraction in the formula shows that the density drops off more rapidly in the electromagnetic case than in the scalar case. The graph of 2−7/2 π −5/2 times this quantity is plotted in Fig. 5. As in the scalar case the maximal energy density occurs on balls with radius near zero. The intensity drops to one half of this value to the left of R = 1.9. This is close to the corresponding value for scalar waves, about one third of a wavelength. 1
0.4 0.3
0.5
0.2 0.1
0 0 −0.1
−0.5
−0.2 −0.3
−1
−0.4 −1.5
0
0.5
1
1.5
2
2.5
3
−0.5 2.25
2.3
2.35
2.4
2.45
2.5
2.55
2.6
Fig. 4.
0.055 0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0
0.5
1 Fig. 5.
1.5
2
2.5
2.65
2.7
2.75
September 20, J070-S0129055X11004448
2011 11:24 WSPC/S0129-055X
148-RMP
Optimal Focusing
863
Acknowledgments I thank G. Mourou for proposing this problem. I early conjectured that the constants and (ξ) were the extremizers in the scalar and electromagnetic cases, respectively. J. Szeftel, G. Allaire, C. Sogge, P. G´erard, J. Schotland and E. Wolf provided both encouragement and help on the path to the results presented here. The meetings were in Paris, Pisa, and Lansing. In Europe, I was a guest at the Ecole Normale Sup´erieure, Universit´e de Paris Nord and Universit` a di Pisa. Sogge and G´erard were guests of the Centro De Giorgi. Schotland and I were both guests of the IMA and Michigan State University. I thank all these individuals and institutions as well as the support of the NSF under grant NSF DMS-0807600. References [1] L. H¨ ormander, The Analysis of Linear Partial Differential Operators I, II (SpringerVerlag, Berlin, 1983). [2] E. Stein and G. Weiss, Fourier Analysis on Euclidean Space (Princeton University Press, Princeton, 1971).
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 8 (2011) 865–882 c World Scientific Publishing Company DOI: 10.1142/S0129055X1100445X
ALL “STATIC” SPHERICALLY SYMMETRIC PERFECT FLUID SOLUTIONS OF EINSTEIN’S EQUATIONS WITH CONSTANT EQUATION OF STATE PARAMETER AND FINITE POLYNOMIAL “MASS FUNCTION”
˙ ˙ SEMIZ ˙ IBRAH IM Physics Department, Bo˘ gazi¸ci University, ˙ Bebek, Istanbul, Turkey
[email protected] Received 11 February 2011 Revised 18 August 2011 We look for “static” spherically symmetric solutions of Einstein’s Equations for perfect fluid source with equation of state p = wρ, for constant w. We consider all four cases compatible with the standard ansatz for the line element, discussed in previous work. For each case, we derive the equation obeyed by the mass function or its analogs. For these equations, we find all finite-polynomial solutions, including possible negative powers. For the standard case, we find no significantly new solutions, but show that one solution is a static phantom solution, another a black hole-like solution. For the dynamic and/or tachyonic cases we find, among others, dynamic and static tachyonic solutions, a Kantowski–Sachs (KS) class phantom solution, another KS-class solution for dark energy, and a second black hole-like solution. The black hole-like solutions feature segregated normal and tachyonic matter, consistent with the assertion of previous work. In the first black hole-like solution, tachyonic matter is inside the horizon, in the second, outside. The static phantom solution, a limit of an old one, is surprising at first, since phantom energy is usually associated with super-exponential expansion. The KS-phantom solution stands out since its “mass function” is a ninth order polynomial. Keywords: Exact solutions; spherical symmetry; perfect fluid. Mathematics Subject Classification 2010: 83C20
1. Introduction and Motivation Exact solutions of Einstein’s Field Equations Gµν = κTµν
(1.1)
are, of course, of interest for various purposes. (Here, Gµν is the Einstein tensor, Tµν the stress-energy-momentum tensor and κ the coupling constant.) Since the equations are very complicated, to find solutions one often makes simplifying assumptions about the left-hand side and/or the right-hand side. Popular simplifying 865
September 20, J070-S0129055X1100445X
866
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
assumptions about the left-hand side include staticity and spherical symmetry. As is well known, the use of both assumptions together leads to the ansatz [1, Sec. 23.2] ds2 = −B(r)dt2 + A(r)dr2 + r2 dΩ2
(1.2)
for the metric. Most often used simplifying assumptions about the right-hand side of (1.1) are that Tµν represents vacuum (i.e. vanishes) or an electromagnetic field or a perfect fluid. For example, the vacuum assumption, together with the ansatz (1.2) gives uniquely the Schwarzschild metric, the simplest and best-known black hole solution. The perfect fluid form of Tµν is Tµν = (ρ + p)uµ uν + pgµν
(1.3)
where ρ and p are the energy density and pressure, respectively, as measured by an observer moving with the fluid, and uµ is its four-velocity. The use of this Tµν together with ansatz (1.2) describes the interiors of static spherically symmetric stars, for example. But the description (1.3) is not complete: ρ and p should also be specified as functions of particle number density, temperature, etc. One further simplifying assumption, justified under most circumstances, is that there is a relation, called an equation of state f (p, ρ) = 0 between p and ρ. In cosmology, one usually assumes that the equation of state is a proportionality, p = wρ,
(1.4)
with e.g., w = 0 describing the matter-dominated (or “pressureless dust”) case, w = 1/3 the radiation-dominated case, w < −1/3 dark energy, and w < −1 phantom energy. The latter two concepts have been introduced into cosmology in the last decade [2, 3], after the discovery of the acceleration of the expansion of the universe [4, 5]. Now that a good case exists that the universe might be dominated by dark energy, even phantom energy, one should look for exact solutions with these sources. In particular, static spherically symmetric solutions would be the easiest to find and might be relevant in the contexts of black holes or static stars. These solutions can be found starting from the ansatz (1.2), which for “static” perfect fluid source, (i.e. uµ = u0 δ0µ ) leads to the well-known Oppenheimer–Volkoff (OV) equation [6] p = −
(κpr3 + F ) (ρ + p) 2r(r − F )
where
(1.5)
F (r) = κ
ρr2 dr
(1.6)
is written as F for brevity, and prime denotes r-derivative. F (r) can be recognized as κ/4π times the “mass function” defined in the literature. Into the OV equation (1.5) one must put p in terms of ρ via the equation of state, then ρ in terms of F ,
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
SSSPF Solutions with Finite Polynomial “Mass Function”
867
via (1.6), eventually getting a differential equation for F . After solving for F , the metric functions can be found via r , (1.7) A(r) = r − F (r) B (r) κpr2 + 1 1 = − . (1.8) B(r) r − F (r) r The solutions can be interpreted as static only for positive A(r) and B(r), however. In general, the ansatz (1.2) admits four classes of solutions, called NS (the standard case), TD, ND (corresponding to Kantowski–Sachs [8, Sec. 15.6.5], [7] case) and TS in [9]. The ND and TD solutions are not static, hence the quotes on “static” in the title and abstract. For each class, one gets a different OV-like equation. The OV equation is valid in case NS. For equation of state (1.4), it becomes (w + 1)F (wrF + F ) + 2w(rF − 2F )(r − F ) = 0
(1.9)
where we put no constraint on w other than that it is a constant. This is a nonlinear equation whose general solution is difficult to find, unless w = −1 or w = 0, so that one half or the other of (1.9) vanishes. For other w, one can attempt a series solution F (r) =
∞
an r n
(1.10)
n=0
but the recursion expression one gets for an involves all of a0 · · · an−1 and it seems not possible to even show that (1.10) converges, let alone find a closed expression for an . We can, however, find all of the finite-polynomial solutions of (1.9). This we do in the next section. In fact, we find all finite Laurent polynomials, i.e. we consider also negative powersa of r, but find none in case NS. Four of the found solutions are valid for particular values of w, and two for general w. While none of the solutions is totally original, the procedure shows that there are no other finite-polynomial solutions; and in Sec. 3 we discuss properties of the spacetimes. In Sec. 4, we find all finite-polynomial solutions for F (r) in the TD, ND and TS cases; and in Sec. 5, we discuss their properties. In the short Sec. 6, we also ask if we can find any solutions with finite-polynomial A(r). Finally we conclude by pointing out the more interesting, and possibly original solutions; and emphasizing the main point of this work, that there are no other solutions under our restrictions. 2. All Finite-Polynomial Solutions for the Mass Function from the Standard OV Equation In case NS, any power of r less than 3 in F (r) means a diverging density at the origin; in particular, a constant term corresponds to a point mass there, while a In
the rest of this work, we will use “power” also when we really mean “order of the power”. It should be clear from the context which meaning is intended.
September 20, J070-S0129055X1100445X
868
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
negative powers mean diverging mass function, and therefore seem unnatural. On the other hand, the meaning of F (r) is different in the TD, ND(KS) and TS cases, therefore negative powers could be more acceptable. Before starting the general case, however, let us first consider the special cases mentioned above, after Eq. (1.9); especially since they also give polynomial solutions. They are Solution 1: w = −1,
F (r) = Ar3 + C,
(2.1)
Solution 2: w = −1,
F (r) = r
(2.2)
and Solution 3: w = 0,
F (r) = C.
(2.3)
where in Solutions 1 and 3, C is a constant. Now, we will consider general (but constant) w. In some cases to be discussed below, the number of terms in the polynomial will be known. Then, substitution into Eq. (1.9) gives a certain number of terms, and the analysis reduces to straightforward, if possibly tedious, algebra, which can be expedited by the use of symbolic computation software. We will call this the “brute force” approach. Otherwise, we will call the highest and lowest power of r in F (r), m and m. ˜ The second-highest, third-highest, second-lowest and third-lowest powers of r in F (r) ˜ B, ˜ C˜ will we will call n, p, n ˜ and p˜, respectively, when they exist; and A, B, C, A, be the respective coefficients. Therefore ˜ ˜ p˜ + Br ˜ n˜ + Ar ˜ m F (r) = Arm + Brn + Crp + · · · + Cr
(2.4)
if F (r) has more than five terms. We will substitute the polynomial into the lefthand side of (1.9) and set coefficients of all powers of r equal to zero. We start the general analysis by considering the coefficient of the highest power of r in Eq. (1.9), for m > 1: (w + 1)m(wm + 1)A2 − 2wm(m − 3)A2 = 0. Therefore, in this case, A is arbitrary and 7w + 1 m= [for m > 1]. w(1 − w)
(2.5)
(2.6)
If (2.6) had given an integer, we would have found the order of the polynomial for arbitrary w. Since it does not, we conclude that in this case finite polynomial solutions exist for certain values of w only. (None at all for w = 0 or w = 1, since Eq. (2.5) cannot vanish for these values.) Of course, one can also solve for w in terms of m: m − 7 ± (m − 7)2 − 4m [for m > 1] (2.7) w= 2m or one can write (m − 7)w − 1 w2 = [for m > 1]. (2.8) m
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
SSSPF Solutions with Finite Polynomial “Mass Function”
869
Similarly, for m ˜ < 0, we can consider lowest power of r in Eq. (1.9) and find m ˜ =
7w + 1 w(1 − w)
[for m ˜ < 0].
(2.9)
A comparison of Eqs. (2.6) and (2.9) shows that m > 1 and m ˜ < 0 are not compatible, so that (considering also that m ≥ m) ˜ we have three possibilities at the top level: (1) m > 1 and m ˜ ≥ 0, (2) m ˜ < 0 and m ≤ 1, (3) m, m ˜ ∈ {0, 1}. The breakdown of the search for all possible finite-polynomial solutions into a complete set of cases is also shown in tabular form in Table 1. Case 1. m > 1 ⇒ w = 0, w = 1, m ˜ ≥ 0. We ask if n exists and if so, in what range its value is. Case 1.1. n does not exist ⇒ m = m ˜ > 1. This subcase is amenable to the “brute force” approach, since F (r) has a single term. We get m = 3, (since w = 0 and m = 0), which in turn gives w = −1 or w = − 13 . The first solution is the C = 0 special case of solution 1, whereas the second one is 1 Solution 4: w = − , 3
F (r) = Ar3 .
(2.10)
Case 1.2. n exists, n > 1. In this case, the second-highest power of r in Eq. (1.9) is m + n − 1, with coefficient (w + 1)[m(wn + 1) + n(wm + 1)]AB − 2w[m(m − 3) + n(n − 3)]AB = 0, (2.11) Table 1.
Breakdown of all finite polynomial solutions of Eq. (1.9) into cases and subcases.
Case 0: Simple cases giving linear equations Case 1: m > 1 ⇒ m = 7w + 1 , w(1 − w) m ˜ ≥0
Case 2: m, m ˜ ∈ {0, 1} Case 3: m ˜ <0 ⇒ m ≤ 1, m ˜ = 7w + 1 w(1 − w)
Case 0.1: w = −1 →
F1 (r) = Ar 3 + C F2 (r) = r Case 0.2: w = 0 → F3 (r) = C Case 1.1: w = −1 → F (r) = Ar 3 : covered by F1 (r) No n w = − 13 → F4 (r) = Ar 3 Case 1.2: m = 18 w = 1/2 → fails (noninteger n) n>1⇒ w rational
w = 1/9 → fails (“brute force”) w = 1/3 → fails (“brute force”) w = 1/5 → fails (“brute force”) m=3 w = −1 → fails (improper n) w = −1/3 → fails (improper n) 3 Case 1.3: B = 0 → m = 3 → F (r) = √ Ar + C: same as F1 (r) n = 0 or 1 B = 0 w = −3 ± 2 2 → fails (improper m) ˜=0 ⇒A w = −1/3 → F5 (r) = Ar 3 + 32 r A = 0 → F6 (r) = C, w arbitrary; covers F3 (r) √ B = 0 → F7 (r) = 2 4w r, w arbitrary, except −3 ± 2 2 w + 6w + 1 w = −1/5 → F8 (r) = 5r + B Case 3.1: No n ˜→m ˜ = 3 → fails (“rute force”) Case 3.2: n ˜ < 1 → fails (rational w → positive m) ˜ Case 3.3: n ˜ < 1 → fails (rational w → positive m) ˜ m = 15
September 20, J070-S0129055X1100445X
870
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
giving arbitrary B and, after elimination of w2 by using (2.8), the equation (m − n)[(2m − 2n − 7)w − 1] = 0
(2.12)
which not only gives n in terms of m and w, but also means that w is rational. A careful inspection of (2.7) shows that there are only three values of m giving rational w: 18, 15 and 3, with two attendant w values each. The “brute force” approach is applicable now and shows that all the m = 18 and m = 15 cases fail. (The m = 3 cases turn out to belong to Case 1.3.) Case 1.3. n exists, n = 1 or n = 0. We put F (r) = Arm +Br+ A˜ into Eq. (1.9), and we use Eq. (2.6) in the coefficients of powers of r. Then, the coefficient of rm−1 gives ˜ AAm[1 + (7 − 2m)w] = 0.
(2.13)
A and m cannot be zero here. w = 1/(2m − 7), combined with Eq. (2.6) gives m = 0 or m = 7/2, both of which are unacceptable. Therefore, so we must consider ˜ + 5w) vanish. Next A˜ = 0. This also makes the constant term in Eq. (1.9), B A(1 we consider the coefficient of r, B(−4w + B(w2 + 6w + 1)) = 0,
(2.14)
which leads to two subcases: Case 1.3.1. B = 0. The coefficient of rm reduces to 2A(m − 3)mw = 0
(2.15)
which gives m = 3, leading to Solution 1. Case 1.3.2. B = 4w/(1 + 6w + w2 ). Also using Eq. (2.6), the coefficient of rm this time becomes 2A(1 + 3w)(1 + 6w + w2 ) =0 (w − 1)2 w
(2.16)
√ whose w = −3 ± 2 2 solutions give m = 1, therefore are not acceptable for Case 1, whereas the w = −1/3 solution gives 1 Solution 5: w = − , 3
3 F (r) = Ar3 + r, 2
(2.17)
which does not include Solution 4 as a special case. This finishes Case 1, m > 1. ˜ Then, the “brute force” Case 2. m, m∈{0, ˜ 1}. In this case, F (r) = Ar + A. approach gives Solution 6: w arbitrary,
F (r) = C = constant,
(2.18)
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
SSSPF Solutions with Finite Polynomial “Mass Function”
871
a solution that includes Solution 3; √ Solution 7: w arbitrary (except − 3 ± 2 2),
F (r) =
w2
4w r + 6w + 1
(2.19)
√ (for w = −3 ± 2 2, the left-hand side of Eq. (1.9) cannot vanish at all with F (r) = Ar); and 1 Solution 8: w = − , 5
F (r) = 5r + B.
(2.20)
Case 3. m ˜ < 0 ⇒ w = 0, w = 1, m ≤ 1. Similar to Case 1. Case 3.1. n ˜ does not exist ⇒ m = m. ˜ Since the “brute force” approach gives m = 3, there is no solution in this case. ˜ n−1 Case 3.2. n ˜ exists, n ˜ < 1. The coefficient of rm+˜ is given by the same expres˜ This makes again w sion as Eq. (2.11) with m → m, ˜ n→n ˜ , A → A˜ and B → B. rational, but now m ˜ must be 18 or 15 or 3, unacceptable because they are positive.
Case 3.3. n ˜ exists, n ˜ = 1 = m. The “brute force” approach, with F (r) = Ar + m ˜ ˜ Ar gives the unacceptable (positive) m ˜ values 1 and 3. This completes all finite polynomial solutions of Eq. (1.9). Since Solution 3 is a special case of Solution 6, we will not consider it separately in the following section. 3. Discussion of the Solutions Found from the Standard (NS) OV Equation To finalize the solutions, we calculate the metric functions A(r) and B(r) by using (1.7), (1.8), (1.4) and (1.6). The calculation of B(r) involves an arbitrary multiplicative constant at the last stage, the change of which is usually interpreted as a rescaling of t, therefore physically irrelevant. But such rescaling cannot change the sign of that constant, so we consider the two choices of sign as two separate solutions, unless the requirement of correct signature forces a choice upon us. This happens for solutions 1, 5 and 6, whereas for solutions 4, 7 and 8 we have consider both signs. The results are shown in Table 2, where the well-known solutions are indicated in italics. When the metric functions are negative, the spacetime cannot be supported by normal perfect fluid, the source fluid must be tachyonic. In other words, such a spacetime is of type TD in the terminology of [9]. In that case, the OV equation, (1.5), is not valid, but still, A(r)–B(r) pairs satisfy the same equation of pressure isotropy for cases NS and TD. Therefore negative metric functions found from NS-equations represent a valid TD solution, but not with the equation of state that one has started with. If the NS equation of state is (1.4), the corresponding w ρ. TD equation of state becomes p = − 1+2w Solution 1 is the well-known K¨ ottler (aka Schwarzschild–de Sitter) solution, the de Sitter part sometimes being called anti-de Sitter if A is negative.
September 20, J070-S0129055X1100445X
872
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
Table 2. All finite-polynomial solutions of Eq. (1.9) for the mass function in the standard (NS) OV case, together with the corresponding metric functions. No.
w
F (r)
1
−1
Ar 3 + C
2
−1 1 − 3
r
4
5
−
1 3
6
arbitrary
7
arbitrary, except −1, √ −3 ± 2 2
8
−
1 5
Ar 3
Ar 3 +
3 r 2
A 4wr w 2 + 6w + 1
5r + B
A(r) = grr 1 1−
− Ar 2 ∞ 1 1 − Ar 2
−
C r
1 2
1 + Ar 2
±1
1
−2
1 −4
„ ±
r2
r r0
±
« A r « 4w
r0 r
Comments K¨ ottler (SdS) − 4a+ (A > 0): ESU; 4a− (A < 0): Open, static; 4b: Type TD
+ Ar 2
1−
A r
w 2 + 6w + 1 (w + 1)2
C r
C − Ar 2 r ?
1−
„
1 1−
B(r) = −gtt
w+1
5+ (A > 0): Type TD; 5− (A < 0): BH-like Schwarzschild
7a: Type NS, incl. static phantom; 7b: Type TD 8a: Type NS; 8b: Type TD
Note: a Solution 3 does not appear because it is a special case of Solution 6. b The well known solutions are indicated in italics. c Although we started with the NS OV equation, some of the solutions belong to class TD, as defined in [9]. In TD solutions, w should be replaced by −w/(1 + 2w). d In Solutions 4, 7 and 8, the upper signs in B(r) apply to solutions a and lower signs to solutions b. e There is also a Solution 9, coming from the TD OV equation, of type TD or type NS (Secs. 4.1 and 5.1).
Although Solution 2 satisfies Eq. (1.9), it does not correspond to a spacetime: The function A(r) is singular, B(r) is indeterminate.b Solution 4a+ is also well-known: It is the Einstein static universe, with intimate historical connection to the cosmological constant Λ, equivalent to w = −1. But this universe also contains matter (w = 0), whose attraction is precisely balanced by the repulsion of Λ. So the matter density is proportional to Λ and the net effect is equivalent to a single fluid with w = − 31 . Of course, “in the universe” Ar2 < 1, so A(r) is positive and the signature correct. Solution 4a− represents an open static universe, albeit with negative energy density, and no coordinate restriction. The third well-known solution in the table is Solution 6, Schwarzschild solution. It may at first seem surprising that there is no restriction on w. But since ρ vanishes, b In
fact, if one expresses F (r) as the result of some limit process, e.g., F (r) = (1 + )r or F (r) = r + , and takes the limit → 0, the function B(r) and the kind of divergence of A(r) depend on the process used.
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
SSSPF Solutions with Finite Polynomial “Mass Function”
873
the value of w does not matter. In other words, it corresponds to a situation where all the fluid — whatever its equation of state parameter is — has already collapsed to the origin. Also, here we do not apply the usual restriction that A must be positive. If A is negative, the spacetime will give a naked singularity. Now we turn to the discussion of less well-known solutions in Table 2. Solution 4b: A must be positive for correct signature in this solution. It is a dynamic spacetime, r being timelike, (it is solution TD1 of [9]) and describes a spacetime that first contracts, then expands in angular directions, while distances in the orthogonal spacelike direction stay fixed.c Even though we found this solution for w = − 13 , the equation of state is actually p = ρ. Solution 5+: This solution is also of type TD, contracting in the angular directions and expanding in the orthogonal spacelike directiond as r → 0, the density also diverges like r12 . The solution can be identified with the a = 1, b = −1, m = 0, R12 = A (and the trivial B = 1 or const = 1) choice of Tolman VIII [13]. Solution 5−: 1 Both metric functions switch sign at r = rH = − 2A , so that the spacetime is static (NS) for r > rH and dynamic (TD) for r < rH . As far as test particle motion is concerned, this spacetime would be that of a black hole; but it must be supported by normal matter in the NS region, and tachyonic matter (with p = ρ) in the TD region. As unreasonable as this may seem, it is the only possible perfect fluid interpretation [9]. Again, the density diverges like r12 , near the origin (which is in the TD region). Solution 7a: This solution has correct signature only for A(r) √ > 0, which means that the solution √ 2 < w < −3 + 2 2 (and it is of type NS). The cases is valid except for −3 − 2 √ w < −3 − 2 2, for example, w = −6, represent static (ultra)phantom solutions. The w = 13 case is well known [1, Prob. 23.10]; the w → ∞ limit, meaning zero density but nonzero pressure, is the metric called S1 in [10]; other valid cases with integer power of r in B(r) are w = 1 and w = 3. The density is proportional to r12 , but this is a mild singularity because the mass function goes to zero as r → 0, i.e. there is no mass point at the origin. Of course, there is no event horizon, so the singularity is naked. For test particles, the sign of attraction to the origin √ is the same as that of 2 and for w > 0, repulsive w(1 + w), that is, the origin attractive for w < −3 − 2 √ for −3 + 2 2 < w < 0. On the other hand, the pressure is positive for all w values, and since p ∝ ρ ∝ r12 , the pressure gradient is always towards the origin. √ 1 KS-like form of the metric is ds2 = −dτ 2 + dρ2 + A cosh2 ( Aτ )dΩ2 . √ √ d The KS-like form of the metric is ds2 = −dτ 2 + A coth2 ( Aτ )dρ2 + 1 sinh2 ( Aτ )dΩ2 . 2A c The
September 20, J070-S0129055X1100445X
874
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
Naively thinking in terms of ρ, it would seem that both forces would be pushing a fluid element towards the origin in the ultraphantom case (ρ is negative), but the “density of inertial mass” (e.g., [11]), (ρ + p) [here proportional to w(1 + w)] is positive so that static equilibrium is possible. These static ultraphantom solutions constitute a counterexample to the impression in the literature (e.g., see [12]) that everywhere-phantom static spherically symmetric solutions cannot exist. This solution can be identified with the n = const =
− 4w r0 w+1 )
2w w+1 ,
2w − w+1
R → ∞ and B = r0
(or
choice of Tolman V [13].
Solution 7b: √ √ This is a TD solution (a subcasee of TD2 of [9]) valid for −3−2 2 < w < −3+2 2, except w = −1. Assuming r is future-directed, this spacetime expands in the angular directions, and either expands (for w < −1) or contracts (for w > −1) in the orthogonal spacelike direction.f An infinite number of w-values, crowding −1, exist that give integer power of r in B(r). The density is again proportional to r12 ; the solution can be identified with the almost same subcase of Tolman V [13] as 4w − w+1
Solution 7a, exceptg const=−r0
.
Solution 8a: This solution is type NS. C must be positive and r < C4 . Interestingly, radially moving free particles oscillate between a minimum radius and C4 , which may be understood in terms of the repulsion of the negative mass point at the origin (C = −B and F (r) is the mass function) versus the attraction of the fluid, whose “enclosed active gravitational mass” (e.g., [11]) grows with r (here, both ρ and ρ + 3p are positive). The origin is a naked singularity, and not only due to the negative point mass there: The scalar curvature is r82 , that is, it diverges without containing C. But, after all, the scalar curvature does not contain M in the Schwarzschild case, either (in fact, it vanishes). r = C4 is a type of boundary, it is a turning point for all radial timelike geodesics. This solution can be identified with the n = − 12 , R → −C and B 2 = r0 choice of Tolman V [13]. Solution 8b: This TD solution (with p = ρ/3) can be identified with the n = − 12 , R → −C and const = −r0 choiceg of Tolman V [13]. There is no coordinate restriction for negative C, but r must be larger than C4 for positive C. In the latter case, again r = C4 is a turning point for timelike radial geodesics, but r is timelike, so this spacetime first contracts in the angular directions while expanding in the orthogonal spacelike direction, then the evolution reverses. e Which f Metric
subcase it is depends on the sign of w + 1. “ 2 ” 2w w+1 τ in KS-like form: ds2 = −dτ 2 + |A|r dρ2 + 2
g Tolman
0
τ2 dΩ2 , |A|
where A =
chooses const = B 2 and later literature reports this form (e.g., [14]).
w2 +6w+1 . (w+1)2
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
SSSPF Solutions with Finite Polynomial “Mass Function”
875
On the other hand, for negative C, the spacetime expands in the angular directions while contracting in the orthogonal spacelike direction,h assuming r is future-directed. 4. All Finite-Polynomial Solutions for F (r) from the OV-Like Equations in the TD, ND(KS) and TS Cases 4.1. The TD case The TD OV equation [9] is p =
(κpr3 + FTD ) (ρ + p) 2r(r − FTD )
where
(4.1)
FTD (r) = −κ
(ρ + 2p)r2 dr,
(4.2)
and the metric functions are found by r , r − FTD (r) B (r) κpr2 + 1 1 = − . B(r) r − FTD (r) r
(4.3)
A(r) =
(4.4)
The substitution ρ˜ = −(ρ + 2p) brings the TD OV Eq. (4.1) into the same form as the regular one (1.5), in terms of ρ˜ and p. When expressed in terms of FTD (r), with equation of state (1.4), we get Eq. (1.9), but with the replacement w . Since this is another constant equation of state parameter, we will not w → − 1+2w get any finite-polynomial solutions that are not already in Table 2, unless 1+2w = 0. This should not be taken as an indication that the TD and TS solutions are trivial relabelings; for more complicated equations of state than Eq. (1.4), the solutions’ mathematical forms will be different. For w = −1/2, FTD (r) becomes a constant, 1 Solution 9: w = − , 2
FTD (r) = C.
(4.5)
4.2. The ND case The ND OV equation [9] is ρ =
h KS-like
3FND − 4r + κρr3 (ρ + p) 2r(r − FND )
form of the metric is ds2 = −dτ 2 +
r0 dρ2 r(τ )
+ r 2 (τ )dΩ2 , where
(4.6)
dr dτ
=±
q 4−
C . r
September 20, J070-S0129055X1100445X
876
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
where
FND (r) = −κ
pr2 dr
(4.7)
and the metric functions can be found by r , r − FND (r) 1 − κρr2 1 B (r) = − . B(r) r − FND (r) r A(r) =
(4.8) (4.9)
In this case, FND (r) obeys (3wFND − 4wr − rFND ) + 2w(rFND − 2FND )(FND − r) = 0. (4.10) (1 + w)FND
To find all finite-polynomial solutions of this equation (dropping label ND), we follow the same procedure as in Sec. 2, and show the results in Table 3. 4.3. The TS case The TS OV equation [9] is ρ + 2p = Table 3.
3FT S − 4r − κ(ρ + 2p)r3 (ρ + p) 2r(r − FT S )
(4.11)
Breakdown of all finite polynomial solutions of Eq. (4.10) into cases and subcases.
Case 0: Simple cases giving linear equations Case 1: m > 1 ⇒ (w − 1)· (3w + m) = 0, m ˜ ≥0
Case 2: m, m ˜ ∈ {0, 1} Case 3: m ˜ <0 ⇒ (w − 1)· (3w + m) ˜ = 0, m≤1
Case 0.1: w = −1 →
F10 (r) = Ar 3 + C F11 (r) = r Case 0.2: w = 0 → F12 (r) = C Case 1.1: C. 1.1.1: No n → fails (“brute force”) w=1 C. 1.1.2: n > 1 → fails (coefficient of r m+n−1 ) C. 1.1.3: n = 1 or 0 → fails (“brute force”) Case 1.2: C. 1.2.1: No n → w = −1, F (r) = Ar 3 (covered by F10 ) w = −m/3 C. 1.2.2: n > 1 No p → fails (“brute force”) → m = 2n + 3 p exists → fails (coefficient of r m+p−1 ) C. 1.2.3: n = 1 C. 1.2.3.1: m = 3 → B = 0 → or 0 w = −1, F (r) = Ar 3 + C: same as F10 C. 1.2.3.2: C = 0 → (1) m = 3 → B = 0: covered by F10 9r (2) w = −3, F13 (r) = Ar 9 + 8 w arbitrary, F14 (r) = C; covers F12 . 4w 2 r, w arbitrary, except − 13 and 1 F15 (r) = 3w 2 − 2w − 1 w = 13 → F16 (r) = − 13 r + B Case 3.1: C. 3.1.1: No n ˜ → F17 (r) = C r ,w = 1 w=1 C. 3.1.2: n ˜ < 1 → fails (coefficient of r m+n−1 ) C. 3.1.3: n ˜ = 1 → fails (“brute force”) Case 3.2: C. 3.2.1: No n ˜→m ˜ = 3 → fails w = −m/3 ˜ C. 3.2.2: n ˜ = 1 → fails (“brute force”, m=1,9) ˜ C. 3.2.3: n ˜ < 1 C. 3.2.3.1: No p → fails (“brute force”) ⇒m ˜ = 2˜ n + 3 C. 3.2.3.2: p < −3 → fails (˜ p=n ˜) (coefficient of C. 3.2.3.3: p > −3 → fails (˜ p = m) ˜ ˜ n−1 ) r m+˜ C. 3.2.3.4: p = −3 → fails (“brute force”)
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
SSSPF Solutions with Finite Polynomial “Mass Function”
where
877
FT S (r) = −κ
pr2 dr,
(4.12)
and the metric functions are found by r , r − FT S (r) B (r) 1 + κ(ρ + 2p)r2 1 = − . B(r) r − FT S (r) r A(r) =
(4.13) (4.14)
Again, for FT S (r), with equation of state (1.4) we get Eq. (4.10), but with w , leading to the only potentially new solutions for the replacement w → − 1+2w 1 + 2w = 0. Then, we get two solutions: 1 Solution 18: w = − , 2
FT S (r) = C,
1 Solution 19: w = − , 2
FT S (r) =
4 r. 3
(4.15) (4.16)
5. Discussion of Solutions Found from the OV-Like Equations in the Other Cases In this section, we calculate the metric functions A(r) and B(r) for each solution from the previous section by using the relevant formulae, and discuss the solutions. 5.1. The TD case For Solution 9, we get A(r) =
1 1 − C/r
(5.1)
which for r < C (only possible if C > 0) gives 2 C−r C −r −4 2 2 2 −1 C1 − 15C tan B(r) = −r1 (2r + 5Cr − 15C ) + , r r (5.2) i.e. Solution TD3 of [9]. On the other hand, for r > C we find √ √ 2 r−C r−C+ r −4 2 2 2 C1 + 15C ln( , ) B = r1 (2r + 5Cr − 15C ) + r |C| (5.3) the solution called NS1 in [9], found in [15] and named Kuch68 I in [14]. It describes a spacetime where pure pressure is in static equilibrium with its own gravitational attraction.
September 20, J070-S0129055X1100445X
878
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
5.2. The ND(KS) and TS cases The solutions found for the ND(KS) and TS cases, together with their metric functions, are shown in Table 4 (Solutions 12 and 18 do not appear because they are special cases of Solution 14). As in Sec. 3, the sign of B(r) is arbitrary, unless forced by the signature requirement. The Schwarzschild and K¨ ottler (SdS) solutions, which appeared in Table 2, are found in this table as well, because they cannot really be classified in this scheme. Our classification is based upon the nature and direction of motion of the fluid, but for these solutions, the stress-energy-momentum tensor is independent of the fluid four-velocity: The uµ uν term in Tµν is multiplied by p + ρ; and p + ρ = 0 for the K¨ ottler solution, p = ρ = 0 for Schwarzschild. Hence, these solutions satisfy the equations for all four cases. The other solutions in the table are less well known: Solution 13+: This solution is type ND (KS), representing a dynamic spacetime filled with a phantom perfect fluid. Assuming r is future-directed, the spacetime expands in Table 4. All finite-polynomial solutions for F (r) in the ND(KS) and TS cases as defined in [9]; together with the corresponding metric functions. No.
w
10
−1
11
−1
13
−3
14
arbitrary
15
arbitrary, except 1 − and 1 3
F (r) Ar 3
+C
r Ar 9 +
9 r 8
A(r) = grr
B(r) = −gtt
1 2 1− C r − Ar ∞
2 1− C r − Ar
−
8 1 + 8Ar 8
C
1 1− C r
4w 2 r r 3w 2 − 2w − 1
(1 − w)(3w + 1) (w + 1)2
16
1 3
1 − r+B 3
3r 4r − 3B
17
1
C r
1
19
−
1 2
4 r 3
K¨ ottler (SdS)
? −
1+ r6
„ ±
–
8Ar 8
1− r r0
C r
«−
±
„ −
4w w+1
r0 r
r r0
13+ (A > 0): Type ND(KS), phantom-filled dynamic universe; 13− (A < 0): BH-like Schwarzschild
±1
1 − C2 r −3
Comments
«−4
„
« 1 <w<1 : 3 Type TS; 15b (otherwise): Type ND(KS), incl. DE, incl. phantom
15a
−
16a: Type TS; 16b: Type ND(KS) 17a: Type TS; 17b: Type ND(KS) Type ND(KS)
Note: a Solutions 12 and 18 do not appear because they are special cases of Solution 14. b In Solutions 15–17, the upper signs in B(r) apply to solutions a and lower signs to solutions b. c The well known solutions are indicated in italics.
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
SSSPF Solutions with Finite Polynomial “Mass Function”
879
the angular directions; in the perpendicular spacelike direction, it first contracts, reaches a minimum, then expands.i It is singular at both ends of the evolution, that is, at r = 0 and as r → ∞, the first singularity being in the finite past, the second in the infinite future. Of course, these attributes switch if r is past-directed. Solution 13−: This solution, like Solution 5−, represents a black hole spacetime, as far as test particle motion is concerned; but it must be supported by two different fluids on the two sides of the horizon: tachyonic fluid in the outside, static region and normal fluid in the dynamic region inside/in the future. Solutions 15a: This is a TS solution. For positive w, that is, for 0 < w < 1 radially incoming test particles are reflected near the origin back to infinity, whereas for negative w, that is, − 13 < w < 0, the origin constitutes a potential well from which they cannot escape. Solutions 15b: This solution is identical to the C1 = 0 special case of Solution ND2 of [9]. If r is future-directed, it expands in the angular directions, and it expands or contracts w is negative or positive, in the perpendicular spacelike direction, if the sign of w+1 j respectively. Note that this means expansion for non-phantom dark energy (−1 < w < − 13 ) and “radial” contraction for phantom energy. Solution 16a: This is a TS solution, where we must have 4r > 3B, i.e. we have a restriction on r if B is positive. Either way, the equation of motion for test particles shows that tachyonic w = 13 fluid is repulsive, consistent with the solution 15a. Solution 16b: This is an ND (KS) solution, where 4r < 3B. It represents a radiation-filled universe that expands and recollapses in angular directions while contracting and reexpanding in the perpendicular spacelike directionk ; first found in [16]. Solution 17a: This solution is identical to solution TS1 of [9], where we must have r2 > C. For √ positive C, r0 = C is a turning point for radial geodesics; for negative C, there are no such turning points. q 8 (τ ) dr KS form of the metric is ds2 = −dτ 2 + 1+8Ar dρ2 + r 2 (τ )dΩ2 , where dτ = ± Ar 8 + 18 . r 6 (τ ) “ 2 ”− 2w w+1 (w−1)(3w+1) τ τ2 j The KS form of the metric is ds2 = −dτ 2 + dρ2 + |A| dΩ2 , where A = − (w+1)2 . 2 |A| r0 q B 4 k The KS form of the metric is ds2 = −dτ 2 + r0 dρ2 + r 2 (τ )dΩ2 , where dr = ± − 3 . The r(τ ) dτ r i The
arbitrary r0 must be chosen as
3 B 4
for agreement with [16, p. 1684].
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
880
Solution 17b: This is solution ND1 of [9], apparently first found in [17], describing a finitelifetime universe containing stiff matter, expanding and recollapsing in the angular directions.l Solution 19: This solution is the C1 = 0, A = −3 special case of solution ND2 of [9], describing a spacetime containing pressure, but no density (because it is an ND (KS) solution found from the TS equations, its equation of state is not p = − ρ2 ); expanding in angular directions while contracting in the perpendicular spacelike direction,m if r is taken to be future-directed. 6. Finite-Polynomial A(r)? Another possible way to look for solutions is to work in terms of A(r) rather than F (r) by using Eq. (1.7). This leads to an equation with terms second to fourth order in A(r) and/or its derivatives. In trying to find a finite-polynomial solution for A(r), if the highest power of r in A(r) is m, the highest power of r in the equation is 4m + 1; but it is multiplied by A2 (w + 1)2 in cases NS and TD, and −A2 (w + 1)2 in cases ND and TS; unless m = 0. Setting the trivial w = −1 case aside, therefore, the highest possible value for m is zero. A similar argument shows that the lowest power in the A(r) polynomial must be zero or higher. Hence the only finite polynomial A(r) can be for equation of state (1.4) is a constant. 7. Summary and Conclusions We have considered spherically symmetric perfect fluid solutions in General Relativity and found all finite-polynomial solutions — including negative powers — of the equation satisfied by the so-called “mass function” and its mathematical analogs for the equation of state p = wρ; and discussed the associated spacetimes. The equation for the mass function follows from the Oppenheimer–Volkoff (OV) equation in the standard case where the fluid is static and normal (i.e. timelike fluid four-velocities, uµ uµ = −1). However, the metric ansatz used in that analysis can also accomodate cases where the spacetime is dynamic in a certain way, or the fluid is tachyonic; as discussed in [9]. In these other cases analogous, but different functions exist, satisfying their own equations. The solutions we found for the standard case, NS, are mathematically not very original; they are either some limiting cases of solutions found long ago by Tolman [13] or simple modifications thereof. Some aspects of the physical nature of these solutions can be seen in new light however, considering the classification in [9] and newly cosmologically relevant concepts of dark energy and phantom energy. The l The
KS form of the metric is ds2 = −dτ 2 + dρ2 + (C − τ 2 )dΩ2 . “ ”4 2 KS form of the metric is ds2 = −dτ 2 + ττ0 dρ2 + τ3 dΩ2 .
m The
September 20, J070-S0129055X1100445X
2011 11:25 WSPC/S0129-055X
148-RMP
SSSPF Solutions with Finite Polynomial “Mass Function”
881
solutions (Table 2) include dynamic spacetimes supported by tachyonic fluids (4b, 5+, 7b and 8b) and a static spacetime containing a w = − 51 fluid around a negative point mass (8a). The TD case gives two extra solutions, one describing a spacetime where pure pressure is in static equilibrium with its own gravitational attraction. Some interesting solutions are also found from the ND(KS) and TS cases (Table 4): There are static solutions supported by tachyonic fluids (15a, 16a, 17a), the first two presumably original. Some solutions (13+, 15b, 16b, 17b, 19) are of the Kantowski–Sachs (KS) class: Solutions 16b, 17b and 19 describe dynamic KSuniverses containing radiation, stiff matter and pure pressure, respectively. We would like to particularly point out the following solutions: • Solution 5− is a black hole-like spacetime, which must be supported by normal matter outside the horizon√and tachyonic fluid on the inside. • Solution 7a for w < −3−2 2 represents, perhaps unexpectedly, a family of static “ultraphantom” solutions. • Solution 13+ is a phantom KS solution, probably new. • Solution 13− is similar to Solution 5−, a black hole-like spacetime, supported by segregated normal and tachyonic matter, except in this solution, the tachyonic fluid is outside and normal fluid is inside. It was concluded in [9] that black holes supported by perfect fluids cannot be “simple”. • Solutions 13+ and 13− both have by “mass functions” containing the 9th power of r, therefore grr containing r8 in the denominator and gtt containing r−6 . • Solution 15b can also be valid for dark energy, including phantom, exhibiting anisotropic expansion for non-phantom dark energy. There are no other solutions where F (r) is a finite polynomial of r for the assumed equation of state. One can also express the problem(s) in terms of A(r), and then try to find finite polynomial solutions. The only such solution is A(r) = constant. Acknowledgments ¨ We would like to thank M. Ozbek and N. M. Uzun for helpful discussions. This work was partially supported by Grant No. 06B303 of the Bo˘ gazi¸ci University Research Fund. References [1] C. W. Misner, K. S. Thorne and J. A. Wheeler, Gravitation (Freeman, New York, 1973). [2] M. S. Turner and D. Huterer, Cosmic acceleration, dark energy and fundamental physics, J. Phys. Soc. Jpn. 76 (2007) 111015; arXiv:0706.2186. [3] R. R. Caldwell, A phantom menace? Cosmological consequences of a dark energy component with super-negative equation of state, Phys. Lett. B 545 (2002) 23–29; arXiv:astro-ph/9908168.
September 20, J070-S0129055X1100445X
882
2011 11:25 WSPC/S0129-055X
148-RMP
˙ Semiz I.
[4] A. G. Riess et al. (High-z Supernova Search Team), Observational evidence from supernovae for an accelerating universe and a cosmological constant, Astron. J. 116 (1998) 1009–1038. [5] S. Perlmutter et al. (Supernova Cosmology Project), Measurements of omega and lambda from 42 high redshift supernovae, Astrophys. J. 517 (1999) 565–586. [6] J. R. Oppenheimer and G. M. Volkoff, on massive neutron cores, Phys. Rev. 55 (1939) 374–381. [7] R. Kantowski and R. K. Sachs, Some spatially homogeneous anisotropic relativistic cosmological models, J. Math. Phys. 7 (1966) 443–446. [8] H. Stephani, D. Kramer, M. MacCallum, C. Hoensealers and E. Herlt, Exact Solutions of Einstein’s Equations, 2nd edn. (Cambridge University Press, Cambridge, 2003). ˙ Semiz, The standard “static” spherically symmetric ansatz with perfect fluid source [9] I. revisited, Int. J. Mod. Phys. 19 (2010) 1–20. [10] P. Boonserm, M. Visser and S. Weinfurtner, Generating perfect fluid spheres in general relativity, Phys. Rev. D 71 (2005) 124037. [11] B. Schutz, Gravity from the Ground Up (Cambridge University Press, Cambridge, 2003). [12] V. Dzhunushaliev, V. Folomeev, R. Myrzakulov and D. Singleton, Non-singular solutions to Einstein–Klein–Gordon equations with a phantom scalar field, JHEP 07 (2008) 094; arXiv:0805.3211. [13] R. C. Tolman, Static solutions of einstein’s field equations for spheres of fluid, Phys. Rev. 55 (1939) 364–373. [14] M. S. R. Delgaty and K. Lake, Physical acceptability of isolated, static, spherically symmetric, perfect fluid solutions of Einstein’s equations, Comp. Phys. Commun. 115 (1998) 395–415; arXiv:gr-qc/9809013. [15] B. Kuchowicz, Extensions of external Schwarzschild solution, Bull. Acad. Pol. Sci. — Ser. Sci. Math. Astron. Phys. 16 (1968) 341–342. [16] R. Kantowski, Some relativistic cosmological models, Ph.D. thesis, University of Texas (1966); reprinted in Gen. Relativ. Gravit. 30 (1998) 1665–1700. [17] K. S. Thorne, Geometrodynamics of cylindrical systems, Ph.D. thesis, Princeton University (1965).
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 8 (2011) 883–902 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004461
EXPLOSIVE SOLUTIONS OF STOCHASTIC VISCOELASTIC WAVE EQUATIONS WITH DAMPING
FEI LIANG∗,†,§ and HONGJUN GAO∗,‡,¶ ∗Jiangsu
Provincial Key Laboratory for Numerical Simulation of Large Scale Complex Systems, School of Mathematical Science, Nanjing Normal University, Nanjing 210046, P. R. China †Department of Mathematics, An Hui Science and Technology University, Feng Yang, 233100 Anhui, P. R. China ‡Center
of Nonlinear Science, Nanjing University, Nanjing 210093, P. R. China §
[email protected] ¶
[email protected]
Received 28 January 2011 Revised 25 August 2011 In this paper, a nonlinear stochastic viscoelastic wave equation with linear damping is considered. By an appropriate energy inequality and estimations, we show that the local solution of the stochastic equations will blow up with positive probability or explosive in L2 sense under some sufficient conditions. Moreover, the upper bound of the blow-up time is given. Keywords: Stochastic viscoelastic wave equations; explosive solutions; energy inequality. Mathematics Subject Classification 2010: 60H15, 35L05, 35L70
1. Introduction The viscoelastic wave equation of the following form t u − ∆u + g(t − τ )∆u(τ )dτ + h(ut ) = f (u), (x, t) ∈ D × (0, T ), tt 0 u(x, t) = 0, (x, t) ∈ ∂D × (0, T ), u(x, 0) = u (x), u (x, 0) = u (x), x ∈ D, 0 t 1
¶ Corresponding
author. 883
(1.1)
September 20, J070-S0129055X11004461
884
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
describes a viscoelastic material, with u(x, t) giving the position of material particle x at time t. The term g is the relaxation function, f denotes the body force and h is the damping term. The properties of the solution to (1.1) has been studied by many authors (see [1–5]). For instance, Messaoudi [2] studied (1.1) for h(ut ) = a |ut | m−2 ut and f (u) = b |u| p−2 u and proved a blow-up result for solutions with negative initial energy if p > m ≥ 2 and a global result for 2 ≤ p ≤ m. This result has been later improved by the same author in [3] to accommodate certain solutions with positive initial energy. In [4], Song et al. considered (1.1) for h(ut ) = −∆ut and f (u) = |u| p−2 u and proved a blow-up result for solutions with positive initial energy by using the ideas of the “potential well” theory introduced by Payne and Sattinger [6]. More recently, Wang [5] has investigated a sufficient conditions of the initial data with arbitrarily positive initial energy such that the corresponding solution of (1.1) with h(ut ) = ut and f (u) = |u| p−2 u blows up in finite time. For related results, we refer the reader to [7–11]. In fact, the driving force may be affected by the environment randomly. In view of this, we consider the following stochastic viscoelastic wave equations t u − ∆u + g(t − τ )∆u(τ )dτ + µut tt 0 dW (t, x) , x ∈ D, t ∈ (0, T ), = κ|u|p u + εσ(u, ∇u, x, t) (1.2) dt x ∈ ∂D, t ∈ (0, T ), u(x, t) = 0, u(x, 0) = u0 (x), ut (x, 0) = u1 (x), x ∈ D, where D is a bounded domain in Rd with a smooth boundary ∂D, κ, p > 0 are constants, g is a positive function satisfying some conditions to be specified later, µ, ε are given positive constants which measures the strength of the damping and the noise, respectively. σ is a given function and W (t, x) is an infinite dimensional Wiener process which may be treated as the random force. To motivate our work, let us recall some results regarding stochastic wave equations. In [12], Chow discussed a class of non-dissipative stochastic wave equations with polynomial nonlinearity in Rd with d ≤ 3. Using the energy inequality the author demonstrated the blow-up in finite time with a positive probability or explosive in L2 norm for an example and studied the global existence of the solutions for the equation. This blow-up result has been later generalized by the same author in [13]. Furthermore, Chow [14, 15] studied the properties of the solution such as asymptotic stability and invariant measure and Brze´zniak et al. [16] studied global existence and stability of solutions for the stochastic nonlinear beam equations. In those papers, the energy inequality plays a central role in proving global existence of solutions. In a recent paper, using the energy inequality, Bo et al. [17] proposed sufficient conditions that the solutions of a class of stochastic wave equations blow up with a positive probability or in L2 sense. However, for the current Eq. (1.2), the memory part makes it difficult to estimate the energy by using these methods.
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
885
Hence, Wei and Jiang [18] studied (1.2) with σ ≡ 1 in another way. They showed the existence and uniqueness of solution for (1.2) and obtained the decay estimate of the energy function of the solution. Motivated by the above research, in this article we first use the definition of solutions in [19] and extend them to the stochastic cases. Then we prove the existence and uniqueness of a local mild solution. Furthermore, using the energy inequality we prove the solution either blows up in finite time with positive probability or is explosive in L2 . This paper is organized as follows. In Sec. 2, we present some assumptions and definitions needed for our work. In Sec. 3, we show the local existence and uniqueness of the mild solution. Section 4 is devoted to the proof of the explosive solutions for (1.2). 2. Preliminaries First, let us introduce some notation used throughout this paper. We denote by · q the Lq (D) norm for 1 ≤ q ≤ ∞ and by ∇ · 2 the Dirichlet norm in H01 (D) which is equivalent to the H 1 (D) norm. Moreover, we set ϕ(x)ψ(x)dx ϕ, ψ = Ω
2
as the usual L (D) inner product. For the relaxation function g(t), we assume (G1) g ∈ C 1 [0, ∞) is a non-negative and non-increasing function satisfying ∞ 1− g(s)ds = l > 0.
0
∞
g(s)ds <
(G2) 0
p(p + 2) . (p + 1)2
Remark 2.1. Condition (G1) is necessary to guarantee the hyperbolicity of (1.2) and an example of functions satisfying (G1) and (G2) are g(s) = e−as , a > (p + 1)2 /[p(p + 2)]. According to the arguments in [19, 20], we solve Eq. (1.1) as an integrodifferential equation. More precisely, we consider the following integral-differential equation t utt + Au + B(t − τ )u(τ )dτ + h(ut ) = f (u), (x, t) ∈ D × (0, T ), 0 (2.1) u(x, t) = 0, (x, t) ∈ ∂D × (0, T ), u(x, 0) = u (x), u (x, 0) = u (x), x∈D 0 t 1
September 20, J070-S0129055X11004461
886
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
with u ∈ L1 ([0, T ]; X), where X is a real Hilbert space, A and B satisfy the conditions in [19], i.e. A and B(·) are linear unbounded self-adjoint operators with domains D(A) and D(B(·)), respectively, satisfying that (A1) (A2) (A3) (A4)
D(A) ⊂ D(B(t)) for any t ≥ 0 and D(A) is dense in X. Ay, y ≥ c0 y 22 for any y ∈ D(A) and some constant c0 > 0. 1,1 B(·)y ∈ Wloc (0, +∞; X) for any y ∈ D(A). B(t) commutes with A, that is B(t)D(A2 ) ⊂ D(A)
and AB(t)y = B(t)Ay,
y ∈ D(A2 ),
t ≥ 0.
Definition 2.1. A family of bounded linear operators S(t)t≥0 in X is called a resolvent for Eq. (2.1) with f = 0 and h = 0, if the following conditions are satisfied: (S1) S(0) = I and S(t) is strong continuous on [0, ∞). That is, for all x ∈ X, S(·)x is continuous on [0, ∞). (S2) S(t) commutes A, which means that S(t)D(A) ⊂ D(A) and AS(t)y = S(t)Ay for all y ∈ D(A) and t ≥ 0. (S3) For any y ∈ D(A), S(·)y is twice continuously differentiable in X on [0, ∞) and S (0) = 0. (S4) For any y ∈ D(A) and t ≥ 0, the resolvent equation is t B(t − τ )S(τ )ydτ = 0. S (t)y + AS(t)y + 0
Define the linear operators A and B by Au = −∆u,
B(t)u = g(t)∆u,
u ∈ D(A) = H 2 (D) ∩ H01 (D),
(2.2)
where ∆ is the Laplace operator on D with Dirichlet boundary condition and g(t) satisfies (G1). It is easy to check that A and B satisfy the conditions (A1)–(A4). Hence we have the following theorem, which is a consequence of [19, Theorem 2]. Theorem 2.1. Assume that A and B satisfy (2.2), f = 0 and h = 0. Then there exists a unique resolvent S(t)t≥0 for Eq. (2.1). Furthermore, the resolvent satisfies the following properties: (i) The operators S(t) are√self-adjoint. √ √ √ A, that is S(t)D( A) ⊂ D( A) and AS(t)x = (ii) S(t)√commutes with √ S(t) Ax for all x ∈ D( A) and t ≥ 0. √ t (iii) For any x ∈ L2 (D), the function t → 0 S(τ )xdτ belongs to C([0, ∞); D( A)) and for any T > 0, there exists a constant CT such that t √ S(τ )xdτ (2.3) S(t)x2 + A ≤ CT x2 for any t ∈ [0, T ]. 0
2
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
887
√ √ t (iv) For any x ∈ D( A), the function t → 0 S(τ )xdτ belongs to C([0, ∞); D( A)) and for any T > 0, there exists a constant CT such that t √ A S(τ )xdτ t ∈ [0, T ], (2.4) ≤ CT Ax2 , 0
2
√ S (t)x2 ≤ CT (x2 + Ax2 ), t ∈ [0, T ], t S (t)x + A S(τ )xdτ + B ∗ 1 ∗ S(t)x = 0 t > 0,
(2.5) (2.6)
0
where ∗ stands for √ √ the convolution of two functions. (v) For any x ∈ D( A), the function S (·)x belongs to C([0, ∞); D( A)). With the resolvent, we can define the solutions of the stochastic wave equation (1.2). From the definition of (2.2), we can rewrite Eq. (1.2) as t utt + Au + B(t − τ )u(τ )dτ + µut 0 p = κ|u| u + εσ(u, ∇u, x, t)∂t W (t, x), (x, t) ∈ D × (0, T ), (2.7) u(x, t) = 0, (x, t) ∈ ∂D × (0, T ), u(x, 0) = u0 (x), ut (x, 0) = u1 (x), x ∈ D, where W (t, x) is a Q-Wiener process in X on some probability space (Ω, P, F ) with the variance operator Q satisfying TrQ < ∞ and {Ft , t ≥ 0} as its natural filtration satisfying the usual conditions. Moreover, we can assume that Q has the following form Qei = λi ei ,
i = 1, 2, . . . , ∞ where λi are eigenvalues of Q satisfying i=1 λi < ∞ and {ei } are the corresponding eigenfunctions with c0 := supi≥1 ei ∞ < ∞ (where · ∞ denotes the super-norm) which form an orthonormal base of X. In this case, ∞
λi Bi (t)ei , W (t, x) = i=1
where {Bi (t)} is a sequence of independent copies of standard Brownian motions 1 in one dimension. Let H be the set of L02 = L2 (Q 2 X, X)-valued processes with the norm t 12 t 12 2 ∗ Ψ(s)L0 ds = E Tr(Ψ(s)QΨ (s))ds < ∞, Ψ(t)H = E 0
2
0
∗
where Ψ (s) denotes the adjoint operator of Ψ(s). Let {tk }nk=1 be a partition on [0, T ] such that 0 = t0 < t1 < · · · < tn = T . For a process Ψ(t) ∈ H, define the stochastic integral with respect to the Q-Wiener process as t n−1 Ψ(s)dW (s) = lim Ψ(tk )(W (tk+1 ∧ t) − W (tk ∧ t)), 0
n→∞
k=0
September 20, J070-S0129055X11004461
888
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
where the sequence converges in H-sense. It is not difficult to check that the integral t process 0 Ψ(s)dW (s) is a martingale for any Ψ(t) ∈ H, and the quadratic variation process is given by t
t Ψ(s)dW (s) = Tr(Ψ(s)QΨ∗ (s))ds. 0
0
For more details about the infinite dimension Wiener process and the stochastic integral, we refer to [21]. Now we give the definitions of strong solution, weak solution and mild solution to (1.2). Definition 2.2. (i) We say that u is a strong solution to (1.2) if u is {Ft }t≥0 -adapted that belongs to C 2 ([0, T ] × D; X) ∩ C([0, T ] × D; D(A)) and satisfies (1.2). (ii) An {Ft }t≥0 -adapted X valued stochastic process√u is said to be a weak solution √ to (1.2) if u ∈ C 1 ([0, T ]×D; X)∩C([0, T ]×D; D( A)), and for any ϕ ∈ D( A), the following equation holds:
t d ut , ϕ + ∇u, ∇ϕ − g(t − τ )∇udτ, ∇ϕ + µut , ϕ dt 0
d = κ|u|p u, ϕ + ϕ, εσ(u) W (t) . dt (iii) An {Ft }t≥0 -adapted X valued stochastic process u is said √ to be a mild solution 1 to (1.2) if u ∈ C ([0, T ] × D; X) ∩ C([0, T ] × D; D( A)) and the following equation holds: t t t S(τ )u0 dτ + S(τ )u1 dτ − µ S(t − τ )u(τ )dτ u(t) = S(t)u0 + µ +κ 0
0
t
0
1 ∗ S(t − τ )|u|p udτ + ε
0
0
t
1 ∗ S(t − τ )σ(u, ∇u, x, τ )dW (τ ) (2.8)
where S(t) is the resolvent of (2.1) with f = h = 0. Remark 2.2. By the definitions above and Theorem 2.1, we know that the stochastic integral in (2.8) is well-defined and if u is a mild solution to (1.2) satisfying √ u0 ∈ D(A) and u1 ∈ D( A), then u is a strong solution. 3. Existence and Uniqueness of Local Solution In this section, we deal with the local existence and√uniqueness of solution for problem (1.2). From the definition of A, we have D( A) = H01 (D). For a given T > 0, denote √ √ Λ := v ∈ C([0, T ]; D( A)); sup E Av2 < ∞ 0≤t≤T
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
889
√ equipped with the distance generated by C([0, T ]; D( A)), i.e. √ √ d(v, v ) = sup E Av − Av 2 , ∀ v, v ∈ Λ. 0≤t≤T
Then (Λ, d) is a complete metric space. In addition, assume that σ : R × Rd × Rd × R+ → R be a continuous function and satisfy |σ(u, ∇u, x, t)|2 ≤ c1 (1 + |u|2(p+1) + |∇u|2 )
(3.1)
and |σ(u, ∇u, x, t) − σ(v, ∇v, x, t)|2 ≤ c2 [(1 + |u|2p + |v|2p )|u − v|2 + |∇u − ∇v|2 ]
(3.2)
√ for all u, v ∈ D( A), where c1 and c2 are some constants.
d − 1 when d ≥ 3 and 0 < p < ∞ when Theorem 3.1. Assume that 0 < p ≤ d−2 d = 1, 2, and (G1), (3.1) and (3.2) hold. Then (1.2) admits a unique local mild √ solution u ∈ C([0, T ]; D( A)) provided (u0 , v1 ) ∈ H01 (D) × L2 (D).
Proof. Since f is locally Lipschitz, existence of local in time solution is proved by standard arguments. For each n ≥ 1, define a C 1 function kn by if x ≤ n, 1, (3.3) kn (x) = ∈ (0, 1), if n < x < n + 1, 0, if x ≥ n + 1, √ and further assume that kn ∞ ≤ 2. Let fn (u) = κkn ( Au2 )|u|p u, Σn (u) = √ εkn ( Au2 )σ(u) for all u ∈ Λ. We will prove the existence and uniqueness of the solution to the following stochastic integral equation t t t S(τ )u0 dτ + S(τ )u1 dτ − µ S(t − τ )un (τ )dτ un (t) = S(t)u0 + µ 0
t
+ 0
0
1 ∗ S(t − τ )fn (un )dτ +
0
t
0
1 ∗ S(t − τ )Σn (un (τ ))dW (τ ).
(3.4)
Set u(0) n (t) = S(t)u0 , u(m) n (t) = S(t)u0 + µ + 0
t
(3.5) 0
t
S(τ )u0 dτ +
0
t
S(τ )u1 dτ − µ
1 ∗ S(t − τ )fn (u(m−1) )dτ + n
0
t
0
t
S(t − τ )u(m−1) (τ )dτ n
1 ∗ S(t − τ )Σn (u(m−1) (τ ))dW (τ ). n (3.6)
September 20, J070-S0129055X11004461
890
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao (m)
We will show that {un } is a Cauchy sequence in Λ for each n and the limit of (m) d − 1 when d ≥ 3 and {un } is a solution to (3.4). Since p satisfies 0 < p ≤ d−2 0 < p < ∞ when d = 1, 2, the Gagliardo–Nirenberg inequality yields u2(p+1) ≤ c∇u2 ,
(3.7)
where c is a constant. Using (3.7), we have the following inequality up v2 ≤ cp+1 ∇up2 ∇v2 , In fact, when d = 1, 2, let q > 1 and k = we have
q q−1 ,
∀ u, v ∈ Λ.
(3.8)
by the H¨ older inequality and (3.7)
up v2 ≤ up2pq v2k ≤ cp+1 ∇up2 ∇v2 .
(3.9)
d d d When d > 2, set q = (d−2)p > 1. Then k = d−(d−2)p ≤ d−2 , (3.9) is also valid for d > 2. From the assumptions (3.1) and (3.2), Theorem 2.1 and (3.7), we have ∗ (0) Tr(Σn (u(0) n )QΣn (un ))
=
∞
(0) Σn (u(0) n )Qei , Σn (un )ei =
i=1
≤ ε 2 c1
∞
√ λi kn2 ( Au(0) n 2 )
i=1
Ω
∞
(0) λi Σn (u(0) n )Qei , Σn (un )ei
i=1 2(p+1) 2 e2i (1 + |u(0) + |∇u(0) n | n | )dx
∞ √ (0) (0) 2(p+1) 2 2 2 2 (0) 2 λi kn ( Aun 2 ) ei 2 + sup ei ∞ un 2(p+1) + ∇un 2 ≤ ε c1 i≥1
i=1
≤ ε 2 c1
∞
√ λi kn2 ( Au(0) n 2 )
i=1
2(p+1) 2 2 (0) 2(p+1) (0) 2 × ei 2 + sup ei ∞ c ∇un 2 + ∇un 2 i≥1
2(p+1) + ∇u0 22 . ≤ ε2 Cn,T ∇u0 2
(3.10)
From Theorem 2.1, Sobolev inequality, (3.8) and (3.10), we get √ (0) 2 E A(u(1) n − un )2 t 2 2 ≤ CT u0 22 + u1 22 + µEu(0) + E fn (u(0) n 2 n )2 dτ +E 0
t
∗ (0) Tr(Σn (u(0) n )QΣn (un ))dτ
0
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
891
≤ CT u0 22 + u1 22 + µES(t)u0 22
2
+κ E 0
+ε
2
t
p (0) 2 √ |u(0) n | un 2 χ{ Au(0) 2 ≤n+1} dτ
2(p+1) Cn,T (∇u0 2
n
+
∇u0 22 )
≤ CT u0 22 + u1 22 + µCT u0 22 + κ2 c2(p+1) ×E +ε
2
0
t
√ 2(p+1) Au(0) χ{√Au(0) 2 ≤n+1} dτ n 2
2(p+1) Cn,T (∇u0 2
n
+
∇u0 22 )
√ √ 2(p+1) ) < ∞, ≤ Cn,T ( Au0 22 + u1 22 + Au0 2
(3.11)
√ √ where χB denotes the indicator of some set B. Let η(t) = max{ Av1 2 , Av2 2 } for any v1 , v2 ∈ Λ. Then according to Theorem 2.1, we have t √ A 1 ∗ S(t − τ )(fn (v1 (τ )) − fn (v2 (τ )))dτ 0
2
t √ √ √ p ≤ κ A 1 ∗ S(t − τ )(k ( Av ) − k ( Av ))|v (τ )| v (τ )dτ n 1 2 n 2 2 1 1 0
2
t √ √ p p +κ A 1 ∗ S(t − τ )k ( Av )(|v (τ )| v (τ ) − |v (τ )| v (τ ))dτ n 2 2 1 1 2 2 0
t √ √ p ≤ κ (kn ( Av1 2 ) − kn ( Av2 2 ))|v1 (τ )| v1 (τ )dτ 0
2
2
t √ p p +κ k ( Av )(|v (τ )| v (τ ) − |v (τ )| v (τ ))dτ n 2 2 1 1 2 2 0
2
= κI1 + κI2 .
(3.12)
For I1 , using the definition of kn (x) and (3.7), we have t √ √ p I1 ≤ kn ∞ ( Av1 2 − Av2 2 )|v1 (τ )| v1 (τ )dτ 0
≤C
t 0
√ √ Av1 2 − Av2 2 dτ
t 0
2
v1 (τ )p+1 2(p+1) χ{η(t)≤n+1} dτ
September 20, J070-S0129055X11004461
892
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
≤ Ccp+1
0
≤ Cn,T
t
t
0
t √ √ √ Av1 − Av2 2 dτ Av1 (τ )p+1 χ{η(t)≤n+1} dτ 2 0
√ √ Av1 − Av2 2 dτ.
(3.13)
For I2 , by the H¨ older inequality we have t √ kn ( Av2 2 )|v1 (τ )|p v1 (τ ) − |v2 (τ )|p v2 (τ )2 dτ I2 ≤ 0
≤C
|v1 (τ ) − v2 (τ )|(|v1 |p ∨ |v2 |p ) dτ 2
t
0
≤C
t
v1 (τ ) − v2 (τ )2(p+1) (v1 p2(p+1) + v2 p2(p+1) )dτ
0
≤ Ccp+1 ≤ Cn
t
0
t
0
√ √ √ √ Av1 (τ ) − Av2 (τ )2 ( Av1 p2 + Av2 p2 )dτ
√ √ Av1 − Av2 2 dτ,
(3.14)
where t ∨ s = max(t, s). Combining (3.12)–(3.14), by the H¨ older inequality we get t 2 √ A 1 ∗ S(t − τ )(fn (v1 (τ )) − fn (v2 (τ )))dτ 0
≤ κCn,T
0
2
t
√ √ Av1 − Av2 22 dτ.
(3.15)
Similarly as (3.10), from the assumptions (3.1) and (3.2), one can conclude that Tr((Σn (v1 ) − Σn (v2 ))Q(Σn (v1 ) − Σn (v2 ))∗ ) =
∞
λi (Σn (v1 ) − Σn (v2 ))ei 22
i=1
≤ 2ε2
∞
√ √ λi (kn ( Av1 2 ) − kn ( Av2 2 ))σ(v1 )ei 22
i=1
√ + 2ε2 kn2 ( Av2 2 ) sup ei 2∞ Tr(Q)σ(v1 ) − σ(v2 )22 i≥1
≤ 8ε2
∞
√ √ λi Av1 − Av2 22 σ(v1 )ei 22 χ{η(t)≤n+1}
i=1
2 + 2ε2 c20 c2 Tr(Q) v1 − v2 22 + |v1 |p |v1 − v2 |2 √ 2 √ 2 + |v2 |p |v1 − v2 |2 + Av1 − Av2 2 √ √ ≤ ε2 Cn Av1 − Av2 22 .
(3.16)
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
893
Combining (3.15) with (3.16), according to Theorem 2.1 it follows that √ 2 − u(m) E A(u(m+1) n n )2 2 t √ (m) (m−1) ≤ µ2 CE A S(t − τ )(u (τ ) − u (τ ))dτ n n 0
2
t 2 √ (m) (m−1) + CE A 1 ∗ S(t − τ )(f (u (τ )) − f (u (τ )))dτ n n n n 0
2
t 2 √ (m) (m−1) + CE A 1 ∗ S(t − τ )(Σn (un (τ )) − Σn (un (τ )))dW (τ ) 0
≤ CT E
√ (m−1) S(t − τ ) A(u(m) (τ ))22 dτ n (τ ) − un
t
0
+ κCT,n
≤ CT,n
t
0
t
0
t
0
+ CT E
2
√ (m−1) A(u(m) (τ ))22 dτ n (τ ) − un
(m−1) (m−1) ∗ Tr((Σn (u(m) ))Q(Σn (u(m) )) )dτ n ) − Σn (un n ) − Σn (un
√ (m−1) A(u(m) (τ ))22 dτ. n (τ ) − un
Iterating this inequality, we get m √ √ CT,n 2 (0) 2 sup E A(u(1) − u(m) sup E A(u(m+1) n n )2 ≤ n − un )2 . m! 0≤t≤T 0≤t≤T
From (3.11), we have ∞
√ 2 sup E A(u(m+1) − u(m) n n )2 < ∞.
m=1 0≤t≤T
(m)
Hence, there exists un ∈ Λ such that limm→∞ un (t) = un (t) uniformly. It is easy to check that un is a solution to Eq. (3.4). For the uniqueness, let un and vn be two solutions to (3.4). Then by the similar arguments as above, we have √ sup E A(un (t) − vn (t))22 0≤t≤T
≤ CT,n
T 0
√ sup E A(un (τ ) − vn (τ ))22 dτ.
0≤t≤T
Using Gronwall inequality, we have √ sup E A(un (t) − vn (t))22 = 0. 0≤t≤T
Finally, the continuity of un follows from the continuity of S and the integrals.
September 20, J070-S0129055X11004461
894
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
For each n, define the stopping time τn by √ τn = inf{t > 0; Aun 2 ≥ n}. By the uniqueness of the solution, for m > n, um (t) = un (t) on [0, τn ]. So we can define a local solution u of (1.2) by u(t) = un (t) on [0, T ∧τn ]. Let τ∞ = limn→∞ τn . Hence, we construct a unique continuous local solution to (1.2) on [0, T ∧ τ∞ ]. 4. Explosive Solution of (1.2) In this section, we switch to discuss the explosion of the mild solution of (1.2). Note √ that we could only use Itˆo formula on a strong solution to Eq. (1.2). However, D( A) is dense in D(A) and a strong solution is also a mild solution. So we can approximate the energy function of a mild solution u by a sequence of energy functions such that the corresponding strong solution sequence {un } converges to u. Hence, the following arguments that should be derived for a strong solution can be easily extended to a mild solution (see [12, 16]). As well known, Eq. (1.2) is equivalent to the following Itˆ o system dut = vt dt, t p dv = ∆u − g(t − τ )∆u dτ − µv + κ|u | u dt t t τ t t t 0 (4.1) + εσ(ut , ∇ut , x, t)dW (t, x), ut (x, t) = 0, x ∈ ∂D, u0 (x, 0) = u0 (x), v0 (x, 0) = u1 (x). For A = −∆, we will replace the σ(x, t) with ∞ 0
√ A with ∇ in this section. Set σ(u, ∇u, x, t) ≡
σ 2 (x, t)dxdt < ∞.
D
Define the energy functional E(t) associated to our system t 1 1 E(t) = vt (t)22 + g(τ )dτ ∇ut (t)22 1− 2 2 0 1 κ ut p+2 + (g ◦ ∇ut )(t) − p+2 , 2 p+2 where (g ◦ w)(t) =
0
t
g(t − τ )w(t) − w(τ )22 dτ.
Before we state and prove our explosion result, we need the following lemma.
(4.2)
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
895
Lemma 4.1. Assume (G1) and (4.2) hold. Let (ut , vt ) be a global mild solution of system (4.1) with initial data (u0 , u1 ) ∈ H01 (D) × L2 (D). Then t t 1 2 2 2 EE(t) ≤ E(0) − µ Evτ 2 dτ + ε c0 Tr(Q) σ 2 (x, s)dxdt, (4.3) 2 0 0 D and Eut (t), vt (t)
= u0 (x), v0 (x) −
t 0
µ u0 22 + 2
E∇uτ 22 dτ +
t 0
Evτ (τ )22 dτ
t s µ Eut (t)22 + E g(s − τ )∇uτ (τ )dτ, ∇us (s) ds 2 0 0 t +κ Euτ (τ )p+2 p+2 dτ, −
(4.4)
0
where c0 is defined in Sec. 2. Proof. Using Itˆo formula to vt 22 , we have t t 2 2 vt 2 = v0 2 + 2 vτ , dvτ + dvτ , dvτ 0
= v0 22 − 2
t
0 s
+2 0
+2
0
+
0 t
∇uτ , ∇vτ dτ − 2µ
vτ , κ|uτ | uτ dτ + 2
0
= 2E(0) −
t
0
vτ 22 dτ
0
t
vτ , εσ(x, τ )dW (τ )
εσ(x, τ )Qei , εσ(x, τ )ei dτ
∇ut (t)22
t
s
+2 0
t
g(s − τ )∇uτ (τ )dτ, ∇vs (s) ds p
∞ i=1
0
t
0
− 2µ
0
t
vτ 22 dτ
g(s − τ )∇uτ (τ )dτ, ∇vs (s) ds
t 2κ p+2 ut (t)p+2 + 2 + vτ , εσ(x, τ )dW (τ ) p+2 0 t ∞ + ε2 λi σ 2 (x, τ )e2i (x)dxdτ. i=1
0
D
(4.5)
September 20, J070-S0129055X11004461
896
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
Using the condition (G1), we have
s g(s − τ )∇uτ (τ )dτ, ∇vs (s) 0
s
= 0
s
= 0
g(s − τ ) g(s − τ )
s
g(s − τ )
0
1 2
1 + 2
∇vs (s)(∇uτ (τ ) − ∇us (s))dxdτ D
+ =−
∇vs (s) · ∇uτ (τ )dτ dx D
s
0
d ds
|∇uτ (τ ) − ∇us (s)|2 dxdτ
g(s − τ )
s
g(τ ) 0
1 d = 2 ds
∇vs (s) · ∇us (s)dxdτ D
s
0
d ds
D
|∇us (s)|2 dxdτ
D
g(τ )dτ ∇us (s)22
− (g ◦ ∇us )(s)
1 1 + (g ◦ ∇us )(s) − g(s)∇us (s)22 2 2 s 1 d 2 ≤ g(τ )dτ ∇us (s)2 − (g ◦ ∇us )(s) , 2 ds 0 which implies that
t
s
2 0
0
≤
t
0
g(s − τ )∇uτ (τ )dτ, ∇vs (s) ds g(τ )dτ ∇ut 22 − (g ◦ ∇ut )(t).
(4.6)
Note that Tr(Q) =
∞ i=1
λi ,
c0 := sup ei ∞ < ∞.
(4.7)
i≥1
Inserting (4.6) and (4.7) into (4.5) and taking the expectation for (4.5), we obtain (4.3). Next we turn to prove (4.4). If (ut , vt ) is a global mild solution of system (4.1), then for each n ≥ 1, {ut (t), ξn ; t ≥ 0} is a continuous {Ft , t ≥ 0}-adapted finite variation process and {vt (t), ξn ; t ≥ 0} is a continuous {Ft , t ≥ 0}-adapted semimartingale, where {ξn }n≥1 is an orthonormal basis of L2 (D). Then by Itˆ o formula ut (t), ξn vt (t), ξn = u0 , ξn u1 , ξn t t + uτ (τ ), ξn dvτ (τ ), ξn + vτ (τ ), ξn duτ (τ ), ξn , 0
0
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
which implies that
ut (t), vt (t) = u0 , v0 + = u0 , v0 − −µ
0
t
+ 0
Note that
t
0
t
0
0
t
uτ (τ ), dvτ (τ ) +
vτ , uτ (τ ) dτ +
0
t
t
vτ , uτ (τ ) dτ =
t
0
vτ (τ ), duτ (τ )
t
∇uτ (τ )22 dτ
+ 0
0
s
g(s − τ )∇uτ (τ )dτ, ∇us (s) ds
uτ , κ|uτ |p uτ dτ
uτ (τ ), εσ(x, τ )dW (τ ) +
0
and
t
897
0
t
vτ (τ )22 dτ.
1 ut (t)22 − u0 22 2
uτ , κ|uτ |p uτ dτ = κ
0
t
uτ (τ )p+2 p+2 dτ.
(4.8)
(4.9)
(4.10)
Then (4.4) follows from (4.8)–(4.10). In the following, we switch to discuss the explosion of the mild solution of (1.2). Actually, we have d − 1 when d ≥ 3 and 0 < p < ∞ when Theorem 4.1. Assume that 0 < p ≤ d−2 d = 1, 2, and (G1), (G2) and (4.2) hold. Let ut (t) be a mild solution of (1.2) with initial data (u0 , u1 ) ∈ H01 (D) × L2 (D) satisfying ∞ σ 2 (x, t)dxdt, (4.11) 2E(0) ≤ −ε2 c20 0
D
and p u0 , v0 > µu0 22 . 2
(4.12)
Then the mild solution ut (t) and the lifespan τ∞ defined in Sec. 3 with L2 norm, either (1) P(τ∞ < ∞) > 0, i.e. ut (t) in L2 norm blows up in finite time with positive probability, or (2) there existence a positive time T ∗ ∈ (0, T0 ] such that lim Eut (t)22 = +∞.
t→T ∗
with T0 =
2u0 22 . pu0 , v0 − 2u0 22
September 20, J070-S0129055X11004461
898
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
Proof. For the lifespan τ∞ of the mild solution {ut (t); t ≥ 0} of (1.2) with L2 norm, let us consider the case when P(τ∞ = +∞) = 1. Then, for sufficiently large T > 0 we consider F (t) : [0, T ] → R+ defined by t F (t) := Eut (t)22 + µE uτ (τ )22 dτ + µ(T − t)u0 22 . 0
From (4.4) in Lemma 4.1, we conclude that F (t) = 2Eut (t), vt (t) + µEut (t)22 − µu0 22 t = 2Eut (t), vt (t) + 2µE uτ (τ ), vτ (τ ) dτ = 2u0 (x), v0 (x) − 2 +2
t
E 0
0
s
0
0
t
E∇uτ 22 dτ
+2 0
t
Evτ (τ )22 dτ
t g(s − τ )∇uτ (τ )dτ, ∇us (s) ds + 2κ Euτ (τ )p+2 p+2 dτ, 0
and so F (t) = −2E∇ut (t)22 + 2Evt (t)22
t + 2E g(t − τ )∇uτ (τ )dτ, ∇ut (t) + 2κEut (t)p+2 p+2 . 0
Therefore, we have F (t)F (t) −
p+4 (F (t))2 4
= 2F (t) −E∇ut (t)22 + Evt (t)22
t
+E 0
p+2 g(t − τ )∇uτ (τ )dτ, ∇ut (t) + κEut (t)p+2
2 t uτ (τ ), vτ (τ ) dτ − (p + 4) Eut (t), vt (t) + µE 0
= 2F (t) −E∇ut (t)22 + Evt (t)22
+E 0
t
p+2 g(t − τ )∇uτ (τ )dτ, ∇ut (t) + κEut (t)p+2
t + (p + 4) H(t) − (F (t) − (T − t)u0 22 ) Evt (t)22 + µE vτ (τ )22 dτ , 0
(4.13)
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
where H(t) =
899
t t uτ (τ )22 dτ Evt (t)22 + µE vτ (τ )22 dτ Eut (t)22 + µE 0
0
2 t − Eut (t), vt (t) + µE uτ (τ ), vτ (τ ) dτ . 0
Using the H¨older and Schwarz inequalities, we have 2 (Eut (t), vt (t) ≤ Eut (t)22 Evt (t)22 , 2 t t t 2 E uτ (τ ), vτ (τ ) dτ ≤E uτ (τ )2 dτ E vτ (τ )22 dτ, 0
and
0
Eut (t), vt (t) E ≤ ≤
t
0
0
uτ (τ ), vτ (τ ) dτ 12 12 t t 2 2 12 2 vτ (τ )2 dτ (Evt (t)2 ) uτ (τ )2 dτ E E
1 (Eut (t)22 ) 2
1 Eut (t)22 E 2
0
0
t
1 vτ (τ )22 dτ + Evt (t)22 E 2
0
t
0
uτ (τ )22 dτ.
These three inequalities entail H(t) ≥ 0 for every [0, T ]. Using (4.13), we get F (t)F (t) −
p+4 (F (t))2 ≥ F (t)Υ(t), 4
where Υ(t) =
−2E∇ut (t)22
− (p +
2)Evt (t)22
+ 2κEut (t)p+2 p+2 − (p + 4)µE
t
0
t ∈ [0, T ],
t
+ 2E 0
(4.14)
g(t − τ )∇uτ (τ )dτ, ∇ut (t)
vτ (τ )22 dτ.
(4.15)
For the third term on the left-hand side of (4.15), we have
t g(t − τ )∇uτ (τ )dτ, ∇ut (t) 0
t
= 0
t
= 0
=
0
t
g(t − τ )
∇uτ (τ )∇ut (t)dxdτ
D
g(t − τ )
∇ut (t)∇(uτ (τ ) − ut (t))dxdτ +
D
g(t − τ ) D
0
∇ut (t)∇(uτ (τ ) − ut (t))dxdτ +
t
0
t
g(t − τ )∇ut (t)22 dτ g(τ )dτ ∇ut (t)22 . (4.16)
September 20, J070-S0129055X11004461
900
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
Combining (4.15) with (4.16), we get t Υ(t) = −(p + 2)Evt (t)22 − 2 1 − g(τ )dτ E∇ut (t)22 + 2κEut (t)p+2 p+2 0
t t + 2E g(t − τ ) ∇ut (t)∇(uτ (τ ) − ut (t))dxdτ − (p + 4)µE vτ (τ )22 dτ 0
0
D
t ≥ −(p + 2)Evt (t)22 − 2 1 − g(τ )dτ E∇ut (t)22 + 2κEut (t)p+2 p+2 − 2E
0
p+2 2
1 + 2(p + 2)
t
0
t
0
g(τ )∇u2 (t)22 dτ
− (p + 4)µE
g(t − τ )∇uτ (τ ) − ∇ut (t)22 dτ
t
0
vτ (τ )22 dτ
t g(τ )dτ E∇ut (t)22 ≥ −2(p + 2)EE(t) + p 1 − −
1 p+2
0
0
t
g(τ )dτ E∇ut (t)22 − (p + 4)µE
t
0
vτ (τ )22 dτ.
Using (4.3) in Lemma 4.1, (G2) and (4.11), we have 2 2 Υ(t) ≥ −(p + 2) 2E(0) + ε c0
∞
0
σ (x, t)dxdt 2
D
t +p 1− g(τ )dτ E∇ut (t)22 −
1 p+2
0
0
t
g(τ )dτ E∇ut (t)22 + µpE
≥ −(p + 2) 2E(0) + ε2 c20
0
∞
0
t
vτ (τ )22 dτ
σ 2 (x, t)dxdt
D
(p + 1)2 ∞ +p 1− g(τ )dτ E∇ut (t)22 p(p + 2) 0 ≥ 0. It follows from (4.14) that for t ≥ 0, F (t)F (t) −
p+4 (F (t))2 ≥ 0, 4
September 20, J070-S0129055X11004461
2011 11:23 WSPC/S0129-055X
148-RMP
Stochastic Viscoelastic Wave Equations
and so
901
p p p p+4 (F − 4 (t)) = − F − 4 −2 (t) F (t)F (t) − (F (t))2 ≤ 0. 4 4
The Taylor expansion shows that there exists θ ∈ (0, 1) such that p p p p p 1 F − 4 (t) = F − 4 (0) − F (0)F − 4 −1 (0)t + (F − 4 (θt)) t2 4 2 p p p ≤ F − 4 (0) − F (0)F − 4 −1 (0)t. 4
Therefore, −1 p p p . F 4 (t) ≥ F 4 +1 (0) F (0) − F (0)t 4 Recall that F (0) = u0 22 + T u022 and F (0) = 2u0 , v0 , Then the assumption (4.12) implies that pu0 , v0 − 2µu0 22 > 0. Let T0 =
2u022 . pu0 , v0 − 2µu0 22
Then F (t) → +∞ as t → T0 . This means that there exists a positive time T ∗ ∈ (0, T0 ] such that lim Eut (t)22 = +∞.
t→T ∗
As for the case when P(τ∞ = +∞) < 1 (i.e. P(τ∞ < +∞) > 0), then ut (t) in L2 norm blows up in finite time interval [0, τ∞ ] with positive probability. Remark 4.1. In the classical (deterministic) case of ε = 0, it is well known that for (u0 , v0 ) ∈ H01 (D) × L2 (D), the conditions (G1) and E(0) ≤ 0 already imply finite-time blowup of (1.2) (see, e.g., [2, 5]). If ε > 0, by our results, to balance the influence of W (t, x) such that the local solution of (1.2) blows up with positive probability or explosive in L2 sense, the initial energy should satisfy ∞ E(0) ≤ − 21 ε2 c20 0 D σ 2 (x, t)dxdt. Acknowledgments The authors are indebted to the referee for giving some important suggestions which improved the presentations of this paper. Supported in part by the China NSF Grant Nos. 10871097, 11028102 and 11171158, National Basic Research Program of China (973 Program) No. 2007CB814800, Qing Lan Project of Jiangsu Province, the Foundation for Young Talents in College of Anhui Province Grant No. 2011SQRL115 and the program was sponsored for scientific innovation research of college graduate in Jangsu province No. 181200000649.
September 20, J070-S0129055X11004461
902
2011 11:23 WSPC/S0129-055X
148-RMP
F. Liang & H.-J. Gao
References [1] M. Kafini and S. A. Messaoudi, A blow-up result in a cauchy viscoelastic problem, Appl. Math. Lett. 21 (2008) 549–553. [2] S. A. Messaoudi, Blow up and global existence in a nonlinear viscoelastic wave equation, Math. Nachr. 260 (2003) 58–66. [3] S. A. Messaoudi, Blow up of positive-initial-energy solutions of a nonlinear viscoelastic hyperbolic equation, J. Math. Anal. Appl. 320 (2006) 902–915. [4] H. T. Song and C. K. Zhong, Blow-up of solutions of a nonlinear viscoelastic wave equation, Nonlinear Anal. 11 (2010) 3877–3883. [5] Y. J. Wang, A global nonexistence theorem for viscoelastic equations with arbitrarily positive initial energy, Appl. Math. Lett. 22 (2009) 1394–1400. [6] L. Payne and D. Sattinger, Saddle points and instability on nonlinear hyperbolic equations, Israel Math. J. 22 (1975) 273–303. [7] H. A. Levine and J. Serrin, Global nonexistence theorems for quasilinear evolution equation with dissipation, Arch. Ration. Mech. Anal. 137 (1997) 341–361. [8] H. A. Levine and S. R. Park, Global existence and global nonexistence of solutions of the Cauchy problem for a nonlinearly damped wave equation, J. Math. Anal. Appl. 228 (1998) 181–205. [9] E. Vitillaro, Global nonexistence theorems for a class of evolution equations with dissipation, Arch. Ration. Mech. Anal. 149 (1999) 155–182. [10] Z. J. Yang, Existence and asymptotic behavior of solutions for a class of quasi-linear evolution equations with non-linear damping and source terms, Math. Methods Appl. Sci. 25 (2002) 795–814. [11] S. A. Messaoudi and B. Said-Houari, Blow up of solutions of a class of wave equations with nonlinear damping and source terms, Math. Methods Appl. Sci. 27 (2004) 1687– 1696. [12] P. L. Chow, Stochastic wave equations with polynomial nonlinearity, Ann. Appl. Probab. 12 (2002) 361–381. [13] P. L. Chow, Nonlinear stochstic wave equations: Blow-up of second moments in L2 norm, Ann. Appl. Probab. 19 (2009) 2039–2046. [14] P. L. Chow, Asymptotics of solutions to semilinear stochastic wave equations, Ann. Appl. Probab. 16 (2006) 757–789. [15] P. L. Chow, Asymptotic solutions of a nonlinear stochastic beam equation, Discrete Contin. Dyn. Syst. Ser. B 6 (2006) 735–749. [16] Z. Brze´zniak, B. Maslowski and J. Seidler, Stochastic nonlinear beam equations, Probab. Theory Related Fields 132 (2005) 119–149. [17] L. J. Bo, D. Tang and Y. G. Wang, Explosive solutions of stochastic wave equations with damping on Rd , J. Differential Equations 244 (2008) 170–187. [18] T. T. Wei and Y. M. Jiang, Stochastic wave equations with memory, Chinese Ann. Math. Ser. B 31 (2010) 329–342. [19] P. Cannarsa and D. Sforza, An existence result for semilinear equations in viscoelasticity: The case of regular kernels, in Mathematical Models and Methods for Smart Materials, Series on Advances in Mathematics for Applied Sciences, Vol. 62, eds. B. Fabrizio, B. Lazzari and A. Morro (World Scientific Publishing, Singapore, 2002), pp. 343–354. [20] J. Pr¨ uss, Evolutionary Intergral Equations and Applications, Monographs in Mathematics, Vol. 87 (Birkh¨ auser, Basel, 1993). [21] G. D. Prato and J. Zabczyk, Stochastic Equations in Infinite Dimensions (Cambridge University Press, Cambridge, 1992).
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 8 (2011) 903–931 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004473
SMALL GLOBAL SOLUTIONS FOR NONLINEAR COMPLEX GINZBURG–LANDAU EQUATIONS AND NONLINEAR DISSIPATIVE WAVE EQUATIONS IN SOBOLEV SPACES
MAKOTO NAKAMURA Mathematical Institute, Tohoku University, Aoba, Sendai, 980-8578 Japan
[email protected] Received 9 January 2011 Revised 5 September 2011 The Cauchy problems for nonlinear complex Ginzburg–Landau equations and nonlinear dissipative wave equations are considered in Sobolev spaces. The relation between the order of the nonlinear terms and the regularity of solutions is considered in terms of the scaling arguments, and the existence of local solutions and small global solutions is shown in Sobolev and Besov spaces. Keywords: Ginzburg–Landau equations; Cauchy problems; Sobolev spaces. Mathematics Subject Classification 2010: 35K15, 35K30, 35Q56, 35L70
1. Introduction We consider the Cauchy problem for nonlinear complex Ginzburg–Landau equations (∂t − (a + ib)∆)u(t, x) = f (u(t, x)) for (t, x) ∈ [0, ∞) × Rn (1.1) u(0, ·) = u0 (·) in Sobolev spaces H s (Rn ) of order s ≥ 0, where a = 0 and b are real numbers, n ≥ 1, √ n i = −1 and ∆ denotes the Laplacian given by ∆ := j=1 ∂ 2 /∂x2j . We note that the equations and our results cover nonlinear heat equations since we are able to take a = 1 and b = 0. When the nonlinear term f (u) is given by the typical nonlinear term |u|p for 1 < p < ∞, by simple scaling arguments uλ (t, x) = λ2/(p−1) u(λ2 t, λx), λ ∈ R, the relation p=1+
4 n − 2s
(1.2)
must hold for the well-posedness of the Cauchy problem for u0 ∈ H˙ s (Rn ). We give local and global solutions of (1.1) based on this scaling arguments, especially we give 903
September 20, J070-S0129055X11004473
904
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
small global solutions under Eq. (1.2). Equation (1.2) breaks down when s ≥ n/2, and s = n/2 corresponds to the critical number for the Sobolev embedding from H s (Rn ) to L∞ (Rn ). In this sense, s = n/2 is critical both for the scaling arguments and the L∞ control of the nonlinear terms. One of our aims is to consider this critical case, and we give the solutions of the Cauchy problem even if the nonlinear terms have the exponential growth for large |u|. For example, the typical nonlinear terms we treat in this paper are p if 0 ≤ s < n/2, C|u| p ν (1.3) f (u) = C|u| exp(λ|u| ) if s = n/2, p if s > n/2, C|u| g(u) where C is any complex constant, λ ≥ 0, 0 ≤ ν ≤ 2 are any real constants, 1 < p ≤ 1 + 4/(n − 2s), g is any fixed smooth function on R2 with complex valued, and f is required to be modified to be smooth corresponding to s. Especially, we show the existence of local in time solutions for large data and global in time solutions for small data in Sobolev spaces H s (Rn ). This type of theory has been developed for nonlinear dispersive equations and wave equations by the use of so called Strichartz estimates which are constructed by the stationary phase estimates of the fundamental solutions of the linear equations (see [3]). We have shown it for nonlinear Schr¨ odinger equations and wave equations and Klein–Gordon equations in [25], and for nonlinear Dirac equations in [18]. The aim of this paper is to show that we are also able to construct that theory for nonlinear parabolic equations based on the classical energy estimates rather than the stationary estimates, which enables us to treat the perturbation of the Laplacian to higher order. For that purpose, we introduce the generalization of the Laplacian and complex Ginzburg– Landau equations to higher order. Throughout this paper, let m be a positive integer, let {Cα }|α|≤m be complex numbers which real parts satisfy ≥ 0 for |α| < m, (1.4) Re Cα > 0 for |α| = m, where α denotes n multi-index α = (α1 , . . . , αn ) with nonnegative integers α1 , . . . , αn . We put ∆m := −
0≤|α|≤m
Cα (−1)|α| ∂x2α = −
0≤|α|≤m
Cα (−1)|α|
∂ 2|α| . 1 n ∂x2α · · · ∂x2α n 1
(1.5)
We note ∆m = ∆ if m = 1, Cα = 0 for |α| = 0 and Cα = 1 for |α| = 1. We consider the Cauchy problem for higher order nonlinear complex Ginzburg–Landau equations (∂t − ∆m )u(t, x) = f (u(t, x)) for (t, x) ∈ [0, ∞) × Rn , (1.6) (CGL) u(0, ·) = u0 (·).
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
905
By the scaling argument for λ2m/(p−1) u(λ2m t, λx), the critical exponent for the well-posedness for u0 ∈ H s is p = 1 + 4m/(n − 2s). We now describe the precise assumption for the nonlinear terms f (u) with (1.3) in mind. For a function f from C to C, we regard f as a function of z and z¯, and we consider the derivatives of f with respect to ∂/∂z = (∂/∂x − i∂/∂y)/2 and ∂/∂ z¯ = (∂/∂x + i∂/∂y)/2. We denote the kth order derivatives of f by f (k) , and the maximum of the modulus of those derivatives by |f (k) |. For any real numbers s ≥ 0, p > 1 and any nonnegative nondecreasing function M on [0, ∞), we say f satisfies N (s, p, M ) if f is [s] times differentiable [s] function and there exists a positive constant C such that {f (k) }k=0 satisfies the estimates |f (k) (z)| ≤ C|z|max{p−k,0} M (|z|) for any 0 ≤ k ≤ [s]
(1.7)
and |f ([s]) (w) − f ([s]) (z)| if s < p < [s] + 1, C|w − z|p−[s] M (max{|w|, |z|}) ≤ C|w − z| max{|w|, |z|}max{p−[s]−1,0} M (max{|w|, |z|}) otherwise, (1.8) where [s] is the largest integer which is less than or equal to s. We note that for any complex constant C, the nonlinear term C|u|p [respectively C|u|p−1 u] satisfies N (s, p, 1) if s < p or p is an even [respectively odd] integer (see [7, Lemma 2.4]). For s ∈ R and 1 < r < ∞, H s,r (Rn ) and H˙ s,r (Rn ) denote the Sobolev space and the homogeneous Sobolev space given by H s,r (Rn ) = F −1 ξ−s F Lr (Rn ),
H˙ s,r (Rn ) = F −1 |ξ|−s F Lr (Rn ),
(1.9)
where ξ := 1 + ξ 2 , F and F −1 denote the Fourier transform and its inverse. For s s (Rn ) and B˙ r,l (Rn ) denote the Besov space and the homogeneous 1 ≤ l ≤ ∞, Br,l Besov space given by s n Br,l (R ) := f ∈ S (Rn ) |
s f Br,l
1/l ∞ −1 l −1 j l := F ψ(ξ)F f Lr (Rn ) + F ϕ(ξ/2 )F f Lr (Rn ) <∞ j=1
(1.10)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
906
and s B˙ r,l (Rn ) :=
f ∈ S (Rn ) |
f B˙ s := r,l
∞
1/l
j=−∞
F −1 ϕ(ξ/2j )F f lLr (Rn )
<∞ ,
(1.11)
where ϕ is a nonnegative function with its support in {ξ ∈ Rn | 1/2 ≤ |ξ| ≤ 2} ∞ ∞ and satisfies j=−∞ ϕ(ξ/2j ) = 1 when ξ = 0, and ψ(ξ) := 1 − j=1 ϕ(ξ/2j ) for any ξ ∈ Rn . We refer to [2] for their properties. We use the abbreviation such as s s s s = Br,l (Rn ), B˙ r,l = B˙ r,l (Rn ). H s,r = H s,r (Rn ), H˙ s,r = H˙ s,r (Rn ), Br,l For any real numbers q and r, we say the couple (q, r) an admissible pair for CGL if it satisfies 2 ≤ q ≤ ∞,
2 ≤ r < ∞,
1 1 2m + = . r nq 2
(1.12)
For any 1 ≤ r ≤ ∞, r denotes its conjugate with 1/r + 1/r = 1. First we consider the Cauchy problem for CGL in Sobolev spaces H s (Rn ) with 0 ≤ s < n/2. Theorem 1.1. Let n ≥ 1, 0 ≤ s < n/2, 0 < p − 1 ≤ 4m/(n − 2s). Let f satisfy N (s, p, 1). Then there exists an admissible pair (q, r) with the following properties. If p − 1 < 4m/(n − 2s), then for any u0 ∈ H s (Rn ), the Cauchy problem (1.6) has a unique solution in s (Rn )) < ∞}, {u ∈ C([0, T ), H s (Rn )) | u L∞ ((0,T ),H s (Rn ))∩Lq ((0,T ),Br,2
(1.13)
where the time interval T is given by −(p−1)/σ
T = C u0 H˙ s
,
σ := 1 − (p − 1)
n − 2s 4m
(1.14)
for some constant C > 0. If p − 1 = 4m/(n − 2s), then for any u0 ∈ H s (Rn ) which u0 H˙ s is sufficiently small, the Cauchy problem (1.6) has a unique solution in (1.13) with T = ∞. Namely, global solutions are obtained. Remark 1.2. In Theorem 1.1, the existence time T is determined only by u H˙ s instead of u0 H s (= u0 L2 ∩H˙ s ). Namely, u0 L2 can be taken arbitrary large for global solutions. We consider the Cauchy problem for CGL in critical Sobolev spaces H n/2 (Rn ). Theorem 1.3. Let n ≥ 1, 0 ≤ s0 < n/2, n/2 − 1 < p − 1 ≤ 4m/(n − 2s0 ). Let λ ≥ 0 and 0 ≤ ν ≤ 2, and let f satisfy N (n/2, p, exp(λ| · |ν )). Then there exist an admissible pair (q, r) with the following properties. If p − 1 < 4m/(n − 2s0), then for
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
907
any u0 ∈ H n/2 (Rn ), provided that u0 H˙ n/2 (Rn ) is sufficiently small when ν = 2, the Cauchy problem (1.6) has a unique solution in {u ∈ C([0, T ), H n/2 (Rn )) | u L∞ ((0,T ),H n/2 (Rn ))∩Lq ((0,T ),B n/2 (Rn )) < ∞}, r,2
(1.15) where the time interval T is given by −1/σ ∞ p−1 T = C aj (C u0 H˙ n/2 )νj u0 H , ˙ s0
σ := 1 − (p − 1)
j=0
aj :=
λj νj C r(j)νj/2+(p−1)/r(0) , j!
1 1 := r(j) p − 1 + νj
n − 2s0 , 4m
2 1− r0
(1.16)
for j ≥ 0, (1.17)
1 1 p − 1 n − 2s0 · := − r0 2 p+1 2n
(1.18)
for some constant C > 0. If p − 1 = 4m/(n − 2s0 ), then for any u0 ∈ H n/2 (Rn ) which u0 H˙ s0 is sufficiently small, the Cauchy problem (1.6) has a unique solution in (1.15) with T = ∞. Namely, global solutions are obtained. We consider the Cauchy problem for CGL in Sobolev spaces H s (Rn ) with s > n/2. Theorem 1.4. Let n ≥ 1, 0 ≤ s0 < n/2 < s < ∞, and let 2m n n − 4m if m < with s0 < n − 2s0 − 2m 2 2 0
(1.19)
Let M be a nonnegative nondecreasing function on [0, ∞), and let f satisfy N (s, p, M ). Then there exists an admissible pair (q, r) with the following properties. If p − 1 = 4m/(n − 2s0 ), then for any u0 ∈ H s (Rn ), the Cauchy problem (1.6) has a unique solution in s (Rn )) < ∞}, {u ∈ C([0, T ), H s (Rn )) | u L∞ ((0,T ),H s (Rn ))∩Lq ((0,T ),Br,2
where the time interval T is given by −1/σ p−1 , T = CM (2C0 u0 B˙ n/2 ) u0 B˙ s0 2,1
2,1
σ := 1 − (p − 1)
n − 2s0 4m
(1.20)
(1.21)
for some constant C > 0. If p − 1 = 4m/(n − 2s0 ), then for any u0 ∈ H s (Rn ) which u0 B˙ s0 is sufficiently small, the Cauchy problem (1.6) has a unique solution in 2,1 (1.20) with T = ∞. Namely, global solutions are obtained.
September 20, J070-S0129055X11004473
908
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
Remark 1.5. In Theorem 1.4, the existence time T is determined by u0 B˙ n/2 2,1
instead of u0 H s .
Remark 1.6. The local in time solutions in Theorems 1.1, 1.3 and 1.4 become global in time solutions if f satisfies Re(zf (z)) = 0 for any z ∈ C. Indeed, the linear estimates (2.2) and (2.3) show the uniform bounds of H s (Rn ) norm of the solutions by H s (Rn ) norm of the initial data, which enable us to extend the solutions globally. Another application of our methods is for the Cauchy problem for nonlinear dissipative wave equations 2 n (∂t − ∆ + ∂t )u(t, x) = f (u(t, x)) for (t, x) ∈ [0, ∞) × R , (1.22) u(0, ·) = u0 (·), ∂t u(0, ·) = u1 (·). We generalize it to higher order 2 (∂t − ∆m + ∂t )u(t, x) = f (u(t, x)) (DW) u(0, ·) = u0 (·), ∂t u(0, ·) = u1 (·).
for (t, x) ∈ [0, ∞) × Rn , (1.23)
Let us consider the typical case f (u) = |u|p . The scaling arguments fail for Eq. (1.23). However, when we drop ∂t u term from (1.23) and consider the scaling λ2m/(p−1) u(λm t, λx), we expect the relation p=1+
4m n − 2(m + s)
(1.24)
is critical for the well-posedness for (u0 , u1 ) ∈ H m+s ⊕ H s , which is the same critical exponent for the complex Ginzburg–Landau equations (1.6) for u0 ∈ H m+s . Unfortunately, our arguments below could only treat the smaller number p ≤ 1 + 2m n−2(m+s) instead of (1.24), however, we can consider the critical exponent m + s = n/2. First, we consider the Cauchy problem for DW in Sobolev spaces H m+s (Rn ) with m + s < n/2. Theorem 1.7. Let n ≥ 3, 1 ≤ m < n/2 be positive integers. Let s ≥ 0 be a real number with s + m < n/2. And let p=1+
2m . n − 2(m + s)
(1.25)
Let f satisfy N (s, p, 1). Let σ be a real number with 0 < σ ≤ 1 and 1 − p/2 ≤ σ. Then for any (u0 , u1 ) ∈ H m+s (Rn ) ⊕ H s (Rn ), there exists a positive constant C such that the Cauchy problem (1.23) has a unique solution in {(u, ∂t u) ∈ C([0, T ), H m+s (Rn ) ⊕ H˙ s (Rn )) | (u, ∂t u) Z < ∞},
(1.26)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
909
where Z := L∞ ((0, T ), H m+s (Rn ) ⊕ H˙ s (Rn )) ∩ L2 ((0, T ), H˙ m+s (Rn ) ⊕ H˙ s (Rn )) (1.27) and the time interval T is given by −(p−1)/σ
T = C (u0 , u1 ) (H˙ s ∩H˙ m+s )⊕H˙ s .
(1.28)
Moreover if p ≥ 2 and (u0 , u1 ) (H˙ s ∩H˙ m+s )⊕H˙ s is sufficiently small, then we can take T = ∞. Namely, global solutions are obtained. We consider the Cauchy problem for DW in critical Sobolev spaces H n/2 (Rn ). Theorem 1.8. Let n ≥ 3, 1 ≤ m < n/2 be positive integers. Let s0 ≥ 0 be a real number with s0 + m < n/2. And let n 2m −m
(1.30) where Z µ := L∞ ((0, T ), (H˙ µ (Rn ) ∩ H˙ m+µ (Rn )) ⊕ H˙ µ (Rn )) ∩ L2 ((0, T ), H˙ m+µ (Rn ) ⊕ H˙ µ (Rn ))
(1.31)
and the time interval T is given by ∞ T = 2C aj (2C0 (u0 , u1 ) (H˙ n/2−m ∩H˙ n/2 )(Rn )⊕H˙ n/2−m (Rn ) )νj j=0
−1/σ
· (2C0 (u0 , u1 ) (H˙ s0 ∩H˙ s0 +m )(Rn )⊕H˙ s0 (Rn ) )p−1
λj λj C r(j)νj/2+(p−1)/r(0) , j!
1 1 m 1 1 1 1 , := − . := − r(j) p − 1 + νj 2 r0 r0 2 n
aj :=
,
(1.32)
(1.33) (1.34)
Moreover if p ≥ 2 and (u0 , u1 ) (H˙ s0 ∩H˙ s0 +m )⊕H˙ s0 is sufficiently small, we can take T = ∞. Namely, global solutions are obtained.
September 20, J070-S0129055X11004473
910
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
We consider the Cauchy problem for DW in Sobolev spaces H m+s (Rn ) with m + s > n/2. Theorem 1.9. Let n ≥ 3, 1 ≤ m < n/2 be positive integers. Let s0 ≥ 0, s ≥ 0 be real numbers with s0 + m < n/2 < s + m. And let p=1+
2m . n − 2(m + s0 )
(1.35)
Let M be a nonnegative nondecreasing function on [0, ∞), and let f satisfy N (s, p, M ). Let σ be a real number with 0 < σ ≤ 1 and 1 − p/2 ≤ σ. Then there exist positive constants C0 and C such that the Cauchy problem (1.23) has a unique solution in {(u, ∂t u) ∈ C([0, T ), H m+s (Rn ) ⊕ H s (Rn )) | (u, ∂t u) Z < ∞},
(1.36)
where n/2−m n/2 Z := L∞ ((0, T ), (B˙ 2,1 (Rn ) ∩ B˙ 2,1 (Rn ) ∩ H m+s0 (Rn )) n/2−m
⊕ (B˙ 2,1
n/2
(Rn ) ∩ H˙ s0 (Rn ))) ∩ L2 ((0, T ), (B˙ 2,1 (Rn ) ∩ H˙ m+s0 (Rn ))
n/2−m ⊕ (B˙ 2,1 (Rn ) ∩ H˙ s0 (Rn ))),
(1.37)
where the time interval T is given by T = {CM (2C0 (u0 , u1 ) (B˙ n/2−m ∩B˙ n/2 )⊕B˙ n/2−m ) 2,1
2,1
2,1
p−1 −1/σ
· (2C0 (u0 , u1 ) (B˙ s0 ∩B˙ s0 +m )⊕B˙ s0 ) 2,1
2,1
2,1
}
.
(1.38)
Moreover if p ≥ 2 and (u0 , u1 ) (B˙ s0 ∩B˙ s0 +m )⊕B˙ s0 is sufficiently small, then we can 2,1 2,1 2,1 take T = ∞. Namely, we obtain global solutions. Remark 1.10. In the critical cases of Theorems 1.3 and 1.8, the growth rate eλ|u| seems to be optimal in view of the Trudinger’s inequality (see [1, 31]).
2
So far, we have considered the nonlinear terms of power type. We mention here that our arguments to prove the above theorems are also applicable to consider the derivative nonlinear terms. Theorem 1.11. Let n ≥ 1, m ≥ 1, p ≥ 2 be positive integers. Let P (u) be any polynomial of u and u of order p. For any fixed multi-index α with |α| = m, let f (u) = ∂xα P (u). Let a be any fixed real number with a ≥ m + n + 2. Then for any u0 ∈ H a (Rn ), (1.6) has a unique solution in {u ∈ C([0, ∞), H a (Rn )) | u L∞ ((0,∞),H a (Rn ))∩L2 ((0,∞),H a+m (Rn )) < ∞} if u0 H a (Rn )∩H˙ −m (Rn ) is sufficiently small.
(1.39)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
911
Remark 1.12. In Theorem 1.11, we can also treat the nonlinear terms f (u) = α |α|=m Aα ∂x P (u) for any {Aα }|α|=m ⊂ C. Especially, we obtain small global solutions for Cahn–Hilliard equations ∂t u − ∆(−∆u + P (u)) = 0
(1.40)
when we take m = 2, Cα = 1 for α = 2ej , 1 ≤ j ≤ n, and Cα = 0 otherwise, where {ej }nj=1 are canonical vectors in Rn . Remark 1.13. With respect to Theorem 1.11, Wang has already considered small global solutions for real numbers m and p with an additional condition which requires large p for large m in [37, 38]. Theorem 1.14. Let n ≥ 1, m ≥ 1, p ≥ 2 be positive integers. Let P (u) be any polynomial of u and u of order p. For any fixed multi-index α with |α| = m, let f (u) = ∂xα P (u). Let a be any fixed real number with a ≥ m + n + 2. Then for any (u0 , u1 ) ∈ H a+m (Rn ) ⊕ (H a (Rn ) ∩ H˙ −m (Rn )), the Cauchy problem (1.23) has a unique global solution in {u ∈ C([0, ∞), H a (Rn )) | u L∞ ((0,∞),H a (Rn ))∩L2 ((0,∞),H a+m (Rn )) < ∞} (1.41) if (u0 , u1 ) H a+m (Rn )⊕(H a (Rn )∩H˙ −m (Rn )) is sufficiently small. Remark 1.15. We can show the well-posedness of the Cauchy problem in the above eight theorems. For example, in Theorem 1.1, if the Cauchy data {uj }∞ j=1 satisfy u0 = limj→∞ uj in L2 (Rn ), then the corresponding solutions {u(j) }∞ j=1 of (1.6) satisfy lim u − u(j) L∞ ((0,T ),L2 (Rn ))∩Lq ((0,T ),Lr (Rn )) = 0
j→∞
(1.42)
for the admissible pair (q, r). There is a lot of literature which treat the Cauchy problem for nonlinear parabolic equations. We only refer to some closely related results. The global solutions for nonlinear parabolic equations, mainly the nonlinear heat equations, have been considered in Lebesgue spaces [4, 6, 10, 41], in Sobolev spaces [30]. In the above papers, power type nonlinear terms |u|p−1 u or |u|p are mainly focused in terms of the Fujita exponent p = 1 + 2/n. In [32], Ruf and Teraneo have considered 2 the exponential nonlinear term eu and shown the local in time solutions in Orlicz spaces. The problems for higher order Laplacian, we refer to [5, 37, 38]. For the Cauchy problems for nonlinear complex Ginzburg–Landau equations, we refer to [8, 9, 17, 36]. In some cases, the methods of the above results for nonlinear heat equations are also applicable to complex Ginzburg–Landau equations. For the Cauchy problem for nonlinear dissipative wave equations, the global existence or blowing up of solutions have been considered by many authors in terms of the Fujita exponent for power type nonlinear terms. See for example
September 20, J070-S0129055X11004473
912
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
[11–14, 16, 19, 26–28, 34, 42]. For the detailed review of the history, we refer to [29] by Nishihara. One of our aims in this paper is to clarify how the nonlinear terms are controlled by the regularity s of initial data in H s (Rn ) in terms of scaling arguments. Espe2 cially, we show global solutions for exponential type nonlinear terms eu for data in critical Sobolev spaces H n/2 (Rn ) in a unified way for complex Ginzburg–Landau equations and dissipative wave equations based on classical energy estimates. Throughout the paper, the notation A B denotes A ≤ CB for some constant C which is not essential in our arguments.
2. Linear Estimates In this section, we prepare some linear estimates to prove our theorems. Let m ≥ 1 be any integer. Lemma 2.1. Let T > 0, and let f be a function on [0, T ]×Rn . Let u be the solution of the equation (∂t − ∆m )u = f.
(2.1)
Then the following estimates hold. (1) We have u 2L∞((0,T ),L2 ) + u 2L2((0,T ),H˙ m ) u(0, ·) 2L2 + Re(uf ) L1 ((0,T )×Rn ) . (2.2) (2) For any admissible pair (q, r) for CGL, we have u 2L∞ ((0,T ),L2 ) + u 2Lq ((0,T ),Lr ) u(0, ·) 2L2 + Re(uf ) L1 ((0,T )×Rn ) . (2.3) (3) For any s ∈ R and any admissible pairs (q, r) and (q0 , r0 ) for CGL, we have u L∞ ((0,T ),H˙ s ) + u Lq ((0,T ),B˙ s
r,2 )
u(0, ·) H˙ s + f Lq0 ((0,T ),B˙ s
r ,2 0
)
,
(2.4)
s s where we can replace H˙ s , B˙ r,2 , B˙ rs ,2 with H s , Br,2 , Brs ,2 , respectively. 0 0 (4) For any admissible pair (q, r) for CGL and any real numbers µ and 1 ≤ η ≤ 2, we have
u L∞ ((0,T ),B˙ µ
2,η )
µ + u Lq ((0,T ),B˙ r,η ˙ µ + f L1 ((0,T ),B˙ µ ) , ) u(0, ·) B 2,η
2,η
(2.5) µ µ µ µ where we can replace B˙ 2,η , B˙ r,η with B2,η , Br,η , respectively.
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
913
Proof of Lemma 2.1. The proof of (1) follows from classical energy estimates. First, we note that for any function u and v and multi index α, we have u∂ 2α v =
|α|
(−1)k−1 ∂ δk (∂ δ1 +···+δk−1 u∂ δ1 +···+δk +2(δk+1 +···+δ|α| ) v)
k=1
+ (−1)|α| ∂ α u∂ α v,
(2.6)
where {δk } are multi indices with α = δ1 + · · · + δ|α| , |δk | = 1 for 1 ≤ k ≤ |α|. We multiply u to Eq. (2.1) and use (2.6) to obtain u∂t u + Cα Iα + Cα |∂ α u|2 = uf, (2.7) α
α
where we have put Iα := (−1)|α|
|α|
(−1)k−1 ∂ δk (∂ δ1 +···+δk−1 u ∂ δ1 +···+δk +2(δk+1 +···+δ|α| ) u).
(2.8)
k=1
Taking the real part, using 2 Re(u∂t u) = ∂t |u|2 , and integrating by spatial variables, we have 2 α 2 ∂t u L2 + 2 (Re Cα ) ∂ u L2 = 2 Re(uf )dx, (2.9) Rn
|α|≤m
where we have used the divergence theorem. Integrating by time variable, we obtain the required result by the condition (1.4). The proof of (2) follows from (1) and the convex inequality 1−2/q
2/q
u Lq Lr u L∞ L2 u L2 H˙ m .
(2.10)
For the proof of (3), by (2) and the bound Re(uf ) L1 ((0,T )×Rn ) ≤ ε u 2Lq0 Lr0 +
1 f 2 q0 r0 L L 4ε
(2.11)
for ε > 0, we have u L∞ L2 + u Lq Lr u(0, ·) L2 + f Lq0 Lr0 .
(2.12)
By the definition of the Besov spaces and the Minkowski inequality, we obtain the required results. The proof of (4) follows from (2), the Schwarz inequality 1 (2.13) Re(uf ) L1 ((0,T )×Rn ) ≤ ε u 2L∞L2 + f 2L1L2 4ε for ε > 0, and the definition of the Besov spaces with the Minkowski inequality since 1 ≤ η ≤ 2 ≤ q, r. Lemma 2.2. Let T > 0, and let f be a function on [0, T ]×Rn . Let u be the solution of the equation (∂t − ∆m + ∂t2 )u = f. Then the following estimates hold.
(2.14)
September 20, J070-S0129055X11004473
914
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
(1) We have u 2L∞((0,T ),H m ) + u 2L2 ((0,T ),H˙ m ) + ∂t u 2L∞ ((0,T ),L2 ) + ∂t u 2L2 ((0,T ),L2 ) u(0, ·) 2H m + ∂t u(0, ·) 2L2 + Re(uf ) L1 ((0,T )×Rn ) + Re ((∂t u)f ) L1 ((0,T )×Rn ) .
(2.15)
(2) For any real number µ and 1 ≤ η ≤ 2, we have u L∞ B˙ µ + u L∞ B˙ m+µ + u L2 B˙ m+µ + ∂t u L∞ B˙ µ + ∂t u L2 B˙ µ 2,l
2,l
u(0, ·) B˙ µ
2,l
2,l
˙ m+µ 2,l ∩B2,l
2,l
+ ∂t u(0, ·) B˙ µ + f L1B˙ µ . 2,l
(2.16)
2,l
Proof of Lemma 2.2. (1) The proof follows from classical energy estimates. Multiplying ∂t u to Eq. (2.14), we have 2|∂t u|2 + ∂t |∂t u|2 + ∂t Cα |∂ α u|2 + Cα 2 Re Iα = 2 Re((∂t u)f ), α
α
(2.17) where we have used (2.6) and put |α|
Iα := (−1)
|α|
(−1)k−1 ∂ δk (∂ δ1 +···+δk−1 ∂t u∂ δ1 +···+δk +2(δk+1 +···+δ|α| ) u).
k=1
(2.18) Integrating by the spatial and time variables, we obtain ∂t u(t, ·) 2L2 + Cα ∂ α u(t, ·) 2L2 + 2 ∂t u 2L2 L2 t
α
= ∂t u(0, ·) 2L2 +
α
Cα ∂ α u(0, ·) 2L2 + 2
x
t 0
Rn
Re((∂t u)f )dxdt. (2.19)
Next multiplying u to Eq. (2.14), we have ∂t |u|2 + ∂t (2 Re(u∂t u)) − 2|∂t u|2 + Cα 2 Re Jα + 2 Cα |∂ α u|2 α
= 2 Re(uf ),
α
(2.20)
where we have put Jα := (−1)|α|
|α|
(−1)k−1 ∂ δk (∂ δ1 +···+δk−1 u ∂ δ1 +···+δk +2(δk+1 +···+δ|α| ) u).
k=1
(2.21)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
915
Integrating by the spatial and time variables, we obtain u(t, ·) 2L2 + 2 Re(u∂t u)(t, x)dx − 2 ∂t u 2L2 L2 + 2 Cα ∂ α u 2L2 L2 t
= u(0, ·) 2L2 +
x
α
t 2 Re(u∂t u)(0, x)dx +
2 Re(uf )dxdt.
t
x
(2.22)
0
So that, by Schwarz inequality, we have u 2L∞ Cα ∂ α u 2L2L2 u(0, ·) 2L2 + Re(uf ) L1t L1x + ∂t u 2L∞ 2 + 2 t Lx t Lx t
α
x
+ ∂t u 2L2 L2 . t
(2.23)
x
Combining (2.19) and (2.23), we obtain the required result. (2) By (1) and the Schwarz inequality, we have u L∞ H m + u L2 H˙ m + ∂t u L∞ L2 + ∂t u L2 L2 u(0, ·) H m + ∂t u(0, ·) L2 + f L1 L2 .
(2.24)
Using the definition of Besov spaces and the Minkowski inequality, we obtain the required result. 3. Proof of Theorems 1.1, 1.3 and 1.4 We prove Theorems 1.1, 1.3 and 1.4 in this section. We rewrite the differential equation (1.6) to the integral equation t U (t − τ )f (u(τ, ·))dτ (3.1) u(t) = Φ(u)(t) := U (t)u0 + 0
t∆m
. We show that the operator Φ becomes a contraction for t ≥ 0, where U (t) := e map on some complete metric spaces. The solutions in our theorems are given as the fixed points of Φ. 3.1. Proof of Theorem 1.1 We prepare some estimates for nonlinear terms. Lemma 3.1. Let m ≥ 1 be an integer. Let 0 ≤ s < n/2, 0 < p − 1 ≤ 4m/(n − 2s) be real numbers. Let f satisfy N (s, p, 1). Let n − 2s . (3.2) σ = 1 − (p − 1) 4m Then there exist admissible pairs (q0 , r0 ), (q, r) such that q0 = ∞, q = ∞, (1)
f (u) Lq0 ((0,T ),B˙ µ
r ,2 0
)
p−1 T σ u L q ((0,T ),B ˙ s ) u Lq0 ((0,T ),B˙ µ
r0 ,2 )
r,2
(3.3)
for any 0 < µ ≤ s, and (2)
p−1 f (u) − f (v) Lq0 ((0,T ),Lr0 ) T σ max w L q ((0,T ),B ˙ s ) u − v Lq0 ((0,T ),Lr0 ) . w=u,v
r,2
(3.4)
September 20, J070-S0129055X11004473
916
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
Proof. (1) We take r0 such that n − 2s 1 1 1 − (p − 1) < < , 2 4n r0 2
1 . r0
(3.5)
1 m 1 n − 2s − 2m 1 − < . ≤ − (p − 1) 2 n r0 2 4n
(3.6)
0<
When m < n/2, moreover we assume
We can take such r0 by p − 1 ≤ 4m/(n − 2s). We put 1/r := 2(1/2 − 1/r0)/(p − 1) + s/n, and we take q0 and q such that (q0 , r0 ) and (q, r) are admissible pairs. We start from the fact f B˙ µ
r0 ,2
u p−1 ˙µ ˙ 0 u B B r∗ ,2
(3.7)
r0 ,2
which has been shown in [22, Proposition 2.1], where 1/r∗ := 2(1/2−1/r0)/(p− s
→ B˙ r0∗ ,2 and the 1) = 1/r − s/n. By the Sobolev embedding theorem B˙ r,2 H¨older inequality for time variable, we obtain the required result. (2) This follows easily from |f (u) − f (v)| max |w|p−1 |u − v|
(3.8)
w=u,v
and the H¨ older inequality and the Sobolev embedding.
s We put X := L∞ L2 ∩ Lq0 Lr0 ∩ Lq Lr , X s := L∞ H˙ s ∩ Lq0 B˙ rs0 ,2 ∩ Lq B˙ r,2 . By Lemmas 2.1 and 3.1, we have
Φ(u) X u0 L2 + f (u) Lq0 Lr0 p−1 u0 L2 + T σ u L qB ˙ s u Lq0 Lr0 r,2
u0 L2 + T
σ
p−1 u X s u X .
Similarly, we also have Φ(u) X s u0 H˙ s + f (u) Lq0 B˙ s
r ,2 0
p−1 u0 H˙ s + T σ u L qB ˙ s u Lq0 B˙ s
r0 ,2
r,2
u0 H˙ s + T
σ
p−1 u X s u X s
and Φ(u) − Φ(v) X f (u) − f (v) Lq0 Lr0 p−1 T σ max w L qB ˙ s u − v Lq0 Lr0 w=u,v
T
σ
r,2
p−1 max w X s u w=u,v
− v X .
(3.9)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
917
Therefore Φ becomes a contraction map on {u ∈ L∞ H s | u X ≤ C0 u0 L2 , u X s ≤ C0 u0 H˙ s }
(3.10)
for some positive constant C0 and suitable T > 0, where the metric is given by the X norm. Moreover we can take T = ∞ when σ = 0 and u0 H˙ s is sufficiently small. We have obtained the solution u as the fixed point of Φ in (3.10). We can show that u is in C([0, T ), H s ) since U (·) is an operator from H s to C([0, ∞), H s ). And the uniqueness of the solutions in (1.13) follows from (3.9). Indeed, if u and v are fixed points of Φ in (1.13), then (3.9) shows that u − v Lq0 ((t0 ,t0 +ε),Lr0 ) ≤ Cεσ max w Lq ((t0 ,t0 +ε),B˙ s ) u − v Lq0 ((t0 ,t0 +ε),Lr0 ) w=u,v
r,2
(3.11) for any t0 ≥ 0 and ε > 0 if u(t0 ) = v(t0 ). So that, taking ε sufficiently small, we obtain u = v on the interval (t0 , t0 + ε) since q = ∞. The iteration argument shows u = v on [0, T ). 3.2. Proof of Theorem 1.3 Lemma 3.2. Let m ≥ 1 be an integer. Let 0 ≤ s0 < n/2, n/2 − 1 < p − 1 ≤ 4m/(n − 2s0 ). Let λ ≥ 0 and ν ≥ 0. Let f satisfy N (n/2, p, exp(λ| · |ν )). Let n − 2s0 σ = 1 − (p − 1) . (3.12) 4m Then there exist an admissible pair (q0 , r0 ) and a constant C > 0 such that q0 = ∞ ∞ aj u νj (1) f (u) Lq0 ((0,T ),B˙ µ ) T σ ˙ n/2 ) L∞((0,T ),H r ,2 0
j=0
p−1 · u L q0 ((0,T ),Lr0 ∩B ˙ s0 ) u Lq0 ((0,T ),B˙ µ r0 ,2
r0 ,2 )
(3.13)
for any 0 < µ ≤ n/2, where λj νj C r(j)νj/2+(p−1)/r(0) j! 0 = λC 2 e2(1 − 2/r0 )−1 u 2L∞ H˙ n/2 ∞
aj := ν(j+1)
lim
aj+1 u L∞ H˙ n/2 aj u νj ˙ n/2 L∞ H
j→∞
(3.14) if ν < 2 if ν = 2
(3.15)
if ν > 2
and (2)
f (u) − f (v) Lq0 ((0,T ),Lr0 )
νj ∞ σ T aj max w L∞ ((0,T ),H˙ n/2 ) j=0
w=u,v
p−1
·
max w Lq0 ((0,T ),B˙ s0
w=u,v
r0 ,2 )
u − v Lq0 ((0,T ),Lr0 ) .
(3.16)
September 20, J070-S0129055X11004473
918
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
Proof. (1) Let r(j) and r0 be given by (1.17) and (1.18). By Proposition 3.1 in [24], we have f B˙ µ
r0 ,2
∞ λj
j!
j=0
p−1+νj u L r(j) ∩B ˙0
r(j),2
u B˙ µ .
(3.17)
r0 ,2
We use [21, Lemma 2.2] to obtain 1−r(0)/r(j)
u Lr(j) ∩B˙ 0
r(j),2
r(j)1/2+(r(0)−2)/2r(j) u H˙ n/2
r(0)/r(j)
u Lr(0) ∩B˙ 0
.
r(0),2
(3.18) So that, we obtain
f B˙ µ
r ,2 0
∞ j=0
p−1 aj u νj ˙µ , ˙ n/2 u B ˙ s0 u B H r0 ,2
(3.19)
r0 ,2
0 for time where we have used the Sobolev embedding B˙ rs00,2 → Lr(0) ∩ B˙ r(0),2
variable. Taking the Lq0 norm and using the H¨ older inequality, we obtain the required result. The last statement follows from the direct computation by the definition of aj . (2) We use |f (u) − f (v)|
∞ λj j=0
j!
max |w|p−1+νj |u − v|.
(3.20)
w=u,v
Taking the Lq0 Lr0 norm and using the same estimates as above, we obtain the required result. We put X := L∞ L2 ∩ Lq0 Lr0 , X µ := L∞ H˙ µ ∩ Lq0 B˙ rµ0 ,2 for µ = s0 , n/2. By Lemmas 2.1 and 3.2, we have Φ(u) X µ u0 H˙ µ + f (u) Lq0 B˙ µ
,2 r0
u0 H˙ µ + T σ u0 H˙ µ + T σ
∞ j=0 ∞ j=0
p−1 aj u νj ˙ n/2 u Lq0 B˙ s0 u Lq0 B˙ µ L∞ H
r0 ,2
r0 ,2
p−1 aj u νj u X s0 u X µ X n/2
for µ = s0 , n/2. Similarly, we also have Φ(u) − Φ(v) X f (u) − f (v) Lq0 Lr0 Tσ
∞ j=0
Tσ
∞ j=0
p−1 aj max w νj ˙ n/2 max w Lq0 B˙ s0 u − v Lq0 Lr0 L∞ H w=u,v
w=u,v
r0 ,2
p−1 aj max w νj max w X s0 u − v X . X n/2 w=u,v
w=u,v
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
919
Therefore Φ becomes a contraction map on {u ∈ L∞ H n/2 | u X ≤ C0 u0 L2 , u X µ ≤ C0 u0 H˙ µ for µ = s0 , n/2} (3.21) for some positive constant C0 and suitable T > 0, where the metric is given by the X norm. We note that we must take u0 H˙ n/2 as sufficiently small when ν = 2 since ∞ it is required in Lemma 3.2 for the convergence of j=0 aj u νj ˙ n/2 . Moreover L∞ H we can take T = ∞ when σ = 0 and u0 H˙ s0 is sufficiently small. 3.3. Proof of Theorem 1.4 Lemma 3.3. Let m ≥ 1 be an integer. Let 0 ≤ s0 < n/2 < s < ∞, and let p satisfy (1.19). Let M be any fixed nonnegative nondecreasing function on [0, ∞), and let f satisfy N (s0 , p, M ). Let σ = 1 − (p − 1)
n − 2s0 . 4m
(3.22)
Then there exist admissible pairs (q0 , r0 ), (q, r) such that q0 = ∞, q = ∞, (1)
f (u) L1 ((0,T ),B˙ µ
2,l )
T σ M ( u L∞ ((0,T ),L∞ ∩B˙ 0
∞,l )
p−1 ) u L q ((0,T ),B ˙ s0 ) r,l
· u Lq0 ((0,T ),B˙ µ
(3.23)
r0 ,l )
for any 0 < µ ≤ s and 1 ≤ l ≤ 2, and (2)
f (u) − f (v) L1 ((0,T ),L2 ) T σ max M ( w L∞ ((0,T ),L∞ ) ) w=u,v
p−1 · w L q ((0,T ),B ˙ s0 ) u − v Lq0 ((0,T ),Lr0 ) . r,2
(3.24)
Proof. (1) By the condition of s0 and p, we can take r0 such that n − 2s0 1 1 1 − (p − 1) < < , 2 2n r0 2
1 . r0
(3.25)
1 1 m 1 n − 2s0 − 2m ≤ − (p − 1) − ≤ . 2 n r0 2 2n
(3.26)
0<
When m < n/2, moreover we assume
We can take such r0 by p − 1 ≤ 4m/(n − 2s0 ). We put
1 1 1 1 1 s 1 − := := + . , r∗ p − 1 2 r0 r r∗ n
(3.27)
And we take q and q0 such that (q, r) and (q0 , r0 ) are admissible pairs. Since 1/2 = (p − 1)/r∗ + 1/r0 holds, we have the estimate p−1 f (u) B˙ µ M ( u L∞ ∩B˙ 0 ) u L ˙µ . r∗ ∩ B ˙ 0 u B 2,l
∞,l
r∗ ,l
r0 ,l
(3.28)
September 20, J070-S0129055X11004473
920
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
Indeed, this estimate has been shown by [23, Proposition 1.1] when l = 2. The proof is also varid if we take 1 ≤ l ≤ 2. Since 1/r∗ = 1/r − s0 /n holds, u Lr∗ ∩B˙ 0 is bounded by u B˙ s0 by the r∗ ,l
r,l
Sobolev embedding. Taking L1 norm for time variable and using the H¨older inequality, we obtain the required result since σ satisfies the equation σ = 1 − (p − 1)/q − 1/q0 . (2) The proof follows similarly. By the estimate |f (u) − f (v)| max |w|p−1 M (|w|)|u − v|,
(3.29)
w=u,v
we have the estimate p−1 f (u) − f (v) L2 max M ( w L∞ ) w L r∗ u − v Lr0 .
(3.30)
w=u,v
s0 By the embedding B˙ r,2
→ Lr∗ , and the H¨ older inequality for time variable, we obtain the required result. s , X α := We put X := L∞ L2 ∩ Lq0 Lr0 ∩ Lq Lr , X s := L∞ H˙ s ∩ Lq0 B˙ rs0 ,2 ∩ Lq B˙ r,2 α q α q α L B˙ 2,1 ∩ L 0 B˙ r0 ,1 ∩ L B˙ r,1 for α = s0 , n/2. By Lemmas 2.1 and 3.3, we have ∞
Φ(u) X s u0 H˙ s + f (u) L1 H˙ s p−1 u0 H˙ s + T σ M ( u L∞ B˙ n/2 ) u L qB ˙ s0 u Lq0 B˙ s 2,1
u0 H˙ s + T
σ
r,2
r0 ,2
p−1 M ( u X n/2 ) u X s0 u X s ,
n/2 0 where we have used the embedding B˙ 2,1 → L∞ ∩ B˙ ∞,2 . Similarly, we also have
Φ(u) X α u0 B˙ α + f (u) L1 B˙ α 2,1
2,1
p−1 u0 B˙ α + T M ( u L∞B˙ n/2 ) u L qB ˙ s0 u Lq0 B˙ α σ
2,1
2,1
r,1
r0 ,1
p−1 u0 B˙ α + T σ M ( u X n/2 ) u X s0 u X α 2,1
for α = s0 , n/2 and Φ(u) − Φ(v) X f (u) − f (v) L1 L2 p−1 q r T σ max M ( w L∞ B˙ n/2 ) w L qB ˙ s0 u − v L 0 L 0 w=u,v
T
σ
2,1
p−1 max M ( w X n/2 ) w X s0 u w=u,v
r,2
− v X .
Therefore Φ becomes a contraction map on {u ∈ L∞ H s | u X ≤ C0 u0 L2 , u X s ≤ C0 u0 H˙ s , u X α ≤ C0 u0 B˙ α for α = s0 , n/2} 2,1
(3.31)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
921
for some positive constant C0 and suitable T > 0, where the metric is given by the X norm. Moreover we can take T = ∞ when σ = 0 and u0 B˙ s0 is sufficiently 2,1 small. 4. Proof of Theorems 1.7–1.9 We rewrite the differential equation (1.23) to the integral equation t u(t) = Ψ(u) := ∂t K(t)u0 + K(t)(u0 + u1 ) + K(t − τ )f (u(τ ))dτ,
(4.1)
0
where K(t) := F −1 α :=
1−
1 − 4P (ξ) , 2
β :=
1+
e−tα − e−tβ F, β−α
1 − 4P (ξ) , 2
P (ξ) :=
(4.2)
Cα ξ 2α .
(4.3)
0≤|α|≤m
We show that the operator Ψ becomes a contraction map on some complete metric spaces. The solutions in our theorems are given as the fixed points of Ψ. 4.1. Proof of Theorem 1.7 Lemma 4.1. Let n ≥ 3. Let m be an integer with 1 ≤ m < n/2. Let 0 ≤ s < n/2 − m, and let p−1=
2m . n − 2(m + s)
(4.4)
p−1 1 − ≥ 0. q q0
(4.5)
Let 2 ≤ q ≤ ∞, 2 ≤ q0 ≤ ∞ satisfy σ := 1 −
Let f satisfy N (s, p, 1). Then the following estimates hold. (1)
p−1 f (u) L1 ((0,T ),H˙ µ ) T σ u L ˙ m+µ ) q ((0,T ),H ˙ m+s ) u Lq0 ((0,T ),H
(4.6)
for any 0 < µ ≤ s, and (2)
p−1 f (u) − f (v) L1 ((0,T ),L2 ) T σ max w L ˙ m). q ((0,T ),H ˙ m+s ) u − v Lq0 ((0,T ),H w=u,v
(4.7) Proof. (1) We put 1 1 m := − , r0 2 n
1 1 m+s . := − r∗ 2 n
(4.8)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
922
Since we have 1/2 = (p − 1)/r∗ + 1/r0 , by Proposition 2.1 in [22], we have the estimate p−1 f (u) H˙ µ u B ˙µ . ˙ 0 u B r∗ ,2
r0 ,2
(4.9)
By the Sobolev embeddings H˙ m+s → B˙ r0∗ ,2 and H˙ m+µ → B˙ rµ0 ,2 , and the H¨ older inequality in time variable, we obtain the required result. (2) Since f satisfies |f (u) − f (v)| max |w|p−1 |u − v|,
(4.10)
p−1 f (u) − f (v) L2 max w L r∗ u − v Lr0 .
(4.11)
w=u,v
we have w=u,v
By the same Sobolev embeddings and the H¨ older inequality in time variable, we obtain the required result. We put X0 := L∞ H m ∩ L2 H˙ m ,
Y0 := L∞ L2 ∩ L2 L2
(4.12)
and X1 := L∞ H˙ m+s ∩ L2 H˙ m+s ,
Y1 := L∞ H˙ s ∩ L2 H˙ s .
(4.13)
By Lemmas 2.2 and 4.1, we have Ψ(u) X0 + ∂t Ψ(u) Y0 (u0 , u1 ) H m ⊕L2 + f (u) L1L2 p−1 (u0 , u1 ) H m ⊕L2 + T σ u L ˙m qH ˙ m+s u Lq0 H p−1 (u0 , u1 ) H m ⊕L2 + T σ u X u X0 , 1
where we have used the embedding L∞ ∩ L2 → Lq for the last inequality. Similarly, we also have Ψ(u) X1 + ∂t Ψ(u) Y1 (u0 , u1 ) (H˙ s ∩H˙ m+s )⊕H˙ s + f (u) L1 H˙ s p−1 (u0 , u1 ) (H˙ s ∩H˙ m+s )⊕H˙ s + T σ u L ˙ m+s qH ˙ m+s u Lq0 H
(u0 , u1 ) (H˙ s ∩H˙ m+s )⊕H˙ s + T σ u pX1 and Ψ(u) − Ψ(v) X0 f (u) − f (v) L1 L2 p−1 T σ max w L ˙m qH ˙ m+s u − v Lq0 H w=u,v
T
σ
p−1 max w X u − v X0 . 1
w=u,v
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
923
Therefore Ψ becomes a contraction map on {u ∈ L∞ H m+s | u X0 ≤ C0 (u0 , u1 ) H m ⊕L2 , u X1 ≤ C0 (u0 , u1 ) (H˙ s ∩H˙ m+s )⊕H˙ s }
(4.14)
for some positive constant C0 and suitable T > 0, where the metric is given by the X0 norm. Moreover we can take T = ∞ when σ = 0 with q = q0 = p ≥ 2 and (u0 , u1 ) (H˙ s ∩H˙ m+s )⊕H˙ s is sufficiently small. We note that the fixed point u of Ψ satisfies ∂t u ∈ L∞ H s by the above estimates for ∂t Ψ. 4.2. Proof of Theorem 1.8 Lemma 4.2. Let n ≥ 3. Let m be an integer with 1 ≤ m < n/2. Let 0 ≤ s0 < n/2 − m, and let n 2m −m−1
1 , q
1 ≤ min q0
σ := 1 −
1 1 , 2 p
(4.15)
(4.16)
1 p−1 − ≥ 0. q q0
(4.17)
Let λ ≥ 0 and ν ≥ 0, and let f satisfy N (n/2 − m, p, exp(λ| · |ν )). We put
1 1 1 1 1 1 m := − for j ≥ 0. := − , r0 2 n r(j) p − 1 + νj 2 r0
(4.18)
Then there exist a constant C such that the following inequalities hold. (1)
f (u) L1 ((0,T ),H˙ µ ) T σ
∞ j=0
aj u νj ˙ n/2 ) L∞ ((0,T ),H
p−1 · u L ˙ m+µ ) q ((0,T ),H ˙ m+s ) u Lq0 ((0,T ),H
(4.19)
for any 0 < µ ≤ n/2 − m, where λj νj C r(j)νj/2+(p−1)/r(0) j! ν(j+1) 0 aj+1 u L∞H˙ n/2 lim = (λC 2 e2n u 2L∞H˙ n/2 )/m j→∞ aj u νj ˙ n/2 L∞ H ∞
aj :=
(4.20) if ν < 2, if ν = 2, if ν > 2.
(4.21)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
924
M. Nakamura
(2)
f (u) − f (v) L1 ((0,T ),L2 )
νj ∞ Tσ aj max w L∞ ((0,T ),H˙ n/2 ) ·
148-RMP
w=u,v
j=0
p−1
max w Lq ((0,T ),H˙ m+s )
u − v Lq0 ((0,T ),H˙ m ) .
w=u,v
(4.22)
Proof. (1) By [24, Proposition 3.1], we have f (u) H˙ µ
∞ λj j=0
j!
p−1+νj u L r(j) ∩B ˙0
r(j),2
u B˙ µ .
(4.23)
r0 ,2
Since 1/r0 = 1/2 − m/n, we have the Sobolev embedding H˙ m+µ → B˙ rµ0 ,2 . We use [21, Lemma 2.2] to obtain u Lr(j) ∩B˙ 0
r(j),2
1−r(0)/r(j)
r(j)1/2+(r(0)−2)/2r(j) u H˙ n/2
r(0)/r(j)
u Lr(0) ∩B˙ 0
.
(4.24)
r(0),2
So that, we obtain f (u) H˙ µ
∞ j=0
p−1 aj u νj ˙ m+µ . ˙ n/2 u H ˙ m+s u H H
(4.25)
older inequality in time variable, we obtain Taking the L1 norm and using the H¨ the required result. The last statement follows from the direct computation by the definition of aj . (2) The proof follows similarly after we use the estimate ∞ λj max |w|p+νj−1 |u − v|. (4.26) |f (u) − f (v)| ≤ w=u,v j! j=0 older inequality, we obtain the required Taking the L1 L2 norm and using the H¨ result. For µ = 0, s0 , n/2 − m, we put Xµ := L∞ (H˙ µ ∩ H˙ m+µ ) ∩ L2 H˙ m+µ ,
Yµ := L∞ H˙ µ ∩ L2 H˙ µ ,
(4.27)
where we regard H˙ 0 as L2 . By Lemmas 2.2 and 4.2, we have Ψ(u) Xµ + ∂t Ψ(u) Yµ (u0 , u1 ) (H˙ µ ∩H˙ m+µ )⊕H˙ µ + f (u) L1H˙ µ (u0 , u1 ) (H˙ µ ∩H˙ m+µ )⊕H˙ µ + T σ (u0 , u1 ) (H˙ µ ∩H˙ m+µ )⊕H˙ µ + T σ
∞ j=0 ∞ j=0
p−1 aj u νj ˙ m+µ ˙ n/2 u Lq H ˙ m+s0 u Lq0 H L∞ H
p−1 aj u νj Xn/2−m u Xs u Xµ 0
for µ = 0, s0 , n/2 − m, where we have used the embedding L∞ ∩ L2 → Lq ∩ Lq0 for the last inequality.
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
925
Similarly, we have Ψ(u) − Ψ(v) X0 + ∂t Ψ(u) − ∂t Ψ(v) Y0 f (u) − f (v) L1 L2 Tσ
∞ j=0
Tσ
∞ j=0
p−1 aj max w νj ˙m ˙ n/2 w Lq H ˙ m+s0 u − v Lq0 H L∞ H w=u,v
p−1 aj max w νj Xn/2−m w Xs u − v X0 . w=u,v
0
Therefore Ψ becomes a contraction map on {u ∈ L∞ H n/2 | u Xµ ≤ C0 (u0 , u1 ) (H˙ µ ∩H˙ m+µ )⊕H˙ µ
for µ = 0, s0 , n/2 − m} (4.28)
for some positive constant C0 and suitable T > 0, where the metric is given by the X0 norm. We note that we must take (u0 , u1 ) (H˙ n/2−m ∩H˙ n/2 )⊕H˙ n/2−m as sufficiently small when ν = 2 since it is required in Lemma 4.2. Moreover we can take T = ∞ when σ = 0 with q = q0 = p ≥ 2 and (u0 , u1 ) (H˙ s0 ∩H˙ s0 +m )⊕H˙ s0 is sufficiently small. We note that the fixed point u of Ψ satisfies ∂t u ∈ L∞ H˙ µ ∩ L2 H˙ µ for µ = 0, s0 , n/2 − m by the above estimates for ∂t Ψ.
4.3. Proof of Theorem 1.9 Lemma 4.3. Let n ≥ 3. Let m be an integer with 1 ≤ m < n/2. Let 0 ≤ s0 < n/2 − m < s < ∞. We put p−1=
2m . n − 2(m + s0 )
(4.29)
Let q and q0 satisfy 2 ≤ q ≤ ∞,
2 ≤ q0 ≤ ∞,
σ := 1 −
1 p−1 − ≥ 0. q q0
(4.30)
Let M be any fixed nonnegative nondecreasing function on [0, ∞), and let f satisfy N (s, p, M ). Let 1 ≤ l ≤ 2. Then the following estimates hold. (1)
f (u) L1 ((0,T ),B˙ µ
2,l )
T σ M ( u L∞ ((0,T ),B˙ n/2 ) ) 2,1
·
p−1 m+µ u L q ((0,T ),B ˙ m+s ) u Lq0 ((0,T ),B˙ 2,l ) 2,l
(4.31)
for any 0 < µ ≤ s, and (2)
f (u) − f (v) L1 ((0,T ),L2 ) T σ max M ( w L∞ ((0,T ),B˙ n/2 ) ) w=u,v
2,1
p−1 · w L q ((0,T ),B ˙ m+s ) u − v Lq0 ((0,T ),B˙ m ) . 2,2
2,2
(4.32)
September 20, J070-S0129055X11004473
926
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
Proof. (1) We put 1 1 m := − , r0 2 n
1 1 m+s . := − r∗ 2 n
(4.33)
Since 1/2 = (p − 1)/r∗ + 1/r0 holds, we have the estimate p−1 f (u) B˙ µ M ( u L∞ ∩B˙ 0 ) u L r∗ ∩ B ˙ 0 u B˙ µ 2,l
∞,l
(4.34)
r0 ,l
r∗ ,l
for 0 < µ ≤ s, 1 ≤ l ≤ 2. Indeed, this estimate has been shown by Proposition 1.1 in [23] when l = 2. The proof is also valid if we take 1 ≤ l ≤ 2. n/2 m+s 0 , B˙ 2,l
→ Lr∗ ∩ B˙ r0∗ ,l , Applying the Sobolev embeddings B˙ 2,1 → L∞ ∩ B˙ ∞,l m+µ B˙ 2,l
→ B˙ rµ0 ,l , and taking L1 norm for time variable and using the H¨older inequality, we obtain the required result. (2) The proof follows similarly. Since we have |f (u) − f (v)| max |w|p−1 M (|w|)|u − v|,
(4.35)
w=u,v
we have the estimate p−1 f (u) − f (v) L2 max M ( w L∞ ) w L r∗ u − v Lr0 .
(4.36)
w=u,v
By the Sobolev embeddings, and the H¨ older inequality for time variable, we obtain the required result. For (µ, l) = (0, 2), (s0 , 1), (n/2 − m, 1), (s, 2), we put µ m+µ m+µ Xµ,l := L∞ B˙ 2,l ∩ L∞ B˙ 2,l ∩ L2 B˙ 2,l ,
µ µ Yµ,l := L∞ B˙ 2,l ∩ L2 B˙ 2,l .
(4.37)
By Lemmas 2.2 and 4.3, we have Ψ(u) Xµ,l + ∂t Ψ(u) Yµ,l (u0 , u1 ) (B˙ µ
+ f (u) L1B˙ µ
(u0 , u1 ) (B˙ µ
+ T σ M ( u L∞B˙ n/2 ) u p−1 u Lq0 B˙ m+µ q ˙ m+s0
(u0 , u1 ) (B˙ µ
p−1 + T σ M ( u Xn/2−m,1 ) u X u Xµ,l , s ,l
˙ m+µ )⊕B˙ µ 2,l ∩B2,l 2,l ˙ m+µ )⊕B˙ µ 2,l ∩B2,l 2,l ˙ m+µ )⊕B˙ µ 2,l ∩B2,l 2,l
2,l
2,1
L B2,l
2,l
0
where we have used the embedding L∞ ∩ L2 → Lq ∩ Lq0 for the last inequality. Similarly, we have Ψ(u) − Ψ(v) X0,2 + ∂t Ψ(u) − ∂t Ψ(v) Y0,2 f (u) − f (v) L1 L2 T σ max M ( w L∞ B˙ n/2 ) w p−1 u − v Lq0 B˙ m q ˙ m+s0 w=u,v
2,1
L B2,2
p−1 T σ max M ( w Xn/2−m,1 ) w X u − v X0,2 . s ,2 w=u,v
0
2,2
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
927
Therefore Ψ becomes a contraction map on {u ∈ L∞ H s | u Xµ,l ≤ C0 (u0 , u1 ) (B˙ µ
˙ m+µ )⊕B˙ µ 2,l ∩B2,l 2,l
for (µ, l) = (0, 2), (s0 , 1), (n/2 − m, 1), (s, 2)}
(4.38)
for some positive constant C0 and suitable T > 0, where the metric is given by the X0,2 norm. Moreover we can take T = ∞ when σ = 0 with q = q0 = p ≥ 2 and (u0 , u1 ) (B˙ s0 ∩B˙ m+s0 )⊕B˙ s0 is sufficiently small. We note that the fixed point u of 2,1 2,1 2,1 Ψ satisfies ∂t u ∈ L∞ B˙ µ ∩ L2 B˙ µ for (µ, l) = (0, 2), (s0 , 1), (n/2 − m, 1), (s, 2) by 2,l
2,l
the above estimates for ∂t Ψ. 5. Proof of Theorems 1.11 and 1.14 For the proof of Theorems 1.11 and 1.14, we prepare the following estimates for nonlinear terms. Lemma 5.1. Let n ≥ 1, m ≥ 1, p ≥ 2 be integers. Let P (u) be any polynomial of u and u of order p. For any fixed α with |α| = m, let f (u) = ∂xα P (u). Let a be a real number with a ≥ m + n + 2. Then the following estimates hold. (1)
p−1 P (u) H a+m u H a u H a+m .
(5.1)
p−2 2 u L ∞ H a u L2 H a+m .
(2)
P (u) L1 H a+m
(3)
p−2 f (u) − f (v) L1 H˙ −m max w L ∞ L∞ w L2 L∞ u − v L2 L2 . w=u,v
Proof. (1) We obtain the required result by the elementary estimate p−1 ∂xα u L P (u) H a+m ∞ u H a+m
(5.2) (5.3)
(5.4)
|α|≤(a+m)/2
and by the Sobolev embedding theorem. (2) The proof follows from the H¨ older inequality in time variable. (3) For any |α| = m, we have f (u) − f (v) H˙ −m = P (u) − P (v) L2 .
(5.5)
The required result follows from the simple fact p−1 P (u) − P (v) L2 max w L ∞ u − v L2 w=u,v
and the H¨ older inequality in time variable.
(5.6)
5.1. Proof of Theorem 1.11 We put X := L∞ H a ∩ L2 H a+m . For any fixed α with |α| = m, by Lemmas 2.1 and 5.1, we have Φ(u) L∞ H a u0 H a + f (u) L1H a
(5.7)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
928
and p−2 p 2 f (u) L1 H a P (u) L1H a+m u L ∞ H a u L2 H a+m u X .
On the other hand, by the use of the fact Φ(u) L2 H a+m
∂xβ Φ(u) L2 L2
(5.8)
(5.9)
|β|≤a+m
and Lemmas 2.1 and 5.1, we have ∂xβ Φ(u) L2 ∂xβ u0 H˙ −m + ∂xβ f (u) L1 H˙ −m
(5.10)
p−2 p 2 ∂xβ f (u) L1 H˙ −m P (u) L1 H a+m u L ∞ H a u L2 H a+m u X .
(5.11)
and
So that, we obtain Φ(u) X u0 H˙ −m ∩H a + u pX .
(5.12)
Similarly, we also have Φ(u) − Φ(v) L2 L2 = f (u) − f (v) L1 H˙ −m p−2 max w L ∞ L∞ w L2 L∞ u − v L2 L2 w=u,v
p−1 max w X u − v L2 L2 . w=u,v
Therefore Φ becomes a contraction map on {u ∈ L∞ H a | u X ≤ C0 u0 H˙ −m ∩H a }
(5.13)
for some positive constant C0 if u0 H˙ −m ∩H a is sufficiently small, where the metric is given by the L2t L2x norm. 5.2. Proof of Theorem 1.14 We put X := L∞ H a ∩ L2 H a+m . By Lemmas 2.2 and 5.1, we have Ψ(u) L∞ H a u0 H a+m + u1 H a + f (u) L1 H a
(5.14)
p−2 p 2 f (u) L1 H a P (u) L1 H a+m u L ∞ H a u L2 H a+m u X .
(5.15)
and
On the other hand, by the fact Ψ(u) L2H a+m
∂xβ Ψ(u) L2 L2
(5.16)
|β|≤a+m
and Lemmas 2.2 and 5.1, we have ∂xβ Ψ(u) L2 L2 ∂xβ u0 L2 ∩H˙ −m + ∂xβ u1 H˙ −m + ∂xβ f (u) L1 H˙ −m
(5.17)
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
929
and p−2 p 2 ∂xβ f (u) L1 H˙ −m P (u) L1 H a+m u L ∞ H a u L2 H a+m u X .
(5.18)
Similarly, we have Ψ(u) − Ψ(u) L2 L2 f (u) − f (v) L1 H˙ −m p−2 max w L ∞ L∞ w L2 L∞ u − v L2 L2 w=u,v
p−1 max w X u − v L2 L2 . w=u,v
(5.19)
Therefore Ψ becomes a contraction map on {u ∈ L∞ H a | u X ≤ C0 (u0 , u1 ) (H a+m ∩H˙ −m )⊕(H a ∩H˙ −m ) }
(5.20)
for some positive constant C0 if (u0 , u1 ) (H a+m ∩H˙ −m )⊕(H a ∩H˙ −m ) is sufficiently small, where the metric is given by L2t L2x norm. Acknowledgments I am thankful to the anonymous referee for useful suggestions for the known results. References [1] S. Adachi and K. Tanaka, Trudinger type inequalities in RN and their best exponents, Proc. Amer. Math. Soc. 128(7) (2000) 2051–2057. [2] J. Bergh and J. L¨ ofstr¨ om, Interpolation Spaces. An Introduction, Grundlehren der Mathematischen Wissenschaften, No. 223 (Springer-Verlag, Berlin-New York, 1976), x+207 pp. [3] T. Cazenave, Semilinear Schr¨ odinger Equations, Courant Lecture Notes in Mathematics, Vol. 10 (American Mathematical Society, Providence, RI, 2003), xiv+323 pp. [4] H. Fujita, On the blowing up of solutions of the Cauchy problem for ut = ∆u+u1+α , J. Fac. Sci. Univ. Tokyo Sect. I 13 (1966) 109–124. [5] V. A. Galaktionov and S. I. Pohozaev, Existence and blow-up for higher-order semilinear parabolic equations: Majorizing order-preserving operators, Indiana Univ. Math. J. 51(6) (2002) 1321–1338. [6] Y. Giga, Solutions for semilinear parabolic equations in Lp and regularity of weak solutions of the Navier–Stokes system, J. Differential Equations 62(2) (1986) 186– 212. [7] J. Ginibre and G. Velo, Scattering theory in the energy space for a class of nonlinear wave equations, Comm. Math. Phys. 123(4) (1989) 535–573. [8] J. Ginibre and G. Velo, The Cauchy problem in local spaces for the complex Ginzburg–Landau equation. II. Contraction methods, Comm. Math. Phys. 187(1) (1997) 45–79. [9] J. Ginibre and G. Velo, Localized estimates and Cauchy problem for the logarithmic complex Ginzburg–Landau equation, J. Math. Phys. 38(5) (1997) 2475–2482. [10] K. Hayakawa, On nonexistence of global solutions of some semilinear parabolic differential equations, Proc. Japan Acad. 49 (1973) 503–505.
September 20, J070-S0129055X11004473
930
2011 11:27 WSPC/S0129-055X
148-RMP
M. Nakamura
[11] N. Hayashi, E. I. Kaikina and P. I. Naumkin, Damped wave equation with super critical nonlinearities, Differential Integral Equations 17(5–6) (2004) 637–652. [12] N. Hayashi, E. I. Kaikina and P. I. Naumkin, Damped wave equation in the subcritical case, J. Differential Equations 207(1) (2004) 161–194. [13] T. Hosono and T. Ogawa, Large time behavior and Lp -Lq estimate of solutions of 2-dimensional nonlinear damped wave equations, J. Differential Equations 203(1) (2004) 82–118. [14] R. Ikehata, Y. Miyaoka and T. Nakatake, Decay estimates of solutions for dissipative wave equations in RN with lower power nonlinearities, J. Math. Soc. Japan 56(2) (2004) 365–373. [15] S. Kawashima, M. Nakao and K. Ono, On the decay property of solutions to the Cauchy problem of the semilinear wave equation with a dissipative term, J. Math. Soc. Japan 47(4) (1995) 617–653. [16] T. T. Li and Y. Zhou, Breakdown of solutions to u+ut = |u|1+α , Discrete Contin. Dyn. Syst. 1(4) (1995) 503–520. [17] S. Machihara and Y. Nakamura, The inviscid limit for the complex Ginzburg– Landau equation, J. Math. Anal. Appl. 281(2) (2003) 552–564. [18] S. Machihara, M. Nakamura and T. Ozawa, Small global solutions for nonlinear Dirac equations, Differential Integral Equations 17(5–6) (2004) 623–636. [19] A. Matsumura, On the asymptotic behavior of solutions of semi-linear wave equations, Publ. Res. Inst. Math. Sci. 12(1) (1976/77) 169–189. [20] A. Matsumura, Global existence and asymptotics of the solutions of the secondorder quasilinear hyperbolic equations with the first-order dissipation, Publ. Res. Inst. Math. Sci. 13(2) (1977/78) 349–379. [21] M. Nakamura and T. Ozawa, Nonlinear Schr¨ odinger equations in the Sobolev space of critical order, J. Funct. Anal. 155(2) (1998) 364–380. [22] M. Nakamura and T. Ozawa, The Cauchy problem for nonlinear wave equations in the homogeneous Sobolev space, Ann. Inst. H. Poincar´ e Phys. Theor. 71(2) (1999) 199–215. [23] M. Nakamura and T. Ozawa, Small solutions to nonlinear Schr¨ odinger equations in the Sobolev spaces, J. Anal. Math. 81 (2000) 305–329. [24] M. Nakamura and T. Ozawa, The Cauchy problem for nonlinear Klein–Gordon equations in the Sobolev spaces, Publ. Res. Inst. Math. Sci. 37(3) (2001) 255–293. [25] M. Nakamura and T. Ozawa, Small data scattering for nonlinear Schr¨ odinger wave and Klein–Gordon equations, Ann. Sc. Norm. Super. Pisa Cl. Sci. (5) 1(2) (2002) 435–460. [26] M. Nakao and K. Ono, Existence of global solutions to the Cauchy problem for the semilinear dissipative wave equations, Math. Z. 214(2) (1993) 325–342. [27] T. Narazaki, Lp -Lq estimates for damped wave equations and their applications to semi-linear problem, J. Math. Soc. Japan 56(2) (2004) 585–626. [28] K. Nishihara, Lp -Lq estimates of solutions to the damped wave equation in 3-dimensional space and their application, Math. Z. 244(3) (2003) 631–649. [29] K. Nishihara, Diffusion phenomena of solutions to the Cauchy problems for a damped wave equation, Sugaku 62(2) (2010) 164–181 (in Japanese). [30] F. Ribaud, Cauchy problem for semilinear parabolic equations with initial data in Hps (Rn ) spaces, Rev. Mat. Iberoamericana 14(1) (1998) 1–46. [31] B. Ruf, A sharp Trudinger–Moser type inequality for unbounded domains in R2 , J. Funct. Anal. 219(2) (2005) 340–367.
September 20, J070-S0129055X11004473
2011 11:27 WSPC/S0129-055X
148-RMP
Global Solutions for Nonlinear Ginzburg–Landau Equations
931
[32] B. Ruf and E. Terraneo, The Cauchy problem for a semilinear heat equation with singular initial data, in Evolution Equations, Semigroups and Functional Analysis (Milano, 2000), Progr. Nonlinear Differential Equations Appl., Vol. 50 (Birkhauser, Basel, 2002), pp. 295–309. [33] W. A. Strauss, On weak solutions of semi-linear hyperbolic equations, An. Acad. Brasil. Ci. 42 (1970) 645–651. [34] G. Todorova and B. Yordanov, Critical exponent for a nonlinear wave equation with damping, J. Differential Equations 174(2) (2001) 464–489. [35] H. Triebel, Theory of Function Spaces, Monographs in Mathematics, Vol. 78 (Birkhauser Verlag, Basel, 1983), 284 pp. [36] B. Wang, The limit behavior of solutions for the Cauchy problem of the complex Ginzburg–Landau equation, Comm. Pure Appl. Math. 55(4) (2002) 481–508. [37] B. Wang, The Cauchy problem for critical and subcritical semilinear parabolic equations in Lr . I, Nonlinear Anal. 48(5) (2002) 747–764. [38] B. Wang, The Cauchy problem for critical and subcritical semilinear parabolic equas tions in Lr . II. Initial data in critical Sobolev spaces H˙ −s,r , Nonlinear Anal. 52(3) (2003) 851–868. [39] F. B. Weissler, Semilinear evolution equations in Banach spaces, J. Funct. Anal. 32(3) (1979) 277–296. [40] F. B. Weissler, Local existence and nonexistence for semilinear parabolic equations in Lp , Indiana Univ. Math. J. 29(1) (1980) 79–102. [41] F. B. Weissler, Existence and nonexistence of global solutions for a semilinear heat equation, Israel J. Math. 38(1–2) (1981) 29–40. [42] Q. S. Zhang, A blow-up result for a nonlinear wave equation with damping: The critical case, C. R. Acad. Sci. Paris Ser. I Math. 333(2) (2001) 109–114.
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 9 (2011) 933–967 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004485
SEMI-CLASSICAL WAVE PACKET DYNAMICS FOR HARTREE EQUATIONS
PEI CAO Department of Mathematical Science, Tsinghua University, Beijing 100084, P. R. China
[email protected] ´ REMI CARLES CNRS and University Montpellier 2, Math´ ematiques, CC 051, Place Eug` ene Bataillon, 34095 Montpellier, France
[email protected] Received 24 January 2011 Revised 23 September 2011 We study the propagation of wave packets for nonlinear nonlocal Schr¨ odinger equations in the semi-classical limit. When the kernel is smooth, we construct approximate solutions for the wave functions in subcritical, critical and supercritical cases (in terms of the size of the initial data). The validity of the approximation is proved up to Ehrenfest time. For homogeneous kernels, we establish similar results in subcritical and critical cases. Nonlinear superposition principle for two nonlinear wave packets is also considered. Keywords: Hartree equation; semi-classical analysis; coherent states. Mathematics Subject Classification 2010: 35Q40, 35Q55, 81Q20, 81S30
1. Introduction In this paper, we consider the following semi-classically scaled Hartree equation iε∂t ψ ε +
ε2 ∆ψ ε = V (t, x)ψ ε + K ∗ |ψ ε |2 ψ ε , 2
(t, x) ∈ R+ × Rd ,
where K : Rd → R, V : R+ × Rd → R, d ≥ 1, with initial data x − x0 √ ψ ε (0, x) = εM × ε−d/4 a ei(x−x0 )·ξ0 /ε , a ∈ S(Rd ), x0 , ξ0 ∈ Rd . ε
(1.1)
(1.2)
Such data, which are called semi-classical wave packets (or coherent states), have raised great interest in the linear case (see, e.g., [3, 9, 10, 21, 22]). It is well known 933
October 31, J070-S0129055X11004485
934
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
that if the data is a wave packet, then the solution of the linear equation (K = 0) associated with (1.1) and (1.2) still is a wave packet at leading order up to times of order C log( 1ε ), called Ehrenfest time (see, e.g., [2, 17, 18]). We refer the reader to the recent papers [9, 23–25], where overview and references on the topics can be found. Throughout this paper, we consider dynamical properties for positive time only: this is just for the simplicity of notations, since the equation is reversible. This paper is inspired by the two recent papers [1], where (1.1) is considered for a smooth kernel K, and [7] where the nonlinearity is local, as opposed to the Hartree nonlinearity. In [7], the authors proved that if the initial data have subcritical size, the leading order behavior of the wave function as ε → 0 is the same as for the linear equation. When the size of the initial data is critical, at leading order the wave function propagates like a coherent state whose envelope is given by a nonlinear equation, up to a nonlinear analogue of the Ehrenfest time. In this paper, we follow a similar approach in the case of nonlocal Schr¨ odinger equations, a case where this analysis is not a priori clear, precisely because the nonlinearity is nonlocal. Up to changing ψ ε to ε−M ψ ε , (1.1) and (1.2) can be written as: ε ε2 ε ε ε α ε 2 iε∂t ψ + ∆ψ = V (t, x)ψ + ε K ∗ |ψ | ψ , 2 (1.3) x − x0 ε ψ (0, x) = ε−d/4 a √ ei(x−x0 )·ξ0 /ε , ε with α = 2M . Notice that the initial data are of order O(1) in L2 (Rd ), and α accounts for the strength of nonlinear effects in the limit ε → 0. 2 Consider the trajectories associated with the Hamiltonian flow |ξ|2 + V (t, x(t)): x(t) ˙ = ξ(t),
˙ = −∇V (t, x(t)); ξ(t)
x(0) = x0 ,
ξ(0) = ξ0 .
(1.4)
Assumption 1. The external potential V is smooth, real-valued, and at most quadratic in space: V ∈ C ∞ (R+ × Rd ; R) and ∂xβ V ∈ L∞ (R+ × Rd ),
∀ |β| ≥ 2.
In addition, we require t → ∇V (t, 0) ∈ L∞ (R+ ). Remark 1.1. If V = V (x) does not depend on time, the last assumption is automatically fulfilled. This assumption is needed to ensure that the Hamiltonian flow t grows at most exponentially in time. Typically, if V (x) = κ·xee for some (constant) t ˙ = −κee , so x and ξ grow like a double exponential. κ ∈ Rd , then ξ(t) The following lemma is straightforward. Lemma 1.1. Let (x0 , ξ0 ) ∈ Rd × Rd . Under Assumption 1, (1.4) has a unique global, smooth solution (x, ξ) ∈ C ∞ (R+ ; Rd )2 . It grows at most
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
935
exponentially: ∃C0 > 0,
|x(t)| + |ξ(t)| eC0 t ,
∀ t ≥ 0.
(1.5)
As far as the Hartree kernel is concerned, two cases will be considered, leading to two different interesting phenomena: • Smooth kernel: K ∈ W 3,∞ (Rd ), with K smooth near the origin. • Homogeneous kernel: K(x) = λ|x|−γ , with λ ∈ R and 0 < γ < min(2, d). The second case includes the three-dimensional Schr¨ odinger–Poisson, typically. Remark 1.2. For several results (linearizable case — see definition below — or finite time propagation), the second assumption could be relaxed to 0 < γ < min(4, d) (energy-subcritical case). To simplify the presentation, we shall not discuss this extension. We will focus on the first case in Sec. 2, in which we mostly revisit the results from [1]. In the rest of this introduction, we consider the homogeneous case. We seek the solution with the form x − x(t) ei(S(t)+ξ(t)·(x−x(t)))/ε . ψ ε (t, x) = ε−d/4 uε t, √ ε Here S(t) is the classical Lagrangian action along the Hamiltonian flow generated by (1.4), given by t 1 2 |ξ(s)| − V (s, x(s)) ds. (1.6) S(t) = 2 0 In terms of uε = uε (t, y), (1.3) is equivalent 1 i∂t uε + ∆uε = V ε (t, y)uε + λεα−αc (|y|−γ ∗ |uε |2 )uε 2
(1.7)
with the initial date uε (0, y) = a(y), where αc = 1 +
γ 2
is a critical exponent and the time-dependent potential V ε (t, y) is given by V ε (t, y) =
√ √ 1 (V (t, x(t) + εy) − V (t, x(t)) − ε ∇V (t, x(t)), y). ε
It reveals the first terms of the Taylor expansion of V about the point x(t). Passing formally to the limit, V ε converges to the Hessian of V at x(t) evaluated at (y, y). Throughout the paper, we denote Q(t) = Hess V (t, x(t)).
(1.8)
October 31, J070-S0129055X11004485
936
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
1.1. The linear case λ = 0 Introduce the function
x − x(t) ϕεlin (t, x) = ε−d/4 ulin t, √ ei(S(t)+ξ(t)·(x−x(t)))/ε , ε
where ulin solves 1 1 i∂t ulin + ∆ulin = y, Q(t)yulin ; 2 2
ulin (0, y) = a(y).
(1.9)
Then the following lemma is well known, see, e.g., [2, 7, 9–11, 16–18] and references therein. Lemma 1.2. Let a ∈ S(Rd ), and ψ ε solve (1.3) with K = 0. There exist positive constants C and C1 independent of ε such that √ ψ ε (t) − ϕεlin (t) L2 (Rd ) ≤ C εeC1 t . In particular, there exists c > 0 independent of ε such that sup 0≤t≤c ln
1 ε
ψ ε (t) − ϕεlin (t) L2 (Rd ) −→ 0. ε→0
1.2. The nonlinear case λ = 0 As in [7], we introduce two linear operators, which are essentially ∇ and x, up to the wave packet scaling, in the moving frame: Aε (t) =
√ ξ(t) ε∇ − i √ ; ε
B ε (t) =
x − x(t) √ . ε
For f ∈ Σ := {f ∈ H 1 (Rd ); xf ∈ L2 (Rd )}, we define f H = f L2 (Rd ) + Aε f L2 (Rd ) + B ε f L2 (Rd ) . 1.2.1. The subcritical case α > αc In this case, the solution of (1.3) is linearizable in the sense of [14]: ϕlin yields a good approximation to ψ ε , up to Ehrenfest time. Proposition 1.1. Let λ ∈ R, 0 < γ < min(2, d) and α > αc . Suppose that a ∈ S(Rd ) and V satisfies Assumption 1. Then there exist positive constants C, C1 , C2 independent of ε, and ε0 > 0 such that for any ε ∈ ]0, ε0 ], 1 1 ψ ε (t) − ϕεlin (t) H ≤ Cεκ eC1 t , 0 ≤ t ≤ C2 ln , κ = min , α − αc . ε 2 In particular, there exists a positive constant c independent of ε such that sup 0≤t≤c ln
1 ε
ψ ε (t) − ϕεlin (t) H −→ 0. ε→0
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
937
1.2.2. The critical case α = αc By passing formally to the limit ε → 0, (1.7) can be written as 1 1 i∂t u + ∆u = y, Q(t)yu + λ(|y|−γ ∗ |u|2 )u; 2 2
u(0, y) = a(y).
(1.10)
The Cauchy problem for (1.10) is addressed in Sec. 3.2. For α = αc , the solution to (1.3) is not linearizable: the nonlinearity affects the dynamics at leading order. For k ∈ N, define (Σ1 = Σ)
Σk = f ∈ L2 (Rd ) ; f Σk := xα ∂xβ f L2 (Rd ) < ∞ . |α|+|β|≤k
We prove: Theorem 1.1. Let λ ∈ R, 0 < γ < min (2, d), α = αc and a ∈ Σ3 . Suppose that V satisfies Assumption 1. Let u ∈ C(R+ ; Σ3 ) be the solution to (1.10) and x − x(t) ε −d/4 ϕ (t, x) = ε u t, √ (1.11) ei(S(t)+ξ(t)·(x−x(t)))/ε . ε Then there exist positive constants C, C1 , C2 independent of ε, and ε0 > 0 such that for any ε ∈ ]0, ε0 ], √ ψ ε (t) − ϕε (t) L2 (Rd ) ≤ C ε exp(C1 t),
1 0 ≤ t ≤ C2 ln . ε
In particular, there exists a positive constant c independent of ε such that sup 0≤t≤c ln
1 ε
ψ ε (t) − ϕε (t) L2 (Rd ) −→ 0. ε→0
Furthermore, if a ∈ Σ4 , then for the same constants C1 , C2 as above, √ ψ ε (t) − ϕε (t) H ≤ C3 ε exp(C1 t),
1 0 ≤ t ≤ C2 ln . ε
Remark 1.3 (Notion of Criticality). We will see in Sec. 2 that when K is smooth near the origin, the critical value of α is αc = 1. For such K, K(x) = K(0) + O(|x|) near the origin, so this can be viewed as a limiting case γ = 0, since x → K(0) is 0-homogeneous. For more general Hartree kernels, the picture should be as follows. Assume that there exists λ ∈ R\{0}, γ ≥ 0, δ > 0 such that K(x) = λ|x|−γ + O(|x|−γ+δ ) as x → 0, and that K is smooth, bounded as well as its derivatives, away from the origin. Then we expect αc = 1 + γ/2, with critical phenomena similar to the cases studied in this paper: like in the smooth kernel case if γ = 0, and like in the homogeneous kernel case if γ > 0 (since wave packets are extremely localized, the behavior of K near the origin should be the only relevant one).
October 31, J070-S0129055X11004485
938
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
1.3. Nonlinear superposition In this paragraph, we consider the critical case α = αc . Suppose that initial data have the form x − x1 x − x2 √ √ ψ ε (0, x) = ε−d/4 a1 ei(x−x1 )·ξ1 /ε + ε−d/4 a2 ei(x−x2 )·ξ2 /ε , ε ε where a1 , a2 ∈ S(Rd ), and (x1 , ξ1 ) = (x2 , ξ2 ). For j ∈ {1, 2}, (xj (t), ξj (t)) are the trajectories solutions to (1.4) with initial data (xj , ξj ). Let Sj (t) be the classical action associated with (xj (t), ξj (t)) given by (1.6) and uj be the solutions of (1.10) with initial data aj . Assume the ϕεj ’s are defined as in (1.11), and ψ ε ∈ C(R+ ; Σ) is the solution to (1.3). As in [7], for f ∈ Σ, define f Σε = f L2(Rd ) + ε∇f L2(Rd ) + xf L2 (Rd ) . For bounded time, we have Theorem 1.2. Let 0 < γ < min(2, d) and a1 , a2 ∈ S(Rd ). For any T > 0 independent of ε, γ
sup ψ ε (t) − ϕε1 (t) − ϕε2 (t) Σε = O(ε 2(1+γ) ).
0≤t≤T
When time becomes large, we can establish a superposition property for d = 1, like in [7], where the condition γ < min(2, d) boils down to γ < 1. Theorem 1.3. Let d = 1, 0 < γ < 1 and a1 , a2 ∈ S(Rd ). Assume that V does not depend on time, and define ξj2 + V (xj ). 2 Suppose E1 = E2 . There exist positive constants C, C1 , C2 independent of ε, and ε0 > 0 such that for all ε ∈ ]0, ε0 ], Ej =
1 0 ≤ t ≤ C2 ln . ε In particular, there exists a positive constant c independent of ε such that γ
ψ ε (t) − ϕε1 (t) − ϕε2 (t) Σε ≤ Cε 2(1+γ) eC1 t ,
sup 0≤t≤c ln
1 ε
ψ ε (t) − ϕε1 (t) − ϕε2 (t) Σε −→ 0. ε→0
Remark 1.4. In the case of a smooth kernel, and when nonlinear effects are present at leading order (α ≤ 1), there is no longer such a nonlinear superposition principle: nonlinear interferences affect the behavior of the solution at leading order. We refer to [6], where these effects are studied in detail. Notation. Throughout the paper, C denotes a constant independent of ε and t, whose value may change from one line to the other. For two positive numbers aε and bε , the notation aε bε means that there exists C > 0 independent of ε such that for all ε ∈ ]0, 1], aε ≤ Cbε .
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
939
2. Smooth Kernel In this section, we shall assume that the kernel K satisfies: Assumption 2. The kernel is bounded as well as its first three derivatives, and smooth near the origin: for some neighborhood ω of the origin in Rd , K ∈ W 3,∞ (Rd ) ∩ C 3 (ω). This assumption is similar to the one made in [1]. We first establish the local and global existence for the solution for (1.3) at the L2 level. 2.1. Construction of the exact solution We prove that for fixed ε > 0, (1.1) has a unique, global in time solution under Assumption 1. Since for such a result, ε is irrelevant, we shall consider the case ε = 1. Lemma 2.1. Let V satisfy Assumption 1, K ∈ L∞ (Rd ) and ψ0 ∈ L2 (Rd ). There exists a unique solution ψ ∈ C(R+ ; L2 (Rd )) to 1 i∂t ψ + ∆ψ = V (t, x)ψ + (K ∗ |ψ|2 )ψ; 2
ψ|t=0 = ψ0 .
In addition, it satisfies ψ(t) L2 (Rd ) = ψ0 L2 (Rd ) for all time t ≥ 0. Proof. Since V may depend on time, we consider the more general Cauchy problem with a varying initial time: 1 i∂t ψ + ∆ψ = V (t, x)ψ + (K ∗ |ψ|2 )ψ; 2
ψ|t=s = ψs ,
(2.1)
with ψs ∈ L2 (Rd ). In view of Assumption 1, the linear case generates a unitary semigroup ( [12, 13]), which we denote by U (t, s): ψ(t) = U (t, s)ψs when K = 0. In the nonlinear case, using Duhamel’s formula, we can write (2.1) as t U (t, τ )(K ∗ |ψ|2 )ψ(τ )dτ := Φs (ψ)(t). ψ(t) = U (t, s)ψs − i s
For s ≥ 0 and T > 0, denote Is,T = [s, s + T ] and introduce the space Xs,T = {ψ ∈ C(Is,T , L2 (Rd )); ψ L∞ (Is,T ;L2 (Rd )) ≤ 2 ψs L2 (Rd ) }. Let ψ, ψ1 , ψ2 ∈ Xs,T . Using H¨ older inequality, we have Φs (ψ) L∞ (Is,T ;L2 (Rd )) ≤ ψs L2 (Rd ) + K ∗ |ψ|2 ψ L1 (Is,T ;L2 (Rd )) ≤ ψs L2 (Rd ) + T K L∞(Rd ) ψ 3L∞ (IT ;L2 (Rd )) ≤ ψs L2 (Rd ) + 8T K L∞(Rd ) ψs 3L2 (Rd ) .
October 31, J070-S0129055X11004485
940
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
Observe that K ∗ |ψ1 |2 ψ1 − K ∗ |ψ2 |2 ψ2 1 1 K ∗ (|ψ1 |2 + |ψ2 |2 )(ψ1 − ψ2 ) + K ∗ (|ψ1 |2 − |ψ2 |2 )(ψ1 + ψ2 ). 2 2 Then by similar arguments as above, we have =
Φs (ψ1 ) − Φs (ψ2 ) L∞ (Is,T ;L2 ) ≤ CT K L∞ ψs 2L2 ψ1 − ψ2 L∞ (Is,T ;L2 ) . Taking T small enough, we conclude Φs is a contraction from Xs,T into itself, and there exists a unique local solution ψ ∈ C(Is,T ; L2 (Rd )) to (2.1). By classical arguments, the L2 -norm of ψ does not depend on time, and since T depends only on ψs L2 , the solution is global in time. 2.2. The general strategy As in [7], seek an approximate solution of the form x − x(t) ε −d/4 ϕ (t, x) = ε u t, √ ei(S(t)+ξ(t)·(x−x(t)))/ε , ε
(2.2)
for some profile u independent of ε, and some function S(t) to be determined. When K = 0, S is the classical action defined in (1.6). We will see that according to the value α in (1.3), the expression of S may vary, accounting for nonlinear effects due to the presence of the Hartree nonlinearity. In the cases α = 0, 1/2, 1 and α > 1, we will see that we can write ε2 ∆ϕε − V ϕε − εα (K ∗ |ϕε |2 )ϕε 2 √ = ε−d/4 ei(S(t)+ξ(t)·(x−x(t)))/ε (b0 + εb1 + εb2 + εrε ),
iε∂t ϕε +
(2.3)
for b0 , b1 , b2 independent of ε. The approximate solution ϕε will be determined by the conditions b0 = b1 = b2 = 0. The remaining factor rε will account for the error between the exact solution ψ ε and the approximate solution ϕε . Denote φ(t, x) = S(t) + ξ(t) · (x − x(t)). The linear terms are computed as follows:
√ iε∂t ϕε = ε−d/4 eiφ(t,x)/ε (iε∂t u − i εx(t) ˙ · ∇u − u∂t φ), √ ε ε2 |ξ(t)|2 ε −d/4 iφ(t,x)/ε ∆ϕ = ε ∆u + i εξ(t) · ∇u − u . e 2 2 2
Here, as well as below, one should remember that the functions are assessed as in (2.2). Recalling that the relevant space variable for u is y=
x − x(t) √ , ε
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
941
we have: √ d ˙ · y − ξ(t) · x(t). ˙ + εξ(t) (ξ(t) · (x − x(t))) = S(t) ˙ dt For the linear potential term, we compute, in terms of the variable y, ˙ + ∂t φ = S(t)
V ϕε = V (t, x)ε−d/4 eiφ(t,x)/ε u(t, y) √ = ε−d/4 eiφ(t,x)/ε V (t, x(t) + y ε)u(t, y), and we perform a Taylor expansion for V about (t, x(t)): √ √ V (t, x(t) + y ε)u(t, y) = V (t, x(t))u(t, y) + εy · ∇V (t, x(t))u(t, y) ε + y, ∇2 V (t, x(t))yu(t, y) + ε3/2 rVε (t, y), 2 with |rVε (t, y)| ≤ C y3 |u(t, y)|, for some C independent of ε, t and y, in view of Assumption 1. In the case K = 0, we come up with the relations: |ξ(t)|2 lin ˙ ˙ + + V (t, x(t)) , b0 = −u S(t) − ξ(t) · x(t) 2 ˙ + ∇V (t, x(t)))u, blin ˙ − ξ(t)) · ∇u − y · (ξ(t) 1 = −i(x(t) 1 1 2 blin 2 = i∂t u + ∆u − y, ∇ V (t, x(t))yu. 2 2 For the nonlinear term, we compute similarly ε 2 (K ∗ |ϕ | )(t, x) = K(x − z)|ϕε (t, z)|2 dz =ε
−d/2
=
√ K(x − x(t) − z ε)|u(t, z)|2 dz.
We have, in terms of the variable y: (K ∗ |ϕ | )(t, x) = ε 2
2 z − x(t) K(x − z) u t, √ dz ε
√ K((y − z) ε)|u(t, z)|2 dz.
This is where the smoothness of K near the origin becomes important: performing a Taylor expansion, we write √ √ ε K((y − z) ε) = K(0) + ε(y − z) · ∇K(0) + y − z, ∇2 K(0)(y − z) 2 ε + ε3/2 rK (y − z),
with ε |rK (y − z)| ≤ C y − z3 ,
October 31, J070-S0129055X11004485
942
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
for some C independent of ε, y and z. Therefore, using the conservation of mass, √ εd/4 e−iφ/ε (K ∗ |ϕε |2 )ϕε (t, x) = K(0) a 2L2 u(t, y) + ε a 2L2 y · ∇K(0)u(t, y) √ − ε∇K(0) · G(t)u(t, y) ε + y, ∇2 K(0)y a 2L2 u(t, y) 2 ε +
z, ∇2 K(0)z|u(t, z)|2 dz × u(t, y) 2 − ε ∇2 K(0)G(t), yu(t, y), where the notation G(t) stands for
z|u(t, z)|2 dz.
G(t) = Rd
We then discuss the outcome in (2.3) according to the value of α, on a formal level. We present the strategy to justify the approximation in the case α = 0 only, since this case contains all the arguments needed to treat the other cases. 2.3. Subcritical case: α > 1 When α > 1, by have bj = blin j for j = 0, 1, 2, and √ rε = εrVε + εα−1 (K ∗ |ϕε |2 )ϕε . lin lin ε Solving the equations blin 0 = b1 = b2 = 0 leads to the approximate solution ϕlin defined in Sec. 1.1.
2.4. The critical case: α = 1 When α = 1, we still have bj = blin j for j = 0, 1, but the expression for b2 is altered: 1 1 b2 = i∂t u + ∆u − y, ∇2 V (t, x(t))yu − K(0) a 2L2 u. 2 2 The equation b2 = 0 is the linear envelope equation (1.9), plus a constant potential, K(0) a 2L2 . We infer 2
u(t, y) = ulin (t, y)e−itK(0)aL2 . The presence of this phase shift accounts for nonlinear effects at leading order in the approximate wave packet ϕε . For the remainder term, we have: √ √ rε (t, y) = εrVε (t, y) + ε a 2L2 y · ∇K(0)u(t, y) √ ε − ε∇K(0) · G(t)u(t, y) + y, ∇2 K(0)y a 2L2 u(t, y) 2 ε +
z, ∇2 K(0)z|u(t, z)|2 dz × u(t, y) 2 − ε ∇2 K(0)G(t), yu(t, y),
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
and we infer the (rough) pointwise estimate √ |rε (t, y)| ≤ C ε y3 |u(t, y)|(1 + u(t) 2Σ ).
943
(2.4)
2.5. A supercritical case: α = 1/2 For α < 1, we have to assume either α = 1/2 or α = 0 in order to derive functions bj which do not depend on ε. For the simplicity of the presentation, we therefore stick to these cases, but essentially, the case 1/2 < α < 1 is treated like the case α = 1/2, and the case 0 < α < 1/2 like the case α = 0, up to the introduction of a rather weak dependence upon ε in the functions involved in the definition of the approximate solution. Roughly speaking, when 1/2 < α < 1 or 0 < α < 1/2, replace K with εα−1/2 in all the formulas in Secs. 2.5 and 2.6, respectively. In the case α = 1/2, we still have b0 = blin 0 , but now with ˙ + ∇V (t, x(t)))u − K(0) a 2 2 u, b1 = −i(x(t) ˙ − ξ(t)) · ∇u − y · (ξ(t) L 1 1 b2 = i∂t u + ∆u − y, ∇2 V (t, x(t))yu − a 2L2 y · ∇K(0)u + ∇K(0) · G(t)u. 2 2 At this stage, it is easy to convince oneself that the equations b0 = b1 = b2 = 0 are not compatible in general (if one wants to consider a non-zero solution u). Therefore, we modify our strategy, in order to allow b0 to depend on ε, so we can upgrade the last factor in b1 to b0 . This leads to: √ |ξ(t)|2 ˙ − ξ(t) · x(t) ˙ + bε0 = −u S(t) + V (t, x(t)) + εK(0) a 2L2 , 2 ˙ + ∇V (t, x(t)))u, ˙ − ξ(t)) · ∇u − y · (ξ(t) b1 = −i(x(t) 1 1 b2 = i∂t u + ∆u − y, ∇2 V (t, x(t))yu − a 2L2 y · ∇K(0)u + ∇K(0) · G(t)u. 2 2 Keeping (x(t), ξ(t)) solution to the Hamiltonian flow (1.4) leads to the ε-dependent action: t √ 1 |ξ(s)|2 − V (s, x(s)) ds − t εK(0) a 2L2(Rd ) . S ε (t) = 2 0 The equation b1 = 0 is then fulfilled as soon as we consider the standard Hamiltonian flow. The equation b2 = 0 is an envelope equation, which is nonlinear since G is a nonlinear function of u. Note however that this yields a purely time-dependent potential. Up to the time-dependent gauge transform t ∇K(0) · G(s)ds , u(t, y) → u(t, y) exp i 0
which preserves the modulus of the unknown, hence G, the equation for u becomes a linear profile equation. Finally, we still have a remainder term satisfying (2.4).
October 31, J070-S0129055X11004485
944
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
2.6. Another supercritical case: α = 0 This case corresponds to the one studied in [1]. We consider a more general framework though, since for instance we do not assume that the kernel K is radially symmetric. We now have |ξ(t)|2 2 ˙ ˙ + b0 = −u S(t) − ξ(t) · x(t) + V (t, x(t)) + K(0) a L2 , 2 ˙ + ∇V (t, x(t)))u − a 2 2 y · ∇K(0)u b1 = −i(x(t) ˙ − ξ(t)) · ∇u − y · (ξ(t) L + ∇K(0) · G(t)u, 1 1 b2 = i∂t u + ∆u − y, M (t)yu + ∇2 K(0)G(t), yu 2 2 1 −
z, ∇2 K(0)z|u(t, z)|2dz × u, 2 where we have denoted M (t) = a 2L2 (Rd ) ∇2 K(0) + ∇2x V (t, x(t)). Note that M ∈ L∞ t (R+ ). We will assume ∇K(0) = 0, so b1 = 0 as soon as (x(t), ξ(t)) satisfies (1.4). This assumption is a consequence of the framework in [1], since the authors suppose K(x) = F (|x|) with F even. Note that the slightly more general assumption K(x) = K(−x) is physically relevant, in the sense that in that case, an energy can be associated to the Hartree nonlinearity (see, e.g., [8]): K(x − y)|ψ ε (t, x)|2 |ψ ε (t, y)|2 dxdy. In the case where K is even, we obviously have ∇K(0) = 0. We then consider the Hamiltonian flow (1.4), the modified action t 1 |ξ(s)|2 − V (s, x(s)) ds − tK(0) a 2L2(Rd ) , (2.5) S(t) = 2 0 and the envelope equation b2 = 0. The remainder term still satisfies (2.4). Remark 2.1. Note that the Wigner measure of ψ ε is not affected by the nonlinearity: w(t, x, ξ) = a 2L2 (Rd ) δ(x − x(t)) ⊗ δ(ξ − ξ(t)), in the four cases α > 1, α = 1, α = 1/2 and α = 0, even though we have seen that the Hartree nonlinearity does affect the leading order behavior of the wave function.
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
945
2.7. Proof of stability in the case α = 0 We want to construct a solution to 1 1 i∂t u + ∆u = y, M (t)yu − ∇2 K(0)G(t), yu 2 2 1
z, ∇2 K(0)z|u(t, z)|2dz × u, + 2
(2.6)
with initial datum a. Introduce the solution to 1 1 i∂t v + ∆v = y, M (t)yv − ∇2 K(0)G(t), yv; 2 2
v(0, y) = a(y).
(2.7)
The functions u and v solve the same equation, up to one term which can be absorbed by the gauge transform u(t, y) = v(t, y) exp(iθ(t)), where 1 t
z, ∇2 K(0)z|u(s, z)|2 dzds. θ(t) = − 2 0 Rd Since this gauge transform does not affect the modulus of the solution, one should consider that in (2.7), G(t) = z|v(t, z)|2 dz. Rd
Lemma 2.2. Let k ≥ 1 and assume that a ∈ Σk . Then (2.7) has a unique solution v ∈ C(R+ ; Σk ), and there exists C such that v(t) Σk ≤ CeCt ,
t ≥ 0.
As a consequence, (2.6) has a unique solution, which possesses the same properties. Remark 2.2. The Σ regularity is the least one has to demand in this result, for the gauge θ to be well defined (and the harmonic oscillator rotates the phase space, so the regularity must be the same in space and frequency). Proof. The main difficulty is that since the last term in the equation involves a time dependent potential which is unbounded in y, it cannot be treated as a perturbation. So to construct a local solution, we modify the standard Picard iterative scheme, to consider 1 1 i∂t vn + ∆vn = y, M (t)yvn + ∇2 K(0)Gn−1 (t), yvn , n ≥ 1, 2 2
(2.8)
with vn|t=0 = a for all n, v0 (t, y) = a(y), and Gk (t) = z|vk (s, z)|2 ds. Rd
At each step, we solve a linear equation, with a time dependent potential which is at most quadratic: if Gn−1 ∈ L∞ loc (R+ ), [13] ensures the existence of
October 31, J070-S0129055X11004485
946
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
vn ∈ C(R+ ; L2 (Rd )). Applying the operators y and ∇y to (2.8) shows that vn ∈ C(R+ ; Σ), hence Gn ∈ L∞ loc (R+ ). To prove the convergence of this scheme we need more precise (uniform in n) estimates. Direct computations show that G˙ n (t) = Im vn (t, z)∇vn (t, z)dz, Rd
and ¨ n (t) + M (t)Gn (t) = ∇2 K(0) a 2 2 d Gn−1 (t), G L (R )
n ≥ 1.
These relations are straightforward on a formal level, and can be justified thanks to standard techniques (see, e.g., [8]). Let fn (t) = |G˙ n (t)|2 + |Gn (t)|2 . We have ¨ n (t)| + 2|G˙ n (t)||Gn (t)| ≤ Cfn (t) + C|Gn−1 (t)|2 , f˙n (t) ≤ 2|G˙ n (t)||G for some C independent of t and n. We infer that there exists C0 independent of t ≥ 0 and n such that fn (t) = |G˙ n (t)|2 + |Gn (t)|2 ≤ C0 eC0 t . By using energy estimates (applying the operators y and ∇y successively to the equation), we infer that there exists C1 independent of t ≥ 0 and n such that vn (t) Σk ≤ C1 eC1 t . The convergence of the sequence vn then follows: by a standard fixed point argument, vn converges in C([0, T ]; Σ) if T > 0 is sufficiently small. By using energy estimates, and (exponential) a priori bounds for G, we infer the exponential control stated in the lemma, and hence global existence. Remark 2.3. The above computations show that for v, the function G satisfies G(0) =
¨ + ∇2x V (t, x(t))G(t) = 0, G(t) 2 ˙ z|a(z)| dz; G(0) = Im a(z)∇a(z)dz.
Rd
Rd
In [1], the authors proved that if the initial data a ∈ Σ3 is such that 2 z|a(z)| dz = Im a(z)∇a(z)dz = 0,
Rd
Rd
then z|u(t, z)|2 dz = 0 for all time. The above ODE gives a simple explanation of that property. Note that up to changing a to b with b(y) = a(y − y0 )eiy·η0 for y0 and η0 which can be computed explicitly, that is up to a translation in the phase space, these two assumptions are satisfied. However, the external potential
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
947
is modified, and it is not so easy to keep track of the geometric meaning of the approximation. This is why we have chosen to present a direct approach here, which also shows that (2.7) is more nonlinear than it may seem. Note finally that because of the term G, working in L2 only would not be possible. To conclude, we have: Proposition 2.1. Let a ∈ Σ3 , α = 0. Suppose V satisfies Assumption 1, and K satisfies Assumption 2 and ∇K(0) = 0. Assume ϕε is given by (2.2), where the action is given by (2.5) and the envelope is given by (2.6). Then there exists a positive constant C independent of ε such that √ Ct ψ ε (t) − ϕε (t) L2 (Rd ) ≤ C εee ,
t ≥ 0.
In particular, there exists c > 0 independent of ε such that sup 0≤t≤c ln ln
1 ε
ψ ε (t) − ϕε (t) L2 (Rd ) −→ 0. ε→0
Proof. First, we change the unknown function ψ ε to uε through the bijective change of unknown function x − x(t) ε −d/4 ε u t, √ ψ (t, x) = ε ei(S(t)+ξ(t)·(x−x(t)))/ε , ε where S is given by (2.5). Then (1.3) (with α = 0) is equivalent to: 1 i∂t uε + ∆uε = V ε uε + (K ε ∗ |uε |2 )uε ; 2
uε (0, y) = a(y),
where √ √ 1 (V (t, x(t) + y ε) − V (t, x(t)) − εy · ∇V (t, x(t))), ε √ 1 K ε (y) = (K(y ε) − K(0)). ε
V ε (t, y) =
We have, in view of Assumptions 1 and 2, the property ∇K(0) = 0, and Taylor’s formula, |V ε (t, y)| + |K ε (y)| ≤ C|y|2 , for some constant C independent of ε, t and y. We already know from Lemma 2.1 that uε ∈ C(R+ ; L2 ), with uε (t) L2 = a L2 . Proceeding in the same way as in the proof of Lemma 2.2, we can also prove that uε ∈ C(R+ ; Σ3 ) and that there exists C such that uε (t) Σ3 ≤ CeCt .
October 31, J070-S0129055X11004485
948
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
Set wε = uε − u: we have wε (t) L2 = ψ ε (t) − ϕε (t) L2 , and wε solves 1 i∂t wε + ∆wε = V ε wε + (K ε ∗ |uε |2 )uε − (K ε ∗ |u|2 )u − rε ; 2
ε w|t=0 = 0,
where rε satisfies the pointwise estimate (2.4). Write (K ε ∗ |uε |2 )uε − (K ε ∗ |u|2 )u = (K ε ∗ |uε |2 )wε + (K ε ∗ (|uε |2 − |u|2 ))u, so the standard L2 estimate yields t t ε ε ε 2 2 w (t) L2 ≤ (K ∗ (|u | − |u| ))u(s) L2 ds + rε (s) L2 ds. 0
0
Since |u | − |u| = 2 Re(uw ) + |w | , and u (t) Σ2 + u(t) Σ3 ≤ CeCt , we come up with t t √ eCs wε (s) L2 ds + C ε eCs ds. wε (t) L2 ≤ C ε 2
2
ε
ε 2
ε
0
0
Gronwall lemma yields √ Ct wε (t) L2 ≤ C εee , and the result follows. Remark 2.4. It is quite surprising that even in a supercritical case, the approximation can be proven so simply, eventually by a Gronwall type argument. This is in sharp contrast with supercritical WKB analysis for the nonlinear Schr¨ odinger equation (see [4]). On the other hand, we had to use the a priori control of the approximate solution u and of the exact solution uε : in subcritical or critical cases, controlling the approximate solution is sufficient in general, as illustrated below in the case of homogeneous kernels. Remark 2.5. In the cases α = 1/2, α = 1, and α > 1, a similar statement can be proved, and the time of validity is improved, in the sense that the c ln ln 1/ε in the end of Proposition 2.1 can be replaced by c ln 1/ε. More precisely, the error estimate will be: • α > 1: ψ ε (t) − ϕεlin (t) L2 ≤ Cεκ eCt with κ = min(1/2, α − 1). √ • α = 1: ψ ε (t) − ϕε (t) L2 ≤ C εeCt , with ϕε as in Sec. 2.4. √ √ • α = 12 : ψ ε (t) − ϕε (t) L2 ≤ C εeCt exp( εeCt ), with ϕε as in Sec. 2.5. 3. Homogeneous Kernel: Technical Background In this section, we present some general technical tools which will be used in the proofs of the main results in the case of an homogeneous kernel. In particular, we establish the global well-posedness for (1.10), and estimate the evolution of weighted Sobolev norms of the solution over large time:
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
949
Proposition 3.1. Let λ ∈ R and 0 < γ < min(2, d). Suppose V satisfies Assumption 1 and the initial data a ∈ L2 (Rd ). Then there exists a unique solution 8/γ u ∈ C(R+ ; L2 (Rd )) ∩ Lloc (R+ , L4d/(2d−γ)(Rd )) to (1.10). If in addition a ∈ Σk for some k ∈ N, then u ∈ C(R+ ; Σk ), and there exists C = C(k) such that u(t) Σk ≤ CeCt ,
∀ t ≥ 0.
3.1. Strichartz estimates Before studying the semi-classical limit, we recall some known facts and establish technical results. Definition 3.1. A pair (q, r) is admissible if 2 ≤ r < 2 ≤ q < ∞ if d = 2) and 1 1 2 =d − := δ(r). q 2 r
2d d−2
(2 ≤ q ≤ ∞ if d = 1,
Following [15, 20, 26], Strichartz estimates are available for the Schr¨ odinger equation without external potential. Thanks to the construction of the parametrix performed in [12, 13], similar results are available in the presence of an external potential satisfying d Assumption 3. W ∈ L∞ loc (R+ × R ) is a smooth with respect to x for all t ≥ 0: x → W (t, x) is a C ∞ map. Moreover, it is subquadratic in x:
∀ β ∈ Nd ,
|β| ≥ 2, ∂xβ W ∈ L∞ (R+ × Rd ).
Define U ε (t, s) the semigroup as uε (t, x) = U ε (t, s)φ(x), where iε∂t uε +
ε2 ∆uε = W (t, x)uε ; 2
uε (s, x) = φ(x).
From [12, 13], it has the following properties: • • • • • •
U ε (t, t) = Id. The map (t, s) → U ε (t, s) is strongly continuous. U ε (t, τ )U ε (τ, s) = U ε (t, s). U ε (t, s)∗ = U ε (t, s)−1 . U ε (t, s) is unitary on L2 : U ε (t, s)φ L2 (Rd ) = φ L2 (Rd ) . There exist δ, C > 0 independent of ε ∈]0, 1] such that for all t, s ≥ 0 with |t − s| < δ, U ε (t, 0)U ε (s, 0)∗ φ L∞ (Rd ) ≤
C φ L1 (Rd ) . (ε|t − s|)d/2
Scaled Strichartz estimates follow from the above dispersive relation:
October 31, J070-S0129055X11004485
950
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
Proposition 3.2 (Scaled Strichartz Estimates). Let U ε (t, s) defined as above. There exists δ > 0 independent of ε such that the following holds. (1) For any admissible pair (q, r), there exists C(q) independent of ε such that ε1/q U ε (·, s)φ Lq ([s,s+δ];Lr (Rd )) ≤ C(q) φ L2 (Rd ) ,
∀ φ ∈ L2 (Rd ),
∀ s ≥ 0.
(2) For s ∈ R, denote Dsε (F )(t, x)
t
=
U ε (t, s)F (s, x)ds,
s
and I = [s, s + η]. For all admissible pairs (q1 , r1 ) and (q2 , r2 ), there exists C(q1 , q2 ) independent of ε and s ≥ 0 such that ε1/q1 +1/q2 Dsε (F ) Lq1 (I;Lr1 (Rd )) ≤ C(q1 , q2 ) F Lq2 (I;Lr2 (Rd ))
for all F ∈ Lq2 (I; Lr2 (Rd )) and 0 ≤ η ≤ δ. Here
1 q2
+
1 q2
= 1 and
1 r2
+
1 r2
= 1.
This statement will be used in the two cases ε = 1 (for the envelope equation (1.10)), and ε ∈ ]0, 1] (to justify the approximation of the exact solution ψ ε ). 3.2. Global existence in L2 We consider a rather general potential W satisfying Assumption 3, and consider the Cauchy problem on R+ × Rd : 1 i∂t v + ∆v = W (t, y)v + λ|y|−γ ∗ |v|2 v; 2
v|t=s = vs .
(3.1)
This form includes the cases of the exact solution ψ ε in (1.3) as well as the envelope equation (1.10). We establish global existence for (3.1) in the L2 -subcritical case 0 < γ < min(2, d), yielding the first part of Proposition 3.1. Lemma 3.1. Let λ ∈ R, 0 < γ < min(2, d). Suppose W satisfies Assumption 3. Then for s = 0 and v0 ∈ L2 (Rd ), (3.1) has a unique solution 8/γ
v ∈ C(R+ ; L2 (Rd )) ∩ Lloc (R+ , L4d/(2d−γ)(Rd )). In addition, the L2 norm of v is conserved: v(t) L2 (Rd ) = v0 L2 (Rd ) ,
∀ t ≥ 0.
Proof. By Duhamel’s formula, we write (3.1) as t v(t) = U (t, s)vs − iλ U (t, τ )(|x|−γ ∗ |v|2 v)(τ )dτ =: Φs (v)(t), s
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
951
where we have dropped the dependence of U ε upon ε in the notation, since we assume ε = 1 here. Introduce the space Ys,T = {φ ∈ C(Is,T ; L2 (Rd )) : φ L∞ (Is,T ; φ L8/γ (Is,T ;
L4d/(2d−γ) (Rd ))
L2 (Rd ))
≤ 2 vs L2 (Rd ) ,
≤ 2C(8/γ) vs L2 (Rd ) },
and the distance d(φ1 , φ2 ) = φ1 − φ2 L8/γ (Is,T ;
L4d/(2d−γ) ) ,
where Is,T = [s, s + T ] with s ≥ 0 and T > 0, and C(8/γ) stems from Proposition 3.2. Then (Ys,T , d) is a complete metric space, as remarked in [19] (see also [8]). Hereafter, we denote by q= and · La (I
b d s,T ; L (R ))
8 , γ
r=
4d , 2d − γ
θ=
8 , 4−γ
by · La Lb for simplicity. Notice that (q, r) is admissible and
1 1 1 1 4−γ + = + ; = q 4 q 2 θ
1 1 γ + ; = r 2d r
1 1 1 = + . 2 θ q
By using Strichartz estimates, H¨older inequality and Hardy–Littlewood–Sobolev inequality, we have, for (q, r) ∈ {(q, r), (∞, 2)}: Φs (v) Lq Lr ≤ C(q) vs L2 + C(q, q) |y|−γ ∗ |v|2 v Lq Lr ≤ C(q) vs L2 + C(q, q) |y|−γ ∗ |v|2 L4/(4−γ) L2d/γ v Lq Lr ≤ C(q) vs L2 + C v 2Lθ Lr v Lq Lr ≤ C(q) vs L2 + CT 1−γ/2 v 3Lq Lr ,
(3.2)
for any v ∈ Ys,T , with C(∞) = 1 by the standard energy estimate. To show the contraction property of Φs , for any v, w ∈ Ys,T , we get Φs (v) − Φs (w) Lq Lr |y|−γ ∗ |v|2 L4/(4−γ) L2d/γ v − w Lq Lr + |y|−γ ∗ ||v|2 − |w|2 | L2 L2d/γ w Lθ Lr ( v 2Lθ Lr + w 2Lθ Lr ) v − w Lq Lr ≤ CT 1−γ/2 ( v 2Lq Lr + w 2Lq Lr ) v − w Lq Lr . Thus Φs is a contraction from Ys,T to Ys,T provided that T is sufficiently small. Then there exists a unique v ∈ Ys,T solving (3.1). The global existence of the solution for (3.1) follows from the conservation of L2 -norm of v.
October 31, J070-S0129055X11004485
952
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
3.3. Growth of higher order Sobolev norms and momenta We now consider (1.10), that is 1 1 i∂t u + ∆u = y, Q(t)yu + λ|y|−γ ∗ |u|2 u; 2 2
u|t=0 = a,
2
where Q(t) = ∇2 V (t, x(t)), so Q ∈ C(R+ , Rd ) is locally Lipschitzean, and bounded, by Assumption 1. The second part of Proposition 3.1 follows from the following lemma. Lemma 3.2. Let λ ∈ R and 0 < γ < min(2, d). Suppose a ∈ Σk for some k ∈ N. Then there exists a unique u ∈ C(R+ ; Σk ) solving (1.10), and there exists C = C(k) such that for every admissible pair (q1 , r1 ), y α ∂yβ u(t) Lq1 ([0,t];Lr1 (Rd )) ≤ CeCt ,
∀ t ≥ 0,
α, β ∈ Nd ,
|α| + |β| ≤ k. (3.3)
Proof. We just state the proof of (3.3), by borrowing the approach in [5]. Applying similar arguments as the proof of Lemma 3.1 and induction, one can prove global existence and uniqueness of the Σk solution for (1.10). Step 1. k = 0. For all t ≥ 0 and τ > 0, set I = [t, t + τ ]. Resuming the computations as in (3.2), we have u Lq (I;Lr (Rd ))∩L∞ (I;L2 (Rd )) ≤ C u(t) L2 + C1 τ 1−γ/2 u 3Lq (I;Lr (Rd )) , where C and C1 is independent of t and τ . Then (3.3) for k = 0 follows from the following bootstrap argument. Lemma 3.3 (Bootstrap Argument). Let g = g(t) be a nonnegative continuous function on [0, T ] such that for every t ∈ [0, T ], g(t) ≤ M + δg(t)κ where M, δ and κ > 1 are constants such that 1 1 1 M < 1− ; g(0) ≤ . 1/(κ−1) 1/(κ−1) κ (κδ) (κδ) Then g(t) ≤
κ M, κ−1
∀ t ∈ [0, T ].
For fixed τ small enough, by the conservation of L2 norm of u, we choose κ = 3 and δ = C1 τ 1−γ/2 . Since 0 < γ < 2, at every increment of time of length of τ , the Lq Lr and L∞ L2 norms of u are bounded by 32 C u0 L2 , and (3.3) follows in the case k = 0, (q1 , r1 ) = (q, r). Using Strichartz inequalities again, we have, for any admissible pair (q1 , r1 ), u Lq1 (I;Lr1 (Rd )) u(t) L2 + τ 1−γ/2 u 3Lq (I;Lr (Rd )) , and (3.3) follows in the case k = 0.
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
953
Step 2. Suppose Lemma 3.2 holds for k − 1 (k ≥ 1). We denote by w the family of combinations of α momenta and β space derivatives of u with |α| + |β| = . Applying y α ∂yβ to (1.10) formally, we obtain 1 1 i∂t wk + ∆wk = y, Q(t)ywk + V(u, wk ) + Lk (u) 2 2
+λ cj1 ,j2 ,j3 |y|−γ ∗ (wj1 w j2 )wj3 , 0≤j1 ,j2 ,j3 ≤k−1 j1 +j2 +j3 =k
where V(u, wk ) = |y|−γ ∗ (wk u)u + |y|−γ ∗ (w k u)u + |y|−γ ∗ |u|2 wk , Lk (u) =
1 1 Ω(t)[y α ∂yβ , |y|2 ]u + [∆, y α ∂yβ ]u. 2 2
Notice that Lk (u) is controlled pointwise by wk . Still by Strichartz estimates, induction and Step 1, we have γ
wk Lq (I;Lr )∩L∞ (I;L2 ) wk (t) L2 + τ 1− 2 u Lq (I;Lr ) wk Lq (I;Lr ) + C2 wk L1 (I,L2 ) + C3 e3C(t+τ ) γ
≤ C wk (t) L2 + C1 τ 1− 2 wk Lq (I;Lr ) + C2 wk L1 (I,L2 ) + C3 e3C(t+τ ) , where C, C1 , C2 and C3 are independent of t and τ . Choosing τ < 1 fixed small enough, the second term on the right-hand side of the above inequality can be absorbed by the left-hand side. For any time interval [0, t], split it into finitely many pieces such that the length of every piece at most τ , then we have t e3Cs ds. wk Lq ([0,t];Lr (Rd ))∩L∞ ([0,t];L2 (Rd )) wk L1 ([0,t];L2 (Rd )) + 0
Lemma 3.2 follows from the Gronwall lemma in the case (q1 , r1 ) = (∞, 2). Using Strichartz inequalities again, the general case follows. 4. Bounded Time Interval for the Critical Case In this section, we consider the critical case for (1.3) with homogeneous nonlinearity in bounded time interval, and establish a good approximation to the wave function. First, we recall the following lemma from [7]. Lemma 4.1. Suppose V satisfies Assumption 1. Let (x(t), ξ(t)) be defined by the trajectories (1.4) and S(t) be the classical action (1.6). Assume Aε and B ε are
October 31, J070-S0129055X11004485
954
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
defined as follows: Aε =
√ ξ(t) √ ε∇ − i √ = εei(S(t)+ξ(t)·(x−x(t)))/ε ∇(e−i(S(t)+ξ(t)·(x−x(t)))/ε ·); ε
Bε =
x − x(t) √ . ε
Then Aε and B ε satisfy the commutation relations: √ ε2 ε iε∂t + ∆ − V, A = ε(∇V (t, x) − ∇V (t, x(t))); 2 ε2 ε iε∂t + ∆ − V, B = εAε . 2 Proposition 4.1. Under the assumptions in Theorem 1.1, for all T > 0 which is independent of ε > 0, we have √ sup ψ ε (t) − ϕε (t) H = O( ε).
0≤t≤T
Proof. Set wε = ψ ε − ϕε : it satisfies iε∂t wε +
ε2 ∆wε = V wε − Lε + N ε ; 2
wε |t=0 = 0.
(4.1)
We have denoted by Lε = (V (t, x) − T2 (t, x, x(t)))ϕε , N ε = λεαc (|x|−γ ∗ |ψ ε |2 ψ ε − |x|−γ ∗ |ϕε |2 ϕε ), where T2 corresponds to a second order Taylor approximation: T2 (t, x, x(t)) = V (t, x(t)) + ∇V (t, x(t)), x − x(t) 1 + x − x(t), ∇2 V (t, x(t))(x − x(t)). 2 By Duhamel’s formula, using scaled Strichartz estimates and similar arguments as in the proof of Lemma 3.1, we have wε Lq Lr ε−1/q wε (t) L2 + ε−1−1/q Lε L1 L2 + ε−1−2/q N ε Lq Lr ε−1/q wε (t) L2 + ε−1−1/q Lε L1 L2 + εαc −1−2/q ( ψ ε 2Lθ Lr + ϕε 2Lθ Lr ) wε Lq Lr .
(4.2)
We have used the notation La Lb for La ([t, t + τ ]; Lb (Rd )) with t ≥ 0 and τ > 0. For all T > 0, using similar arguments as in the proof of Lemma 3.1, we know uε ,
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
955
u ∈ C(R+ ; Σ), then P u L∞([0,T ];L2 (Rd )) + P uε L∞ ([0,T ];L2 (Rd )) ≤ C(T ),
∀ P ∈ {Id, ∇, x}.
By the definitions of Aε and B ε , it is easy to check P ε ϕε L∞ ([0,T ];L2 (Rd )) + P ε ψ ε L∞ ([0,T ];L2 (Rd )) ≤ C(T ),
∀ P ε ∈ {Id, Aε , B ε },
where C(T ) is independent of ε. Gagliardo–Nirenberg inequality then yields ϕε (t) Lr (Rd ) ε−γ/8 ϕε L2 (Rd ) Aε ϕε L2 (Rd ) ≤ C(T )ε−γ/8 , 1−γ/4
γ/4
∀ t ∈ [0, T ].
Similarly, ψ ε (t) Lr is bounded by C(T )ε−γ/8 for all t ∈ [0, T ], where C(T ) is independent of ε. Let [t, t + τ ] ⊂ [0, T ], we can rewrite (4.2) as wε Lq Lr ε−1/q wε (t) L2 + ε−1−1/q Lε L1 L2 + εαc −1−2/q τ 1−γ/4 ( ψ ε 2L∞ Lr + ϕε 2L∞ Lr ) wε Lq Lr ε−1/q wε (t) L2 + ε−1−1/q Lε L1 L2 + εαc −1−2/q−γ/4 τ 1−γ/4 wε Lq Lr ε−1/q wε (t) L2 + ε−1−1/q Lε L1 L2 + τ 1−γ/4 wε Lq Lr .
(4.3)
Choosing τ sufficiently small, the last term on the right-hand side in (4.3) can be absorbed by the left-hand side. Splitting [0, T ] into finitely many such intervals, we have wε Lq ([0,T ];Lr (Rd )) ε−1/q wε L1 ([0,T ];L2 (Rd )) + ε−1−1/q Lε L1 ([0,T ];L2 (Rd )) . Using Strichartz estimates again and resuming the above computations, we get wε L∞ ([0,t];L2 (Rd )) wε L1 ([0,t];L2 (Rd )) + ε−1 Lε L1 ([0,t];L2 (Rd )) ,
∀ t ∈ [0, T ].
Thanks to Assumption 1 and Proposition 3.1, we have Lε L1 ([0,t];L2 (Rd )) ε3/2 y3 u L1 ([0,t];L2 (Rd )) ε3/2 eC0 t , for any t ≥ 0. Then the Gronwall inequality yields √ wε L∞ ([0,T ];L2 (Rd )) ≤ C(T ) ε, where C(T ) is independent of ε.
(4.4)
October 31, J070-S0129055X11004485
956
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
To establish the control of H norm, in view of Lemma 4.1, we obtain √ ε2 iε∂t + ∆ − V (Aε wε ) = ε(∇V (t, x) − ∇V (t, x(t)))wε − Aε Lε + Aε N ε . 2 Using Duhamel’s formula and scaled Strichartz estimates again, we lead to Aε wε Lq Lr ε−1/q Aε wε (t) L2 + ε−1−1/q Aε Lε L1 L2 + ε−1−2/q Aε N ε Lq Lr √ + ε−1−1/q ε(∇V (t, x) − ∇V (t, x(t)))wε L1 L2 . (4.5) Note that we have the pointwise estimate: √ | ε(∇V (t, x) − ∇V (t, x(t)))wε | ≤ Cε|B ε wε |, where C is independent of t, x and ε. We also have Aε Lε L2 (Rd ) ε3/2 ( y2 u L2 (Rd ) + y3 ∇u L2 (Rd ) ). Having in mind Proposition 3.1, we infer Aε Lε L1 ([0,t];L2 (Rd )) ε3/2 eC0 t ,
∀ t ≥ 0.
(4.6)
We observe that Aε acts on gauge invariant nonlinearities like a derivative: Aε (|x|−γ ∗ |φ|2 φ) = 2 Re(|x|−γ ∗ (φAε φ))φ + |x|−γ ∗ |φ|2 (Aε φ). Then we have Aε N ε Lq Lr εαc |x|−γ ∗ (ψ ε Aε ψ ε )ψ ε − |x|−γ ∗ (ϕε Aε ϕε )ϕε Lq Lr + εαc (|x|−γ ∗ |ψ ε |2 )Aε ψ ε − (|x|−γ ∗ |ϕε |2 )Aε ϕε Lq Lr =: εαc (I + II). In view of triangle inequality, H¨ older inequality and Hardy–Littlewood–Sobolev inequality, by similar arguments as in the proof of Lemma 3.1, we get I Aε wε Lq Lr ψ ε 2Lθ Lr + wε Lq Lr Aε ϕε Lθ Lr ( ψ ε Lθ Lr + ϕε Lθ Lr ) ( wε 2Lθ Lr + ϕε 2Lθ Lr ) Aε wε Lq Lr +( Aε ϕε 2Lθ Lr + wε 2Lθ Lr + ϕε 2Lθ Lr ) wε Lq Lr . The term II satisfies the same estimate. Applying Gagliardo–Nirenberg inequality, Aε ϕε (t) Lr (Rd ) ε−γ/8 Aε ϕε L2 (Rd ) (Aε )2 ϕε L2 (Rd ) 1−γ/4
γ/4
ε−γ/8 ∇u L2(Rd ) ∇2 u L2 (Rd ) ≤ C(T )ε−γ/8 , 1−γ/4
γ/4
∀ t ∈ [0, T ],
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
957
which implies that Aε N ε Lq Lr εαc −γ/4 τ 1−γ/4 ( Aε wε Lq Lr + wε Lq Lr ). Then (4.5) can be estimated as Aε wε Lq Lr ε−1/q Aε wε (t) L2 + ε−1/q B ε wε L1 L2 + ε−1−1/q Aε Lε L1 L2 + τ 1−γ/4 ( Aε wε Lq Lr + wε Lq Lr ). (4.7) Recalling Lemma 4.1, we have ε2 iε∂t + ∆ − V (B ε wε ) = εAε wε − B ε Lε + B ε N ε . 2 Proceeding like above, we come up with B ε wε Lq Lr ε−1/q B ε wε (t) L2 + ε−1/q Aε wε L1 L2 + ε−1−1/q B ε Lε L1 L2 + τ 1−γ/4 ( B ε wε Lq Lr + wε Lq Lr ). (4.8) Summing over (4.3), (4.7) and (4.8), we have
P ε wε Lq Lr P ε ∈{Id,Aε ,B ε }
ε−1/q
( P ε wε (t) L2 + P ε wε L1 L2 )
P ε ∈{Id,Aε ,B ε }
+ ε−1−1/q
P ε Lε L1 L2 + τ 1−γ/4
P ε ∈{Id,Aε ,B ε }
P ε wε Lq Lr .
P ε ∈{Id,Aε ,B ε }
Take τ sufficiently small such that the last term of the right-hand side in the above inequality can be absorbed by the left-hand side. Using scaled Strichartz estimates again, resuming the above computations, for any fixed T > 0 and any t ∈ [0, T ],
P ε wε L∞ ([0,t];L2 (Rd )) P ε wε L1 ([0,t];L2 (Rd )) P ε ∈{Id,Aε ,B ε }
P ε ∈{Id,Aε ,B ε }
+ ε−1
P ε Lε L1 ([0,t];L2 (Rd )) .
P ε ∈{Id,Aε ,B ε }
We end up with wε L∞ ([0,t];H) wε L1 ([0,t];H) +
√
ε,
∀ t ∈ [0, T ],
(4.9)
and Proposition 4.1 follows from Gronwall lemma. 5. Large Time Approximation In this section, we improve the time of validity of the error estimate proven in Sec. 4, in two cases: the linearizable case α > αc , and the non-linearizable case α = αc , thus proving Proposition 1.1 and Theorem 1.1.
October 31, J070-S0129055X11004485
958
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
5.1. Proof of Proposition 1.1 Set wε = ψ ε − ϕεlin : it satisfies iε∂t wε +
ε2 ∆u = V wε − Lε + N ε ; 2
wε |t=0 = 0,
(5.1)
where Lε = (V (t, x) − T2 (t, x, x(t)))ϕεlin ;
N ε = λεα |x|−γ ∗ |ϕεlin + wε |2 (ϕεlin + wε ).
Using scaled Strichartz estimates, we have, for t > 0, wε Lqt Lr ε−1−1/q Lε L1t L2 + ε−1−2/q N ε Lq Lr t
ε
−1−1/q
L L1t L2 + ε ε
α−1−2/q
ϕεlin 2Lθ Lr ϕεlin Lqt Lr t
+ εα−1−2/q ( wε 2Lθ Lr + ϕεlin 2Lθ Lr ) wε Lqt Lr , t
t
where Lat Lb stands for La ([0, t]; Lb (Rd )). In view of Proposition 3.1, Gagliardo– Nirenberg inequality yields ϕεlin (t) Lr (Rd ) ε−γ/8 ϕεlin L2 (Rd ) Aε ϕεlin L2 (Rd ) ≤ Cε−γ/8 eC0 t , 1−γ/4
γ/4
∀ t ≥ 0,
where C and C0 are independent of ε and t. We use a bootstrap argument, relying upon the estimate wε (t) Lr (Rd ) ≤ ε−γ/8 eC0 t ,
(5.2)
for t ∈ [0, T ε ], where T ε may (and will) depend on ε. Since wε is expected to be small compared to ϕεlin , such an estimate looks sensible. We come up with wε Lqt Lr ≤ Cε−1−1/q Lε L1t L2 + Cεα−αc e2C0 t ϕεlin Lqt Lr + Cεα−αc e2C0 t wε Lqt Lr . Now assume T ε is chosen so that ε
Cεα−αc e2C0 T ≤
1 . 2
Then we have wε Lqt Lr ≤ 2Cε−1−1/q Lε L1t L2 + 2Cεα−αc e2C0 t ϕεlin Lqt Lr . Using scaled Strichartz estimates again, we get −1 2 ε wε L∞ Lε L1t L2 + εα−1−1/q e2C0 t ϕεlin Lqt Lr t L √ ≤ C( εeC0 t + εα−αc e3C0 t ).
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
959
Apply Aε and B ε respectively to Eq. (5.1), then by the same arguments as in the proof of Proposition 4.1, we get
P ε wε Lqt Lr P ε ∈{Aε ,B ε }
ε−1/q
P ε ∈{Aε ,B ε }
+ ε−1−1/q
P ε wε L1t L2 + εα−αc e3C0 t
P ε ∈{Aε ,B ε }
P ε Lε L1t L2 + εα−αc e2C0 t
P ε ∈{Id,Aε ,B ε }
P ε wε Lqt Lr .
As long as (5.2) holds, we can use the same absorption argument as above to treat the last term. By using scaled Strichartz again, we obtain wε L∞ ([0,t];H) wε L1 ([0,t];H) + εα−αc e3C0 t + ε−1 Lε L1 ([0,t];H) . Since √ ε−1 Lε L1 ([0,t];H) ≤ C εeCt , we end up with wε L∞ ([0,t];H) ≤ C( wε L1 ([0,t];H) + εκ eC1 t ), where κ = min{ 12 , α − αc } and C1 is independent of ε and t. Gronwall lemma yields wε (t) H ≤ Cεκ eC1 t so long as (5.2) holds. Gagliardo–Nirenberg inequality yields wε (t) Lr (Rd ) ε−γ/8 wε L2 (Rd ) Aε wε L2 (Rd ) ≤ Cεκ−γ/8 eC1 t . 1−γ/4
γ/4
Proposition 1.1 then follows, by choosing T ε = C2 ln 1ε with C2 > 0 sufficiently small and independent of ε, and ε ∈ ]0, ε0 ] for ε0 > 0 sufficiently small. 5.2. Proof of Theorem 1.1 We now explain how to upgrade Proposition 4.1 to Theorem 1.1, by examining the large time behavior of the quantities involved in the proof. By Proposition 3.1, we know that u ∈ C(R+ , Σk ) provided that the initial datum a ∈ Σk . Recalling Step 1 in Lemma 3.2, we showed that for any t ≥ 0 and τ > 0 sufficiently small, u Lq ([t,t+τ ];Lr (Rd )) ≤
3 3 C u(0, ·) L2 (Rd ) = C a L2 (Rd ) . 2 2
October 31, J070-S0129055X11004485
960
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
where C is independent of t, τ . For any t ≥ 0, fixed τ ∈]0, 1[, split [t, t + 1] into finitely many pieces with length at most τ , then we obtain ϕε Lq ([t,t+1];Lr (Rd )) = ε−γ/8 u Lq ([t,t+1];Lr (Rd )) ≤ Cε−γ/8 a L2 (Rd ) ,
∀ t ≥ 0.
Using Proposition 4.1 and Gagliardo–Nirenberg inequality, we know that there exists ε0 > 0 such that wε Lq ([t,t+1];Lr (Rd )) ≤ ε−γ/8 a L2 (Rd ) ,
(5.3)
for t ∈ [0, 1]. Suppose that (5.3) holds for t ∈ [0, T ε ] (T ε ≥ 1), and let t, τ > 0 with t + τ ≤ T ε . Denote by La Lb the space La ([t, t + τ ]; Lb (Rd )). Scaled Strichartz estimates, H¨older inequality and Hardy–Littlewood–Sobolev inequality yield wε Lq Lr ε−1/q wε (t) L2 + ε−1−1/q Lε L1 L2 + εαc −1−2/q ( ϕε 2Lθ Lr + wε 2Lθ Lr ) wε Lq Lr ≤ C(ε−1/q wε (t) L2 + ε−1−1/q Lε L1 L2 + τ 1−γ/2 wε Lq Lr ), where C is independent of ε, t , τ . Choose τ ∈]0, 1] sufficiently small such that the last term on the above right-hand side can be absorbed by the left-hand side. Using scaled Strichartz estimates again and resuming the previous computations, we end up with t t wε L∞ ([0,s];L2 (Rd )) ds + Cε−1 Lε (s) L2 ds. wε L∞ ([0,t];L2 (Rd )) ≤ C 0
0
The last term is controlled thanks to Proposition 3.1, and Gronwall lemma yields √ wε L∞ ([0,t];L2 (Rd )) ≤ C εeC1 t . (5.4) Resuming Strichartz inequalities, and splitting the interval [t, t + 1] into finitely many intervals of length τ , the above computation yields: wε Lq ([t,t+1];Lr ) ε−1/q wε (t) L2 + ε−1−1/q Lε L1 ([t,t+1];L2 ) √ ε−γ/8 εeC1 t . Setting T ε = C2 ln 1ε with C2 > 0 sufficiently small and independent of ε, and ε ∈ ] 0, ε0 ] for ε0 > 0 sufficiently small, we see that (5.3) holds for all t ∈ [0, T ε ]; the first part of Theorem 1.1 follows. As in the proof of Proposition 4.1, we also have
P ε wε Lq Lr P ε ∈{Id,Aε ,B ε }
ε−1/q
P ε ∈{Id,Aε ,B ε }
( P ε wε (t) L2 + P ε wε L1 L2 )
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
+ ε1−1/q
961
P ε Lε L1 L2
P ε ∈{Id,Aε ,B ε }
+ εαc −1−2/q wε Lq Lr
P ε ϕε 2Lθ Lr
P ε ∈{Aε ,B ε }
+ εαc −1−2/q ( wε 2Lθ Lr + ϕε 2Lθ Lr )
P ε wε Lq Lr .
P ε ∈{Id,Aε ,B ε }
Choosing τ sufficiently small, the last term on the right-hand side of the above inequality can be absorbed by the left-hand side. Notice that so long as (5.3) holds, 2
εαc −1− q wε Lq Lr ( Aε ϕε 2Lθ Lr + B ε ϕε 2Lθ Lr ) ε−γ/8 e2C0 t . Using scaled Strichartz estimates again, mimicking the above computations, and using Proposition 3.1, Gronwall lemma yields √ wε L∞ ([0,t];H) ≤ C εeC1 t , which completes the proof of Theorem 1.1. 6. Nonlinear Superposition In this section, we consider the nonlinear superposition for the critical case α = αc . The arguments to prove Theorems 1.2 and 1.3 are quite similar to the proof of Proposition 4.1 and Theorem 1.1, respectively. The important aspect to be understood is the interaction between the two profiles ϕε1 and ϕε2 . The error wε = ψ ε − ϕε1 − ϕε2 satisfies iε∂t wε +
ε2 ∆wε = V wε − Lε + N ε ; 2
wε |t=0 = 0,
where Lε = (V (t, x) − T2 (t, x, x(t)))(ϕε1 + ϕε2 ), N ε = λεαc (F (wε + ϕε1 + ϕε2 ) − F (ϕε1 ) − F (ϕε2 )),
F (ψ) := (|x|−γ ∗ |ψ|2 )ψ.
As in [7], decompose N ε into two parts: a semilinear term NSε = λεαc (F (wε + ϕε1 + ϕε2 ) − F (ϕε1 + ϕε2 )), and an interaction (source) term NIε = λεαc (F (ϕε1 + ϕε2 ) − F (ϕε1 ) − F (ϕε2 )). As noticed in [7], the term NSε can be treated as in the case of a single wave packet: the key point is to estimate εαc −1 NIε L1 ([0,t];Σε ) ,
October 31, J070-S0129055X11004485
962
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
since NIε plays the role of a source term. This is the only new term to control to infer Theorems 1.2 and 1.3 from Proposition 4.1 and Theorem 1.1. Lemma 6.1. Let 0 < γ < min (2, d), T ≥ 0 and 0 < σ < 12 . Denote I ε (T ) = {t ∈ [0, T ]; |x1 (t) − x2 (t)| ≤ εσ }.
(6.1)
Then for any k ∈ N with k > γ, ε−1 NIε L1 ([0,T ];Σε ) (Mk+2 (T ))3 (T εγ(1/2−σ) + |I ε (T )|)eCT , where Mk (T ) = sup{ uj L∞ ([0,T ];Σk ) ; j ∈ {1, 2}}. In particular, if T > 0 is independent of ε, then ε−1 NIε L1 ([0,T ];Σε ) (Mk+2 (T ))3 (εγ(1/2−σ) + εσ ). Proof. We compute NIε = εαc ((|x|−γ ∗ |ϕε1 |2 )ϕε2 + (|x|−γ ∗ |ϕε2 |2 )ϕε1 ) + 2εαc (|x|−γ ∗ (Re(ϕε1 ϕε2 )))(ϕε1 + ϕε2 ).
(6.2)
On the complement of I ε (T ), we will use Peetre inequality: for η ∈ Rd , sup ( x−1 x − η−1 ) x∈Rd x1 (t)−x2 (t) √ . ε
Denote η ε = εαc −1
[0,T ]\I ε (T )
1 1 .
η |η|
For the last term in (6.2), we have, for j ∈ {1, 2},
(|x|−γ ∗ |ϕε1 (t, x)ϕε2 (t, x)|)ϕεj (t, x) L2 (Rd ) dt
=
[0,T ]\I ε (T )
(|x|−γ ∗ |u1 (t, x)u2 (t, x − η ε )|)uj (t, x) L2 (Rd ) dt
|x|−γ ∗ | xk u1 (t, x) x − η ε k u2 (t, x − η ε )| L∞ L2d/γ uj L∞ L2d/(d−γ) dt × . ε k [0,T ]\I ε (T ) |η (t)| In view of Hardy–Littlewood–Sobolev inequality and Sobolev embedding, we have: |x|−γ ∗ | xk u1 (t, x) x − η ε k u2 (t, x − η ε )| L∞ L2d/γ xk u1 (x) L∞ L4d/(2d−γ) x − η ε k u2 (x − η ε ) L∞ L4d/(2d−γ) xk u1 (x) L∞ H 1 x − η ε k u2 (x − η ε ) L∞ H 1 (Mk+1 (T ))2 . On the other hand, we have dt εk/2 = dt ≤ εk(1/2−σ) T. ε k k [0,T ]\I ε (T ) |η (t)| [0,T ]\I ε (T ) |x1 (t) − x2 (t)|
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
It follows that εαc −1 [0,T ]\I ε (T )
963
(|x|−γ ∗ |ϕε1 ϕε2 |)ϕεj (t, x) L2 (Rd ) dt (Mk+1 (T ))3 εk(1/2−σ) T.
The first two terms in (6.2) are of the same form, so we consider the first one only: ε
αc −1
[0,T ]\I ε (T )
(|x|−γ ∗ |ϕε1 |2 )ϕε2 L2 (Rd ) dt
(
= [0,T ]\I ε (T )
Rd
|y|−γ |u1 (t, x − y − η ε )|2 dy)u2 (t, x) L2 (Rd ) dt.
In view of Peetre inequality, using similar arguments as above, we have −γ ε 2 |y| |u1 (t, x − y − η )| dy × u2 (t, x) [0,T ]\I ε (T ) |y+η ε |>|η ε |
−k
x − y − η ε
sup [0,T ]\I ε (T )
x∈Rd |y+η ε |>|η ε |
x
−k
dt L2 (Rd )
k dt x u2 (t, x) L∞ L2d/(d−γ)
−γ ε 2 ε k |y| × |u (t, x − y − η )|
x − y − η dy 1
L∞ L2d/γ
[0,T ]\I ε (T )
dt xk u2 (t, x) L∞ L2d/(d−γ) xk/2 u1 (t, x) 2L∞ L4d/(2d−γ) |η ε |k
(Mk+1 (T ))3 εk(1/2−σ) T. We observe that {y ∈ Rd : |y + η ε | ≤ |η ε |} ⊂ {y ∈ Rd : |y| ≤ 2|η ε |}, and, for k > 0, x − η ε k x − y − η ε k + |y|k . Then we have, for all x ∈ Rd , {y:|y+η ε |≤|η ε |}
Rd
|y|−γ |u1 (t, x − y − η ε )|2 x − η ε k dy
|y|−γ |u1 (t, x − y − η ε )|2 x − y − η ε k dy
+
Rd
{y:|y|≤2|η ε |}
|y|k−γ |u1 (t, x − y − η ε )|2 dy
|y|−γ |u1 (t, x − y − η ε )|2 x − y − η ε k dy + |η ε |k−γ u1 (t) 2L2 ,
October 31, J070-S0129055X11004485
964
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
since k > γ. It follows that −γ ε 2 |y| |u1 (t, x − y − η )| dy u2 (t, x) ε ε ε [0,T ]\I (T ) |y+η |≤|η |
sup
[0,T ]\I ε (T )
dt L2 (Rd )
x − η ε −k x−k dt xk u2 (t, x) L∞ L2d/(d−γ)
x∈Rd |y+η ε |>|η ε |
−γ ε 2 ε k |y| × |u (t, x − y − η )|
x − η dy 1 ∞ 2d/γ L L (Mk+1 (T ))3 |η ε |−k (1 + |η ε |k−γ )dt [0,T ]\I ε (T )
(Mk+1 (T ))3 (εk(1/2−σ) + εγ(1/2−σ) )T (Mk+1 (T ))3 εγ(1/2−σ) T, older inequality, Hardy–Littlewood–Sobolev inequality and since k > γ. In I ε (T ), H¨ Sobolev embedding yield similarly αc −1 NIε L2 (Rd ) dt (M1 (T ))3 |I ε (T )|. ε I ε (T )
Note that √ √ ε∇ϕj L2 (Rd ) ε ∇uj L2 (Rd ) + |ξj | uj L2 (Rd ) , √ xϕj L2 (Rd ) ε yuj L2 (Rd ) + |xj | uj L2 (Rd ) . The Σε estimate of NIε then follows easily, in view of Lemma 1.1. When T > 0 does not depend on ε, we simply invoke [7, Lemma 6.2]: |I ε (T )| = O(εσ ), and the proof of the lemma is complete. 6.1. Proof of Theorem 1.2 Due to the lack of natural rescaling for two wave packets, we use a bootstrap argument even on finite time intervals; the operators Aε and B ε used in Sec. 4 are helpful analytically because they have a precise geometrical meaning in terms of one wave packet, and this meaning is lost in the case of two wave packets. Since for j = 1, 2, ϕεj (t) Lr (Rd ) ≤ C(T )ε−γ/8 ,
∀ t ∈ [0, T ],
the bootstrap argument goes as follows: so long as wε (t) Lr (Rd ) ≤ C(T )ε−γ/8 ,
(6.3)
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
965
we estimate the error wε with a rather precise rate. Resuming the computations of Proposition 4.1 and using the above bootstrap argument, we have 1 1 wε L∞ ([0,T ];L2 (Rd )) Lε L1 ([0,T ];L2 (Rd )) + NIε L1 ([0,T ];L2 (Rd )) . ε ε √ Similarly, applying ε∇ and x to the equation, resuming an analogue computation as Aε and B ε , we get 1 ε 1 L L1 ([0,T ];Σε ) + NIε L1 ([0,T ];Σε ) ε ε √ σ γ(1/2−σ) ε+ε +ε εσ + εγ(1/2−σ) ,
wε L∞ ([0,T ];Σε )
since 0 < σ < 1/2. Optimizing the estimate in σ, we find 1 γ −σ ⇔ σ = , σ=γ 2 2(1 + γ) which is consistent with 0 < σ < 1/2. Then Gagliardo–Nirenberg inequality yields γ
wε (t) Lr (Rd ) ε−γ/4 wε (t) L2 (Rd ) ε∇wε (t) L2 (Rd ) ε 2(1+γ) −γ/4 . 1−γ/4
γ/4
Now we notice that γ γ γ − >− , 2(1 + γ) 4 8 so for any T > 0 independent of ε, we can find ε0 so that (6.3) holds for t ∈ [0, T ] provided that ε ∈]0, ε0 ]. Theorem 1.2 follows. 6.2. Proof of Theorem 1.3 Since aj ∈ Σk , we have, by Proposition 3.1, Mk (t) ≤ Ck eCk t . From [7, Lemma 6.3], there exist positive constants C, C1 , C2 independent of ε such that 1 |Iε (t)| ≤ Cεσ eC1 t |E1 − E2 |−2 , ∀ t ∈ 0, C2 ln . ε It follows from Lemma 6.1 that ε−1 NIε L1 ([0,t];Σε ) eCt (tεγ(1/2−σ) + εσ eC1 t ) eCt (εγ(1/2−σ) + εσ ) Choosing k ≥ 3 and the same (optimal) σ as before, γ , σ= 2(1 + γ) we have ε−1 NIε L1 ([0,t];Σε ) εγ/2(1+γ) eCt . Resuming the bootstrap argument and similar arguments as in Sec. 5 yields Theorem 1.3.
October 31, J070-S0129055X11004485
966
2011 13:3 WSPC/S0129-055X
148-RMP
P. Cao & R. Carles
Acknowledgments This work was supported by the French ANR project R.A.S. (ANR-08-JCJC-012401), and was achieved when the first author was visiting the University of Montpellier 2, under a grant from Tsinghua University. She would like to thank these institutions for this opportunity.
References [1] A. Athanassoulis, T. Paul, F. Pezzotti and M. Pulvirenti, Coherent states propagation for the Hartree equation, to appear in Ann. Henri Poincar´e (2011); doi:10.1007/s00023-011-0115-2. [2] D. Bambusi, S. Graffi and T. Paul, Long time semiclassical approximation of quantum flows: A proof of the Ehrenfest time, Asymptot. Anal. 21(2) (1999) 149–160. [3] J. M. Bily and D. Robert, The semi-classical Van Vleck formula. Application to the Aharonov–Bohm effect, in Long Time Behaviour of Classical and Quantum Systems (Bologna, 1999 ), Ser. Concr. Appl. Math., Vol. 1 (World Sci. Publ., River Edge, NJ, 2001), pp. 89–106. [4] R. Carles, Semi-Classical Analysis for Nonlinear Schr¨ odinger Equations (World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2008). [5] R. Carles, Nonlinear Schr¨ odinger equation with time dependent potential, Commun. Math. Sci. 9(4) (2011) 937–964. [6] R. Carles, Interaction of coherent states for Hartree equations, preprint (2011); http://arxiv.org/abs/1104.0781. [7] R. Carles and C. Fermanian-Kammerer, Nonlinear coherent states and Ehrenfest time for Schr¨ odinger equations, Comm. Math. Phys. 301(2) (2011) 443–472. [8] T. Cazenave, Semilinear Schr¨ odinger Equations, Courant Lecture Notes in Mathematics, Vol. 10 (New York University Courant Institute of Mathematical Sciences, New York, 2003). [9] M. Combescure and D. Robert, Semiclassical spreading of quantum wave packets and applications near unstable fixed points of the classical flow, Asymptot. Anal. 14(4) (1997) 377–404. [10] M. Combescure and D. Robert, Quadratic quantum Hamiltonians revisited, Cubo 8(1) (2006) 61–86. [11] M. Combescure and D. Robert, A phase-space study of the quantum Loschmidt echo in the semiclassical limit, Ann. Henri Poincar´e 8(1) (2007) 91–108. [12] D. Fujiwara, A construction of the fundamental solution for the Schr¨ odinger equation, J. Anal. Math. 35 (1979) 41–96. [13] D. Fujiwara, Remarks on the convergence of the Feynman path integrals, Duke Math. J. 47(3) (1980) 559–600. [14] P. G´erard, Oscillations and concentration effects in semilinear dispersive wave equations, J. Funct. Anal. 141(1) (1996) 60–98. [15] J. Ginibre and G. Velo, Scattering theory in the energy space for a class of nonlinear Schr¨ odinger equations, J. Math. Pures Appl. (9) 64(4) (1985) 363–401. [16] G. A. Hagedorn, Semiclassical quantum mechanics. I. The → 0 limit for coherent states, Comm. Math. Phys. 71(1) (1980) 77–93. [17] G. A. Hagedorn and A. Joye, Exponentially accurate semiclassical dynamics: Propagation, localization, Ehrenfest times, scattering, and more general states, Ann. Henri Poincar´e 1(5) (2000) 837–883.
October 31, J070-S0129055X11004485
2011 13:3 WSPC/S0129-055X
148-RMP
Wave Packets for Hartree Equations
967
[18] G. A. Hagedorn and A. Joye, A time-dependent Born–Oppenheimer approximation with exponentially small error estimates, Comm. Math. Phys. 223(3) (2001) 583–626. [19] T. Kato, On nonlinear Schr¨ odinger equations, Ann. IHP (Phys. Th´eor.) 46(1) (1987) 113–129. [20] M. Keel and T. Tao, Endpoint Strichartz estimates, Amer. J. Math. 120(5) (1998) 955–980. [21] R. G. Littlejohn, The semiclassical evolution of wave packets, Phys. Rep. 138(4–5) (1986) 193–291. [22] T. Paul, Semi-classical methods with emphasis on coherent states, in Quasiclassical Methods (Minneapolis, MN, 1995 ), IMA Vol. Math. Appl., Vol. 95 (Springer, New York, 1997), pp. 51–88. [23] D. Robert, On the Herman–Kluk semiclassical approximation, Rev. Math. Phys. 22(10) (2010) 1123–1145. [24] V. Rousse, Semiclassical simple initial value representations, to appear in Ark. f¨ or Mat.; http://arxiv.org/abs/0904.0387. [25] T. Swart and V. Rousse, A mathematical justification for the Herman–Kluk propagator, Comm. Math. Phys. 286(2) (2009) 725–750. [26] K. Yajima, Existence of solutions for Schr¨ odinger evolution equations, Comm. Math. Phys. 110 (1987) 415–426.
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 9 (2011) 969–1008 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004497
GEOMETRIC APPROACH TO THE HAMILTON–JACOBI EQUATION AND GLOBAL PARAMETRICES FOR THE ¨ SCHRODINGER PROPAGATOR
SANDRO GRAFFI∗ and LORENZO ZANELLI† Dipartimento di Matematica, Universit` a di Bologna, Piazza di Porta S. Donato 5, 40126 Bologna, Italy ∗graffi@dm.unibo.it †
[email protected] Received 13 February 2011 Revised 29 September 2011 We construct a family of global Fourier Integral Operators, defined for arbitrary large times, representing a global parametrix for the Schr¨ odinger propagator when the potential is quadratic at infinity. This construction is based on the geometric approach to the corresponding Hamilton–Jacobi equation and thus sidesteps the problem of the caustics generated by the classical flow. Moreover, a detailed study of the real phase function allows us to recover a WKB semiclassical approximation which necessarily involves the multivaluedness of the graph of the Hamiltonian flow past the caustics. Keywords: Schr¨ odinger equation; global Fourier Integral Operators; multivalued WKB semiclassical method; symplectic geometry. Mathematics Subject Classification 2010: 47D08, 35S30, 53D12
1. Introduction and Statement of the Results Let us consider the initial value problem for the Schr¨ odinger equation: 2 ∆ψ(t, x) + V (x)ψ(t, x), 2m ψ(0, x) = ϕ(x),
i∂t ψ(t, x) = −
(1.1)
where the potential V ∈ C ∞ (Rn ; R) is assumed quadratic at infinity. In this case it is well known that the operator H in L2 (Rn ) defined by the maximal action of 2 ∆+ V (x) is self-adjoint. Hence the Cauchy problem (1.1) considered in L2 (Rn ) − 2m admits the unique global solution ψ(t, x) = e−iHt/ ϕ(x), ∀ t ∈ R, ∀ ϕ ∈ L2 (Rn ). Under the present conditions a parametrix of the propagator under the form of a semiclassical Fourier integral operator (WKB representation) has been constructed long ago by Chazarain [5] (for related results by the same technique, see 969
October 31, J070-S0129055X11004497
970
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
also [8, 13]; for recent related work, see [16, 21, 22]). The occurrence of caustics of the Hamilton–Jacobi equation makes this construction local in time; the solution at an arbitrary time T > 0 requires multiple compositions of the local representations. A global parametrix for the propagator has been constructed through the method of complex valued phase functions (as in [12, 14]), with related complex transport coefficients. A particularly convenient choice of the complex phase function (the Herman–Kluk representation) has been isolated in the chemical physics literature long ago ([9]); its validity has been recently proved in [18, 17]. The relation between the above approaches and the underlying classical flow is however less direct than the standard WKB approximation in which the phase function solves the Hamilton–Jacobi equation. In this paper, we study the problem through the geometric approach to the Hamilton–Jacobi equation (see, e.g., [6, 23]). In Theorem 1.1, a parametrix is obtained for the propagator U (t) := e−iHt/ valid for t ∈ [0, T ], 0 < T < ∞, under the form of a family of semiclassical global Fourier Integral Operators (FIO), which extend to continuous operators in L2 (Rn ). The corresponding phase function p2 + V (x). This is real and generates the graph of the flow of the Hamiltonian H = 2m technique not only yields globality in time, but also helps to obtain a unified view of Fujiwara’s as well as Chazarain’s approaches on one side, and of the Laptev–Sigal one on the other. In Theorem 1.2, we prove that a WKB construction is still valid, necessarily multivalued because of the caustics. We assume: V (x) = Lx, x + V0 (x), V0 ∈ C ∞ (Rn ),
L ∈ GL(n),
|∂xα V0 (x)| ≤ C0 ,
det L = 0,
(1.2)
|α| = 0, 1, 2, . . . .
(1.3)
Then the main result of the paper is: Theorem 1.1. Let (1.2) and (1.3) be fulfilled. Let 0 < T < ∞ and ϕ ∈ S(Rn ). Then: N i e (S(t,x,η,θ)−y,η)bj (t, x, η, θ)dθdηϕ(y)dy + O(N +1 ) ψ(t, x) = (2π)−n+j j=0
R2n+k
(1.4) where k > 2n + CH,N T 4 and CH,N > 0 is shown in Theorem 2.1. Moreover, the following assertions hold: (i) The phase function (t, x, η, θ) → S(t, x, η, θ) belongs to C ∞ ([0, T ]×R2n ×Rk ; R) and it is a global generating function for the graph Λt of the Hamiltonian flow φtH : T Rn → T Rn ∀ t ∈ [0, T ]: Λt := {(y, η; x, p) ∈ T Rn × T Rn | (x, p) = φtH (y, η)} = {(y, η; x, p) ∈ T Rn × T Rn | p = ∇x S, y = ∇η S, 0 = ∇θ S}.
(1.5)
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
971
(ii) The transport coefficients bj : j = 0, . . . are determined by the first order PDE: ∂ b + 1 ∇ S∇ b + 1 ∆ Sb (t, x, η, θ) = Θ (t, x, η, θ), t 0 x x 0 x 0 N m 2m (1.6) k + b0 (0, x, η, θ) = ρ(θ), ρ ∈ S(R ; R ), ρL1 = 1. ∂ b + 1 ∇ S∇ b + 1 ∆ Sb (t, x, η, θ) = i ∆ b (t, x, η, θ), t j x x j x j x j−1 m 2m 2m (1.7) bj (0, x, η, θ) = 0, j ≥ 1. Here ΘN ∈ Cb∞ ([0, T ] × R2n+k ; R) (see Theorem 2.4) fulfills the following condition: ∇θ S Πα ΘN ∈ Cb∞ (R2n+k ; R), ∀ 0 ≤ α ≤ N, ΠΘN := divθ ΘN . |∇θ S|2 (iii) ∀ 0 ≤ t ≤ T , 0 ≤ T < +∞, the expansion (1.4) generates an L2 parametrix of the propagator U (t) = e−iHt/ : each term is a continuos FIO on S(Rn ) denoted Bj (t), j = 0, 1, . . . , which admits a continuous extension to L2 (Rn ), and: U (t) =
N
Bj (t) + O(N +1 ).
(1.8)
j=0
The notation O(N +1 ) means: RN (t)L2 →L2 ≤ CN (T )N +1 , RN (t) := U (t) −
N
∀ N ≥ 0,
∀ t ∈ [0, T ],
Bj (t).
j=0
Moreover, the expansion (1.8) does not depend on ρ provided ρL1 = 1. Namely: N j=0
Bj [ρ1 ](t) −
N
Bj [ρ2 ](t) = O(N +1 ).
j=0
By applying the stationary phase theorem to the oscillatory integral (1.4), the integration over the auxiliary parameters θ can be eliminated and the WKB approximation to the evolution operator is recovered, necessarily multivalued on account of the caustics. Theorem 1.2. Let V (x) = 12 |x|2 + V0 (x) with supx∈Rn ∇2 V0 (x) < 1; let ϕˆ (η) be the -Fourier transform of the initial datum ϕ. Then ∀ t ∈ [0, T ], t = (2τ + 1) π2 , τ ∈ N, there exists a finite open partition Rn × Rn = N =1 D such that the solution of (1.1) can be represented as: ˆ (t, x, η)ϕˆ (η)dη, 0 ≤ t ≤ T, t = (2τ + 1) π , U ψ(t, x) = 2 Rn
October 31, J070-S0129055X11004497
972
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
with kernel ˆ |D = U
i
1
e Sα (t,x,η) | det ∇2θ S(t, x, η, θα (t, x, η))|− 2 e
iπ 4 σα
bα,0 (t, x, η) + O(),
α=1
σα := sgn ∇2θ S(t, x, η, θα (t, x, η)),
Sα := S(t, x, η, θα (t, x, η)),
(1.9)
bα,0 := b0 (t, x, η, θα (t, x, η)), where N is a t-dependent natural and: (i) On each D the equation 0 = ∇θ S(t, x, η, θ) has smooth solutions θα (t, x, η), 1 ≤ α ≤ . (ii) Any function Sα (t, x, η) solves locally the Hamilton–Jacobi equation: |∇x Sα |2 (t, x, η) + V (x) + ∂t Sα (t, x, η) = 0. 2m (iii) An explicit upper bound on the t-dependent natural N is computed in (2.71). Example. In the harmonic oscillator case V (x) = 12 |x|2 , the phase function is exactly quadratic: S(t, x, η, θ) = x, η − 2t (|η|2 + |x|2 ) + v(t, x, η), θ + Q(t)θ, θ. It admits a unique smooth global critical point θ (t, x, η) on (x, η) ∈ R2n for t ∈ [0, T ] outside resonant times t = (2τ + 1) π2 , τ ∈ N. Hence the finite sum (1.9) reduces to just one term conciding with the well known Mehler formula: 2 2 i 1 1 ϕˆ (η)dη. ψ(t, x) = e cos(t) [x,η− 2 sin(t)(|η| +|x| )] cos(t) Rn Remarks. (1) The phase function is constructed (Sec. 2) by the Amann–Conley– Zehnder reduction technique of the Hamilton–Helmholtz functional ([1–3, 6]). t S(t, x, η, θ) = x, η + [γ p (s)γ˙ x (s) − H(γ x (s), η + γ p (s))]ds|γ(t,x,θ)(·) . 0
(1.10) The curves γ(t, x, θ)(s) = (γ x (t, x, θ)(s), γ p (t, x, θ)(s)) are parametrized as: t x γ (t, x, θ)(s) = x − φx (t, x, θ)(τ )dτ ; φx = θx + f x (t, x, θ); s γ(t, x, θ)(s) := s γ p (t, x, θ)(s) = φp (t, x, θ)(τ )dτ ; φp = θp + f p (t, x, θ). 0
2
2n
Here θ ∈ PM L ([0, T ]; R ) Rk (PM is the finite-dimensional Fourier orthogonal projector, k = 2n(2M + 1)) so that the parameters θ can be identified with the finite Fourier components of the derivatives of the curves γ. The Formula (1.10) represents a global generating function if k > 2n + CH T 4 . In turn, the functions (f x , f p ) : [0, T ] × Rn × PM L2 → QM L2 × QM L2 are determined by a fixed point functional equation, essentially
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
973
the QM projection of the Hamilton’s equations (Sec. 2.3). We can define Γ(t, x) := {γ(t, x, θ) | θ ∈ Rk }. (2) The parametrization (1.5) entails that S is a smooth solution of the problem: 2 |∇x S| (t, x, η, θ) + V (x) + ∂t S(t, x, η, θ) = 0, 2m (1.11) S(0, x, η, θ) = x, η, ∇ S(t, x, η, θ) = 0. θ
Any function S solving (1.11), i.e. the Hamilton–Jacobi equation under the stationarity constraint ∇θ S = 0, is the central object to determine the so called geometrical solutions of the Hamilton–Jacobi equation (see, for example the recent works [3, 2]). Global generating functions are clearly not unique and this is due to the presence of the θ-auxiliary parameters. Uniqueness holds instead for the geometry of set of critical points: ΣS := {(x, η, θ) ∈ R2n+k | ∇θ S(t, x, η, θ) = 0}, which does not depend on S because it is globally diffeomorphic to Λt ; a detailed study of ΣS is done in Sec. 2. We prove in Sec. 3 that symbols coinciding on ΣS generate semiclassical Fourier Integral Operators differing only by terms O(∞ ). This will allow us to select symbols in such a way to make essentially trivial the proof of the L2 -continuity of the associated operator. (3) The symbol b0 solving (1.6) is
t 1 b0 (t, x, η, θ) = exp − ∆x S(τ, γ x (t, x, θ)(τ ), η, θ)dτ ρ(θ). (1.12) 2m 0 If T2 > T1 , then k(T2 ) > k(T1 ) so that Γ(T1 , x) ⊂ Γ(T2 , x). In the limit T → ∞, θ → φ ∈ L2 (R+ ; R2n ) and we get the simplified functional:
t t 1 b0 (t, x, φ) = exp ∆x V x − φx (λ)dλ dτ ρ(φ). (1.13) 2m 0 τ This is similar to the zeroth order symbol of Laptev–Sigal construction [14]:
t 1 τ v0 (t, y, η) = exp ∆x V (x (y, η))dτ . 2m 0 Namely, the functional is the same, but it is evaluated on the classical curves (with initial conditions x0 (y, η) = y, p0 (y, η) = η) instead of all the free curves in (1.13) with H 1 -regularity and boundary condition γ x (t, x, φ)(t) = x. (4) For potentials in the class (1.2) and 0 ≤ t ≤ T small enough no caustics develop, and there is a unique smooth solution θ (t, x, η) for (x, η) ∈ R2n . The stationary phase theorem yields the 0th order approximation to the integral (1.4): ˆ (0) (t, x, η) = e i S(t,x,η,θ) |det ∇2 S(t, x, η, θ )|− 12 e iπ 4 σ b (t, x, η, θ ) + O(), U 0 θ which coincide with the WKB semiclassical approximation. This fact suggests a relationship, at any order in , between the present construction and those of Chazarain [5] and Fujiwara [8]. This is the contents of Theorem 4.1.
October 31, J070-S0129055X11004497
974
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
(5) The assertions of Theorem 1.2 represent the counterpart (in the η variables) of a result of Fujiwara [8], valid under the additional assumption that the number of classical curves connecting boundary data is finite. (6) As stated in Theorem 1.1 the lower bound for the number of extra phase variables grows as T 4 and not as T , a behavior predicted by the so called broken geodesics method (see, for example, [4]). To this purpose, we remark that we construct the (not unique) global generating function through a different variational technique, namely the Amann–Conley–Zehnder reduction of the Hamilton–Helmholtz functional ([1–3, 6]). This method appears, in our paper, to be natural within the construction of the parametrix for the Schr¨ odinger Propagator. 2. Generating Functions for the Graph of the Hamiltonian Flow 2.1. Lagrangian submanifolds and global generating functions Adopting standard notations and terminology (see e.g. [20]), we denote by ω = n dp ∧ dx = i=1 dpi ∧ dxi the 2-form on T Rn that defines its natural symplectic structure. As usual, a diffeomorphism C : T Rn → T Rn is a canonical transformation if the pull back of the symplectic form is preserved, C ω = ω. We say that L ⊂ T Rn is a Lagrangian submanifold if ω|L = 0 and dim L = ¯ on T Rn × T Rn ∼ n = 12 dim T Rn . In a natural way, a symplectic structure ω = n n T (R × R ) is the twofold pull-back of the standard symplectic 2-form on T Rn defined as ω ¯ := pr2 ω − pr1 ω = dp2 ∧ dx2 − dp1 ∧ dx1 . Similarly, Λ ⊂ T Rn × T Rn is called a Lagrangian submanifold of T Rn × T Rn if ω ¯ |Λ = 0 and dim(Λ) = 2n. 2 n A Hamiltonian is a C -function H : T R → R and its flow is the one-parameter group of canonical transformations φtH : U ⊆ T Rn → T Rn solving Hamilton’s equations γ˙ = J∇H(γ) (J the unit symplectic matrix) with γ(0) = (x0 , p0 ) ∈ U . The Hamilton–Helmholtz functional: t [γ p (s)γ˙ x (s) − H(γ x (s), γ p (s))]ds (2.1) A[(γ x , γ p )] := 0
is well defined and continuous on the path space H 1 ([0, t]; T Rn ). The action functional: t x A[γ ] := L(γ x (s), γ˙ x (s))ds
(2.2)
0
2
p is defined on H 1 ([0, t]; Rn ). In this paper we consider H = 2m + V (x), so that the Legendre transform guarantees the corrispondence of the stationary curves of these two functionals.
Definition 2.1. A global generating function for a Lagrangian submanifold L ⊂ T Rn is a C 2 function S : Rn × Rk → R such that • L = {(x, p) ∈ T Rn | p = ∇x S(x, θ), 0 = ∇θ S(x, θ)}, • rank (∇2xθ S∇2θθ S)|L = max .
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
975
Similarly, a global generating function for a Lagrangian submanifold Λ ⊂ T Rn × T Rn is a C 2 map S : Rn × Rn × Rk → R such that • Λ = {(x, p; y, η) ∈ T Rn × T Rn | p = ∇x S, y = ∇η S, 0 = ∇θ S(x, η, θ)}, • rank (∇2xθ S ∇2ηθ S ∇2θθ S)|Λ = max . It is important to remark that the following set: ΣS := {(x, η, θ) ∈ Rn × Rn × Rk | 0 = ∇θ S(x, η, θ)}
(2.3)
is a submanifold of R2n+k and it is diffeomorphic to Λ. We focus our attention on the graphs of a Hamiltonian flow φtH : T Rn → T Rn , which correspond to a family of Lagrangian submanifolds in T Rn × T Rn : Λt := {(y, η; x, p) ∈ T Rn × T Rn | (x, p) = φtH (y, η)}. An important object in what follows is the family of global generating functions: Λt = {(y, η; x, p) ∈ T Rn × T Rn | p = ∇x S, y = ∇η S, 0 = ∇θ S(t, x, η, θ)} to be explicitly constructed for arbitrarily large times in the next Section. As is known, this technical tool has been developed in the framework of symplectic geometry and variational analysis (see [1, 6, 4, 15, 19, 23, 24]) to sidestep the locality in time generated by the occurrence of caustics. 2.2. Generating function with infinitely many parameters In the following we mainly review the construction of a generating function with infinitely many parameters described in [3]. We begin by the following simple result (see [20]): Lemma 2.1. Let us consider the Hamilton–Helmholtz functional A[·] as in (2.1). A curve γ ∈ Γ(0) := {γ ∈ H 1 ([0, t]; T Rn ) | γ p (0) = 0, γ x (t) = x} satisfies Hamilton’s equations with boundary conditions: γ˙ = J∇H(γ),
γ p (0) = 0,
γ x (t) = x
if and only if the following stationarity condition of variational type holds: DA (γ)[v] = 0, Dγ
∀ v ∈ T Γ(0) .
Proof. By computing the Gˆ ateaux derivative of the functional, we get: t DA (γ)[v] = [γ˙ − J∇H(γ)](s)v(s)ds + γ p (s)v x (s)|t0 , ∀ v ∈ T Γ(0) . Dγ 0 Now use the boundary condition γ p (0) = 0 and recall that for T Γ(0) it must be v x (t) = 0. The result is proved. The above lemma has an important consequence: it allows us to introduce the notion of generating function with infinitely many parameters.
October 31, J070-S0129055X11004497
976
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
First of all, it is easy to observe that the set of curves s t φx (τ )dτ, φp (τ )dτ , φ ≡ (φx , φp ) γ(t, x, φ)(s) := x − s
(2.4)
0
gives a parametrization of the path space Γ(0) introduced in the previous lemma, namely: Γ(0) (t, x) := {γ(t, x, φ)(·) | φ ∈ L2 ([0, T ]; R2n )}. Second, we define the functional with infinitely many parameters specified by φ ∈ L2 ([0, T ]; R2n ) in the following way: Definition 2.2. S(t, x, η, φ) := x, η +
t
0
[γ p (s)γ˙ x (s) − H(γ x (s), η + γ p (s))]ds|γ(·)=γ(t,x,φ)(·) (2.5)
Remark. Introducing the traslated curves ζ = (ζ x , ζ p ) := (γ x , η + γ p ), it is easy to see that the functional (2.5) admits the equivalent representation t x S(t, x, η, φ) = ζ (0), η + ζ p (s)ζ˙x (s) − H(ζ x (s), ζ p (s))ds|ζ(·)=ζ(t,x,φ)(·) 0
x
p
where now ζ (t) = x and ζ (0) = η. Let us now make our assumptions on the Hamiltonian, more precise: Definition 2.3. H(x, p) =
p2 p2 + V (x) = + Lx, x + V0 (x) 2m 2m
(2.6)
where V0 ∈ C ∞ (Rn ), L ∈ GL(n), and |∂xα V0 (x)| ≤ C0 ,
|α| = 0, 1, 2, . . . .
This allows us to look more closely at the structure of the generating function: Lemma 2.2. The functional S admits the representation: S(t, x, η, φ) = x, η −
t 2 η − tLx, x + R(t)φ, φ + v(t, x, η), φ + σ(t, x, φ). 2m (2.7)
Here v(t, x, η) has a linear dependence with respect to (x, η) variables, and σ(t, x, φ) is bounded with respect to (x, φ).
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
977
Proof. It is easy to see that: S = x, η +
t
s
p
φ (τ )dτ 0
0
· φx (s)ds
2 t t s 1 φp (τ )dτ +L x− φx (τ )dτ · x − φx (τ )dτ ds η+ 0 2m 0 s s t t − V0 x − φx (τ )dτ ds
−
t
0
s
t 2 = x, η − η − tLx, x + R(t)φ, φ + v(t, x, η), φ + σ(t, x, φ) 2m
(2.8)
where R(t)φ, φ :=
t 0
−
0
s
1 φ (τ )dτ φ (s) − 2m p
x
2
s
p
φ (τ )dτ
ds
0
t t t φx (τ )dτ · φx (τ )dτ ds, L 0
s
s
t t η p x x v(t, x, η), φ := φ (τ )dτ + Lx · φ (τ )dτ + L φ (τ )dτ · x ds, − m 0 0 s s t t σ(t, x, φ) := − V0 x − φx (τ )dτ ds. (2.9) t
s
0
s
The boundedness of σ is immediate: sup (x,φ)∈Rn ×L2
|σ(t, x, φ)| ≤ t sup |V0 (z)|. z∈Rn
2π
Finally, we consider the orthonormal basis of L2 , eα (s) = √1T e T iαs , α ∈ Z and the corresponding Fourier expansion φ(s) = α∈Z φ(α) eα (s). This entails the identifi cation φ(s) ≡ {φ(α) }α∈Z ∈ 2 under the usual norm |φ| = α |φ(α) |2 generated by (α) (α) the scalar product ψ, φ = α ψ φ . Proposition 2.1. The graph of the Hamiltonian flow Λt := (y, η; x, p) ∈ T Rn × T Rn | (x, p) = φtH (y, η) . is generated by S : Λt =
DS (y, η; x, p) ∈ T Rn × T Rn | p = ∇x S, y = ∇η S, 0 = . Dφ
October 31, J070-S0129055X11004497
978
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
Proof. The first component of the stationarity equation 0 =
DS Dφ
reads:
DS (φ)[v p ] Dφp s t s s 1 v p (τ )dτ · φx (s) − φp (τ )dτ · v p (τ )dτ ds = η+ m 0 0 0 0
0=
for all v p ∈ L2 . This is satisfied if and only if s 1 x p φ (s) = φ (τ )dτ η+ m 0
(2.10)
that is γ˙ x (t, x, φ)(s) =
1 (η + γ p (t, x, φ)(s)). m
(2.11)
On the other hand, the second equation reads: DS (φ)[v x ] Dφx t t s t p x x = φ (τ )dτ · v (s) + ∇V x − φ (τ )dτ · v x (τ )dτ ds
0=
0
0
s
for all v x ∈ L2 . Integrating by parts, we get t t t 0= φp (s)ds · v x (τ )dτ − φp (s) · 0
+ = 0
0 t
0
t
0
s
s
v x (τ )dτ ds
0
t t ∇V x − φx (τ )dτ · v x (τ )dτ ds
φp (s) ·
s
t
s
t t v x (τ )dτ + ∇V x − φx (τ )dτ · v x (τ )dτ ds.
s
This entails
s
t x φ (s) = −∇V x − φ (τ )dτ , p
s
(2.12)
s
that is equivalent to: γ˙ p (t, x, φ)(s) = −∇x V (γ x (t, x, φ)(s)).
(2.13)
By a simple computation and (2.10) we get: t 1 t s p η η 1 φ (τ )dτ ds = x − t − ∇η S = x − t − φx (s) − η ds m m 0 0 m m 0 t t η η = x−t − φx (s)ds + t = x − φx (s)ds = γ x (t, x, φ)(0) = y. m m 0 0
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
979
Finally, by (2.12), we can complete the verification: t t x ∇x S = η − ∇V x − φ (τ )dτ ds 0
=η+
s t
φp (s)ds = η + γ p (t, x, φ)(t) = p.
0
The following statement is a direct consequence of the above result: Proposition 2.2. The Hamilton–Jacobi equation is solved on the stationarity points 0 = DS Dφ ; more precisely S is a smooth solution of the problem: |∇x S|2 (t, x, η, φ) + V (x) = 0, (t, x) ∈ R+ × Rn , ∂t S(t, x, η, φ) + 2m (2.14) DS S(0, x, η, φ) = x, η, (t, x, η, φ) = 0. Dφ For the proof we refer to [3, Secs. 3 and 4]. Remark 2.1. As we have seen in Proposition 2.1, fix t ∈ [0, T ] and define the map Gt : R2n × L2 ([0, T ]; R2n ) → L2 ([0, T ]; R2n ), t η 1 s p x p x + Gt (x, η, φ , φ ) := φ (τ )dτ, −∇V x − φ (τ )dτ . m m 0 s
(2.15) (2.16)
Then, the fixed point equation on L2 ([0, T ]; R2n ) φ = Gt (x, η, φ),
(2.17)
is equivalent to the stationarity equation: 0=
DS (t, x, η, φ). Dφ
On the other hand, the solutions of this equation are related to the curves t s x p ζ(t, x, φ)(s) := x − φ (τ )dτ, η + φ (τ )dτ , φ ≡ (φx , φp ) (2.18) s
0
solving the Hamilton’s equations ζ˙ = J∇H(ζ) with boundary conditions ζ x (t) = x, ζ p (0) = η. The following result deals with some topological properties for the set of the solutions. Let: π (2.19) tα,β := √ (2β + 1), α = 1, 2, . . . , n; β ∈ N, 2 λα (2.20) λ(x, η) := 1 + |x|2 + |η|2
October 31, J070-S0129055X11004497
980
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
where λα : α = 1, 2, . . . , n are the eigenvalues of L + L† . Remark that tα,β are just p2 the resonant times of the Hamiltonian flow generated by H0 := 2m + Lx, x. Proposition 2.3. There are D(T ) < +∞, K2 (T ) < +∞, K1 (t) < +∞ such that the solutions of Eq. (2.17) fulfill the estimates: ∀ t ∈]0, T ],
φL2 > K2 (T )λ(x, η), φL2 ≤ K1 (t)λ(x, η),
∀ t = tα,β ,
|x|2 + |η|2 > D(T )2 ; 2n
∀ (x, η) ∈ R .
(2.21) (2.22)
Moreover, there is E(t) < +∞ such that the difference of any two solutions φ, ψ of (2.17) fulfills the estimate φ − ψL2 ≤ E(t),
∀ t = tα,β
∀ (x, η) ∈ R2n .
(2.23)
Proof. We begin by remarking that Eq. (2.17) t η 1 s p x p x (φ , φ ) = + φ (τ )dτ, −∇V x − φ (τ )dτ , m m 0 s can be rewritten as φ − L(t, φ) = Ψ0 (t, x, η) + Ψ1 (t, x, φ) where
L(t, φ) :=
1 m
s
†
p
φ (τ )dτ, (L + L ) 0
t
(2.24) x
φ (τ )dτ , s
η † , −(L + L )x , Ψ0 (t, x, η) := m t x Ψ1 (t, x, φ) := 0, −∇V0 x − φ (τ )dτ .
(2.25)
s
To prove the inequality (2.21), remark that the non-degeneracy of L + L† entails the lower and upper bounds: 2 12 1 |η| † 2 + |(L + L )x| ≤ C0 (T )λ(x, η). W0 (T )λ(x, η) ≤ Ψ0 (t, x, η)L2 = T 2 m2 (2.26) 1 2
1 2
Here W0 (T ) := T µM , C0 (T ) := T ; µm and µM , µm are the maximum and the minimum eigenvalue of the matrix: 1 I 0 X = m2 † 2 0 (L + L ) respectively. Moreover,
T Ψ1 (t, x, φ)L2 = 0
2 12 t 1 ∇V0 x − φx (τ )dτ ds ≤ T 2 ∇V0 C 0 =: C1 (T ). s
(2.27)
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
981
Now set M(t, φ) := φ − L(t, φ). Hence the solutions of Eq. (2.24) fulfill the estimate: sup M(t, ·)L2 →L2 φL2 ≥ M(t, φ)L2 = Ψ0 (t, x, η) + Ψ1 (t, x, φ)L2
t∈[0,T ]
≥ W0 (T )λ(x, η) − C1 (T ). For |x|2 + |η|2 > D(T )2 we have W0 (T )λ(x, η) − C1 (T ) > implies
φL2 >
−1 sup M(t, ·)L2 →L2
t∈[0,T ]
(2.28) W0 (T ) λ(x, η), 2
and this
W0 (T ) λ(x, η) =: K2 (T )λ(x, η). 2
Now consider Eq. (2.24) in the particular case of V0 = 0 (so that the Hamiltonian p2 is H0 := 2m + Lx, x). It becomes: φ − L(t, φ) = Ψ0 (t, x, η),
(x, η) ∈ Rn × Rn .
(2.29)
The explicit representation of the flow for the harmonic oscillator is φsH0 (ζ0x , ζ0p ) = esU (ζ0x , ζ0p ) where U=
0 −(L + L† )
I m . 0
It is easy to prove that outside the resonant times tα,β the flow can be globally inverted with respect to the boundary conditions (x, η), namely: ζ x (s) = ζ x (t, x, η)(s) and ζ p (s) = ζ p (t, x, η)(s). By recalling Remark 2.1, this fact is equivalent to the existence of a unique global smooth solution φ0 (t, x, η) for Eq. (2.29). This argument works ∀(x, η) ∈ Rn × Rn . In the particular case x = η = 0 (2.29) reduces to: M(t, φ) := φ − L(t, φ) = 0.
(2.30)
The uniqueness of the solution implies that the linear operator M(t, ·) : L2 ([0, t]; R2n ) → L2 ([0, t]; R2n ), t = tα,β , is invertible. Now, we can come back to the general equation (2.24), written in the equivalent form: φ = M−1 (t, Ψ0 (t, x, η) + Ψ1 (t, x, φ))
(2.31)
for all t = tα,β . By (2.26) and (2.27), we get the inequality φ ≤ M−1 (t, ·)L2 →L2 (C0 (T )λ(x, η) + C1 (T )) ≤ K1 (t)λ(x, η)
(2.32)
where K1 (t) := M−1 (t, ·)L2 →L2 (C0 (T ) + C1 (T )). Finally, we have to prove the bound for the difference of any two solutions φ, ψ for Eq. (2.31). In order to do this,
October 31, J070-S0129055X11004497
982
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
we rewrite it under the form φ − M−1 (t, Ψ1 (t, x, φ)) = M−1 (t, Ψ0 (t, x, η)). As a consequence, φ − ψ = M−1 (t, Ψ1 (t, x, φ)) − M−1 (t, Ψ1 (t, x, ψ)). Recalling (2.27) we have φ − ψL2 ≤ M−1 (t, )L2 →L2 Ψ1 (t, x, φ) − Ψ1 (t, x, ψ)L2 ≤ M−1 (t, )L2 →L2 2C1 (T ) =: E(t) and this concludes the proof. 2.3. Generating function with finitely many parameters In this section we describe how the global parametrization of the graph Λt of the Hamiltonian flow can be actually obtained through a generating function with finitely many parameters. To this end we use the reduction of the Hamilton– Helmholtz functional due to Amann, Conley and Zehnder (see [1–3, 6]). In this way we find, for the graph Λt of the Hamiltonian flow, a global parametrization of type: Λt = {(y, η; x, p) ∈ T Rn × T Rn | p = ∇x S, y = ∇η S, 0 = ∇θ S(t, x, η, θ)}. The essence of the Amann–Conley–Zehnder reduction is the existence of an underlying finite-dimensional structure for the equation investigated in the previous section: (φx , φp ) = Gt (x, η, φx , φp ),
(φx , φp ) ∈ L2 ([0, T ]; R2n ).
(2.33)
We can indeed consider the two orthogonal projectors φ(r) er (s), QM φ(s) = φ(r) er (s) PM φ(s) = |r|≤M
|r|>M 2
generated by any orthonormal basis of L ([0, T ]; R2n ); for instance er (s) := 2π √1 e T irs : r ∈ Z. Then, let us introduce the decomposition: T (f x (t, x, θ), f p (t, x, θ)) = QM Gt (x, η, θx + f x (t, x, θ), θp + f p (t, x, θ)), (θx , θp ) = PM Gt (x, η, θx + f x (t, x, θ), θp + f p (t, x, θ))
(2.34) (2.35)
and prove the following Lemma 2.3. For M ∈ N large enough the functional equation (2.34) admits a unique solution f (θ) : PM L2 → QM L2 . The solutions of (2.33) can then be written in the form (φx , φp ) = (θx + f x (t, x, θ), θp + f p (t, x, θ)) where θ ∈ PM L2 Rk are finite dimensional parameters solving the fixed point equation (2.35) on Rk , k = 2n(2M + 1).
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
983
Proof. Let us verify that, if M ∈ N is large enough, Eq. (2.34) realizes in fact a contraction on C 0 (L2 , L2 ). Hence it admits a unique solution f (t, x, θ) = (f x (t, x, θ), f p (t, x, θ)). By (2.16), Eqs. (2.34) and (2.35) read: s t 1 x p p x x f (θ)(τ )dτ, −∇V x − θ (τ ) + f (θ)(τ )dτ , (f , f ) = QM m 0 s η 1 s p x p (θ , θ ) = PM θ (τ ) + f p (θ)(τ )dτ, + m m 0 t x x − ∇V x − θ (τ ) + f (θ)(τ )dτ . s
It is proved in [3, Lemma 6] that the contraction property holds if: √ 1 + 2M 2 < 1. T sup |∇ H(x, p)| 2πM (x,p)∈T Rn
(2.36)
By Definition 2.4 it follows that sup(x,p)∈T Rn |∇2 H(x, p)| < +∞ and consequently given 0 < T < ∞ we get the contraction property for the first equation choosing M (T ) large enough. In a general setting, the second equation has many solutions depending on the values of (t, x). In this finite-dimensional setting, we can consider the following set of curves Γ(t, x) with 0 ≤ t ≤ T . t x γ (t, x, θ + f (θ))(s) = x − φx (t, x, θ)(τ )dτ, φx (t, x, θ) = θx + f x (t, x, θ), s s p γ (t, x, θ + f (θ))(s) = φp (t, x, θ)(τ )dτ, φp (t, x, θ) = θp + f p (t, x, θ). 0
(2.37) We note that this is a finite reduction of (2.4), but still contains all curves solving Hamilton’s equations with boundary data γ x (t) = x and γ p (0) = 0 because of φ are solving Eq. (2.33). Moreover, by a Sobolev’s immersion theorem, Γ(T, x) ⊂ H 1 ([0, T ]; T Rn ) ⊂ C 0 ([0, T ]; T Rn ) and this entails their continuity. We can now proceed to define the main object of this section. Definition 2.4. The finitely-many parameters global generating function of Λt is defined as: t S(t, x, η, θ) := x, η + γ p (s)γ˙ x (s) − H(γ x (s), η + γ p (s))ds|γ(·)=γ(t,x,θ+f (t,x,θ))(·) 0
= S(t, x, η, θ + f (t, x, θ)).
(2.38)
Here S is the infinite-dimensional generating function of Definition 2.3. Remark. The above generating function is fully parametrized by θ + f (t, x, θ), θ ∈ Rk , and not by an arbitrary φ ∈ L2 . This is the core of the finite reduction.
October 31, J070-S0129055X11004497
984
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
Now we provide a more detailed study about the analytical properties of f . Lemma 2.4. Consider the pair of functions (f x , f p ). Then: (1) (f x , f p ) fulfill the following equations s t 1 x QM QM (L + L† )f x (r)drdτ + Φx (t, x, θ, f x )(s), f (s) = m 0 τ t p f (s) = QM (L + L† )f x (τ )dτ + Φp (t, x, θ, f x )(s), s
Φx (t, x, θ, f x )(s) := −
1 QM m
s
0
t QM ∇V0 x − θx (r) + f x (r)dr dτ, τ
t Φp (t, x, θ, f x )(s) := −QM ∇V0 x − θx (r) + f x (r)dr . s
(2.39) (2) Under the condition d :=
T2 QM 2 L + L† < 1, m
(2.40)
they fulfill the estimates: 3
f x (t, x, θ)(·)L2 ≤ (1 − d)−1
T2 QM 2 ∇V0 C 0 , m
f p (t, x, θ)(·)L2 ≤ T LQM f x(t, x, θ)(·)L2 + (3) If in addition
T2 QM 2 L + L† + m
|i|≤|α|+|σ|
(2.41) 1 T2 QM ∇V0 C 0 . m
sup |∂xi V0 (x)| < 1, x∈Tn
(2.42)
then there exist Cασ (T ) > 0 such that: ∂xα ∂θσ f (t, x, θ)(·)L2 ≤ Cασ (T ). Proof. By direct computation, the first functional equation reads: s 1 f x (s) = QM f p (θ)(τ )dτ m 0 t s 1 = QM −QM ∇V x − θx (r) + f x (r)dr dτ m 0 τ t s 1 QM (L + L† ) x − θx (r) + f x (r)dr dτ = − QM m 0 τ s t 1 − QM QM ∇V0 x − θx (r) + f x (r)dr dτ m 0 τ s t 1 = QM QM (L + L† )f x (r)drdτ + Φx (t, x, θ, f x )(s) m 0 τ
(2.43)
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
985
where the last equality follows by (2.39). Analogous computation for f p (s): t f p (t, x, θ)(s) = QM (L + L† )f x (τ )dτ + Φp (t, x, θ, f x )(s). s
This proves Assertion (1). To see Assertion (2), remark that (2.39) also entails: 3
Φx (t, x, θ, f x )(·)L2 ≤
T2 QM 2 ∇V0 C 0 m
(2.44)
whence we obtain: f x (t, x, θ)(·)L2 ≤
T2 QM 2 L + L† f x (t, x, θ)(·)L2 + Φ(t, x, η, θ)(·)L2 . m
If we choose M large enough, then d :=
T2 2 m QM L
f x(t, x, θ)(·)L2 ≤ (1 − d)
−1
+ L† < 1 and hence we get
3
T2 QM 2 ∇V0 C 0 . m
In the same way we have the estimate: 1
f p (t, x, θ)(·)L2 ≤ T 2 L + L† QM f x(t, x, θ)(·)L2 + Φp (t, x, η, θ)(·)L2 1
1
≤ T 2 L + L† QM f x(t, x, θ)(·)L2 + T 2 QM ∇V0 C 0 . This proves Assertion (2). The equation for the first order partial derivatives reads: t ∂f x,α QM s ∂f x,α (t, x, θ) = QM (L + L† ) (t, x, θ)(r)drdτ ∂xi m 0 ∂xi τ t ∂ 2 V0 QM s x x QM θ (r) + f (t, x, θ)(r)dr x− + m 0 ∂xα ∂xβ τ t ∂f x,β · δβi + (t, x, θ)(r)dr dτ ∂xi τ 2
2
If M is large enough, then d := Tm QM 2 L + L† + Tm QM 2 ∇2 V0 C 0 < 1 and we get: x 3 ∂f 2 < (1 − d )−1 T QM 2 ∇2 V0 C 0 . (t, x, θ)(·) ∂xi 2 m L The equation for the second order partial derivatives reads: t QM s ∂ 2 f x,α ∂ 2 f x,α (t, x, θ) = QM (L + L† ) (t, x, θ)(r)drdτ ∂xi ∂xj m 0 ∂xi ∂xj τ t QM s ∂ 2 V0 + QM θx (r) + f x (t, x, θ)(r)dr x− m 0 ∂xα ∂xβ τ
October 31, J070-S0129055X11004497
986
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
t
× τ
∂ 2 f x,β (t, x, θ)(r)dr dτ ∂xi ∂xj s QM Fαβk (t, x, θ)(τ ) δkj +
QM m 0 · δβi +
+
τ
t
x,β
∂ 3 V0 ∂xα ∂xβ ∂xk
As before, if we require d := 2 x ∂ f ∂xi ∂xj (t, x, θ)(·) 2 L < (1 − d )−1
τ
∂f x,k (t, x, θ)(r)dr ∂xj
∂f (t, x, θ)(r)dr dτ ∂xi
where Fαβk (t, x, θ)(τ ) :=
t
2
T m
t
x−
θx (r) + f x (t, x, θ)(r)dr .
τ
QM 2 (L + L† + ∇2 V0 C 0 + F C 0 ) < 1 then
1 3 QM 2 F C 0 (T 2 + T 4 ∇f x 2L2 + 2T 2 ∇f x L2 ). m
For the higher order derivatives in x and also for θ-partial derivatives we can proceed in the same way, with the general condition T2 QM 2 L + L† + sup |∂xi V0 (x)| < 1, (2.45) m x∈Tn |i|≤|α|+|σ|
in order to conclude the existence of Cασ (T ) > 0 such that: ∂xα ∂θσ f (t, x, θ)(·)L2 ≤ Cασ (T ).
(2.46)
This proves Assertion (3) and thus concludes the proof of the lemma. Theorem 2.1. The generating function (2.38) admits the following representation: t 2 η − tLx, x + Q(t)θ, θ + v(t, x, η), θ + f (t, x, θ) 2m + ν(t, x, θ), θ + g(t, x, θ). (2.47)
S = x, η −
Here θ ∈ Rk and t → Q(t) ∈ GL(n), Q(0) = 0; moreover there are C¯αβσ (T ) > 0 such that: |∂xα ∂ηβ ∂θσ g| + |∂xα ∂ηβ ∂θσ ν| + |∂xα ∂θσ f | ≤ C¯αβσ (T ). The function (t, x, η) → v(t, x, η) is linear in x, η, and finally: k > 2n + CH,N T 4 .
(2.48)
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
987
T 2 Proof. In the light of the explicit upper bound QM ≤ 2π M and (2.42) we require: T2 T2 2 † i + sup |∂ V (x)| < 1. L + L 0 x m 4π 2 M x∈Tn |i|≤|α|+|σ|
Equivalently: 4
M>
T L + L† + 2π 2 m
|i|≤|α|+|σ|
sup |∂xi V0 (x)| . x∈Tn
However, by definition k = 2n(2M + 1) so that: k > 2n + CH,N T 4 where CH,N
4n := 2 L + L† + 2π m
|i|≤N +3
(2.49) sup |∂xi V0 (x)| . x∈Tn
(2.50)
Now we realize that to use this lower bound within Theorem 1.1 we need to fix |α| = 2 and |σ| = N + 1 because the transport equations of Theorem 1.1 contains ∇x S, ∆x S and the term ΘN involves the manifold ΣS localized by the stationary equation 0 = ∇θ S(t, x, η, θ). Moreover, this is needed because of we have to iterate the operator Π for N -times (that involves ∇θ ) in order to get the O(N +1 ) — approximation of the Schr¨ odinger propagator. We recall the structure of the infinite dimensional generating function: t 2 η − tLx, x + R(t)φ, φ + v(t, x, η), φ + σ(t, x, φ). S(t, x, η, φ) = x, η − 2m As a consequence: S(t, x, η, θ) := S(t, x, η, θ + f (t, x, θ)) t 2 η − tLx, x + R(t)(θ + f (t, x, θ)), θ + f (t, x, θ) 2m + v(t, x, η), θ + f (t, x, θ) + σ(t, x, θ + f (t, x, θ)).
= x, η −
We can thus make the identifications: Q(t)θ, θ := R(t)θ, θ, ν(t, x, θ), θ := 2R(t)f (t, x, θ), θ,
(2.51)
g(t, x, θ) := σ(t, x, θ + f (t, x, θ)) + R(t)f (t, x, θ), f (t, x, θ). By using the above results we then get the existence of C¯000 (T ) such that |ν(t, x, θ)| ≤ 2R(t)f (t, x, θ)(·)L2 ≤ C¯000 (T ), |g(t, x, θ)| ≤ σ(t, x, φ)(·)C 0 + R(t)f (t, x, θ)(·)2L2 ≤ C¯000 (T ). The estimate for all the partial derivatives follows in the same way.
(2.52)
October 31, J070-S0129055X11004497
988
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
Theorem 2.2. The graph of the Hamiltonian flow Λt := {(y, η; x, p) ∈ T Rn × T Rn | (x, p) = φtH (y, η)} admits a global generating function with finitely many parameters: Λt = {(y, η; x, p) ∈ T Rn × T Rn | p = ∇x S, y = ∇η S, 0 = ∇θ S(t, x, η, θ)}. Proof. By Proposition 2.1 we can write:
DS n n Λt = (y, η; x, p) ∈ T R × T R | p = ∇x S, y = ∇η S, 0 = Dφ where S is the infinite-dimensional generating function of Definition 2.3. Now we remark that the finite-dimensional stationarity condition: 0 = ∇θ S(t, x, η, θ) is equivalent to the variational equation expressing the stationarity: 0=
DS (t, x, η, φ). Dφ
Indeed, by Lemma 2.3 and [3, Lemma 7], there is a bijective correspondence φ = θ + f (t, x, θ) between the solutions of the two equations. Moreover, it is easy to prove that on the critical points we have: ∇x S|(t,x,η,θ) = ∇x S|(t,x,η,φ) ,
∇η S|(t,x,η,θ) = ∇η S|(t,x,η,φ) .
This is true because of the definition S(t, x, η, θ) := S(t, x, η, θ + f (t, x, θ)), and the computation ∇x S(t, x, η, θ) = ∇x S(t, x, η, φ)|φ=θ+f (t,x,θ) +
DS (t, x, η, φ)|φ=θ+f (t,x,θ) [∇x f (t, x, θ)]. Dφ
Evaluating both sides on the solutions θ we get the relation. The same argument applies to ∇η S, and this concludes the proof. Theorem 2.3. The Hamilton–Jacobi equation is solved by the smooth function S(t, x, η, θ) on the stationary points ΣS = {(x, η, θ) ∈ R2n+k | ∇θ S(t, x, η, θ) = 0}. More precisely: 2 ∂t S(t, x, η, θ) + |∇x S| (t, x, η, θ) + V (x) = 0, 2m (2.53) S(0, x, η, θ) = x, η, ∇θ S(t, x, η, θ) = 0.
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
989
Proof. We recall that, by Proposition 2.2, the Hamilton–Jacobi equation is solved by S(t, x, η, φ) on the infinite dimensional stationary points defined by DS Dφ = 0. 2 |∇ S| ∂t S(t, x, η, φ) + x (t, x, η, φ) + V (x) = 0, (t, x) ∈ R × Rn , 2m (2.54) S(0, x, η, φ) = x, η, DS (t, x, η, φ) = 0. Dφ On the stationary points we have: ∇x S|(t,x,η,θ) = ∇x S|(t,x,η,φ) ,
∂t S|(t,x,η,θ) = ∂t S|(t,x,η,φ) .
(2.55)
Indeed the first equality is proved in the previous theorem; whereas for the second one we observe: ∂t S(t, x, η, θ) = ∂t S(t, x, η, φ)|φ=θ+f (t,x,θ) +
DS (t, x, η, φ)|φ=θ+f (t,x,θ) [∂t f (t, x, θ)]. Dφ
Since DS Dφ (t, x, η, φ) = 0 the second equality in (2.55) is proved. (2.54) and (2.55) then yield the assertion. Theorem 2.4. Let S and ΣS be as in Theorem 2.3. Then there exists ΘN ∈ Cb∞ ([0, T ] × R2n+k ; R) with ΘN |ΣS = 0, such that an equivalent generating function SN is given by the solution of the problem 2 |∇x SN | (t, x, η, θ) + V (x) + ∂t SN (t, x, η, θ) = ΘN , 2m (2.56) S (0, x, η, θ) = x, η. N
Moreover, defining:
∇θ SN Π(ΘN ) := divθ ΘN |∇θ SN |2 = ΘN divθ
∇θ SN ∇θ SN + ∇θ ΘN = (Π1 + Π2 )ΘN . |∇θ SN |2 |∇θ SN |2
(2.57)
ΘN enjoys the property: Πi1 ◦ Πj2 (ΘN ) ∈ Cb∞ ([0, T ] × R2n+k ; R),
∀ 1 ≤ i + j ≤ N,
N = 1, 2, . . . . (2.58)
Proof. We remember that ΣS ⊂ R2n+k is a submanifold of dimension 2n thanks to the non-degeneracy condition rk(∇2xθ S, ∇2ηθ S, ∇2θθ S)|ΣS = max = k. We define z := (x, η, θ) ∈ R2n+k and around any point z¯ ∈ ΣS define furthermore S˜ (not necessarily uniquely) through the conditions: ˜ z) = ∂t S(t, z¯) + L(t, z), ∂t S(t,
(2.59)
October 31, J070-S0129055X11004497
990
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
˜ z) = ∇z S(t, z¯) + F (t, z), ∇z S(t,
(2.60)
˜ z) = ∆x S(t, z¯) + G(t, z), ∆x S(t,
(2.61)
where L = (Lx , Lη , Lθ ) ∈ Cb∞ ([0, T ] × R2n+k ; R) and L(t, z¯) = 0, the perturbation of the gradient in (2.60) is F = (F x , F η , F θ ) ∈ Cb∞ ([0, T ] × R2n+k ; R2n+k ) with F (t, z¯) = 0 while in (2.66) we require G ∈ Cb∞ ([0, T ] × R2n+k ; R) and G(t, z¯) = 0. / ΣS . Hence the new stationarity In addition, we require that F θ (t, z) = 0 for z ∈ equation: ˜ z) = F θ (t, z) ∇θ S(t,
(2.62)
implies ΣS˜ = ΣS . In order to verify (2.57) we require a suitable asymptotic behavior of L, F, G around ΣS . Indeed, ˜ z) = ∂t S(t, z¯) + L(t, z), ∂t S(t,
˜ z) = ∇x S(t, z¯) + F x (t, z). ∇x S(t,
So, by easy computations and by (2.53), we have Θ(t, z) =
˜2 |∇x S| ˜ z) (t, z) + V (x) + ∂t S(t, 2m
=
1 |∇x S(t, z¯) + F x (t, z)|2 + V (x) + ∂t S(t, z¯) + L(t, z) 2m
=
1 |∇x S(t, z¯)|2 + V (x) + ∂t S(t, z¯) 2m +
=
1 1 ∇x S(t, z¯)F x (t, z) + |F x (t, z)|2 + L(t, z) m 2m
1 1 ∇x S(t, z¯)F x (t, z) + |F x (t, z)|2 + L(t, z). m 2m
(2.63)
Now we can always require that the vanishing asymptotic behavior of F x , F θ , L around ΣS are such that it holds: Π1 (Θ), Π2 (Θ) ∈ Cb∞ ([0, T ] × R2n+k ; R). By the same arguments as above, we can look for SN such that ΣSN = ΣS and ∂t SN (t, z) = ∂t S(t, z¯) + LN (t, z),
(2.64)
∇z SN (t, z) = ∇z S(t, z¯) + FN (t, z),
(2.65)
∆x SN (t, z) = ∆x S(t, z¯) + GN (t, z),
(2.66)
where FNx , FNθ and LN are chosen in such a way that: Πi1 ◦ Πj2 (ΘN ) ∈ Cb∞ ([0, T ] × R2n+k ; R),
∀ 1 ≤ i + j ≤ N.
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
991
Let us examine the topology of the finite-dimensional critical points set. Theorem 2.5. Let S be as in Definition 2.4, and λ(x, η), tα,β as in (2.19) and (2.20), respectively. Consider (t, x, η) ∈ ]0, T ] × Rn × Rn . Then: (1) All solutions θ ∈ PM L2 ([0, T ]; R2n ) Rk of the stationarity equation: 0 = ∇θ S(t, x, η, θ) fulfill the estimates ∀ t ∈ ]0, T ],
˜ 2 (T )λ(x, η), θL2 > K ˜ 1 (t)λ(x, η), θL2 ≤ K
|x|2 + |η|2 > D(T )2 ; 2n
∀ t = tα,β ,
∀ (x, η) ∈ R .
(2.67) (2.68)
˜ 1 (t) < +∞ is a constant defined for ˜ 2 (T ) < +∞ while K where D(T ) < +∞, K t = tα,β . (2) The difference of any two solutions θ, ω fulfills the inequality ˜ θ − ωL2 ≤ E(t),
∀ t = tα,β ,
∀ (x, η) ∈ R2n .
(2.69)
Proof. By Proposition 2.3 and Lemma 2.3 all solutions φ ∈ L2 of the variational equation 0=
DS (t, x, η, φ) Dφ
are such that φ = θ + f (t, x, θ) and fulfill the inequalities (2.21)–(2.23) for some constants K1 (t), K2 (T ) and E(t). Using the uniform bound proved in Lemma 2.4: f (t, x, θ)(·)L2 ≤ C00 (T ) ˜ 2 (T ) and E(t). ˜ ˜ 1 (t), K we easily establish the existence of the new constants K Theorem 2.6. Let us suppose V (x) = 12 |x|2 + V0 (x) with supx∈Rn ∇2 V0 (x) < 1, (t, x, η) ∈]0, T ] × R2n and t = (2τ + 1) π2 , τ ∈ N. Then the following quadratic form: ∇2θ S(t, x, η, θ)u, u,
u ∈ Rk ,
is non-degenerate on all points solving ∇θ S(t, x, η, θ) = 0. Proof. The quadratic form is non-degenerate if and only if the solutions θ of the equation ∇θ S(t, x, η, θ) = 0
(2.70)
October 31, J070-S0129055X11004497
992
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
are isolated points. This property can be translated in the infinite-dimensional setting of the equation DS (t, x, η, φ) = 0 Dφ thanks to the equivalence φ = θ + f (t, x, θ) shown in Lemma 2.3. We recall that 2 s 1 p S = x, η + φ (τ )dτ, φ (s) − φ (τ )dτ ds η+ 2m 0 0 0 t t x − V x− φ (τ )dτ ds. t
0
s
p
x
s
Now we perform the partial reduction of the infinite-dimesional parameters, by DS means of the first stationarity equation Dφ x (t, x, η, φ) = 0 corresponding to s p x mφ (s) = η + 0 φ (τ )dτ (essentially, the Legendre transform). Therefore we get the new functional: t ˜ x, η, φx ) = x − S(t, φx (τ )dτ, η + 0
Setting γ x (s) = x −
t s
0
t
t m 2 x x φ (τ )dτ ds. |φ (s)| − V x − 2 s
φx (τ )dτ , we can consider the equivalent form
A[γ ] = γ (0), η + x
x
0
t
m x [|γ˙ (s)|2 − V (γ x (s))]ds 2
x
with the boundary conditions: γ (t) = x and mγ˙ x (0) = η. The second variation is: 1 t D2 A x 2 (γ )[δγ, δ γ] ˙ = m[|δ γ(s)| ˙ − ∇2 V (γ x (s))δγ(s)δγ(s)]ds. Dγ 2 0 Writing down the integrand under the form
−∇2 V (γ x (s)) (δγ(s)δ γ(s)) ˙ 0
0
δγ(s)
mI
δ γ(s) ˙
we realize that requiring ∇2 V (x) non-degenerate ∀ x ∈ Rn , then the second variation is a bilinear non-degenerate functional. This implies that all the stationary curves of the action functional, namely the curves solving DA x (γ )[v] = 0, Dγ
∀v ∈ TΓ
are isolated points belonging to H 1 ([0, t]; Rn ). We conclude that the same property must hold for the points θ ∈ Rk solving Eq. (2.70).
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
993
Next, we investigate the number of solutions of the stationarity equation. Theorem 2.7. Let us suppose V (x) = 12 |x|2 + V0 (x) with supx∈Rn ∇2 V0 (x) < 1, (t, x, η) ∈]0, T ] × R2n and t = (2τ + 1) π2 , τ ∈ N. Then the stationarity equation ∇θ S(t, x, η, θ) = 0 has a finite number of solutions θα (t, x, η), 1 ≤ α ≤ N (t). The upper bound has the expression: k ˜ (2E(t)) . k ε(T )
N (t) ≤
(2.71)
˜ Here E(t) as in Theorem 2.5 whereas 2 ∂ S (t, x, η, θ) inf sup (t,x,η,θ) i,j ∂θi ∂θj 1 . ε(T ) := k ∂3S (t, x, η, θ) + 1 sup sup (t,x,η,θ) i,j,m ∂θi ∂θj ∂θm
(2.72)
Proof. By Theorem 2.5 all critical parameters must be contained in the com¯r ⊂ Rk with r := 2E(t). ˜ pact set B As a consequence, there exists a subse¯r (0). However the function quence {θα(j) }j∈N converging to some point θ¯ in B k ∇θ S(t, x, η, ·) is continuous on R . Hence the limit is also a critical point, namely ¯ By the previous theorem all the critical points of S are isolated. 0 = ∇θ S(t, x, η, θ). This is a contradiction, so their number must be finite. In order to obtain an upper bound for this number, we first observe that 1 d 2 2 2 ∇θ S(t, x, η, θ + λ(θ − θ ))dλ ∇θ S(t, x, η, θ) = ∇θ S(t, x, η, θ ) + dλ 0 1 2 Dθ ∇2θ S(t, x, η, θ + λ(θ − θ ))dλ (θ − θ ). = ∇θ S(t, x, η, θ ) + 0
We know that, thanks to Theorem 2.6, the first matrix on the right-hand side is non-degenerate. In order to verify that the addition of the second one does not change this property, we establish the matrix norm inequality: 1 2 2 D ∇ S(t, x, η, θ + λ(θ − θ ))dλ(θ − θ ) (2.73) θ θ < ∇θ S(t, x, η, θ )2 . 0
2
Here · 2 is the usual norm for the matrix viewed as an operator. Now denote ε := θ − θ . The above inequality is a fortiori verified if: 2 √ ∂ S ∂3S 1 (t, x, η, θ) (t, x, η, θ) < √ inf sup ε k sup sup k (t,x,η,θ) i,j ∂θi ∂θj (t,x,η,θ) i,j,m ∂θi ∂θj ∂θm (2.74)
October 31, J070-S0129055X11004497
994
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
because the left-hand side is an upper bound for the left-hand side of (2.73) and the right-hand side a lower bound for the right-hand side of (2.73). (2.74) is in turn a fortiori verified if:
√ ∂3S sup sup ε k (t, x, η, θ) + 1 (t,x,η,θ) i,j,m ∂θi ∂θj ∂θm 2 ∂ S 1 (t, x, η, θ) √ < inf sup (2.75) k (t,x,η,θ) i,j ∂θi ∂θj and this yields (2.72). In this way, we have found the radius ε(T ) of the balls in Rk , where each θ is a unique local critical point. This local confinement of critical points together with the global one proved in Theorem 2.5, allows us to get an estimate of their total number N . We simply compute the ratio between the volume of the ball Br containing all the points and the volume of the small isolating balls. N (t) =
k ˜ (2E(t)) vol(Br ) = . vol(Bε ) ε(T )k
We use Theorem 2.7 in order to study the global behavior of the stationarity equation. Theorem 2.8. Let us suppose V (x) = 12 |x|2 + V0 (x) with supx∈Rn ∇2 V0 (x) < 1, (t, x, η) ∈ ]0, T ] × R2n and t = (2τ + 1) π2 , τ ∈ N. Let the number N (t) be given N (t) by (2.71). Then there exists a finite open partition R2n = =1 D such that the equation 0 = ∇θ S(t, x, η, θ)
(2.76)
admits on each D exactly smooth solutions θα (t, x, η), 1 ≤ α ≤ . Proof. We recall that ΣS := {(x, η, θ) ∈ R2n+k | 0 = ∇θ S(t, x, η, θ)} is a 2ndimensional submanifold of R2n+k diffeomorphic to Λt . Moreover, by the nondegeneracy hypothesis on ∇2 V we have the transversal behavior of ΣS (with respect to (x, η) ∈ R2n ) almost everywhere; namely the rank of ∇2θ S can differ from its maximum value (k) only on subsets whose projection on (x, η) ∈ R2n is of zero measure. The condition of transversality is fulfilled on components D (locally diffeomorphic to open sets of R2n ) where the local smooth inversion of equation (2.76) is possible, yielding functions θα (t, x, η). This argument works up to the finite maximum value N (t). 2.4. Transport equations We conclude this section by introducing transport equations in a global geometrical setting.
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
995
Theorem 2.9. Let us consider ρ ∈ S(Rk ; R+ ) with ρL1 = 1. The transport equation written on the stationary points ΣS of the generating function S, ∂t b0 + 1 ∇x S∇x b0 + 1 ∆x Sb0 (t, x, η, θ) = 0, m 2m (2.77) b0 (0, x, η, θ) = ρ(θ), ∇θ S(t, x, η, θ) = 0, admits the following solution:
t 1 x ∆x S(τ, γ (t, x, θ)(τ ), η, θ)dτ ρ(θ) b0 (t, x, η, θ) = exp − 2m 0
(2.78)
where γ x is the family of curves defined in (2.37). Proof. The inital condition is immediately verified: b0 (0, x, η, θ) = ρ(θ). Recalling the results of Proposition 2.1 and Theorem 2.2, we compute the expres1 sion of the differential operator ∂t b0 + m ∇x S∇x b0 (t, x, η, θ) when evaluated on the 2n+k | 0 = ∇θ S(t, x, η, θ)}. Namely, submanifold ΣS := {(x, η, θ) ∈ R 1 ∇x S∇x b0 (t, x, η, θ)|ΣS m 1 = ∂t b0 + (η + γ p (t, x, θ)(t))∇x b0 (t, x, η, θ)|ΣS m = ∂t b0 + γ˙ x (t, x, θ)(t)∇x b0 (t, x, η, θ)|ΣS
∂t b0 +
= ∂µ b0 (µ, x, η, θ) + γ˙ x (t, x, θ)(µ)∇x b0 (µ, x, η, θ)|ΣS |µ=t =
d b0 (µ, γ x (t, x, θ)(µ), η, θ)|ΣS |µ=t dµ
where the expression of
1 2m ∆x S(t, x, η, θ)b0 (t, x, η, θ)
(2.79) is:
1 ∆x S(t, x, η, θ)b0 (t, x, η, θ) 2m 1 ∆x S(t, γ x (t, x, θ)(µ), η, θ)b0 (µ, γ x (t, x, θ)(µ), η, θ)|µ=t . = 2m Now, we write down the equation in the new variable µ and for all (x, η, θ) ∈ R2n+k : d b0 (µ, γ x (t, x, θ)(µ), η, θ) dµ 1 + ∆x S(µ, γ x (t, x, θ)(µ), η, θ)b0 (µ, γ x (t, x, θ)(µ), η, θ) = 0. 2 If we define α(µ) := b0 (µ, γ x (t, x, θ)(µ), η, θ) we can rewrite the previous equation as 1 d α(µ) = − ∆x S(µ, γ x (t, x, θ)(µ), η, θ)α(µ) dµ 2m
October 31, J070-S0129055X11004497
996
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
where the variables (t, x, η, θ) have to be considered as fixed. This yields:
µ 1 b0 (µ, γ x (t, x, θ)(µ), η, θ) = exp − ∆x S(τ, γ x (t, x, θ)(τ ), η, θ)dτ ρ(θ). 2m 0 Finally, we make µ = t and so we obtain the solution of the original problem (2.77):
t 1 x ∆x S(τ, γ (t, x, θ)(τ ), η, θ)dτ ρ(θ). b0 (t, x, η, θ) = exp − 2m 0 2
Theorem 2.10. Let b0 be defined as in (2.78) with ρ(θ) := e−|θ| ξ(θ) and ξ ∈ Cb∞ (Rk ; R+ ). Then b0 (t, x, η, θ) ∈ C ∞ ([0, T ] × R2n+k ; R+ ) and b0 (t, x, η, ·) ∈ S(Rk ; R+ ) for every (t, x, η) fixed. Moreover, there exists constants + (T ), dασ (T ) > 0 such that Cασ 2
+ |∂xα ∂θσ b0 (t, x, η, θ)| ≤ Cασ (T )edασ (T )λ(x,η) e−|θ| ,
∀(x, η, θ) ∈ R2n+k .
(2.80)
Proof. Let us first obtain a more explicit expression for (∆x )S(·): ∆x S(t, x, η, θ) = 2 tr(L)t + ∆x ν(t, x, θ), θ + v(t, x, η), ∆x f (t, x, θ) + 2∇x v(t, x, η), ∇x f (t, x, θ) + ∆x g(t, x, θ) = 2 tr(L)t + 2R(t)∆x f (t, x, θ), θ + v(t, x, η), ∆x f (t, x, θ) + 2∇x v(t, x, η), ∇x f (t, x, θ) + ∆x g(t, x, θ) = 2 tr(L)t + 2R(t)∆x f (t, x, θ), θ + v(t, x, η), ∆x f (t, x, θ) + 2∇x v(t, x, η), ∇x f (t, x, θ) + ∆x g(t, x, θ).
(2.81)
Now, we recall that γ x (t, x, θ)(τ ) = x −
t
θx (r) + f x (t, x, θx )(r)dr
τ
where f and all its derivatives are L2 uniformly bounded, as proved in Lemma 2.4, whereas v is linear in (x, η) and g is L∞ bounded. Now we observe that by setting 2 ρ(θ) := e−|θ| ξ(θ) with a bounded ξ ∈ C ∞ (Rk ; R+ ) then b0 (t, x, η, ·) is a Schwartz function on Rk . Indeed,
t 2 1 |b0 (t, x, η, θ)| ≤ exp |∆x S(τ, γ x (t, x, θ)(τ ), η, θ)|dτ e−|θ| ξ(θ). 2m 0 But by the above detailed computation we see that |∆x S(τ, γ x (t, x, θ), η, θ)| ≤ |2 tr(L)t| + 2R(t) ∆x f (t, γ x , θ)L2 θ + v(t, γ x , η)L2 ∆x f (t, γ x , θ)L2 + 2∇x v(t, γ x , η)L2 ∇x f (t, γ x , θ)L2 + ∆x g(t, γ x , θ)L∞ .
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
997
∆x f (t, γ x (t, x, θ), θ)L2 θ is linear in θ, v(t, γ x (t, x, θ), η)L2 has a linear uniform growth on (x, η, θ) and ∆x f (t, γ x (t, x, θ), θ)L2 is bounded. To see this, remark that t x x v(t, γ (t, x, θ), η)L2 ≤ v(t, x, η)L2 + v t, θ (r)dr, η L2
τ
t x x + f (t, x, θ )(r)dr, η v t,
.
L2
τ
The first and third term on the right-hand side generate a linear growth on (x, η), the second term has a linear dependence on θ. The other terms above are bounded with respect to all variables. We conclude that |b0 | has a uniform exponential behav2 ior on all its variables, and that the function ρ(θ) := e−|θ| ξC 0 makes the effective dependence of |b0 | on θ of Schwartz type. All the estimates on the partial derivates in x, θ follow as above. ˜N ∈ Theorem 2.11. Let S, SN and ΣS be as in Theorem 2.4. Then there exists Θ ∞ 2n+k ˜ ; R) with ΘN |ΣS = 0, such that SN solves the problem Cb ([0, T ] × R 1 1 ˜N, ∂t b0,N + ∇x SN ∇x b0,N + ∆x SN b0,N (t, x, η, θ) = Θ m 2m (2.82) b0,N (0, x, η, θ) = ρ(θ) and it is fulfilled the property: ˜ N ) ∈ C ∞ ([0, T ] × R2n+k ; R), Πj (Θ b
∀ 1 ≤ j ≤ N,
N = 1, 2, . . .
(2.83)
where, as in Theorem 2.4:
˜ N ) := divθ Θ ˜ N ∇θ SN . Π(Θ |∇θ SN |2
Proof. Let us define
t 1 x ∆x SN (τ, γ (t, x, θ)(τ ), η, θ)dτ ρ(θ) (2.84) b0,N (t, x, η, θ) := exp − 2m 0
and prove that it solves the above problem. Indeed, we can write down the expantions for z := (x, η, θ) around z¯ ∈ ΣSN = ΣS b0,N (t, z) = b0 (t, z¯) + fN (t, z),
(2.85)
∇x b0,N (t, z) = ∇x b0 (t, z¯) + gN (t, z),
(2.86)
∂t b0,N (t, z) = ∂t b0 (t, z¯) + hN (t, z),
(2.87)
all these terms are related to the choice of GN in Theorem 2.4 and their rate of convergence to zero near z¯ are related as well. By unperturbed equation (2.77), we
October 31, J070-S0129055X11004497
998
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
compute: ˜ N (t, z) = ∂t b0,N + 1 ∇x SN ∇x b0,N + 1 ∆x SN b0,N (t, z) Θ m 2m 1 (∇x S(t, z¯) + FNx (t, z)) (∇x b0 (t, z¯) + gN (t, z)) = ∂t b0 (t, z¯) + hN (t, z) + m 1 + (∆x S(t, z¯) + GN (t, z))(b0 (t, z¯) + fN (t, z)) 2m 1 = hN (t, z) + ∇x S(t, z¯)gN (t, z) + FNx (t, z) (∇x b0 (t, z¯) + gN (t, z)) m 1 1 + ∆x S(t, z¯)fN (t, z) + GN (t, z)(b0 (t, z¯) + fN (t, z)). (2.88) 2m 2m Moreover, by Theorem 2.4, we remember that around z¯ ∈ ΣSN = ΣS the new stationarity equation is ∇θ SN (t, z) = FNθ (t, z). Now we can state that a suitable choice of FNθ , FNx and GN leads to the following property: ˜ N ) ∈ Cb∞ ([0, T ] × R2n+k ; R), Πj (Θ
∀ 1 ≤ j ≤ N.
3. A Class of Global FIO In this section we follow the general setting of H¨ormander [10] and in particular we study a class of global FIO related to the Hamiltonian flow φtH with generating functions as in the previous section. The study of the topology of the critical points set for such a functions will be useful here in order to determine important analytical properties of the global FIO such as asymptotic behavior of the kernel and L2 continuity. 3.1. Basic definition and main properties First, we introduce the set of phase functions: Definition 3.1. The set of phase functions S : [0, T ] × R2n × Rk → R is the set of smooth global generating functions of the graphs Λt ⊂ T Rn × T Rn of the canonical maps φtH : T Rn → T Rn . Each Λt admits the parametrization: Λt := {(y, η; x, p) ∈ T Rn × T Rn | (x, p) = φtH (y, η)} = {(y, η; x, p) ∈ T Rn × T Rn | p = ∇x S, y = ∇η S, 0 = ∇θ S(t, x, η, θ)}. Before going further we recall that, by Theorem 2.5, all the global generating functions S enjoys an important property. Namely, consider the set of critical
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
999
points ΣS := {(x, η, θ) ∈ R2n+k | 0 = ∇θ S(t, x, η, θ)}.
(3.1)
Then ΣS is a manifold globally diffeomorphic to Λt ; moreover for all t > 0 the following set ˜ 2 (T )λ(x, η)} ΥS := {(x, η, θ) ∈ R2n+k | |x|2 + |η|2 > D(T )2 , |θ| ≤ K
(3.2)
is free from critical points, i.e.: ΥS ⊂ R2n+k \ΣS . Second, we introduce the relevant class of symbols associated to S: Definition 3.2. The set of symbols consists of all b ∈ C ∞ ([0, T ] × R2n × Rk ; R) such that (i) b(0, x, η, θ) = ρ(θ), ρ ∈ S(Rk ; R+ ), Rk ρ(θ)dθ = 1. (ii) For t ∈]0, T ] the inequalities: ! 2 (x, η, θ) ∈ / ΥS C + (T ) eλ(x,η) e−|θ| , (3.3) |b(t, x, η, θ)| ≤ − −n −|θ|2 C (T )λ (x, η)e , (x, η, θ) ∈ ΥS hold for some constants C ± (T ) > 0. Remark 3.1. The exponential upper bound outside ΥS is verified by the symbol b0 (see Theorem 2.10) and also, as we will see, by any other symbol bj , j = 1, . . . entering in Theorem 1.1. Moreover, on domain ΥS there are no critical points for the function S and this leads to require asymptotic vanishing behavior of type 2 λ−n (x, η)e−|θ| in this region for b0 ; as a conseguence the same asymptotic property is fulfilled by all bj . This setting is motivated by the fact that the contribution of this region to the FIO can be of order O(∞ ) as we see in Corollary 3.1. In this framework, we provide a very simple proof of global L2 continuity. Finally, we introduce the class of global FIO associated to the Hamiltonian flow: Definition 3.3. Fix a phase function S as in Definition 3.1, and a symbol b as in Definition 3.2. Then the global -Fourier Integral Operator on S(Rn ) is defined as: B(t)ϕ(x) = (2π)−n
Rn
Rn
i
e (S(t,x,η,θ)−y,η)b(t, x, η, θ)dθdηϕ(y)dy.
(3.4)
Rk
In equivalent way, it can be rewritten in the form: i ˜ −n B(t)ϕ(x) = (2π) e S(t,x,y,u)˜b(t, x, u)duϕ(y)dy Rn
(3.5)
Rk
˜ x, y, u) := S(t, x, η, θ) − y, η and ˜b(t, x, u) := b(t, x, η, θ). where u := (η, θ), S(t, Indeed, if S generates the Lagrangian submanifold Λ, then S˜ does the same in new variables: ˜ η = −∇y S, ˜ 0 = ∇u S}. ˜ Λt = {(x, p; y, η) ∈ T Rn × T Rn | p = ∇x S,
October 31, J070-S0129055X11004497
1000
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
Theorem 3.1. The t-family of global FIO B(t) : S(Rn ) → S(Rn ) as in (3.4) are continuous and admits a continuous extension as operators in L2 (Rn ). Proof. We begin by rewriting the FIO under the form of an integral operator acting on the -Fourier transform of the initial datum: i B(t)ϕ(x) = (2π)−n e S(t,x,η,θ) b(t, x, η, θ)dθϕˆ (η)dη Rn
= (2π)−n
Rn
σ ˆ (t, x, η) :=
Rk
σ ˆ (t, x, η)ϕˆ (η)dη,
(3.6)
i
e S(t,x,η,θ) b(t, x, η, θ)dθ. Rk
This is because of the integral in the θ-variables is absolutely convergent since b(t, x, η, ·) ∈ S(Rk ), and ϕ(y) is also a Schwartz function and therefore admits a -Fourier transform in S(Rn ). The absolute convergence of the integral, as well as the L2 -continuity, is the consequence of the following computations. σ ˆ (t, x, η) = σ ˆ+ (t, x, η) + σ ˆ− (t, x, η), i σ ˆ− (t, x, η) := e S(t,x,η,θ) b(t, x, η, θ)dθ,
(3.7) (3.8)
Bδ (0)⊂Rk
σ ˆ+ (t, x, η) :=
i
e S(t,x,η,θ) b(t, x, η, θ)dθ,
(3.9)
Rk \Bδ (0) i
˜ 2 (T )λ(x, η). For t = 0 we have σ where δ := K ˆ+ (0, x, η) + σ ˆ− (0, x, η) = e x,η , B(0)ϕ = ϕ, and the continuity is obvious. For t > 0 we can apply the estimates of Property (ii) of Definition 3.2. In the region containing the critical points we have: 2 |b(t, x, η, θ)|dθ ≤ C + (T )eλ(x,η) e−|θ| dθ |ˆ σ+ (t, x, η)| ≤ Rk \Bδ (0)
= C + (T )eλ(x,η)
Rk \Bδ (0)
2
e−|θ| dθ.
(3.10)
Rk \Bδ (0)
By writing down the integral in spherical coordinates, we have the following simple estimates ∞ ∞ 2 2 e−|θ| dθ = ck e−ρ ρk−1 dρ ≤ ck dk (L) e−ρL dρ = ck dk (L)e−Lδ Rk \Bδ (0)
δ
δ 2
for all L > 0 and dk (L) := supρ≥0 e−ρ ρk−1 eρL . In particular we choose L := ˜ −1 (T ), so that it follows 1+K 2 ˜
|ˆ σ+ (t, x, η)| ≤ C + (T )eλ(x,η) ck dk (L)e−K2 (T )λ(x,η)−λ(x,η) ˜
= C + (T )ck dk (L)e−K2 (T )λ(x,η) .
(3.11)
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
Whereas in the other region we can write: − |ˆ σ (t, x, η)| ≤ |b(t, x, η, θ)|dθ ≤ Bδ
≤
(0)⊂Rk
1001
2
C − (T )λ−n (x, η)e−|θ| dθ
Bδ
(0)⊂Rk
2
k
C − (T )λ−n (x, η)e−|θ| dθ = C − (T )π − 2 λ−n (x, η).
(3.12)
Rk
We can now apply the Schur’s Lemma to both integral operators and this yields the L2 -boundedness. Now we prove a result, based on an argument of Duistermaat (see [7, Proposition 2.1.1]), showing that for a particular class of symbols the related FIO exhibit an order O(). Lemma 3.1. Let us consider a FIO of type (3.4) with phase function S and symbol g leading to a convergent integral. Suppose that ∇θ S (3.13) ∈ C ∞ ([0, T ] × R2n × Rk ; R) Π(g) := divθ g ∇θ S2 and that the integral with symbol Π(g) is convergent. Then, the following equivalence holds: i i S(t,x,η,θ) e g(t, x, η, θ)dθ = −i e S(t,x,η,θ)Πg(t, x, η, θ)dθ. (3.14) Rk
Rk
Proof. The differential operator Lψ := i
∇θ S,∇θ ψ
∇θ S 2
verifies the relation
i
−iLe S = e S . Now, by using integration by parts and the definition of the operator L, we get: i i i e S gdθ = −i L(e S )gdθ = −i e S Π(g)dθ. Rk
Rk
Rk
Corollary 3.1. Let g˜ ∈ C ∞ ([0, T ] × R2n+k ; R) be such that Πj (˜ g ) ∈ C ∞ ([0, T ] × R2n+k ; R) and that the corresponding Fourier integral is convergent ∀ 0 ≤ j ≤ N . Then: i i S(t,x,η,θ) N e g˜(t, x, η, θ)dθ = (−i) e S(t,x,η,θ) ΠN g˜(t, x, η, θ)dθ. Rk
Rk
Proof. The iterated application of the previous lemma gives the result. Remark 3.2. If we take two symbols g1 , g2 coinciding on ΣS and moreover such that g1 − g2 = g˜, with g˜ as in the above corollary, then the related FIO coincide up to order O(N ).
October 31, J070-S0129055X11004497
1002
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
4. Global Parametrices of the Evolution Operator Here we prove the main result of this paper. Consider the initial-value problem for Schr¨ odinger equation 2 i∂t ψ(t, x) = − ∆ψ(t, x) + V (x)ψ(t, x), 2m (4.1) ψ(0, x) = ϕ(x) ∈ S(Rn ), with a potential V quadratic at infinity, of the type (2.6). We proceed to apply the results of the previous two sections in order to prove Theorem 1.1; namely, to construct a parametrix for the evolution operator under the form of series of a global FIO such that the solution of the Schr¨ odinger equation (4.1) admits the following representation: ψ(t, x) =
N j=0
(2π)−n+j
i
R2n+k
e (S(t,x,η,θ)−y,η)bj (t, x, η, θ)dθdηϕ(y)dy + O(N +1 )
within the time interval t ∈ [0, T ] with T arbitrary large. 2
Proof of Theorem 1.1. Denoting Hx := − 2m ∆x + V (x) the action of the Schr¨ odinger operator we look for a family of global FIO {Bj (t)}j∈N with symbol bj (t, x, η, θ) enjoying Properties (i) and (ii) of Definition 3.3 such that
0 = (Hx − i∂t )
N j=0
i
e (S(t,x,η,θ)−y,η)
R2n+k
× j bj (t, x, η, θ)dθdηϕ(y)dy + O(N +1 ). First of all, the approximation of order zero is the operator B0 (t) defined as: i −n B0 (t)ϕ := (2π) e (S(t,x,η,θ)−y,η)b0 (t, x, η, θ)dθdηϕ(y)dy. (4.2) R2n+k
It has to reduce to the identity for t = 0 and to represent the semiclassical approximation of the propagator. To this end, the related phase function solves the H-J problem (2.56) and moreover the symbol b0 solves the regularized geometric version of the transport equation as in Theorem 2.11. As we observed in Remark 3.1, we require that in the region ΥS free from critical points of S, the symbol b0 behaves 2 as λ−n (x, η)e−|θ| . Now, we easily see that |∇x S|2 i i + V (x) + ∂t S e S b0 (Hx − i∂t )e S b0 = 0 2m i ∆x S 2 i S 1 b0 e S − e ∆x b0 . − i ∂t b0 + ∇x S ∇x b0 + 2m 2m
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
1003
The first two symbols of this sum vanish on the critical points set ΣS in such a way we can apply Corollary 3.1. Indeed for |∇x S|2 + V (x) + ∂t S b0 g˜ = 2m we use the results proved in Theorems 2.4 and 2.10. Wheares for ∆x S b0 2m we remember Theorem 2.11. As a conseguence, the related two operators are bounded and O(N +1 ). So, i i 2 S (Hx − i∂t ) e b0 dθϕˆ (η)dη = − e S ∆x b0 dθϕˆ (η)dη + O(N +1 ). 2m n+k n+k R R g˜ = ∂t b0 + ∇x S ∇x b0 +
(4.3) The operator B1 (t) and the related symbol b1 fulfill the analogous relation. The transport equation we now require for this symbol is the following: 1 i ∂t b1 + ∇x S∇x b1 + ∆x Sb1 = ∆x b0 , 2m 2m (4.4) b1 (0, x, η, θ) = 0. As a consequence,
(Hx − i∂t ) =−
i
Rn+k
3 2m
e S (b0 + b1 )dθϕˆ (η)dη
i
Rn+k
e S ∆x b1 dθϕˆ (η)dη + O(N +1 ).
The equation for the second order symbol: ∂t b2 + ∇x S∇x b2 + 1 ∆x Sb2 = i ∆x b1 , 2m 2m b (0, x, η, θ) = 0, 2
implies
(Hx − i∂t ) =−
4 2m
i
Rn+k
e S (b0 + b1 + 2 b2 )dθϕˆ (η)dη
Rn+k
i
e S ∆x b2 dθϕˆ (η)dη + O(N +1 ).
Therefore we can deal with functions bj , j ≥ 1 fulfilling the recurrent equations 1 i ∂t bj + ∇x S ∇x bj + ∆x S bj = ∆x bj−1 , 2m 2m (4.5) bj (0, x, η, θ) = 0.
October 31, J070-S0129055X11004497
1004
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
Each solution bj is a symbol as in Definition 3.2 and therefore (thanks to Theorem 3.1) it defines a bounded operator Bj (t). The same holds true for the remainder operators: i −n Rj (t)ϕ = (2π) e (S(t,x,η,θ)−y,η)2+j rj (t, x, η, θ)dθdηϕ(y)dy R2n+k
where rj :=
i 2m ∆x bj .
In order to prove it, we need the following
Lemma 4.1. For all j ≥ 1 the solution of Eq. (4.5) fulfills the estimates: Cj+ (T )edj (T )λ(x,η) e−|θ|2 , (x, η, θ) ∈ / ΥS , |bj |, |∆x bj |(t, x, η, θ) ≤ 2 C − (T )λ−n (x, η)e−|θ| , (x, η, θ) ∈ ΥS . j Proof. We consider the following problem d ζ(t, x, η, θ)(τ ) = ∇x S(t, ζ(t, x, η, θ)(τ ), η, θ) dτ ζ(t, x, η, θ)(t) = x
(4.6)
(4.7)
and define
τ 1 Φ(τ, x, η, θ) := exp − ∆x S(t, ζ(t, x, η, θ)(r), η, θ)dr 2m 0 in order to apply the theory of characteristics ((θ, η) fixed) and find the solution: bj (t, x, η, θ) =
i 2m
0
t
Φ(t − τ, x, η, θ)∆x bj−1 (τ, ζ(t, x, η, θ)(τ ), η, θ)dτ .
(4.8)
By the iteration of this map, we have a direct linear relationship between ∆x b0 and bj . Now we recall the estimates on b0 proved in Theorem 2.10 2
|∂xα b0 (t, x, η, θ)| ≤ Cα+ (T )edα (T )λ(x,η) e−|θ| ,
∀(x, η, θ) ∈ R2n+k
and the explicit analytic structure of S studied in Theorem 2.1: t 2 η − tLx, x + Q(t)θ, θ + v(t, x, η), θ + f (t, x, θ) 2m + ν(t, x, θ), θ + g(t, x, θ)
S = x, η −
(4.9)
which implies the exponential behavior of Φ. The exponential upper bound for |bj | and |∆x bj | follows directly from that. In the region ΥS we recall the upper 2 bound of type λ−n (x, η)e−|θ| we required for ∆x b0 and the expansion ∆x S(z) = z ) + G(z) with G ∈ Cb∞ (see Theorem 2.4). By using (4.8) we obtain this ∆x S(¯ second estimate also for |bj | and |∆x bj |.
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
1005
As a consequence we can conclude the proof of Theorem 1.1, by using the boundedness result of Th. 3.1 and state the existence of constants Kj (T ) > 0 such that Rj (t) ≤ Kj (T )2+j . By well known arguments related to the Duhamel formula we obtain the estimate: N 1 t U (t) − Bj (t) ≤ RN (s)ds ≤ T KN (T )N +1 , t ∈ [0, T ]. 0 j=0 Now we clarify the relationship between the construction of the previous theorem and Chazarain’s formulation [5], as well as with the integral representation of Fujiwara [8]. Theorem 4.1. Let t ∈ [0, t0 ], with t0 so small that the solution of the Hamilton– Jacobi equation does not develop caustics. Consider the construction of Theorem 1.1, truncated at any finite order J: J J i Bj (t)ϕ := (2π)−n e (S(t,x,η,θ)−y,η)j bj (t, x, η, θ)dθdηϕ(y)dy. j=0
R2n+k
j=0
(4.10) Then: (1) J
Bj (t)ϕ =
J
Uαch (t)ϕ + O(J+1 )
(4.11)
α=0
j=0
Here Uαch (t) is the term of order α of Chazarain’s FIO ([5]). (2) J j=0
Bj (t)ϕ =
J
UαF (t)ϕ + O(J+1 )
(4.12)
α=0
where this time UαF (t) is the term of order α of Fujiwara’s integral operator ([8]). Proof. In order to prove the first assertion, the main idea is to apply the stationary phase theorem to the oscillatory integrals (4.10) with respect to θ-variables. In the same way, if we consider the stationarity argument with respect to (θ, η)-variables we obtain the second assertion. In the small time regime t ∈ [0, t0 ] there exists a unique smooth and global critical point θ (t, x, η), solution of 0 = ∇θ S(t, x, η, θ). This fact suggests us to consider the translated phase function around this point S(t, x, η, θ+θ (t, x, η)) with θ ∈ B1 (0) and symbol b0 (see Theorem 1.1) for which we choose the regularizing part as ρ(θ) := (volB1 (0))−1 X1 (θ), a C ∞ cut off function for the ball B1 (0). The compact behavior of bj on the θ-variables follows as a consequence. The uniqueness
October 31, J070-S0129055X11004497
1006
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
of θ and the compact setting in the oscillatory integral allow us to apply the stationary phase theorem to each integral in the θ-variables i e S(t,x,η,θ)j bj (t, x, η, θ) dθ Bj (t, x, η) = Rk
obtaining
i
1
Bj (t, x, η) = e S(t,x,η,θ ) |det ∇2θ S(t, x, η, θ (t, x, η))|− 2 iπ
× e 4 σ j bj (t, x, η, θ ) + O(j+1 )
(4.13)
where σ = sgn ∇2θ S(t, x, η, θ (t, x, η)) and we have omitted (to simplify the exposition) the explicit form of the higher orders symbols. Now we remark that the function S(t, x, η, θ ) equals the phase used in the Chazarain’s paper (the action functional evaluated on the classical curve with boundary conditions x and η). Hence, by the uniqueness of the symbol expansion of the propagator in powers of , we get the corrispondence between the symbols obtained as in (4.13) and the ones obtained in the above-mentioned paper. This implies the equivalence of the two series (4.11) up to an order o(J+1 ). By the same argument, applied this time to the integrals over u := (θ, η) and Φ(t, x, y, u) := S(t, x, η, θ) − y, η i ˜j (t, x, y) = B e (S(t,x,η,θ)−y,η)j bj (t, x, η, θ)dθdη Rn
Rk
i
= Rn+k
e Φ(t,x,y,u) j ˜bj (t, x, u)du
(4.14)
we use the uniqueness of the critical point u (t, x, y) to get ˜j (t, x, y) = e i Φ(t,x,y,u ) j |det ∇2θ Φ(t, x, y, u (t, x, y))|− 12 e iπ 4 σ ˜ B bj (t, x, u (t, x, y)) + O(j+1 ).
(4.15)
The phase function Φ(t, x, y, u ) is the same used by Fujiwara and therefore also (4.12) is proved. This concludes the proof of the theorem. 5. Multivalued WKB Semiclassical Approximation In this final section we prove Theorem 1.2, mainly applying the Stationary Phase theorem to the global FIO (4.2), in order to get a multivalued WKB semiclassical approximation of the Schr¨ odinger evolution operator. Proof of Theorem 1.2. We start by recalling that (as proved in Theorem 1.1) the -Fourier Integral Operator i B0 (t)ϕ := (2π)−n e (S(t,x,η,θ)−y,η)b0 (t, x, η, θ)dθdηϕ(y)dy (5.1) Rn
Rn
Rk
October 31, J070-S0129055X11004497
2011 13:3 WSPC/S0129-055X
148-RMP
Geometric Approach to the Hamilton–Jacobi Equation and Global Parametrices
1007
is a semiclassical approximation of the Schr¨odinger propagator for all t ∈ [0, T ]. Under the particular hypothesis V (x) =
1 2 |x| + V0 (x), 2
sup ∇2 V0 (x) < 1,
x∈Rn
π t = (2τ + 1) (τ ∈ N), 2
we proved (see Theorems 2.6–2.8) that the phase function has isolated and finitely many critical points; precisely the equation ∇θ S(t, x, η, θ) = 0
(5.2)
N (t)
is solved on a finite open partition (x, η) ∈ R2n = =1 D in such a way that on each D there are exactly smooth solutions θ (t, x, η), 1 ≤ α ≤ . This property allows us to apply the Stationary Phase Theorem (see [11]) to the oscillatory integral in (5.1). The result is: i e S(t,x,η,θ)b0 (t, x, η, θ)dθ B0 (t, x, η)D = Rk
=
i
1
e S(t,x,η,θα(t,x,η)) |det ∇2θ S(t, x, η, θα (t, x, η))|− 2
α=1 iπ
× e 4 σα b0 (t, x, η, θα (t, x, η)) + O() where σα = sgn ∇2θ S(t, x, η, θα (t, x, η)). In the small time regime t ∈ [0, t0 ] and for potentials V quadratic at infinity, it is well known (see i.e. [20]) that the graph of the Hamiltonian flow Λt := {(y, η; x, p) ∈ T Rn × T Rn | (x, p) = φtH (y, η)} = {(y, η; x, p) ∈ T Rn × T Rn | p = ∇x S, y = ∇η S, 0 = ∇θ S} is globally transverse to the base manifold (x, η) ∈ R2n , so the Eq. (5.2) admits a unique global smooth solution θ (t, x, η). This simplified setting yields: i
B0 (t, x, η) = e S(t,x,η,θ
(t,x,η))
1
|det ∇2θ S(t, x, η, θ (t, x, η))|− 2
iπ
× e 4 σ b0 (t, x, η, θ (t, x, η)) + O() which is the usual WKB construction, local in time. Acknowledgment We thank Johannes Sj¨ ostrand for suggesting us the formulation of Theorem 1.2, and Kenji Yajima for a critical reading of a first draft of this paper. References [1] H. Amann and E. Zehnder, Periodic solutions of asymptotically linear Hamiltonian systems, Manuscripta Math. 32 (1980) 149–189. [2] O. Bernardi and F. Cardin, Minimax and viscosity solutions of Hamilton–Jacobi equations in the convex case, Commun. Pure Appl. Anal. 5(4) (2006) 793–812.
October 31, J070-S0129055X11004497
1008
2011 13:3 WSPC/S0129-055X
148-RMP
S. Graffi & L. Zanelli
[3] F. Cardin, The global finite structure of generic envelope loci for Hamilton–Jacobi equations, J. Math. Phys. 43(1) (2002) 417–430. [4] M. Chaperon, Une id´ee du type g´eod´esiques bris´es pour les syst`emes hamiltoniens, C. R. Acad. Sci. Paris S` er. I Math. 298(13) (1984) 293–296. [5] J. Chazarain, Spectre d’un hamiltonien quantique et m`ecanique classique, Comm. Partial Differential Equations 5(6) (1980) 595–644. [6] C. Conley and E. Zehnder, A global fixed point theorem for symplectic maps and subharmonic solutions of Hamiltonian equations on tori, in Nonlinear Functional Analysis and Its Applications, Part 1 (Berkeley, Calif., 1983), Proc. Sympos. Pure Math., Vol. 45 (Amer. Math. Soc., Providence, RI, 1986), pp. 283–299. [7] J. J. Duistermaat, Fourier Integral Operators, Progress in Mathematics, Vol. 130 (Birkh¨ auser Boston, Inc., Boston, MA, 1996). [8] D. Fujiwara, On a nature of convergence of some Feynman path integrals. I–II, Proc. Japan Acad. Ser. A Math. Sci. 55(8) (1979) 273–277. [9] M. F. Herman and E. Kluk, A semiclassical justification for the use of non-spreading wavepackets in dynamics calculations, Chem. Phys. 91(1) (1984) 27–34. [10] L. H¨ ormander, Fourier integral operators I, Acta Math. 127 (1971) 79–183. [11] L. H¨ ormander, The Analysis of Linear Partial Differential Operators, Vol. I, Grundlehren der Math. Wiss., Vol. 256 (Springer-Verlag, 1985). [12] L. Kapitansky and Y. Safarov, A parametrix for the nonstationary Schr¨ odinger equation, in Differential Operators and Spectral Theory, Amer. Math. Soc. Transl. Ser. 2, Vol. 189 (Amer. Math. Soc., 1999), pp. 139–148. [13] H. Kitada, On a construction of the fundamental solution for Schr¨ odinger equations, J. Fac. Sci. Univ. Tokyo Sect. IA Math. 27(1) (1980) 193–226. [14] A. Laptev and I. M. Sigal, Global Fourier integral operators and semiclassical asymptotics, Rev. Math. Phys. 12(5) (2000) 749–766. [15] F. Laudenbach and J. C. Sikorav, Persistance dintersection avec la section nulle au cours dune isotopie hamiltonienne dans un fibr`e cotangent, Invent. Math. 82 (1985) 349–357. [16] A. Martinez and K. Yajima, On the fundamental solution of semiclassical Schr¨ odinger equations at resonant times, Comm. Math. Phys. 216 (2001) 357–373. [17] D. Robert, On the Herman–Kluk semiclassical approximation, arXiv: 0908.0847v1 [math-ph]. [18] T. Swart and V. Rousse, A mathematical justification for the Herman–Kluk propagator, Comm. Math. Phys. 286(2) (2009) 725–750. [19] C. Viterbo, Symplectic topology as the geometry of generating functions, Math. Ann. 292 (1992) 685–710. [20] A. Weinstein, Lectures on Symplectic Manifolds, CBMS Regional Conference Series in Mathematics, Vol. 29 (American Mathematical Society, Providence, R.I., 1979). [21] K. Yajima, On the behaviour at infinity of the fundamental solution of time dependent Schr¨ odinger equation, Rev. Math. Phys. 13(7) (2001) 891–920. [22] K. Yajima, Smoothness and non-smoothness of the fundamental solution of time dependent Schr¨ odinger equations, Comm. Math. Phys. 181(3) (1996) 605–629. [23] J. C. Sikorav, On Lagrangian immersions in a cotangent bundle defined by a global phase function, C. R. Acad. Sci. Paris S´ er. I Math. 302(3) (1986) 119–122. [24] J. C. Sikorav, Probl`emes dintersection et de points fixes en geometrie Hamiltonienne, Comm. Math. Helv. 62 (1987) 61–72.
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 9 (2011) 1009–1033 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004503
FERMIONIC FIELDS IN THE FUNCTIONAL APPROACH TO CLASSICAL FIELD THEORY
KATARZYNA REJZNER II. Inst. f. Theoretische Physik, Universit¨ at Hamburg, Luruper Chaussee 149, D-22761 Hamburg, Germany
[email protected] Received 3 March 2011 Revised 27 September 2011 In this paper, we present a formulation of the classical theory of fermionic (anticommuting) fields, which fits into the general framework proposed by Brunetti, D¨ utsch and Fredenhagen. It was inspired by the recent developments in perturbative algebraic quantum field theory and it also allows for a deeper structural understanding on the classical level. We propose a modification of this formalism that also allows to treat fermionic fields. In contrast to other formulations of classical theory of anticommuting variables, we do not introduce additional Grassman degrees of freedom. Instead the anticommutativity is introduced in a natural way on the level of functionals. Moreover, our construction incorporates the functional-analytic and topological aspects, which is usually neglected in the treatments of anticommuting fields. We also give an example of an interacting model where our framework can be applied. Keywords: Perturbative algebraic field theory; fermionic fields; classical field theory. Mathematics Subject Classification 2010: 81T05, 81T08, 70S05
0. Introduction Recent developments in perturbative algebraic quantum field theory [3, 4, 14–17] have opened a new perspective also for the classical field theory. Using the algebraic approach one can treat the quantum algebra as a deformation of the classical structure. This view point turned out to be very successful for example in understanding the various notions of renormalization group [4, 31] and for applications in cosmology [12, 13, 22]. In the second case Dirac fields play a very important role. In [13] a suitable modification of the functional approach was developed to describe the anticommuting (fermionic) free quantum fields. The first steps to formulation of the classical theory were made as well.
1009
October 31, J070-S0129055X11004503
1010
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
The complete treatment of classical field theory of bosons shall be presented in [7]. Since the generalization to fermionic fields leads to some new effects, it is treated separately. In this paper, we present a formalism which bases on the one of [4] but introduces some new features characteristic to anticommuting fields. The problem of understanding the classical theory of fermions is a long standing one. Various attempts were made to tackle it. On the mathematical side, there is the variational bicomplex approach [41–43] and the supermanifold [19] or graded manifold [11, 35] formalism. The supermanifold approach was used in [8–10, 28]. The geometrical foundations of supermechanics on graded manifolds were formulated in [27, 33, 35], including the notion of graded Lagrangian, tangent supermanifold, space of velocities and Hamiltonian mechanics of a graded system. The more intuitive but formal approach to the calculus of variation is presented in [52]. It applies as well to even as to odd fields. It seems to be closer to intuition, but does not provide a clear mathematical structure that can be used to understand the classical theory of fermion fields. Following the spirit of [3, 4, 14–17] we propose a way to make the formal notions used in [52] more precise, without losing the intuition known from practical calculations in classical and quantum field theory. In contrast to most of the standard approaches we do not use the Grassman valued functions. In this respect our treatment resembles the one of [45–47]. The field configurations are ordinary sections of some vector bundle and the anticommutativity is introduced at the level of functionals. This way we avoid many of the technical and conceptual problems. The most difficult problem in understanding the classical theory of fermionic fields is the treatment of models with interaction, such as the Gross–Neveu [21] or the Thirring model [51]. We propose a formalism to deal with this kind of theories in a way which agrees with the spirit of pAQFT. The classical structure presented here can be treated as a first step to quantization. In Sec. 4, we comment further on this point and sketch the way how a deformation quantization can be performed, along the lines of [4, 5]. The approach presented in this paper allows not only the treatment of fermionic fields in the functional approach, but also opens a way to a locally covariant treatment of interacting models. The treatment of the free Dirac field was already presented by [40]. The paper is organized as follows: (1) In the first section, we present an overview of all mathematical structures that will be needed for the formulation of the functional framework. We also define the kinematical structure of the theory. Since we work in the off-shell formalism, the notion of observables shall be introduced at this level without the need to solve the equations of motion. It will turn out that this is a very crucial point, especially for the fermionic fields. (2) In the second section, we propose a treatment of the dynamics. We construct the Poisson structure (Peierls bracket) and intertwining maps between different Poisson structures (so called Møller maps). For the bosonic case, the existence of these maps for nonlinear actions will be proved in [7]. (3) In the next section we treat the example of the Gross–Neveu model, using the functional methods. (4) Finally we give a sketch of quantization using the deformation of the Poisson structure.
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1011
1. Algebra of Functionals In the functional approach to classical field theory [4, 14–17] the basic structure is a Poisson algebra of classical observables. It is defined in a slightly abstract way as an algebra of functionals on the configuration space E. One starts with an off-shell setting, i.e. field configurations are not required to be solutions of some dynamical equations. They are just vector valued functions of a given type on the Minkowski spacetime. For example in case of a scalar field these are smooth functions E = C ∞ (M). We can identify classical observables with functionals on this space. Intuitively speaking, a measurement performed at a spacetime point x can be associated with an evaluation functional Φx : E → R, Φx (ϕ) = ϕ(x). Of course the full space of functionals is far too big to introduce a sensible structure on it. We have to impose some regularity conditions. First we restrict ourselves to functionals that are smooth. This notion again requires a few comments, since the calculus on infinite-dimensional spaces (and E is indeed infinite-dimensional) is a subtle issue. Roughly speaking one endows the space E with its natural locally convex topology and defines the derivative of a functional on E as a generalization of the directional derivative (see [24, 36] for details). We do not want to go into the details now, since in the context of fermionic fields we will need a slightly different notion of smoothness anyway. We provide a thorough mathematical discussion of these points in Sec. 1.1. For now let us denote by C ∞ (E, R) the space of smooth functionals on E. Using the definition of the Peierls bracket [37] one can introduce the dynamical Poisson structure on E in the Lagrange formalism. For the moment let us focus on the example of a free scalar field. The Peierls bracket of two functionals F, G from a suitably chosen domain is defined as: {F, G} := −F (1) , ∆ ∗ G(1) ,
(1.1)
where ∆ is the causal propagator for the Klein–Gordon operator. We already indicated that this expression is well defined only on a suitably chosen domain. The reason for it is a singular character of the causal propagator ∆. Indeed, the above expression contains implicitly a pointwise multiplication of distributions. To make sense of such an operation one has to control the singularity structure of the objects involved. The whole space C ∞ (E, R) is too big to make a Poisson algebra out of it, since some functionals are simply too singular for the Peierls bracket (1.1) to be well defined. An easy way out is to consider only functionals that are compactly supported and local. The definition of the spacetime support of a smooth functional is simply a generalization of the distributional support. . supp F = {x ∈ M | ∀ neighborhoods U of x ∃ϕ, ψ ∈ E, supp ψ ⊂ U such that F (ϕ + ψ) = F (ϕ)}.
(1.2)
The notion of locality is also quite intuitive. According to the standard definition one calls a functional F local if it can be expressed as: dxf (jx (ϕ)), (1.3) F (ϕ) = M
October 31, J070-S0129055X11004503
1012
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
where f is a function on the jet space over M and jx (ϕ) = (x, ϕ(x), ∂ϕ(x), . . .) is the jet of ϕ at the point x. It was already recognized in [4, 17] in the context of perturbative algebraic quantum field theory that the property of locality can be reformulated using the notion of additivity of a functional together with a certain wavefront set condition (we come back to it in Sec. 1.1). It can be shown that the Peierls bracket is well defined on the space of smooth compactly supported local functionals, but it turns out that this space is not closed under {·, ·}. To obtain a Poisson algebra one has to admit objects that are more singular. Here the microlocal analysis comes into play. Note that the wavefront set of the causal propagator is characterized as (1.4) WF(∆) = {(x, y, k, −k) ∈ T˙ ∗ (M)2 |(x − y)2 = 0, k||(x − y), k 2 = 0}. By applying the H¨ ormander’s criterium [26] for multiplication of distributions one can identify a class of smooth compactly supported functionals for which (1.1) makes sense. The construction outlined here is discussed in details in references [4–6, 14–17]. Here we only indicated most important features, which we will now reproduce for the case of fermionic fields. Note that the algebraic formulation allows us to work on a very abstract level and we can avoid some of the conceptual difficulties of other approaches. In the first step we have to specify our algebra of functionals. Here we encounter a first difference with respect to the bosonic case, since our functionals have to be antisymmetric in some sense. To obtain a suitable framework we have to use tools from functional analysis. Although we will need some abstract mathematics at the beginning, it turns out that once the framework is established, it can be easily applied to physical examples (Sec. 2). 1.1. Antisymmetric functionals We already indicated in the introduction to this section, that first we need to define the basic kinematical objects of the theory. Working off-shell means that at this . point we do not specify the dynamics. Let E = E(M, V ) be an infinite-dimensional vector space of field configurations. Here V is a k-dimensional complex vector space in which fields take values and M is the Minkowski spacetime with the signature (+, −, −, −). Now we want to implement the notion of antisymmetry into our kine• matical structure. To this end we construct E, the exterior algebra of E, by taking the quotient of the tensor algebra over E by an ideal x ⊗ y + y ⊗ x. The resulting algebra is equipped with the antisymmetric ∧-product and can be written p p • • ∞ E, where E is a subspace of E spanned by as a direct sum: E = p=0 0 elements of the form: u1 ∧ · · · ∧ up , u1 , . . . , up ∈ E and E = C. A general element p • of E is a finite sum of elements of spaces E (called homogenous). Each space p E can be embedded in the space of antisymmetric sections [13, 22]. The condition of “asymmetry” means that: ua1 ,...,ak ,ak+1 ,...,ap (x1 , . . . , xk , xk+1 , . . . , xp ) =
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1013
−ua1 ,...,ak+1 ,ak ,...,ap (x1 , . . . , xk+1 , xk , . . . , xp ). We shall denote the space of antisymmetric sections by E(Mp , V ⊗p )a . This space can be equipped with the Fr´echet topology of uniform convergence on compact subsets of Mp . This induces also a topology p p on E and it follows that E = E(Mp , V ⊗p )a , with respect to this topology. We define now: ∞
. E(Mp , V ⊗p )a . C (M, V ) =
(1.5)
p=0
The dual of C (M, V ) is the direct product: ∞ ∞ . p ⊗p a . A = (C (M, V )) = E (M , V ) = Ap . p=0
(1.6)
p=0
The elements of A are called here the antisymmetric functionals, written as (possibly infinite) sequences: T = (Tp )p∈N , where the components Tp ∈ Ap = E (Mp , V ⊗p )a are referred to as homogenous functionals. The evaluation of T ∈ A n on an element C (M, V ) u = p=0 u(p) , u(p) ∈ E(Mp , V ⊗p )a is understood as: T (u) =
n
Tp , u(p) ≡
p=0
n
Tp (u(p) ),
(1.7)
p=0
where ·, · denotes the natural duality between E(Mp , V ⊗p )a and E (Mp , V ⊗p )a . Note that each T ∈ A evaluated on an element of C (M, V ) is always a sum of finitely many terms. We can equip A with the antisymmetric wedge product [39, 44]: S ∧ T (u1 , . . . , up+q ) . 1 = (−1)π S(uπ(1) , . . . , uπ(p) )T (uπ(p+1) , . . . , uπ(p+q) ), p!q!
(1.8)
π∈Pp+q
for S ∈ Ap (E), T ∈ Aq (E), ui ∈ E. The definition of ∧ can be now extended to C (M, V ) by continuity. The wedge product is associative and antisymmetric for homogenous elements: (R ∧ S) ∧ T = R ∧ (S ∧ T ),
S ∧ T = (−1)|S||T | T ∧ S,
(1.9)
where |S|, |T | denote the grades of S, T respectively. One can define a derivative on homogenous elements and extend it by linearity to the whole of A with the prescription [39]: dh : Ap → Ap−1 , h ∈ E, . (dh T )(u) = T (h ∧ u), T ∈ Ap , u ∈ E(Mp−1 , V ⊗(p−1) )a , dh T = 0,
T ∈ A0 .
p > 0,
(1.10)
October 31, J070-S0129055X11004503
1014
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
It is easy to verify, that the “derivative” d defined in (1.10) has the following properties: (1) dh is a graded derivation for every h ∈ E and S, T homogenous, i.e. dh (αS + βT ) = αdh S + βdh T,
dh (S ∧ T ) = (dh S) ∧ T + (−1)|S| S ∧ dh T. (1.11)
(2) For each T ∈ A it induces a map: T (1) : C (M, V ) → L(E, R), . T (1) (u); h = dh T (u).
(1.12) (1.13)
Moreover T (1) (u) is continuous (i.e. an element of E ) for every u ∈ C (M, V ). An object T (1) (.), h ∈ A corresponds to the (formal) notion of “the left variational derivative” of T . The definition given here agrees with [2, 25, 39]. One can define also the “right derivative” by a suitable modification of the definition of du . (3) Property 2 generalizes to: T (k) : C (M, V ) → Lalt (E × · · · × E ; R),
T ∈ Ap ,
k < p,
(1.14)
k
T
(k)
. (u); hk , . . . , h1 = dhk · · · dh1 T (u) = T (hk ∧ · · · ∧ h1 ∧ u).
(1.15)
Moreover T (k) (u) is jointly continuous for every u ∈ C (M, V ). (4) It is “anticommutative” in the following sense: dh1 dh2 T = −dh2 dh1 T,
∀ T ∈ A.
(1.16)
Analogously to the commutative case one can consider a particular class of elements of A1 , namely the evaluation functionals: . A1 Φax , Φax (u) = u(x)a , where x ∈ M, a = 1, . . . , k and u ∈ E. (1.17) Applying the wedge product to these functionals we get a relation: Φax ∧ Φby = −Φby ∧ Φax .
(1.18)
1.2. Distributions With the kinematical structure introduced in the previous subsection we can start to proceed toward the dynamics. As indicated in the introduction, the functional approach relies heavily on the theory of distributions. Therefore the next step is to include the functional-analytic aspects into our framework. A natural formulation in case of fermionic fields involves distributions with values in a graded algebra. Since it needs a certain generalization of the usual setting, we devote this section to introduce some abstract mathematical structures that are needed. We start with some basic definitions concerning distributions. For details see: . . [26, 48–50]. To fix the notation, we define: E(M) = C ∞ (M, R), D(M) = C0∞ (M, R)
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1015
and S (M) denotes the space of Schwartz functions. These function spaces are equipped with the topology of uniform convergence on compact subsets of M. The corresponding distribution spaces are defined as the topological duals. This can be also generalized to vector-valued distributions: Definition 1.1. Let X be a locally convex topological vector space with a topology defined by a separable family of seminorms {pα }α∈I . We say that T is a distribution on M with values in X if it is a continuous linear mapping from D(M) to X. The theory of vector-valued distributions was developed in [49, 50]. A slight modification was proposed in [32], where one uses sequential completeness instead of quasi-completeness. Most results from the theory of scalar-valued distributions generalize to this setting. The main difficulty lies in the fact, that there is no natural notion of tensor product for locally convex vector spaces. Also the approximation property does not hold in general. The situation is much easier if the space X is nuclear and (sequentially) complete (see Appendix A and [30] for details). This is the case if we take A with the weak topology τσ . A few details concerning the topologies are given in Appendix A. By the space of distributions with values in A ˆ A, where D (M)c denotes the dual of D(M) with the we shall understand D (M)c ⊗ ˆ is the sequential completion topology of uniform convergence on compact sets and ⊗ of the tensor product with respect to the tensor product topology.a The notion of vector-valued distributions enables us to formulate the classical theory of fermionic fields in a mathematically elegant way. Note that the map (1.12) ˆ A, i.e. a distribution with values in a can be treated as an element of D (M, V )c ⊗ Grassman algebra. One can generalize all known operations like convolution, Fourier transform and pullback to such objects. We shall recall them here to set the notation ˆA∼ ˆV⊗ ˆ A and and we refer to [26, 49, 50] for details. Since D (M, V )c ⊗ = D (M)c ⊗ V is finite-dimensional we provide the definitions for the case V = R, without the loss of generality.b Definition 1.2. Let T = t ⊗ f and φ = ϕ ⊗ g, where f, g ∈ A, t ∈ D (M)c and ϕ ∈ D(M). We have an antysymmetric bilinear product on A defined as: . ma (T, S) = T ∧ S. We define the convolution of T and φ by setting: . (T ∗a φ)(x) = t(ϕ(x − ·)) ⊗ ma (f, g).
(1.19)
ˆ A defines a convolution of The extension by the sequential continuity to D (M)c ⊗ a vector-valued distribution with a vector-valued function. a In
general one has to distinguish between the projective and injective tensor product, but in case of nuclear vector spaces, these notions coincide. For detailed discussion see [30, 32, 49, 50]. b In general one needs an inner product structure on V . Since V is finite dimensional, it can be always introduced with the natural pairing of V and V ∼ = V . In physical examples this dual pairing is usually provided by some natural structure. In case of Dirac fields, this is the pairing between spinors and cospinors and for the ghost fields in gauge theories it is induced by the Killing form of the gauge algebra.
October 31, J070-S0129055X11004503
1016
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
Definition 1.3. Let T = t ⊗ f and S = s ⊗ g, where f, g ∈ A, t ∈ E (M2 )c and s ∈ D (M)c . We define the convolution of T and S by setting: . (1.20) T ∗a S = t(·, y)s(y)dy ⊗ ma (f, g). This expression is well defined by [26, 4.2.2] and can be extended by the sequential ˆ A, T ∈ E (M)c ⊗ ˆ A. continuity to an arbitrary S ∈ D (M)c ⊗ Definition 1.4. In a similar spirit we define the evaluation of T = t ⊗ f on φ = ϕ ⊗ g, by: . (1.21) T, φa = t, ϕ ⊗ ma (f, g), where f, g ∈ A, t ∈ D (M)c and ϕ ∈ D(M). Also this can be extended by the ˆ A. sequential continuity to D (M)c ⊗ ˆ A. We define Tˆ ∈ S (M)c ⊗ ˆ A, the Fourier Definition 1.5. Let T ∈ S (M)c ⊗ transform of T as: ˆ Tˆ (φ) = T (φ),
φ ∈ S (M).
(1.22)
Also the notion of the wave front set [26] can be extended to distributions with values in a lcvs. The case of Banach spaces was already treated in detail in [40]. Definition 1.6. Let {pα }α∈A be the family of seminorms generating the locally ˆ A. A point (x, ξ0 ) ∈ T ∗ Rn \0 is not in convex topology on A. Let T ∈ Sc (M) ⊗ WF(T ), if and only if pα (φT (ξ)) is fast decreasing as |ξ| → ∞ for all ξ in an open conical neighborhood of ξ0 , for some φ ∈ D(M) with φ(x) = 0, ∀ α ∈ A. With the notion of the wave front set we can define a “pointwise product” of ˆ A by a straightforward extension of [26, 8.2.10]: two distributions T, S ∈ D (M)c ⊗ ˆ A, U ⊂ M (open). The product T ·a S can Proposition 1.1. Let T, S ∈ D (U )c ⊗ be defined as the pullback of ma ◦ (T ⊗ S) by the diagonal map δ : U → U × U unless (x, ξ) ∈ WF(T ) and (x, ξ) ∈ WF(S) for some (x, ξ). Obviously we have: T ·a S = (−1)|S||T | S ·a T , whenever these expressions are . well defined. In the following we shall also use a more suggestive notation: T ·a S = T, Sa . Now we want to impose some regularity conditions on the distributions we want to consider. This is important for the definition of the subspace of A, where the . / Peierls bracket is well defined. Let Ξn = {(x1 , . . . , xn , k1 , . . . , kn )|(k1 , . . . , kn ) ∈ . (V¯+n ∪ V¯−n )} be an open cone. We denote by F n = EΞ n (Mn , V ⊗n ) the subspace of An = E (Mn , V ⊗n ) consisting of distributions with wave front set contained in Ξn . We want to restrict to antisymmetric functionals that are elements of: . n objects microF = ∞ n=0 F . With analogy to the bosonic case we call these . ∞ causal functionals. Space F has two important subspaces: F + = p=0 F 2p and ∞ − . 2p+1 + . Elements of F would be called even and elements of F − , F = p=0 F
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1017
odd. Since we are using the weak topology on A, F can be equivalently characterized as a space of functionals satisfying: WF(F (n) (u)) ∩ (Mn × (V¯+n ∪ V¯−n )) = ∅,
∀ u ∈ C (M, V ),
n ∈ N,
(1.23)
ˆ A. We can impose a more where F (n) (u) is treated as an element of E (Mn , V ⊗n )c ⊗ p ⊂ F p consisting restrictive condition on the wave front set to define a subspace Floc of local functionals. A functional F ∈ F p is called local if it satisfies: WF(F ) ⊥ T ∆p (M),
(1.24)
. and is supported on ∆p (M), where ∆p (M) = {(x, . . . , x) ∈ Mp : x ∈ M} is the thin diagonal of Mp . This more abstract notion agrees with the notion of locality (1.3) mentioned in the introduction. F can be equipped with various topologies, for example the weak topology τσ inherited from A. Since we want to have control of the wavefront sets of functional derivatives, we shall use instead the so called H¨ ormander topology [26]. Let Γn ⊂ Ξn be a closed cone contained in Ξn . We introduce (following [1, 4, 26]) the family of seminorms on EΓ n (Mn , V ⊗n ) (ξ)|}, where the index set congiven by: pn,φ,C,k (T ) = supξ∈C {(1 + |ξ|)k |φT sists of (n, φ, C, k) such that k ∈ N0 , φ ∈ D(U ) and C is a closed cone in Rn with (supp(φ) × C) ∩ Γn = ∅. These seminorms, together with the seminorms of the weak topology provide a defining system for a locally convex topology denoted by τΓn . To control the wavefront set properties inside open cones, we take an inductive limit. It is easy to see that, to form this inductive limit, one can choose the family of closed cones contained inside Ξn to be countable. The resulting topology will be denoted by τΞn . The space F can be now equipped with the direct product topology denoted by τΞ . Since each of the topologies τΓn is nuclear [4] so τΞ is nuclear as well. It has also the property of sequential completeness, so from now on we shall always take F with this topology, unless stated differently. For F ∈ F n = EΞ n (Mn , V ⊗n ) we shall often use the notation: F (u) =
k
···
a1 =1
k
dx1 · · · dxp Fa1 ···ap (x1 , . . . , xp )ua1 ,...,ap (x1 , . . . , xp ),
(1.25)
ap =1
where Fa1 ,...,ap (x1 , . . . , xp ) is an integral kernel of a compactly supported disa tribution with the wavefront set contained in Ξp . Since Φax11 ∧ · · · ∧ Φxpp (u) = ua1 ,...,ap (x1 , . . . , xp ) we can write (1.25) formally and analogously to the bosonic case [16, 17], as: F (u) =
k a1 =1
···
k ap =1
dx1 · · · dxp Fa1 ,...,ap (x1 , . . . , xp )Φax11 ∧ · · · ∧ Φaxpp (u).
(1.26)
In the following we shall suppress the vector space indices of the Φ’s
whenever possible. With this notation (1.26) shall be written as: F (u) = dx1 · · · dxp F (x1 , . . . , xp )Φx1 ∧ · · · ∧ Φxp (u).
October 31, J070-S0129055X11004503
1018
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
2. Dynamical Structure 2.1. Equations of motion A generalized Lagrangian [4] S is defined to be a map S : D(M) → Floc such that 1. supp(S(f )) ⊆ supp(f ),
(2.1)
2. S(f + g + h) = S(f + g) − S(g) + S(g + h), if supp(f ) ∩ supp(h) = ∅.
(2.2)
The second condition is called additivity. One can think of it as a weaker replacement for the notion of linearity. It is crucial in the quantum field theory, where one uses the test function as a localized coupling constant [4]. To admit interaction terms that are of higher order in the coupling (for example ∼ f 2 ) one has to drop the linearity condition, whereas the additivity still holds. The “variation” of S(f ) is understood in the sense of (1.10) namely we require that: S(f )(1) (u), h = 0 for all h ∈ D(M, V ) and f ∈ D(M) such that f ≡ 1 on . K = supp h. For this choice of f we use notation: S(1)(1) (u), h = 0.
(2.3)
Since we can choose K arbitrary large, it follows that the equations of motion must 2 , we can interpret (2.3) as a system of hold true on the whole M. If S(f ) ∈ Floc partial differential equations for u ∈ E = E(M, V ). This is analogous to the bosonic case and allows us to define a solution space ES ⊂ E. For higher order interactions this concept has to be modified. If the nonlinearity is present, Eq. (2.3) contains Grassman-valued objects. Hence it cannot be seen as an equation for C-valued functions. Therefore it is more convenient to work purely on the algebraic level. We define an ideal JS ⊂ F as the one generated (in the algebraic and topological sense) by the set {S(1)(1) (.), h}h∈D(M,V ) ⊂ F (equations of motion). Then we can construct the quotient F /JS and write S(f ) in terms of equivalence classes [Φx ] ∈ F/JS . One can linearize a given Lagrangian S(f ) in terms of the second derivative. We always assume that the Lagrangian is even i.e. S(f )(2) is an element ˆ F + . In some cases it can be inverted, so there exists ∆∗ ∈ of E (M2 , V ⊗2 )a ⊗ 2 ⊗2 a ˆ D (M , V ) ⊗ F + such that: S(1)(2) ∗a ∆∗ ; h1 , · = δ ∗a h1 .
(2.4)
Formally this can be written as: S(f )(2) ∗a ∆∗ = δ ∗a 1.
(2.5)
By retarded and advanced Green’s functions ∆R/A we mean solutions of (2.5) satisfying in addition: supp(∆R ) ⊂ {(x, y) ∈ M2 | y ∈ J − (x)},
(2.6)
supp(∆A ) ⊂ {(x, y) ∈ M2 | y ∈ J + (x)}.
(2.7)
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1019
Particularly, if S(f )(2) ∈ E (M2 , V ⊗2 )a and it is a strictly hyperbolic operator, it can be shown that it has retarded and advanced Green’s functions ∆R , ∆A . 2.2. Møller maps In Sec. 2.1 we defined the “on-shell” algebra as the quotient F /JS , where JS is the ideal generated by the equations of motion. Analogously to the classical theory of bosonic fields we would like to compare theories with different actions S1 , S2 . This can be done most conveniently at the algebraic level. One can construct analogs of off-shell Møller maps [3], rS1 ,S2 : F → F, which intertwines the corresponding ideals JS1 and JS2 . We require rS1 ,S2 to have the following properties: (1) (2) (3) (4)
if G ∈ JS1 , then rS1 ,S2 G ∈ JS2 (the intertwining property), rS2 ,S3 ◦ rS1 ,S2 = rS1 ,S3 , (rS1 ,S2 G)(u) = G(u), u ∈ C (M, V ) if supp(u) ∩ (supp(S1 − S2 ) + V¯+ ) = ∅, G → rS1 ,S2 (G) is a homomorphism of F .
+ Now let S1 = S + λF and S2 = S for S, F ∈ Floc , |S| = 2. We assume that for S (2) (x, y) there exist retarded and advanced Green’s functions such that ∆R (x, y) = −(∆A (y, x))T . We want to construct rS+λF,S as a series in λ. Assume first that rS+λF,S exists and for fixed G we have a smooth (in the sense of calculus on locally + → Floc . Then we can use the generalized convex vector spaces) map r·,S (G) : Floc Taylor series expansion to obtain:
rS+λF,S (G) =
∞ λk k=0
∞
. λk (dk r·,S (G))(S)[F ⊗k ] = RS,k (F ⊗k , G). k! k!
(2.8)
k=0
To construct rS+λF,S we shall try to reverse this reasoning and define rS+λF,S by its power series. Each term should be a (k + 1)-linear map, symmetric in the first k arguments. We require that the 0-th order term is the identity map, and the first order term is the retarded product of F, G ∈ Floc , that is d rS+λF,S (G). (2.9) RS,1 (F, G) = RS (F, G) = dλ λ=0 After [16] we call higher order terms: higher order retarded products. From the conditions on rS+λF,S , we deduce those, which we want to impose on RS,k (F ⊗k , ·). To fulfill 1 we postulate that: rS+λF,S (S (1) + λF (1) , h) = S (1) , h,
(2.10)
where h ∈ D(M, V ) and S (1) + λF (1) , h ∈ JS+λF . The fact that JS+λF is generated by elements of this form, together with condition 4 already suffices to fulfill 1. From (2.10) follows a recursive condition on the retarded products: RS,k (F ⊗k , S (1) , h) = −kRS,k−1 (F ⊗(k−1) , F (1) , h),
k > 0.
(2.11)
October 31, J070-S0129055X11004503
1020
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
Particularly, for k = 1 we have: RS (F, S (1) , h) = −F (1) , h.
(2.12)
There is still a big freedom in defining RS . Particularly, we can use the analogy 2 . In this case the equations of with bosonic fields and define it first for F ∈ Floc motion can be interpreted in terms of a dynamical system and one can define the corresponding Møller maps on the configuration space E. Let ES and ES+λF be the solution spaces corresponding to actions S and S + λF . It was already shown in [15] that the map r˜S+λF,S : ES → ES+λF can be constructed perturbatively. It was proven for the scalar field but it can be easily generalized to the fermionic case for |F | = 2. The existence of nonperturbative solutions will be discussed in [7]. Therefore we postulate two more conditions on rS+λF,S : (5) If |F | = |S| = 2 then: rS+λF,S (F 1 ) ⊆ F 1 , (6) rS+λF,S (G)(u) = G ◦ r˜S+λF,S (u), u ∈ E, F, S ∈ F 2 , G ∈ F 1 . 2 and From Conditions 5, 6 and [15, Proposition 1] it is clear that for F, S ∈ Floc 1 G ∈ Floc the retarded product can be expressed as:
RS (F, G) = −G(1) , ∆R ∗a F (1) a .
(2.13)
To extend this definition to higher order functionals, we require the (graded) Leibniz rule in the left and right argument: (7) RS (F1 ∧ F2 , G) = F1 ∧ RS (F2 , G) + RS (F1 , G) ∧ F2 , (8) RS (F, G1 ∧ G2 ) = G1 ∧ RS (F, G2 ) + RS (F, G1 ) ∧ G2 . + and G ∈ Floc . It is given By continuity we can now extend RS to general F ∈ Floc by the same formula, namely (2.13). One can check with an explicit calculation, that (2.12) is fulfilled. Now we proceed like in [16]. Condition 2.2 implies that:
rS+λF1 ,S ◦ rS+λF1 +µF2 ,S+λF1 = rS+λF1 +µF2 ,S .
(2.14)
The comparison of the coefficients with respect to the powers of µ yields: d rS+λF1 +µF2 ,S (G). rS+λF1 ,S (RS+λF1 (F2 , G)) = dµ
(2.15)
µ=0
Again we can compare the coefficients with respect to the powers in λ and obtain the following recursive condition: RS,n+1 (F1⊗n ⊗ F2 , G) n n (1) A (n−l) =− RS,l (F1⊗l , (−1)|F2 |+1 F2 , ∆S+λF1 ∗a G(1) a ), l
(2.16)
l=0
where
. dk A (k) ∆A ∆S+λF1 = dλk λ=0 S+λF1 (2)
= (−1)k k! ∆A S ∗a F1
(2)
∗a ∆A S ∗a . . . ∗a F1
∗a ∆A S .
(2.17)
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1021
A (k)
The formula for ∆S+λF1 is the graded counterpart of [16, relation (42)]. Particularly we can set F1 = F2 = F and obtain: RS,n+1 (F ⊗(n+1) , G) = −
n n A (n−l) RS,l (F ⊗l , F (1) (x), ∆S+λF ∗a G(1) a ). l
(2.18)
l=0
Now we can prove that the Taylor series (2.8) with coefficients given by (2.18) defines a map rS+λF,S : JS+λF,S → JS . First we have to check if the recursive condition (2.11) is fulfilled. This is the case since we have: RS,k+1 (F ⊗k+1 , S (1) , ha ) k k A (k−l) =− RS,l (F ⊗l , F (1) , ∆S+λF ∗a S (2) ; ·, ha a ) l l=0
=k
k−1 l=0
k−1 A (k−l−1) ∗a F (2) ; ·, ha a ) RS,l (F ⊗l , F (1) , ∆S+λF l
− RS,k (F ⊗k , F (1) , ha ) = −(k + 1)RS,k (F ⊗k , F (1) , ha ). Next we show that G → rS+λF,S (G) is a homomorphism of F . Proposition 2.1. Let S, F ∈ F + and let r be defined by the Taylor series (2.8) with the coefficients given by (2.18). Then it holds (in every order in λ): rS+λF,S (G ∧ H) = rS+λF,S (G) ∧ rS+λF,S (H).
(2.19)
Proof. First we show the identity: RS,n (F ⊗n , G ∧ H) =
n n k=0
k
RS,k (F ⊗k , G) ∧ RS,n−k (F ⊗(n−k) , H).
(2.20)
This can be proved by induction. For n = 1 we have: l.h.s. = G ∧ H = r.h.s. Now we assume that (2.20) is satisfied at the order n and prove the induction step: RS,n+1 (F
⊗(n+1)
, G ∧ H) = −
n n l=0
l
A (n−l)
RS,l (F ⊗l , F (1) , ∆S+λF ∗a (G ∧ H)(1) a ). (2.21)
First we apply the graded Leibniz rule. With the use of the induction hypothesis and after changing the order of summation and renaming the indices the first term
October 31, J070-S0129055X11004503
1022
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
of (2.21) can be written as: n n A (n−l) − RS,l (F ⊗l , F (1) , ∆S+λF ∗a G(1) a ∧ H) l l=0
n n n l A (n−l) =− RS,l−k (F ⊗(l−k) , F (1) , ∆S+λF ∗a G(1) a ) ∧ RS,k (F ⊗k , H) l k k=0 l=k
=−
n k n k A (k−l) RS,l (F ⊗l , F (1) , ∆S+λF ∗a G(1) a ) ∧ RS,n−k (F ⊗(n−k) , H) k l k=0 l=0
n n =− RS,k+1 (F ⊗(k+1) , G) ∧ RS,n−k (F ⊗(n−k) , H). k
(2.22)
k=0
The last equality is a consequence of the definition of higher order retarded products. It follows now that: RS,n+1 (F ⊗(n+1) , G ∧ H) n n = RS,k+1 (F ⊗(k+1) , G) ∧ RS,n−k (F ⊗(n−k) , H) k k=0
+ (−1)|H||G|
n n RS,k+1 (F ⊗(k+1) , H) ∧ RS,n−k (F ⊗(n−k) , G) k
k=0
n+1 n + 1 RS,k (F ⊗k , G) ∧ RS,n−k+1 (F ⊗(n−k+1) , H). = k
(2.23)
k=0
This proves the induction step. From the induction principle it follows that (2.20) holds. Yet (2.20) is simply the Taylor expansion of (2.19), so (2.19) holds in the sense of formal power series. This proves Condition 4. Together with (2.11) this implies that Condition 1 is fulfilled as well. Condition 3 is fulfilled because of support properties of ∆R and Condition 2 holds from the definition. It remains to show that the series (2.8) defines indeed an element of F . To do it we have to check if in each degree we obtain a finite expression. First we assume that F has only terms of degree higher than 2. Then we get convergence in each grade, since RS,n (F ⊗n , G) has a degree that increases with n, i.e. |RS,n (F ⊗n , G)| = |G| + n(|F | − 2) > |RS,n−1 (F ⊗(n−1) , G)| and therefore the λk ⊗k sum (rS+λF,S (G))(u) = ∞ , G))(u), u ∈ C (M, V ) has only finitely k=0 k! (RS,k (F many non-vanishing terms. For |F | = 2 we cannot use this argument, but we can instead construct rS+λF,S with the use of r˜S+λF,S . The existence of r˜ has to be showed with the same argument as for the bosonic case [7]. −1 : JS → As a final remark, we discuss the existence of an inverse mapping: rS+λF,S JS+λF,S for |F | > 2. This map can always be defined as a formal power series, since
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1023
the first term in the expansion (2.8) is the identity. Moreover, as discussed above, this provides a well defined element of F . The retarded product given by formula (2.13) can be extended in the left argu+ to Floc by postulating (for homogenous elements): ment from Floc (1) a . RS (F, G) = (−1)|F |+1 F (1) , ∆R S ∗a G
(2.24)
We can extend this definition to the whole Floc by linearity and continuity. The additional sign factor is necessary since we use only left derivatives instead of right and left ones. Analogously, the advanced product is defined as: (1) a . AS (F, G) = (−1)|F |+1 F (1) , ∆A S ∗a G
(2.25)
2.3. Peierls bracket p q , G ∈ Floc we can Let ∆(x, y) = ∆R (x, y) − ∆A (x, y) = (∆(y, x))T . For F ∈ Floc define a Poisson structure by a definition analogous to the bosonic case:
{F, G}S = RS (F, G) − AS (F, G),
(2.26)
where the retarded and advanced products are given by (2.24), (2.25). This can be written as: {F, G}S = (−1)|F |+1 F (1) , ∆S ∗a G(1) a .
(2.27)
We can extend this definition also to F ∈ F p , G ∈ F q . By [26, Theorem 8.2.10] the pointwise product of distributions appearing in (2.27) is well defined. Since S (2) is assumed to be even, so is ∆. Therefore, for homogenous elements, we have the graded anticommutativity: {F, G}S = −(−1)|F ||G|G(1) , ∆S ∗a F (1) a = −(−1)|F ||G|{G, F }S .
(2.28)
By modifying slightly the proof of the Jacobi identity for the bosonic fields given in [29] one can show that for homogenous elements F, G, H ∈ F: {{F, G}S , H}S (−1)|F ||H| + {{G, H}S , F }S (−1)|F ||G| + {{H, F }S , G}S (−1)|G||H| = 0.
(2.29)
Also the graded derivation law is fulfilled, namely, {F ∧ G, H}S = (−1)|G||H| {F, H}S ∧ G + F ∧ {G, H}S .
(2.30)
In Sec. 2.1 we defined the ideal of (F , ∧) generated by equations of motion for a given action functional S. We denoted this ideal as JS . Now we prove that this is also a Poisson ideal with respect to the {·, ·}S structure: Proposition 2.2. Let JS be the (F , ∧)-ideal generated by elements of the form S (1) , h. Then JS is a Poisson ideal of the Poisson algebra (F , {·, ·}S ).
October 31, J070-S0129055X11004503
1024
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
Proof. A general element of JS can be written as a limit of a sequence of elements of the form: F ∧ S (1) , h for F ∈ F, h ∈ D(M, V ). Inserting this in formula (2.27) yields: {F ∧ S (1) , h, G}S = (−1)|F |+|S|(F ∧ S (1) , h)(1) , ∆ ∗a G(1) a .
(2.31)
With the use of (2.5) and the definition of the causal propagator we obtain: {F ∧ S (1) , h, G}S = (−1)|F | (F (1) , ∆ ∗a G(1) a ) ∧ S (1) , h = −({F, G}S ) ∧ S (1) , h ∈ JS .
(2.32)
The above proposition shows that we can now take the quotient of F by JS . and we obtain the Poisson algebra of observables: (FS , {·, ·}S ) = (F /JS , {·, ·}S ). Different action functionals define different Poisson structures on F . It turns out, that (like in the bosonic case [15]) those structures are intertwined by Møller maps. Proposition 2.3. The retarded (advanced) Møller maps are canonical transformations for Poisson structures induced by action functionals. Namely: {rS2 ,S1 (F ), rS2 ,S1 (G)}S1 = rS2 ,S1 ({F, G}S2 ).
(2.33)
Proof. The proof is analogous to the bosonic case [15]. The infinitesimal version of (2.33) is simply: {RS (H, F ), G} + {F, RS (H, G)} d = RS (H, {F, G}) + (RS+λH (F, G) − AS+λH (F, G)). dλ λ=0
(2.34)
This in turn can be verified by a straightforward calculation, using the fact that: (1) (3) , h = −∆A ; h, ·, · ∗a ∆A (∆A S) S ∗a S S, d (2) ∆A = −∆A ∗a ∆A S ∗a H S. dλ λ=0 S+λH
(2.35) (2.36)
The second identity can be proved analogously as [16, formula (37)]. 3. Gross–Neveu Model After introducing the general formalism, we shall now apply it in a concrete example. We want to construct the algebra of classical observables for the Gross–Neveu model. Since we shall do it perturbatively, we start with the free action, namely we consider a free Dirac field in Minkowski spacetime. Let DM (D∗ M) be the spinor (cospinor) bundle. We take the Whitney sum DM ⊕ D∗ M and define the configura˜ = u⊕u ¯. tion space to be E = E (DM⊕ D∗ M), the set of smooth sections. Let E u
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1025
To be consistent with the standard approach we introduce a following notation for evaluation functionals: u) = uA (x), Ψx A (˜
(3.1)
˙ ˙ ¯B Ψ u) = u ¯B (x). x (˜
(3.2)
The generalized action functional for the free Dirac field takes the form: ¯ x ∧ (i∂/ − m)Ψx )(u), f ∈ D(M), S0 (f )(u) = dxf (x)(Ψ
(3.3)
. where u ∈ C (M, E) and the derivative ∂µ is a weak derivative, i.e. (∂µ Ψx )(u) = Ψx (∂µ u) = (∂µ u)(x). The equations of motion take the form: (1) ˜ ˜ ∧u ¯ x ∧ (i∂/ − m)Ψx )(h u), h = dx(Ψ ˜) S0 (1) (˜ =
˜ ¯ x E (˜ dxh(x), DΨx ⊕ −D∗ Ψ u),
(3.4)
← − . . ¯x = ¯ x (i ∂/ + m) and ·, ·E denotes the dual where DΨx = (i∂/ − m)Ψx , D∗ Ψ −Ψ pairing on E induced by the pairing between spinors and cospinors. Let J0 be an ¯ x . The on-shell algebra of functionals ideal generated by the elements DΨx ⊕ −D∗ Ψ is defined as F /J0 . The second derivative of S0 (f ) takes the form: (2) ˜ ˜ ¯ 1 (i∂/ − m)h2 − h ¯ 2 (i∂/ − m)h1 ). (3.5) S0 (1) ; h1 , h2 = dx(h ˙
¯A ): It is convenient to write S0 (1)(2) (x, y) as a block matrix in the basis (uA , u (2)
S0 (1)
(x, y) = δ(x − y)
T
0
D∗ (x)
−D(x)
0
,
(3.6)
where T denotes the transpose of a matrix. We can construct the retarded and R/A using the fact that DD∗ = D∗ D = + m2 . Let advanced Green’s functions ∆0 R/A be retarded/advanced Green’s function for ( + m2 ). It can be shown that: G 0 −D∗ (x)GR/A (x, y) R/A ∆0 (x, y) = DT (x)GR/A (x, y) 0 0 K0 R/A ∗ . = . (3.7) R/A −K0 0 The Peierls bracket for the free theory {·, ·}0 is given by Eq. (2.27) with the causal A propagator ∆0 = ∆R 0 − ∆0 determined by (3.7). The interacting action functional
October 31, J070-S0129055X11004503
1026
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
for the Gross–Neveu model in D dimensions with N Dirac spinors (colors) takes the form (with the spinor indices suppressed): N ¯ a ∧ (i∂/ − m)Ψa Ψ S(f )(u) = dxf (x) x x a=1
N λg(x) ¯ a a b b ¯ x ∧ Ψx ) (u), + (Ψx ∧ Ψx ) ∧ (Ψ 2N
(3.8)
a,b=1
where λ is a coupling constant and g ∈ D(M) is the spacetime cutoff for the interaction. Let J be the ideal generated by elements of the form: N N λg(x) ¯ b λg(x) ¯ b a b a ∗ ¯b b a ¯x . (Ψx ∧ Ψx ) ∧ Ψx ⊕ − D Ψx + (Ψx ∧ Ψx ) ∧ Ψ DΨx + N N b=1
b=1
(3.9) Then equations of motion are realized on the algebra F /J . The second derivative reads: S(1)(2) (1 ⊕ u ˜1 ∧ u ˜2 ); h1 , h2 N N λg(x) a a h1 (x)Dha2 (x) + h1 (u1 b ub2 − u2 b ub1 )ha2 − c.c.. = dD x N a=1 a,b=1
(3.10) One can read off from the above equation the distribution kernel of S(1)(2) to be (written formally): T 0 D∗ (x) + F T (x) (2) , (3.11) S(1)ab (x, y) = δ(x − y)δab −D(x) − F (x) 0 N ¯ b b (2) where F (x) = λg(x) can be inverted b=1 Ψx ∧ Ψx . It turns out that S(1) 2N R/A denote and one can also impose support conditions on the inverse. Let ∆I the inverse whose support satisfies condition (2.6) or (2.7), respectively. For every n u = p=0 u(p) ∈ C (M, V ) we have: R/A 0 K (x, y)(u) I ∗ (∆I R/A )ab (x, y)(u) = δab , (3.12) R/A −KI (x, y)(u) 0 where the corresponding distribution kernels are defined as: n/2 . R/A R/A R/A R/A k (−1) dz1 · · · dzk K0 (x, z1 )K0 (z1 , z2 ) · · · KI (x, y)(u) = K0 (x, y) + k=1 R/A K0 (zk , y)F (z1 )
∧ · · · ∧ F (zk ) (u).
(3.13)
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1027
Analogously for KI R/A . We can conclude that the inverse of S(1)(2) exists as a ∗ distribution with values in F + . The Peierls bracket for the interacting theory {·, ·}I A is given by Eq. (2.27) with the causal propagator ∆I = ∆R I − ∆I . 4. Deformation Quantization Finally we come to the quantization. In this section we want to show how the formalism we introduced for the classical theory fits into the framework of deformation quantization introduced in [5, 14, 15]. We start with the free theory and introduce the interaction in the perturbative way. Let S be the free action, i.e. S ∈ F 2 . We assume that S(1)(2) is a strictly hyperbolic operator on E. Let ∆ be the corresponding causal propagator, i.e. ∆ = ∆R − ∆A . In the first step we consider only n ⊗n a ) . The quantum algebra very regular elements of F . Let Freg = ∞ n=0 D(M , V is defined by deforming the ∧-product on Freg . We define the star product on Freg analogously to [4, 15, 31] but instead of a symmetric tensor product we use ∧. First 2 2 we introduce a graded functional differential operator Γ∆ : Freg → Freg : 1 . dxdy∆(x, y) · F (1) (x) ∧ G(1) (y), Γ∆ (F, G) = (−1)(|F |+1) (4.1) 2 where F , G are homogenous. Clearly Γ∆ can be extended also to non-homogenous elements of Freg by linearity. Let Freg [[]] denote the space of formal power series with coefficients in Freg . It is equipped with the direct product topology induced by the topology of Freg . By a slight abuse of notation we denote these topologies by the same symbol. The -product is defined as: :
2
F [[]] → F [[]] . F G = exp(iΓ∆ )(F, G),
(4.2) ∞
1 (Γ∆ )n where exp(iΓ∆ ) is a short-hand notation for a formal power series: n=0 n! and Γ0∆ (F, G) = F ∧ G. With the star-product (4.2) we can define the commutator as:
. [F, G] = F G − (−1)|F ||G|G F.
(4.3)
Particularly Φ(f ) = dxfa (x)Φax , a field smeared with a test function is an element 1 . The -product of two such elements reads: of Freg Φ(f ) Φ(g) = Φ(f ) ∧ Φ(g) +
i f, ∆g, 2
(4.4)
. where f, ∆g = a,b dxdyfa (x)∆ab (x, y)gb (y). The corresponding (anti-) commutator takes the form: [Φ(f ), Φ(g)] = if, ∆g.
(4.5)
October 31, J070-S0129055X11004503
1028
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
As an example we can take the free Dirac field. The causal propagator written in the matrix form reads: 0 K ∆= . (4.6) −K∗ 0 From this, it follows that: ¯ )] = ig, Kf E = −iK∗ g, f E = [Ψ(f ¯ ), Ψ(g)] , [Ψ(g), Ψ(f
(4.7)
where f ∈ D(DM ), g ∈ D(D∗ M ). This is the quantized algebra of free fields. Now we want to introduce the interaction. Following [4] we use to this end the relative S-matrix. Firstly we introduce the time ordered product. Let ∆D = 12 (∆R + ∆A ) be the Dirac propagator. One can define a map Γ∆D : Freg [[]] → Freg [[]] on homogenous elements by: 1 . (4.8) Γ∆D = (−1)|F | dxdy∆D (x, y)F (2) (x, y). 2 The time-ordering operator is a map T : Freg [[]] → Freg [[]] defined as a formal power series: . T (F ) = exp(iΓ∆D )F.
(4.9)
. The anti-time-ordering operator T −1 is a formal inverse of T defined by: T −1 (F ) = exp(−iΓ∆D )F . The time ordered product ·T on T (Freg ) is given by: . F ·T G = T (T −1 F ∧ T −1 G).
(4.10)
Following [4, 5] we introduce now the interaction by the formula of Bogoliubov. Let + be an interaction term, then we define the formal S-matrix as: F ∈ Freg ∞ ∞ . 1 . 1 n T (F, . . . , F ) = F ·T . . . ·T F. S(F ) = n! n! n=0 n=0
(4.11)
With the use of S(F ) one can define the interacting algebra along the lines of [4, 5, 14]. Now we want to extend our discussion to more singular elements of F . In order to do this we need to replace the product with an equivalent one, defined by means of a Hadamard solution satisfying the microlocal spectrum condition [4, 5, 14]. Since we are considering here only the Minkowski spacetime, we can choose for concreteness the Wightmann 2-point function ∆+ = 2i ∆ + ∆1 . We replace now i i 2 ∆ in the definition of the star product with 2 ∆ + ∆1 . This way we obtain an
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1029
equivalent product ∆1 , which is related to the old one by the transformation α∆1 , defined as . α∆1 (F ) = exp(Γ∆1 )F, with 1 . Γ∆1 (F ) = (−1)|F | 2
dxdy∆1 (x, y)F (2) (x, y).
(4.12)
(4.13)
The relation between star products and ∆1 reads: −1 F ∆1 G = α∆1 (α−1 ∆1 F α∆1 G).
(4.14)
The new star product can now be extended to the elements of F . Consider now the map α−1 ∆1 : Freg [[]] → Freg [[]]. We equip the domain Freg [[]] with topology τΞ and define a topology τ∆1 on the target space as the finest one that makes α−1 ∆1 continuous. Next we embed (Freg [[]], τΞ ) as a dense topological vector space in (F [[]], τΞ ) and take a sequential closure of the target space (Freg [[]], τ∆1 ) with respect to all the sequences α−1 ∆1 (Fn ), where Fn converges to an element of F [[]] with respect to τΞ . We denote this closure by A[[]] and from the construction follows that α−1 ∆1 : F [[]] → A[[]] is continuous. From the microlocal spectrum condition and the condition (1.23) on wavefront sets of functionals in F it follows that the star product is also continuous with respect to τ∆1 , so it can be extended to a product on A[[]]. This way we obtain an involutive algebra (A[[]], ). The situation with the time-ordered product is more complicated. Like in the bosonic case this product is not continuous with respect to the topology τΞ . Nevertheless it is well defined for F, G ∈ A with disjoint supports. More generally we can define graded symmetric maps T n : F [[]]⊗n → A[[]] by means of: −1 T n (F1 , . . . , Fn ) := α−1 ∆1 (F1 ) ·T . . . ·T α∆1 (Fn ),
(4.15)
for F1 , . . . , Fn with pairwise disjoint supports. The problem of renormalization is now formulated as the problem of extending maps T n to functionals with coinciding supports. From the mathematical point of view it does not differ much from the bosonic case, except of the fact that now we deal with graded symmetric in place of symmetric distributions. As shown in [18] the extension of T n can be defined if the arguments are local functionals. 5. Conclusions The proper understanding of classical theory of fermions is essential for applying the functional approach in quantum field theory to anticommutiong fields. In this paper, we proposed a formalism that generalizes that of [4, 7, 15–17] and stays consistent with [13, 22], where the functional approach was applied to free Dirac fields. We also showed that our setting can be used to treat fermion-fermion interactions, in
October 31, J070-S0129055X11004503
1030
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
the example of the Gross–Neveu model. The algebraic structure we use can also describe more general nonlinear models, since we admit interaction terms that are defined by arbitrary (possibly infinite) power series. This generalizes the approach of [13, 22], where only finite sums are admitted. Our results can be also applied in quantization of gauge theories with the use of BRST method, since ghost fields have to be of fermionic type. Furthermore our approach allows to treat odd and even variables on the equal footing and is therefore very natural to apply to BV formalism. We also provide a notion of an “odd” derivative which can be used to make the formal calculations of BRST and BV quantization more precise. The next step is to apply the results concerning odd fields to complete treatment of gauge field theories in the functional approach. This is done in [20].
Acknowledgments I would like to thank K. Fredenhagen for enlightening discussions and remarks. Furthermore I want to thank R. Brunetti, C. Dappiaggi, K. Keller, P. Lauridsen Ribeiro and J. Zahn for valuable comments. I am also grateful to the Villigst Stiftung for the financial support of my Ph.D.
Appendix A. Topologies We recall from [30] some of the important definitions from topology and distribution theory. Let E, F be locally convex topological vector spaces (lcvs) such that there exists a dual paring ·, · : E × F → R. E can be regarded as a linear subspace of RF and the topology it inherits from the product topology of RF , is called the weak topology, denoted by σ(E, F ). Definition A.1 ([30, 3.2]). A subset U of a topological vector space X is called complete (sequentially complete) if every Cauchy net (sequence) converges in U . We say that X is quasi-complete if every closed bounded subset of X is complete. The property of (sequential) completeness and quasi-completeness is inherited by the infinite direct products and infinite direct sums [30, 3.3.5]. Definition A.2. Let E, F be Hausdorff lcvs and B be the family of bounded sets of the completion of E (bornology). Let τB be the topology of uniform convergence on bounded sets it induces on L(E, F ). We say that E has the (sequential) approximation property if one of the following equivalent conditions holds: (1) E ⊗ F is (sequentially) dense in (L(E, F ), τB ) for every F , (2) E ⊗ E is (sequentially) dense in (L(E, E), τB ), (3) 1E is the τB -limit of some (sequence) net in E ⊗ E.
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1031
This property is also inherited by a direct product of a family of lcvs that are Hausdorff [30, 18.2.4]. Now we shall describe the topology of A in more detail. The family of seminorms for the space E(M) is given by: pK,m (φ) = sup |∂ α φ(x)|, x∈K |α|≤m
(A.1)
where α ∈ NN is a multiindex. A set B ⊂ E(M) is bounded if supφ∈B {pK,m (φ)} < ∞ for all seminorms pK,m . Let B be the family of bounded sets in E(M). The strong topology on the dual space E (M) is defined by a family of seminorms: . pB (T ) = supφ∈B T, φ, where B ∈ B, T ∈ E (M), φ ∈ E(M). The bounded sets in C (M, V ) are finite products of bounded sets in the constituent spaces E(Mp , V ⊗p ). It follows that the strong topology on A is generated by the family of seminorms: . m pBi1 ,...,Bim (T ) = k=1 supφ∈Bi T, φ, where k ∈ N and Bik ⊂ E(Mik , V ⊗ik )a is k bounded. The function spaces S (M), D(M), E(M), as well as their strong duals S (M), D (M), E (M), are reflexive complete (therefore quasi-complete) nuclear spaces ([38]) and they have the (sequential) approximation property ([30, 49]). It is shown in [23] that every locally convex vector space is nuclear with its weak topology, so the weak duals S (M), D (M), E (M) are trivially nuclear. Note that spaces E(Mn , V ⊗n )a defined in Sec. 1.1 are Frech´et nuclear spaces and this holds also for their countable direct sum C (M, V ). The corresponding dual spaces E (Mn , V ⊗n )a are nuclear (with strong and weak topologies) and we can equip n ⊗n a ) with the direct product topology τb (or τσ ) A = C (R, V ) = ∞ n=0 E (M , V induced by the strong (weak) topologies on each of the factors. Locally convex topological vector space (A, τb ) (or (A, τσ )) is also nuclear because the nuclearity is preserved under the countable direct product. For our purposes it is sufficient to use the weak topology (A, τσ ). The quasi-completeness is also preserved under taking the direct products, so we conclude that (A, τσ ) is quasi-complete. References [1] Ch. B¨ ar and K. Fredenhagen, Quantum Field Theory on Curved Spacetimes, Lecture Notes in Physics, Vol. 786 (Springer, 2009). [2] F. A. Berezin, The Method of Second Quantization (Academic Press, 1966). [3] F. Brennecke and M. D¨ utsch, Removal of violations of the Master Ward Identity in perturbative QFT, Rev. Math. Phys. 20 (2008) 119–172. [4] R. Brunetti, M. D¨ utsch and K. Fredenhagen, Perturbative algebraic quantum field theory and the renormalization groups; arXiv:math-ph/0901.2038v2. [5] R. Brunetti and K. Fredenhagen, Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds, Comm. Math. Phys. 208 (2000) 623–661. [6] R. Brunetti, K. Fredenhagen and M. K¨ ohler, The microlocal spectrum condition and Wick polynomials of free fields on curved spacetimes, Comm. Math. Phys. 180 (1996) 633–652; arXiv:gr-qc/9510056. [7] R. Brunetti, K. Fredenhagen and P. Lauridsen-Ribeiro, Algebraic structure of classical field theory I. The case of a scalar field, work in progress.
October 31, J070-S0129055X11004503
1032
2011 13:4 WSPC/S0129-055X
148-RMP
K. Rejzner
[8] U. Bruzzo and R. Cianci, Structure of supermanifolds and supersymmetry transformations, Comm. Math. Phys. 95 (1984) 393–400. [9] U. Bruzzo and R. Cianci, On the structure of superfields in a field theory on a supermanifold, Lett. Math. Phys. 11 (1986) 21–26. [10] U. Bruzzo, Field theories on supermanifolds: General formalism, local supersymmetry, and the limit of global supersymmetry, in Proceedings, Topological Properties and Global Structure of Spacetime, Erice, Italy (1985), pp. 21–29. [11] J. F. Carinena and H. Figueroa, Hamiltonian versus Lagrangian formulation of supermechanics, J. Phys. A: Math. Gen. 30 (1997) 2705–2724. [12] C. Dappiaggi, T. P. Hack, J. Moller and N. Pinamonti, Dark energy from quantum matter, arXiv:astro-ph/1007.5009. [13] C. Dappiaggi, T.-P. Hack and N. Pinamonti, The extended algebra of observables for Dirac fields and the trace anomaly of their stress-energy tensor, Rev. Math. Phys. 21 (2009) 1241–1312. [14] M. D¨ utsch and K. Fredenhagen, Algebraic quantum field theory, perturbation theory, and the loop expansion, Comm. Math. Phys. 219 (2001) 5–30; arXiv:hep-th/0001129. [15] M. D¨ utsch and K. Fredenhagen, Perturbative algebraic field theory, and deformation quantization, in Proceedings of the Conference on Mathematical Physics in Mathematics and Physics, Siena, Italy (2000), pp. 151–160; arXiv:hep-th/0101079. [16] M. D¨ utsch and K. Fredenhagen, The master Ward identity and generalized Schwinger–Dyson equation in classical field theory, Comm. Math. Phys. 243 (2003) 275–314; arXiv:hep-th/0211242. [17] M. D¨ utsch and K. Fredenhagen, Causal perturbation theory in terms of retarded products, and a proof of the action Ward identity, arXiv:hep-th/0403213. [18] H. Epstein and V. Glaser, The role of locality in perturbation theory, Ann. Inst. H. Poincar´e A 19 (1973) 211–295. [19] D. Franco and C. Polito, Supersymmetric field-theoretic models on a supermanifold, J. Math. Phys. 45 (2004) 1447–1473. [20] K. Fredenhagen and K. Rejzner, Batalin–Vilkovisky formalism in the functional approach to classical field theory, arXiv:math-ph/1101.5112. [21] D. J. Gross and A. Neveu, Dynamical symmetry breaking in asymptotically free field theories, Phys. Rev. D 10 (1974) 3235–3253. [22] T. P. Hack, On the backreaction of scalar and spinor quantum fields in curved spacetimes — From the basic foundations to cosmological applications; arXiv:1008.1776 [gr-qc]. [23] H. Hogbe-Nlend and V. B. Moscatelli, Nuclear and Conuclear Spaces: Introductory Courses on Nuclear and Conuclear Spaces in the Light of the Duality “TopologyBornology”, North-Holland Mathematics Studies, Vol. 52, Notes de Mathem´ atica, Vol. 79 (North-Holland Publishing Co., 1981). [24] R. S. Hamilton, The inverse function theorem of Nash and Moser, Bull. Amer. Math. Soc. (N.S.) 7(1) (1982) 65–222. [25] S. Hollands, Renormalized quantum Yang–Mills fields in curved spacetime; arXiv:grqc/0705.3340v3. [26] L. H¨ ormander, The Analysis of Linear Partial Differential Operators I: Distribution Theory and Fourier Analysis (Springer, 2003). ¨ sn-Solano, Geometrical foundations of Lagrangian superme[27] L. A. Ibort and J. MarA´ chanics and supersymmetry, Rep. Math. Phys. 32 (1993) 385–409. [28] A. Jadczyk and K. Pilch, Superspaces and supersymmetries, Comm. Math. Phys. 78 (1981) 373–390.
October 31, J070-S0129055X11004503
2011 13:4 WSPC/S0129-055X
148-RMP
Fermionic Fields in the Functional Approach to Classical Field Theory
1033
˜ [29] S. Jakobs, EichbrAijcken in der klassischen Feldtheorie, diploma thesis under the supervision of K. Fredenhagen, DESY-THESIS-2009-009, Hamburg (2009). [30] H. Jarchow, Topological Vector Spaces (B. G. Teubner Stuttgart, 1981). [31] K. J. Keller, Dimensional regularization in position space and a forest formula for regularized Epstein–Glase renormalization, doctoral thesis; arXiv:1006.2148v1 [math-ph]. [32] H. Komatsu, Ultradistributions. III. Vector-valued ultradistributions and the theory of kernels, J. Fac. Sci. Univ. Tokyo Sect. IA Math. 29(3) (1982) 653–717. [33] J. Monterde, Higher order graded and Berezinian Lagrangian densities and their Euler–Lagrange equations, Ann. Inst. H Poincar´e 56 (1992) 3–26. [34] J. Monterde and J. Munoz-Masqu´e, Variational problems on graded manifolds, in Mathematical Aspects of Classical Field Theory, Contemp. Math., Vol. 132 (Amer. Math. Soc., 1992), pp. 551–571. [35] J. Monterde and J. A. Vallejo, The symplectic structure of Euler–Lagrange superequations and Batalin–Vilkoviski formalism, J. Phys. A 36 (2003) 4993–5009. [36] K.-H. Neeb, Monastir lecture notes on infinite-dimensional Lie groups, http://www.math.uni-hamburg.de/home/wockel/data/monastir.pdf. [37] R. E. Peierls, The commutation laws of relativistic field theory, Proc. R. Soc. London A 214 (1952) 143–157. [38] A. Pietsch, Nuclear Locally Convex Vector Spaces (Springer-Verlag, 1972). [39] G. Roepstorff, Pfadintegrale in der Quantenphysik (Vieweg, 1991). [40] K. Sanders, Aspects of locally covariant quantum field theory, Rev. Math. Phys. 22 (2010) 381–430; arXiv:0911.1304v2 [math-ph]. [41] G. Sardanashvily, Classical field theory. Advanced mathematical formulation, Int. J. Geom. Methods Mod. Phys. 5 (2008) 1163–1189; arXiv:0811.0331v2 [math-ph]. [42] G. Sardanashvily, Graded infinite order jet manifolds, Int. J. Geom. Methods Mod. Phys. 4 (2007) 1335–1362. [43] G. Giachetta, L. Mangiarotti and G. Sardanashvily, Advanced Classical Field Theory (World Scientific, 2009). [44] G. Scharf, Finite Quantum Electrodynamics (Springer-Verlag, 1995). [45] T. Schmitt, Functionals of classical fields in quantum field theory, Rev. Math. Phys. 7 (1995) 1249–1301. [46] T. Schmitt, The Cauchy problem for abstract evolution equations with ghost and fermion degrees of freedom, arXiv:hep-th/9610096. [47] T. Schmitt, Supermanifolds of classical solutions for Lagrangian field models with ghost and fermion fields, arXiv:hep-th/9707104. [48] L. Schwartz Th´eorie des Distributions (Hermann, 1950). [49] L. Schwartz, Th´eorie des distributions ` a valeurs vectorielles. I, Ann. Inst. Fourier 7 (1957) 1–141. [50] L. Schwartz, Th´eorie des distributions ` a valeurs vectorielles. II, Ann. Inst. Fourier 8 (1958) 1–209. [51] W. E. Thirring, A soluble relativistic field theory, Ann. Phys. 3 (1958) 91–112. [52] B. de Witt, The Global Approach to Quantum Field Theory I, II (Oxford University Press Inc., 2003).
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 10 (2011) 1035–1062 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004515
REMARKS ON THE REEH–SCHLIEDER PROPERTY FOR HIGHER SPIN FREE FIELDS ON CURVED SPACETIMES
CLAUDIO DAPPIAGGI Dipartimento di Fisica Nucleare e Teorica, Universit` a degli Studi di Pavia, Italy and INFN Sezione di Pavia, Via Bassi 6, I-27100 Pavia (PV), Italy
[email protected] Received 4 March 2011 Revised 24 September 2011 The existence of states enjoying a weak form of the Reeh–Schlieder property has been recently established on curved backgrounds and in the framework of locally covariant quantum field theory. Since only the example of a real scalar field has been discussed, we extend the analysis to the case of massive and massless free fields either of spin- 21 or of spin-1. In the process, it is also shown that both the vector potential and the Proca field can be described as a locally covariant quantum field theory. Keywords: Quantum field theory on curved spacetimes; Reeh–Schlieder states. Mathematics Subject Classification 2010: 81T20, 81T05
1. Introduction One of the main results in the algebraic approach to quantum field theory is the so-called Reeh–Schlieder theorem, which asserts that, in the framework of local algebras on Minkowski spacetime, the set of vectors generated from the vacuum by the polynomial algebra of any open region is dense in the underlying Hilbert space — see [24, Theorem 5.3.1]. From a physical point of view, this statement basically warns us that making precise the qualitative notion of localized states can be trickier than it appears at first glance. This property was shown, moreover, to hold true for a broader class of free fields and on a larger class of backgrounds in [38] where the existence of an algebraic state enjoying the Reeh–Schlieder property was proven for scalar, Dirac and Proca fields on a four-dimensional static manifold. This result prompted a further analysis in [35]: In the framework of locally covariant field theories it was shown that the construction of a state enjoying the Reeh–Schlieder property on an ultrastatic spacetime suffices to claim the existence of a state which 1035
December 17, J070-S0129055X11004515
1036
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
enjoys the very same property at least for a suitable region on an arbitrary fourdimensional globally hyperbolic spacetime. This is achieved by a careful use of a deformation argument first introduced in [20] and often used in the literature to prove the existence of states enjoying the so-called Hadamard condition [32]. Furthermore, in [35] it was also shown that a free real scalar field falls in the class of theories for which the mentioned result can be applied. The goal of this paper is to prove that the existence of a state enjoying at least on a suitable region the Reeh–Schlieder property is not restricted to the example of a scalar field theory, but it can be extended to higher spin field theories, most notably the Dirac field, the Proca field and the vector potential. The reason for this investigation originated not only from the desire of extending the axiomatic results on Minkowski background to an arbitrary globally hyperbolic spacetime, but it is also connected to some recent results in constructive algebraic quantum field theory. In particular, in [11] it was shown that warped convolutions, a deformation technique first developed for quantum field theories on Minkowski space and closely related to Rieffel’s deformation [8], can be applied also on a rather broad class of curved backgrounds. Although the whole procedure is carried in a model-independent, operator-algebraic framework, and although the deformed models share many structural properties with deformations of quantum field theories on a flat background, a natural and yet very difficult question to answer is whether the deformed model is actually not equivalent to the original one. Although a general answer is still not within reach, thanks to the results in [8], it was shown that this is indeed the case for a free Dirac field living on a Friedmann– Robertson–Walker spacetime with flat spatial section. The key point of the argument is the existence of a state for the undeformed theory which enjoys the Reeh– Schlieder property for a suitable open region, a fact which cannot hold true after the deformation. Hence, on account of this example, it is even more important to answer the question whether higher spin free field theories posses a state which enjoys at least locally the Reeh–Schlieder property. While the case of a Dirac field is rather simple since it was already proven in [36] that it can be described as a locally covariant quantum field theory, a similar analysis for the Proca field and for the vector potential seems not to be available in the literature. Hence, we organize the paper as follows: In Sec. 2.1, we introduce with the language of bundles the kinematical configurations for spin- 12 and spin-1 fields on an arbitrary four-dimensional globally hyperbolic spacetime (M, g). In Sec. 2.2 the classical dynamics are instead thoroughly discussed and the associated algebras of observables are defined. In Sec. 2.3 we recast the previous analyses in the language of general local covariance and it is proven that both the vector potential and the Proca field fit in this framework. Section 3 is instead entirely devoted to the recollection of the deformation argument; this leads to the identification of a state enjoying at least on a suitable region the Reeh–Schlieder property and to the proof that such a state exists also for higher spin free field theories.
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1037
2. Higher Spin Field: Classical Dynamics and Field Algebra The determination of a classical dynamical system is essentially ruled by two key steps: the construction of suitable kinematical configurations and the identification of a distinguished subset ruled by the underlying dynamics. If one focuses only on non-interacting theories constructed on the flat Minkowski background, there exists a well-established procedure, first introduced by Wigner, which completely and unambiguously classifies the class of possible free fields and their equation of motion out of the unitary and irreducible representations of the Poincar´e group. On a manifold with a non-trivial Lorentzian metric the situation is less crystal clear due to the potential absence of any isometry and hence of any Wigner-like analysis. In such a framework even the construction of a free field theory requires a more careful analysis and it is ultimately affected by some arbitrary choices. Even under the reasonable requirement that the generalization on curved backgrounds of a free field should coincide with the Minkowski one in the limit when the metric becomes the flat one, already the definition of suitable kinematically allowed configurations is a rather subtle affair. The first problem one faces is the necessity to find a suitable notion of a “field with spin j”, j being any integer or semi-integer number. Although for scalar and for spin-1 fields, one can easily consider smooth function and 1-forms respectively, the case where j = 12 is already less transparent. In this case one is forced to develop a characterization of the spin of a field out of geometric structures which are meaningful also on a large class of manifolds with a non-trivial geometry. An elegant solution to this potential difficulty can be found combining the language of fiber bundles with the existence of a natural notion of Lorentz group connected to the frames defined at each point of a Lorentzian background. The first goal of this section will be to recapitulate such an approach for the non-trivial scenario of Dirac field and later we shall extend it to the case of spin-1. Since this last case was only briefly presented in this language in [39], we feel worth to devote a few lines to close this small gap. 2.1. The bundle structures and the kinematical configurations As a first step we shall introduce the kinematically allowed configurations for spin- 12 and spin-1 fields leaving the discussion of the dynamical ones to the next section. Due to a few intrinsic differences we shall always discuss the two cases separately. Furthermore, since there exists a recent and extensive literature on spinors, particularly [10, 34, 35, 39] but also [14], we will follow these references to which we refer for a more extensive analysis. Before entering into the details, we fix once and for all our notion of spacetime: Definition 2.1. A spacetime is a four-dimensional differentiable, connected, Hausdorff manifold M endowed with a smooth Lorentzian metric g whose signature is chosen as (−, +, +, +). Furthermore, if M is oriented, time oriented and it possesses a Cauchy surface, that is a closed achronal set Σ whose domain of
December 17, J070-S0129055X11004515
1038
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
dependence coincides with M , then (M, g) is called globally hyperbolic. Henceforth we shall only consider these manifolds. The kinematics of Dirac fields: As we have already mentioned above, the notion of Dirac field requires first of all a suitable definition of spin and we shall now recall how this can be inferred from the geometric structures available on a generic spacetime. Definition 2.2. The Lorentz frame bundle of M is LM = LM [SO 0 (3, 1), πL , M ] where SO 0 (3, 1) is the component connected to the identity of the Lorentz group while πL : LM → M . This structure comes together with the free and transitive right action RL : SO 0 (3, 1) × LM → LM such that RL (Λ, [Λ , p]) = [ΛΛ , p] for all Λ ∈ SO 0 (3, 1) and for all [Λ , p] ∈ LM . Unfortunately, in order to define a suitable notion of spin structure, we need to go one step beyond the above definition and particularly we shall exploit that SL(2, C) is the double cover of SO 0 (3, 1). Hence, Definition 2.3. We call spin structure of a spacetime (M, g), the following two data: . (i) a spin bundle SM = SM [SL(2, C), πS , M ] with πS : SM → M and with a fiberwise right action RS of SL(2, C), (ii) a smooth bundle morphism ρ : SM → LM such that πL ◦ ρ = πS and ρ ◦ ˜ = RL (Λ) ◦ ρ where Λ = Π(Λ), ˜ Π being the canonical surjective group RS (Λ) homomorphism from SL(2, C) to SO 0 (3, 1). We recall that a spacetime (M, g) admits a spin structure if and only if the second de Rham cohomology group with Z2 coefficients H 2 (M, Z2 ) = {0} and that the number of non equivalent ones coincides with that of the equivalence classes in H 1 (M, Z2 ). Notice that in this paper all cohomologies will be of de Rham type and hence we will not specify it again. Furthermore it is important to stress that, as proven by Geroch [22, 23], all globally hyperbolic spacetimes have a trivial second cohomology group with Z2 -coefficients. Hence, in the cases we consider, the existence of at least a spin structure is always guaranteed. We can thus proceed as follows Definition 2.4. The Dirac (spinor) bundle of (SM, πS , RS , ρ) is the associ1 1 . . ated vector bundle DM = SM ×T C4 where T = D( 2 ,0) ⊕ D(0, 2 ) is a SL(2, C)representation. Hence DM is the set of equivalence classes [(p, z)] where p ∈ SM , z ∈ C4 and (p, z) ∼ (p , z ) if and only if there exists Λ ∈ SL(2, C) such that RS (Λ)p = p while T (Λ−1 )z = z . Hence DM ≡ DM [C4 , πD , M ] is a fiber bundle over M with C4 as typical fiber. The projection map is inherited from SM , namely . for every [(p, z)] ∈ DM , we set πD ([(p, z)]) = πS (p). Furthermore, we can introduce the dual Dirac bundle D∗ M as the C4∗ -bundle over M where we require that the points (p1 , z1∗ ) and (RS (Λ)(p), z1∗ T (Λ)) are equivalent. Here ∗ is the adjoint with
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1039
respect to the inner product. Consequently, the dual pairing between C4 and (C4 )∗ extends to a well-defined fiberwise dual pairing between DM and D∗M . We can now introduce the key object of a spin- 21 field theory: Definition 2.5. A spinor field is a smooth global section of the Dirac bundle, . namely ψ ∈ Γ(DM ). The space Γ(DM ) = C ∞ (M, DM ) is naturally endowed with the usual local convex topology, that is a sequence of sections ψn ∈ Γ(DM ) is said to converge to ψ ∈ Γ(DM ) if all derivatives of fn converge to the ones of f uniformly on all compact subsets of M . Equivalently, we call a cospinor field a section ψ ∈ Γ(D∗ M ), the latter space being also endowed with the same topology as Γ(DM ). Notice that, together with smooth sections, it is important to consider also the space of smooth sections with compact support, D(DM ) = C0∞ (M, DM ) equipped with the following topology: A sequence fk ∈ D(DM ), k ∈ N converges to f ∈ D(DM ) if all fk and f are supported in a compact subset K ⊂ M and all derivatives of fk converge to the ones of f uniformly in K. A similar definition holds for D(D∗ M ) = C0∞ (M, D∗ M ). Furthermore there exists a global pairing between Γ(DM ) and D(D∗ M ) (as well as between Γ(D∗ M ) and D(DM )) by integrating the local pairing induced by the inner product on C4 : . ψ, f =
dµ(x)ψ(x)(f (x)),
∀ ψ ∈ Γ(DM )
and ∀f ∈ D(D∗ M ).
M
To summarize, on a globally hyperbolic spacetime, a kinematically allowed configuration of a spinor is nothing but a smooth section of an associated bundle constructed out of the same representation of SL(2, C) which is used on Minkowski spacetime to characterize Dirac fields. Within this respect we can claim that ψ as in Definition 2.5 has spin- 12 . The generalization of this picture to the real or complex scalar field is rather straightforward, but it is certainly desirable not be limited to these cases, but, quite surprisingly, already spin-1 fields seem to have been treated in this language only briefly in [39] and in the massive case. The usual paradigm to consider in this scenario sections of the cotangent bundle of the underlying spacetime is of course flawless, yet we find worth to supplement it with a complementary approach which justifies also on curved backgrounds the notion of spin-1 fields. The kinematics of spin-1 fields: Let us briefly mention that, on Minkowski spacetime, a spin-1 field, be it the Proca field or the vector potential, is real and unambiguously characterized by its transformation under a suitable unitary 1 1 and irreducible representation of the Poincar´e group induced from the D( 2 , 2 ) representation of SL(2, C). Furthermore the reality condition for this field entails that such a representation boils down to the fundamental one of SO 0 (3, 1).
December 17, J070-S0129055X11004515
1040
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
Therefore, in analogy with the Dirac case, but avoiding the notion of spin structure since not strictly necessary, we can proceed as follows: Definition 2.6. The (real ) Proca bundle of a spacetime (M, g) is the associated . bundle P M = LM ×Λ R4 where P M is the set of equivalence classes [(p, v µ )], p ∈ LM and v µ ∈ R4 , where (p, v µ ) ∼ (p , v µ ) if and only if there exists Λ ∈ SO 0 (3, 1) with p = RL (Λ)p and v µ = Λµν v ν , where Λµν stands for the real fundamental representation of SO 0 (3, 1) on R4 . In other words P M = P M [R4 , πP , M ] where . πP (p, v µ ) = πL (p). A Proca field or a vector potential is A ∈ Γ(P M ), all the sections being endowed with the usual local convex topology, that is a sequence of sections An ∈ Γ(P M ) is said to converge to A ∈ Γ(P M ) if all derivatives of An converge to the ones of A uniformly on all compact subsets of M . We stress that, in complete analogy with Definition 2.4, we could have started 1 1 from an associated bundle constructed out of the D( 2 , 2 ) -representation of SL(2, C), considering ultimately only the real smooth sections. This turns out to be completely equivalent to Definition 2.6 and hence it is justified to call A ∈ Γ(P M ) a field of spin-1. In order to see the connection with the standard picture of a Proca field or of a vector potential, we notice that SO 0 (3, 1), the structure group, acts transitively on each fiber π −1 (x) ∼ R4 for all x ∈ M . Hence, we are free to solder P M and to conclude that there exists a linear isomorphism Θ : T M → P M , which composed with the canonical metric-induced isomorphism Θg : T ∗ M → T M , yields the sought iso : Θg ◦ Θ : T ∗ M → P M . This entails that to any section A ∈ Γ(P M ) we morphism Θ . ∗ = can associate via pull-back a unique one form A Θ (A) ∈ Γ(T ∗ M ). Henceforth, ∗ Θ will be left implicit and A will always refer to a section of T ∗ M . To conclude notice also that the definition of P M is sometimes used directly as the one of the tangent bundle [27]. 2.2. The dynamical configurations and the associated algebras Since we have a full control of the kinematically allowed configurations for spin- 12 and spin-1 fields on a globally hyperbolic curved spacetime, we can focus on the dynamics. We shall split our discussion in three parts covering Dirac fields, Proca fields and vector potentials, respectively. Furthermore we will also introduce the algebra of observables of each of these theories since it is unambiguously determined out of the space of solutions of the equations of motion. The dynamics and the algebra of fields of Dirac fields: This scenario has been thoroughly discussed in the recent literature, see for example [10, 14, 25, 34] and we shall thus only recall the main ingredients and results. Notice that in this section we shall adopt the notation and the nomenclature of [10]. The dynamics for a spinor ψ ∈ Γ(DM ) and for a cospinor ψ ∈ Γ(D∗ M ) are ruled by the Dirac equation: Dψ = (−γ µ ∇µ + mI)ψ = 0, (1) D ψ = (γ µ ∇µ + mI)ψ = 0,
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1041
where m ≥ 0, I is the identity operator and where the γ-matrices are chosen as I2 0 0 σi , γi = , γ0 = 0 I2 −σi 0 σi with i = 1, 2, 3 being the standard Pauli matrices and I2 the 2 × 2 identity matrix. Furthermore ∆ : Γ(DM ) → Γ(DM ⊗ T ∗ M ) is the standard spin connection — see [10, Definition 2.9]. As proven in [10, 14], the operator D admits an advanced (− ) and a retarded (+ ) fundamental solution S ± : D(DM ) → Γ(DM ) fulfilling DS ± = id = S ± D, id being the identity map on the appropriate spaces. Furthermore S ± enjoy the standard support property supp(S ± (f )) ⊆ J ± (supp(f )) for all f ∈ D(DM ). An identical result holds true for the cospinor and for the operator D which thus also admits a unique advanced and retarded fundamental solution S∗± : D(D∗ M ) → Γ(D∗ M ) enjoying the same properties as S ± . Until this point, spinors and cospinors have been considered as completely distinct objects although, as customary in Minkowski spacetime, it is possible to relate them via a suitable mapping. To this avail, we recall that there exists the Dirac conjugation matrix, that is the unique β ∈ SL(4, C) such that (i) β ∗ = β, (ii) γa∗ = −βγa β −1 with a = 0, . . . , 3 and (iii) iβna γa > 0, n being timelike and future-directed. This object allows us to introduce the Dirac conjugation maps: . ·† : Γ(DM ) → Γ(D∗ M ), f † = f ∗ β, . ·† : Γ(D∗ M ) → Γ(DM ), h† = β −1 h∗ , where ∗ denotes the adjoint with respect to the inner product on C4 . Notice both that β is unique once a choice of the γ-matrices has been made and that, if we apply the Dirac conjugation maps twice consecutively, we obtain the identity. In particular in our scenario β = −iγ0 . As it holds true for bosonic field theories, the full control of the classical dynamics entails the possibility to introduce a natural algebra of observables. Within this respect, the first formulation of Dirac fields on curved backgrounds [14] treats “particles” and “antiparticles” as distinct objects although, since we are interested in the locally covariant properties of spinors, we shall follow a different approach first introduced in [1] and later used also by several authors, for example [10, 12, 17, 32–34]. This calls for considering spinors and cospinors as part of a single object; the building block of this idea is the direct sum of vector bundles DM ⊕ D∗ M out of which . we can define D = D(DM ⊕D∗ M ). It is the set of smooth and compactly supported sections endowed with the standard topology induced from that of D(DM ) and of D(D∗ M ) as per Definition 2.5. The Dirac conjugation maps above defined can be joined to form the new application Γ : D → D such that . Γ(f ⊕ h) = h† ⊕ f † , ∀ f ⊕ h ∈ D. (2) In order to define a suitable algebra we need a last datum, that is a sesquilinear form (, ) : D2 → C: . (3) (f, h) = −i f1† , Sh1 + i S∗ h2 , f2† ,
December 17, J070-S0129055X11004515
1042
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
. . where f = f1 ⊕ f2 , h = h1 ⊕ h2 and , is the non-degenerate pairing between Γ(DM ) and D(D∗ M ) introduced in the previous section. Definition 2.7. We call algebra of fields of the Dirac field, the unital ∗-algebra F (M, g) generated by the identity and the abstract elements B(f ) with f ∈ D. They satisfy the following defining relations: (i) the map f → B(f ) is linear, (ii) B(Df1 ⊕ D f2 ) = 0 for all f1 ⊕ f2 ∈ D, where D and D are the operators defined in (1), (iii) B(Γf ) = B(f )∗ for all f ∈ D and with Γ defined as in (2), . (iv) {B(f )∗ , B(h)} = B(f )∗ B(h) + B(h)B(f )∗ = (f, h) for all f, h ∈ D and where ( , ) is defined as in (3). We remark that the notion of spinor and cospinor can be recovered from the generators of the field algebra by a suitable choice of the test-functions, namely . ψ(h) = B(0 ⊕ h) where ψ † (f ) = B(f ⊕ 0). Furthermore, since (, ) is a sesquilinear form, we can consider the coset space D/(Ker(S ⊕ S∗ )) and complete it to a Hilbert space H with respect to (, ). As a by-product the pair (H, Γ) allows the extension of F (M, g) to a C∗ -algebra, F(M, g) whose elements are bounded operators on H itself. Yet it is clear that not all elements of F(M, g) can be considered as genuine observables due to the anticommuting nature of spinors and cospinors. Hence, as a first step, we restrict our attention to Feven (M, g), the even subalgebra of F(M, g) whose elements are invariant under the map B(f ) → −B(f ). As shown in [10, Proposition 3.1], this suffices to guarantee that elements, whose supports are spacelike separated, commute. This is still not sufficient since we have also to ensure that all elements are “well behaving” under the action of an element of SL(2, C). Hence, we shall consider as algebra of observables the set . A(M, g) ⊂ Feven (M, g) whose elements A = n B(fn,1 ) · · · B(fn,2kn ) are (point z (Λ) of any Λ ∈ Spin0 (3, 1); given an wise) invariant under a particular “action” L 4 z (Λ) on [(p, z)] ∈ DM arbitrary but fixed z ∈ C and any p ∈ SM , we first define L ∗ ∗ and [(p, z )] ∈ D M as . z (Λ)([(p, z)]) = L [(p, T (Λ)z)],
. z (Λ)([(p, z ∗ )]) = L [(p, z ∗ T (Λ−1 ))],
z (Λ) where T is the same representation of SL(2, C) introduced in Definition 2.5. L ∗ can be straightforwardly extended to DM ⊕ D M , subsequently to arbitrary outer tensor products of the latter, and finally to the test sections fn,1 ⊗ · · · ⊗ fn,2kn z (Λ) depends on z, it is of course not a well-defined action determining A. Since L in the strict sense, cf., [34, footnote on p. 74]. The dynamics and the Weyl algebra of Proca fields: We turn our attention to the case of spin-1 fields and due to some sharp differences we discuss separately the massless and the massive case. We shall start from the latter already considered in
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1043
[16, 21]. We say that A ∈ Γ(T ∗ M ) is a dynamically allowed Proca field if (δd + m2 )A = 0,
(4)
where m2 > 0, while d : Ωp (M ) → Ω(p+1) (M ) is the exterior derivative, Ωp being the space of smooth real-valued p-forms. At the same time δ : Ωp (M ) → Ω(p−1) (M ) . is the coderivative defined as δ = ∗−1 d∗, ∗ being the Hodge-dual. A remarkable property of these operators is that they can be combined together in the Laplace– . de Rham operator = dδ + δd which coincides up to curvature terms with −g = −g µν ∇µ ∇ν , minus the wave operator. More precisely, since we are interested in 1-forms, it holds that, for all ω ∈ Ω1 (M ), in a local frame (ω)µ = −g ωµ + Rµν ων , where Rµν is the Ricci tensor built out of the metric. Hence, (4) can be rewritten as ( − dδ + m2 )A = 0, which, in an arbitrary frame of (M, g), assumes the more renown form: (g − m2 )Aµ + ∇µ ∇ν Aν − Rµν Aν = 0.
(5)
At first glance this equation does not look hyperbolic, but if one applies the codifferential to (4), one obtains: δ(δd + m2 )A = m2 δA = 0. Whenever m = 0, this entails the Lorentz gauge condition since δA = 0 implies in a local coordinate ∇µ Aµ = 0. Hence, every solution of (4) is also coclosed, which guarantees us that the dynamics can be equivalently described by a second order hyperbolic partial differential equation: ( + m2 )A = 0,
δA = 0.
(6)
The set of solutions of the first equation in (6) can be generated with the help of ± : Ω10 (M ) → the so-called advanced (− ) and retarded (+ ) fundamental solutions Em Ω1 (M ) where Ω10 (M ) represents the set of smooth and compactly supported one± ± = I while Em ◦ ( + m2 )|Ω10 (M) = I where forms on M . Furthermore ( + m2 ) ◦ Em ◦ stands for the composition of operators and I is the identity map on Ω10 (M ). The ± (f )) ⊆ fundamental solutions yield the standard support properties, i.e., supp(Em ± interJ ± (supp(f )) for all f ∈ Ω10 (M ). Furthermore since [, δ] = [, d] = 0, Em twine the action of both δ and d. This allows to write the space of solutions for (6) as . Sm (M ) = ∆Ω10 (M ), . ± (I + m−2 dδ), I being the identity operator. See where ∆ = ∆+ − ∆− and ∆± = Em also [16] although a different signature for the metric is employed. Notice that the
December 17, J070-S0129055X11004515
1044
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
Lorenz gauge condition is automatically implemented since, for all f ∈ Ω10 (M ), the following chain of identities holds: δ∆(f ) = δ(Em (I + m−2 dδ)f ) = Em ((δ + m−2 δdδ)f ) = m−2 Em ((m2 + )δf ) = 0. Exactly as for real scalar fields, it is possible to single out the kernel of the causal propagator — see also [16, Lemma A.2] — introducing the following set of equivalence classes: [Ω10 (M )]m = {[f ] | f ∈ Ω10 (M ), and f ∼ f ⇔ ∃f˜ ∈ Ω10 (M ) | f − f = (δd + m2 )f˜}. The space of solutions can be rewritten as Sm (M ) ≡ ∆([Ω10 (M )]m ) and, on account of the results of [3, Sec. 4.3, p. 129] in particular, it turns out that Sm (M ) is a weakly non-degenerate symplectic vector space if endowed with the following antisymmetric bilinear form . ∆(f ) ∧ ∗h, ∀ Af , Ah ∈ Sm (M ). (7) σm (Af , Ah ) = M
Notice that both f and h are meant as arbitrary representatives of the equivalence classes in [Ω10 (M )]m generating Af and Ah , respectively; the symplectic form is manifestly independent from such a choice. With all these data it is possible to apply a standard construction to associate an algebra of observables to a massive spin-1 field: Definition 2.8. We call Weyl algebra of a Proca field on a four-dimensional globally hyperbolic spacetime (M, g) the unique (up to ∗-isometries) C ∗ -algebra Wm (M, g) associated to (Sm (M ), σm ) and generated by the elements W (Af ) for all Af ∈ Sm (M ) together with the defining relations: W (0) = 1,
W (Af )∗ = W (−Af ),
i
W (Af )W (Ah ) = e 2 σm (Af ,Ah ) W (Af + Ah ).
Furthermore this algebra satisfies the time-slice axiom, that is, if Σ is a Cauchy surface of (M, g) and O an open globally hyperbolic subset of (M, g) containing Σ, it holds Wm (M, g) = Wm (O, g|O ). The last statement of this definition is a direct consequence of [16, Lemma A.3]. The dynamics and the Weyl algebra of massless spin-1 fields: The scenario, in which (4) is taken with m = 0, is more involved and thus it requires a careful analysis on its own. Notice that previous analyses are present in [15, 16] although, in both papers, it is assumed that the Cauchy surface Σ of the underlying globally hyperbolic background is compact, and, in the second, it is added the further hypothesis that the first singular homology group of Σ is trivial. On the contrary we will not make any assumption on the topology of the underlying Cauchy surface. If we focus on
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1045
the equation of motion, this could be seen as originating from two different paths: (i) The first calls for considering the curved spacetime analogue of Maxwell equations, that is a two-form, called field strength, F ∈ Ω2 (M ) which fulfils dF = 0 and δF = 0. Unless the second cohomology group H 2 (M ) is trivial, we cannot apply Poincar´e lemma to conclude that there exists a globally defined A ∈ Ω1 (M ) such that F = dA. This obstruction entails that not all field strength tensors which are dynamically admissible descend from a global vector potential; yet those which can be constructed in this way originate from vector potentials A ∈ Ω1 (M ) obeying the equation of motion δdA = 0. For a recent analysis focused on the field strength tensor, refer to [28]. (ii) The second calls for considering the vector potentials A ∈ Ω1 (M ) as the building block of the theory. Their dynamics are ruled by the action S[A] = 1 dA ∧ ∗dA, where ∗ is the Hodge dual. The associated Euler–Lagrange 4 M equation is δdA = 0 and accordingly we define: . M = {[A], A ∈ Ω1 (M ), δdA = 0 and A ∼ A ⇔ ∃Λ ∈ Ω1 (M ), dΛ = 0 and A − A = Λ}. In this paper we shall focus on the second scenario even though all vector potentials dynamically allowed might not exhaust the possible set of field strengths as if we were considering F as the main object. Nonetheless, even in this picture dA still remains the basic observable and this entails the existence of a gauge freedom, namely two vector potentials which differ by a smooth and closed 1-form are indistinguishable. This justifies the definition of M. Such freedom plays a further key role also in the study of the classical dynamics of a vector potential: Contrary to the massive case, δA = 0 is not identically satisfied by all solution of (4) with m = 0. Yet the following lemma avoids any possible complication: Lemma 2.1. Every solution of (4) is gauge equivalent to one of A = 0,
δA = 0,
(8)
where A ∈ Ω1 (M ). Proof. By direct inspection every solution of (8) solves δdA = 0. Hence, let us consider any A solving the latter equation and we look for Λ ∈ Ω1 (M ) such that dΛ = 0 and (A+Λ) = 0, δ(A+Λ) = 0. In particular let us choose χ ∈ C ∞ (M ) and . let us fix Λ = −dχ; in order to guarantee the coclosedness of A = A − dχ, it must hold δA = δdχ = χ. This constraint on χ entails that A solves automatically the wave equation since A = A − dχ = A − dχ = A − dδA = δdA = 0. To conclude we notice that the existence of at least one function χ satisfying the above requisites descends from [2, Chap. 3, Corollary 5]; it guarantees the existence of a smooth solution of a scalar equation of the form χ = f with f ∈ C ∞ (M ) provided that smooth initial data are assigned.
December 17, J070-S0129055X11004515
1046
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
Yet since the actual physical system is represented by the elements of M, we cannot simply focus on the solutions of (4), but we have to identify those vector potential solving (8) and lying in the same equivalence class of M: Lemma 2.2. Let L be the set of equivalence classes of smooth vector potentials solving (8) where A ∼ A if and only if there exists Λ ∈ Ω1 (M ) such that dΛ = 0, δΛ = 0 and A = A + Λ. Then there exists a bijection ϕ : M → L such that ϕ([A]) = [A ] where [A ] is the equivalence class in L generated by the 1-form A = A − dχ where χ ∈ C ∞ (M ) is any solution of χ = δA. Proof. The map ϕ is well-defined since Lemma 2.1 guarantees that A solves (8) and thus [A ] ∈ L. Furthermore [A ] does not depend on χ. As a matter of fact we have the freedom to choose χ, χ ∈ C ∞ (M ) such that χ = χ = δA, but . dχ = dχ . Yet A = A − dχ = A − dχ − d(χ − χ ) and, if we consider Λ = d(χ − χ) it is immediate that dΛ = 0 and δΛ = (χ − χ) which vanishes. We show that ϕ is a bijection. Since the map is linear, injectivity of ϕ is guaranteed if we show that only [0]M ∈ Ker(ϕ). Suppose this is not true, than there would exist [A] ∈ M such that ϕ([A]) = [0]L . This is not possible since, per definition of ϕ, there exists χ ∈ C ∞ (M ) such that A + dχ = 0 and χ = δA. Yet, on account of the definition of M, this entails that A ∈ [0]M . Surjectivity instead can be easily proven noticing that every [A] ∈ L is generated by A, solution of (8) and thus also of (4). Hence A generates also an equivalence class in M and any other representative of [A] ∈ L would fall in the same equivalence class since two elements differ by a smooth 1-form which is both closed and co-closed. As a by-product of this lemma we can focus only on the solutions of (8) and, as usual, we are interested in those generated by compactly supported initial data. In order to account automatically for the Lorenz condition, we use a procedure first implemented in [15], see Proposition 4 in particular: We define as space of test functions: . Ω10,δ (M ) = {f ∈ Ω10 (M ) such that δf = 0}, from which we construct S(M ) = {A ∈ Ω1 (M ) | ∃f ∈ Ω10,δ (M ) | A = E(f )}, where E is the causal propagator associated to the -operator. Yet, on account of the definition of L, it is still possible to assign two distinct initial data yielding two 1-forms which are in the same equivalence class and we want to single out such possibility. Definition 2.9. We call [Ω10,δ (M )] the set of equivalence classes [f ] constructed out of the following equivalence relation: f ∼ f if f, f ∈ Ω10,δ and if there exists β ∈ Ω20 (M ) such that f − f = δβ and dβ = 0.
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1047
Then we define the space of Lorentz solutions with compactly supported initial . data L0 (M ) = E[Ω10,δ (M )], where E is the causal propagator of in (M, g). We . also introduce M0 (M ) = ϕ−1 (L0 (M )), ϕ being the map defined in Lemma 2.2. Notice that the equivalence relation between two test functions can be justified as follows: Let f, f be any two representatives of [f ] ∈ [Ω10,δ (M )]. Then E(f ) − E(f ) = E(f − f ) = E(f˜) where f˜ ∈ Ω10,δ (M ); hence δE(f˜) = 0. Furthermore, since f and f should generate two solutions in the same equivalence class of L, it must hold dE(f˜) = E(df˜) = 0, which entails that there exists β ∈ Ω20 (M ) such that df˜ = β. Yet f˜ = dδf + δdf˜ = δβ = δβ, which, due to the compactness of all forms, yields f˜ = δβ. Furthermore, since df˜ = β, it holds 0 = dβ and thus dβ = 0, β being compactly supported. In order to discuss the quantization of the vector potential, we can proceed constructing an associated Weyl algebra. To this avail we need to identify a suitable symplectic form on M or rather, in view of the preceding discussion, on M0 . The auxiliary space of Lorentz solutions is rather handy since Proposition 2.1. The set L0 (M ) is a symplectic vector space if H 1 (M ) is trivial and if endowed with the following weakly non-degenerate symplectic form: σ([Af ], [Ah ]) = E(f ) ∧ ∗h, (9) M
where f, h are any representative of [f ], [h] ∈ [Ω10 (M )] while ∗ stands for the Hodge-operator. This automatically induces a weakly non-degenerate symplectic form σM0 : M0 × M0 → R defined as the pull-back . . σM0 ([A], [A ]) = ϕ∗ σ([A], [A ]) = σ(ϕ[A], ϕ[A ]),
∀ [A], [A ] ∈ M0 .
Proof. As a starting point let us notice that the right-hand side of (9) contains specific representatives of the equivalence classes appearing on the left-hand side. Hence, in order for (9) to be well-defined, one must prove that the right-hand side is independent from the choice of the representative. Since σ is bilinear and antisymmetric, we can just consider one of the arguments and, thus, suppose that f, f ∈ [f ] ∈ L0 . Hence f − f = f˜ ∈ Ω10,δ (M ) with f˜ = δβ, β being a closed 2-form of compact support. Since dE(f˜) = 0 and since the first cohomology group of M is trivial, we know that there exists λ ∈ C ∞ (M ) such that E(f˜) = dλ. Therefore E(f˜) ∧ ∗h = dλ ∧ ∗h = λ ∧ ∗δh = 0, M
M
M
where in the last identity we used the fact that the intersection of the support of λ and h is compact. Notice that, without the topological assumption on the cohomological structure of M , there is no apparent way to claim that M E(f˜) ∧ ∗h vanishes even knowing that f˜ is both closed and coclosed. Since ϕ is a bijection as per Proposition 2.2 and since M0 is per definition the pre-image of L0 , we only
December 17, J070-S0129055X11004515
1048
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
need to prove that (L0 , σ) is a weakly non-degenerate symplectic vector space. This is tantamount to show that, if σ([Af ], [Ah ]) = 0 for all [Ah ] ∈ L0 then [Af ] must with be [0]. Let us suppose that this is not true and let us assume that h = δ λ 2 λ ∈ Ω0 (M ). This entails that = 0. E(f ) ∧ ∗δλ = dE(f ) ∧ ∗λ M
M
are two-forms, the non-degenerateness of the pairing Since both dE(f ) and ∗λ between 2-forms and the arbitrariness of λ yield that dE(f ) = 0. Since per hypothesis also δE(f ) vanishes, it means that E(f ) falls in the same equivalence class as 0 in L. Since f is also compactly supported, the desired conclusion follows. We are now in position to define the algebra of observables for a massless spin-1 field: Definition 2.10. We call Weyl algebra of the spin-1 massless field on (M, g) with H 1 (M ) = {0}, the unique (up to *-isometries) C ∗ -algebra W0 (M, g) associated to the symplectic space (M0 , σM0 ) which is generated by the abstract elements W ([A]) with [A] ∈ M0 and such that W ([0]) = 1,
W ([A])∗ = W (−[A]), i
W ([A])W ([A ]) = e 2 σM0 ([A],[A ]) W ([A + A ]). Whenever the first cohomology group of (M, g) is not trivial, even though the straightforward construction of a Weyl C∗ -algebra fails, one could try to resort to an alternative procedure which stems from the existence of a cover of any globally hyperbolic spacetime with contractible globally hyperbolic subsets. Since on each such subset all topological obstructions would be absent and thus each of them possesses its own Weyl algebra, one could try to associate to (M, g) the socalled universal algebra using the scheme outlined in [19]. Yet, in order to prove the existence of such an algebra one would have to verify that the conditions of [9, Proposition B.0.6] are met; these appear to be rather tricky to prove and we postpone the study of this problem to a later work. Although already proven in [16, Proposition A.3], we recast in our language the following result: Lemma 2.3. The Weyl algebra W0 (M, g) satisfies the time-slice axiom, that is, if Σ is a Cauchy surface of (M, g) and O an open globally hyperbolic subset of M containing Σ, it holds that W0 (M, g) = W0 (O, g|O ). Proof. Let us recall that every generator W ([Af ]) of W(M, g) can be constructed unambiguously from [Ω10,δ (M, g)]. Hence we can equivalently consider the C∗ -algebra generated by V ([f ]) with [f ] ∈ [Ω10,δ (M, g)] and subject to the defining relations: V ([0]) = 1,
V ([f ])∗ = V (−[f ]),
i
V ([f ])V ([f ]) = e 2 E([f ],[f ]) V ([f + f ]),
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1049
. where E([f ], [f ]) = M E(f ) ∧ ∗f , f, f being arbitrary representatives in the respective equivalence classes. Let us now consider an arbitrary globally hyperbolic open subset O of M encompassing a Cauchy surface Σ and let us choose two other Cauchy surfaces Σ± , lying in the future and in the past of Σ respectively. Let us also fix two functions χ± ∈ C ∞ (M ) such that χ+ + χ− = 1 and supp(χ+ ) ∈ J − (Σ+ ) while supp(χ− ) ∈ J + (Σ− ). Hence consider now any [f ] ∈ [Ω10,δ (M, g)]. Then the following identity holds: f = δd(E − (f ) − χ+ E(f )) + f˜. Hence, f˜ = f − δd(E − (f ) − χ+ E(f )) = δd(χ+ E(f )), where we exploited that δf vanishes. Thanks to the support properties of χ± , it turns out that f˜ ∈ Ω10 (O) ⊂ Ω10 (M ). Furthermore, it also holds that E(f ) = E(f˜) and, since δf = 0, also δ f˜ = 0. To conclude we need to show that f˜ falls in the same equivalence class as f in (M, g) as per Definition 2.9: This is automatic since, per construction, f − f˜ = δβ where β = d(χ+ E(f ) − E − (f )) and dβ = 0. 2.3. Spin- 12 and spin-1 fields in the language of general local covariance We remind the reader that our ultimate goal is to prove the existence of states for higher spin free fields enjoying at least locally the Reeh–Schlieder property. Particularly we want to follow the approach of [35] which has the advantage of being rather general since it deals with arbitrary locally covariant field theories. First introduced in [6], the so-called principle of general local covariance was formulated leading to the realization of a quantum field theory as a covariant functor between the category of globally hyperbolic (four-dimensional) Lorentzian manifolds with isometric embeddings as morphisms and the category of C∗ -algebras with invertible endomorphisms as morphisms. This also entails a new interpretation of local fields as natural transformations from compactly supported smooth functions to suitable operators. In the same paper it was proven that the Klein–Gordon field fits perfectly in this scheme whereas Dirac fields were dealt with in [36], although a preliminary analysis was already present in [39]. Therefore, the goal of this subsection will be twofold: On the one hand we will shorty recapitulate some of the results of this last cited paper concerning spin- 12 fields, while, on the other hand, we will prove that also the vector potential and the Proca field can be described as genuine local covariant quantum field theories. Since, as already mentioned, the natural language of this framework is that of categories, as a first step we will introduce those which will be employed in the forthcoming analysis. Since the composition map between morphisms and the existence of an identity map are straightforwardly defined in all the cases we shall consider, we will omit them. Definition 2.11. We call: (i) GlobHyp: The category whose objects are (M, g), that is four-dimensional oriented and time oriented globally hyperbolic spacetimes endowed with a smooth
December 17, J070-S0129055X11004515
1050
(ii)
(iii)
(iv)
(v)
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
metric of signature (−, +, +, +). A morphism between two objects (M, g) and (M , g ) is a smooth embedding µ : M → M such that orientation and time orientation are preserved and µ(M ) is causally convexa and µ∗ g = g on µ(M ). GlobHyp1 : The subcategory of GlobHyp whose objects are (M, g) ∈ Obj(GlobHyp) and H 1 (M ) = {0}. A morphism between two objects (M, g) and (M , g ) is a smooth embeddingb µ : M → M such that orientation and time orientation are preserved and µ(M ) is causally convex with µ∗ g = g on µ(M ). Bund: The category whose objects are smooth fiber bundles π : P → M where (M, g) is an object of GlobHyp. Morphisms are smooth maps ν : P1 → P2 such that they restrict to isomorphisms of the fibers and they cover a morphism µ : M1 → M2 in GlobHyp, that is π2 ◦ ν = µ ◦ π1 . SSpac: The subcategory of Bund whose objects are the quintuples (M, g, SM, RS , πS ) where (M, g) is an object of GlobHyp whereas (SM, πS ) is a spin bundle over (M, g) as per Definition 2.3. Morphisms are maps χ : (M1 , g1 , SM1 , RS,1 , πS,1 ) → (M2 , g2 , SM2 , RS,2 , πS,2 ) covering morphisms µ : (M1 , g1 ) → (M2 , g2 ) in GlobHyp so that χ ◦ RS,1 = RS,2 ◦ χ and πS,2 ◦ χ = µ∗ ◦ πS,1 where µ∗ is the push-forward induced by µ. Alg: The category whose objects are unital C ∗ -algebras whereas morphisms are injective unit-preserving ∗-homomorphisms.
We shall now use these ingredients first discussing the Proca field and the vector potential, which are still treated separately due to some subtleties, and later recollecting the results of [35] on Dirac fields. Hence, on account of [6, Definition 2.1]: Proposition 2.2. The Proca field is a locally covariant quantum field theory, that is there exists a covariant functor WA,m : GlobHyp → Alg which assigns to every object (M, g) ∈ GlobHyp the C ∗ -algebra Wm (M, g) of Definition 2.8 with the induced action on the morphisms. In other words, to any arrow µ between (M, g) and (M , g ), both objects in GlobHyp, it corresponds αµ , the unit-preserving injective . ∗-homomorphism defined by its action on the generators as αµ (W (Af )) = W (Af˜) . where f˜ = f ◦ µ for all Af ∈ Sm (M ). Furthermore the locally covariant quantum field theory of a Proca field fulfils the time slice axiom and it is causal, that is for every two morphisms µj : (Mj , gj ) → (M, g), j = 1, 2 between objects in GlobHyp so that µ1 (M1 , g1 ) is causally separated from µ2 (M2 , g2 ), it holds [αµ1 (Wm (M1 , g1 )), αµ2 (Wm (M2 , g2 ))] = 0. Proof. On account of the analysis of the previous section, the proof is roughly identical to the one for the Klein–Gordon field given in [6], which in turn relies on the recall that an open subset O of a globally hyperbolic spacetime is called causally convex if ∀ x, y ∈ O all causal curves connecting x to y lie entirely inside O. b Notice that µ is a diffeomorphism between M and µ(M ) and thus [29, Corollary 11.3] entails that the cohomology groups of M and µ(M ) are isomorphic. a We
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1051
analysis of [13]. As we discussed in Definition 2.8 and in the preceding analysis, to every (M, g) ∈ Obj(GlobHyp), one can associate both (Sm (M ), σm ), the symplectic vector space built out of the solutions of the Proca equation and Wm (M, g), a unique (up to ∗-isomorphisms) C∗ -algebra. Let us consider any morphism µ between two objects, (M, g) and (M , g ) and let us consider (µ(M ), g |µ(M) ) as a globally hyperbolic spacetime on its own. It is easy to see that Sm (µ(M )) = µ∗ Sm (M ), where µ∗ is the pull-back on forms induced by the action of µ−1 which is welldefined, µ being a diffeomorphism between M and µ(M ). As a matter of fact, if we consider any solution A of the Proca equation, the following identity holds: µ∗ [(δd − m2 )A] = (δd − m2 )µ∗ (A) = 0, where we employed that the pull-back induced by a smooth isometric embedding commutes with d (see [29, Lemma 9.14]) and it intertwines the Hodge star ∗ of the source and of the target manifold. This identity together with the relations µ∗ ◦ µ∗ = idµ(M) and µ∗ ◦ µ∗ = idM suffices to prove the above statement. Hence, if we consider the propagator ∆ = Em (I + m−2 dδ) for the Proca equation in (M, g) for the counterpart in (µ(M ), g |µ(M) ), it holds similarly to the scenario of and ∆ = µ∗ ◦ ∆ ◦ µ∗ . Consequently, on account of definition (7), a real scalar field that ∆ 1 for all f, h ∈ Ω0 (M )
∆(f ) ∧ ∗h =
M
˜ f˜) ∧ ∗h, ∆(
µ(M)
˜ are equal to µ∗ (f ) and µ∗ (h), respectively. In other words µ is a where f˜ and h symplectomorphism. According to a standard theorem for C∗ -algebras, this entails ˜ µ between the Weyl the existence of a unit-preserving injective ∗ -homomorphism α algebras Wm (M, g) and Wm (µ(M ), g |µ(M) ) whose action on each generator W (Af ), . ˜ µ (W (Af )) = W (µ∗ (Af )). Furthermore Af ∈ Sm (M ), is unambiguously fixed as α since µ(M ) is a globally hyperbolic open subset of M , the uniqueness of the ≡ χ(µ(M ))∆ , χ(µ(M )) causal propagator for the Proca equation entails that ∆ and ∆ being the characteristic function of µ(M ) ⊆ M and the propagator for δd − m2 in (M , g ) respectively. This relation yields the existence of a natural immersion ι : Sm (µ(M )) → Sm (M ) which associates to each Af ∈ Sm (µ(M )) with f ∈ Ω10 (µ(M )) the 1-form ∆ (f ) ∈ Sm (M ). A direct inspection of (7) shows that ι is a symplectomorphism and thus it induces an injective unit-preserving ∗homomorphism αι : Wm (µ(M ), g |µ(M) ) → Wm (M , g ) characterized by its action . on the generators as αι (W (Af )) = W (ι(Af )). Hence we have constructed an injec. ˜ µ : Wm (M, g) → Wm (M , g ) and tive unit-preserving ∗-homomorphism αµ = αι ◦ α it automatically satisfies the covariance properties required in [6, Definition 2.1], namely αµ ◦ αµ = αµ ◦µ ,
αidM = idWm (M,g) ,
December 17, J070-S0129055X11004515
1052
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
where µ is a morphism between (M, g) and (M , g ) in GlobHyp while µ is a morphism between (M , g ) and (M , g ) in GlobHyp. This suffices to prove that WA,m is indeed a covariant functor defining a locally covariant quantum field theory. As already mentioned, the time-slice axiom holds true as proven by Fewster and Pfenning in [16, Lemma A.3] whereas the property of being causal is a by-product of the composition rule of the Weyl algebra in Definition 2.8. Such rule depends on the symplectic form σm which in turn is constructed out of the causal propagator Em for + m2 . Hence, if we consider two objects (M1 , g1 ) and (M2 , g2 ) of GlobHyp isometrically embedded in a third one, (M, g) and such that µ1 (M1 ) is causally separated with µ2 (M2 ), it holds that, for every W (Af ) ∈ αµ1 (Wm (M1 , g1 )) and W (Ah ) ∈ αµ2 (Wm (M2 , g2 )), W (Af )W (Ah ) = W (Af + Ah ) = W (Ah )W (Af ) since σm (Af , Ah ) vanishes. This is due to the support properties of the causal propagator Em and thus also of ∆. We can now focus on the massless case: Proposition 2.3. The vector potential can be described in terms of a locally covariant quantum field theory, that is there exists a covariant functor WA,0 : GlobHyp1 → Alg which assigns to every object (M, g) ∈ GlobHyp1 the C ∗ -algebra W0 (M, g) of Definition 2.10 with the induced action on the morphisms. In other words, to any arrow µ between (M, g) and (M , g ), both objects in GlobHyp1 , it corresponds αµ , the unit-preserving injective ∗-homomorphism defined by its action on the genera. . tors as αµ (W ([Af ])) = W ([Af˜]) where [f˜] = [f ◦ µ] for all f ∈ [Ω10,δ (M, g)] Furthermore this locally covariant quantum field theory fulfils the time slice axiom and it is causal. Proof. Let us start from an arbitrary (M, g) ∈ Obj(GlobHyp1 ) and let us consider any smooth isometric embedding µ : (M, g) → (M , g ). On the image µ(M ), the map µ is a diffeomorphism and thus, for every A ∈ Ω1 (M ) solving (8) it is meaningful to consider the pull-back under the action of µ−1 which we indicate as µ∗ (A). As already discussed in the proof of Proposition 2.2, both the exterior derivative and δ commute with any isometry; thus it holds both that µ∗ (A) = µ∗ ((A)) = 0 and that δµ∗ (A) = µ∗ δ(A) = 0. Furthermore, if A can be written as E(f ) with f ∈ Ω10,δ , we can proceed as for the Proca equation to f˜ is a solution of (8) in conclude that µ∗ (A) = µ∗ ◦ E(f ) = µ∗ ◦ E ◦ µ∗ ◦ µ∗ (f ) = E . 1 ˜ µ(M ) generated by the action on f = µ∗ (f ) ∈ Ω0 (µ(M )) of the causal propagator . = E µ∗ ◦ E ◦ µ∗ . Furthermore, on account of Definition 2.9 and of [δ, µ∗ ] = 0, it holds that [Ω10,δ (µ(M ))] = µ∗ [Ω10,δ (M )] and thus, L0 (µ(M )) = µ∗ L0 (M ). For any ˜ ∈ [Ω1 (µ(M ))], (9) reads [f˜], [h] 0,δ ˜ ˜ σ ([Af˜], [Ah˜ ]) = E(f ) ∧ ∗h = µ∗ ◦ E(f ) ∧ µ∗ (h) µ(M)
µ(M)
µ∗ [E(f ) ∧ h] =
= µ(M)
E(f ) ∧ ∗h = σ([Af ], [Ah ]), M
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1053
where σ stands for the symplectic form of L0 (µ(M )). Notice that in the various ˜ as well as [29, Lemma 9.9]. In f˜, h equalities we employed the definitions of E, ). other words µ∗ is a symplectomorphism between (L0 (M ), σ) and (L0 (µ(M )), σ We recall now that (µ(M ), g |µ(M) ) is a globally hyperbolic open subset of (M , g ), which describes via the smooth isometric embedding ι : µ(M ) → M . Hence, since any f ∈ Ω10,δ (µ(M )) also lies in Ω10,δ (M ) every element of the equivalence class [f ]µ(M) , ∈ [Ω10,δ (µ(M ))] generated by f lies in the equivalence class [f ]M ∈ [Ω10,δ (M )] generated still by f . Notice that the subscripts are introduced in order to avoid confusion in the notation and they will not be present elsewhere in the text. Furthermore the uniqueness of the solution for the wave equation with smooth compactly supported initial data entails that, if we consider E , the causal = χ[µ(M )]E where χ is the characpropagator for (8) in M , it holds that E teristic function. Consequently we can claim both that ι induces an embedding . ˜ι : L0 (µ(M )) → L0 (M ) defined as ˜ι(E[f ]µ(M) ) = E([f ]M ) and that ˜ι is a symplectomorphism. This can be proven per direct inspection of (9) and thus we omit the explicit computation. If we now consider the symplectomorphism ϕ from Definition 2.9, we have proven . ι ◦ µ∗ ◦ ϕM : M0 (M ) → M 0 which preserves the the existence of a map µ ˜ = ϕ−1 M ◦ ˜ symplectic form. At a level of Weyl algebras this induces an injective unit-preserving ∗-homomorphism αµ : W0 (M ) → W0 (M ) completely determined by its action on the generators: αµ (W ([A])) = W (ϕ−1 ι ◦ µ∗ ◦ ϕM ([A])), M ◦ ˜
∀ [A] ∈ M0 (M ).
On account of this last formula , the map αµ automatically satisfies the covariance properties required in [6, Definition 2.1], namely αµ ◦ αµ = αµ ◦µ ,
αidM = idW0 (M,g) ,
where µ is a morphism between (M, g) and (M , g ) in GlobHyp1 while µ is a morphism between (M , g ) and (M , g ) in GlobHyp1 . This suffices to prove that WA,0 is indeed a covariant functor defining a local covariant quantum field theory. As in the massive case, the time-slice axiom is satisfied as proven by Pfenning and Fewster in [16], whereas causality holds true following exactly the same reasoning as in Proposition 2.2 barring minor amendments to account for the different equivalence classes. We can now focus on the fields with spin- 12 . As we mentioned before this case has been treated in full details in [36] and it is far from our goals to summarize all the results obtained in this cited paper. Bearing in mind the nomenclature of Definition 2.7 and of the subsequent analysis, here we shall only summarize in our setting the results of [36, Theorem 4.7 and Proposition 4.8]: Proposition 2.4. A Dirac field can be described as a locally covariant quantum field theory, namely as the B : SSpac → Alg which assigns to each (M, g, SM, πS ) ∈
December 17, J070-S0129055X11004515
1054
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
Obj(SSpin) A(M, g), the subalgebra of observables of F(M, g). Furthermore this theory is causal and it satisfies the time-slice axiom. 3. The Reeh–Schlieder Property In the previous section we set all the pieces on the chessboard and now it is time to unveil the details of the strategy according to which we shall use them. Hence, as a first step, we shall recollect the notion of “Reeh–Schlieder” property. As customary in the algebraic approach to quantum field theory, a full quantization scheme consists of two ingredients, a unital C∗ -algebra (a ∗-algebra actually suffices in the most general scenario), here indicated for the sake of simplicity as W, and a state, that is a continuous linear functional ω : W → C such that ω(e) = 1,
ω(a∗ a) ≥ 0,
∀ a ∈ W,
where e is the unit element of the algebra. The Gelfand–Naimark–Segal (GNS) theorem guarantees that it is possible to assign to the pair (W, ω) a triplet (Hω , Πω , Ωω ), where Hω is an Hilbert space on which the algebra W is represented in terms of bounded linear operators via Πω . Furthermore, Ωω is a norm 1 vector in Hω , such that Hω = πω (W)Ωω . The collection of all states admits a description in terms of the following category [6]: States: The objects are sets of states on a C∗ -algebra seen as an object of Alg. Morphisms are instead all positive maps α∗ : S1 → S2 . Definition 3.1. For a given functor W : GlobHyp → Alg defining a local covariant quantum field theory in the sense of [6, Definition 2.1], a state space S is a contravariant functor S : GlobHyp → States such that S(M, g) is a convex subset of the set of states of W(M, g) for all (M, g) ∈ Obj(Globhyp) whereas, for a . given morphism µ : (M, g) → (M , g ), it holds that S(µ) = α∗µ , this being the pull back induced by αµ : W(M, g) → W(M , g ). The property of being controvariant is encoded both in α∗µ◦µ = α∗µ ◦ α∗µ and in the requirement that the unit morphisms are mapped to unit morphisms. Furthermore, a state ω ∈ S(M, g) with (M, g) ∈ Obj(GlobHyp) has the Reeh–Schlieder property for a causally convex region O ⊆ M if and only if πω (W(O, g|O ))Ωω = Hω . Notice that the same definition is valid if we replace GlobHyp with GlobHyp1 or with SSpac. 3.1. The deformation argument The next step consists of explaining the procedure leading to prove the existence of a state which satisfies the Reeh–Schlieder property at least for a causally convex region of a globally hyperbolic spacetime. This issue was addressed in [35] for a generic but abstract local covariant field theory while only the concrete case of a
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1055
scalar field theory was discussed in detail. The underlying philosophy is the same as the one used in the last decade to prove that there exist Hadamard states for free field theories, namely a deformation argument. First introduced in [20], it calls for mapping via a local isometry a suitable neighborhood of a Cauchy surface in an arbitrary globally hyperbolic spacetime into an open neighborhood of a Cauchy surface of a second globally hyperbolic spacetime. The latter is engineered in such a way that its metric becomes isometric to that of an ultrastatic globally hyperbolic spacetimec in a causal convex neighborhood of a third Cauchy surface. The advantage lies in the presence on any ultrastatic spacetime of a complete timelike Killing field which makes possible an explicit construction of states enjoying most of the wanted properties such as the Reeh–Schlieder one or/and the Hadamard condition. Furthermore, the existence of a state with these properties in an ultrastatic manifold suffices to guarantee the existence of a second state in the first manifold which preserves at least locally the very same properties. We review now this procedure more in detail although all the statements we shall write are proven in [35, Sec. 3]. Notice that, as in the previous section, we can replace GlobHyp with GlobHyp1 without problems. The same conclusion holds true in the case of spin- 12 fields, where one should work with SSpac. To this avail we recall that to each object (M, g, SM, πS ) ∈ SSpac it is associated a unique element (M, g) ∈ Obj(GlobHyp) and that each morphism in SSpac is the covering of one in GlobHyp. One might think that potential subtleties might arise when there does not exists a unique spin structure associated to a four-dimensional oriented and time oriented globally hyperbolic spacetime (M, g). Yet this is ultimately not a problem since the mentioned non-uniqueness is ruled by the topology of the underlying background. Hence, since the deformation argument focuses only on the geometry leaving the topology untouched, we are free to associate to all the spacetimes involved the same spin structure and thus all the results we shall derive can be used also for Dirac fields once a spin structure has been fixed. Hence, the geometric side of the deformation argument is the following [34], although we add the requirement that all Cauchy surfaces are orienteddiffeomorphic as it has been recently picked up in [18, Sec. 2.3]: Proposition 3.1. Let (M, g), (M , g ) ∈ Obj(GlobHyp) be chosen so that the respective Cauchy surfaces Σ → M, Σ → M are oriented-diffeomorphic, that is there exists a diffeomorphism between Σ and Σ which preserves the orientation. Let O ⊂ M be a bounded causally convex (bcc) region with non-empty causal ˜ , g˜) ∈ Obj(GlobHyp) with two embedded Cauchy complement.d Then there exist (M ˜ as well as bcc. regions U, V ⊂ M and U , V ⊂ M ˜ → M ˜ and Σ ˜ → M surface Σ cA
four-dimensional globally hyperbolic spacetime is called ultrastatic if there exists a local chart (t, xi ), i = 1, . . . , 3 such that the line element reads ds2 = −dt2 + hij dxi dxj where t runs over the whole real line while h is a smooth Riemannian metric independent from t. d We recall that for any open set K of a Lorentzian manifold (M, g), the causal complement is . K ⊥ = M \(J + (K) ∪ J − (K)).
December 17, J070-S0129055X11004515
1056
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
such that, if we call I ± the chronological future and past, ˜ as well as ψ + between (i) there exist isometries ψ − between I − (Σ) and I − (Σ) + + ˜ I (Σ ) and I (Σ ), (ii) U , V ⊂ I + (Σ ), U ⊂ D(O ) and O ⊂ D(V ), D being the domain of dependence, (iii) U, V ⊂ I − (Σ), U, V ⊥ = ∅ and ψ − (U ) ⊂ D(ψ + (U )) as well as ψ + (V ) ⊂ D(ψ − (V )). At a level of C∗ -algebras, the above proposition has been used in the proof of the following facts: Proposition 3.2. Let W : GlobHyp → Alg be a covariant functor defining a locally covariant quantum field theory satisfying the time-slice axiom and let S be the associated state space as per Definition 3.1 . Then (i) two (M, g), (M , g ) ∈ Obj(GlobHyp) with oriented-diffeomorphic Cauchy surfaces are mapped by W into isomorphic C∗ -algebras, (ii) for any bcc. region O ⊂ M with O⊥ = ∅, there exist bcc. open sets U, V ⊂ M and a ∗-isomorphism α : W(M , g ) → W(M, g) such that V ⊥ = ∅ and W(U, g|U ) ⊂ α(W(O , g |O )) ⊂ W(V, g|V ). In turn this last proposition leads to the main statement of [35]: Proposition 3.3. Under the same assumptions of the previous proposition, the following statements hold true: (i) Let us consider (M, g), (M , g ) ∈ Obj(GlobHyp) with oriented-diffeomorphic Cauchy surfaces such that ω ∈ S(M, g) has the Reeh–Schlieder property. Then for any bcc. region O ⊂ M such that O⊥ = ∅, there exists a ∗-isomorphism . α : W(M , g ) → W(M, g) such that ω = α∗ ω has the Reeh–Schlieder property for O . (ii) If the Cauchy surfaces are not compact, for any bcc. region O1 ⊂ M , there exists a second bcc. region O2 ⊂ O1⊥ for which α∗ ω has the Reeh–Schlieder property. (iii) If the locally covariant quantum field theory is causal in the sense of [6, Definition 2.1], then, given (Hω , Πω , Ωω ) — the GNS triple of ω — it turns out that Ωω is cyclic and separating for πω (W (O , g |O )) . If the Cauchy surfaces are not compact, then ω is also separating for all πω (W (O1 , g |O1 )) , O1 being an arbitrary bcc. region in M . Notice that the above proposition deals only with the construction of a state which has the Reeh–Schlieder property only in suitably small neighborhood. This is indeed the scenario we are interested in, but it is worthwhile to mention that
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1057
a stronger result can be obtained: Let us consider a locally covariant quantum field theory W : GlobHyp → Alg, causal and satisfying the time-slice axiom with a locally quasi-equivalent state space — see [34, Definition 2.4.3] for more details. Suppose that the latter is maximal, that is, for any state ω : W(M, g) → C which is locally quasi-equivalent to a state in S(M, g), ω lies in S(M, g). Then, for any (M, g), (M , g ) ∈ Obj(GlobHyp) with oriented-diffeomorphic and non-compact Cauchy surfaces, S(M , g ) contains a full Reeh–Schlieder state if one such state exists in S(M, g). 3.2. Existence of a local Reeh–Schlieder state for higher spin field theories We can now use the analysis of the previous section to prove our main result. The line of reasoning can be summarized as follows: Since every globally hyperbolic spacetime (M , g ) is diffeomorphic to R × Σ, Σ being a three-dimensional Cauchy surface, it is always possible to apply Propositions 3.1, 3.2 and 3.3 by fixing (M, g) as an ultrastatic spacetime which is in turn diffeomorphic to R × Σ. In the latter case it was proven in [38] that, for a scalar or a Proca field, every quasifree and continuous state which is ground or KMS with respect to the timelike Killing vector field in M has the Reeh–Schlieder property. Hence, since a real massive scalar field can be described as a locally covariant quantum field theory and since a ground state is always existent on a static spacetime [26], it is possible to combine the result of Strohmaier with Proposition 3.3 to conclude the existence of a state for a real and massive scalar field theory on (M , g ) which satisfies the local Reeh–Schlieder property and, moreover, it is of Hadamard form. Our goal is to generalize the above result as follows and we start from the case of Dirac fields. Proposition 3.4. Let (M , g , SM , RS , πS ) ∈ Obj(SSpac) be given and let O ⊂ M be any bcc. region with O⊥ = ∅. Then, for B(M , g , SM , RS , πS ) = A(M , g ) as in Proposition 2.4, there exists always a Hadamard state ω which has the Reeh– Schlieder property for O. Furthermore, if (Hω , πω ω , Ωω ) is the GNS triplet associated to ω , Ωω is both cyclic and separating for πω (A(O, g |O )) . Proof. In Proposition 2.4, it was proven that a Dirac field can be described as a locally covariant quantum field theory which furthermore satisfies the time-slice axiom. Furthermore, a free Dirac field on an ultrastatic spacetime (M, g) admits always a pure and quasifree state ω which is a ground state with respect to the dynamics induced at a C∗ -algebra level by the timelike Killing field. This was first explicitly established in [17] following [1, Theorem 2] and [41]. Furthermore, on account of [33], we know that such a state is of Hadamard form and, on account of [38], that it has the Reeh–Schlieder property. Hence, we can invoke the deformation argument and Propositions 3.2 and 3.3 to conclude the existence of a ∗-isomorphism α : B(M , g , SM , RS , πS ) → B(M, g, SM, RS , πS ). This induces in turn a state
December 17, J070-S0129055X11004515
1058
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
ω = α∗ ω in (M , g ) which, on account of Proposition 3.3, has the sought properties. Furthermore, the very same deformation argument (see [32]) entails also that ω enjoys the Hadamard property. For the massive spin-1 field the outcome is not that different: Proposition 3.5. Let (M , g ) ∈ Obj(GlobHyp) be given and let O ⊂ M be any bcc. region with O⊥ = ∅. Then there exists always a Hadamard state ω for Wm (M , g ) as in Definition 2.8 which has the Reeh–Schlieder property for O. Furthermore, if (Hω , πω , Ωω ) is the GNS triplet associated to ω , Ωω is both cyclic and separating for πω (Wm (O, g |O )) . Proof. In Proposition 2.2 it was proven that a Proca field can be described as a locally covariant quantum field theory which, furthermore, satisfies the timeslice axiom. Hence we can apply the procedure depicted in the previous section by deforming the chosen globally hyperbolic spacetime (M , g ) into an ultrastatic one, (M, g). On the latter the Weyl algebra for a massive spin-1 field, Wm (M, g), admits a ground state with respect to the timelike Killing field [21]. On account of [33] we also know that such a state is of Hadamard form and, on account of [38] that it has the Reeh–Schlieder property. Hence, Proposition 3.2 guarantees the existence of a ∗-isomorphism α : Wm (M , g ) → Wm (M, g) whereas Proposition 3.3 asserts that ω = α∗ ω has the sought properties. Furthermore, the very same deformation argument (see [32]) entails also that ω enjoys the Hadamard property. Slightly more complicated is instead the discussion for the vector potential. We remark that, in the case of globally hyperbolic spacetimes with trivial first de Rham cohomology group and compact Cauchy surface, one could shorten the argument by employing the results of [16, Sec. IV] in particular. It holds: Proposition 3.6. Let (M , g ) ∈ Obj(GlobHyp1 ) be given and let O ⊂ M be any bcc. region with O⊥ = ∅. Then there exists always a Hadamard state ω for W0 (M , g ) as in Definition 2.10 which has the Reeh–Schlieder property for O. Furthermore, if (Hω , πω , Ωω ) is the GNS triplet associated to ω , Ωω is both cyclic and separating for πω (W0 (O, g |O )) . Proof. In Proposition 2.3, it was proven that the vector potential can be described as a locally covariant quantum field theory which satisfies, moreover, the timeslice axiom. Hence we can apply the procedure depicted in the previous section by deforming the chosen globally hyperbolic spacetime (M , g ) into an ultrastatic one, say (M, g). Notice that, since the deformed manifold has the same topological structure of the former, it holds that H 1 (M ) = {0}. Unfortunately the existence of a state ω : W0 (M, g) → C which satisfies the Reeh–Schlieder property has never been explicitly established. Yet all ingredients to do it are already available in the literature and we can simply recollect them: First of all the existence of a quasi-free
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1059
state ω for W0 (M, g) can be inferred from [31, Sec. 3.1] where quantization via a Fock space is discussed. The main ingredients are the same as those needed for a scalar field: a weakly non-degenerate symplectic form and a complex structure. Furthermore, since (M, g) is ultrastatic, it possesses a timelike Killing field ξ which generates a 1-parameter group of isometries, say gt (ξ). This induces an action on each f ∈ Ω10 (M ) via map composition and, since gt (ξ) is an isometry, it commutes with the action both of δ and of d. Hence, on account of Definition 2.9, both [Ω10,δ (M )] and L0 (M ) are preserved by the natural action of gt (ξ). Furthermore, if one write (9) in a local coordinate system, it is immediate that gt (ξ) is also a symplectomorphism. Hence, according to these remarks, the state constructed in [31] is a ground state and one could repeat almost slavishly the discussion in [38] to conclude that it also enjoys the Reeh–Schlieder property. On account of [33] we also know that such a state is of Hadamard form. Hence, Proposition 3.2 guarantees the existence of a ∗-isomorphism α : W0 (M , g ) → W0 (M, g) which induces a state ω = α∗ ω with the sought properties since Proposition 3.3 holds true. Furthermore, the very same deformation argument (see [32]) entails also that ω enjoys the Hadamard property. To conclude the section, we would like to point out an additional feature of the deformation argument which is often neglected but descends almost automatically from the construction employed. Most notably, let us consider a globally hyperbolic spacetime (M , g ) and, up to an isometry, we can split M as R × Σ and we can find a coordinate system xµ = (t, xi ), µ = 0, . . . , 4 and i = 1, . . . , 3 such that dxµ dxν = −βdt2 + hij dxi dxj where β ∈ C ∞ (M, R+ ) while h is a smooth timegµν dependant Riemannian metric on the Cauchy surface Σ . If there exists a complete spacelike Killing field ξ for g whose integral curves lie, for each initial value, on a fixed Cauchy surface Σ , it is possible to choose the ultrastatic spacetime (M, g), which is diffeomorphic to (M , g ), in such a way that ξ is also a spacelike complete Killing field for g. Furthermore each integral curve will lie entirely on a fixed Cauchy surface Σ of (M, g) which is diffeomorphic to Σ . Hence, under these hypotheses and assuming the standard action of this additional isometry on the algebra of observables (the fermionic and the bosonic case yield the same result), the ground state ω we consider on the ultrastatic spacetime will be invariant under the action of the spacelike isometry. This invariance property will be preserved under the pull-back action of α introduced in Proposition 3.2 and thus also ω = α∗ ω will be invariant under the action induced by ξ. 4. Conclusion We have proven that the Dirac and the Proca field as well as the vector potential admit even on a generic globally hyperbolic spacetime a state which enjoys the Reeh–Schlieder property at least on a suitable open region. To get to this result we also had to show that spin-1 fields can be described as locally covariant quantum
December 17, J070-S0129055X11004515
1060
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
field theories regardless of the gauge freedom present in the massless case. From a mathematical point of view, this result guarantees that yet another property, first introduced on Minkowski background, admits a generalization on non-trivial manifolds. From a physical point of view, we envisage instead at least two possible applications: As already commented in the introduction we expect that the states, we proved to exist, could be employed in the framework of warped convolutions where new field theory models are engineered by deforming free ones. The existence of a state which enjoys the Reeh–Schlieder property could allow to prove on a firmer ground that the resulting model is indeed non-equivalent to the original one. From a completely different perspective, the outcome of this paper could be seen as a first step to bring our knowledge of spin-1 fields on curved backgrounds on par with that of the other free fields. In particular the massless case has been often relegated to an ancillary role mostly due to the presence of the gauge freedom which makes the achievement of any result trickier to say the least. With this paper we want to start a series of analyses aiming to fill this gap and our next goal will be to extend the results of [16]. In this paper the very same deformation argument we employed was used to prove that, under certain topological restrictions on the Cauchy surface, a vector potential on a globally hyperbolic spacetime always admits at least one Hadamard state. We plan to show that the bulk-to-boundary procedure, first discussed in [9], can be used also for vector potential. To this avail the proof that such a field can be described in the language of general local covariance will play an important role and ultimately we will provide an explicit construction of Hadamard states on a large class of curved backgrounds. Acknowledgments The author gratefully acknowledges financial support from the University of Pavia, from the project “Stati Quantistici di Hadamard e radiazione di Hawking da buchi neri rotanti” funded by the GNFM-Indam and from the German Research Foundation DFG through the Emmy Noether Fellowship WO 1447/1-1. It is also gratefully acknowledged the hospitality of the II. Institut f¨ ur Theoretische Physik, Universit¨ at Hamburg, where part of this project was undertaken. The author is also indebted with Ko Sanders, Daniel Siemssen and Benjamin Lang for enlightening discussions. References [1] H. Araki, On quasifree states of CAR and bogoliubov automorphsims, Publ. RIMS Kyoto Univ. 6 (1970/71) 385–442. [2] C. B¨ ar and K. Fredenhagen (eds.), Quantum Field Theory on Curved Spacetimes (Springer, 2009). [3] C. B¨ ar, N. Ginoux and F. Pf¨ affle, Wave Equations on Lorentzian Manifolds and Quantization (European Mathematical Society, 2007). [4] A. N. Bernal and M. Sanchez, On smooth Cauchy hypersurfaces and Geroch’s splitting theorem, Comm. Math. Phys. 243 (2003) 461–470; gr-qc/0306108.
December 17, J070-S0129055X11004515
2011 9:23 WSPC/S0129-055X
148-RMP
Remarks on the Reeh–Schlieder Property for Higher Spin Free Fields
1061
[5] A. N. Bernal and M. Sanchez, Further results on the smoothability of Cauchy hypersurfaces and Cauchy time functions, Lett. Math. Phys. 77 (2006) 183–197; gr-qc/0512095. [6] R. Brunetti, K. Fredenhagen and R. Verch, The generally covariant locality principle: A new paradigm for local quantum physics, Comm. Math. Phys. 237 (2003) 31–68; arXiv:math-ph/0112041. [7] R. Brunetti, L. Franceschini and V. Moretti, Topological features of massive bosons on two dimensional Einstein spacetime. I: Spatial approach, Ann. Henri Poincar´e 10 (2009) 1027–1073; arXiv:0812.0533 [gr-qc]. [8] D. Buchholz, G. Lechner and S. J. Summers, Warped convolutions, Rieffel deformations and the construction of quantum field theories, to appear in Comm. Math. Phys.; arXiv:1005.2656 [math-ph]. [9] C. Dappiaggi, V. Moretti and N. Pinamonti, Rigorous steps towards holography in asymptotically flat spacetimes, Rev. Math. Phys. 18 (2006) 349–416; gr-qc/0506069. [10] C. Dappiaggi, T. P. Hack and N. Pinamonti, The extended algebra of observables for Dirac fields and the trace anomaly of their stress-energy tensor, Rev. Math. Phys. 21 (2009) 1241–1312; arXiv:0904.0612 [math-ph]. [11] C. Dappiaggi, G. Lechner and E. Morfa-Morales, Deformations of quantum field theories on spacetimes with Killing vector fields, Comm. Math. Phys. 305 (2011) 99–130; arXiv:1006.3548 [math-ph]. [12] S. P. Dawson and C. J. Fewster, An explicit quantum weak energy inequality for Dirac fields in curved spacetimes, Classical Quant. Grav. 23 (2006) 6659–6681. [13] J. Dimock, Algebras of local observables on a manifold, Comm. Math. Phys. 77 (1980) 219–228. [14] J. Dimock, Dirac quantum fields on a manifold, Trans. Amer. Math. Soc. 269 (1982) 133–147. [15] J. Dimock, Quantized electromagnetic field on a manifold, Rev. Math. Phys. 4 (1992) 223–233. [16] C. J. Fewster and M. J. Pfenning, A quantum weak energy inequality for spin one fields in curved spacetime, J. Math. Phys. 44 (2003) 4480–4513; gr-qc/0303106. [17] C. J. Fewster and R. Verch, A quantum weak energy inequality for Dirac fields in curved spacetime, Comm. Math. Phys. 225 (2002) 331–359; math-ph/0105027. [18] C. J. Fewster and R. Verch, Dynamical locality and covariance: What makes a physical theory the same in all spacetimes?; arXiv:1106.4785 [math-ph]. [19] K. Fredenhagen, Generalizations of the theory of superselection sectors, in The Algebraic Theory of Superselection Sectors: Introduction and Recent Resutls, Palermo, Italy (World Scientific, 1989), pp. 379–387. [20] S. A. Fulling, F. J. Narcowich and R. M. Wald, Singularity structure of the two point function in quantum field theory in curved spacetime. II, Ann. Phys. 136 (1981) 243–272. [21] E. P. Furlani, Quantization of massive vector fields in curved spacetime, J. Math. Phys. 40 (1999) 2611–2626. [22] R. Geroch, Spinor structure of spacetimes in general relativity. I, J. Math. Phys. 9 (1968) 1739–1744. [23] R. P. Geroch, Spinor structure of spacetimes in general relativity. II, J. Math. Phys. 11 (1970) 343–348. [24] R. Haag, Local Quantum Physics: Fields, Particles, Algebras (Springer, 1992). [25] T.-P. Hack, On the backreaction of scalar and spinor quantum fields in curved spacetimes, Ph.D. thesis, University of Hamburg; arXiv:1008.1776 [gr-qc].
December 17, J070-S0129055X11004515
1062
2011 9:23 WSPC/S0129-055X
148-RMP
C. Dappiaggi
[26] B. Kay, Linear spin-zero quantum fields in external gravitational and scalar fields — I, Comm. Math. Phys. 62 (1978) 55–70. [27] S. Kobayashi and K. Nomizu, Foundations of Differential Geometry: Volume 1 (Interscience Publisher, 1963). [28] B. Lang, Homologie und die Feldalgebra des quantisierten Maxwellfeldes, Diplomarbeit, Universit¨ at Freiburg (2010). [29] J. M. Lee, Introduction to Smooth Manifolds (Springer, 2000). [30] B. O’Neill, Semi-Riemannian Geometry (Academic Press, 1983). [31] M. J. Pfenning, Quantization of the Maxwell field in curved spacetimes of arbitrary dimension, Classical Quant. Grav. 26 (2009), No. 13, 135017, 20 pp.; arXiv:0902.4887 [math-ph]. [32] H. Sahlmann and R. Verch, Microlocal spectrum condition and Hadamard form for vector valued quantum fields in curved space-time, Rev. Math. Phys. 13 (2001) 1203– 1246; math-ph/0008029. [33] H. Sahlmann and R. Verch, Passivity and microlocal spectrum condition, Comm. Math. Phys. 214 (2000) 705–731; math-ph/0002021. [34] J. A. Sanders, Aspects of locally covariant quantum field theory, Ph.D. thesis, University of York (2008); arXiv:0809.4828 [math-ph]. [35] K. Sanders, On the Reeh–Schlieder property in curved spacetime, Comm. Math. Phys. 288 (2009) 271–285; arXiv:0801.4676 [math-ph]. [36] K. Sanders, The locally covariant Dirac field, Rev. Math. Phys. 22 (2010) 381–430; arXiv:0911.1304 [math-ph]. [37] J. Schlemmer and R. Verch, Local thermal equilibrium states and quantum energy inequalities, Ann. Henri Poincar´e 9 (2008) 945–978; arXiv:0802.2151 [gr-qc]. [38] A. Strohmaier, The Reeh–Schlieder property for quantum fields on stationary spacetimes, Comm. Math. Phys. 215 (2000) 105–118; math-ph/0002054. [39] R. Verch, A spin-statistics theorem for quantum fields on curved spacetime manifolds in a generally covariant framework, Comm. Math. Phys. 223 (2001) 261–288. [40] R. M. Wald, General Relativity (Chicago University Press, 1984). [41] M. Weinless, Existence and uniqueness of the vacuum for linear quantized fields, J. Funct. Anal. 4 (1969) 350–379.
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 10 (2011) 1063–1113 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004527
ON THE GENERAL ONE-DIMENSIONAL XY MODEL: POSITIVE AND ZERO TEMPERATURE, SELECTION AND NON-SELECTION
A. T. BARAVIERA∗,‡ , L. M. CIOLETTI†,§ , A. O. LOPES∗,¶ , JOANA MOHR∗, and RAFAEL RIGAO SOUZA∗,∗∗ ∗Instituto
de Matem´ atica, UFRGS, Av. Bento Gon¸calves 9500, 91500, Porto Alegre, RS, Brazil
†Departamento
de Matem´ atica, Universidade de Bras´ılia, 70910-900, Bras´ılia–DF, Brazil ‡
[email protected] §
[email protected] ¶
[email protected] [email protected] ∗∗
[email protected] Received 16 June 2011 Revised 31 October 2011
We consider (M, d) a connected and compact manifold and we denote by Bi the Bernoulli space M Z . The analogous problem on the half-line N is also considered. Let A : Bi → R be an observable. Given a temperature T , we analyze the main properties of the Gibbs state µ ˆ 1 A. T
In order to do our analysis, we consider the Ruelle operator associated to T1 A, and we get in this procedure the main eigenfunction ψ 1 A . Later, we analyze selection T
problems when the temperature goes to zero: (a) existence, or not, of the limit V := limT →0 T log(ψ 1 A ), a question about selection of subactions, and, (b) existence, or not, T ˆ 1 A , a question about selection of measures. of the limit µ ˜ := limT →0 µ T The existence of subactions and other properties of Ergodic Optimization are also considered. The case where the potential depends just on the coordinates (x0 , x1 ) is carefully analyzed. We show, in this case, and under suitable hypotheses, a Large Deviation Principle, when T → 0, graph properties, etc. Finally, we will present in detail a result due to van Enter and Ruszel, where the authors show, for a particular example of potential A, that the selection of measure µ ˆ 1 A in this case, does not happen. T
Keywords: One-dimensional XY model; thermodynamic formalism; eigenfunction and eigenmeasure; Ruelle operator; zero temperature limit; maximizing probability; subaction; selection of probability. Mathematics Subject Classification 2010: 37A60, 37A50, 82B05, 60G10
1063
December 17, J070-S0129055X11004527
1064
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
0. Introduction Let (M, d) be a connected and compact manifold. We denote by B the Bernoulli space M N of sequences represented by x = (x0 , x1 , x2 , x3 , . . .), where xi , i ≥ 0 belongs to the space (alphabet) M . By Tychonoff’s Theorem of compactness, we know B is a compact metric space when equipped with the distance given by dc (x, y) = k≥0 d(xckk,yk ) , with c > 1. The topologies generated by dc1 or dc2 are the same. We denote d when we choose c = 2. In several of our results M is the interval [0, 1] or the one-dimensional circle S1 . The shift σ on B is defined by σ((x0 , x1 , x2 , x3 , . . .)) = (x1 , x2 , x3 , x4 , . . .). It is a continuous function on B. Let A : B → R be an observable or potential defined on the Bernoulli space B, i.e. a real-valued function defined on B. The potential A describes an interaction between sites in the one-dimensional lattice M N . For most of the results we consider here we will require A to be H¨ oldercontinuous, which means there exist constants 0 < α < 1 and HolA > 0 such that |A(x) − A(y)| ≤ HolA d(x, y)α . We call α the exponent of A and HolA the constant for A. We will be interested here in the Gibbs state µA associated to such A, which will be a probability measure on B. Note that the set of probability measures on B is compact for the weak* topology, (which is given by a metric). For each value β = 1/T , where T is the temperature, we can consider the Gibbs state µβA , and, we want to show in a particular example (introduced by van Enter and Ruszel [20]), that there is no limit (in the weak* topology) of the family µβA , when β → ∞. We will present here in Sec. 6 all the details of the proof of this non-trivial result. We point out that by trivial modification of the metric a Holder potential can be considered a Lipschitz potential (with no change of the topology). Therefore, we can state our results in either case. The assumption of A being Lipschitz means that there is fast rate of decay of influence of the potential if we are far away in the lattice. The case of the lattice Z, that is Bi = M Z can be treated in a similar way: ˆ the left-shift on Bi . Let A : M Z → R be a Lipschitz potential, and denote by σ ˆ -cohomologous to a potential on B (same proof Any Lipschitz potential on Bi is σ as in [45, Proposition 1.2] or, in [7]). We will explain this more carefully later. To consider σ ˆ -invariant probability measures on Bi means that the position 0 ∈ Z in the lattice is not distinguished (which in general makes sense). We call general one-dimensional XY model the setting described above. A particularly interesting case is when we consider M = S1 (the unit one-dimensional circle) [22, 35, 20]. This one-dimensional continuous Ising model is another important example that can be treated in the setting. Below in Sec. 1 our results are for the general case of any M as above. We say that the potential A : B → R depends on the first two coordinates if A(x) = A(x0 , x1 , x2 , . . .) = A(x0 , x1 ), for any x = (x0 , x1 , x2 , . . .). In this case A
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1065
is always Lipschitz. Such kind of potentials are sometimes called nearest neighbor interaction potentials. The so-called one-dimensional XY model in most of the cases assumes that A depends on the first two coordinates [22]. Special attention to this case will be given in Sec. 4. For example, in [22, 19] A(x) = A(x0 , x1 ) = cos(x1 − x0 − α) + γ cos(2x0 ), where α and γ are constants. The part γ cos(2x0 ) corresponds to the magnetic term while cos(x1 − x0 − α) corresponds to the interaction term. We point out that this point of view of getting a coboundary and the systematic use of the Ruelle operator is the Thermodynamical Formalism setting (see [45]). This, in principle, is different from the point of view more commonly used in Statistical Mechanics on general lattices where the Gibbs measures are defined by means of a specification, DLR formalism, limit of probabilities on finite boxes (see [27, 19, 44]). We briefly address this question for a potential which depends on two coordinates in Sec. 5. In the Classical Thermodynamic Formalism one usually considers M = {1, 2, . . . , d} [45, 33]. Here M is a compact manifold with a volume form. We point out that we will use the following notation: we call a Gibbs probability measure for A the measure which is derived from a Ruelle operator, and we call the equilibrium probability measure for A the one which is derived from a maximization of Pressure (which requires one to be able to talk about entropy). We will be interested here in Gibbs states because we need to avoid to talk about entropy. Note that the shift acting on M N is such that each point has an uncountable number of pre-images. Just in some late sections we will speak about “entropy” and “pressure” of the potential A (in general in the case it depends on two coordinates). An interesting discussion about the several possible approaches (DLR, Thermodynamic limit in finite boxes, etc.) to Statistical Mechanics in the one-dimensional lattice appears in [53]. Some of the results presented here will be used in a future related paper [40]. We point out that the understanding of Statistical Mechanics via the Ruelle Operator (Transfer Operator) allows one to get eigen-functions, and, in the limit (in the logarithm scale), when temperature goes to zero, the subaction. This helps in getting Large Deviation properties of Gibbs states when temperature goes to zero [3, 40]. In the first part of this paper we describe the theory for case of general A (Sec. 1 for positive temperature and Sec. 2 for zero temperature). Later (in Sec. 4) we will focus on the case where the potential A depends only on the first two coordinates. Section 5 compares the setting of Thermodynamical Formalism with DLR Formalism. These two sections will help a better understanding of Sec. 6 where we present a detailed explanation of an example [20] where there is no selection of measures.
December 17, J070-S0129055X11004527
1066
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
1. Positive Temperature: A Generalized Ruelle–Perron–Frobenius Theorem Let C be the space of continuous functions from B = M N to R. We are interested in the Ruelle operator on C associated to the Lipschitz observable A : M N → R, which acts on ψ ∈ C, and sends it to LA (ψ) ∈ C defined by eA(ax) ψ(ax) da, LA (ψ)(x) = M
for any x = (x0 , x1 , x2 , . . .) ∈ B, where ax represents the sequence (a, x0 , x1 , x2 , . . .) ∈ B, and da is the Lebesgue probability measure on M . Note that σ(ax) = x. A major difference between the settings of the Classical Bowen–Ruelle–Sinai Thermodynamic Formalism setting and the XY model is that here, in order to define the Ruelle operator, we need an a priori measure (for which we consider in most of the cases the Lebesgue probability measure da on S1 ). Some of the results of the present section are generalization of theorems in [38]. The operator LA will help us to find the Gibbs state for A. First we will show the existence of a main eigenfunction for LA , when A is Lipschitz. Part of our proof follows the reasoning of [1, Sec. 7] (which considers M = {1, 2, . . . , d}), adapted to the present case. We begin by defining another operator on C. Let 0 < s < 1, and define, for u ∈ C, Ts,A (u) given by Ts,A (u)(x) = log eA(ax)+su(ax) da . M
Proposition 1. If 0 < s < 1 then Ts,A is an uniform contraction map. Proof A(ax)+su1 (ax) Me |Ts,A (u1 )(x) − Ts,A (u2 )(x)| = log A(ax)+su2 (ax) e M
A(ax)+su2 (ax)+su1 (ax)−su2 (ax) e M = log A(ax)+su2 (ax) e M
A(ax)+su2 (ax)+su1 −u2 e M ≤ log eA(ax)+su2 (ax) M
= su1 − u2 .
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
Let us be the unique fixed point for Ts,A . We have log eA(ax)+sus (ax) da = us (x).
1067
(1)
M
Proposition 2. The family {us }0<s<1 is an equicontinuous family of functions. Proof. Let Hs (x, y) = us (x) − us (y). By (1) we have us (x) e = eA(ax)+sus (ax) M
eA(ay)+sus (ay) eA(ax)−A(ay)+s[us (ax)−us (ay)]
= M
≤e
us (y)
max{eA(ax)−A(ay)+s[us(ax)−us (ay)] }. a
Hence eus (x)−us (y) ≤ max{eA(ax)−A(ay)+s[us (ax)−us (ay)] }, a
and this implies Hs (x, y) = us (x) − us (y) ≤ max[A(ax) − A(ay) + sHs (ax, ay)]. a
Proceeding by induction we get ∞ Hs (x, y) ≤ max sn [A(θn . . . θ0 x) − A(θn . . . θ0 y)] θ∈B
n=0
≤ HolA max θ∈B
≤ HolA
sn d((θn . . . θ0 x), (θn . . . θ0 y))α
n=0
∞ n=0
≤
∞
s 2α
n d(x, y)α
α
2 HolA d(x, y)α . 2α − 1
Remark 1. This shows that us is Lipschitz, and, moreover, that us , 0 ≤ s < 1, is an equicontinuous family. Note the very important point: the Lipschitz constant α of us , is given by 2α2−1 HolA , and depends only on the Holder constant for A, but does not depend on s. Let Sn (z) = Sn,A (z) =
n−1
A ◦ σ k (z).
k=0
Note that iterates of the operator LA can be written with the use of Sn,A (z). eSn,A (ax) w(ax) da. LnA (w)(x) = a∈M n
December 17, J070-S0129055X11004527
1068
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
Theorem 3. There exists a strictly positive Lipschitz eigenfunction ψA for LA : C → C associated to a strictly positive eigenvalue λA . The eigenvalue is simple and it is equal to the spectral radius. Proof. It follows from the fixed point equation that for any x −A + s min us ≤ us (x) ≤ A + s max us . Therefore, −A ≤ (1 − s) min us ≤ (1 − s) max us ≤ A, for any s. Consider a subsequence sn → 1 such that [(1 − sn ) max usn ] → k. The family {u∗s = us − max us }0<s<1 is equicontinuous and uniformly bounded. Therefore, by Arzela–Ascoli {u∗sn }n≥1 has an accumulation point in C, which we will call u. Observe that for any s ∗
eus (x) = eus (x)−max us = e−(1−s) max us +us (x)−s max us −(1−s) max us =e eA(ax)+(sus (ax)−s max us ) da. Taking limit where n goes to infinity for the sequence sn we get that u satisfies eu(x) = e−k eA(ax)+u(ax) da. In this way we get a positive Lipschitz eigenfunction ψA = eu for LA associated to the eigenvalue λA = ek . Remark 2. To prove that u is Lipschitz, we just use the fact that u is the limit of a sequence of uniformly Lipschitz functions (i.e. Lipschitz functions with same Lipschitz constant). Using that u is a bounded function we have that ψA = eu is also Lipschitz. Note a very important point: the Lipschitz constant of u = log(ψA ) α is given by 2α2−1 HolA (see Remark 1 in the end of the proof of Proposition 2). The property that the eigenvalue is simple and maximal follows from the same reasoning as in [45, pp. 23 and 24]. For example, to prove that the eigenvalue is simple we suppose there are two eigenfunctions ψ1 and ψ2 . Let t = min{ψ1 /ψ2 }. Then ψ3 = ψ1 − tψ2 is a non-negative eigenfunction which vanishes at some point z ∈ B. Therefore 0 = λnA ψ3 (z) = eSn,A (az) ψ3 (az) da, a∈M n
which implies ψ3 (az) = 0 ∀ a ∈ M n , ∀ n, which makes ψ3 = 0. Note that
M
eA(ax) ψA (ax) da = 1, λA ψA (x)
∀ x ∈ B.
(2)
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
If a potential B satisfies
1069
eB(ax) da = 1,
∀ x ∈ B,
M
which means LB (1) = 1, we say that B is normalized. Let A¯ = A + log ψA − log ψA ◦ σ − log λA , where σ : B → B is the usual shift map. Equation (2) shows that A¯ is normalized. It is also Lipschitz (Holder). In this case the main eigenvalue is 1 and the main eigenfunction is constant equal to 1 (in fact we can prove, using Proposition 4, that there is only one strictly positive eigenfunction, the one associated to the maximal eigenvalue). Remember that, given x = (x0 , x1 , x2 , . . .) ∈ B and a ∈ M , we denote by ax ∈ B the element ax = (a, x0 , x1 , x2 , . . .), i.e. any y ∈ B such that σ(y) = x is of this form. We define the Borel sigma-algebra F over B as the σ-algebra generated by the cylinders. By this we mean the sigma-algebra generated by sets of the form B1 × B2 × · · · × Bn × M N , where n ∈ N, and Bj , j ∈ {1, 2, . . . , n}, are open sets in M . Similar definitions can be considered for Bi . We say a probability measure µ over F is invariant, if for any Borel set B, we have that µ(B) = µ(σ −1 (B)). This corresponds to stationary probability measures for the underlying stochastic process Xn , n ∈ N, with state space M . We denote by Mσ the set of invariant probability measures. Similar definitions can be considered for Bi . We present below a generalization of results considered in [45]. We define the dual operator L∗A on the space of the Borel measures on B as the operator that sends a measure v to the measure L∗A (v) defined by ψ dL∗A (v) = LA (ψ) dv B
B
for any ψ ∈ C. Now we want to find an eigen-probability for L∗A . This will help us to find the Gibbs state for the potential A. Proposition 4. If the observable A¯ is normalized, then there exists an unique fixed point m = mA¯ for L∗A¯ . Such a probability measure m is σ-invariant, and for all Holder continuous function ω we have that, in the uniform convergence topology, ω dm. LnA¯ ω → B
Here LnA¯ denotes the nth iterate of the operator LA¯ : C → C. Proof. We begin by proving that the normalization property implies that the convex and compact set of Borel probability measures on B is preserved by the
December 17, J070-S0129055X11004527
1070
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
operator L∗A¯ : in order to see that, note that for µ a Borel probability measure on B, we have ∗ ∗ 1 dLA¯ (µ) = LA¯ (1) dµ = 1 dµ = µ(B) = 1 LA¯ (µ)(B) = B
B
B
where the third equality is precisely the normalization hypothesis. By the Tychonoff–Schauder theorem let m be a fixed point for the operator L∗A¯ . To prove that m is σ-invariant, we begin by observing that ¯ ¯ eA(ax) ψ ◦ σ(ax) da = eA(ax) ψ(x) da = ψ(x). LA¯ (ψ ◦ σ)(x) = M
M
Note that the normalization hypothesis is used in the last equality. Therefore, if ψ ∈ C, then ∗ ψ ◦ σ dm = ψ ◦ σ dLA¯ (m) = LA¯ (ψ ◦ σ) dm = ψ dm B
B
B
B
which implies the invariance property of m. Before finishing the proof of Proposition 4, we will need two claims. The first is a special estimate which will be important in the rest of this section. Claim 1. For any Holder potential A, if w denotes the uniform norm of the Holder function w : B → R, we have
1 1 Cw |LnA (w)(x) − LnA (w)(y)| ≤ CeA w + · · · + + d(x, y)α , 2α 2nα 2nα where CeA is the Holder constant of eA and Cw is the Holder constant of w. Proof. We prove the claim by induction. Suppose n = 1. We have |LA (w)(x) − LA (w)(y)| ≤ |eA(ax) − eA(ay) | · |w(ax)| da M
eA(ay) |w(ax) − w(ay)| da
+ M
d(x, y)α , 2α where in the last inequality we used the normalization property of A. In particular we can say that the Holder constant of LA (w) is given by ≤ (CeA w + Cw )
CeA w + Cw . 2α Now, suppose Claim 3 holds for n. We have CLA (w) =
n+1 |Ln+1 A (w)(x) − LA (w)(y)|
= |LnA (LA (w))(x) − LnA (LA (w))(y)|
CLA (w) 1 1 ≤ CeA LA (w) + · · · + nα + d(x, y)α , 2α 2 2nα
(3)
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1071
and, therefore the claim is proved when we use (3) and LA (w) ≤ w which is consequence of the normalization property of A. As a consequence, the set {LnA¯ ω}n≥0 is equicontinuous. In order to prove that n {LA¯ ω}n≥0 is uniformly bounded we use again the normalization condition which implies LnA¯ ω ≤ w, ∀ n ≥ 1. By the Arzela–Ascoli Theorem let ω ¯ be an accumulation point for {LnA¯ ω}n≥0 , i.e. suppose there exists a subsequence {nk }k≥0 such that ω ¯ (x) = lim LnA¯k ω(x). k≥0
Claim 2. ω ¯ is a constant function. The proof of this second claim is similar to the reasoning of [45, p. 25]. Now that ω ¯ is a constant function we can prove that ω ¯= ω ¯ dm = lim LnA¯k ω dm = lim ωd(L∗A¯ )nk (m) = ω dm, B
k
k
B
B
B
which shows that ω ¯ does not depend on the subsequence chosen. Therefore, for any x ∈ B we have LnA¯ ω(x) → ω ¯= ω dm. B
The last limit shows that the fixed point m is unique. Proposition 5. Let A be a Holder, not necessarily normalized potential, and ψA and λA the eigenfunction and eigenvalue given by Theorem 3. To the potential A we associate the normalized potential A¯ = A + log ψA − log ψA ◦ σ − log λA . Let m be the unique probability measure that satisfies L∗A¯ (m) = m, given by Proposition 4. (a) The measure ρA =
1 m ψA
satisfies L∗A (ρA ) = λA ρA . Therefore, ρA is an eigen-probability for L∗A . (b) For any Holder φ : B → R, we have that LnA (φ) → ψA φ dρA . λnA Proof. (a) L∗A¯ (m) = m implies that for any ψ ∈ C, we have ψ dm = ψ dL∗A¯ (m) =
LA¯ (ψ) dm
December 17, J070-S0129055X11004527
1072
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
¯ ψ(ax)eA(ax) da dm(x)
= =
ψ(ax)
Now, if ϕ ∈ C, making ψ =
ϕ ψA
eA(ax) ψA (ax) da dm(x). λA ψA (x)
in the last equation we have eA(ax) da dm(x), ϕ(ax) ψA (x)
1 ϕ dm = ψA λA
which is equivalent to
λA
ϕ dρA =
LA (ϕ) dρA
(4)
or L∗A (ρA ) = λA ρA . (b) We have that A = A¯ − log ψA + log ψA ◦ σ + log λA , and therefore Sn,A (z) ≡
n−1
A ◦ σ k (z) = Sn,A¯ (z) − log ψA + log ψA ◦ σ n + n log λA ,
k=0
which makes 1 LnA (φ)(x) = n λnA λA
eSn,A (ax) φ(ax) da
a∈M n
eSn,A¯ (ax) φ(ax) da a∈M n ψA (ax) φ φ n = ψA (x)LA¯ dmA¯ → ψA (x) ψA ψA = ψA (x)
where the convergence on n in the last line comes from Proposition 4. Remark 3. From now on we will call mA¯ the eigen-probability for L∗A¯ . One can show that the eigen-probability ρA = ψ1A mA¯ is the unique eigen-probability for L∗A . Also, it is not necessarily invariant for the shift σ. We call mA¯ the Gibbs state for A. This probability measure mA¯ over B is invariant for the shift and describes the statistics of the interaction described by A. It is usual to call the probability measure mA¯ the Gibbs state (in the Thermodynamic Formalism setting [45]) for the interaction given by A. We point out that the probability measure ρA is positive on open sets of B. Suppose the metric space M = S1 . The projection of this probability measure on the first two coordinates S1 × S1 is absolutely continuous with respect to Lebesgue
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1073
probability measure on S1 × S1 . This is so because, if B is Borel in [0, 1]2 , then from (4) we have 1 L2A¯ (I(x0 ,x1 )∈B ) dρA , I(x0 ,x1 )∈B dρA = 2 λA and, for any x ∈ B L2A¯ (I(x0 ,x1 )∈B )(x)
= M
eS2,A¯ (abx) I(x0 ,x1 )∈B (abx) da db.
M
Remark 4. If we consider instead a Holder potential B : Bi = M Z → R, where Bi = {(. . . , x−2 , x−1 , x0 , x1 , x2 , . . .) | xi ∈ M, i ∈ Z}, then, we first derive (as in [45, Proposition 1.2] or, in [7]) the associated cohomologous Holder potential A : B → R (the Holder class can change), then proceed as above to get ρA over B. Finally, we consider the natural extension ρˆA of ρA on Bi (see [46, 7]), and we solve in this way the Statistical Mechanics problem for the interaction described by B in the lattice Z: it is the probability measure ρˆβA . Note that if C is a set that depends just on the coordinates x0 , x1 , then ρβA (C) = ρˆβA (C). For sets C ⊂ Bi , of this form, we can use without loss of generality ρβA (C) or ρˆβA (C). Proposition 6. The only Lipschitz continuous eigenfunction ψ of LA which is totally positive is ψA (the one associated to the maximal eigenvalue λA ). Proof. Suppose ψ : B → R is a Lipschitz continuous eigenfunction of LA associated to the eigenvalue β. Ln (ψ) It follows from the above that Aλn → ψA ψ dρA , when n → ∞. A Therefore, if ψ > c > 0, then ψ dρA > 0. Moreover, LnA (ψ) = β n ψ. This is only possible if β = λA and ψ = ψA . It is easy to see that if A is Holder with exponent α, and, denoting Hα , the set of real-valued functions with Holder exponent α, then LA¯ : Hα → Hα . For w ∈ Hα , denote |w|α = supx=y |w(x)−w(y)| d(x,y)α . It is known that Hα is a Banach space for the norm wα = |w|α + w, where w is the uniform norm of w. When α = 1 we are considering the space of Lipschitz functions H1 . We note that Kα ≡ {w ∈ Hα , wα ≤ 1} is compact in the uniform norm as a subset of C. To prove that, we just need to observe that the definition of the norm wα implies that Kα is a equicontinuous and uniformly bounded set, and then we have the result directly by using Arzela–Ascoli’s theorem.
December 17, J070-S0129055X11004527
1074
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
A We can also prove that Kα ≡ {w ∈ Hα , w dmA = 0, wα ≤ 1} is compact A (w) = in the uniform norm. For doing that, let ImA : Hα → R be given by Im−1 w dmA . We have that ImA is a bounded linear operator, and therefore ImA {0} is A −1 = Kα ∩ Im {0} is compact. a closed subset of Hα . Now Kα A Proposition 7. Suppose A¯ is normalized, then the eigenvalue λA¯ = 1 is maximal. Moreover, the remainder of the spectrum of LA¯ : Hα → Hα is contained in a disk centered at zero with radius strictly smaller than one. Proof. Remember that 1 is the eigenfunction associated to the eigenvalue 1. We ¯ A has spectral radius strictly smaller than 1. We will show that LA¯ restricted to Kα ¯ k A . know from Proposition 4 that LA¯ converges to zero in the compact set Kα n The normalization hypothesis implies Ln+1 (w) ≤ L (w) ∀ n ≥ 0. We will ¯ ¯ A A now prove that this monotonicity property implies that the convergence above is uniform. More precisely, we have Claim 3. Given a small , there exists N = N ∈ N such that LnA¯ (w) < ,
∀ n ≥ N,
¯
A ∀ w ∈ Kα .
¯
A : Lm To prove this claim, let Cn ≡ {w ∈ Kα ¯ (w) < ∀ m ≥ n}. The monoA tonicity property implies Cn ⊆ Cn+1 and also that Cn is an open set in the uniform ¯ ¯ A A . Therefore, compactness of Kα implies norm, while LkA¯ (w) → 0 implies ∪n Cn = Kα ¯ A Kα = CN for some N ∈ N. The last claim is easy to prove and can be enunciated as:
Claim 4. There exists C > 0 such that ∀ n ∈ N and w ∈ Hα |LnA¯ (w)|α ≤ Cw +
|w|α . (2α )n
Now, for any given n and k, using Claim 4 we have for w ∈ Hα k |Ln+k ¯ (w) + ¯ (w)|α ≤ CLA A
|LkA¯ (w)|α w |w|α ≤ CLkA¯ (w) + C α n + α n+k . α n (2 ) (2 ) (2 ) ¯
A Therefore, if is small enough and n ≥ N , we have that for all w ∈ Kα
Ln+k ¯ (w)α ≤ < 1. A 1
In this case the spectral radius is smaller than n+k . 1 We denote λA¯ < λA¯ the spectral radius of LA¯ when restricted to the set {w ∈ Hα : w dmA¯ = 0}. Now we will show the exponential decay of correlation for Holder functions.
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
Proposition 8. If v, w ∈ L2 (mA¯ ) are such that w is Holder and then, there exists C > 0 such that for all n (v ◦ σ n )w dmA¯ ≤ C (λ1A¯ )n . Proof. This follows from
1075
w dmA¯ = 0,
(v ◦ σ )w dmA¯ = n
v LnA¯ w dmA¯ .
The above proposition implies that mA¯ is mixing (same reasoning as in [33, Sec. 2] which considers the case of the shift on {1, 2, . . . , d}N ). Proposition 9. The invariant probability measure mA¯ is ergodic. Proof. If a dynamical system is mixing then it is ergodic (see [33, Sec. 2]). A major difference of the general XY Model to the Thermodynamic Formalism setting (in the sense of [45, 33]) in {1, 2, . . . , d}Z is that here we cannot define in the traditional way (via dynamic partitions) the concept of entropy of an invariant probability measure µ (defined over the sigma algebra F of B). Each element x ∈ B has an uncountable set of pre-images and this is a problem. Note that there exist invariant probabilities (for instance, singular with respect to Lebesgue measure) for the shift on B which have Kolmogorov entropy arbitrarily large. For the other hand, in the DLR-Gibbs theory, see [31], a definition of entropy is presented and the variational principle at positive temperatures is worked out and proved. But here we take another path, just in terms of transfer operators, and we present the theory of the Ruelle operator for continuous-spin models, and also including some noncontinuous potentials, which is not part of standard treatments. Note that the Gibbs state formalism via boundary conditions, as in [27], does not require, in principle, to talk about entropy (see also our Sec. 5). We will address the question about entropy when the potential depends on two coordinates in Sec. 4. In Statistical Mechanics, for a fixed interaction A under a certain temperature T > 0, up to a multiplicative constant, the natural potential to be considered is T1 A. We denote β = T1 , and, using the results above we can consider the corresponding eigenfunction ψβA , eigenvalue λβA = λβ , and the Gibbs state which now will be denoted µβA . What happen with these two objects when T → 0 (or, β → ∞), is the purpose of the next section. 2. Zero Temperature: Calibrated Subactions, Maximizing Probability Measures and Selection of Probability Measures In this section and also in the next two sections we will consider, among other issues, questions involving selections of probability measures when the temperature
December 17, J070-S0129055X11004527
1076
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
goes to zero, maximizing probability measures for a given potential and existence of calibrated subactions. Among other results we will show that, under some conditions, the sequence {µβA } of Gibbs states for the potential βA converges to a measure µ∞ which has the property of maximizing the integral A dµ among all invariant measures µ for the shift map. Sometimes such convergence will not occur (this is what we call non selection of probability measures — a very interesting example due to van Enter and Ruszel will be presented in Sec. 6). We will also consider calibrated subactions, which is an important tool that allows one to identify the support of the maximizing probability measure µ∞ (see Eq. (6) below), and can be used to relate the maximal eigenvalues of the Ruelle operator to the value m(A) = A dµ∞ (see Theorem 11). Existence of calibrated subactions are also related to the existence of large deviation principles for the convergence of {µβA } to µ∞ (see Theorem 18 in Sec. 4). Some of the problems discussed here are usually called ergodic optimization problems (see [32]). We refer the reader to [16] for question related to Ergodic Transport Theory. Consider a fixed Holder potential A and a real variable β > 0. We denote by ψβA the eigenfunction for the Ruelle operator associated to βA. Remark 5. Given β and A, the Lipschitz constant of uβ , such that ψβA = euβ , depends on the Holder constant for βA (see Remarks 1 and 2). More precisely, the α Lipschitz constant of uβ = log(ψβA ) is given by β 2α2−1 HolA . Therefore, β1 log(ψβA ), β > 0, is equicontinuous. Note that it is also uniformly bounded from the reasons described below. A possible renormalization condition for ψβA [15] is ψβA dρβA = 1, where ρβA is the eigen-probability for L∗βA (see Proposition 5 and Remark 3). For each β > 0 the normalization hypothesis ψβA dρβA = 1 implies the existence of xβ ∈ B such that ψβ (xβ ) = 1. Here we are using the connectedness hypothesis of B. When ¯, for a subsequence. Note that when we normalize ψβA β → ∞ we have that xβk → x the Holder constant of log(ψβA ) remains unchanged, which assures the uniformly continuous property of the family 1/β log(ψβA ), β > 0. Moreover, the normalization hypothesis and Remark 5 implies that 1/β log(ψβA ), β > 0 is uniformly bounded. Therefore, there exists a subsequence βn → ∞, and V Lipschitz, such that on the uniform convergence V := lim
n→∞
1 log(ψβn A ). βn
Consider point p0 ∈ B. Another possible normalization for the eigenfunction ψβA is to assume that ψβA (p0 ) = 1. We will prefer this latter form. By selection of a function V , when the temperature goes to zero (or, β → ∞), we mean the existence of the limit (in the uniform norm) 1 log(ψβA ). β→∞ β
V := lim
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1077
The existence of the limit when β → ∞ (not just of a subsequence), in the general case, is not an easy question. In this section we denote µβA the Gibbs state for the potential βA, i.e. the eigen-probability of L∗A¯ , where A¯ = A + log ψA − log ψA ◦ σ − log λA . By selection of a measure µ ˜ ∞ , when the temperature goes to zero (or, β → ∞), we mean the existence of the limit (in the weak∗ sense) µ ˜∞ := lim µβA . β→∞
In some sense V is what one can get in the limit, in the log-scale, from the eigenfunction (at non-zero temperature), and µ ˜∞ is the Gibbs state at temperature zero. Even if A is Lipschitz not always the above limit on µβA , β → ∞, exist. In fact we will show an interesting example in Sec. 6 (due to van Enter and Ruszel) where there is no limit for µβA , as β → ∞. Some theorems in this section are generalizations of corresponding ones in [38] (which consider only potentials A which depend on two coordinates). Related results appear in [25, 26]. Results about selection (or, non selection) in the setting of Thermodynamic Formalism appear in [5, 4, 9, 36, 8, 40]. Some of the proofs and results presented in the present section are similar to other ones in Ergodic Optimization [32] and Thermodynamic Formalism, but the main point is that we have to avoid in the proofs the concept of entropy and the variational principle of pressure. Remember that we denote by Mσ the set of σ invariant Borel probability measures over B. As Mσ is compact, given A, there always exists a subsequence βn , such that µβn A converges to an invariant probability measure. We consider the following problem: given A : B → R Lipschitz, we want to find measures that maximize, over Mσ , the value A(x) dµ(x). We define
m(A) = max
µ∈Mσ
A dµ .
Any of these measures will be called a maximizing probability measure, which is sometimes denoted by µ∞ . As Mσ is compact, there exist always at least one maximizing probability measure. It is also true that there exists ergodic maximizing probability measures. Indeed, the set of maximizing probability measures is convex, compact and the extreme probability measures of this convex set are ergodic (can not be expressed as convex combination of others [33]). Any maximizing probability measure is a convex combination of ergodic ones [46]. Even when A is Holder the maximizing probability measure µ∞ do not have to be unique. For instance, suppose that A is Holder and has maximum value just in
December 17, J070-S0129055X11004527
1078
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
the union of two different fixed points (for the shift σ) p0 ∈ B and p1 ∈ B. In this case the set of maximizing probability measures µ∞ is {t δp0 + (1 − t)δp1 | t ∈ [0, 1]}. Note that δp0 and δp1 are ergodic, but the other maximizing probability measures are not. µ, Similar definitions for a potential A : Bi → R and maximization of A dˆ over all the µ ˆ which are σ ˆ -invariant probability measures, can also be considered. Questions about selection of measure also make sense. Definition 1. A continuous function u : B → R is called a calibrated subaction for A : B → R, if, for any y ∈ B, we have u(y) = max [A(x) + u(x) − m(A)].
(5)
σ(x)=y
This can also be expressed as m(A) = max{A(ay) + u(ay) − u(y)}. a∈M
Note that for any x ∈ B we have u(σ(x)) − u(x) − A(x) + m(A) ≥ 0. The above equation for u can be seen as a kind of discrete version of a subsolution of the Hamilton–Jacobi equation [12, 6, 21]. It can be also seen as a kind of dynamic additive eigenvalue problem [13, 14, 24]. If u is a calibrated subaction, then u+c, where c is a constant, is also a calibrated subaction. An interesting question is when such calibrated subaction u is unique up to an additive constant. Remember that if ν is invariant for σ, then for any continuous function u : B → R we have [u(σ(x)) − u(x)] dν = 0. Suppose µ is maximizing for A and u is a calibrated subaction for A. It follows at once (see for instance [15, 32, 51] for a similar result) that for any x in the support of µ∞ we have u(σ(x)) − u(x) − A(x) + m(A) = 0.
(6)
In this way if we know the value m(A), then a calibrated subaction u for A helps us to identify the support of maximizing probabilities. The above equation can be true outside the union of the supports of the maximizing probabilities. Maximizing probability measures are natural candidates for being selected by µβA , as β → ∞. But, in our setting, without the maximizing principle of pressure (which one can take advantage of the classical Thermodynamic Formalism) this is not so obvious. We address the question in Sec. 3. Proposition 10. For any β, we have −A <
1 β
log λβ < A.
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1079
Proof. Fix β > 0. We choose x ¯ the maximum of ψβA in B and x ˜ the minimum of ψβA in B. Now, if A is the uniform norm of A, we have 1 ¯) da ≤ eβA(a x¯) da ≤ eβA and λβ = eβA(a x¯) ψβA (a x ψβA (¯ x) 1 λβ = ˜) da ≥ eβA(a x¯) da ≥ e−βA , eβA(a x˜) ψβA (a x ψβA (˜ x) which proves the result. From now on, we will suppose M = S1 to avoid technical issues. But we claim that the following results hold for more general connected and compact manifolds. Considering a subsequence βn we get the existence of a limit β1n log λβn → K, when n → ∞. By taking a subsequence we can assume that is also true that there exists V Lipschitz, such that V := limn→∞ β1n log(ψβn A ). Given y ∈ B, consider the equation 1 λβn = eβn A(a y) ψβn A (a y) da. ψβn A (y) It follows from Laplace method that, when β → ∞, K = max1 {A(ay) + V (ay) − V (y)}. a∈S
If we are able to show that K = m(A), then we can say that any limit of subsequence limn→∞ β1n log(ψβn A ) is a calibrated subaction, and we will get, finally, that lim
β→∞
1 1 log λβ A = lim log λβ = m(A). β→∞ β β
Next theorem is inspired by [1, Theorem 1] and [28, Theorem 3.3]. It follows from the last part of its proof that K = m(A). Theorem 11. Given A Lipschitz there exists u Lipschitz which is a calibrated subaction for A. As a consequence, we have that lim
β→∞
1 log λβ = m(A). β
Proof. Suppose A : B → R is Lipschitz. Given 0 < λ ≤ 1, consider the operator Lˆλ : C → C given by, Lˆλ (u)(x) = sup [A(ax) + λu(ax)]. a∈S1
Given x ∈ B, we denote by ax ∈ S1 one of the points a where the supremum is attained.
December 17, J070-S0129055X11004527
1080
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
It is easy to see that for any 0 < λ < 1, the transformation Lˆλ is a contraction on C with the uniform norm. Indeed, given x ∈ B sup [A(ax) + λu(ax)] − sup [A(bx) + λv(bx)]
a∈S1
b∈S1
≤ [A(ax x) + λu(ax x)] − [A(ax x) + λv(ax x)] ≤ λu(ax x) − λv(ax x) ≤ λu − v. Denote by uλ the corresponding fixed point in C. We want to show that uλ is equicontinuous. Consider x0 , y0 ∈ B. For the given x0 we take the corresponding ax0 ∈ M , and then the we get x1 = ax0 x0 . By induction, given xj , get xj+1 = axj xj . We can also can get a sequence yj ∈ B, j ≥ 1, such that, yj = axj−1 ... ax1 ax0 y0 . Note that for all j we have σ j (yj ) = y0 . As for any j we have uλ (yj ) ≥ A(yj+1 ) − λuλ (yj+1 ), then uλ (xj ) − uλ (yj ) ≤ [A(xj+1 ) − A(yj+1 )] + λ[uλ (xj+1 ) − uλ (yj+1 )]. Therefore, given x0 , y0 uλ (x0 ) − uλ (y0 ) ≤
∞
λj [A(xj ) − A(yj )]
j=0
≤ (1 − λ)
∞
λj
j=0
≤ sup j
j
j
[A(xi ) − A(yi )]
i=0
[A(xi ) − A(yi )]
i=0
≤ A sup j
j j 1 i=0
2
d(x0 , y0 )
< A2d(x0 , y0 ). This shows that uλ is Lipschitz, and, moreover, that uλ , 0 ≤ λ < 1, is an equicontinuous family. Note the very important point: the Lipschitz constant of uλ depends on A. Denote u∗λ = uλ − max uλ . Using Arzela–Ascoli we get the existence of a subsequence λn → 1 such that u∗λn → u. We claim that u is a subaction. Indeed, given x ∈ B, as |uλ (x)| ≤ λ|uλ (ax x)| + |A(ax x)| ≤ λuλ + A(x), then (1 − λ)uλ < C, where C is a constant. From this follows that there is a constant k, such for some subsequence (of the previous subsequence λn ), which will be also denoted by λn , we have (1 − λn )uλn → k.
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1081
Note that for any λ u∗λ (x) = uλ (x) − max uλ = −(1 − λ) max uλ + uλ (x) − λ max uλ = −(1 − λ) max uλ + max1 {A(ax) + (λuλ (ax) − λ max uλ )}. a∈S
Taking the limit n to infinity for the sequence λn we get u(x) = −k + max1 {A(ax) + u(ax)} = max1 {A(ax) + u(ax) − k}. a∈S
a∈S
Now, all we have to show is that k = m(A). From the above it follows at once that −u(σ(y)) + u(y) + A(y) ≤ k. If ν is a σ-invariant probability measure, then, A(y) dν(y) = [u(σ(y)) − u(y) + A(y)] dν(y) ≤ k, and, this shows that m(A) ≤ k. Now we show that m(A) ≥ k. Note that for any x there exist y = ax x such that σ(y) = x, and −u(σ(y)) + u(y) + A(y) = k. Therefore, the compact set K = {y | −u(σ(y)) + u(y) + A(y) = k} is such −n (K) is non-empty, compact and σ-invariant. If we conthat, K = n σ sider an σ-invariant probability measure ν with support on K , we have that A(y) dν(y) = k. From this follows that m(A) ≥ k. Now we state a general result assuming just that A is continuous (not necessarily Lipschitz). We refer the reader to [23, Theorem 1], [38, Proposition 4], [28, Theorem 2.4] for related results. Theorem 12. Given a potential A ∈ C, we have m(A) = inf
max
f ∈C (a,x)∈S1 ×B
[A(a x) + f (ax) − f (x))].
Proof. First, consider the convex correspondence F : C → R defined by F (g) = max(A + g). Consider also the subset G = {g ∈ C : there exists f such that g(ax) = f (a x) − f (x), f ∈ C} = ∅. Now consider the concave correspondence G : C → R ∪ {−∞} taking G(g) = 0, ¯ and G(g) = −∞ otherwise. if g ∈ G,
December 17, J070-S0129055X11004527
1082
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
Let S be the set of the signed measures over the Borel sigma-algebra of B. Remember that the corresponding Fenchel transforms, F ∗ : S → R ∪ {+∞} and G∗ : S → R ∪ {−∞}, are given by
F ∗ (ˆ µ) = sup g(ax) dˆ µ(ax) − F (g) , and g∈C ∗
G (ˆ µ) = inf
g∈C
Denote S0 =
g(ax) dˆ µ(ax) − G(g) .
µ ˆ ∈ S : f (ax) dˆ µ(ax) = f (x) dˆ µ(x), ∀ f ∈ C .
We denote by M the set of probability measures over B. Given F and G as above, we claim that − A(y, x) dˆ µ(y, x) if µ ˆ∈M ∗ ˆ Σ µ) = F (ˆ +∞ otherwise 0 if µ ˆ ∈ S0 ∗ G (ˆ . µ) = −∞ otherwise
and
We refer the reader to the [23] or [38] for a proof of this claim (which is basically the same as we need here). Once the correspondence F is Lipschitz, the theorem of duality of Fenchel– Rockafellar [47] assures µ) − G∗ (ˆ µ)], sup[G(g) − F (g)] = inf [F ∗ (ˆ g∈C
sup − g∈G
max
(a,x)∈S1 ×B
µ∈S ˆ
(A + g)(ax) = inf − A(ax) dˆ µ(ax) . µ∈M ˆ σ
Finally, from the definition of G, the claim of the theorem follows. 3. A Definition of Entropy for Gibbs States at Positive Temperature and Selection of Probability Measure Given a Lipschitz function A we have that A(ax) ψA (ax) e da = 1, λA ψA (x)
∀ x ∈ B.
We denote as before A¯ = A + log ψA − log ψA ◦ σ − log λA ,
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1083
where σ : B → B is the usual shift map. In this case the normalized potential A¯ satisfies ¯ eA(ax) da = 1, ∀ x ∈ B, M
which means LA¯ (1) = 1. Therefore,
¯ eA(ax) da dµA (x) = 1.
B
M
¯ Note that for a fixed x the value A(ax) cannot be smaller than zero for all a ∈ M . This is quite different from the analogous case where we consider the shift over {1, 2, . . . , d}N in the classical Thermodynamic Formalism. ¯ For each a ∈ M, x ∈ B, we denote by J(ax) = min{1, eA(ax) }. Definition 2. Given the invariant probability measure µA , associated to the Lipschitz potential A, we define the entropy of µA as h(µA ) = − log J(y) dµA (y) > 0. In other words
h(µA ) = −
¯ A(y)I (y) dµA (y). ¯ {A≤0}
The set of probability measures µA , with A Lipschitz, is dense in the set of σ-invariant probability measures [37]. Note that µA is σ-invariant −h(µA ) = log J(y) dµA (y) ≤ =
eA(y) ψA (y) log λA ψA (σ(y))
dµA (y)
A dµA − log λA .
Therefore,
log λA ≤ h(µA ) +
A dµA .
For a fixed A consider now for each real value β the corresponding potential βA. Therefore, log λβA ≤ h(µβA ) + β A dµβA . Suppose for a certain subsequence βn we have that µβn A → µ.
December 17, J070-S0129055X11004527
1084
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
If we divide the last inequality by βn , and, taking limit in n, we get h(µβn A ) m(A) ≤ lim sup + A dµ. βn n→∞ From the above we can derive: Theorem 13. Suppose that µ = limn→∞ µβn A , for some subsequence βn , and lim sup n→∞
h(µβn A ) = 0, βn
then, the limit measure µ is a maximizing probability measure. Corollary 14. If the maximizing probability measure µ∞ for A is unique, and, lim sup β→∞
h(µβA ) = 0, β
then, µβA , when β → ∞, selects the maximizing probability measure µ∞ . 4. Analysis of the Case in Which the Potential Depends on Two Coordinates In this section we suppose the potential depends on two coordinates and the metric space is M = S1 . In this case the Ruelle operator has a simple form. We will make the usual identification of S1 with [0, 1] (in further sections we will make the identification of S1 with [0, 2π]). We will present several results from [38] which will be needed in future sections. We will need to define the following operators: Definition 3. Let Lβ , L¯β : C([0, 1]) → C([0, 1]) be given by Lβ ψ(y) = eβA(x,y) ψ(x) dx,
(7)
L¯β ψ(x) =
eβA(x,y) ψ(y) dy.
(8)
We refer the reader to [34 and 49, Chap. IV] for general results on positive integral operators. The next theorem (Krein–Ruthman) is well known. It will follow that, when A depends just on two coordinates (x0 , x1 ), then the eigenfunction of the Ruelle operator (as defined in previous sections) depends only on the first coordinate x0 (similar to [55]). Theorem 15. The operators Lβ and L¯β have the same positive maximal eigenvalue λβ , which is simple and isolated. The eigenfunctions associated are positive functions. Let us call ψβ , ψ¯β the positive eigenfunctions for Lβ and L¯β associated to λβ , which satisfy the normalization condition ψβ (x) dx = 1 and ψ¯β (x) dx = 1.
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1085
We will define a density θβ : [0, 1] → R by θβ (x) := where πβ =
ψβ (x)ψ¯β (x) , πβ
(9)
ψβ (x)ψ¯β (x) dx, and a transition Kβ : [0, 1]2 → R by Kβ (x, y) :=
eβA(x,y)ψ¯β (y) . ψ¯β (x)λβ
(10)
The above expressions are consistent with the results obtained in [22, Sec. 3]. This can be formulated also as a variational pressure problem as we will see soon. Note that if A(x, y) = A(y, x), then ψβ and ψ¯β are constant, and, therefore θβ is constant equal to 1. This happen for M = S in the case A is of the form U (x − y) for a periodic function U . This case will be consider later. Consider a probability measure ν on [0, 1]2 that can be disintegrated as dν(x, y) = dθ(x) dKx (y), where θ : [0, 1] → [0, +∞) and K : [0, 1]2 → [0, +∞) are continuous functions. We will denote this by ν = θK, where θ is a continuous density of probability on [0, 1]. Definition 4. A probability measure θ on [0, 1] is called stationary for a transition K(·, ·), if θ(B) = K(x, B) dθ(x), for all interval B ∈ [0, 1]. More explicitly we assume K : [0, 1]2 → [0, +∞) and θ : [0, 1] → [0, +∞) satisfy the following equations: K(x, y) dy = 1, ∀ x ∈ [0, 1], (11) θ(x)K(x, y) dx dy = 1,
(12)
θ(x)K(x, y) dx = θ(y),
∀ y ∈ [0, 1].
(13)
Given the initial probability measure θ and the transition K, as above, one can define a Markov process {Xn }n∈N with state space [0, 1] (see [38] for more details). The measure µ over [0, 1]N which describes this process is N µ(A0 . . . An × [0, 1] ) := θ(x0 )K(x0 , x1 ) . . . K(xn−1 , xn ) dxn . . . dx0 A0 ...An
for any cylinder A0 . . . An × [0, 1]N . If θ is stationary the Markov Process Xn will be stationary. Note that θβ above is stationary for Kβ (x, y). In this way we can define νβ = θβ Kβ on [0, 1]2 .
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
1086
For instance, µβ,A ([a1 , a2 ] × [b1 , b2 ] × [c1 , c2 ] × [0, 1]N ) a2 b2 c2 = θβ (x0 )Kβ (x0 , x1 )Kβ (x1 , x2 ) dx2 dx1 dx0 . a1
b1
(14)
c1
The next result is similar to the one described in [55]. Theorem 16. Suppose A is a Holder continuous function. Then the probability measure µβ,A defined in (14) is the Gibbs state for the potential βA. Proof. We need to show that L∗A¯ (µβ,A ) = µβ,A , where β A¯ = βA+ log ψβ − log ψβ ◦ σ − log λβ . Indeed, let g ∈ C such that g(x0 , x1 , . . .) = g(x0 , . . . , xk ), by definition of L∗A¯ we have B
g dL∗A¯ (µβ,A )
= B
LA¯ (g) dµβ,A
=
e [0,1]k
×
k−2
¯ β A(a,x 0)
[0,1]
g(a, x0 , . . . , xk−1 ) da θβ (x0 )
Kβ (xj , xj+1 ) dxk−1 . . . dx0
j=0
=
e [0,1]k
×
k−2
βA(a,x0 )
[0,1]
ψβ (a) g(a, . . . , xk−1 ) da θβ (x0 ) λβ ψβ (x0 )
Kβ (xj , xj+1 ) dxk−1 . . . dx0
j=0
eβA(a,x0 )
= [0,1]k+1
×
k−2
ψβ (a) ψ¯β (x0 ) g(a, . . . , xk−1 ) λβ πβ
Kβ (xj , xj+1 ) dxk−1 . . . dx0 da
j=0
= [0,1]k+1
×
k−2 j=0
g(a, . . . , xk−1 )eβA(a,x0 )
ψ¯β (x0 ) ψ¯β (a)ψβ (a) πβ λβ ψ¯β (a)
Kβ (xj , xj+1 ) dxk−1 . . . dx0 da
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1087
= [0,1]k+1
×
k−2
g(a, x0 , . . . , xk−1 )θβ (a)Kβ (a, x0 )
Kβ (xj , xj+1 ) dxk−1 . . . dx0 da
j=0
= [0,1]k+1
×
k−2
g(x0 , x1 , . . . , xk−1 , xk )θβ (x0 )
Kβ (xj , xj+1 )Kβ (xk−1 , xk ) dxk . . . dx1 dx0 .
j=0
Hence, for any continuous g ∗ g(x0 , . . . , xk ) dLA¯ (µβ,A ) = g(x0 , . . . , xk ) dµβ,A . B
B
The entropy (as defined in Sec. 3) of such probability measure µβA is h(µβA ) = − A(y) I{A−log λβ ≤0} (y) dµβA (y) + log λβ . Definition 5. We denote by M0 the set of all ν = θK on [0, 1]2 , where θ is stationary for K. Definition 6. For an absolutely continuous probability measure ν ∈ M[0,1]2 , given by a density ν(x, y) dx dy, we denote S[ν] by ν(x, y) dx dy. (15) S[ν] = − ν(x, y) log ν(x, z) dz Remark. The S[ν] was called “penalized entropy” in [38] and it is a kind of relative entropy with respect to Lebesgue measure. It is different kind of definition of entropy (from the previous one we consider before). It is easy to see that any ν = θK ∈ M0 satisfies S[θP ] = − θ(x)K(x, y) log(K(x, y)) dx dy. (16) The value S[θ K] assume negative values. We can consider now the variational problem P (A) = max βA(x, y) dν + S[ν] . ν=θK∈M0
(17)
December 17, J070-S0129055X11004527
1088
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
This is equivalent to maximize max βA(x, y)θ(x)K(x, y) dx dy − θ(x)K(x, y) log(K(x, y)) dx dy .
ν=θK∈M0
Definition 7. A probability measure ν in M0 is called an equilibrium state for A (which depends on two coordinates) if attains the maximal value P (A). The value P (A) is called the pressure (or Free Energy) of A. We refer the reader to [38] for the proof of the following result. Proposition 17. The stationary measure νβ = θβ Kβ defined above maximize β A(x, y) dν + S[ν], over all stationary ν = θK ∈ M0 . Also P (A) = log λβ = βA θβ Kβ dx dy + S[θβ Kβ ]. When the potential A depends just on two coordinates the equation used in the definition of subaction can be simplified. Definition 8. A continuous function u : [0, 1] → R is called a [0, 1]-calibrated forward-subaction if, for any y ∈ [0, 1], we have u(y) = max [A(ay) + u(a) − m(A)].
(18)
a∈[0,1]
We refer the reader to [14] for related problems in a different setting. The equation for u above also appears in problems related to the additive eigenvalue [13, 14]. A function u as above can be seen as a function on x ∈ [0, 1]N , where x = (x0 , x1 , x2 , x3 , . . .), which depends just on the first coordinate x0 . Therefore, a [0, 1]calibrated forward-subaction is a also calibrated subaction (in the previous sense). We point out that [0, 1]-calibrated forward-subactions do exist (see [38]). An interesting question on the case of selection of measures µβ → µ∞ is: what happens with the measure of a particular subset D of B when T → 0 (or, β → ∞)? A Large Deviation Principle (see [18] for general references) is true under certain conditions. We refer the reader to [38] for the proof of the result below. Theorem 18. If A has only one maximizing probability measure µ∞ and there exist an unique [0, 1]-calibrated forward-subaction V for A, then the following LDP is true: for each cylinder D = A0 . . . Ak × [0, 1]N , the following limit exists lim
β→∞
1 ln µβA (D) = − inf I(x), x∈D β
where I : [0, 1]N → [0, +∞] is a function defined by I(x) := V (xi+1 ) − V (xi ) − (A − m(A))(xi , xi+1 ). i≥0
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1089
Results about Large deviations in the setting of Thermodynamic Formalism appear in [15, 39]. Definition 9. We say that A : [0, 1]2 → R satisfies the twist condition, if A is C 2 , and ∂2A = 0. ∂x∂y This property is an open condition under the right topology. The next theorem (see [38] for a proof) addresses the question of uniqueness when we add a magnetic term f (x) to A(x, y). Related results in a different setting appear in [2, 6]. The above condition for A replaces the convexity of the Lagrangian which is crucial in Aubry–Mather theory [12]. Definition 10. We will say that a property is generic for A, A ∈ C 2 ([0, 1]2 ), in Ma˜ n´e’s sense, if the property is true for A + f , for any f , f ∈ C 2 ([0, 1]), in a set G which is generic (in Baire sense). This concept was initially introduced in the Aubry–Mather setting in [42]. We will show below that under the twist condition the uniqueness of [0, 1]-forward backward-subaction is generic in Ma˜ n´e’s sense. Theorem 19. Consider the class of all A : [0, 1]2 → R which is C 2 and satisfies ∂2 A = 0, then there exists a generic set O in C 2 ([0, 1]) (in the twist condition ∂x∂y Baire sense) such that : (a) for each f ∈ O, f : [0, 1] → R, for “any” A we have ˜ ∈ Mσ two that given µ,µ maximizing measures for A + f (i.e. m(A + f ) = (A + f ) dµ = (A + f ) d˜ µ), then ν = ν˜, where ν and ν˜ are the projections of µ and µ ˜ in the first two coordinates. (b) for “any” A the [0, 1]-calibrated forward-subaction for A + f is unique, for each f ∈ O (up to an additive constant ). In the above theorem the potential A is considered the interaction and f the magnetic term. Therefore, it claims, among other things, that for “any” A we have uniqueness of the calibrated subaction (up to an additive constant) for a generic magnetic term f . The next theorem (see [38] for a proof) addresses the question of the graph property for a probability measure. A related result in the setting of thermodynamic formalism appears in [41]. 2
∂ A = 0, Theorem 20. If A : [0, 1]2 → R is C 2 , and satisfies the twist condition ∂x∂y 2 then, the projected measure ν on [0, 1] of the maximizing probability measure µ∞ (on B) has support on a graph.
December 17, J070-S0129055X11004527
1090
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
The problem we consider above can be seen as a Transshipment Problem (see [38]). For related results see also [24, 16]. The graph property of a measure is of great importance in Aubry–Mather Theory [12, 21, 43]. 5. DLR Gibbs Measures and Transfer Operator 5.1. One-dimensional systems and transfer operator Given the potential A we will use the following terminology: Gibbs-TF for A denotes the set of measures usually considered in the Thermodynamical Formalism (as, for example, in [45], or in the first part of this paper) and Gibbs-DLR for A the set of measures constructed as in the Dobrushin–Lanford–Ruelle formulation of Statistical Mechanics, where the Gibbs measures are obtained from Specification Theory point of view, for a complete exposition see [27, 48, 50, 53]. For reasons that will be clarified latter we adopt the notation µA,σ for this measures, where σ is an element of the state space which is called sometimes a boundary condition. The measures obtained by the first construction (Sec. 1) are denoted here by m = mA , and they are defined over the σ-algebra of B = (S 1 )N generated by the cylinder sets. The second one is usually defined over the σ-algebra of Bi = (S 1 )L generated by the cylinder sets, where L is any countable set. In order to show the relation of this two constructions in this paper, we focus on the cases where L = Z. We will call MA the Gibbs-TF-Z for A, which is, by definition, the natural extension of mA , the Gibbs-TF for A. For a large class of potentials (see [27]) we can show that µA,σ is independent of 1 N the choice of σ ∈ (S ) . Here using a very simple argument we give a proof of this independence using Ruelle operator when one consider free on the left, and a fixed σ ∈ (S 1 )N boundary conditions. We also show that this unique probability measure constructed using the Gibbs-DLR approach is equals to the measure MA obtained in the Gibbs-TF-Z for A. In a forthcoming paper we discus in great generality the equivalence of Gibbs-TF and Gibbs-DLR for one-dimensional systems. 5.2. Gibbs-DLR probability measures on (S1 )Z For a Bi -measurable function A : Bi → R depending on the two first coordinates, we associated a family Φ = (ΦΓ )Γ⊂N of functions from Bi to R, given by −A(xn , xn+1 ), if Γ = {n, n + 1}; ΦΓ (x) = 0, otherwise. We call this family Φ an interaction. For each n ∈ N we consider the associated Hamiltonian n−1 HΛΦn (x) = − A(xk , xk+1 ), (19) k=−n
where Λn = [−n, n] ∩ Z.
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1091
The first step to obtain a Gibbs-DLR probability measure for a given A : Bi = (S ) → R depending on the two first coordinates, whit boundary condition σ ∈ (S1 )N , is to construct a family of probability measures µΦ,σ Λn over Bi and then take cluster points in the weak∗ topology of this family when n → ∞. Note that at least one cluster point exists because of the Banach–Alaoglu Theorem and any element on the set of these cluster points will be called a Gibbs-DLR measure. Once we take the limit when n goes to infinity, the sequence of the sets Λn = {−n, −n + 1, . . . , −1, 0, 1, . . . , n − 1, n} converges in the set theoretical sense to Z, which allows for these measures to capture information in the past and in the future coordinates. Fixed a configuration σ = (σ0 , σ1 , . . . , σn , . . .) ∈ (S1 )N , and, a potential A as above, then, we define the Hamiltonian on Λn for the potential Φ with σ right boundary conditions by 1 Z
HΛΦn (τ |σ ) = −
n−2
A(τk , τk+1 ) − A(τn−1 , σn ).
k=−n
Note that HΛΦn (τ |σ ) can also be considered as a function defined on [0, 2π]2n , i.e. HΛΦn (τ−n , . . . , τn−1 |σ ) = −
n−2
A(τk , τk+1 ) − A(τn−1 , σn ).
k=−n
Let M (Λn , σ ) = {x ∈ (S1 )Z | xi = σi , ∀ i ≥ n}, dν the Lebesgue probability measure on S1 (which we identify with [0, 2π]) and dνn is the Lebesgue probability measure on (S1 )n . The partition function associated to the potential Φ with right boundary condition σ ∈ B on the volume Λn is defined by Φ Φ,σ ZΛn := e−HΛn (τ |σ ) dν(τ ) M(Λn ,σ )
=
[0,2π]2n
Φ
e−HΛn (τ−n ,...,τ0 ,...,τn−1 |σ ) dν(τ−n ) . . . dν(τ0 ) . . . dν(τn−1 ).
is finite for We restrict our attention to potentials Φ for which the partition ZΛΦ,σ n any choice n and σ . Hence for each n, this defines a probability measure which acts on continuous functions f : B → R (depending on finite coordinates) by Φ 1 f dµΦ,σ = f (τ )e−HΛn (τ |σ ) dν(τ−n ) dν(τ−n+1 ) . . . dν(τn−1 ). Λn Φ,σ ZΛn Bi M(Λn ,σ )
Note that in this way for any fixed σ the probability measure µΦ,σ Λn depends
just on A (and, of course, σ ), thus we could also denoted it µA,σ Λn . But here we will
adopt the Statistical Mechanics notation µΦ,σ Λn as used in [27, 50].
December 17, J070-S0129055X11004527
1092
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
For a fixed σ we are interested in the limit of µΦ,σ Λn , when n → ∞. Any possible A,σ (or, µΦ,σ ). Any one of these cluster point of this sequence will be denoted by µ is called a Gibbs state for A with a boundary condition σ ∈ B on the right and free on the left. Given A : B → R, by the major theorem of Sec. 1, we know there is a maximal positive eigenvalue λ = λA associated to the eigen-function ψA . We also have, for any ψ : B → R, n eSn A(τ y) ψ(τ y) dνn (τ ). (20) LA ψ(y) = [0,2π]n
If A depends on two coordinates, then, ψA depends on one coordinate (as we get Φ,τ n from Sec. 4). Note that for any τ ∈ (S1 )N we have L2n A (1) (σ (τ )) = ZΛn , where σ is the shift on (S1 )N and LnA¯ 1 = 1 for any n ∈ N, where A¯ = A + log ψA − log ψA ◦ σ − log λA . ¯ be the potential defined by A¯ and π the natural projection of (S1 )Z to Let Φ (S ) . (analogous to the case for the potential A), we set for any Borelian C ⊂ B ¯ Φ 1 ¯ Φ,σ π µΛ (C) = 1C (τ )e−HΛn (τ |σ ) dν(τ ). ¯ n Φ,σ M(Λn ,σ ) ZΛn 1 N
We point out that a potential A which depends on two coordinates can be seen as a potential defined either in (S1 )N , or (S1 )Z . Another important remark is when the spin variables take values in the close interval [−1, 1] these models are known in the literature as continuous Ising model. Proposition 21. Consider a fixed σ ∈ B = (S1 )N Given A : Bi → R, which depends on two coordinates, if A¯ is its normalized associated potential then for any ¯ cluster point πµΦ,σ we have that ¯
m = πµΦ,σ , ¯ where m = mA¯ is the Gibbs-TF measure for A. ¯
We will show that limn→∞ πµΦ,σ Λn = m, so this limit does not depend on the fixed σ we choose. Proof. Consider a given f : B → R which depends on finitely many coordinates, (let us say r > 0). Note that ¯
HΛΦn (τ |σ ) = −
n−2 k=−n
¯ k , τk+1 ) − A(τ ¯ n−1 , σ ), A(τ n
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model ¯
1093
and that ZΛΦ,σ = 1. Suppose that n > r. By definition n ¯ Φ ¯ Φ,σ f dπµΛ = f (τ )e−HΛn (τ |σ ) dν(τ ) n M(Λn ,σ )
¯ Φ
= [0,2π]2n
=
[0,2π]n
f (τ0 , . . . , τr )e
×
e
Pn−2 ¯ ¯ n−1 ,σ ) ,τ )+A(τ n k=0 A(τk k+1
P−1
¯ k=−n A(τk ,τk+1 )
[0,2π]n
= [0,2π]n
f (τ0 , . . . , τr )e−HΛn (τ−n ...τn−1 |σ ) dν(τ−n ) . . . dν(τn−1 )
f (τ0 , . . . , τr )e
dν(τ−n ) . . . dν(τ−1 ) dν(τ0 ) . . . dν(τn−1 )
Pn−2 ¯ ¯ k=0 A(τk ,τk+1 )+A(τn−1 ,σn )
dν(τ0 ) . . . dν(τn−1 ),
¯ where in the last equation we used n times [0,2π] eA(x,y) dν(x) = 1. In this way, ¯ Φ,σ = LnA¯ (f )(σ n (σ )). f dπµΛ n Is known from Sec. 1 that LnA¯ (f ) converges uniformly to f dm, as n goes to ¯ As the convergence of Ln¯ (f ), when infinity, where m is Gibbs-TF for A or (A). A ¯
n → ∞, is uniform, then limn→∞ πµΦ,σ Λn = m
Corollary 22. For any σ ∈ (S1 )N , and, any f which depends on finitely many coordinates ¯ Φ f (τ )e−HΛn (τ |σ ) dν(τ−n ) dν(τ−n+1 ) . . . dν(τn−1 ) M(Λn ,σ ) → 1, ¯ Φ f (τ )e−HΛn (τ ) dν(τ−n ) dν(τ−n+1 ) . . . dν(τn−1 ) dν(τn ) (S1 )Λn
when n → ∞. Proof. This follows easily from the above because the convergence of LAn¯ (f ) is uniform. Proposition 23. Suppose σ ∈ (S1 )N . Given A : Bi → R, which depends on two coordinates and, a coboundary h : Bi → R, which depends on one coordinate (the 0 coordinate), and, such that A¯ = A + h − h ◦ σ ˆ + log λ, where σ ˆ is the shift on Bi , then
¯
π(µΦ,σ ) = π(µΦ,σ ).
December 17, J070-S0129055X11004527
1094
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
Proof. Consider a function f : B → R which depends on finite coordinates f (τ0 , τ1 , . . . , τk ), k > 0. We have first that n−2
¯ ¯ −n , τ−n+1 ) − HΛΦn (τ |σ ) = −A(τ
¯ k , τk+1 ) − A(τ ¯ n−1 , σn ) A(τ
k=−n+1
=
HΛΦn (τ |σ )
+
h(σn )
− h(τ−n ) − 2n log λ.
Hence ¯
−HΛΦn (τ |σ ) = −HΛΦn (τ |σ ) + h(σn ) − h(τ−n ) − 2n log λ. Therefore M(Λn
Φ
f (τ ) e−HΛn (τ |σ ) dν(τ )
,σ ) −2n h(σn )
¯ Φ
e
=λ
M(Λn ,σ )
e−h(τ−n ) f (τ0 , . . . , τk )e−HΛn (τ |σ ) dν(τ−n ) . . . dν(τn−1 ),
by taking f = 1 we have Φ Φ,σ ZΛn = e−HΛn (τ |σ ) dν(τ ) M(Λn ,σ )
−2n h(σn )
¯ Φ
e
=λ
M(Λn
= λ−2n e
h(σn )
,σ )
e−h(τ−n ) e−HΛn (τ |σ ) dν(τ−n ) . . . dν(τn−1 )
−h L2n )(σ n (σ )). ¯ (e A
We already shown in the previous sections that −h n (e )(σ (σ )) → e−h dmA¯ , L2n ¯ A ∼ λ−2n eh(σn ) e−h dmA¯ . uniformly in n. Therefore, ZΛΦ,σ n We also have ¯ Φ e−h(τ−n ) f (τ0 , τ1 , . . . , τk )e−HΛn (τ |σ ) dν(τ ) M(Λn ,σ )
=
f (τ )e
Pn−2 ¯ ¯ n−1 ,σ ) ,τ )+A(τ n k=0 A(τk k+1
[0,2π]n
×
e [0,2π]n
−h(τ−n )
e
P−1
¯ k=−n A(τk ,τk+1 )
−1 i=−n
n−1 dν(τi ) dν(τk ) k=0
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
=
f (τ )e
Pn−2 ¯ ¯ k=0 A(τk ,τk+1 )+A(τn−1 ,σn )
[0,2π]n
f (τ )e
Pn−2 ¯ ¯ k=0 A(τk ,τk+1 )+A(τn−1 ,σn )
[0,2π]n
dν(τk )
n−1 n −h −h dν(τk ) LA¯ (e )(τ ) − e dmA¯ k=0
f (τ )e
+
Pn−2 ¯ ¯ k=0 A(τk ,τk+1 )+A(τn−1 ,σn )
[0,2π]n
→
n−1 k=0
=
(LnA¯ (e−h )(τ ))
1095
f dmA¯
e−h dmA¯
n−1
dν(τk )
k=0
e−h dmA¯ ,
where in the convergence we used the fact that, given any > 0, there exists N such that, for n > N , we have n−1 Pn−2 ¯ ¯ A(τ ,τ )+ A(τ ,σ ) n −h −h n−1 n k k+1 f (τ )e k=0 dν(τk ) LA¯ (e )(τ ) − e dmA¯ [0,2π]n k=0
<
f (τ )e
n−1 Pn−2 ¯ ¯ k=0 A(τk ,τk+1 )+A(τn−1 ,σn )
[0,2π]n
=
LnA¯ (f )(σ n (σ ))
dν(τk )
k=0
< 2
f dmA¯ ,
which means that the first integral vanishes when n → ∞, while the second integral is n−1 Pn−2 ¯ ¯ n−1 ,σ ) −h A(τk ,τk+1 )+A(τ n k=0 f (τ )e dν(τk ) e dmA¯ [0,2π]n
= LnA¯ (f )(σ n (σ )) Finally,
Φ
e−h dmA¯ →
f dmA¯
k=0
e−h dmA¯ .
f (τ )e−HΛn (τ |σ ) dν(τ )
M(Λn ,σ )
ZΛΦ,σ n
λ−2n eh(σn )
¯ Φ
M(Λn ,σ )
=
ZΛΦ,σ n
e−h(τ−n) f (τ )e−HΛn (τ |σ ) dν(τ )
¯ Φ
e−h(τ−n ) f (τ )e−HΛn (τ |σ ) dν(τ ) M(Λn ,σ ) ∼ → f dmA¯ . e−h dmA¯
¯
Therefore, πµΦ,σ = πµΦ,σ .
December 17, J070-S0129055X11004527
1096
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
Corollary 24. Consider a general σ ∈ B. Given A : B → R, then,
mA = πµΦ,σ , where m = mA is the Gibbs-TF for A. Proof. It follows from A¯ = A + log ψA − log ψA ◦ σ − log λA .
According to [27, Part III, p. 289], for any σ the probability measure µA,Φ,σ is invariant for σ ˆ acting on (S1 )Z . By definition, the Gibbs-FT-Z state MA on (S1 )Z , is the natural extension of ˆ acting on (S1 )Z . mA , and, it is also invariant for σ Proposition 25. Suppose A : (S1 )Z → R depends on two coordinates, and, consider σ ∈ B, then
µΦ,σ = MA .
Proof. µΦ,σ and MA are both the natural extension of mA . Proposition 26. Suppose A : (S1 )Z → R depends on two coordinates, and, consider σ , σ ∈ B, then
µΦ,σ = µΦ,σ .
Proof. µΦ,σ and µΦ,σ are both the natural extension of mA . The final conclusion is that, if the potential depends on two coordinates, then the Gibbs probability measure on (S1 )Z in both settings, Thermodynamic Formalism and Statistical Mechanics via a boundary condition σ on the right side, coincide. Now we will analyze the free-boundary case. Remember that HΛΦn (τ ) = −
n−1
A(τk , τk+1 ).
k=−n
We are going to define the Gibbs probability measure in the sense of Statistical Mechanics with free boundary condition on the left and on the right. For a given n > 0, Φ e−HΛn (τ ) dν(τ−n ) dν(τ−n+1 ) . . . dν(τn−1 ) dν(τn ) ZΛΦn = (S1 )Λn
will be the partition function which corresponds to the case of free a boundary condition on the right and on the left. For each n, this defines a probability measure which acts on continuous functions f (depending on finite coordinates) by Φ 1 Φ f (τ )e−HA,Λn (τ ) dν(τ−n ) dν(τ−n+1 ) . . . dν(τn−1 ) dν(τn ). f dµΛn = Φ ZΛn (S1 )Λn
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1097
Any weak limit of subsequences of µΦ Λn will be called a Gibbs state for A with a free boundary condition on the right and on the left. It follows from Corollary 1 above that any Gibbs state for A with a free boundary condition on the right and on the left is equal to MA . The result we will analyze in the next section will be the case of a free boundary condition on the right and on the left. 6. An Example by van Enter and Ruszel Where There is No Selection In this section we will consider A depending on its first neighbors, and having the form A(x) = A(x0 , x1 ) = U (x0 − x1 ). We want to show a particular example (introduced by [20]), where the potential ˜ : [0, 2π] → R is a function such that U| ˜ [a ,b ) , is not continuous and is of the form: U n n is constant for each n and equal to cn , where [an , bn ), n ∈ N is a partition of [a, b]. We will show that for each positive β we can also consider an extension of GibbsTF, say µβ,U˜ , over B and also that this measure coincides with the Gibbs-DLR for ˜ . In [20] the authors have shown that there is no selection of the this potential U family µβ,U˜ when β → ∞. We will present here all the details of the proof of this non-trivial result. Basically, we will show that IB dµβ,U˜ does not converge when β → ∞, for a set B which depends just on the coordinates (x0 , x1 ). Therefore, this is also the same as µβ,U˜ does not converge (see Remark 4 just before Proposition 6). to say that IB dˆ The main result of this section is Theorem 32, which is a consequence of Corollary 30 and Lemma 31. Section 6.1 shows that results of previous sections are still valid even if the potential A belongs to certain classes of non-continuous potentials including the potential of [20]. 6.1. Gibbs measures for non-continuous potentials and DLR formulation of statistical mechanics So far we have defined Gibbs measures for Holder continuous potentials in Secs. 1 (general case) and 4 (nearest neighbors interaction, i.e. potential depending on two coordinates). In Sec. 4, we gave an alternative definition based on transition kernels associated to a certain potential (or Hamiltonian) A, and proved that this definition is equivalent to the one of Sec. 1. We will now show that our definition coincides with the usual one in Statistical Mechanics, in the case of a certain special non-continuous potential depending on ˜ ˜ 0 , x1 ) = two coordinates. We assume, among other things, the form A(x) = A(x ˜ (x0 − x1 ), where U ˜ : S1 → R is a bounded L1 function, which is pointwise U approximated by Holder functions Un . This case will cover the important example to be described later. A potential of this form is called symmetric. First we will show that the main results of Sec. 4 are true for this potential ˜ ˜ (x0 − x1 ), which is no longer continuous. A(x) =U
December 17, J070-S0129055X11004527
1098
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
Using the notation described in Sec. 4, let Lβ U˜ , L¯β U˜ : C([0, 2π]) → C([0, 2π]) be given by Lβ U˜ ψ(y) =
1 2π
1 L¯β U˜ ψ(x) = 2π
2π
˜
eβ U(x−y) ψ(x) dx,
(21)
0
2π
˜
eβ U(x−y) ψ(y) dy,
(22)
0
for any y ∈ [0, 2π]. In order to simplify the notation we denote Lβ instead of Lβ U˜ . Lemma 27. The operators Lβ and L¯β preserve the set of continuous functions in [0, 2π], sending continuous functions to uniformly continuous functions. Moreover, a bounded function is mapped to an uniformly continuous one. The fact that continuous functions are preserved implies the compactness of the operator, as we can see in [11, pp. 43 and 47]. Proof. Consider a fixed β and the operator Lβ . Let f be a continuous function. Fix > 0. Let Ac be a continuous function such that A˜ − Ac L1 <
. 4f C 0
Here we use the L1 norm on the functions defined on the one-dimensional set [0, 2π]. Such a function exists because continuous functions are dense in Lp [0, 2π] for p ≥ 1. Let Kc (x, y) = Ac (x − y). We have A˜ = Ac + (A˜ − Ac ). Moreover, let δ > 0 be such that |Ac (z) − Ac (w)| <
2f C 0
if |z − w| < δ. Suppose |y1 − y2 | < δ. Then we have |L(f )(y1 ) − L(f )(y2 )| = K(x, y1 )f (x) dx − K(x, y2 )f (x) dx ≤ |Ac (x − y1 ) − Ac (x − y2 )| |f (x)| dx |(A˜ − Ac )(x − y1 )||f (x)| dx
+ + < .
|(A˜ − Ac )(x − y2 )||f (x)| dx
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1099
The proof of the next theorem is a small modification of the proof of [38, Theorem 3]. Theorem 28. The operators Lβ and L¯β have the same positive maximal eigenvalue λβ , which is simple and isolated. The eigenfunctions associated to these operators, say ψβ and ψ¯β , are positive functions. Proof. We can see that Lβ is a compact operator, because Lemma 27 shows that the image of the unity closed ball of C([0, 1]) under Lβ is an equicontinuous family in C([0, 1]). Thus, we can use Arzel`a–Ascoli theorem to prove the compactness of Lβ (see also [49, Sec. 1, Chap. IV]). The spectrum of a compact operator contains a sequence of eigenvalues that converges to zero, possibly contains zero. This implies that any non-zero eigenvalue of Lβ is isolated (i.e. there is no sequence in the spectrum of Lβ which converges to a non-zero eigenvalue). The definition of Lβ now shows that Lβ preserves the cone of positive functions in C([0, 1]), sending a point in this cone to the interior of the cone. This means that Lβ is a positive operator. The Krein–Ruthman theorem ([17, Theorem 19.3]) implies that there exists a positive eigenvalue λβ , which is maximal (i.e. if λ = λβ is in the spectrum of Lβ then λβ > |λ|.) and simple (i.e. the eigenspace associated to λβ is one-dimensional). Moreover λβ is associated to a positive eigenfunction ψβ . If we proceed in the same way as in [38], we obtain the same conclusions about ¯ β and eigenfunction ψ¯β . the operator L¯β , and we get the respective eigenvalue λ ¯ In order to prove that λβ = λβ , we use the positivity of ψβ and ψ¯β and the fact that L¯β is the adjoint of Lβ . (Here we see that our operators can be, in fact, defined in the Hilbert space L2 ([0, 1]), which contains C([0, 1]).) We have ψβ , ψ¯β = ψβ (x)ψ¯β (x) dx > 0, and ¯ β ψβ , ψ¯β . λβ ψβ , ψ¯β = Lβ ψβ , ψ¯β = ψβ , L¯β ψ¯β = λ ˜ L ˜ ψ(1) and L¯ ˜ (1) are independent of x. Therefore By the periodicity of U, βU βU ¯ ψβ,U˜ (x) = ψβ,U˜ (x) = 1 are the eigenfunctions associated to the maximal eigenvalue λβ,U˜ . It is easy to see that 2π 1 ˜ eβ U(x−y) dy. λβ,U˜ = 2π 0 In the notation of Sec. 4, θβ,U˜ (x) = 1 and the transition Kernel is given by ˜
Kβ,U˜ (x, y) :=
eβ U(x−y) . λβ
December 17, J070-S0129055X11004527
1100
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
For instance, for any cylinder µβ,U˜ (A0 . . . Ak ) =
eβ
Pk−1 ˜ i=0 U(xi −xi+1 )
λkβ
A0 ...Ak
dxk . . . dx0 .
This measure does not came from a H¨ older potential, but we can approximate this measure by Gibbs-TF measures associated to H¨olders potentials, as we will see next. Let us now analyze the case A(x) = A(x0 , x1 ) = U (x0 − x1 ) where U : R → R is a Holder continuous function 2π-periodic. By the same arguments used above, it is easy to see that ψβ,U (x) = ψ¯β,U (x) = 1 are the eigenfunctions of the operators Lβ,U , L¯β,U associated to the maximal eigenvalue λβ,U (see Sec. 4), where λβ,U =
1 2π
2π
eβU(x−y) dy. 0
As in Sec. 4, θβ,U (x) = 1 and the transition Kernel is given by Kβ,U (x, y) :=
eβU(x−y) . λβ
Hence, for any cylinder
eβ
µβ,U (A0 . . . Ak ) =
Pk−1 i=0
U(xi −xi+1 )
dxk . . . dx0 .
λkβ
A0 ...Ak
By Theorem 16 we see that µβ,U = mU , the Gibbs-TF for U . ˜ be a L1 potential such that there exists an uniformly bounded Let now U ˜. sequence of Holder continuous potentials Un converging point wise to U By the Dominated Convergence Theorem, we have that λβ,Un =
1 2π
0
2π
eβUn (x−y) dy →
1 2π
2π
0
˜
eβ U(x−y) = λβ,U˜ ,
as k → ∞, and also for any cylinder A0 . . . Ak , we have µβ,Un (A0 . . . Ak ) → A0 ...Ak
eβ
Pk−1 ˜ i=0 U(xi −xi+1 )
λkβ
dxk . . . dx0 = µβ,U˜ (A0 . . . Ak )
as k → ∞. Note that the measure µβ,U˜ coincides with the Gibbs-DLR measure of statistical ˜ mechanics in the special case of nearest neighbors interaction of the kind A(x) = ˜ 0 , x1 ) = U(x ˜ 0 − x1 ) as can be seen, for example, in [27]. We also remark that A(x 1 is the partition function of DLR formulations of statistical mechanics. λk β
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1101
6.2. One-dimensional systems with symmetric potentials To explain the no selection measure theorem we will use the formalism introduced in last section. Here we take L = Z which is the origin of the term “one-dimensional” in the title of this section. We assign for each i ∈ Z the measure space (S1 , B, ν), where ν is the Lebesgue probability measure on the circle. For each n ∈ N we denote by Λn =: [−n, n] ∩ Z. We will use free conditions on the left and on the right side. For convenience, we use the natural measure isomorphism between the Bernoulli spaces (S1 )Z and [0, 2π)Z to define the Hamiltonian we introduced before in (S1 )Z . Let Φ = (ΦΓ )Γ⊂L be a family of functions on [0, 2π)Z , such that U (θk − θk+1 ), if Γ = {k, k + 1}; ΦΓ (θ) = 0, otherwise, where U is a potential defined by the 2π-periodic extension of ˜ (x) = U
∞
cj 1Aj (x),
j=1
and {Aj }j≥1 is a partition of [0, 2π) given by intervals of the form Aj = [aj , bj ). Using the isomorphism and the family Φ mentioned above, the Hamiltonian in the finite volume Λn , with boundary condition x , will be given by the following expression, if x = (θk )k∈Z HΛn (xΛn xΛcn ) = −
n−1
U (θk − θk+1 ) − U (θ−n − θ−n−1 ) − U (θn − θn+1 ).
(23)
k=−n
The family Φ we are considering is associated to a potential A which depends only on the nearest-neighbors and given by A(x, y) = U (x − y). We can prove (see [48]) that for each fixed β ∈ (0, +∞), the set Gβ,Φ is a singleton set and its unique measure denoted by µβA is given by µβA = w − lim µβ,Φ Λn , Λn N
where for all n ∈ N and E ∈ Bi n−1 n 1 β,Φ µΛn (E) = β,Φ 1πΛn (E) (θ) exp β U (θk − θk+1 ) dν(θk ). ZΛn (S1 )Λn k=−n k=−n
(24)
From now on we call µβ,Φ Λn the Gibbs measure in the volume Λn for the Hamiltonian (23) at inverse temperature β. We are using above free boundary conditions on the left and on the right-hand side. We will consider here a real parameter β, wich means the inverse of the temperature, and the Gibbs probability measure µ ˆβA over Bi = (S1 )Z (see the considerations just above Proposition 6).
December 17, J070-S0129055X11004527
1102
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
Note that if U has a unique maximum at y = 0 ∈ S1 , then the support of any maximizing probability measure µ∞ for A(x, y) = U (x − y) is always contained in the set K = {x = (. . . , x−2 , x−1 , x0 , x1 , x2 , . . .) : xi = c ∈ S1 , ∀ i ∈ Z} ⊂ Bi . All points in K are fixed points for σ ˆ . The above set K can be indexed by c ∈ S1 . Each fixed point x in this set can be denoted by xc , where c ∈ S1 . The corresponding maximizing probability measure for A over (S1 )Z is δxc . Given any probability measure P over S1 , we can consider the probability measure ν over (S1 )Z given by ν = δxc dP (c). The general maximizing probability measure for A is of this form. Suppose now that U has two strict maximals at y = 0 ∈ S1 and at y = π. In this case, the support of any maximizing probability measure for A(x, y) = U (x − y) is always contained in the set K = K1 ∪ K2 , where K1 = {x = (. . . , x−2 , x−1 , x0 , x1 , x2 , . . .) : xi = c ∈ S1 , ∀ i ∈ Z} ⊂ Bi , and K2 = {x = (. . . , x−2 , x−1 , x0 , x1 , x2 , . . .) : xi+1 − xi = π ∈ S1 , ∀ i ∈ Z} ⊂ Bi . The set K1 is called the set of ferromagnetic states, and, the set K2 is called the ˆ -period equal to two. A set of anti-ferromagnetic states. The points in K2 have σ similar result to the above is true for the general maximizing probability measure for A. Now we will state Proposition 29 and its Corollary 30, which, together with Lemma 31, will be used to prove the main result of this section, the non-selection Theorem 32. Proposition 29. Let µβ,Φ Λn be the Gibbs measure in the volume Λn , defined by (24). For any fixed j ∈ N and k ∈ {−n, . . . , n − 1}, if Bk,j = {(θ−n , . . . , θn ) ∈ (0, 2π]2n+1 : θk − θk+1 ∈ Aj }, then 1 ν(Aj )eβcj , Z(β)
µβ,Φ Λn (Bk,j ) = where Z(β) =
1 2π
eβU(x) dx, (0,2π]
and ν(Aj ) is the Lebesgue probability measure of Aj . In fact, to prove Theorem 32, we will only need to consider the Borel sets = {θ0 − θ1 ∈ Aj } ⊂ Bi , j ∈ N, because we are interested in estimate µβA (Bj ) = B j IBj dµβA , for each j, when β → ∞.
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1103
To state Corollary 30 we will consider the potential introduced in [20]. Corollary 30. Let ε > 0. Consider the special case where ˜ (x) = U
∞ i=1
3i
3
2
1 (x) + 2i+1 I2i
∞ i=1
3 22i+2
1 1I2i+1 (x − π) + 1I1 (x − π), 4
3i
where Ii = [− ε2 , ε2 ]. For each j ∈ N we define the ring Aj as follows. If j is even, then Aj = A2i = I2i \I2i+2 , and if j is odd then Aj = A2i+1 = I2i+1 \I2i+3 + π. For any fixed j ∈ N and k ∈ {−n, . . . , n − 1}, we have µβA ({θk − θk+1 ∈ Aj }) = µβ,Φ Λn ({θk − θk+1 ∈ Aj }) β β 1 ν(Aj ) exp − j+1 , = Z(β) 2 2 where Z(β) =
1 2π
eβU(x) dx (0,2π]
and j
j+2
ν(Aj ) = ε3 − ε3
.
In particular,
β β 1 µβA ({θ0 − θ1 ∈ Aj }) = ν(Aj ) exp − j+1 Z(β) 2 2 eβ/2 β 3j 3j+2 = exp − j+1 + log(ε − ε ) Z(β) 2
(25)
Remark 6. Before proceeding to the proof of the proposition we remark that repeated applications of Fubini’s Theorem show that the partition function in the volume Λn for the potential A satisfies ZΛβ,Φ = Z(β)2n . n
Fig. 1.
The graph of the potential U .
December 17, J070-S0129055X11004527
1104
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
Proof of Proposition 29. Let j ∈ N and k ∈ {−n, . . . , n − 1}. By definition we have µβ,Φ Λn ({θk − θk+1 ∈ Aj }) =
1 ZΛβ,Φ n
(0,2π]2n+1
1Bk,j (θ−n , . . . , θn ) exp β
n−1
U (θs − θs+1 )
s=−n
n
dν(θi ).
i=−n
Using the properties of the exponential function, we have that the above integral is given by n−1 n 1Bk,j (θ) exp(βU (θs − θs+1 )) dν(θi ). (26) (0,2π]2n+1
s=−n
i=−n
To simplify the exposition we suppose that k = −n. The following explanation can easily be modified to work in the general case just by reordering the terms, which can be done by Fubini’s Theorem. In the case k = −n it follows from Fubini’s Theorem that (26) is equal to n−1 n−2 βU(θs −θs+1 ) βU(θn−1 −θn ) 1B−n,j (θ) e e dν(θn ) dν(θi ). (0,2π]2n
(0,2π]
s=−n
i=−n
By the periodicity of U it follows that the integral in parenthesis is independent of θn−1 and equal to Z(β). Proceeding by induction, we can see that the above expression simplifies to 1B−n,j (θ)eβU(θ−n −θ−n+1 ) dν(θ−n ) dν(θ−n+1 ). (Z(β))2n−1 (0,2π]2
To evaluate this, we consider the iterated integral where the most internal integral is made in the variable θ−n , with θ−n+1 fixed. For any fixed value of θ−n+1 , whenever θ ∈ B−n,j we have that θ−n ∈ Aj + θ−n+1 . In this set the potential U is constant, i.e. U (θ−n − θ−n+1 ) = cj . From these observations the previous integral is simply eβcj dν(θ−n ) dν(θ−n+1 ), (Z(β))2n−1 (0,2π]
which is equal to 2n−1 βcj
(Z(β))
Aj +θ−n+1
e
(0,2π]
Aj +θ−n+1
dν(θ−n ) dν(θ−n+1 ).
Finally by the translation invariance property of the Lebesgue measure we end up with (Z(β))2n−1 eβcj ν(Aj ).
(27)
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1105
Dividing this value by the partition function, we get µβ,Φ Λn ({θk − θk+1 ∈ Aj }) =
ν(Aj ) βcj e . Z(β)
Note that for |k| < n, this expression does not depend on n. From this follows easily that for |k| < n and j ∈ N, µβ,Φ Λn ({θk − θk+1 ∈ Aj }) = µβA ({θk − θk+1 ∈ Aj }). Proof of Corollary 30. Follows from the fact that, if j = 2i and x ∈ A2i then 1 i i−1 1 − 3 1 3 3 4i U (x) = = = 1 22l+1 8 4l 8 l=1 l=0 1− 4 1 1 1 1 1 1 = 1 − i = − 2i+1 = − j+1 . 2 4 2 2 2 2 For the other hand, if j = 2i + 1 and x ∈ A2i+1 we have that 3 1 1 1 3 + + = 4 22l+2 4 16 22l i
U (x) =
l=1
i−1
l=0
1 3 1 − 4i 1 = + 1 4 16 1− 4 1 1 1 1 1 1 1 1 − i = − 2i+2 = − j+1 . = + 4 4 4 2 2 2 2 6.3. Maximizing µβA (Bk,j ) Now we will present some useful calculations to compute µβA (Bk,j ). We point out that we will need in the future just the case k = 0. Let fβ (x) := −
x x+2 β + log(ε3 − ε3 ). 2x+1
The maximum of this function can be found by derivation with respect to x. fβ (x) = −
x x+2 d β d log(ε3 − ε3 ) + dx 2x+1 dx x
x+2
β log 2 ε3 (log ε log 3)3x − 9ε3 (log ε log 3)3x = x+1 + 2 ε3x − ε3x+2
December 17, J070-S0129055X11004527
1106
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
β log 2 = x+1 + (log ε log 3)3x 2
β log 2 = x+1 + (log ε log 3)3x 2
x
x+2
x
x+2
ε3 − 9ε3 ε3x − ε3x+2 ε3 − 9ε3 ε3x − ε3x+2
.
If x is large enough the equation f (x) = 0 is solvable and the solution is implicitly given by x x+2 β log 2 ε3 − 9ε3 x 0 = x+1 + (log ε log 3)3 , 2 ε3x − ε3x+2 which is equivalent to −2 log ε log 3 β = 6x log 2
x
x+2
ε3 − 9ε3 ε3x − ε3x+2
.
(28)
The fraction appearing in the above equation can be rewritten as x
θ( , x) ≡
x+2
x
x
ε3 1 − 9ε(3 −3 ) 1 − 9ε8·3 = . x ε3 1 − ε(3x+2 −3x ) 1 − ε8·3x
(29)
6.4. An important lemma In this subsection we present an important lemma that will help us to estimate the probability µβA (Bk,j ). Lemma 31. Let (Ω, B) be a measurable space and (Cj )j∈N a measurable partition of Ω. For any positive β let Pβ be a probability measure in (Ω, B) such that j j+2 β 1 exp − j+1 + log(ε3 − ε3 ) Pβ (Cj ) = ¯ 2 Z(β) ¯ where Z(β) is a normalizing constant and ε > 0. Given δ > 0, there exist an εδ > 0 such that, for any 0 < < δ , for all j ∈ N, we have Pβj (Cj ) > 1 − δ, where βj is given by βj = 6j 2Cε
log 3 θ(ε, j) log 2
j
with θ(ε, j) =
1 − 9ε8·3 1 − ε8·3j
and
Cε = − log ε.
Remark. A more careful analysis of the proof presented here shows that the above lemma also works for a slightly different potential U , where we replace in the initial definition the terms 2j+1 and 3j , by (1 + δ)j+1 and (1 + γ)j , respectively, given that 0 < δ < γ.
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1107
Proof. Note that θ(ε, x) is an increasing function of x, and has limit equal to 1 when ε → 0 or x → +∞. Consider the function fβ (x) = − =−
β 2x+1 β 2x+1
x
x+2
+ log(ε3 − ε3
) x
− Cε 3x + log(1 − ε8.3 ).
(30)
From (28) and (29) it follows that its critical point x0 has to satisfy β = 6x0 2Cε
log 3 θ(ε, x0 ). log 2
(31)
Note that the last equation allows us to obtain the maximum point x0 of fβ , thus making x0 = x0 (β) an increasing (therefore invertible) and unbounded function of β. Arguing in the inverse direction, for each x0 = j0 ∈ N we can choose β = β(j0 ) as the unique solution to (31), which means j0 is the maximum point of fβ(j0 ) . Fix now j0 ∈ N. If we set x
κxε = log(1 − ε8.3 ), (note that κxε is an increasing function of x) it follows from (30) and (31) that, for any k ∈ Z fβ(j0 ) (j0 + k) = −
β(j0 ) − Cε 3j0 +k + κεj0 +k 2j0 +k+1
(32)
log 3 θ(ε, j0 ) · − Cε 3j0 +k + κεj0 +k log 2 2j0 +k+1
log 3 θ(ε, j0 ) k · = −3j0 Cε + C 3 + κεj0 +k . ε log 2 2k = −6j0 2Cε
Now we will use these identities to get an upper bound for going to the upper bound computations we prove:
Pβ(j0 ) (Cj0 +k ) Pβ(j0 ) (Cj0 ) .
(33) Before
Identity 1. For any integer k ≥ −j0 + 1, we get from (33) the following identity Pβ(j0 ) (Cj0 +k ) Pβ(j0 ) (Cj0 ) efβ(j0 ) (j0 +k) efβ(j0 ) (j0 )
log 3 θ(ε, j0 ) k · = exp −3j0 Cε + C 3 ε log 2 2k
log 3 + 3j0 Cε · θ(ε, j0 ) + Cε + κεj0 +k − κjε0 log 2
log 3 log 3 θ(ε, j0 ) k j0 +k j0 = exp −3j0 Cε · · θ(ε, j + 3 − ) − 1 + κ − κ 0 ε ε log 2 2k log 2 =
December 17, J070-S0129055X11004527
1108
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
θ(ε, j0 ) log 3 k j0 +k j0 · = exp −3j0 Cε − θ(ε, j ) − 1 + 3 − κ + κ 0 ε ε log 2 2k = exp −3j0 Cε 3k
log 3 θ(ε, j0 ) j0 +k j0 − θ(ε, j ) − 1 + κ − κ × exp −3j0 Cε · 0 ε ε log 2 2k = exp(−3j0 Cε 3k )
log 3 1 θ(ε, j0 ) · 1 − k + 1 + κεj0 +k − κjε0 . × exp 3j0 Cε log 2 2 With the above identities we are ready to show how to get the upper bounds for Pβ(j0 ) (Cj0 +k ) Pβ(j0 ) (Cj0 ) . This will be done by considering separate cases, whether k is positive or negative. Case k > 0. In this case, using the previous identity, θ(ε, j0 ) < 1 and κεj0 +k −κjε0 < 1, we have
Pβ(j0 ) (Cj0 +k ) log 3 j0 k j0 < exp(−3 Cε 3 ) exp 3 Cε +1 +1 Pβ(j0 ) (Cj0 ) log 2
log 3 ≤ exp −3j0 Cε 3k − −1 +1 . log 2 Of course the above inequality implies, for all k ∈ N, that
log 3 −1 Pβ(j0 ) (Cj0 +k ) ≤ Pβ(j0 ) (Cj0 ) exp 1 − 3j0 Cε 3k − log 2 and then summing over k we obtain ∞ k=1
Pβ(j0 ) (Cj0 +k ) ≤
∞ k=1
log 3 j0 k −1 . exp 1 − 3 Cε 3 − log 2
In order to bound this series, we decompose it as follows
∞ log 3 log 3 j0 j0 k exp 1 − 3 Cε 2 − −1 . exp 1 − 3 Cε 3 − + log 2 log 2 k=2
3 By a simple induction process one proves that k ≤ 3k − log log 2 − 1, for all k ≥ 2. From this observation it follows the upper bound
∞ ∞ log 3 j0 Pβ(j0 ) (Cj0 +k ) ≤ exp 1 − 3 Cε 2 − exp(1 − 3j0 Cε k) + log 2 k=1
k=2
log 3 exp(1 − 3j0 2Cε ) j0 = exp 1 − 3 Cε 2 − + log 2 1 − exp(−3j0 Cε )
log 3 exp(1 − 6Cε ) . ≤ exp 1 − 3Cε 2 − + log 2 1 − exp(−3Cε )
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1109
As Cε = − log ε → ∞ when ε → 0, we can choose an ε0 such that for all 0 < ε < ε0 , we have
δ log 3 exp(1 − 6Cε ) < . (34) exp 1 − 3Cε 2 − + log 2 1 − exp(−3Cε ) 2 Note that ε0 > 0 does not depend on j0 . This implies that ∞
Pβ(j0 ) (Cj0 +k ) <
k=1
δ , 2
(35)
for any j0 ∈ N, provided 0 < ε < ε0 . Case k < 0. From Identity 1, we have
Pβ(j0 ) (Cj0 +k ) Pβ(j0 ) (Cj0 )
is equal to
1 log 3 exp(−3j0 Cε 3k ) exp 3j0 Cε θ(ε, j0 ) · 1 − k + 1 + κεj0 +k − κjε0 . log 2 2 Note that we can choose 0 < 1 ≤ 0 such that, for all 0 < ε < 1 and all j0 ≥ 1, we have log 3 j0 −1 1 − 9ε8·3 log 3 log 3 log 2 −1= − 1 > ≡ A. θ(ε, j0 ) j log 2 2 1 − ε8·3 0 log 2
(36)
As a consequence we have θ(ε, j0 ) >
log 2 . log 3
(37)
Then log 3 1 θ(ε, j0 ) · 1 − k + 1 < 0 log 2 2 for any k ∈ {−j0 + 1, . . . , −1}, and we have the following inequality, when we use κεj0 +k − κjε0 < 0 and −3j0 Cε 3k < 0
log 3 1 θ(ε, j0 ) · 1 − k + 1 + κεj0 +k − κjε0 exp(−3j0 Cε 3k ) exp 3j0 Cε log 2 2
log 3 1 ≤ exp 3Cε θ(ε, j0 ) · 1 − k + 1 . log 2 2 From this we obtain
log 3 1 θ(ε, j0 ) · 1 − k + 1 . Pβ(j0 ) (Cj0 +k ) ≤ exp 3Cε log 2 2
December 17, J070-S0129055X11004527
1110
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
Using this upper bound, (36) and (37) again, it follows that
j ∞ 0 −1 log 3 Pβ(j0 ) (Cj0 −k ) ≤ exp 3Cε θ(ε, j0 ) · 1 − 2k + 1 log 2 k=1
k=1
∞ log 3 θ(ε, j0 ) + 1 + exp(3Cε [(1 − 2k ) + 1]) < exp 3Cε − log 2 k=2
< e−3Cε A +
∞
exp(3Cε (2 − 2k ))
k=2
< e−3Cε A +
∞
exp(−3Cε k)
k=2
= e−3Cε A +
e−6Cε . 1 − e−3Cε
Using again that Cε = − log ε → +∞ when ε → 0, and A = can choose 0 < εδ ≤ ε1 such that for all 0 < ε < ε1 we have e−3
log 3−log 2 Cε 2 log 2
+
e−6Cε δ < , −3C ε 1−e 2
log 3−log 2 2 log 2
> 0 we
(38)
which implies j 0 −1
Pβ(j0 ) (Cj0 −k ) <
k=1
Finally by (35) and (39) we get
δ . 2
(39)
Pβ(j0 ) (Ck ) < δ.
k∈N\{j0 }
if < δ . 6.5. The non-selection theorem Now we are ready to state and prove the main result of this section which is due to van Enter and Ruszel [20]. Note that in the notation we used before the maximizing value is m(A) = sup U . Theorem 32. For the potential A described above, consider the family of probability measures µβA , with β ∈ R. Then, in the weak∗ topology, there is no selection of measure, that is, there is no limit for µβA , when β → ∞. Proof. Consider the Borel set B = {(θ0 − θ1 ) ∈ [0, π] ⊂ S1 } ⊂ Bi , and, the non-continuous function IB . Given small δ and , we can approximate IB by a continuous function ϕ : Bi → R, where the set of points where ϕ = IB is
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1111
contained in the small set D = {(θ0 − θ1 ) ∈ [0, ] ∪ [π − , π] ⊂ S1 } ⊂ Bi . From the above we can choose a suitable ϕ, and, also present two sequences sn and tn , converging to infinity, such that ϕ dµsn A > 1 − δ and
ϕ dµtn A < δ.
This shows that there is no limit for µβA . Remark 7. We point out that the example described above can be adapted in order to produce a continuous potential A which does not select in the limit when β → ∞ [20]. References ´ [1] T. Bousch, La condition de Walters, Ann. Sci. Ecole Norm. Sup. (4 ) 34 (2001) 287–311. [2] V. Bangert, Mather sets for twist maps and geodesics on tori, in Dynamics Reported, Vol. 1 (Wiley, 1988), pp. 1–56. [3] A. Baraviera, A. O. Lopes and Ph. Thieullen, A large deviation principle for Gibbs states of Holder potentials: The zero temperature case, Stoch. Dyn. 6 (2006) 77–96. [4] A. Baraviera, R. Leplaideur and A. O. Lopes, Selection of measures for a potential with two maxima at the zero temperature limit, to appear in SIAM J. Appl. Dyn. [5] A. Baraviera, A. O. Lopes A and J. Mengue, On the selection of subaction and measure for a subclass of Walters’s potentials, UFRGS, preprint (2011). [6] P. Bernard and G. Contreras, A generic property of families of Lagrangian systems, Ann. Math. (2 ) 167 (2008) 1099–1108. [7] R. Bowen, Gibbs States and the Ergodic Theory of Anosov Diffeomorphisms, Lecture Notes in Math., Vol. 470 (Springer-Verlag, 1975). [8] J. Br´emont, Gibbs measures at temperature zero, Nonlinearity 16(2) (2003) 419–426. [9] J. R. Chazottes and M. Hochman, On the zero-temperature limit of Gibbs states, Comm. Math. Phys. 297 (2010) 265–281. [10] J. R. Chazottes, J. M. Gambaudo and E. Ugalde, Zero-temperature limit of one dimensional Gibbs states via renormalization: The case of locally constant potentials, Ergodic Theory Dynam. Systems 31(4) (2011) 1109–1161. [11] J. Conway, A Course in Functional Analysis (Springer-Verlag, 1990). [12] G. Contreras and R. Iturriaga, Global Minimizers of Autonomous Lagrangians, No. 22, Col´ oquio Brasileiro de Matem´ atica (IMPA, 1999). [13] W. Chou and R. J. Duffin, An additive eigenvalue problem of physics related to linear programming, Adv. Appl. Math. 8 (1987) 486–498. [14] W. Chou and R. Griffiths, Ground states of one-dimensional systems using effective potentials, Phys. Rev. B 34 (1986) 6219–6234. [15] G. Contreras, A. O. Lopes and Ph. Thieullen, Lyapunov minimizing measures for expanding maps of the circle, Ergodic Theory Dynam. Systems 21 (2001) 1379–1409.
December 17, J070-S0129055X11004527
1112
2011 9:23 WSPC/S0129-055X
148-RMP
A. T. Baraviera et al.
[16] G. Contreras, A. O. Lopes and E. R. Oliveira, Ergodic transport theory, periodic maximizing probabilities and the twist condition, UFRGS, preprint (2011). [17] K. Deimling, Nonlinear Functional Analysis (Springer-Verlag, 1985). [18] A. Dembo and O. Zeitouni, Large Deviations Techniques and Applications (SpringerVerlag, 1998). [19] A. C. D. van Enter, R. Fernandez and A. D. Sokal, Regularity properties and pathologies of position-space renormalization-group transformations: Scope and limitations of Gibbsian theory, J. Stat. Phys. 72 (1993) 879–1187. [20] A. C. D. van Enter and W. M. Ruszel, Chaotic temperature dependence at zero temperature, J. Stat. Phys. 127 (2007) 567–573. [21] A. Fathi, Th´eor`eme KAM faible et th´eorie de Mather sur les syst`emes Lagrangiens, C. R. Acad. Sci. Paris S´ er. Math. 324 (1997) 1043–1046. [22] Y. Fukui and M. Horiguchi, One-dimensional Chiral XY -model at finite temperature, Interdiscip. Inform. Sci. 1 (1995) 133–149. [23] E. Garibaldi and A. O. Lopes, On Aubry–Mather theory for symbolic dynamics, Ergodic Theory Dynam. Systems 28 (2008) 791–815. [24] E. Garibaldi and A. O. Lopes, The effective potential and transshipment in thermodynamic formalism at temperature zero, to appear in Stoch. Dyn. [25] E. Garibaldi and Ph. Thieullen, Minimizing orbits in the discrete Aubry–Mather model, Nonlinearity 24 (2011) 563–611. [26] E. Garibaldi and Ph. Thieullen, Description of some ground states by Puiseux technics, preprint (2010). [27] H.-O. Georgii, Gibbs Measures and Phase Transitions (de Gruyter, Berlin, 1988). [28] D. A. Gomes, Viscosity solution method and the discrete Aubry–Mather problem, Discrete Contin. Dyn. Syst. Ser. A 13 (2005) 103–116. [29] D. A. Gomes and E. Valdinoci, Entropy penalization methods for Hamilton–Jacobi equations, Adv. Math. 215 (2007) 94–152. [30] D. A. Gomes, A. O. Lopes and J. Mohr, The Mather measure and a large deviation principle for the entropy penalized method, Commun. Contemp. Math. 13 (2011) 235–268. [31] R. B. Israel, Convexity in the Theory of Lattice Gases (Princeton University Press, 1979). [32] O. Jenkinson, Ergodic optimization, Discrete Contin. Dyn. Syst. Ser. A 15 (2006) 197–224. [33] G. Keller, Gibbs States in Ergodic Theory (Cambridge Press, 1998). [34] S. Karlin, Total Positivity (Stanford Univ. Press, 1968). [35] J. Lebowitz and A. Martin-L¨ of, On the uniqueness of the Gibbs state for Ising spin systems, Comm. Math. Phys. 25 (1972) 276–282. [36] R. Leplaideur, A dynamical proof for the convergence of Gibbs measures at temperature zero, Nonlinearity 18(6) (2005) 2847–2880. [37] A. O. Lopes, Entropy and large deviation, Nonlinearity 3 (1990) 527–546. [38] A. O. Lopes, J. Mohr, R. Souza and Ph. Thieullen, Negative entropy, zero temperature and stationary Markov chains on the interval, Bull. Braz. Math. Soc. (N.S.) 40 (2009) 1–52. [39] A. O. Lopes and J. Mengue, Zeta measures and thermodynamic formalism for temperature zero, Bull. Braz. Math. Soc. (N.S.) 41(3) (2010) 449–480. [40] A. O. Lopes and J. Mengue, Selection of measure and a large deviation principle for the general one-dimensional XY model, UFRGS, preprint (2011). [41] A. O. Lopes, E. R. Oliveira and Ph. Thieullen, The dual potential, the involution kernel and transport in ergodic optimization, preprint (2008).
December 17, J070-S0129055X11004527
2011 9:23 WSPC/S0129-055X
148-RMP
The General One-Dimensional XY Model
1113
[42] R. Ma˜ n´e, Generic properties and problems of minimizing measures of Lagrangian systems, Nonlinearity 9 (1996) 273–310. [43] J. Mather, Action minimizing invariant measures for positive definite Lagrangian systems, Math. Z. 207(2) (1991) 169–207. [44] D. H. Mayer, The Ruelle–Araki Transfer Operator in Classical Statistical Mechanics, Lecture Notes in Physics, Vol. 123 (Springer-Verlag, 1980). [45] W. Parry and M. Pollicott, Zeta functions and the periodic orbit structure of hyperbolic dynamics, Ast´erisque 187–188 (1990), 268 pp. [46] M. Pollicott and M. Yuri, Dynamical Systems and Ergodic Theory (Cambridge Press, 1998). [47] R. T. Rockafellar, Extension of Fenchel’s duality theorem for convex functions, Duke Math. J. 33 (1966) 81–89. [48] D. Ruelle, Thermodynamic Formalism, 2nd edn. (Cambridge Press, 2004). [49] H. H. Schaefer, Banach Lattices and Positive Operators (Springer-Verlag, 1974). [50] B. Simon, The Statistical Mechanics of Lattice Gases (Princeton Univ. Press, 1993). [51] R. R. Souza, Sub-actions for weakly hyperbolic one-dimensional systems, Dyn. Syst. 18 (2003) 165–179. [52] Y. Velenik, Phase separation as a large deviations problem, Ph.D. thesis, Lausanne (2003). [53] O. Sarig, Lecture notes on thermodynamic formalism for topological Markov shifts, PenState, USA, preprint (2009). [54] A. Taylor, Introduction to Functional Analysis (Krieger Pub Co., 1986). [55] F. Spitzer, A variational characterization of finite Markov chains, Ann. Math. Statist. 43 (1972) 303–307. [56] A. N. Shiryaev, Probability, 2nd edn. (Springer, 1984).
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Reviews in Mathematical Physics Vol. 23, No. 10 (2011) 1115–1156 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004539
SCALING LIMITS OF INTEGRABLE QUANTUM FIELD THEORIES
HENNING BOSTELMANN Department of Mathematics, University of York, York YO10 5DD, United Kingdom
[email protected] GANDALF LECHNER Institute for Theoretical Physics, Vor dem Hospitaltore 1, 04013 Leipzig, Germany
[email protected] GERARDO MORSELLA Department of Mathematics, University of Roma Tor Vergata, viale della Ricerca Scientifica 1, I-00133 Roma, Italy
[email protected] Dedicated to the memory of Claudio D’Antoni Received 17 May 2011 Revised 13 October 2011 Short distance scaling limits of a class of integrable models on two-dimensional Minkowski space are considered in the algebraic framework of quantum field theory. Making use of the wedge-local quantum fields generating these models, it is shown that massless scaling limit theories exist, and decompose into (twisted) tensor products of chiral, translation-dilation covariant field theories. On the subspace which is generated from the vacuum by the observables localized in finite light ray intervals, this symmetry can be extended to the M¨ obius group. The structure of the interval-localized algebras in the chiral models is discussed in two explicit examples. Keywords: Quantum field theory; scaling limits; conformal symmetry; operator algebras. Mathematics Subject Classification 2010: 81T05, 81T10, 47C15, 81T40
1. Introduction In the analysis of quantum field theories, the information gained by computing the ultraviolet scaling limit and determining its properties is most relevant. Probably the most important application of this principle in physics is perturbative asymptotic freedom of QCD — the key feature which led to its general acceptance as the quantum field theory of strong interactions. 1115
December 17, J070-S0129055X11004539
1116
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
In view of its importance, several approaches to the computation of the scaling limit have been developed, adapted to different descriptions of quantum field theory. In the Lagrangian framework, the requirement that the physical amplitudes are independent of the arbitrary choice of the distance (or energy) scale at which the theory is renormalized, provides an equation for the dependence of renormalized correlation function on coupling constants and the renormalization scale (the Callan–Symanzik equation). Using information from perturbation theory, the coefficients of the equation (the β-functions) can usually be determined, and the scaling limit of correlation functions computed by solving it. However, these methods are of little help in cases in which perturbation theory is not reliable, or where the theory is not defined in terms of a Lagrangian at all. In order to circumvent these problems, a different approach, based on the algebraic setting of quantum field theory [33], has been proposed by Buchholz and Verch [22] and extended in [12, 13]. In this approach, one considers the scaling algebra, i.e. the algebra generated by functions λ → Aλ of the scaling parameter with values in the algebra of observables of the theory, satisfying certain specific phase space properties. The scaling limit is then obtained as the GNS representation of the scaling algebra induced by the scaling limit of the vacuum state on the original algebra at finite scales. Relying only on the knowledge of the observables of the theory, this method is completely model-independent, and it proved to be very useful in analyzing the scaling limit of charged sectors, and in providing an intrinsic definition of confined charge [16, 25]. A study of the relations with the Lagrangian approach can be found in [12]. In the present article, we study scaling limits of a certain class of integrable quantum field theories on two-dimensional Minkowski space. It is interesting to note that two-dimensional sigma models, which are integrable field theories — although not directly covered by our results — share with QCD the property of asymptotic freedom, as well as several others (see, e.g., [61] and references therein). We will study a simplified version of these: At finite scale, the models we are interested in describe a single type of scalar neutral Bosons of mass m > 0, whose collision theory is governed by a factorizing S-matrix. This means that the particle number is conserved in each scattering process and the n-body S-matrix factorizes into a product of two-body S-matrices, cf. the textbook and review [1, 26] and the references cited therein. A prominent example of a model in the considered class is the Sinh-Gordon model. We are particularly interested in the connection between the long and short distance regimes of such quantum field theories, represented by the S-matrix on the one hand and the scaling limit on the other hand. For the simplified particle spectrum that we consider here, a factorizing S-matrix can be fully characterized by a single complex-valued function S, the so-called scattering function. It is therefore possible to formulate these models in the spirit of inverse scattering theory, taking a scattering function S and a mass value m > 0 as an input. Such a setup is directly related to the long distance regime, and will be more convenient for our analysis
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1117
than Euclidean perturbation theory (see, e.g., [30]), where the relation to the real time S-matrix is quite indirect. There exist different, complementary, approaches to the inverse scattering problem. One such approach, known as the form factor program, aims at computing n-point functions of local fields in terms of so-called form factors, i.e. matrix elements of field operators in scattering states [56]. But despite many partial results known in the literature [4], in this approach one usually runs into the problem that the convergence of the appearing infinite series cannot be controlled because of the complicated form of local field operators [3]. Another approach is based on the operator-algebraic framework of quantum field theory and Tomita–Takesaki modular theory, and constructs the models in question by an indirect procedure involving auxiliary field operators with weakened localization properties. Instead of being sharply localized at points in space-time, these fields (“polarization-free generators” [9]) are localized only in infinite wedge-shaped regions (“wedges”). This last approach will be most convenient for our purposes as it is closely connected to the S-matrix and does not rely on series expansions with unknown convergence properties. Starting from a scattering function S and a mass m > 0, a solution to the inverse scattering problem has been rigorously constructed in this setting [52, 55, 37, 18, 40]. The main results of this analysis will be recalled in Sec. 2 in a manner adapted to scaling transformations. The resulting models meet all standard requirements of algebraic quantum field theory, and hence on abstract grounds, a well-defined scaling limit in the sense of Buchholz and Verch exists. In particular, the short distance regime is in principle completely described by the initially chosen S-matrix. We will not fully analyze the Buchholz–Verch limit here, but choose a simplified construction in the same spirit. For the limit theory, one has natural candidates: massless models with factorizing scattering. These have been described before in a thermodynamical context; see, e.g., [60, 29]. Here, however, we treat them as rigorously constructed quantum field theories on two-dimensional Minkowski space, given in terms of local algebras of observables. These limit theories are interesting in their own right as they provide non-trivial covariant deformations of free field theories (see also [41] for higher-dimensional generalizations), and still depend on the scattering function S one started with. Furthermore, as expected for a scaling limit [13], they are dilation covariant and, as it turns out, (extensions of) chiral nets. This distinguishes them from other massless deformations of quantum field theories that have recently been constructed in the algebraic framework, on two-dimensional Minkowski space [27] and Minkowski half-space [42, 43]. In this paper, we start to explore the relation between the scattering function defining a massive model of the class mentioned above, and the properties of the corresponding scaling limit. The first step consists of computing the behavior of the scattering function under scaling transformations, and to determine the short distance structure of the n-point functions of the wedge-local generators. This is done in Secs. 2 and 3,
December 17, J070-S0129055X11004539
1118
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
respectively. As expected, the mass vanishes in the short distance limit, and we obtain a class of massless (local extensions of) chiral quantum field theories. They are presented in Sec. 4. As we shall explain, their dependence on S is twofold: On the one hand, S determines the decomposition of the two-dimensional massless generators into twisted or untwisted tensor products of chiral fields on the left and right light ray. On the other hand, the chiral components can be generated by massless chiral quantum fields which are localized on half-lines, similar to the massive situation. The commutation relations of these fields directly involve the scattering function S in a manner very similar to the two-dimensional models at finite scale, despite the difference in mass and space-time dimension. The chiral subtheories always transform covariantly under a representation of the translation-dilation-reflection group of the light ray. Making use of modular theory, we will show that on a subspace of the chiral Hilbert space, one can always extend this affine symmetry by a conformal rotation to the M¨ obius group (Sec. 5). This conformal subspace is directly related to observables localized in finite intervals on the light ray. But because our construction is based on halfline-local generator fields, such strictly localized observables are derived quantities here, and it is a non-trivial task to characterize them. We obtain two results in this direction: First, we show that for certain scattering functions, the local chiral observables are fixed points under an additional Z2 -symmetry, which restricts the conformal subspace. Second, we investigate the models given by two simple example scattering functions in full detail in Sec. 6. In these examples, we find conformal nets with central charge c = 1 respectively c = 12 in the limit. This analysis also exemplifies that the scaling limit of a Bosonic theory can be generated by the energy-momentum tensor of a Fermi field. Section 7 contains our conclusions and an account of further work in progress. 2. Two-Dimensional Integrable Models In this section, we recall the structure of the quantum field theories we are interested in. At finite scale, these models describe a single species of scalar Bosons of mass m ≥ 0 on two-dimensional Minkowski space. Scattering processes of these particles are governed by a factorizing S-matrix [1, 3, 26], i.e. in each collision process the particle number and the momenta are conserved, and the n-particle S-matrix factorizes into a product of two-particle S-matrices. In this situation, the S-matrix is determined by a single function S, called the scattering function. Such a restricted form of the collision operator is typical for completely integrable models [34], which provide a rich class of examples for factorizing S-matrices. The family of model theories we consider is thus parametrized by the two data (m, S), where m is a mass parameter and S a function with a number of properties specified below. Before recalling the construction of these quantum field theories, we define the space of parameters (m, S) and investigate its scaling properties. We will first consider the case m > 0, and then obtain the massless case m = 0 in a suitable limit.
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1119
2.1. Scaling limits of scattering functions The defining properties of a scattering function S can most conveniently be expressed when treating S as a function of the rapidity θ as the momentum space variable, which parametrizes the upper mass shell with mass m > 0 according to cosh θ pm (θ) := m , θ ∈ R. (2.1) sinh θ Since Lorentz boosts are translations in the rapidity, the Lorentz invariant scattering function depends only on differences of rapidities. Writing S(a, b) := {ζ ∈ C : a < Im ζ < b}
(2.2)
for two real numbers a < b, and S(a, b) for the closed strip, the family of all scattering functions and two important subfamilies are defined as follows. Definition 2.1 (Scattering Functions). (a) A scattering function is a bounded and continuous function S : S(0, π) → C which is analytic in the interior of this strip and satisfies for θ ∈ R, S(θ) = S(θ)−1 = S(θ + iπ) = S(−θ).
(2.3)
The family of all scattering functions is denoted S. (b) A scattering function S ∈ S is called regular if there exists κ > 0 such that S continues to a bounded analytic function in the strip S(−κ, π + κ). The subfamily of all regular scattering functions is denoted Sreg ⊂ S. (c) A regular scattering function S ∈ Sreg is called a scattering function with limit if the two limits limθ→∞ S(θ) and limθ→−∞ S(θ) exist. The family of all scattering functions with limit is denoted Slim ⊂ Sreg ⊂ S. Equations (2.3) express the unitarity, crossing symmetry, and hermitian analyticity of the factorizing S-matrix corresponding to S. For a discussion of these standard properties, we refer to the textbooks and reviews [1, 4, 35, 56, 26]. The regularity assumption in part (b) of Definition 2.1 comes from the fact that for each regular S ∈ Sreg , a corresponding quantum field theoretic model is known to exist [40], whereas for non-regular scattering functions, this is not known. Particular examples of regular scattering functions (with limit) are the constant functions Sfree (θ) = 1 and SIsing (θ) = −1, corresponding to the interaction-free theory and the Ising model, respectively, and the scattering function of the Sinh-Gordon model with coupling constant g ∈ R [2], πg 2 4π + g 2 . SShG (θ) := πg 2 sinh θ + i sin 4π + g 2 sinh θ − i sin
(2.4)
The additional assumption in part (c) of the above definition, concerning the existence of limits of scattering functions, is relevant in the context of scaling limits: If distances in Minkowski space are scaled according to x → λx, and Planck’s unit
December 17, J070-S0129055X11004539
1120
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
of action is kept fixed, momenta have to be rescaled according to p → λ−1 p. So p p → sinh−1 λm and converge to ±∞ for λ → 0. rapidities scale like θ = sinh−1 m Looking at the example of the Sinh-Gordon scattering function (2.4), where the coupling constant is dimensionless and therefore does not scale with λ, we see that the only dependence of S(θ1 − θ2 ) on the scale λ is via the scale dependence of θ1 , θ2 . Hence for S(θ1 − θ2 ) to have a scaling limit as λ → 0, we need to require the existence of the limits as in part (c). An explicit characterization of such functions is given in the following proposition. Proposition 2.2 (Scattering Functions with Limits). (a) The set Slim of scattering functions with limits consists precisely of the functions S(ζ) = ε ·
N sinh ζ − sinh bk , sinh ζ + sinh bk
ζ ∈ S(0, π),
(2.5)
k=1
where ε = ±1, N ∈ N0 , and {b1 , . . . , bN } is a set of complex numbers in the strip 0 < Im b1 , . . . , Im bN ≤ π2 , such that with each bk (counted according to multiplicity) also −bk is contained in {b1 , . . . , bN }. (b) For each S ∈ Slim , the two limits S(∞) := limθ→∞ S(θ) = limθ→−∞ S(θ) coincide and are equal to ±1, i.e. Slim is the disjoint union of the sets ± Slim := S ∈ Slim : lim S(θ) = lim S(θ) = ±1 . (2.6) θ→∞
θ→−∞
Proof. (a) Each factor sbk : ζ → ±(sinh ζ − sinh bk )(sinh ζ + sinh bk )−1 satisfies sbk (−ζ) = sbk (ζ + iπ) = sbk (ζ)−1 = s−bk (ζ) for ζ ∈ R. Given any sufficiently small δ > 0, the function sbk is analytic and bounded in the strip S(− Im bk + δ, π + Im bk − δ) ⊃ S(0, π). Because the product (2.5) is finite, it follows that S is analytic and bounded in the strip S(−κ, π + κ) for some κ > 0. Furthermore, the last two equations in (2.3) hold for S because they hold for each factor sbk . The first equation in (2.3) holds because of sbk (ζ)−1 = s−bk (ζ) and the assumed invariance of {b1 , . . . , bN } under bk → −bk . Hence each S of the form (2.5) is a regular scattering function. As θ → ±∞, we clearly have S(θ) → ε, which shows S ∈ Slim . Now we pick some arbitrary S ∈ Slim and show that it is of the form (2.5). As a regular scattering function, S is bounded and analytic in a strip S(−κ, π + κ) for some κ > 0, and since S ∈ Slim , we have a limit value ε ∈ C such that S(θ) → ε as θ → ∞. These properties imply that S(θ + iλ) → ε as θ → ∞, uniformly in λ ∈ [0, π] [58, p. 170]. In view of S(θ + iπ) = S(−θ), we also have S(θ + iλ) → ε for θ → −∞. In particular, the two limits limθ→±∞ S(θ) along the real line coincide. Since S has unit modulus on the real line (2.3), we have |ε| = 1, and because of the uniform limit S(ζ) → ε as Re(ζ) → ±∞, we find c > 0 such that |Re(ζ0 )| ≤ c for all zeros ζ0 of S. Taking into account that S is continuous on the closed strip S(0, π), and of modulus 1 on its boundary, we conclude that it has only finitely many zeros in S(0, π).
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1121
Let us denote by b1 , . . . , bN those zeros of S whose imaginary parts λ satisfy 0 < λ ≤ π2 . These zeros come in pairs {bk , −bk } because of (2.3), and there also exist corresponding zeros iπ − bk , iπ + bk in the upper half of the strip. Now consider the product B(ζ) := ε ·
N sinh ζ − sinh bk , sinh ζ + sinh bk
k=1
which is a regular scattering function B ∈ Sreg , of the form specified in (2.5). Since B has precisely the same zeros as S in S(0, π), and B(θ + iλ) → ε for θ → ±∞, also F := S · B −1 belongs to S. By construction, F has no zeros in S(0, π), and F (θ + iλ) converges to 1 for θ → ±∞, uniformly in 0 ≤ λ ≤ π. As F is continuous on S(0, π) and of modulus 1 on the boundary of this strip, it is bounded from above and below, i.e. there exists K > 0 such that K < |F (ζ)| ≤ 1, ζ ∈ S(0, π). But any scattering function, and in particular F , can be meromorphically continued to S(−π, π) by the equations (2.3). In fact, this continuation is given by F (−ζ) = F (ζ)−1 ,
ζ ∈ S(0, π),
(2.7)
and as F has no zeros in S(0, π), it is actually an analytic continuation for this special scattering function. In view of the boundedness of F on S(0, π), there also holds |F (ζ)| < K −1 < ∞ for all ζ ∈ S(−π, π). Taking ζ = −θ + iπ, θ ∈ R, Eqs. (2.7) and (2.3) give F (θ − iπ) = F (iπ − θ)−1 = F (θ)−1 = F (θ + iπ),
θ ∈ R,
i.e. F continues to a (2πi)-periodic, entire function which in view of the above argument is bounded and hence constant. Thus F (θ) = limθ→∞ F (θ) = 1, and we arrive at the claimed representation (2.5) for S, namely S = F · B = B. (b) The identity of the limits limθ→±∞ S(θ) has been shown above, and can also be seen directly from (2.5). Also the fact that these limits can take only the values ±1 is clear from (2.5). As a preparation for the scaling limit of quantum fields, we now compute which effect a space-time scaling x → λx has on a scattering function with limit. As usual, such a limit involves taking the mass to zero. To keep track of the mass scale, we will use momentum variables with explicit mass dependence instead of the rapidity. For spatial momenta p = m sinh θ, q = m sinh θ , we have m pωq − qωpm −1 p −1 q −1 − sinh = sinh θ − θ = sinh , m m m2 with the energies ωpm := (p2 + m2 )1/2 , ωqm := (q 2 + m2 )1/2 . Corresponding to any m > 0, S ∈ Slim , we therefore introduce the function Sm : R2 → C, m pωq − qωpm −1 Sm (p, q) := S sinh , (2.8) m2
December 17, J070-S0129055X11004539
1122
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
which shows the mass dependence explicitly. Clearly, Sm inherits many properties from S, see Eq. (2.3). For example, one has the symmetry and scaling relations, for p, q ∈ R, Sm (q, p) = Sm (p, q)−1 = Sm (p, q), Sm (λ−1 p, λ−1 q) = Sλm (p, q),
λ > 0.
(2.9) (2.10)
The mass zero limit S0 of Sm can be computed in a straightforward manner. ± Lemma 2.3. Let S ∈ Slim , and m > 0. Then, for p, q ∈ R,
S(log p − log q), S(log(−q) − log(−p)), S0 (p, q) := lim Sλm (p, q) = λ→0 S(0), S(∞),
p > 0, q > 0; p < 0, q < 0; p = q = 0;
(2.11)
otherwise.
Proof. For any p, q ∈ R, we have lim (pωqλm λ→0
−
qωpλm )
±2pq, p · q < 0; = p|q| − q|p| = 0, p · q ≥ 0.
(2.12)
± , This implies (λm)−2 (pωqλm −qωpλm ) → ±∞ for λ → 0 if p · q < 0, and since S ∈ Slim we get Sλm (p, q) → S(∞) for this configuration of momenta. In the case p · q ≥ 0, we use l’Hospital’s rule to compute the limit,
lim
λ→0
− qωpλm λ2 m2
pωqλm
pλm2 qλm2 − ωqλm ωpλm 1 q p lim = lim = − λ→0 2λm2 2 λ→0 ωqλm ωpλm 0, p = q = 0; p = 0, q = 0; ε(p) · ∞, = −ε(q) · ∞, p = 0, q = 0; 1 p q − , p · q > 0. 2 |q| |p|
Here ε(p), ε(q) denotes the sign of p, q, respectively. Evaluating these expressions in S ◦ sinh−1 (2.8) gives the claimed result. Note that the limit S0 is not independent of the scattering function S; in fact, S can be completely recovered from S0 (2.11). This can be seen as an indication that the short distance behavior of the (m, S)-model will depend on S (but not on m). The limit behavior of the scattering functions will be used in the calculation of the scaling limit of the field theory models discussed in the next section.
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1123
2.2. Massive and massless models with factorizing S-matrices We now turn to the description of the family of quantum field theoretic models we are interested in. Each model in this family is specified by two parameters, a mass value m ≥ 0 and a scattering function S ∈ Slim with limit. Whereas the most frequently used setting for the discussion of such models is the form factor program [4], their rigorous construction was accomplished only recently with the help of operator-algebraic techniques. The initial idea of this program is due to Schroer [52, 53] and consists in constructing certain auxiliary field operators depending on (m, S). Despite their weaker than usual localization, these fields can be used to define a strictly local, covariant quantum field theory in an indirect manner. The details of this construction, and the passage to algebras of strictly localized observables, was carried out in [37, 18, 39, 40]. In particular, it has been shown that for any choice of (m, S), m > 0, S ∈ Sreg , there exists a corresponding quantum field theory with the factorizing S-matrix given by S as its collision operator. In the following, we will outline the structure of these models using a momentum space formulation. For details and proofs, we refer to the articles cited above. Fixing arbitrary S ∈ Slim and m ≥ 0, the function Sm is defined via (2.8) for m > 0 and via the limit (2.11) for m = 0. Note that the zero mass function S0 (2.11) can be discontinuous at (0, 0) if the signs of S(0) and S(∞) are different, but still satisfies the symmetry relations (2.9). Most of the objects introduced below depend on the choice of S, but since we will work with a fixed scattering function in the following, we do not reflect this dependence in our notation. The mass dependence, on the other hand, will always be written down explicitly. Having fixed (m, S), we first describe the Hilbert space on which the (m, S)model is constructed. Starting from the single particle space Hm,1 := L2 (R, dp/ωpm ), the n-particle spaces Hm,n , n > 1, are defined as certain Sm -symmetrized subspaces ⊗n . To this end, one introduces unitaries Dn (τj ), of the n-fold tensor product Hm,1 ⊗n j = 1, . . . , n − 1, on Hm,1 , (Dn (τj )Ψn )(p1 , . . . , pn ) := Sm (pj+1 , pj ) · Ψn (p1 , . . . , pj+1 , pj , . . . , pn ).
(2.13)
Using (2.9), one checks that these operators generate a unitary representation Dn of the group Sn of permutations of n letters which represents the transposition exchanging j and j + 1 by Dn (τj ). The n-particle space Hm,n of the (m, S)-model ⊗n of vectors invariant under this representation. is defined as the subspace of Hm,1 ⊗n → Hm,n has the form Explicitly, the orthogonal projection Pn : Hm,1 1 π Sm (p1 , . . . , pn ) · Ψn (pπ(1) , . . . , pπ(n) ), (2.14) (Pn Ψn )(p1 , . . . , pn ) := n! π∈Sn
π Sm (p1 , . . . , pn ) :=
1≤l
π(r)
Sm (pπ(l) , pπ(r) ).
(2.15)
December 17, J070-S0129055X11004539
1124
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
Setting Hm,0 := C, the Sm -symmetric Fock space over Hm,1 is Hm :=
∞
Hm,n ,
(2.16)
n=0
i.e. its vectors are sequences Ψ = (Ψ 0 , Ψ1 , Ψ2 , . . .), with Ψ0 ∈ C, Ψn ∈ Hm,n , n ≥ 1, dp
dpn 2 1 such that Ψ2 := |Ψ0 |2 + ∞ n=1 ω m · · · ω m | Ψn (p1 , . . . , pn )| < ∞. Here and in n
1
the following we use the shorthand notation ωkm = ωpmk = (p2k + m2 )1/2 . On Hm , there exists a strongly continuous (anti-)unitary positive energy rep↑ proper resentation Um of the full Poincar´e group P. Denoting by (x, θ) ∈ P+ orthochronous transformations consisting of a boost with rapidity θ and a subsequent space-time translation along x = (x0 , x1 ) ∈ R2 , we set (Um (x, θ)Ψ)n (p1 , . . . , pn ) := ei
Pn
m j=1 (ωj x0 −pj x1 )
· Ψn (θp1 , . . . , θpn ),
(2.17)
where θpj := cosh θ · pj − sinh θ · j = 1, . . . , n. The space-, time-, and spacetime reflections j1 (x0 , x1 ) := (x0 , −x1 ), j0 (x0 , x1 ) := (−x0 , x1 ) and j := j0 j1 are represented as ωjm ,
(Um (j1 )Ψ)n (p1 , . . . , pn ) := Ψn (−pn , . . . , −p1 ),
(2.18)
(Um (j0 )Ψ)n (p1 , . . . , pn ) := Ψn (−p1 , . . . , −pn ),
(2.19)
(Um (j)Ψ)n (p1 , . . . , pn ) := Ψn (pn , . . . , p1 ).
(2.20)
Clearly, all vectors in Hm,1 are eigenvectors of the mass operator with eigenvalue m, and the vector Ωm := 1 ⊕ 0 ⊕ 0 ⊕ · · · ∈ Hm , invariant under Um , represents the vacuum state. The finite particle number subspace of Hm is denoted Dm . # (χ), χ ∈ Hm,1 , On Dm , there act creation and annihilation operators zm defined as √ dq (zm (χ)Ψ)n (p1 , . . . , pn ) := n + 1 χ(q)Ψn+1 (q, p1 , . . . , pn ), (2.21) ωqm √ † † zm (χ) := zm (χ)∗ ⇔ (zm (χ)Ψ)n = nPn (χ ⊗ Ψn−1 ). (2.22) Because of the Sm -symmetrization properties of the vectors in Hm , the distribu# tional kernels zm (p), p ∈ R, related to the above operators by the formal integrals dp # # (p), satisfy the exchange relations of the Zamolodchikov– zm (χ) = ωm χ(p)zm p Faddeev algebra [59, 28], zm (p)zm (q) = Sm (p, q)zm (q)zm (p),
(2.23)
† † † † zm (p)zm (q) = Sm (p, q)zm (q)zm (p),
(2.24)
† † (q) = Sm (q, p)zm (q)zm (p) + ωpm δ(p − q) · 1Hm . zm (p)zm
(2.25)
Having described the Hilbert space of the (m, S)-model, we now construct field operators on it, and first introduce the necessary test functions.a For f ∈ S (R2 ) a We
will use the symbol S (Rn ) for the Schwartz space on Rn . Given some set O ⊂ Rn , we also write S (O) := {f ∈ S (Rn ) : supp f ⊂ O} for its subspace supported in O.
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
we write f m± (p) :=
1 2π
m
d2 xf (x)e±i(ωp
,p)·x
1125
(2.26)
for the restrictions of the Fourier transform of f to the upper and lower mass shell of mass m ≥ 0. For m > 0, we have f m± ∈ L2 (R, dp/ωpm ), and can therefore consider f m± ∈ Hm,1 as a single particle vector. For m = 0, however, the measure dp/ωp0 = dp/|p| is divergent at p = 0, and therefore we can claim f 0± ∈ H0,1 only if f 0,± (0) = 0, i.e. if f is the derivative (with respect to x0 or x1 ) of another test function. Bearing this remark in mind, we define a field operator φm as † (f m+ ) + zm (f m− ). φm (f ) := zm
(2.27)
For general S, this operator is unbounded, but always contains Dm in its domain and leaves this subspace invariant. Furthermore, one can show that φm (f ) is essentially self-adjoint for real-valued f . Regarding its field-theoretical properties, the field φm is a solution of the Klein-Gordon equation with mass m, has the Reeh–Schlieder property, and transforms covariantly under proper orthochronous Poincar´e transformations, Um (x, θ)φm (f )Um (x, θ)−1 = φm (fθ,x ),
fθ,x (y) = f (Λ−1 θ (y − x)).
(2.28)
θ sh θ Here Λθ = ( ch sh θ ch θ ) denotes the Lorentz boost with rapidity θ. For positive mass, these properties have been established in [37]. For m = 0, the proof carries over without changes if restricting to derivative test functions, i.e. if f ∈ S (R2 ) is assumed to be of the form f (x) = ∂g(x)/∂xk , g ∈ S (R2 ), k = 0, 1. Regarding locality, we first note that in the trivial case S = 1, the field φm coincides with the free scalar field of mass m, which is of course point-local. For S = 1, however, φm (x) is not localized at the space-time point x ∈ R2 , i.e. in general [φm (x), φm (x )] = 0 for spacelike separated x, x ∈ R2 . Moreover, the covariance property (2.28) does not hold for the space-time reflection Um (j) (2.20) if S = 1, i.e. the field
φm (f ) := Um (j)φm (f j )Um (j)−1 ,
f j (x) := f (−x),
(2.29)
is different from φm in this case. Nonetheless, φm shares many properties with φm , such as the domain and essential self-adjointness, the covariant transformation behavior with respect to proper orthochronous Poincar´e transformations (2.28), and φm is also a solution of the Klein–Gordon equation with the Reeh–Schlieder property. For the construction of a local quantum field theory with scattering function S = 1, one has to make use of both fields, φm and φm , and exploit their relative localization properties. For the formulation of this relative localization, we first recall that the right wedge is the causally complete region WR := {x ∈ R2 : x1 > |x0 |}, and its causal complement is
WR
= −WR =: WL , the left wedge.
(2.30)
December 17, J070-S0129055X11004539
1126
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
Given m > 0 and S ∈ Sreg , it has been shown in [37] that the two fields φm , φm are relatively wedge-local to each other in the sense that [φm (f ), φm (g)]Ψ = 0,
supp f ⊂ WL ,
supp g ⊂ WR ,
2
f, g ∈ S (R ), Ψ ∈ Dm .
(2.31)
The proof of this fact relies on the analytic properties of Sm . In particular, the apparent asymmetry of the above commutation relations in φ and φ derives from the fact that the scattering function S associated with φ is analytic in the “upper” strip S(0, π), whereas S −1 , associated with φ , is analytic in the “lower” strip S(−π, 0). As we saw in Lemma 2.3, S0 can even be discontinuous, and therefore one cannot directly employ the analyticity arguments in the case m = 0. However, using a splitting in chiral components, we will see in Sec. 4.4 that (2.31) is nonetheless still valid in the massless situation. Having collected sufficient information about the auxiliary fields φm , φm , one can pass to an operator-algebraic formulation and consider the von Neumann algebras generated by them,
Mm := {eiφm (f ) : f ∈ S (WR ) real} ,
(2.32)
m := {eiφm (f ) : f ∈ S (WL ) real} . M
(2.33)
Using the relative localization and Reeh–Schlieder property of the fields φm , φm , m commute, and that the vacuum vector Ωm is cyclic one can show that Mm and M and separating for both of them. The modular data of these algebras act geometrically as expected from the Bisognano–Wichmann theorem [6]. In particular, the modular conjugation J of (Mm , Ωm ) coincides with the space-time reflection Um (j) m are actually (2.20), and with this information, it is easy to see that Mm and M commutants of each other, Mm = Mm [18]. Taking into account the transformation properties of the field φm , it also follows that Um (x, θ)Mm Um (x, θ)−1 ⊂ Mm ,
x ∈ WR , θ ∈ R.
In view of these properties, one can consistently define von Neumann algebras of observables localized in double cones (intersections of two opposite wedges). For y − x ∈ WR , one defines Oxy := (WR + x) ∩ (WL + y) and Am (Oxy ) := Um (x, 0)Mm Um (x, 0)−1 ∩ Um (y, 0)Mm Um (y, 0)−1 .
(2.34)
Algebras associated to arbitrary regions can then be defined by additivity. The assignment O → Am (O) of space-time regions in R2 to observable algebras in B(Hm ) is the definition of the (m, S)-model in the framework of algebraic quantum field theory [33]. Its main properties are summarized in the following theorem. Theorem 2.4. Let m > 0 and S ∈ Sreg . Then the map O → Am (O) of double cones in R2 to von Neumann algebras in B(Hm ) has the following
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1127
properties: (a) Isotony: Am (O1 ) ⊂ Am (O2 ) for double cones O1 ⊂ O2 . (b) Locality: Am (O1 ) ⊂ Am (O2 ) for double cones O1 ⊂ O2 . (c) Covariance: Um (g)Am (O)Um (g)−1 = Am (gO) for each Poincar´e transformation g ∈ P and each double cone O. (d) Reeh–Schlieder property: If S(0) = 1, there exists r0 > 0 such that for all double cones O which are Poincar´e equivalent to WR ∩ (WL + (0, r)) with some r > r0 , there holds Am (O)Ωm = Hm . If S(0) = −1, this cyclicity holds without restriction on the size of O. (e) Additivity: Mm coincides with the smallest von Neumann algebra containing Am (O) for all double cones O ⊂ WR . (f) Interaction: The collision operator of the quantum field theory defined by the algebras (2.34) is the factorizing S-matrix with scattering function S. Statements (a)–(c) also hold if m = 0 and S ∈ Slim . The above list shows that the elements of the algebra Am (O) (2.34) can consistently be interpreted as the observables localized in O of a local, covariant quantum field theory complying with all standard assumptions. Furthermore, the last item shows that the net Am so constructed provides a solution to the inverse scattering problem for the factorizing S-matrix given by S. Statements (d)–(f) of Theorem 2.4 are only known to hold in the massive case since an important tool for their proof, the split property for wedges [47], and the closely related modular nuclearity condition [17], is not satisfied in the massless case. Thus these properties might or might not be valid in the mass zero limit. In Sec. 6, we will see examples of scattering functions S ∈ Slim for both possibilities. 3. Scaling Limits of Massive Models As a quantum field theory in the sense of Haag–Kastler [33], the models given by the nets Am have a well-defined scaling limit theory [22]. However, for generic scattering function, the local observables of these models are given in a quite indirect manner as elements of an intersection of two wedge algebras. On the other hand, field operators localized in wedges are explicitly known, so that it is not difficult to calculate their behavior under scaling transformations. To get an idea about the algebraic short distance limit of the (m, S)-models, m > 0, we can proceed in the following way, inspired by the results in [12] about the behavior of quantum fields under scaling. We consider rescaled wedge-local field operators of the form Nλ φm (λx)
(3.1)
and let λ → 0. The constants Nλ have to be chosen in such a way that the vacuum expectation values of these rescaled fields do not scale to zero or diverge, but approach a finite limit.
December 17, J070-S0129055X11004539
1128
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
The effect of the space-time rescaling x → λx is easily calculated: Smearing the scaled field with a test function f ∈ S (R2 ) amounts to evaluating φm on a scaled testfunction fλ , fλ (x) := λ−2 f (λ−1 x),
λ > 0, x ∈ R2 .
(3.2)
The mass shell restrictions (2.26) of the Fourier transforms of such scaled functions are given by scaling the mass and the momentum, fλm± (p) = f λm± (λp).
(3.3)
As in the analysis of the free field [23], two different choices for the multiplicative renormalization Nλ are possible, namely Nλ = 1 and Nλ = | ln λ|−1/2 . The latter choice corresponds to an anomalous scaling of the field φm (f ) when f 0± (0) = 0, which in turn is due to the infrared divergence of the n-point functions of the field in the massless limit. As in the case of free fields, it can be expected that it gives rise to an abelian tensor factor in the scaling limit algebra, at least if S(0) = 1. We will however not investigate this possibility any further here. Choosing therefore Nλ = 1, we now consider the n-point functions of the rescaled field Wmn,λ (f1 , . . . , fn ) := Ωm , φm (f1λ ) · · · φm (fnλ )Ωm ,
(3.4)
and study their limit as λ → 0. Proposition 3.1. Let m > 0, S ∈ Slim and f1 , . . . , fn ∈ S (R2 ) with fj0± (0) = 0, j = 1, . . . , n. Then lim Wmn,λ (f1 , . . . , fn ) = W0n,1 (f1 , . . . , fn ).
λ→0
(3.5)
An analogous statement holds for the expectation values of the fields φm , φ0 . Proof. Both fields, φm and φ0 , are defined as sums of certain creation and annihilation operators, which change the particle number by ±1. Hence vacuum expectation values of products of an odd number of field operators vanish, i.e. the statement holds trivially if n is odd. We may therefore assume that n = 2k is even, and first consider the vacuum expectation value of a particularly ordered product of creation and annihilation operators. Using (2.21), (2.22) and (2.14), we compute m− m− † m+ m+ † ) · · · zm (fk,λ )zm (fk+1,λ ) · · · zm (fn,λ )Ωm Ωm , zm (f1,λ m− m− m+ m+ = fk,λ ⊗ · · · ⊗ f1,λ , Dk (π)(fk+1,λ ⊗ · · · ⊗ fn,λ ) π∈Sk k dp1 dpk λm− λm+ = · · · (f (λpj )fk+j (λpπ(j) )) ω1m ωkm j=1 k−j+1 π∈Sk
×
1≤lπ(r)
Sm (pπ(l) , pπ(r) ),
(3.6)
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1129
where we used the scaling relation (3.3) in the last line. Taking into account the λm , the change of variables scaling relation (2.10) for Sm and dp/ωpm = d(λp)/ωλp pj → λpj yields m− m− † m+ m+ † Ωm , zm (f1,λ ) · · · zm (fk,λ )zm (fk+1,λ ) · · · zm (fn,λ )Ωm
=
k dp1 dpk λm− λm+ · · · (fk−j+1 (pj )fk+j (pπ(j) )) λm λm ω ω 1 k j=1 π∈S k
×
Sλm (pπ(l) , pπ(r) ).
(3.7)
1≤lπ(r)
In the limit λ → 0, the integrand converges pointwise to the corresponding expression with m = 0, which is integrable because of our assumption on the test functions fj0± . For the Schwartz class functions fjλm± , there exist λ-independent integrable bounds, and since Sλm has constant modulus 1, and |ωjλm |−1 ≤ |pj |−1 , we can use dominated convergence to conclude m− m− † m+ m+ † ) · · · zm (fk,λ )zm (fk+1,λ ) · · · zm (fn,λ )Ωm lim Ωm , zm (f1,λ
λ→0
=
k dp1 dpk 0− 0+ ··· (f (pj )fk+j (pπ(j) )) |p1 | |pk | j=1 k−j+1
π∈Sk
S0 (pπ(l) , pπ(r) )
1≤lπ(r)
0+ = Ω0 , z0 (f10− ) · · · z0 (fk0− )z0† (fk+1 ) · · · z0† (fn0+ )Ω0 .
(3.8)
After this preparation, we consider the (2k)-point function of the field φm , and expand the fields into creation and annihilation operators, m− m+ † Wm2k,λ (f1 , . . . , f2k ) = Ωm , (zm (f1,λ ) + zm (f1,λ )) · · · m− m+ † (zm (f2k,λ ) + zm (f2k,λ ))Ωm ,
and analogously for m = 0. These are sums of 22k terms, each of which is the † ’s (respectively, vacuum expectation value of a (2k)-fold product of zm ’s and zm † z0 ’s and z0 ’s). Because of the annihilation/creation properties of these operators, all terms in which the number of z’s is different from the number of z † ’s vanish. So each non-zero term is of the form considered before, up to a reshuffling of creation and annihilation operators. Picking any one of these terms, we can use the exchange relations of Zamolodchikov’s algebra to write the product of creation and annihilation operators as a sum of products of the particular form considered above, where all creation operators stand to the right of all annihilation operators. The only difference to the previous integral expressions is that the reordering may reduce the number of integrations in (3.7) — due to the term ωpm δ(p − q) in the Zamolodchikov’s relation — and introduce various factors of Sλm (pa , pb ), a, b ∈ {1, . . . , k}, in the integrand as well as a permutation of the momenta p1 , . . . , pk .
December 17, J070-S0129055X11004539
1130
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
But the reorderings are the same for the case m > 0 and m = 0, and the additional factors Sλm (pa , pb ) converge pointwise and uniformly bounded to their counterparts with m = 0 in the limit λ → 0. Thus the analogue of the limit (3.8) holds for an arbitrarily ordered product of z’s and z † ’s, and (3.5) follows. According to this result, the scaling limit of n-point functions of the field φm is given by the n-point functions of the field φ0 , with the same scattering function S. We will then regard the (0, S)-model as the short distance scaling limit of the (m, S)-model. For a discussion of the relations with the Buchholz–Verch scaling limit, we refer the reader to the conclusions in Sec. 7. The massless wedge-localized fields obtained in the above limit are actually chiral, and split into sums of halfline-localized fields on the two light rays. In the following, we will indicate this split on a formal level; the chiral fields will be defined more precisely in Sec. 4. The localization regions of the various fields appearing in this split can be visualized as in Fig. 1. We will use indices r/ to distinguish between the right/left moving component fields. (To avoid confusion, we notice explicitly that this means that, e.g., the right moving field is a function of xr = x0 − x1 only, and therefore lives on the left light ray, defined by x0 + x1 = 0.) Similar to the fields φm , φm on R2 , operators with/without prime are localized on one or the other side of a fixed light ray. It follows from (2.26), (2.27) that the field φ0 is formally defined by the operator valued distribution dp (ei(|p|x0 −px1 ) z0† (p) + e−i(|p|x0 −px1 ) z0 (p)). φ0 (x) = R 2π|p| Splitting now the integration in the sum of an integration over (−∞, 0) and one over (0, +∞), and changing variable p → −p in the former integration, one gets 1 φ0 (x) = √ (ϕr (xr ) + ϕ (x )). 2π
Fig. 1.
Massless wedge- and halfline-localized fields with their localization regions.
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1131
Here x := x0 + x1 , xr := x0 − x1 are the left and right light ray components of x = (x0 , x1 ), and +∞ dp √ (eipxr z0† (p) + e−ipxr z0 (p)), ϕr (xr ) := (3.9) 2πp 0 +∞ dp √ (eipx z0† (−p) + e−ipx z0 (−p)), ϕ (x ) := (3.10) 2πp 0 are two chiral (one-dimensional) fields living on the left/right light ray of two-dimensional Minkowski space. In order to avoid the infrared divergence which is apparent in the above integrals, one should actually consider the derivatives of these fields. At this formal level this is not really relevant, but we will consistently do so in the following section, where we denote the derivatives of ϕ , ϕr as φ , φr , respectively. Notice also that, according to (2.11) and (2.23)–(2.25), one has, for p, q > 0, z0 (−p)z0 (q) = S(∞)z0 (q)z0 (−p), z0 (−p)z0† (q) = S(∞)z0† (q)z0 (−p). This implies that ϕ and ϕr commute, respectively, anticommute, if S(∞) = +1, respectively, S(∞) = −1. Proceeding in the same way for the right-wedge field φ0 , one gets an analogous split into two chiral fields ϕr , ϕ defined by substituting z0# (p) with U0 (j)z0# (p)U0 (j)∗ in formulas (3.9), (3.10). It is then not difficult to see, following the arguments in [37], that [ϕ (x ), ϕ (y )] = 0
if x > y
(see also Proposition 4.2(d) below). This shows that ϕ and ϕ can be interpreted as being localized in the right and left half-line, respectively. An analogous statement holds of course for ϕr , ϕr . The above formal manipulations suggest that the (0, S)-model can be written as the (twisted, if S(∞) = −1) tensor product of two chiral models which are again defined in terms of the Zamolodchikov–Faddeev algebra (2.23)–(2.25) with m = 0 and p, q > 0. These models will be rigorously defined in the next section, and in Sec. 4.4 we will show that such a tensor product decomposition actually holds. 4. Chiral Integrable Models We saw in the previous section that the wedge-local fields generating the massless (0, S)-models factorize into chiral components. To analyze this connection in detail, it turns out to be most convenient to first introduce the chiral fields independently of the previously discussed models on two-dimensional Minkowski space, and discuss the relation to the (0, S)-models afterwards. In this section, we will therefore be concerned with quantum fields on the real line. The development of these models is largely parallel to Sec. 2.2, but has some distinctive differences. Our construction will yield dilation and translation covariant
December 17, J070-S0129055X11004539
1132
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
quantum field theory models on R (thought of as either the right or left light ray), with algebras of observables localized in half lines and intervals. An important question is whether these models extend to conformally covariant theories on the circle. Using results of [32], it turns out that this question is closely related to the size of local algebras associated with bounded intervals. This point will be discussed in detail in Sec. 5. 4.1. S-symmetric Fock space and Zamolodchikov’s algebra on the light ray As before, we start from a scattering function S ∈ Slim . We first define the Hilbert space of the theory. Our “chiral” single particle space is given by H1 := L2 (R, dβ). The variable β is meant to be related to the momentum p by p = eβ , as will become clear in (4.1) below. Like in (2.14), we have a unitary action Dn of the permutation group Sn on H1⊗n = L2 (Rn , dβ) which acts on transpositions τk by (Dn (τk )Ψn )(β1 , . . . , βn ) = S(βk+1 − βk ) · Ψn (β1 , . . . , βk+1 , βk , . . . , βn ). Again, our n-particle space Hn is defined as the Dn -invariant subspace of H1⊗n , and the projector onto it is denoted as Pn . We define the Fock space H := n≥0 Hn and its subspace D ⊂ H consisting of vectors of bounded particle number, i.e. of terminating sequences. We proceed to a representation of the space-time symmetries. On H, we consider a unitary representation U of the affine group G of R, consisting of translations and dilations, R ξ → eλ ξ + ξ , and the reflection, j(ξ) := −ξ. For translations and dilations, it is defined as, ξ, λ ∈ R, β1
(U (ξ, λ)Ψ)n (β1 , . . . , βn ) := eiξ(e
+···+eβn )
· Ψn (β1 + λ, . . . , βn + λ),
(4.1)
and the reflection j is represented antiunitarily by (U (j)Ψ)n (β1 , . . . , βn ) := Ψn (βn , . . . , β1 ) S(βr − βl ) · Ψn (β1 , . . . , βn ). =
(4.2)
1≤l
Compare this with the two-dimensional case in (2.17), (2.20). We will also use the shorthand notation U (ξ) := U (ξ, 0) for pure translations, and note here that this one parameter group has a positive generator, H. Up to scalar multiples, Ω := 1 ⊕ 0 ⊕ 0 ⊕ · · · is the only U -invariant vector in H; it will play the role of the vacuum vector. As in (2.21), we will make use of “S-symmetrized” annihilation and creation † , since operators, which we label y and y † , in order to distinguish them from zm , zm they will take rapidities rather than momenta as arguments. For ψ ∈ H1 , Φ ∈ D, they act by √ (4.3) (y † (ψ)Φ)n := nPn (ψ ⊗ Φn−1 ), y(ψ) := y † (ψ)∗ .
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1133
Except for the special case S = −1, these are unbounded operators containing D in their domains. Under symmetry transformations, they behave like U (ξ, λ)y † (ψ)U (ξ, λ)−1 = y † (U (ξ, λ)ψ), U (ξ, λ)y(ψ)U (ξ, λ)−1 = y(U (−ξ, λ)ψ),
(4.4) (4.5)
whereas with respect to the reflection j, no such transformation formula holds. From time to time, we will also work with operator-valued distributions y(β), y † (β), β ∈ R, related to the above operators by the formal integrals y # (ψ) = dβψ(β)y # (β). They satisfy the relations of the Zamolodchikov–Faddeev algebra in the form y(β1 )y(β2 ) = S(β1 − β2 )y(β2 )y(β1 ),
(4.6)
y † (β1 )y † (β2 ) = S(β1 − β2 )y † (β2 )y † (β1 ),
(4.7)
†
†
y(β1 )y (β2 ) = S(β2 − β1 )y (β2 )y(β1 ) + δ(β1 − β2 ) · 1.
(4.8)
It is interesting to note that these are exactly the same relations as used in massive two-dimensional models, written in terms of rapidities [40]. We will however see that the interpretation in terms of wedge-local observables must be modified in the chiral case. 4.2. Half-local quantum fields and observable algebras We now set out to construct a pair of quantum fields on H as sums of Zamolodchikov type creation and annihilation operators, analogous to the two-dimensional case in Sec. 2.2. For the one-dimensional case, these quantum fields will be localized in halflines, rather than in wedge regions. While we employ largely the same ideas as in the massive two-dimensional case [40], the chiral situation makes some modifications necessary, so that we will need to look into the construction in more detail. We first introduce the necessary test functions and discuss their properties. For f ∈ S (R), ψ ∈ C0∞ (R), we define their Fourier transforms and the positive/negative frequency components of those with the following conventions. ieβ ± β ˜ β ˆ f (ξ) exp(±ieβ ξ)dξ, (4.9) f (β) := ±ie f (±e ) = ± √ 2π ∞ β i i ψ(log p) ∓ipξ e dβ ψ(β)e∓iξe = ∓ √ ψˇ± (ξ) := ∓ √ dp . (4.10) p 2π 2π 0 Lemma 4.1. Let f ∈ S (R), ψ ∈ C0∞ (R). (a) fˆ± , ψˇ± ∈ S (R). As maps from S (R) to L2 (R), f → fˆ± are continuous. (b) For ψ ∈ C0∞ (R), there holds ±
(ψˇ±ˆ) = ψ,
∓
(ψˇ±ˆ) = 0.
December 17, J070-S0129055X11004539
1134
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
(c) Let f ξ,λ (ξ ) := f (e−λ (ξ − ξ)) and f j (ξ) := f (−ξ). Then β ξ,λ )± (β) = e±iξe fˆ± (β + λ), (f
± f = fˆ∓ .
(fj )± (β) = −fˆ± (β),
(4.11)
(d) Let f, g ∈ S (R), with supp f ⊂ R+ , supp g ⊂ R− . Then fˆ+ and gˆ− have g − (β +iλ)| → bounded analytic extensions to the strip S(0, π), and |fˆ+ (β +iλ)|, |ˆ 0 as β → ±∞, uniformly in λ ∈ [0, π]. The boundary values at Im β = π are fˆ+ (β + iπ) = fˆ− (β),
gˆ− (β + iπ) = gˆ+ (β),
β ∈ R.
(4.12)
If supp f ⊂ (r, ∞) and supp g ⊂ (−∞, −r) for some r > 0, then there exist c, c > 0 such that β
|fˆ+ (β + iλ)| ≤ ce−re
sin λ
,
β
|ˆ g − (β + iλ)| ≤ c e−re
sin λ
,
0 ≤ λ ≤ π.
(4.13) ± ˆ Proof. (a) It is clear that f ∈ S (R), and by considering the second formula in (4.10), ψˇ± is seen to be the Fourier transform of a function in C0∞ (R), and hence of Schwartz class, too. Since f˜ ∈ S (R), one gets the bound |fˆ± (β)| ≤ c± (f ) · e−|β| for some Schwartz seminorm c± (f ), which implies the claimed continuity by estimating the L2 -norm of fˆ± . (b) By its definition (4.10), ψˇ± is the inverse Fourier transform of the function p → ∓iθ(±p)ψ(log|p|)/|p|, where θ denotes the step function. The statement now follows from the Fourier inversion formula. (c) This is obtained by straightforward calculation. (d) The analyticity of fˆ+ in S(0, π) follows from the analyticity of f˜ in the upper complex half plane (since supp f ⊂ R+ ), and the fact that the exponential function maps S(0, π) onto the upper half plane. The uniform bound follows from the estimate 1 ∞ + iξeβ+iλ ˆ dξ ∂ξ f (ξ)e |f (β + iλ)| = √ 2π 0 ∞ β 1 ≤ √ dξ | ∂ξ f (ξ)|e−ξe sin λ 2π 0 ∂ξ f 1 , ≤ √ 2π
(4.14)
where in the last step we used ξ > 0, 0 ≤ λ ≤ π. The claimed boundary value follows directly from the definition of fˆ+ in (4.9): fˆ+ (β + iπ) = ieβ+iπ f˜(eβ+iπ ) = −ieβ f˜(−eβ ) = fˆ− (β).
(4.15)
So fˆ+ (β) and fˆ+ (β + iπ) converge to zero for β → ±∞. Since these functions are bounded and analytic in S(0, π), it follows that also |fˆ+ (β + iλ)| → 0 as β → ±∞, uniformly in λ ∈ [0, π] — see, for example, [7, Corollary 1.4.5].
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1135
To obtain the sharpened bound (4.13), note that if supp f ⊂ (r, ∞), then f −r,0 + β −r,0 (β) = e−ire fˆ+ (β) due to (4.11). So (cf. part (c)) has support in R+ , and f there exists c > 0 such that for any λ ∈ [0, π], β+iλ
c > |e−ire
β fˆ+ (β + iλ)| = ere sin λ |fˆ+ (β + iλ)|,
(4.16)
which implies (4.13). Finally, given g ∈ S (R) with supp g ⊂ R− , respectively supp g ⊂ (−∞, −r), all corresponding statements about gˆ− follow from the previous arguments by considering f (ξ) := g(−ξ), since supp f ⊂ R+ , and fˆ+ = −ˆ g−. After these preparations, we define for f ∈ S (R) the two field operators, φ(f ) := y † (fˆ+ ) + y(fˆ− ),
(4.17)
φ (f ) := U (j)φ(f j )U (j).
(4.18)
These fields should be thought of as the derivatives of the left/right chiral fields [] ϕ/r appearing in the decomposition of the massless two-dimensional field φ0 , cf. also the figure on page 1130. For reference, we note the “unsmeared”, distributional version of (4.17): β β dβ (4.19) φ(ξ) = i √ eβ (eie ξ y † (β) − e−ie ξ y(β)). 2π The main features of these fields can largely be obtained in the same way as in [37]. Proposition 4.2. φ and φ have the following properties. (a) The map f → φ(f ) is an operator-valued tempered distribution such that D is contained in the domain of φ(f ) for all f ∈ S (R). For real f, the operator φ(f ) is essentially self-adjoint, with elements from D as entire analytic vectors. (b) φ transforms covariantly under the representation U of the connected component of the affine group, i.e. U (ξ, λ)φ(f )U (ξ, λ)−1 = φ(f ξ,λ ).
(4.20)
(c) The Reeh–Schlieder property holds, i.e. for any non-empty open interval I ⊂ R, the set span{φ(f1 ) · · · φ(fn )Ω : f1 , . . . , fn ∈ S (I), n ∈ N0 }
(4.21)
is dense in H. (d) φ and φ are relatively half-local in the following sense: If f, g ∈ S (R) satisfy supp f ⊂ (a, ∞), supp g ⊂ (−∞, a) for some a ∈ R, then [φ(f ), φ (g)]Ψ = 0
for all Ψ ∈ D.
Statements (a)–(c) also hold when φ is replaced with φ .
(4.22)
December 17, J070-S0129055X11004539
1136
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
Proof. (a) It is clear from the definition of φ(f ) that these operators always contain D in their domains and depend complex linearly on f ∈ S (R). Taking into account that the restrictions of the creation/annihilation operators to an n-particle space √ Hn are bounded, y # (ψ)Hn ≤ n + 1ψH1 , and the continuity of S (R) f → fˆ± ∈ H1 established in Lemma 4.1(a), it follows that φ is an operator-valued tempered distribution. In view of (4.3) and (4.11), we have − + φ(f )∗ = (y † (fˆ+ ) + y(fˆ− ))∗ ⊃ y(fˆ+ ) + y † (fˆ− ) = y(f ) + y † (f ) = φ(f ).
This shows that φ(f ) is hermitian for real f , and the proof of essential selfadjointness can now be completed as in [37, Proposition 1] by showing that any vector in D is entire analytic for φ(f ). (b) This is a direct consequence of (4.4) and (4.11). (c) Let P(I) denote the algebra generated by all polynomials in the field φ(f ) with supp f ⊂ I. By standard analyticity arguments making use of the positivity of the generator of ξ → U (ξ), it follows that P(I)Ω is dense in H if and only if P(R)Ω is dense in H. But given any ψ ∈ C0∞ (R), the function f := ψˇ+ ∈ S (R) satisfies fˆ+ = ψ and fˆ− = 0 (Lemma 4.1(b)), and hence y † (ψ) = φ(f ) ∈ P(R). Since C0∞ (R) is dense in H1 , polynomials in the y † (ψ) create a dense set from Ω. The proofs of statements (a)–(c) for the field φ are completely analogous. (d) Since the Zamolodchikov–Faddeev relations (4.6)–(4.8) are the same as in the massive case in rapidity space, we can establish the following commutation relations in complete analogy to [37, Lemma 4]: [y(ψ1 ), U (j)y(ψ2 )U (j)] = 0, [y † (ψ1 ), U (j)y † (ψ2 )U (j)] = 0, ([U (j)y(ψ1 )U (j), y † (ψ2 )]Φ)n (β) = Cnψ1 ,ψ2 ,+ (β) · Φn (β),
(4.23)
([U (j)y † (ψ1 )U (j), y(ψ2 )]Φ)n (β) = Cnψ1 ,ψ2 ,− (β) · Φn (β), where ψ1 , ψ2 ∈ H1 , Φ ∈ D, and n S(±β0 ∓ βk ). Cnψ1 ,ψ2 ,± (β) = ± dβ0 ψ1 (β0 )ψ2 (β0 )
(4.24)
k=1
In view of the definition of the fields φ and φ , the commutator takes the form [φ(f ), φ (g)]Ψn = −[y † (fˆ+ ) + y(fˆ− ), U (j)y † (ˆ g + )U (j) + U (j)y(ˆ g − )U (j)]Ψn ˆ+ ,ˆ g− ,+
= (Cnf
ˆ− ,ˆ g+ ,−
+ Cnf
)Ψn
for Ψn ∈ Hn .
(4.25)
Due to the translational covariance of φ and φ , it is sufficient to consider the case ˆ+ − ˆ− + a = 0, i.e. supp f ⊂ R+ , supp g ⊂ R− . To show that Cnf ,ˆg ,+ + Cnf ,ˆg ,− = 0, we note that in the integral n ˆ+ − Cnf ,ˆg ,+ (β) = dβ0 fˆ+ (β0 )ˆ g − (β0 ) S(β0 − βk ), (4.26) k=1
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1137
all three functions, fˆ+ , gˆ− , and β0 → S(β0 − βk ), have analytic continuations to the strip S(0, π) (Definition 2.1 and Lemma 4.1(d)). According to Definition 2.1, the continuation of S is bounded on this strip, whereas according to Lemma 4.1(d), the functions fˆ+ (β0 + iλ), gˆ− (β0 + iλ) converge to zero for β0 → ±∞ uniformly in λ ∈ [0, π]. This implies that we can shift the contour of integration from R to R + iπ in (4.26). As the boundary values of the integrated functions are given by fˆ+ (β0 + iπ) = fˆ− (β0 ), gˆ− (β0 + iπ) = gˆ+ (β0 ), and S(β0 + iπ − βk ) = S(βk − β0 ), ˆ+ − ˆ− + comparison with (4.24) shows Cnf ,ˆg ,+ + Cnf ,ˆg ,− = 0. Proceeding to the algebraic formulation, we denote the self-adjoint closures of φ(f ) and φ (f ) (with f real) by the same symbols, and introduce the von Neumann algebras generated by them, M := {eiφ(f ) : f ∈ S (R+ ) real} ,
(4.27)
:= {e M
(4.28)
iφ (f )
: f ∈ S (R− ) real} .
have the following properties. Theorem 4.3. The algebras M and M (a) For ξ ≥ 0, λ ∈ R, we have U (ξ, λ)MU (ξ, λ)−1 ⊂ M.
(4.29)
(b) The vector Ω is cyclic and separating for M. (c) The Tomita–Takesaki modular data of (M, Ω) are ∆it = U (0, −2πt),
J = U (j).
(4.30)
= M . (d) M Proof. (a) Given f ∈ S (R+ ) and ξ ≥ 0, λ ∈ R, also f ξ,λ lies in S (R+ ) by ξ,λ (4.11). Since U (ξ, λ)MU (ξ, λ)−1 is generated by U (ξ, λ)eiφ(f ) U (ξ, λ)−1 = eiφ(f ) , cf. (4.20), the claim follows. (b) Taking into account that the field operator φ(f ) is self-adjoint, we can use standard arguments (see, e.g., [10]) to derive the cyclicity of Ω for M from the cyclicity of Ω for φ, which was established in Proposition 4.2(c). Next we note that our fields φ, φ have the vacuum Ω as an analytic vector (Proposition 4.2(a)), so that we can apply the results of [11] to conclude that also the unitaries eiφ(f ) , eiφ (g) commute for real f, g with supp f ⊂ R+ , supp g ⊂ R− . by the same argument as above, and ⊂ M . But Ω is cyclic for M That is, M hence Ω separates M. (c) The proof of this claim works precisely as in [18, Proposition 3.1] by exploiting the commutation relations between U (x) and ∆it , J, which are known from a theorem of Borchers [8]. = U (j)MU (j). But U (j) coincides with (d) By definition of φ , we have M = the modular conjugation of (M, Ω), and hence, by Tomita’s theorem, M U (j)MU (j) = JMJ = M .
December 17, J070-S0129055X11004539
1138
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
4.3. Local operators So far we have constructed a Hilbert space H, a representation U of G on H, and a von Neumann algebra M ⊂ B(H) associated with a scattering function S ∈ Slim , such that these data are compatible in the sense of Theorem 4.3. Given such objects, we now recall how a corresponding local field theory can be constructed. The first step is to define a family of von Neumann algebras associated with intervals, −∞ < a < b < ∞, as A(a, b) := U (a)MU (a)−1 ∩ U (b)M U (b)−1 .
(4.31)
For general subsets R of R we set A(R) :=
A(a, b).
(4.32)
(a,b)⊂R
This defines in particular the locally generated half line algebras A(R+ ) ⊂ M and A(R− ) ⊂ M , as well as the global algebra A := A(R). The following properties of the assignment I → A(I) are all straightforward consequences of Theorem 4.3, so that we can omit the proof. Proposition 4.4. The map I → A(I) is an isotonous net of von Neumann algebras on H which transforms covariantly under the affine group G, U (g)A(I)U (g)−1 = A(gI),
g ∈ G.
(4.33)
This net of algebras is local in the sense that A(I1 ) ⊂ A(I2 )
whenever I1 ∩ I2 = ∅.
(4.34)
Note that no statement regarding the size of the algebras A(I) is made here. We shall see in Sec. 5 that this question is closely related to the existence of conformal symmetry. However, there is one restriction on the size of A(I) that we can compute directly: We will show that all local operators commute with S(∞)N , where (N Ψ)n := n · Ψn is the number operator on H and S(∞)N is defined via
n spectral calculus. That is, S(∞)N = ∞ n=0 S(∞) Pn . In the case S(∞) = −1, this commutation relation limits the size of A(I). We begin with a preparatory lemma. Lemma 4.5. Let ψ1 , ψ2 ∈ S (R), n ∈ N0 . The following sequences of bounded operators converge to zero in the weak operator topology as λ → ∞. y(U (0, λ)ψ1 )U (j)y(U (0, λ)ψ2 )Pn ,
(4.35)
y † (U (0, λ)ψ1 )U (j)y † (U (0, λ)ψ2 )Pn ,
(4.36)
y † (U (0, λ)ψ1 )U (j)y(U (0, λ)ψ2 )Pn , †
(4.37)
[y(U (0, λ)ψ1 ), U (j)y (U (0, λ)ψ2 )U (j)] − ψ2 , ψ1 S(∞) . N
(4.38)
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1139
Proof. Let Ψn ∈ Hn ∩ S (Rn ). Expanding the definition (4.3) of the annihilation operator as in (2.21), we find, k = 1, 2, y(U (0, λ)ψk )Ψn 2 = n dn−1 β dβ0 dβ0 ψk (β0 + λ)ψk (β0 + λ)Ψn (β0 , β)Ψn (β0 , β). As λ → ∞, the integrand goes to zero pointwise, and since the functions are all of Schwartz class, we can apply the dominated convergence theorem to prove that y(U (0, λ)ψk )Ψn → 0 in Hilbert space norm. On the other hand, we have the bound √ y(U (0, λ)ψk )Pn ≤ nψk , uniform in λ. Hence lim y(U (0, λ)ψk )Pn = 0
λ→∞
in the strong operator topology.
(4.39)
By another application of the uniform bound, and using U (j) = 1, the operator (4.35) converges to zero strongly. The adjoint of this operator then vanishes in the weak operator topology. Since this adjoint differs from (4.36) only by trivial redefinitions, the second claim follows. For proving that (4.37) converges weakly to zero in the limit λ → ∞, we just need to apply (4.39) on both sides of the scalar product. For the operator (4.38), we apply (4.23) to obtain, with Φn , Ψn ∈ Hn ∩ S (Rn ), Φn , ([y(U (0, λ)ψ1 ), U (j)y † (U (0, λ)ψ2 )U (j)] − ψ2 , ψ1 S(∞)N )Ψn n S(βl − β0 + λ) − S(∞)n . = dn β dβ0 Φn (β)Ψn (β)ψ1 (β0 )ψ2 (β0 ) l=1
(4.40) The integrand tends to zero pointwise, and by another application of the dominated convergence theorem, it follows that the above matrix element vanishes as λ → ∞. All matrix elements of (4.38) between vectors of different particle number vanish identically. As (4.38) is bounded in operator norm, uniform in λ, and Φn , Ψn were chosen from a total set, the operator (4.38) tends to zero in the weak operator topology. As a consequence, all local operators are even with respect to the particle number in the case S(∞) = −1. Proposition 4.6. If A ∈ A(I) for some bounded interval I, then [A, S(∞)N ] = 0. Proof. Without loss of generality, let I = (−1, 1). We choose g ∈ S (1, ∞) and g ∈ S (−∞, −1) fixed such that ˆ g − , gˆ+ = 0, and set f [] := g []0,λ with λ ≥ 0. For any such λ, the (closed) field operator φ(f ) is affiliated with U (1)MU (1)−1 , and φ (f ) is affiliated with U (−1)M U (−1)−1 . Since both fields contain D in their domains and leave this subspace invariant, this implies that their product φ(f )φ (f ) commutes with A(I) = U (1)M U (1)−1 ∩ U (−1)MU (−1)−1 on D, cf. (4.31) and
December 17, J070-S0129055X11004539
1140
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
Theorem 4.3(d). Hence we find, Φ, Ψ ∈ D, Φ, [A, φ(f )φ (f )]Ψ = 0.
(4.41)
We can write −φ(f )φ (f ) = (y † (fˆ+ ) + y(fˆ− ))U (j)(y † (fˆ+ ) + y(fˆ− ))U (j)
= y † (fˆ+ )U (j)y † (fˆ+ )U (j) + y † (fˆ+ )U (j)y(fˆ− )U (j) (∗) + U (j)y † (fˆ+ )U (j)y(fˆ− ) + y(fˆ− )U (j)y(fˆ− )U (j) + [y(fˆ− ), U (j)y † (fˆ+ )U (j)].
(4.42)
Inserted into the matrix element (4.41), the expression (∗) vanishes as λ → ∞ due to Lemma 4.5. In the same way, the remaining commutator converges to ˆ g − , gˆ+ S(∞)N . Since ˆ g − , gˆ+ = 0, the claim follows. Thus, at least for the class of models with S(∞) = −1, we have some restriction on the size of the local algebras A(I); in particular, the inclusion A(0, ∞) ⊂ M is proper in these cases. 4.4. Chiral decomposition of the two-dimensional models We now explain the decomposition of the two-dimensional massless (0, S)-models described in Sec. 2.2 into chiral components of the form described in Secs. 4.1–4.3. † (β), Given a scattering function S ∈ Slim , consider two copies H/r , y/r (β), y/r U/r , N/r and φ/r , φ/r of, respectively, the Hilbert space, Zamolodchikov operators, representation of the affine group of R, particle number operators and # (ψ) := halfline fields discussed in Secs. 4.1 and 4.2. We will use the notation y/r ¯ /r (j). U/r (j)y # (ψ)U /r
We also introduce isometries v/r : L2 (R, dβ) → L2 (R, dp/|p|) defined by ψ(log(−p)) if p < 0, (v ψ)(p) := 0 if p ≥ 0, 0 if p ≤ 0, (vr ψ)(p) := ψ(log p) if p > 0. It is clear that the map v : ψ1 ⊕ ψ2 ∈ L2 (R, dβ) ⊕ L2 (R, dβ) → v ψ1 + vr ψ2 ∈ L2 (R, dp/|p|) is unitary. Furthermore, to a given f ∈ S (R2 ) we associate functions f/r ∈ S (R) through 1 ξ + ξ ξ − ξ , dξ f f (ξ) := , 2 R 2 2 (4.43) ξ + ξ ξ − ξ 1 , dξ f fr (ξ) := . 2 R 2 2
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1141
If f = ∂g/∂xk , with g ∈ S (R2 ), k = 0, 1, a calculation using (2.26), (4.9) shows that f 0± (eβ ) =
(−1)k+1 ± √ gˆr (β), 2π
1 f 0± (−eβ ) = − √ gˆ± (β), 2π
β ∈ R,
(4.44)
or, equivalently, 1 f 0± = − √ (v gˆ± + (−1)k vr gˆr± ). 2π
(4.45)
Proposition 4.7. There exists a unitary operator V : H ⊗ Hr → H0 such that : (a) For all ψ ∈ H1 there holds, on D ⊗ D, V ∗ z0† (v ψ)V = y† (ψ) ⊗ 1,
(4.46)
V ∗ z0† (vr ψ)V = S(∞)N ⊗ yr† (ψ).
(4.47)
(b) V ∗ U0 (x, θ)V = U (x , θ) ⊗ Ur (xr , −θ), where x := x0 + x1 , xr := x0 − x1 are the left and right light ray components of x = (x0 , x1 ) ∈ R2 . (c) V ∗ U0 (j)V = S(∞)N ⊗Nr (U (j) ⊗ Ur (j)). (d) For every f ∈ S (R2 ) such that f = ∂g/∂xk with g ∈ S (R2 ), k = 0, 1, there holds, on D ⊗ D, 1 V ∗ φ0 (f )V = − √ (φ (g ) ⊗ 1 + (−1)k S(∞)N ⊗ φr (gr )), 2π
(4.48)
1 V ∗ φ0 (f )V = − √ (φ (g ) ⊗ S(∞)Nr + (−1)k 1 ⊗ φr (gr )). 2π
(4.49)
Proof. (a) Recalling that S0 (p, q) = S(∞) = ±1 for pq < 0 (2.11), we see that, for ψ, ψ ∈ H1 , z0† (vr ψ )z0† (v ψ) = S(∞)z0† (v ψ)z0† (vr ψ ), z0 (vr ψ )z0† (v ψ) = S(∞)z0† (v ψ)z0 (vr ψ ). Considering then functions ψ1 , . . . , ψn , ψ1 , . . . , ψn , χ1 , . . . , χm , χ1 , . . . , χm ∈ H1 with n + m = n + m , one has z0† (v ψ1 ) . . . z0† (v ψn )z0† (vr χ1 ) . . . z0† (vr χm )Ω0 , z0† (v ψ1 ) . . . z0† (v ψn )z0† (vr χ1 ) . . . z0† (vr χm )Ω0
= S(∞)(n+n )m z0 (v ψn ) . . . z0 (v ψ1 )z0† (v ψ1 ) . . . z0† (v ψn )Ω0 , z0 (vr χm ) . . . z0 (vr χ1 )z0† (vr χ1 ) . . . z0† (vr χm )Ω0 = δnn δmm z0† (v ψ1 ) . . . z0† (v ψn )Ω0 , z0† (v ψ1 ) . . . z0† (v ψn )Ω0 × z0† (vr χ1 ) . . . z0† (vr χm )Ω0 , z0† (vr χ1 ) . . . z0† (vr χm )Ω0 ,
(4.50)
December 17, J070-S0129055X11004539
1142
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
where the second equality follows from the observation that if n > n (and then m > m ), the two vectors in the scalar product vanish, while if n < n, one gets the scalar product of two functions of n − n = m − m variables which have supports where all the momenta are positive, respectively, negative. As in (3.6), we have z0† (v ψ1 ) . . . z0† (v ψn )Ω0 , z0† (v ψ1 ) . . . z0† (v ψn )Ω0 n dp1 dpn ··· = ((v ψj )(pj )(v ψj )(pπ(j) )) |p1 | |pn | j=1 π∈Sn
=
dβ1 · · · dβn
n
S0 (pπ(a) , pπ(b) )
1≤aπ(b)
(ψj (βj )ψj (βπ(j) ))
S(βπ(b) − βπ(a) ),
1≤aπ(b)
j=1
π∈Sn
where the last equality follows by the variable change pj = −eβj and (2.11). If we now perform the further change of variables γj = βπ(j) and set σ = π −1 we obtain z0† (v ψ1 ) . . . z0† (v ψn )Ω0 , z0† (v ψ1 ) . . . z0† (v ψn )Ω0 n = (ψj (γj )ψj (γσ(j) )) dγ1 · · · dγn σ∈Sn
j=1
S(γσ(a) − γσ(b) )
1≤aσ(b)
= y† (ψ1 ) . . . y† (ψn )Ω , y† (ψ1 ) . . . y† (ψn )Ω = y† (ψ1 ) . . . y† (ψn ) Ω , y† (ψ1 ) . . . y† (ψn ) Ω . A similar (in fact, simpler) calculation shows that z0† (vr χ1 ) . . . z0† (vr χm )Ω0 , z0† (vr χ1 ) . . . z0† (vr χm )Ω0 = yr† (χ1 ) . . . yr† (χm )Ωr , yr† (χ1 ) . . . yr† (χm )Ωr . Therefore, we see that the scalar product at the beginning of (4.50) equals = δnn δmm y† (ψ1 ) . . . y† (ψn ) Ω , y† (ψ1 ) . . . y† (ψn ) Ω × yr† (χ1 ) . . . yr† (χm )Ωr , yr† (χ1 ) . . . yr† (χm )Ωr . As the sets {y† (ψ1 ) . . . y† (ψn ) Ω ⊗ yr† (χ1 ) . . . yr† (χm )Ωr : ψ1 , . . . , χm ∈ H1 , n, m ∈ N0 } {z0† (v ψ1 ) . . . z0† (v ψn )z0† (vr χ1 ) . . . z0† (vr χm )Ω0 : ψ1 , . . . , χm ∈ H1 , n, m ∈ N0 } are total in H ⊗ Hr and H0 respectively, the definition V y† (ψ1 ) . . . y† (ψn ) Ω ⊗ yr† (χ1 ) . . . yr† (χm )Ωr := z0† (v ψ1 ) . . . z0† (v ψn )z0† (vr χ1 ) . . . z0† (vr χm )Ω0
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1143
uniquely determines a unitary operator V : H ⊗ Hr → H0 . Equations (4.46) and (4.47) then follow easily. (b) We recall that U0 and U/r are second quantized representations, so that, thanks to the definition of V , it is sufficient to consider the action of U0 (x, θ) on v ψ, vr χ for ψ, χ ∈ H1 . We compute: (U0 (x, θ)v ψ)(p) = ei(|p|x0 −px1 ) (v ψ)(cosh θp − sinh θ|p|) = e−ipx (v ψ)(eθ p) = (v U (x , θ)ψ)(p),
(4.51)
where the second equality follows from the sign properties of p → cosh θp−sinh θ|p|. Similarly U0 (x, θ)vr χ = vr Ur (xr , −θ)χ. (c) This also follows straightforwardly from (a) thanks to U0 (j)z0† (ψ1 ) . . . z0† (ψn )Ω0 = z0† (ψ n ) . . . z0† (ψ 1 )Ω0 and similar relations for U/r (j). (d) Using (a), Eq. (4.48) follows by easy computations from the definition of the fields φ0 , φ/r and Eq. (4.45), while Eq. (4.49) is a consequence of (4.48), of (c) and of the fact that S(∞)N ⊗Nr (φ (g ) ⊗ 1)S(∞)N ⊗Nr = φ (g ) ⊗ S(∞)Nr , S(∞)N ⊗Nr (S(∞)N ⊗ φr (gr ))S(∞)N ⊗Nr = 1 ⊗ φr (gr ). Since H0,1 is unitarily equivalent to H,1 ⊕ Hr,1 , the above result can be seen as a generalization to the S0 -symmetric Fock space of the classical result on the tensor product decomposition of the symmetric or antisymmetric Fock space built over a direct sum single particle space. Proposition 4.7(d), together with the halfline-locality of the chiral fields, entails in particular that the fields φ0 , φ0 are wedge-local. That is, we have proved the commutation relation (2.31) for the case m = 0. We now come to the decomposition of operators on H0 . For an operator A ∈ B(Hr ) we define its even/odd parts as Ae/o :=
1 (A ± S(∞)Nr AS(∞)Nr ), 2
(4.52)
and similarly for A ∈ B(H ). Given then von Neumann algebras R/r on H/r such that S(∞)N/r R/r S(∞)N/r = R/r , we consider the following twisted tensor product von Neumann algebras: ˆ Rr := R ⊗ Rr,e + S(∞)N R ⊗ Rr,o , R ⊗
(4.53)
ˇ Rr := R,e ⊗ Rr + R,o ⊗ S(∞)Nr Rr . R ⊗
(4.54)
ˆ Rr = R ⊗ ˇ Rr = R ⊗ Rr , the usual tensor Of course, if S(∞) = 1, then R ⊗ product von Neumann algebras. It can be shown [50] that ˇ Rr ) = (R ) ⊗ ˆ (Rr ) . (R ⊗
(4.55)
December 17, J070-S0129055X11004539
1144
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
The following result will be useful in discussing the splitting of double cone algebras of the two-dimensional theory in the case S(∞) = −1; the analogue for S(∞) = 1 is trivial. (i)
Lemma 4.8. Let R/r ⊂ B(H/r ), i = 1, 2, be von Neumann algebras such that (i)
(i)
(−1)N/r R/r (−1)N/r = R/r , (1) (2) ¯ (2) ¯ (1) N (1) Nr (2) and define R/r := R/r ∩ R/r , R := (−1) R ∩ R , Rr := Rr ∩ (−1) Rr , (1) (2) ¯ ˆ (1) ˇ (2) R := (R ⊗R r ) ∩ (R ⊗R r ). If R/r and R/r have trivial odd and even parts, ¯ ¯ respectively, then R = R ⊗ Rr + R ⊗ Rr .
Proof. The von Neumann algebra R can be decomposed as R = Re,e +Re,o +Ro,e + Ro,o where, denoting by [·, ·]e the commutator and by [·, ·]o the anticommutator, Ri,j = {A ∈ R : [(−1)N ⊗ 1, A]i = 0 = [1 ⊗ (−1)Nr , A]j },
i, j = e, o. (4.56)
(1) ˆ (1) R ⊗R r ,
(1)
(2)
(2) ˇ (2) R ⊗R r ,
(1)
(1)
R := one has R = Re + Similarly, defining R := (1) (2) (2) (2) Ro , R = Re + Ro with respect to the action of 1 ⊗ (−1)Nr and (−1)N ⊗ 1, (1) (2) respectively. It is then clear that Ri,j = Rj ∩ Ri for i, j = e, o. In particular, (1)
(2)
(2) Re,e = (R ⊗ R(1) r,e ) ∩ (R,e ⊗ Rr ) = R,e ⊗ Rr,e = R ⊗ Rr .
¯ ⊗ R ¯ r . In order to get the statement, it is therefore sufficient Similarly, Ro,o = R to show that Re,o = ∅ = Ro,e . To this end, consider the Tomiyama slice map Eω : B(H ⊗ Hr ) → B(H ), ω ∈ B(Hr )∗ , defined by the fact that ϕ(Eω (A)) = (ϕ ⊗ ω)(A) for all ϕ ∈ B(H )∗ , A ∈ B(H ⊗ Hr ) [51]. It is then easy to see that if (1) (1) (2) (2) A ∈ Ro,e = (R ⊗ Rr,e ) ∩ (R,o ⊗ S(∞)Nr Rr ) then Eω (A) ∈ R,o and therefore Ro,e = ∅ by hypothesis. Similarly one shows that Re,o = ∅. Given bounded open intervals I, J ⊂ R we introduce the double cone OI,J := {x ∈ R2 : x ∈ I, xr ∈ J}.
(4.57)
Proposition 4.9. With V the unitary of Proposition 4.7, and with M/r , M/r the von Neumann algebras generated by the fields φ/r (f ), φ/r (g) with supp f ⊂ R+ , supp g ⊂ R− respectively, there holds: ˆ r, V ∗ M0 V = M ⊗M ∗
V M0 V =
(4.58)
ˇ r , M ⊗M
(4.59)
V ∗ A0 (OI,J )V = A (I) ⊗ Ar (J) + A¯ (I) ⊗ A¯r (J), where A¯/r (a, b) :=
/r /r αa (M/r ) ∩ S(∞)N/r αb (M/r ),
and
/r αξ
(4.60)
= Ad U/r (ξ).
ˆ r . First, observe that if f = Proof. We start by showing V ∗ M0 V ⊂ M ⊗M 2 ∂g/∂x1 with g ∈ S (R ), thanks to the fact that the spaces of finite particle vectors D/r ⊂ H/r are cores for φ (g ) and φr (gr ), D0 is a core for φ0 (f ) and
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1145
V D ⊗ Dr = D0 , it follows from (4.48) that √
V ∗ ei
2πφ0 (f )
V = ei(S(∞)
N
⊗φr (gr )−φ (g )⊗1)
If now supp g ⊂ WL , one has supp g/r ⊂ R∓ and then e ˆ r . Moreover the identity M ⊗M eiS(∞)
N
⊗φr (gr )
.
−iφ (g )⊗1
= e−iφ (g ) ⊗ 1 ∈
= 1 ⊗ (eiφr (gr ) )e + S(∞)N ⊗ (eiφr (gr ) )o
(4.61)
N
ˆ r. is easily verified on finite particle vectors and entails eiS(∞) ⊗φr (gr ) ∈ M ⊗M The desired inclusion is then obtained with the help of the Trotter formula [48, Theorem VIII.31] ei(S(∞)
N
⊗φr (gr )−φ (g )⊗1)
= s- lim (ei(S(∞)
N
⊗φr (gr ))/n −i(φ (g )⊗1)/n n
n→∞
e
) ,
(4.62)
and by analogous considerations in the case f = ∂g/∂x0 . Similarly, one gets ˇ r , but then thanks to (4.55) there holds V ∗ M0 V ⊂ M ⊗M ˇ r = (M ⊗M ˆ r ) ⊂ V ∗ M0 V, V ∗ M0 V ⊂ M ⊗M
(4.63)
which proves (4.58) and (4.59). In order to show (4.60), we first observe that, thanks to Poincar´e covariance, it is sufficient to consider I = (−a, 0), J = (0, a), a > 0, so that, A (I) = α−a (M ) ∩ M , Ar (J) = Mr ∩ αra (Mr ), V ∗ A0 (OI,J )V = (α−a ⊗ αra )(V ∗ M0 V ) ∩ V ∗ M0 V ˇ ra (Mr )) ∩ (M ⊗M ˆ r ), = (α−a (M )⊗α where we used Proposition 4.7(b) and formulas (4.58), (4.59). According to Proposition 4.6, the algebras A (I), Ar (J) have trivial odd parts and if S(∞) = −1, by an analogous statement for the anticommutator, A¯ (I), A¯r (J) have trivial even parts. They thus satisfy the assumptions of Lemma 4.8, which yields (4.60). This result completely clarifies the split of the massless two-dimensional models into chiral theories, and the influence of the scattering function on this decomposition. We will therefore restrict attention to the chiral theories on the light ray from now on. 5. Local Observables and Conformal Symmetry The local net A on the real line, as constructed in Sec. 4, is covariant under the affine group G, containing translations and dilations of the light ray. It is a natural question to ask whether this model can be extended to a conformal field theory; that is, whether the net A can be extended to the one-point compactification of R (the circle), covariant under an extension of the representation U to the M¨ obius group PSL(2, R) ⊃ G.
December 17, J070-S0129055X11004539
1146
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
The existence of such a conformal extension is a nontrivial question. In the physics literature, conformal symmetry is usually derived from translation-dilation symmetry under the additional (and sometimes implicit) assumption of existence of a local energy density. In our context, however, the energy density is not at our disposal. Without such additional data, dilation symmetry does in general not imply conformal symmetry; counterexamples have been constructed [20]. Thus, we need to exploit other specific properties of the situation at hand in order to obtain conformal extensions. To that end, we first construct a subspace Hloc ⊂ H on which the vacuum is cyclic for the local algebras A(I). Lemma 5.1. The subspace Hloc := A(a, b)Ω −∞ ≤ a < b ≤ ∞, and invariant under U .
⊂
H is independent of
Proof. Given 0 < b < ∞, we will first show A(0, b)Ω = A(R)Ω. Let Ψ ⊥ A(0, b)Ω. For any A ∈ A(0, b) ⊂ M, we know that AΩ ∈ dom ∆1/2 , where ∆it is the modular group of (M, Ω) as before. Thus the function t → Ψ, ∆it AΩ has an analytic continuation to the strip S(− 12 , 0). But since ∆it acts as a dilation (Theorem 4.3(c)), the function vanishes on the boundary for t < 0, and hence everywhere. This shows Ψ ⊥ A(0, b )Ω for any 0 < b < ∞. Applying a similar Reeh–Schlieder type argument to the function ξ → Ψ, U (ξ)AΩ, using the positivity of the generator of the translation group, we see that Ψ ⊥ A(I)Ω for any finite interval I ⊂ R. Hence Ψ ⊥ A(R)Ω, and we arrive at (A(0, b)Ω)⊥ ⊂ (A(R)Ω)⊥ . But since (A(R)Ω)⊥ ⊂ (A(0, b)Ω)⊥ by isotony, A(0, b)Ω = A(R)Ω follows. The latter space is invariant under U by construction of A(R), which implies the lemma. The reason for considering Hloc is that it is the largest space on which we can expect an extension of A to a net on the circle, and consequently of U to the M¨ obius group. Namely, if the A(I) are covariant under such an extension of U , one shows by the same methods as above that Hloc is invariant under the extended representation as well; thus the extended net A acts on Hloc with cyclic vacuum vector. It is a noteworthy fact that, after restriction of our net to Hloc , such a conformal extension always exists, as we shall show now. Theorem 5.2. The representation U Hloc extends to a strongly continuous unitary representation of PSL(2, R) on Hloc , and I → A(I)Hloc extends to a local net on the circle, conformally covariant under this representation. Proof. By construction, Ω is cyclic and separating for A(R+ )Hloc . In fact, the modular group associated with this pair is ∆it Hloc , since the modular KMS condition is preserved under the restriction. Hence the translation-dilation covariant net AHloc has the Bisognano–Wichmann property. Making use of the modular group of the interval algebra (A(0, 1)Hloc , Ω), the extensions of the net and symmetry group now follow from [32, Theorem 1.4].
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1147
Thus, the questions of conformal symmetry and the size of the local algebras are intimately connected: The algebras A(a, b) are large if, and only if, the domain Hloc of the extended representation U is large. In particular, for the case S(∞) = −1, Proposition 4.6 already gives us a restriction: All local operators are even. This directly implies: Proposition 5.3. If S(∞) = −1, then Hloc ⊂ He , where He is the space of even particle number vectors. At this point, it is unknown (for general scattering function S) what the actual size of Hloc is; we cannot exclude that it contains just multiples of Ω, and consequently A(I) = C1. In Sec. 6, we will however determine Hloc and the local algebras A(I) explicitly in simple examples of S. 6. Conformal Scaling Limits for Constant Scattering Functions In this section, we illustrate the structure of the local algebras A(I), and of their extension to a conformally covariant theory on Hloc , in the examples of a constant scattering function: S = ±1. The simplest possible case is S = 1. In this case, the Zamolodchikov–Faddeev relations (4.6)–(4.8) are the usual canonical commutation relations for annihilators and creators. In fact, one checks from the definitions that the field φ is nothing else than the free U (1) current, and A(I) the associated local algebras; see, e.g., [19]. It is well known that the vacuum is cyclic for these algebras; thus Hloc = H. In fact, the representation U extends to the well-known representation of the conformal group with central charge c = 1. The first non-trivial example is S = −1. A Euclidean version of the associated massive two-dimensional quantum field theory can be obtained by considering the scaling limit of the spin correlation functions of the two-dimensional Ising model off the critical point [46]. In the context of factorizing S-matrices, this quantum field theoretic model, and in particular its formulation on Minkowski space, is often just referred to as “the Ising model”. This model has been investigated from a number of different perspectives. In [54] and previous work cited therein, Schroer and Truong give formulas for associated quantum field operators. In [5], the form factors of one of these fields are calculated, see also [4] for the calculation of the scaling dimension of field operators in the short distance limit. In [38], the existence of local observables in the two-dimensional model, as formulated here in terms of wedge algebras, was proven, and in [21], the model was generalized to higher dimensions, and its local and non-local aspects were discussed. In our context, we are dealing with (a chiral component of) the massless limit of this system, which should hence be related to the Ising model at the critical point. On the field theoretical side, one expects this to be described by a chiral Fermi field, covariant under a representation of the M¨ obius group with central charge c = 12
December 17, J070-S0129055X11004539
1148
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
[45]. In our context, it is not immediately evident that the algebras A(I) consist of the observables related to a Fermi field. However, we shall show now that this is indeed the case. On the technical side, in the case S = −1, our relations (4.6)–(4.8) are canonical anticommutation relations. As a consequence, the “smeared” creation and annihilation operators y † (ψ), y(ψ) are bounded, namely [14, Proposition 5.2.2] y † (ψ) = y(ψ) = ψH1 .
(6.1)
This will simplify our arguments considerably. Proposition 4.6 gives only an “upper estimate” for the size of the local algebras A(I). We now want to determine the size of A(I) explicitly. In fact, we will show in detail how these algebras are generated by the energy density of a chiral Fermi field. To that end, it is very helpful to introduce a new field operator ψ by √ ieβ ξ † β 1 1 dβ eβ/2 ie y (β) + √ e−ie ξ y(β) . (6.2) ψ(ξ) := √ 2π i The smeared field ψ(f ) is a well-defined bounded operator for any test function f ∈ S (R), since the functions β → e−β/2 fˆ± (β) belong to L2 (R). One readily checks that ψ(f ) is selfadjoint for real f , and transforms covariantly under translations and dilations according to U (ξ , λ)ψ(ξ)U (ξ , λ)−1 = eλ/2 ψ(eλ (ξ + ξ )). In particular, ψ has scaling dimension 12 . Using techniques similar to those in [21, Lemma 6.1], we can clarify the relation between ψ and the halfline-local fields φ, φ . Proposition 6.1. Let a < b, and consider test functions f ∈ S (a, b), g ∈ S (b, ∞), h ∈ S (−∞, a). Then {ψ(f ), φ(g)} = 0,
[ψ(f ), φ (h)] = 0.
(6.3)
The algebra Pe (a, b) of even polynomials in ψ, smeared with test functions having support in (a, b), is a subalgebra of A(a, b), and we have Pe (a, b)Ω = He = Hloc . The algebra Pe (R) acts irreducibly on He . Proof. From the definitions (4.17) and (6.2), we see that we have ψ(f ) = φ(k) if the function k fulfills kˆ± (β) = i∓1/2 e−β/2 fˆ± (β). A short computation shows that such a function k can in fact be found, namely k = K ∗ f , where K is the inverse Fourier transform of the distribution p → 1/ i(p + i0). Due to its analyticity and boundedness properties in Fourier space, K has support in the right half line ([49, Theorem IX.16], see also [21, Lemma 6.1]; note that we use different conventions for the Fourier transform). Thus supp k ⊂ (a, ∞). From the relative locality of φ and φ , see Proposition 4.2(d), it follows that [ψ(f ), φ (h)] = [φ(k), φ (h)] = 0.
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1149
To establish the first relation in (6.3), we compute in the sense of distributions, √ ieβ (ξ−ξ ) i 1 −ieβ (ξ−ξ ) 3β/2 {ψ(ξ), φ(ξ )} = +√ e − ie dβe 2π i 3/2 ∞ i =− dp p + i0 e−ip(ξ −ξ) . 2π −∞ As before, this distribution has support only for ξ − ξ > 0, as desired. Now due to (6.3), even polynomials in the field ψ, smeared with test functions having support in the interval (a, b), commute with both φ(g) and φ (h). Since all fields involved are bounded operators, this directly implies that any such polynomial is an element of A(a, b); see Eq. (4.31). For the proof of the cyclicity statement, let P(a, b) denote the algebra of all (even and odd) polynomials in ψ, smeared with test functions supported in (a, b). Then Ω is cyclic for P(a, b). (This follows with arguments as in Proposition 4.2(c); the extra factor eβ/2 in (6.2) can be absorbed in the test functions.) Applying the projector Ee = 12 (1 + (−1)N ) onto He , we obtain He = Ee H = Ee P(a, b)Ω = Ee P(a, b)Ee Ω = Pe (a, b)Ω. Further, Pe (a, b)Ω ⊂ A(a, b)Ω = Hloc . But from Proposition 5.3, we know that Hloc ⊂ He . Hence He = Hloc = Pe (a, b)Ω. Irreducibility of Pe (R) now follows from cyclicity of Ω and from the spectrum condition for translations by standard arguments [57, Theorem 4-5]. Having seen that the even local algebras A(I) are non-trivial, we now want to understand their structure more explicitly in terms of local field operators. To this end, we first note that ψ satisfies the anticommutation relation of a free Fermi field, β β 1 (6.4) {ψ(ξ), ψ(ξ )} = dβ eβ (eie (ξ−ξ ) + e−ie (ξ−ξ ) ) = δ(ξ − ξ ). 2π This observation suggests to introduce a normal ordered even field, T (ξ) :=
i i : ψ(ξ)∂ξ ψ(ξ) : = lim (ψ(ξ)∂ξ ψ(ξ ) − Ω, ψ(ξ)∂ξ ψ(ξ )Ω), 2 2 ξ →ξ
(6.5)
as a candidate for a local energy density. This limit exists in the sense of matrix elements between vectors from D0 , where D0 ⊂ D denotes those vectors in which each n-particle component is smooth and of compact support. Expressing T (ξ) in terms of creation and annihilation operators, see Eq. (6.8) below, it is also easy to see that T is an operator-valued distribution. Proposition 6.2. The field T is point-local, relatively local to the algebras Pe (a, b), transforms covariantly under U, and integrates to the generator H of
December 17, J070-S0129055X11004539
1150
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
translations,
∞
dξ T (ξ) = H,
(6.6)
−∞
where the integral is understood in the sense of matrix elements between vectors from uscher–Mack commutation relations, D0 . With central charge c = 12 , we have the L¨ c δ (ξ − ξ ). (6.7) i[T (ξ), T (ξ )] = −δ (ξ − ξ )(T (ξ) + T (ξ )) + 24π Proof. In view of the anticommutation relation (6.4), we also have {ψ(ξ), ∂ξ ψ(ξ )} = −δ (ξ − ξ ) · 1 and {∂ξ ψ(ξ), ∂ξ ψ(ξ )} = −δ (ξ − ξ ) · 1, which implies that T is a point-local field. Relative locality to Pe follows from (6.4) as well. The covariance of T under translations and dilations is clear from its definition; note that T has scaling dimension two. To establish (6.6), we write down the normal ordered product (6.5) in terms of creation and annihilation operators, β γ β γ 1 dβ dγ e(β+3γ)/2 (iei(e +e )ξ y † (β)y † (γ) + ie−i(e +e )ξ y(β)y(γ) T (ξ) = − 4π β
− ei(e
−eγ )ξ †
β
y (β)y(γ) − e−i(e
−eγ )ξ †
y (γ)y(β)).
(6.8)
(We read this in the sense of sesquilinear forms on D0 × D0 .) The first two terms, containing two creators and annihilators, respectively, vanish after integration over ξ, because they involve exponentials of ±i(eβ + eγ )ξ and the factor (eβ + eγ ) is strictly positive. Therefore, ∞ ∞ 1 dξ T (ξ) = dξ dβ dγ e(β+3γ)/2 4π −∞ −∞ β
γ
β
× (ei(e −e )ξ y † (β)y(γ) + e−i(e = dβ eβ y † (β)y(β) = H.
−eγ )ξ †
y (γ)y(β))
In summary, T is a local, translation and dilation covariant field of scaling dimension two that integrates up to H and is relatively local to the net Pe , which uscher–Mack acts irreducibly on He (Proposition 6.1). Hence the hypotheses of the L¨ theorem [44] are fulfilled (see [31, Theorem 3.1]), and the commutation relation (6.7) follows. The value of the central charge c can then be computed from the vacuum two-point function. 1 (β1 +β2 +3γ1 +3γ2 )/2 dβ dβ dγ Ω, T (ξ)T (ξ )Ω = − 1 2 1 dγ2 e 16π 2 × exp(−i(eβ1 + eγ1 )ξ + i(eβ2 + eγ2 )ξ ) × Ω, y(β1 )y(γ1 )y † (β2 )y † (γ2 )Ω ∞ 1 1 1 · dk k 3 e−ik(ξ−ξ ) . = · 2 24π 2π 0
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1151
Taking the antisymmetric part of this distribution and comparing it with (6.7), we read off c = 12 . This identifies the field T as the energy density of a real free chiral Fermi field. Expanding T into its Fourier modes Ln , n ∈ Z, we therefore have a representation of the Virasoro algebra with central charge 12 in our chiral net A, and a corresponding subnet I → Vir1/2 (I) on He . This net transforms covariantly under a unitary representation of the M¨ obius group, with the generator K of the special conformal transformations given by K = dξξ 2 T (ξ) [31]. The local algebras are now completely fixed by the following theorem. Theorem 6.3. In the chiral model with scattering function S = −1, the net of local von Neumann algebras is the Virasoro net with central charge 12 , i.e. A(I) = Vir1/2 (I) for any interval I ⊂ R. Proof. Both nets, A and Vir1/2 , can be restricted to the even subspace Hloc = He ⊂ H, and both have the vacuum as a cyclic vector on this space — see Proposition 6.1 regarding cyclicity for Pe (I) ⊂ Vir1/2 (I). For every interval I, we have Vir1/2 (I) ⊂ A(I) by construction, and the same then follows for any subset I ⊂ R, cf. (4.32). But the Virasoro net on the real line is Haag-dual [36]. Hence Vir1/2 (I) ⊂ A(I) ⊂ A(I ) ⊂ Vir1/2 (I ) = Vir1/2 (I), which implies Vir1/2 (I) = A(I). 7. Conclusions In this paper, we have investigated the short-distance scaling limit of (1 + 1)dimensional models of quantum field theory with a factorizing scattering matrix, for a certain class of two-particle scattering functions S. At finite scale, these models are generated by wedge-local field operators depending on S in an explicit manner. Proceeding to scale zero, we showed that this feature is also maintained in the limit, and investigated the limit theories in terms of their generators. As might heuristically be expected, the limit turned out to be a massless, dilation covariant theory which extends (trivially, if S(∞) = 1) a chiral theory. We were able to establish this fact on the level of local von Neumann algebras: The observable algebras A(O) associated with double cones contain the tensor products of local interval algebras A(I) of the chiral components. For algebras associated with unbounded regions (wedges and half-lines), one obtains a tensor product as well, but with a grading in the case S(∞) = −1. We then investigated in more detail the individual chiral components of the limit theory, which are of interest in their own right. They are translation-dilationreflection covariant models on the real line; and while they are massless, they are formally very similar to the massive two-dimensional models, viewed in rapidity
December 17, J070-S0129055X11004539
1152
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
space. These theories can be defined on the level of half-line algebras or, by considering intersections, on the level of interval algebras. Our particular approach to the scaling limit via the wedge-local fields has the merit that the computation of the limit is rather easy on the level of the generators, but this comes at the price of an indirect characterization of the local fields and observables of the limit theory. In particular, the nontriviality of the interval algebras A(I) is not guaranteed by our construction. The analysis presented here is thus complementary to other approaches to the short-distance behavior of the models considered, and it is interesting to compare the different procedures. In case the point-local quantum fields contained in the models at finite scale are sufficiently explicit, one might base the scaling limit analysis on these quantities. However, as the S-matrix is taken here as the main input into the construction, for most of the models no Lagrangian formulation or local fields are known. Moreover, even if point-local fields can be constructed, for example by Euclidean perturbation theory, their relation to the real-time S-matrix is very indirect. One can therefore expect a rigorous analysis of the connection between the collision operator on the one hand and the short-distance limit on the other hand to be quite involved with this method. For example, in the Ising model explicit formulas for local fields are available, but have a rather complicated form [46]. By comparison, the S-matrix and wedge-local generators of this model are extremely simple. As we have shown, it is possible to circumvent the construction of the local fields at finite scale, and still completely analyze the corresponding scaling limit theory. Because observables localized in bounded space-time regions are only characterized indirectly in our approach, a detailed comparison with techniques based on local observables is difficult. One can expect however that the limit of double-conelocal objects would possibly yield less (but in no case more) limit points than those obtained when working with wedge-local objects, in some analogy to the scaling limit of charge sectors [25, 24]. In this sense, the limit theory that we compute is maximally large. Another approach to the scaling limit is that of Buchholz and Verch [22]. Here one defines the limit in terms of bounded local operators, and in this sense of more general objects, since unitaries exp iφ(f ) and their weak limit points would be included. This might yield a larger limit theory than ours, and indeed, one expects [16] a large center to occur in the limit algebras. Due to technical difficulties in fully describing this central part of the algebras, we did not yet proceed in this direction. These problems are present even in the free field case, and their complete clarification will probably require a modification of the Buchholz–Verch framework. We hope to return to this point elsewhere. It is not excluded that such a more general approach would yield additional “quantum” observables as well, not only “classical” observables in the center of the algebras. Nevertheless, let us remark 0 that we constructed in the limit theory are Haagthat the wedge algebras M0 , M dual, and to this degree maximal; any additional local observables could only be accommodated on an extended Hilbert space.
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1153
In the approach chosen here, a central question turned out to be whether the chiral models constituting the scaling limit extend to conformal quantum field theories on the circle, covariant under the M¨ obius group. We showed that there is indeed always such an extension, namely on the subspace Hloc ⊂ H generated from the vacuum by the local algebras. In this sense, a conformal extension exists if and only if the local algebras are large. As a general feature, we showed that in the case S(∞) = −1, the local subspace Hloc contains only even particle number vectors, and all local operators must be even with respect to the particle number as well. This effect is illustrated by the models with the constant scattering functions S = ±1. In these two cases, we explicitly computed the local algebras of the chiral components. For the free field (S = 1), one obtains the minimal model with conformal charge c = 1, and for the Ising model (S = −1), one obtains the minimal model with c = 12 . However, in the case of non-constant S, the exact size of the local algebras remains an open question. In fact, our present results do not rule out the possibility that they are trivial in the sense Hloc = CΩ. Another related problem is to clarify the significance of the function S entering in the definition of our chiral models. At finite scale, S directly corresponds to the S-matrix, and its physical interpretation is clear [40]. From a more mathematical point of view, S is an invariant (under unitary equivalence) of the two-dimensional massive nets, and in particular, two models with different scattering function are never equivalent. By comparison, the significance of S is much less understood in the scaling limit, despite it formally being equal to a two-dimensional scattering function. For S(∞) = 1, it seems clear that any formulation of scattering theory of massless two-dimensional models (cf. [15]) yields a trivial scattering matrix, due to the tensor product structure of the local algebras. In the terminology of [29], we deal with models with trivial left-right scattering, while our scattering function S determines the left-left and right-right scattering. However, on a single light ray or in a single chiral component, scattering theory in the usual sense cannot be formulated, and is not physically meaningful. Furthermore, S is not known to be an invariant of the chiral models. Therefore it is possible that models with different scattering functions, inequivalent at finite scale, become equivalent in the scaling limit. Such an effect would actually be expected for asymptotically free theories, and is supported by results obtained in another approach to massless factorizing scattering: In [60], a model similar to ours — yet with a richer particle spectrum — is analyzed by means of a Thermodynamic Bethe Ansatz. Assuming for a moment that the results of [60] can be transferred to our situation by analogy, one is lead to the conjecture that our local net I → A(I) is actually unitarily equivalent to the minimal conformal model with c = 1 (for S(∞) = 1) or c = 12 (for S(∞) = −1), irrespective of the details of the function S. This would mean that the interaction described by S vanishes in the scaling limit, and that the limit models are actually complicated reparametrizations of the free Bose or Fermi field. However, the technical arguments of [60] are largely based on thermodynamical considerations and do not directly apply in our context.
December 17, J070-S0129055X11004539
1154
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
This situation would be compatible with our present results as well. A rigorous answer to the question which of the possible scenarios, ranging from trivial local algebras to asymptotically free theories, is realized for which scattering function, would deepen our understanding of the short distance structure of quantum field theory. Further results in this direction will be presented elsewhere. Acknowledgments Work on this project originated in many joint discussions with Claudio D’Antoni. His sudden and untimely death in October 2010 robbed us of a friend and collaborator. We dedicate this article to his memory. The work of G.L. is supported by FWF project P22929N16 “Deformations of Quantum Field Theories”, and the work of G.M. is supported in part by the ERC Advanced Grant 227458 “Operator Algebras and Conformal Field Theory”. References [1] E. Abdalla, C. Abdalla and K. Rothe, Non-Perturbative Methods in 2-Dimensional Quantum Field Theory (World Scientific, Singapore, 1991). [2] A. E. Arinshtein, V. A. Fateev and A. B. Zamolodchikov, Quantum S-matrix of the (1 + 1)-dimensional Toda chain, Phys. Lett. B 87 (1979) 389–392. [3] H. M. Babujian, A. Foerster and M. Karowski, The form factor program: A review and new results — the nested SU(N ) off-shell Bethe ansatz, SIGMA 2 (2006) 082, 16pp. [4] H. M. Babujian and M. Karowski, Towards the construction of Wightman functions of integrable quantum field theories, Int. J. Mod. Phys. A 19 (2004), Suppl., 34–49. [5] B. Berg, M. Karowski and P. Weisz, Construction of Green’s functions from an exact S matrix, Phys. Rev. D 19 (1979) 2477–2479. [6] J. J. Bisognano and E. H. Wichmann, On the duality condition for quantum fields, J. Math. Phys. 17 (1976) 303–321. [7] R. Boas, Entire Functions (Academic Press, New York, 1954). [8] H.-J. Borchers, The CPT Theorem in two-dimensional theories of local observables, Comm. Math. Phys. 143 (1992) 315–332. [9] H.-J. Borchers, D. Buchholz and B. Schroer, Polarization-free generators and the S-matrix, Comm. Math. Phys. 219 (2001) 125–140. [10] H.-J. Borchers and J. Yngvason, Positivity of Wightman functionals and the existence of local nets, Comm. Math. Phys. 127 (1990) 607–615. [11] H.-J. Borchers and W. Zimmermann, On the self-adjointness of field operators, Nuovo Cimento 31 (1963) 1047–1059. [12] H. Bostelmann, C. D’Antoni and G. Morsella, Scaling algebras and pointlike fields: A nonperturbative approach to renormalization, Comm. Math. Phys. 285 (2009) 763–798. [13] H. Bostelmann, C. D’Antoni and G. Morsella, On dilation symmetries arising from scaling limits, Comm. Math. Phys. 294 (2010) 21–60. [14] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. II (Springer, New York, 1981). [15] D. Buchholz, Collision theory for waves in two dimensions and a characterization of models with trivial S-matrix, Comm. Math. Phys. 45 (1975) 1–8.
December 17, J070-S0129055X11004539
2011 9:22 WSPC/S0129-055X
148-RMP
Scaling Limits of Integrable Quantum Field Theories
1155
[16] D. Buchholz, Quarks, gluons, colour: Facts or fiction? Nucl. Phys. B 469 (1996) 333–353. [17] D. Buchholz, C. D’Antoni and R. Longo, Nuclear maps and modular structures. I. General properties, J. Funct. Anal. 88 (1990) 233–250. [18] D. Buchholz and G. Lechner, Modular nuclearity and localization, Ann. Henri Poincar´e 5 (2004) 1065–1080. [19] D. Buchholz, G. Mack and I. Todorov, The current algebra on the circle as a germ of local field theories, Nucl. Phys. B — Proc. Suppl. 5 (1988) 20–56. [20] D. Buchholz and H. Schulz-Mirbach, Haag duality in conformal quantum field theory, Rev. Math. Phys. 2 (1990) 105–125. [21] D. Buchholz and S. J. Summers, String- and brane-localized causal fields in a strongly nonlocal model, J. Phys. A: Math. Gen. 40 (2007) 2147–2163. [22] D. Buchholz and R. Verch, Scaling algebras and renormalization group in algebraic quantum field theory, Rev. Math. Phys. 7 (1995) 1195–1239. [23] D. Buchholz and R. Verch, Scaling algebras and renormalization group in algebraic quantum field theory. II: Instructive examples, Rev. Math. Phys. 10 (1998) 775–800. [24] C. D’Antoni and G. Morsella, Scaling algebras and superselection sectors: Study of a class of models, Rev. Math. Phys. 18 (2006) 565–594. [25] C. D’Antoni, G. Morsella and R. Verch, Scaling algebras for charged fields and shortdistance analysis for localizable and topological charges, Ann. Henri Poincar´e 5 (2004) 809–870. [26] P. Dorey, Exact S-matrices, preprint (1998); arXiv:hep-th/9810026. [27] W. Dybalski and Y. Tanimoto, Asymptotic completeness in a class of massless relativistic quantum field theories, Comm. Math. Phys. 305 (2011) 427–440. [28] L. D. Faddeev, Quantum completely integrable models in field theory, in Mathematical Physics Reviews, Vol. 1 (Harwood Academic, 1980), pp. 107–155. [29] P. Fendley and H. Saleur, Massless integrable quantum field theories and massless scattering in 1 + 1 dimensions, in Proceedings of the 1993 Summer School on High Energy Physics and Cosmology, eds. E. Gava et al., ICTP Series in Theoretical Physics, Vol. 10 (World Scientific, 1994), pp. 301–332. [30] J. Fr¨ ohlich, Quantized “sine-Gordon” equation with a non-vanishing mass term in two space-time dimensions, Phys. Rev. Lett. 34 (1975) 833–836. [31] P. Furlan, G. M. Sotkov and I. T. Todorov, Two-dimensional conformal quantum-field theory, Riv. Nuovo Cimento 12 (1989) 1–202. [32] D. Guido, R. Longo and H. W. Wiesbrock, Extensions of conformal nets and superselection structures, Comm. Math. Phys. 192 (1998) 217–244. [33] R. Haag, Local Quantum Physics — Fields, Particles, Algebras, 2nd edn. (Springer, Berlin, 1996). [34] D. Iagolnitzer, Factorization of the multiparticle S-matrix in two-dimensional spacetime models, Phys. Rev. D 18 (1978) 1275–1285. [35] D. Iagolnitzer, Scattering in Quantum Field Theories (Princeton University Press, Princeton, 1993). [36] Y. Kawahigashi and R. Longo, Classification of local conformal nets. Case c < 1, Ann. Math. 160 (2004) 493–522. [37] G. Lechner, Polarization-free quantum fields and interaction, Lett. Math. Phys. 64 (2003) 137–154. [38] G. Lechner, On the existence of local observables in theories with a factorizing S-matrix, J. Phys. A: Math. Gen. 38 (2005) 3045–3056. [39] G. Lechner, Towards the construction of quantum field theories from a factorizing S-matrix, Prog. Math. 251 (2007) 175–198.
December 17, J070-S0129055X11004539
1156
2011 9:22 WSPC/S0129-055X
148-RMP
H. Bostelmann, G. Lechner & G. Morsella
[40] G. Lechner, Construction of quantum field theories with factorizing S-matrices, Comm. Math. Phys. 277 (2008) 821–860. [41] G. Lechner, Deformations of quantum field theories and integrable models, to appear in Comm. Math. Phys. (2011); doi: 10.1007/s00220-011-1390-y. [42] R. Longo and K.-H. Rehren, Local fields in boundary conformal QFT, Rev. Math. Phys. 16 (2004) 909–960. [43] R. Longo and E. Witten, An algebraic construction of boundary quantum field theory, Comm. Math. Phys. 303 (2011) 213–232. [44] M. L¨ uscher and G. Mack, The energy momentum tensor of a critical quantum field theory in 1 + 1 dimensions, unpublished manuscript (1976). [45] G. Mack and V. Schomerus, Conformal field algebras with quantum symmetry from the theory of superselection sectors, Comm. Math. Phys. 134 (1990) 139–196. [46] B. M. McCoy, C. A. Tracy and T. T. Wu, Two-dimensional Ising model as an exactly solvable relativistic quantum field theory: Explicit formulas for n-point functions, Phys. Rev. Lett. 38 (1977) 793–796. [47] M. M¨ uger, Superselection structure of massive quantum field theories in 1 + 1 dimensions, Rev. Math. Phys. 10 (1998) 1147–1170. [48] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. I: Functional Analysis (Academic Press, New York, 1972). [49] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. II: Fourier Analysis, Self-Adjointness (Academic Press, New York, 1975). [50] J. E. Roberts, The structure of sectors reached by a field algebra, in Carg`ese Lectures in Physics, Vol. 4 (Gordon and Breach, New York, 1970), pp. 61–78. [51] S. Sakai, C*-Algebras and W*-Algebras (Springer, Berlin, 1971). [52] B. Schroer, Modular localization and the bootstrap-formfactor program, Nucl. Phys. B 499 (1997) 547–568. [53] B. Schroer, Modular wedge localization and the d = 1 + 1 formfactor program, Ann. Phys. 275 (1999) 190–223. [54] B. Schroer and T. T. Truong, The order/disorder quantum field operators associated with the two-dimensional Ising model in the continuum limit, Nucl. Phys. B 144 (1978) 80–122. [55] B. Schroer and H. W. Wiesbrock, Modular constructions of quantum field theories with interactions, Rev. Math. Phys. 12 (2000) 301–326. [56] F. A. Smirnov, Form Factors in Completely Integrable Models of Quantum Field Theory (World Scientific, Singapore, 1992). [57] R. F. Streater and A. S. Wightman, PCT, Spin and Statistics, and All That (Benjamin, New York, 1964). [58] E. Titchmarsh, The Theory of Functions, 2nd edn. (Oxford University Press, Oxford, 1939). [59] A. B. Zamolodchikov and A. B. Zamolodchikov, Factorized S-matrices in two dimensions as the exact solutions of certain relativistic quantum field models, Ann. Phys. 120 (1979) 253–291. [60] A. B. Zamolodchikov and A. B. Zamolodchikov, Massless factorized scattering and sigma models with topological terms, Nucl. Phys. B 379 (1992) 602–623. [61] J. Zinn-Justin, Quantum Field Theory and Critical Phenomena (Clarendon Press, Oxford, 1989).
December 17, 2011 9:24 WSPC/S0129-055X 148-RMP J070S0129055X11004540
Reviews in Mathematical Physics Vol. 23, No. 10 (2011) 1157–1159 c World Scientific Publishing Company DOI: 10.1142/S0129055X11004540
REVIEWS IN MATHEMATICAL PHYSICS Author Index Volume 23 (2011)
Adami, R., Cacciapuoti, C., Finco, D. & Noja, D., Fast solitons on star graphs Amour, L. & Faupin, J., Hyperfine splitting of the dressed hydrogen atom ground state in non-relativistic QED Baraviera, A. T., Cioletti, L. M., Lopes, A. O., Mohr, J. & Souza, R. R., On the general one-dimensional XY model: Positive and zero temperature, selection and non-selection B´ eny, C., see Hiai, F. Bostelmann, H., Lechner, G. & Morsella, G., Scaling limits of integrable quantum field theories Brain, S. & van Suijlekom, W. D., The ADHM construction of instantons on noncommutative spaces Cacciapuoti, C., see Adami, R. Cao, P. & Carles, R., Semi-classical wave packet dynamics for Hartree equations
Carles, R., see Cao, P. Cattaneo, A. S. & Sch¨ atz, F., Introduction to supergeometry Cioletti, L. M., see Baraviera, A. T. Conrad, F. & Grothaus, M., N/V -limit for Langevin dynamics in continuum Dappiaggi, C., Remarks on the Reeh–Schlieder property for higher spin free fields on curved spacetimes De Nittis, G. & Lein, M., Applications of magnetic ΨDO techniques to SAPT Dinu, V., Jensen, A. & Nenciu, G., Perturbation of near threshold eigenvalues: Crossover from exponential to non-exponential decay laws Eltzner, B. & Gottschalk, H., Dynamical backreaction in Robertson–Walker spacetime Faupin, J., Møller, J. S. & Skibsted, E., Regularity of bound states
4 (2011) 409
5 (2011) 553
10 (2011) 1063 7 (2011) 691
10 (2011) 1115
3 (2011) 261 4 (2011) 409
9 (2011) 933
1157
9 (2011) 933
6 (2011) 669 10 (2011) 1063
1 (2011) 1
10 (2011) 1035
3 (2011) 233
1 (2011) 83
5 (2011) 531
5 (2011) 453
December 17, 2011 9:24 WSPC/S0129-055X 148-RMP J070-S0129055X11004540
1158
Author Index
Faupin, J., see Amour, L. Finco, D., see Adami, R. Fr¨ ohlich, J., Griesemer, M. & Sigal, I. M., Spectral renormalization group and local decay in the standard model of non-relativistic quantum electrodynamics Gao, H.-J., see Liang, F. Geisinger, L. & Weidl, T., Sharp spectral estimates in domains of infinite volume Gottschalk, H., see Eltzner, B. Graffi, S. & Zanelli, L., Geometric approach to the Hamilton–Jacobi equation and global parametrices for the Schr¨ odinger propagator Griesemer, M., see Fr¨ ohlich, J. Grothaus, M., see Conrad, F. Hiai, F., Mosonyi, M., Petz, D. & B´eny, C., Quantum f -divergences and error correction Jensen, A., see Dinu, V. Koppen, M., Tretter, C. & Winklmeier, M., Simplicity of extremal eigenvalues of the Klein–Gordon equation K¨ onenberg, M., Matte, O. & Stockmeyer, E., Existence of ground states of hydrogen-like atoms in relativistic QED I: The semi-relativistic Pauli–Fierz operator Landi, G. & Zampini, A., Calculi, Hodge
5 (2011) 553 4 (2011) 409
2 (2011) 179 8 (2011) 883
6 (2011) 615 5 (2011) 531
9 (2011) 969 2 (2011) 179 1 (2011) 1
7 (2011) 691 1 (2011) 83
6 (2011) 643
4 (2011) 375
operators and Laplacians on a quantum Hopf fibration Lechner, G., see Bostelmann, H. Lein, M., see De Nittis, G. Liang, F. & Gao, H.-J., Explosive solutions of stochastic viscoelastic wave equations with damping Lopes, A. O., see Baraviera, A. T. Matte, O., see K¨ onenberg, M. Miyao, T., Self-dual cone analysis in condensed matter physics Mohr, J., see Baraviera, A. T. Møller, J. S., see Faupin, J. Morsella, G., see Bostelmann, H. Mosonyi, M., see Hiai, F. Naaijkens, P., Localized endomorphisms in Kitaev’s toric code on the plane Nakamura, M., Small global solutions for nonlinear complex Ginzburg–Landau equations and nonlinear dissipative wave equations in Sobolev spaces Nenciu, G., see Dinu, V. Noja, D., see Adami, R. Ogata, Y. & Rey-Bellet, L., Ruelle–Lanford functions and large deviations for asymptotically decoupled quantum systems
6 (2011) 575 10 (2011) 1115 3 (2011) 233
8 (2011) 883 10 (2011) 1063 4 (2011) 375
7 (2011) 749 10 (2011) 1063 5 (2011) 453 10 (2011) 1115 7 (2011) 691
4 (2011) 347
8 (2011) 903 1 (2011) 83 4 (2011) 409
2 (2011) 211
December 17, J070-S0129055X11004540
2011 9:24 WSPC/S0129-055X
148-RMP
Author Index Ostrovsky, D., On the stochastic dependence structure of the limit lognormal process Pankrashkin, K. & Richard, S., Spectral and scattering theory for the Aharonov–Bohm operators Petz, D., see Hiai, F. Rauch, J., Optimal focusing for monochromatic scalar and electromagnetic waves Rejzner, K., Fermionic fields in the functional approach to classical field theory Rey-Bellet, L., see Ogata, Y. Richard, S., see Pankrashkin, K. Sait¯ o, Y. & Umeda, T., Eigenfunctions at the threshold energies of magnetic Dirac operators Sch¨ atz, F., see Cattaneo, A. S. ˙ Semiz, I., All “static” spherically symmetric perfect fluid solutions of Einstein’s
2 (2011) 127
1 (2011) 53 7 (2011) 691
8 (2011) 839
9 (2011) 1009 2 (2011) 211 1 (2011) 53
2 (2011) 155 6 (2011) 669
equations with constant equation of state parameter and finite polynomial “mass function” Sigal, I. M., see Fr¨ ohlich, J. Skibsted, E., see Faupin, J. Souza, R. R., see Baraviera, A. T. Stockmeyer, E., see K¨ onenberg, M. Tahvildar-Zadeh, A. S., On the static spacetime of a single point charge Tretter, C., see Koppen, M. Umeda, T., see Sait¯ o, Y. van Suijlekom, W. D., see Brain, S. Weidl, T., see Geisinger, L. Winklmeier, M., see Koppen, M. Yajima, K., Schr¨ odinger equations with time-dependent unbounded singular potentials Zampini, A., see Landi, G. Zanelli, L., see Graffi, S.
1159
8 (2011) 865 2 (2011) 179 5 (2011) 453 10 (2011) 1063 4 (2011) 375
3 (2011) 309 6 (2011) 643 2 (2011) 155 3 (2011) 261 6 (2011) 615 6 (2011) 643
8 (2011) 823 6 (2011) 575 9 (2011) 969