March 15, 2005 11:30 WSPC/148-RMP
J070-00228
Reviews in Mathematical Physics Vol. 17, No. 1 (2005) 1–14 c World Scientific Publishing Company
PASSIVITY OF GROUND STATES OF QUANTUM SYSTEMS
WALTER F. WRESZINSKI Instituto de F´ısica, Universidade de S˜ ao Paulo, Caixa Postal 66318, 05315-970 S˜ ao Paulo, Brazil
[email protected] Received 17 March 2004 Revised 8 December 2004 We consider a quantum system described by a concrete C ∗ -algebra acting on a Hilbert space H with a vector state ω induced by a cyclic vector Ω and a unitary evolution U t such that Ut Ω = Ω, ∀t ∈ R. It is proved that this vector state is a ground state if and only if it is non-faithful and completely passive. This version of a result of Pusz and Woronowicz is reviewed, emphasizing other related aspects: passivity from the point of view of moving observers and stability with respect to local perturbations of the dynamics. Keywords: Passivity; ground states; moving observers.
In this paper, we shall study properties of the ground state of infinite quantum systems. The results of Theorem 3 apply to any C ∗ -dynamical system and thus also to relativistic quantum field theory (rqft) (see also Remark 1). The subsequent observations on passivity in the reference frame of a moving observer refer, however, to non-relativistic systems. We suppose that the (a priori) infinite quantum system is characterized by a S quasi-local C ∗ -algebra of observables Q = O Q(O), with Q(O) the C ∗ -algebra associated to a finite region O (of R3 for non-relativistic systems [2], of Minkowski spacetime for relativistic quantum field theory [9]), and the bar denotes closure in S the norm topology of Q; we shall denote the local algebra O Q(O) by QL (we assume that Q has an identity 1). Time evolution αt , t ∈ R is assumed to be a (norm-continuous) automorphism group of Q. Each state of the system is described by a linear functional ω on Q, which associates to each A ∈ Q the corresponding expectation value ω(A) ∈ C such that ω(A∗ A) ≥ 0, ∀A ∈ Q, and ω(1) = 1. By the GNS construction ω is induced by a cyclic vector Ω, a (physical) separable Hilbert space H and a representation πω (Q) of Q by bounded operators on H, such that ω(A) = (Ω, πω (A)Ω) and πω (Q)Ω = H. We assume that ω is invariant under αt , t ∈ R. It follows [2] that there exists a strongly continuous unitary group Ut , t ∈ R, implementing αt and leaving Ω invariant. By Stone’s theorem, Ut = exp{itH}, where the self-adjoint H is called the physical Hamiltonian. 1
March 15, 2005 11:30 WSPC/148-RMP
2
J070-00228
W. F. Wreszinski
Let Iω denote the set of elements A ∈ Q such that ω(A∗ A) = 0. A state is called faithful if Iω = {0}, non-faithful otherwise. For the purpose of clarity, we shall henceforth identify our C ∗ -algebra with the concrete C ∗ -algebra A ≡ πω (Q) referred to above. Thus, αt (A) = Ut AUt −1 ,
∀ t ∈ R,
∀A ∈ A
(1a)
with kαt (A)k = kAk
(1b)
and ∀ t ∈ R ⇔ HΩ = 0.
Ut Ω = Ω,
(1c)
The (vector) state ω on A is therefore non-faithful if and only if ∃ A ∈ A, with A 6= 0
(2a)
AΩ = 0.
(2b)
and such that
In [3] (see also [4, 5.3.21]) the important notion of passivity was introduced. Let δ be the infinitesimal generator of αt , i.e., the derivation d αt (A), t→0 dt
δ(A) ≡ lim
∀A ∈ D(δ)
where D(δ) denotes the domain of the derivation δ, that is to say, on H, δ(A)ψ = i[H, A]ψ, ∀ψ such that [H, A]ψ ∈ H. The state ω is said to be a passive state if −iω(U ∗ δ(U )) ≤ 0
(3a)
for any U ∈ U0 (A) ∩ D(δ), where U0 (A) denotes the connected component of the identity of the group of unitaries of A with the uniform topology. One may also take the weak closure A00 and identify our algebra as a von Neumann Algebra M. The system is thus passive if for all unitaries U in the norm-connected component of M which contains the identity, such that [H, U ] ∈ M, −(U Ω, [H, U ]Ω) ≤ 0.
(3b)
Condition [H, U ] ∈ M is equivalent to U ∈ D(δ) in (1) because, given the selfadjointness of H, there exists a core D for H such that the sesquilinear form ψ, φ ∈ D × D 7→ i(Hψ, U φ) − i(ψ, U Hφ) is bounded [4, Proposition 3.2.55]. The physical importance of (3a) is due to [3, p. 280] or [4, Theorem 5.4.28]: ω is a passive state if the work performed on the system due to a cyclic change of the external conditions is positive, which is related to the second law of thermodynamics. Let h ∈ A be a bounded time-dependent self-adjoint operator, strongly continuously differentiable, such that h(t) = 0 for t 6∈ [0, T ], T > 0, and τh > 0 be
March 15, 2005 11:30 WSPC/148-RMP
J070-00228
Passivity of Ground States of Quantum Systems
3
the smallest T satisfying this condition. Then the unitaries in (3a) may be inter˜ (τh ), with preted as propagators U i
∂ ˜ ˜ (t), U(t) = (H + h(t))U ∂t
(4)
with [H, h(t)] ∈ A, ∀t ∈ R, and the work Lh performed on the system may be seen to be equal to ˜ h )Ω, H U(τ ˜ h )Ω). −Lh = −(U(τ
(5)
The passivity condition (3) is related to entropy production and Carnot’s version of the second law of thermodynamics — see [5] for an excellent review. Thermal equilibrium states are characterized by the KMS condition. Let β > 0 denote the inverse temperature. A state ω is (α, β)-KMS if, for all A, B ∈ A, there exists a function FA,B analytic inside the strip {z | 0 < Im z < β}, bounded and continuous on its closure, and satisfying the KMS boundary conditions FA,B (t) = ω(Aαt (B)),
FA,B (t + iβ) = ω(αt (B)A)
(6)
for all t ∈ R. A KMS state is αt -invariant. The C ∗ -dynamical system (A, α, ω), where ω is a (α, β)-KMS state, is said to describe a physical system in thermal equilibrium at temperature 1/β, a temperature state. This is motivated by the fact that (5) is satisfied by the thermodynamic limit [2] of Gibbs states [2]. In a similar way, one may define a ground state as a thermodynamic limit [2] of ground states. It is mathematically convenient to consider ground states as (β = +∞)-KMS states, whence the notation ω∞ . An alternative definition (see [4, Definition 5.3.18] and [2, Sec. 3]), which is satisfied by the thermodynamic limit of ground states and thus similarly motivated, is Definition 1. ω∞ is a αt -ground state if −iω∞ (A∗ δ(A)) ≥ 0,
∀ A ∈ D(δ).
(7)
Proposition 1. A state is a ground state if it is αt -invariant and satisfies the spectrum condition H ≥ 0.
(8)
Proof. See [4, Proposition 5.3.19] or [2, Theorem 3.4]. By the Zeroth Law of Thermodynamics, if a system in thermodynamic equilibrium is coupled to an identical system in thermodynamic equilibrium and at the same temperature, the combined system is also in thermodynamic equilibrium, and this should also hold if an arbitrary (N ) number of identical copies is coupled. We thus arrive at
March 15, 2005 11:30 WSPC/148-RMP
4
J070-00228
W. F. Wreszinski
Definition 2. A state ω is completely passive if and only if the product state, defined by A⊗N 3 A1 ⊗ A2 ⊗ · · · AN 7→ ω(A1 ) ω(A2 ) · · · ω(AN ) is passive, i.e., satisfies (3). Note that for a composed system of N copies the corresponding operator HN ˜ N are of the form and ∆ HN = H ⊗ 1 ⊗ · · · ⊗ 1 + · · · + 1 ⊗ · · · ⊗ H, ˜N = ∆ ˜ ⊗∆ ˜ ⊗ · · · ⊗ ∆. ˜ ∆ It is of interest and importance to extend the KMS property to the von Neumann algebra M = A00 . Let ω ˜ denote the extension of ω to A00 . If R ∈ A00 , then ω ˜ (R) = (Ω, RΩ). The restriction of ω ˜ to A is ω, in fact ω ˜ (A) = (Ω, AΩ) = ω(A).
(9)
If ω is invariant for αt , then αt may be extended to an automorphism of the von Neumann algebra A00 . If R ∈ A00 , then α ˜t (R) = Ut RUt−1 , where Ut is the unitary operator which implements αt . The restriction of α ˜ t to A is αt , since for A ∈ A, α ˜ t (A) = Ut AUt −1 = αt (A). The structure (M, α ˜, ω ˜ ) is called a W ∗ -dynamical system. We have [2, Theorem 4.12]: Theorem 1. If β < ∞, and ω is β-KMS state with respect to α, then ω ˜ is faithful, and a β-KMS state with respect to α. ˜ Hence a temperature state is faithful. This is the separating character of a KMS state. It plays an essential role in the Tomita–Takesaki theory [2, 4]. The situation for a ground state ω∞ is quite different. By (2), non-faithfulness of the ground state is equivalent to the existence of annihilators of Ω in A. We shall see that the ground state of non-relativistic quantum systems, as well as the vacuum of rqft, is always non-faithful. For the following proposition, we need a few concepts of the Tomita–Takesaki theory. Let ω ˜ β (β < ∞) be a temperature state on M, induced by a vector Ω. By Theorem 1, ω ˜ β is faithful thus Ω is separating with respect to M, i.e., given A ∈ M, AΩ = 0 ⇒ A = 0. Therefore the map AΩ → A∗ Ω,
A∈M
(10)
defines an (unbounded) anti-linear operator I on the dense domain MΩ: I AΩ = A∗ Ω;
∀A ∈ M.
(11)
This operator is closable: suppose there exists a sequence {An }, An ∈ M, such that limn→∞ An Ω = 0 and limn→∞ IAn Ω = ψ. Let X ∈ M0 , the commutant of M.
March 15, 2005 11:30 WSPC/148-RMP
J070-00228
Passivity of Ground States of Quantum Systems
5
Then (ψ, XΩ) = lim (I An Ω, XΩ) = lim (A∗n Ω, XΩ) n→∞
n→∞
= lim (X ∗ Ω, An Ω) = 0. n→∞
(12)
The fact that Ω is separating for M and hence cyclic for M0 means that the set {XΩ, X ∈ M0 } is dense in H, and thus ψ = 0. Similarly, the operator I 0 defined by I 0 XΩ = X ∗ Ω;
∀ X ∈ M0
(13)
is also closable. Let S denote the closure of I (respectively S 0 the closure of I 0 ). It may be proved [2, 4] that S0 = S∗.
(14)
Since S is a closed, anti-linear operator it has a polar decomposition S = J∆1/2
(15)
where ∆ is a positive self-adjoint operator, called modular operator , and J is an anti-linear isometric operator with initial domain the closure of the range of S ∗ and final domain the closure of the range of S. The ranges of S and S ∗ = S 0 are, however, dense in H, so that J is an anti-unitary operator, which is also an involution: J 2 = 1 because I 2 = 1. The Tomita–Takesaki theorem [8] states: ∆it M∆−it = M, 0
JMJ = M .
∀t ∈ R
(16) (17)
The group of automorphisms A → At = ∆it A∆−it is the modular automorphism group and J is the modular conjugation of M. With respect to the dynamics (16), the state ω ˜ β satisfies the KMS condition, and for each faithful normal state ω ˜ β of a von Neumann algebra, there exists a unique one-parameter group with respect to which the state satisfies the KMS condition [2, Theorem 7.2]. For a ground state ω ˜ ∞ we have the following known result, (see [3, Theorem 1.4 and p. 287, case 2]; see also [7, Proposition 3.2], a generalization where H is replaced by H + u ~ · P~ ), whose proof is presented in order to clarify the main ideas in a more specific context. Proposition 2. If an invariant state ω associated to a W ∗ -dynamical system is non-faithful and completely passive, it is a ground state. The same conclusion holds if ω is associated to a C ∗ -dynamical system. Proof. We first consider W ∗ -dynamical systems. Since Ω is cyclic with respect to M, it is separating with respect to M0 . The projection operator EG onto the closed subspace G ≡ M0 Ω is easily seen to be an element of M.
March 15, 2005 11:30 WSPC/148-RMP
6
J070-00228
W. F. Wreszinski
Thus the algebra EG MEG ≡ {EG M EG ; M ∈ M}
(18)
is a von Neumann subalgebra of M, and with respect to the von Neumann algebra N = {EG M |G : M ∈ M}
(19)
of operators in the Hilbert space G, Ω is both cyclic and separating. It may also be checked that the representation U maps G and G⊥ onto themselves, that it strongly commutes with EG and thus implements automorphisms of N . From Ut N Ut∗ = N and Ut Ω = Ω, one finds for all A ∈ N , Ut SUt∗ AΩ = Ut S(Ut∗ AUt )Ω = Ut (Ut∗ AUt )∗ Ω = A∗ Ω = SAΩ
(20)
where S is the closure of the operator (11) associated to the Neumann algebra N . Thus Ut SUt∗ = S.
(21)
Let ∆ be the modular operator of N and Ω. By (15), (21) and the uniqueness of the polar decomposition (see, e.g., [7, Appendix A]), it follows that the positive self-adjoint operator ˜ ≡ ∆EG ∆
(22)
strongly commutes with H, so that one may consider the joint spectrum σH,∆ ˜
(23)
˜ If A ∈ M, then B ≡ AEG also belongs to M. The inequality [3] of H and ∆. 2
2
(AΩ, He−H AΩ) + (A∗ Ω, He−H A∗ Ω) ≥ 0
(24)
holds. Inserting B into the above inequality, one obtains as in [3] 2
˜ −H ≤ 0 −H(1 − ∆)e
(25)
+ σH,∆ ˜ ⊆ ση,δ ≡ {(η, δ) ∈ R × R : − η(1 − δ) ≤ 0}.
(26)
from which it follows that
The points in ση,δ of the form (η, 0) satisfy −η ≤ 0.
(27)
Now comes the crux of the argument: the spectrum σH,∆ ˜ contains at least one such point: since Ω is not separating with respect to M, while it is separating ˜ annhiwith respect to N by construction, one has M 6= N and G 6= H. As ∆ ⊥ ⊥ ˜ lates all elements of G 6= {0} the elements of G are zero eigenvectors of ∆. The representation U maps G⊥ into itself, hence HG⊥ is a self-adjoint operator on G⊥ , whose spectral projections are restrictions of the corresponding spectral projections of H, implying that σH,∆ ˜ contains some point of the form (η, 0). We now show that, together with the assumption of complete passivity, all points of σH,∆ ˜
March 15, 2005 11:30 WSPC/148-RMP
J070-00228
Passivity of Ground States of Quantum Systems
7
must satisfy (27), in spite of the fact that they are not all of the form (η, 0): take 0 0 (η, 0) ∈ σH,∆ ˜ , and let (η , δ ) be any point in σH,∆ ˜ which violates (27). By complete passivity of ω, (η + nη 0 , 0 · δ 0n ) ∈ ση,δ ,
∀n ∈ N
whence −(η + nη 0 ) ≤ 0,
∀ n ∈ N.
(28)
Choosing n sufficiently large in (28), one arrives at a contradiction with the assumption −η 0 > 0. Hence (27) holds for all elements of σH,∆ ˜ . By a simple argument (see, e.g. [7, Lemma B.2]), this fact implies that (27) holds for all η ∈ σH , whence (8) holds by the spectral theorem. By assumption ω is invariant, and thus, by Proposition 1 it is a ground state. If ω is associated to a C ∗ -dynamical system, the same conclusion holds because the extension (9) is invariant, completely passive and non-faithful, and thus the preceding proof may be applied. One of the main results of [3] may now be stated in the following form: Theorem 2. Let the dynamics α ˜ be non-trivial, i.e., H 6= 0. Then ω ˜ is non-faithful on M and completely passive if and only if it is a ground state. Proof. By the main result of [3, Theorem 1.4 and Remark], ω is completely passive if and only if ω is a β-KMS state (β ≥ 0) or ω is a ground state (H ≥ 0). Thus, by [3, p. 287, case 1], if ω is completely passive, there are two possibilities: (a) ω is faithful on M in which case either (i) ω is a β-KMS state (β ≥ 0) or (ii) the dynamics α ˜ is trivial (H = 0); or, by Proposition 2: (b) ω is non-faithful on M, in which case ω is a ground state (H ≥ 0). When H = 0, the state ω is a ground state and thus, excluding (ii) by hypothesis, there remains the assertion of the theorem.
The faithfulness of ω ˜ on M does not imply the faithfulness on A and thus nonfaithfulness on A is a property which is stronger than non-faithfulness on M. A proof of this deeper property was sketched in [1], but it may be useful to present a detailed and complete version (see also [23, 24] for related ideas) here: Proposition 3. A ground state ω is non-faithful on A. Proof. Let f˜ denote a real-valued infinitely differentiable function of compact support, with supp f˜ = Nλ0
(29)
with λ0 > 0 and Nλ0 = [λ0 − , λ0 + ],
< λ0 .
(30)
March 15, 2005 11:30 WSPC/148-RMP
8
J070-00228
W. F. Wreszinski
With the above choice Nλ0 ∩ {0} = φ.
(31)
˜ f(λ) > 0 if λ ∈ (λ0 − , λ0 + ).
(32)
We also assume that
By the above assumptions, 1 f (t) ≡ √ 2π
Z
∞
dλ f˜(λ)e−iλt
(33)
−∞
satisfies f ∈ L1 (R).
(34)
By (34), the strong integral Uf ≡
√
2π
Z
∞
dtf (t)e
iHt
=
−∞
Z
dE(λ)f˜(λ),
(35)
where {E(λ)} are the spectral projections associated to H, is defined everywhere on H. By (32) and (35), Uf 6= 0 and thus ∃A ∈ A, A 6= 0, such that Uf AΩ 6= 0, as Ω is cyclic for A. Let (note that αt (A) is defined by (1a)): √ Z A(f ) ≡ 2π dtf (t)αt (A)
(36)
(37)
which exists as a strong integral on H by (1b) if f satisfies (34). By (37), the definition (35) of Uf , (1c) and (36), Uf AΩ = A(f )Ω 6= 0
(38)
A(f ) 6= 0.
(39)
from which it follows that
Thus A(f )∗ ≡
√ Z 2π
∞
dtf ∗ (t)αt (A∗ )
(40)
−∞
is also such that A(f )∗ 6= 0 where f ∗ (t) denotes the complex conjugate of f .
(41)
March 15, 2005 11:30 WSPC/148-RMP
J070-00228
Passivity of Ground States of Quantum Systems
Now, let BΩ ≡ We have BΩ =
Z
√
2π
Z
∞
dtf (t)e−iHt A∗ Ω.
(42)
−∞
∞ ∗
9
∗
dt f (−t)α−t (A )Ω = −∞
Z
∞
dt f ∗ (t)αt (A∗ )Ω −∞
= A(f )∗ Ω
(43)
by (40) and the fact that f˜(λ) is real valued, whence f ∗ (−t) = f (t), ∀ t ∈ R. By (42), BΩ may also be written as a strong Stieltjes integral Z ∗ ˜ Ω). (44) BΩ = f(−λ)d(E(λ)A By assumption our state is a ground state, and therefore satisfies the spectrum condition (8) by Proposition 1, the support of {E(λ)} being thus [0, ∞]. Together with (29) and (31), (43) and (44) yield thus A(f )∗ Ω = 0 which, together with (41), shows non-faithfulness, by (2). As a consequence, the analogue of Theorem 2 also holds for C ∗ -dynamical systems: Theorem 3. Let the dynamics α be non-trivial. i.e., H 6= 0. Then ω is non-faithful on A and completely passive if and only if it is a ground state. Proof. The “only if ” part follows from [13, Theorem 1.4] and [3, p. 287, case 2], or, alternatively, from the fact that the extension ω ˜ of ω to M is non-faithful and completely passive; thus ω ˜ is a ground state by Proposition 2, i.e., H ≥ 0, which also implies that ω is a ground state over A. The “if” part follows from [3] (see also [4, Theorem 5.3.22, (1) ⇒ (2)]) and Proposition 3. Remark 1. The results of Theorem 3 apply to rqft: they are not in conflict with the Reeh–Schlieder theorem ([6] or [8, Theorem 4.14, p. 101]), because the latter applies only to the subalgebras Q(O) of Q affiliated to bounded spacetime regions O, which are not stable under time evolution (a property which we used in the construction of A(f ) given by (37)). The procedure of cutting off the total energy of certain states AΩ (as done to arrive at A(f )Ω) may be considerably sophisticated, leading to much more profound results, namely, to the conditions which must be imposed on a rqft to ensure a reasonable particle interpretation [10, 11]. While Theorems 2 and 3 settle the differences between ground states and temperature states in the context of complete passivity, there remains the general deep
March 15, 2005 11:30 WSPC/148-RMP
10
J070-00228
W. F. Wreszinski
context of stability under local perturbations of the dynamics [12, 13]. The differences between ground states and temperature states in this context have already been pointed out [1, 13], and we summarize them briefly here. A state ω is primary or a factor state if in the representation (π, H, Ω) determined by ω the von Neumann algebra π(Q)00 is a factor , which means that the center z ≡ π(Q)00 ∩ π(Q)0 = {λ1}, i.e., consists of multiplies of the identity. A state ω which satisfies the KMS condition and is extremal invariant for αt is primary [2, Corollary 4.15]. An extremal state is an extremal point of the set of states over Q, which, in physical terms, is a “pure thermodynamic phase” [2, 4]. A C ∗ -dynamical system is defined to be L1 (Q0 )-asymptotically abelian if Z ∞ dtk[A, αt (B)]k < ∞ −∞
for all A, B in the norm-dense subalgebra Q0 (see [4, Definition 5.4.8] for a discussion of the meaning of this condition). The above condition is difficult to verify in particular models; it can be verified in the ideal Fermi gas. The related condition of asymptotic abelianness lim k[A, αt (B)]k = 0
t→∞
(45)
has been studied for the two-sided infinite XY chain by Araki, and shown to hold there only if A and B have certain properties with respect to the 180◦ rotation of all spins around the z-axis [28, Theorem 11]. Proposition 4 ([13, Theorem 6]; see also [13, Theorems 1.3 and 4.3] for related results). If ω is a factor state on a L1 -asymptotically abelian C ∗ dynamical system satisfying the “local stability condition” (see [12] or [13] for a physical justification of this designation): Z T lim dt ω([A, αt (B)]) = 0 (46) T →∞
−T
∀ A, B ∈ Q, then ω is an extremal (α, β)-KMS state for some β ∈ R ∪ {±∞}. A temperature state satisfying (46) is just a KMS state, but a ground state satisfying (46) not only satisfies the ground-state condition (8), but also has a gap [13, Theorem 3]: H ≥>0
(47)
The superconducting (BCS) ground state is one of the most interesting examples satisfying (47) (see [14] and [15] for a rigorous treatment of the BCS model). In spite of local stability of the ground state, it has been proved in [20] that current-carrying states of a superconductor cannot satisfy the KMS condition — and thus cannot be passive — under the fundamental assumption of local gauge symmetry [20]. For a treatment applicable to real persistent currents in superconductors — which prevail only in multiply-connected regions (rings) — see [27, Sec. 9.6]. For systems with
March 15, 2005 11:30 WSPC/148-RMP
J070-00228
Passivity of Ground States of Quantum Systems
11
short-range forces and spontaneous breakdown of a continuous symmetry, where Goldstone’s theorem [16, 17] implies that (47) cannot hold: there exists a branch of excitations of the ground state, whose energy tends to the ground state energy when a continuous parameter (e.g. the wave-vector) is varied. It is just this phenomenon which accounts for the breakdown of the local stability (46) for the ground state when (47) is not satisfied: the (local) perturbation may cause the formation of an infinite number of infraparticles, each of infinitesimally small energy ([13] or [4, Theorem 5.4.16]). In a quantum Bose liquid, such as He4 , these Goldstone excitations are phonons [18], which are expected to account for the startling property of superfluidity, i.e., the flow along a capillary, with velocity ~v , without friction (viscosity) [18]. Since local stability is violated, one might thus be led to study passivity in the presence of moving matter [7], or, which is equivalent, passivity in the reference frame of a moving observer (with respect to the rest frame of the state). The energy d ω(U ∗ αt (U ))|t=0 (48) dt may be interpreted as the energy gained in the cyclic process between the initial state ω(·) and the final state ω(U · U ), where U ∈ Q is a unitary operator. Conditions (2) and (3) state that this energy is negative, i.e., the work performed on the system is positive by (5): ∆E ≡ i
∆E ≤ 0.
(49)
This is adequate as an expression of the second law for an observer in the rest frame of the state ω, but in the case of moving observers one has to take into account the energy necessary to maintain the motion of the observer. Thus (49) might be expected to change to d ω(U ∗ αt,~v (U ))|t=0 ≤ ∆E~v (∧) (50) dt where U is a unitary element of Q(∧), the local algebra of observables in a region ∧ in space, e.g. a cube with periodic boundary conditions, and αt,~v is the automorphism of Q with generator: i
H~v0 = H + ~v · P~
(51)
appropriate to a non-relativistic motion with velocity ~v (this is a non-relativistic version of [19, (6.2)]. By a principle of local finiteness, we expect that ∆E~v (∧) = 0(| ∧ |)
(52)
where | ∧ | is the volume of region ∧. Indeed a modification of the passivity condition such as (50) is expected, due to the previously mentioned results of Sewell in [20]. In the case of a (locally Fock or locally normal) ground state, (50) follows from ˜ ∧ + ~v · P~∧ ) ≤ ∆E~v (∧) −(H
March 15, 2005 11:30 WSPC/148-RMP
12
J070-00228
W. F. Wreszinski
or ˜ ∧ + ~v · P~∧ + ∆E~v (∧) ≥ 0. H
(53)
˜ ∧ = H∧ − E ∧ ≥ 0 H
(54)
Above
where H∧ , P~∧ are restriction of the Hamiltonian and the momentum operator to finite regions ∧, — we take them to be cubes of side L with periodic boundary conditions — and E∧ is the ground state energy: if Ω∧ is the ground state eigenvector, in the Hilbert space H∧ of the system restricted to ∧, H∧ Ω ∧ = E ∧ Ω ∧
and
P~∧ Ω∧ = ~0.
(55)
In ∧ we consider a generic conservative system of N identical particles [25]. In units where ~ = m = 1, where m is the particle mass, H∧ and P~∧ take the standard forms 1X ∆r + V (~x1 , . . . , ~xN ) 2 r=1 N
H∧ = −
(56a)
and P~∧ =
N X
(−i∇r ).
(56b)
r=1
Introduce the unitary operator U∧~u of Galilei transformations appropriate to velocity ~u: U∧~u = exp[i~u · (~x1 + · · · + ~xN )]
(57)
where 2π ~n, L By (56) and (57) we find ~u =
~n ≡ (n1 , n2 , n3 ), ni ∈ Z.
˜ ∧ U∧~v = H ˜ ∧ + ~v · P~∧ + 1 N~v 2 . U∧∗~v H 2 By (54) and (59), (53) is true with the identification
(58)
(59)
1 N~v 2 (60) 2 which satisfies (52). Of course (60) was to be expected: it is the kinetic energy of the N particles (“of the fluid”), each moving with uniform velocity ~v relative to a “static” frame (a system moving with velocity (−~v ) relative to the reference frame of the given ground state) and (53) is just the expression of the “covariance” of the passivity- or ground state-condition (54) — under Galilean transformations, but it has nothing to do with dissipation. The latter property is detectable in the static frame of reference whenever friction is able to induce a transition from the ∆E~v =
March 15, 2005 11:30 WSPC/148-RMP
J070-00228
Passivity of Ground States of Quantum Systems
13
˜ ∧ U ~v to an eigenstate of the same operator with the lowest ground state of U∧∗~v H ∧ energy above the ground state (i.e. containing an “elementary excitation”). The ˜ ∧ , corresponding to latter corresponds, by (59), to an eigenvalue ∧ (~ p ), say, of H ∗~ v ˜ momentum p ~. Thus, by (59), the lowest eigenvalues of U∧ H∧ U∧~v are 1 N~v 2 . (61) 2 These excitations may indeed be induced by friction (viscosity) if their energy ˜ ∧ , but not otherwise. We (61) is smaller than the ground state energy ( 1 N~v 2 ) of H ∧ (~ p ) + ~v · p~ +
2
thus get the condition for frictionless motion (expected to hold for |~v | ≤ vc , where vc is a critical velocity): ∧ (~ p) + ~v · p~ ≥ 0
(62)
which is Landau’s criterion of superfluidity (see [22] for a remarkable model where the criterion is verified). We therefore conclude: (a) condition (53) is true; (b) it is not related to dissipation. We have seen, in particular, that (60) is essential in (53), and thus the condition ˜ ∧ + ~v · P~∧ ≥ 0 H
(63)
H + ~v · P~ ≥ 0
(64)
which leads to in the thermodynamic limit, is not true for ~v 6= ~0, being in conflict with Galilean ˜ ∧ +~v · P~∧ )U∧~u which, for suitable covariance. (This is shown [25] by considering U∧∗~u (H 1 2 ∗~ u ˜ u ~ ~ ~u, yields U∧ (H∧ + ~v · P∧ )U∧ Ω∧ = 2 N (~u + 2~u · ~v )Ω∧ ; choosing ~u = λ~v , the coefficient of Ω∧ above is 21 N~v 2 (λ2 − 2λ), which is minimized for λ = 1, leading to ˜ ∧ + ~v · P~∧ )U∧~u Ω∧ = − 1 N~v 2 Ω∧ , with ~u = −~v, which contradicts (63).) U∧∗~u (H 2 Inequality (64) (for some |~v | ≤ , where > 0 is a constant related to a condition of “complete semipassivity” [7]) was proved in [7, Proposition 3.2] in the case of a non-faithful state. The proposition is, however, not valid for ~v 6= ~0, if the dynamics is required to be Galilean covariant, as we have seen. Inequality (64) (or (63)) is true in Bogoliubov’s model [21] (for |~v | ≤ vc , where vc is a critical velocity. Bogoliubov’s model leads to (62) and explains, in this sense, the phenomenon of superfluidity (assuming the existence of Bose–Einstein condensation [21]) but, as we have seen, it is not Galilean covariant, and thus does not satisfy local mass conservation, an essential physical requirement. The model, however, remains very important as a starting point of a mathematically ill-understood, but very successful series of approximations [21]. It is possible that the superfluid state of liquid Helium II — in a suitable model — fits in the framework of NESS (non-equilibrium stationary states), see the review by Jacsic and Pillet [5] and the work of J. Fr¨ ohlich, M. Merkli and D. Ueltschi [26].
March 15, 2005 11:30 WSPC/148-RMP
14
J070-00228
W. F. Wreszinski
Acknowledgment I am very grateful to Prof. G. L. Sewell (Queen Mary College, London) for his remarks on the present review, as well as for his contribution to the arguments of passivity versus dissipation in the case of moving observers. I am also very grateful to the referee for important comments and suggestions. References [1] M. Requardt and W. F. Wreszinski, J. Phys. A18 (1985) 705. [2] N. M. Hugenholtz, in Mathematics of Contemporary Physics, ed. R. F. Streater (Academic Press, London, 1972). [3] W. Pusz and S. L. Woronowicz, Commun. Math. Phys. 58 (1978) 273. [4] O. Bratelli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 2, 2nd. edn (Springer Verlag, Berlin, 1997). [5] V. Jacsic and C. A. Pillet, Mathematical Theory of Nonequilibrium Quantum Statistical Mechanics, A Review, Dec. 17, 2001, McGill, and Toulon, C.P.T., CNRS Luminy preprint. [6] R. F. Streater and A. S. Wightman, PCT, Spin and Statistics and All That (W. A. Benjamin, 1964). [7] B. Kuckert, Ann. Phys. 295 (2002) 216. [8] M. Takesaki, Lecture Notes in Math. Vol. 128 (Springer-Verlag, Berlin, 1970). [9] H. Araki, Mathematical Theory of Quantum Fields (Oxford University Press, 1999). [10] R. Haag and J. A. Swieca, Commun. Math. Phys. 1 (1965) 308. [11] D. Buchholz and P. Junglas, Lett. Math. Phys. 11 (1986) 51. [12] R. Haag, D. Hastler and E. B. Trych-Pohlmeyer, Commun. Math. Phys. 38 (1974) 111. [13] O. Bratelli, A. Kishimoto and D. W. Robinson, Commun. Math. Phys. 61 (1978) 209. [14] W. Thirring and A. Wehrl, Commun. Math. Phys. 4 (1967) 303. [15] W. Thirring, Commun. Math. Phys. 7 (1968) 181. [16] J. A. Swieca, in Carg`ese Lectures on Physics, Vol. 4, ed. D. Kastler (Gordon and Breach, 1970). [17] W. F. Wreszinski, Fortschr. der Physik 35 (1987) 379. [18] R. P. Feynman, Statistical Mechanics (W. A. Benjamin, Inc. 1972). [19] J. Bros and D. Buchholz, Nucl. Phys. B429 (1994) 291. [20] G. L. Sewell, Phys. Rep. 57 (1980) 307. [21] V. Zagrebnov and J.-B. Bru, Phys. Rep. 350 (2001) 291. [22] E. Lieb and W. Liniger, Phys. Rev. 130 (1963) 1605; E. Lieb, ibid. 130 (1963) 1616. [23] M. Sirugue and D. Testard, Commun. Math. Phys. 19 (1971) 161. [24] D. Kastler, Equilibrium States of Matter, Symposia Mathematica, Vol. XX (Bologna, 1976). [25] I owe most of these remarks to G. L. Sewell, to whom I am very thankful. [26] J. Fr¨ ohlich, M. Merkli and D. Ueltschi, Ann. Henri Poincar´e 4 (2003) 897–945. [27] G. L. Sewell, Quantum Mechanics and its Emergent Macrophysics (Princeton University Press, 2002). [28] H. Araki, Publ. RIMS, Kyoto Univ. 20 (1984) 277.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
Reviews in Mathematical Physics Vol. 17, No. 1 (2005) 15–75 c World Scientific Publishing Company
STATES AND REPRESENTATIONS IN DEFORMATION QUANTIZATION
STEFAN WALDMANN Fakult¨ at f¨ ur Mathematik und Physik, Albert-Ludwigs-Universit¨ at Freiburg, Physikalisches Institut, Hermann Herder Straße 3, D 79104 Freiburg, Germany
[email protected] Received 17 August 2004 Revised 15 December 2004 In this review we discuss various aspects of representation theory in deformation quantization starting with a detailed introduction to the concepts of states as positive functionals and the GNS construction. Rieffel induction of representations as well as strong Morita equivalence, Dirac monopole and strong Picard Groupoid are also discussed. Keywords: Deformation quantization; states; representation theory; Morita equivalence. Mathematics Subject Classification 2000: 53D55
Contents 1. Introduction 2. Motivation: Why States and Representations? 2.1. Observables and states 2.2. Superpositions and superselection rules 2.3. Deformation quantization 3. Algebraic Background: ∗ -Algebras over Ordered Rings 3.1. Ordered rings 3.2. Pre-Hilbert spaces 3.3. ∗ -Algebras 4. Examples of Positive Functionals in Deformation Quantization 4.1. The δ-functional for the Weyl and Wick star product 4.2. The Schr¨ odinger functional 4.3. Positive traces and KMS functionals 5. Deformation and Classical Limit of Positive Functionals 5.1. Completely positive deformations 5.2. Complete positivity of Hermitian star products 6. The GNS Construction and Examples 6.1. ∗ -Representation theory 6.2. The general GNS construction
15
16 18 18 20 21 24 24 25 26 28 28 29 31 32 33 34 36 36 37
March 15, 2005 11:31 WSPC/148-RMP
16
7.
8.
9.
10.
J070-00229
S. Waldmann
6.3. The case of δ, Schr¨ odinger and trace functionals 6.4. Deformation and classical limit of GNS representations General ∗ -Representation Theory 7.1. ∗ -Representation theory on pre-Hilbert modules 7.2. Tensor products and Rieffel induction 7.3. A non-trivial example: Dirac’s monopole Strong Morita Equivalence and the Picard Groupoid 8.1. Morita equivalence in the ring-theoretic setting 8.2. Strong Morita equivalence 8.3. The strong Picard Groupoid 8.4. Actions and invariants 8.5. Strong versus ring-theoretic Morita equivalence (Strong) Morita Equivalence of Star Products 9.1. Deformed ∗ -algebras 9.2. Deformed projections 9.3. Morita equivalent star products Outlook: What Comes Next?
39 44 46 47 48 51 52 52 53 56 57 58 63 63 64 66 68
1. Introduction Based on the works of Weyl, Groenewold, Moyal, Berezin and others [12–14, 79, 109, 141] on the physical side and Gerstenhaber’s deformation theory of associative algebras on the mathematical side [73–78], in the 1970’s, Bayen, Flato, Frønsdal, Lichnerowicz and Sternheimer coined the notion of a star product and laid the foundations of deformation quantization in their seminal work [9], see [55, 81, 138] for recent reviews. Since then, deformation quantization has developed into one of the most attractive and successful quantization theories, both from mathematical and physical points of view. The principle idea is to quantize the classical observable algebra which is modeled by smooth complex-valued functions on a Poisson manifold M , see e.g. [43, 139], by simply replacing the commutative product of functions by some new noncommutative product, the star product, now depending on Planck’s constant, to control the noncommutativity, but keeping the underlying vector space of observables. Thereby the interpretation of the quantum observables is trivial: they are the same elements of the algebra of observables as in the classical case. It turns out that many other well-known quantization schemes can actually be cast into this form whence it is fair to say that deformation quantization is more a theory of quantization itself rather than a particular quantization scheme. In formal deformation quantization, one has very strong existence and classification results for star products which depend on ~ in the sense of a formal power series. For the symplectic case the general existence was shown first by DeWilde and Lecomte [51, 52], later by Fedosov [64–66, 68] and Omori, Maeda, and Yoshioka [115]. The case where the classical phase space is a general Poisson manifold turned out to be much more difficult and was finally solved by Kontsevich [99], see also [101, 102], by proving his formality conjecture [100].
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
17
The classification of star products was obtained again first for the symplectic case by Nest and Tsygan [110, 111], Bertelson, Cahen and Gutt [15], Deligne [50], Weinstein and Xu [140]. The classification in the Poisson case follows also from the formality theorem of Kontsevich [99]. For an interpretation of Kontsevich’s formality in terms of the Poisson-sigma model as well as globalization aspects see the work of Cattaneo, Felder and Tomassini [44–46] as well as Dolgushev [59]. It should also be mentioned that star products find physical applications far beyond the original quantization problem: recently the most prominent applications come from noncommutative geometry [47] and the noncommutative field theory models arising from it. Here one endows the space-time manifold with the noncommutative star product and studies field theories on this noncommutative space-time, see e.g. [3, 48, 89–91, 128] and references therein. For the quantization problem for physical reasons it is not enough to consider only the space of observables and their quantization. One also needs a notion for the states and their quantization. To give an overview of the concept of states in deformation quantization is therefore the main topic of this review. It turns out that the question of states is, as in any other quantum theory based on the notion of observables, intimately linked to the question of representations of the observable algebra. A systematic investigation of representations of the deformed algebras started with the work [29] leading to the general representation theory for algebras with involution defined over a ring C = R(i) with an ordered ring R as developed in a series of articles [23–26, 28, 33–38, 131, 132, 134–136]. The purpose of this work is to give an introduction of these concepts and discuss some of the basic results in representation theory of star product algebras. Since the techniques are fairly general, many of the results will find applications also in other areas of mathematical physics. The plan of this review is as follows: In Sec. 2 we briefly remind the reader the basic notion of states as positive functionals and representations of observable algebras and discuss the necessity of studying them. Section 3 gives the algebraic background on ordered rings, ∗ -algebras and notions of positivity which will be crucial throughout this work. Section 4 is devoted to examples of positive functionals from deformation quantization. In Sec. 5 we discuss the deformation and the classical limit of positive functionals and introduce the important notion of a positive deformation. Section 6 establishes the relation between positive functionals and representations via the GNS construction of representations. Section 7 starts with an introduction to more advanced topics in representation theory like Rieffel induction and related tensor product constructions. This will be used in Sec. 8 to establish the notion of strong Morita equivalence and the strong Picard Groupoid which encodes the whole Morita theory. In Sec. 9 we discuss the Morita theory for deformed and more specifically for star product algebras yielding a new look at Dirac’s monopole and the corresponding charge quantization. Finally, Sec. 10 contains several open questions and further ideas related to representation theory
March 15, 2005 11:31 WSPC/148-RMP
18
J070-00229
S. Waldmann
in deformation quantization. We have included an extensive though by no means complete bibliography. For more details and references one should also consult the Deformation Quantization Homepage. 2. Motivation: Why States and Representations? In this section we shall give some well-known remarks on the quantization problem and the general approach to quantum theory based on the notion of an observable algebra and specialize this to deformation quantization. 2.1. Observables and states When we want to learn something about the relation between the classical and quantum descriptions of a physical system we should first discuss the similarities and differences as detailed as possible. Here we follow the idea that the observables, i.e. the possible measurements one can perform on the system, characterize the system itself. Moreover, the algebraic structure of the observables determines what the possible states of the system can be. From this point of view classical and quantum theories behave quite similar. We illustrate this for a system with finitely many degrees of freedom though the main results can easily be generalized to field theories or thermodynamical systems. For the classical side, we model the algebra of observables by the Poisson ∗ -algebra of complex-valued smooth functions on a manifold M , the phase space of the system. The ∗ -involution will always be pointwise complex conjugation. An element in the observable algebra is called observable if it is a real-valued function f = f¯. The structure of a Poisson bracket {·, ·} for smooth functions is equivalent to a Poisson structure on manifold, i.e. a smooth anti-symmetric 2-tensor field π ∈ Γ∞ (Λ2 T M ) with [[π, π]] = 0, where [[·, ·]] denotes the Schouten bracket, and the relation is {f, g} = π(df, dg), see e.g. [43, 106, 139] for more details and references on Poisson geometry. Of course there are situations where the class of functions describing the system most adequately may be a different one. The pure states are then the points of the phase space while the mixed states correspond to more general positive Borel measures on M . The physical spectrum of an observable f , i.e. the possible values of f in a measurement, coincides with its mathematical spectrum, namely the set of values of the function. Finally, the expectation value of an observable f in a pure state x ∈ M is given by the evaluation Ex (f ) = f (x) while R in a mixed state µ the expectation value is Eµ (f ) = M f (x) dµ(x). The crucial feature of a classical observable algebra is its commutativity which allows to have sharp measurement of all observables in a pure state. In quantum theory, the observables are usually modeled by the ∗ -algebra of bounded operators B(H) on a complex Hilbert space H, or, more generally, some ∗ -algebra of densely defined and possibly unbounded operators on H. It is clear that one has to specify in which sense one wants to understand “∗ -algebra” if the
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
19
operators are only densely defined, but these technical aspects can be made precise in a completely satisfactory way, see e.g. [127]. The observable elements in the observable algebra are then the self-adjoint operators. The pure states are now complex rays in the Hilbert space. Usually, not all rays have physical relevance as the vectors defining them have to be in the domain of the observables of interest which may only be a dense subspace of H. From this point of view only a pre-Hilbert space is needed to describe the physically relevant states while the Hilbert space in the background is needed to have a “good” spectral calculus. More generally, mixed states are described by density matrices %, i.e. positive trace class operators with trace 1. Indeed, the pure states are just the rank-one projection operators from this point of view. The spectrum is now the spectrum in the sense of self-adjoint operators. Finally, the expectation value of an observable A in the pure state defined by φ ∈ H is Eφ (A) = hφ,Aφi hφ,φi and in a mixed state % it is given by E% (A) = tr(%A). Up to now this is the standard description of classical and quantum theory as can be found in text books. However, the way we presented it allows for a uniform framework for both theories which is better suited concerning questions of quantization and classical limit. Indeed, the structure of the observables is in both cases encoded in a (unital) complex ∗ -algebra A. The difference between the classical and quantum side is that A is noncommutative for the quantum theory as we have to incorporate uncertainty relations while in the classical situation A is commutative but has an additional structure, the Poisson bracket. The states are now identified with the expectation value functionals and can thus be described by normalized positive linear functionals ω: A → C, i.e. linear functionals such that ω(a∗ a) ≥ 0 for all a ∈ A and ω(1l) = 1. The question whether a state is pure or mixed becomes now the question whether ω can be decomposed in a non-trivial way into a convex combination ω = c1 ω1 + c2 ω2 of two other states ω1 , ω2 . The expectation value of an observable a in a state ω is then Eω (a) = ω(a). Clearly, all our above examples of “states” fit into this framework. Unfortunately, one has to leave this purely algebraic framework as soon as one wants to have a reasonable notion for “spectrum”. Here one has to impose some analytical conditions on the ∗ -algebra in question in order to get physically acceptable answers. In the above examples this corresponds to the choice of an appropriate class of functions on the phase space on the classical side and the questions about self-adjointness on the quantum side. Typically, some C ∗ -algebraic structures underlying A will be responsible for a good spectral calculus. Except for this last difficulty the problem of quantization can now be seen as the task to construct the quantum observable algebra out of the knowledge of the classical observable algebra. The above formulation then gives automatically a construction of the states as well, since the states of a ∗ -algebra are defined in a uniform way, whether the algebra is commutative or not. In this sense the algebraic structure of the observables determines the possible states whence for quantization it is sufficient to find the observable algebra.
March 15, 2005 11:31 WSPC/148-RMP
20
J070-00229
S. Waldmann
Before we discuss one of the approaches to quantization in more detail let us state clearly that from a physicist’s point of view the whole question about quantization is in some sense completely artificial: by our present knowledge the world is already quantum whence there is nothing left to be quantized. The true physical problem is the inverse question: why and how does a classical world emerge out of this quantum world, at least for certain scales of energy, momentum, length, time, etc.? Nevertheless, as a physicist, one is still interested in quantization since up to now we have not developed sufficient intuition which would allow us to formulate quantum theories a priori without the usage of classical counterparts, except for very few cases. We shall not speculate too much on the more philosophical question why this is (still) the case. Instead, we consider “quantization” as a pragmatic approach to find relevant quantum descriptions for physical systems we are interested in. 2.2. Superpositions and superselection rules Once we succeed in finding the quantum algebra of observables and having determined the states as its positive functionals do we then have a complete quantum theory? The answer is no, there is still one important piece missing, most crucial for quantum physics: We still need the superposition principle for the (pure) states. In the usual Hilbert space formulation of quantum physics one simply takes complex linear combinations of the state vectors in order to encode the superposition of the states they represent. Note that this cannot be done so simply in our more advanced formulation where states are identified with their expectation value functionals. Of course we can take convex combinations of positive functionals and get again positive functionals but this does not correspond to the superposition of the states but to a mixed state. Thus we need this additional linear structure of the Hilbert space which is precisely the reason why one has to represent the algebra of observables on a Hilbert space such that the positive functionals become the expectation value functionals for vector states. We want to be able to write hφ, π(a)φi (2.1) ω(a) = hφ, φi
for some ∗ -representation π on some Hilbert space H and some vector φ ∈ H. Clearly, at this stage we only need a pre-Hilbert space structure. But which ∗ -representation shall we choose? In particular, for two given positive functionals ω1 and ω2 can we always find a ∗ -representation (H, π) such that both states ω1 , ω2 can be written in the form (2.1) with some φ1 , φ2 ∈ H in order to form their superpositions? If we require in addition that the algebra acts irreducibly then the answer, in general, is no. Dropping the irreducibility gives an easy answer provided we can find ∗ -representations (Hi , πi ) for each ωi separately, since then we simply can take the direct orthogonal sum of the ∗ -representations. Then however,
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
21
superpositions of the vectors φ1 , φ2 will not produce any interesting interference cross terms. There will be no transitions between these two states. This phenomena is called a superselection rule: one cannot superpose the two states in a non-trivial way. It is well-known from quantum field theory that this may happen indeed, see e.g. the discussion in [84]. The presence of superselection rules is usually interpreted as the existence of non-trivial charges. Mathematically speaking it corresponds to the existence of inequivalent (faithful) irreducible ∗ -representations. Note that in order to “see” the superselection rules it is not enough to choose one particular ∗ -representation from the beginning. So the problem of choosing a ∗ -representation “is not a bug, it is a feature”. In usual quantum mechanics of an uncharged particle moving in Euclidian Rn superselection rules are absent: this is the statement of a classical theorem of von Neumann. Note however that this statement is only true after some effort involving the completion of the observable algebra to a C ∗ -algebra, the Weyl algebra, see e.g. the discussion in [84, Sec. I.1]. One should take this non-trivial result also as a warning: since the absence of superselection rules in the above case is the consequence of some rather strong topological context, one has to expect that one might see “superselection rules” which are artifacts of a possible “noncompleteness” of the observable algebra in the sense that they vanish immediately after one passes to an appropriate completion. As a conclusion we see that we have to understand the whole ∗ -representation theory of the observable algebra and determine the “hard” superselection rules, i.e. those which survive after some (physically motivated) completions. Of course, this has to be made more specific in the sequel. In fact, in formal deformation quantization this turns out to be a highly non-trivial issue.
2.3. Deformation quantization The main idea of deformation quantization, as formulated in [9], is to construct the quantum observable algebra as a noncommutative associative deformation of the classical observable algebra in the sense of Gerstenhaber’s deformation theory [74– 77, 87] where the first non-trivial term in the commutator of the deformation is the classical Poisson bracket and the deformation parameter is Planck’s constant ~. Thus the classical Poisson bracket is the “shadow” of the noncommutativity of quantum theory. Roughly speaking, deformation means that we endow the same underlying vector space of the classical observable algebra with a family of new products ?~ , called star products, which depend on the deformation parameter ~ in such a way that for ~ = 0 we recover the classical commutative product structure. There are at least two flavors of this quantization scheme: strict deformations and formal deformations. While in strict deformation quantization, see e.g. [106, 122, 123], one wants the products ?~ to depend in a continuous way on ~,
March 15, 2005 11:31 WSPC/148-RMP
22
J070-00229
S. Waldmann
usually within a C ∗ -algebraic framework; in formal deformation quantization the dependence is in the sense of formal power series. At least in some good cases the formal deformations can be seen as an asymptotic expansion of the strict ones. Of course, the deformation parameter, being identified with Planck’s constant ~, should not be a formal parameter but a physical quantity. Thus starting with formal deformations one should be able to establish at some point the convergence of the formal series. In general this turns out to be a rather delicate problem usually depending in a very specific way on the particular example one considers, see e.g. [19, 39, 81] and references therein. So, unfortunately, not very much can be said about this point in general. On the other hand, the advantage of formal deformations is that we can decide at which point we want to impose the convergence conditions. This gives usually more freedom in the beginning. In the following we shall always consider the formal framework. After these general remarks we can now state the definition of a star product according to [9] and recall some of the basic results: Definition 2.1 (Star Product). Let (M, π) be a Poisson manifold. Then a formal star product ? on (M, π) is an associative C[[λ]]-bilinear multiplication on C ∞ (M )[[λ]], written as f ?g =
∞ X
λr Cr (f, g)
(2.2)
r=0
for f, g ∈ C ∞ (M )[[λ]], such that (i) (ii) (iii) (iv)
C0 (f, g) = f g and C1 (f, g) − C1 (g, f ) = i{f, g}, 1 ? f = f = f ? 1, Cr is a bidifferential operator, ¯ where λ ¯ = λ by definition. f ? g = g¯ ? f,
The first condition reflects the correspondance principle while the last condition is sometimes omitted. To stress the last condition we shall call star products satisfying this condition also Hermitian star products. In the following we shall mainly be interested in Hermitian star products. P∞ If S = id + r=1 λr Sr is a formal series of differential operators Sr : C ∞ (M ) → C ∞ (M ) with the property that Sr vanishes on constants then for a given star product ? the definition f ?0 g = S −1 (Sf ? Sg)
(2.3)
defines again a star product deforming the same Poisson bracket as ?. This is an immediate computation. If in addition S f¯ = Sf then ?0 is a Hermitian star product if ? is Hermitian. If two star products are related by such an operator they are called equivalent or ∗ -equivalent, respectively.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
23
The existence of star products as well as their classification up to equivalence is now well-understood: Theorem 2.2 (Existence of Star Products). On any Poisson manifold there exists a Hermitian star product. The first proofs of this theorem for the case of symplectic manifolds were obtained by DeWilde and Lecomte [51, 52] and independently by Fedosov [64–66, 68] and Omori, Maeda and Yoshioka [115]. The much more involved existence problem in the Poisson case is a consequence of Kontsevich’s formality theorem [99, 101]. The classification up to equivalence was first obtained for the symplectic case by Nest and Tsygan [110, 111], Bertelson, Cahen and Gutt [15], Deligne [50] (see also [82, 113]), and Weinstein and Xu [140]. Here the equivalence classes are shown to be in canonical bijection with formal series in the second (complex) de Rham cohomology: One has a characteristic class [ω] + H2dR (M, C)[[λ]], (2.4) iλ where the origin of the above affine space is chosen by convention and two symplectic star products are equivalent if and only if their characteristic classes coincide. Moreover, a star product is equivalent to a Hermitian star product if and only if its characteristic class is imaginary [113]. The above classification is a particular case for the classification in the Poisson case which is also obtained from Kontsevich’s formality. In general, the equivalence classes of star products are in bijection with the formal deformations of the Poisson bivector modulo formal diffeomorphisms [99]. Finally, Hermitian star products are equivalent if and only if they are ∗ -equivalent, see e.g. [36, 113]. Having understood this, the next problem is to define positive functionals and ∗ -representations in this context and determine (as far as possible) the representation theory of the star product algebras. The first attempt of considering C-linear positive functionals ω: C ∞ (M )[[λ]] → C with ω(f¯ ? f ) ≥ 0 turns out to be too naive: One is faced immediately with convergence problems or one has to ignore higher orders in λ at some point. Both problems limit this attempt too much. It is simply the wrong category and we should better take the formal power series seriously. Thus the better choice is to look for C[[λ]]-linear functionals c: ? 7→ c(?) ∈
ω: C ∞ (M )[[λ]] → C[[λ]].
(2.5)
Then, of course, we have to define what we mean by positivity. Here we can use the following simple but crucial fact that the real formal power series R[[λ]] comprises in a natural way an ordered ring: one defines for a ∈ R[[λ]] a=
∞ X
λr ar > 0 if and only if ar0 > 0.
(2.6)
r=r0
This allows us to speak of positive linear functionals in a meaningful way and follow the above program once we have adapted the concepts of ∗ -representations
March 15, 2005 11:31 WSPC/148-RMP
24
J070-00229
S. Waldmann
etc. to ∗ -algebras defined over such an ordered ring. To provide such a framework, extending the usual framework of ∗ -algebras over C, is the main objective of this paper. Let us conclude this section with moderate warnings on what we have to expect to get from this approach. Clearly, we have to expect artifacts like many inequivalent ∗ -representations which will disappear in a convergent and more topological context. In some sense the situation might turn out to be even more involved than for ∗ -algebras over C. Thus it will be a difficult task to detect the “hard” superselection rules in this framework. On the other hand, the framework will hopefully be wide enough to contain all physically interesting ∗ -representations. Obstructions found in this general framework will be certainly difficult to overcome in even more strict frameworks. 3. Algebraic Background: ∗ -Algebras over Ordered Rings In this section we set up the basic theory of ∗ -algebras over ordered rings in order to have a unified approach for ∗ -algebras over C like e.g. C ∗ -algebras or more general O ∗ -algebras and the star product algebras being ∗ -algebras over C[[λ]]. The well-known case of operator algebras (bounded or unbounded, see e.g. [30, 93, 94, 125, 127]) will be the motivation and guideline throughout this section. As general references for this section see also [28, 35, 37, 134, 135, 137]. 3.1. Ordered rings Let us first recall the definition and some basic properties of ordered rings which is a slight generalization of ordered fields, see e.g. [88, Sec. 5.1]. Definition 3.1 (Ordered Ring). An ordered ring (R, P) is a commutative, associative unital ring R together with a subset P ⊂ R, the positive elements, such that (i) R = −P ∪˙ {0} ∪˙ P (disjoint union), (ii) P · P ⊆ P and P + P ⊆ P. The subset P induces an ordering defined by a < b if b − a ∈ P. The symbols ≤, ≥, > will then be used in the usual way. In the following, we will fix an ordered ring R and denote by C = R(i) = R ⊕ iR,
where i2 = −1,
(3.1)
the ring-extension by a square root of −1. In C we have the usual complex conjugation z = a + ib 7→ z¯ = a − ib, where a, b ∈ R and R is considered as a sub-ring of C in the usual way.
(3.2)
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
25
Remark 3.2 (Characteristics and Quotient Fields of Ordered Rings). (i) If a 3 R, a 6= 0 then a2 > 0. Hence 1 = 12 > 0 and thus 1 + · · · + 1 = n > 0 for all n ∈ N. Thus it follows that Z ⊆ R whence R has characteristic zero. (ii) Moreover, z z¯ = a2 + b2 > 0 for z = a + ib ∈ C, z 6= 0. It also follows that the characteristic of C is zero, too. (iii) If a, b 6= 0 in R then we have four cases a > 0 and b > 0, a > 0 and b < 0, a < 0 and b > 0, a < 0 and b < 0. In each case we obtain ab 6= 0 whence R has no zero-divisors. The same holds for C. Hence we can pass to the quotient ˆ and C, ˆ respectively. The field R ˆ is canonically ordered and R ,→ R ˆ is fields R ˆ ˆ order preserving. Finally, C = R(i). Definition 3.3 (Archimedean Ordering). An ordered ring R is called Archimedean if for a, b > 0 there is a n ∈ N with na > b. Otherwise R is called non-Archimedean. Example 3.4 (Ordered Rings). (i) Z is the smallest ordered ring and is contained in any other. Clearly Z is Archimedean. (ii) Q and R are Archimedean ordered rings, even ordered fields. (iii) R[[λ]] is non-Archimedean as nλ < 1 but λ > 0. The quotient field is the field of the formal Laurent series R((λ)). (iv) More generally, if R is an ordered ring then R[[λ]] is canonically ordered again by the analogous definition as in (2.6) and it is always non-Archimedean. This already indicates that ordered rings and formal deformations fit together nicely, for the price of non-Archimedean orderings. 3.2. Pre-Hilbert spaces Having an ordered ring we have the necessary notion of positivity in order to define pre-Hilbert spaces generalizing the usual complex case. Definition 3.5 (Pre-Hilbert Space). A C-module H with a map h·, ·i : H × H → C is called pre-Hilbert space over C if (i) h·, ·i is C-linear in the second argument, (ii) hφ, ψi = hψ, φi for all φ, ψ ∈ H, (iii) hφ, φi > 0 for φ 6= 0. A map A: H1 → H2 is called adjointable if there exists a (necessarily unique) map A∗ : H2 → H1 with hφ, Aψi2 = hA∗ φ, ψi1
(3.3)
for all φ ∈ H2 , ψ ∈ H1 . Clearly, adjointable maps are C-linear and we have the usual rules for adjoints, i.e. (zA + wB)∗ = z¯A∗ + wB ¯ ∗,
(AB)∗ = B ∗ A∗ ,
and (A∗ )∗ = A,
(3.4)
March 15, 2005 11:31 WSPC/148-RMP
26
J070-00229
S. Waldmann
where the existence of the adjoints on the left is implied. The set of all adjointable maps is denoted by B(H1 , H2 ) = {A: H1 → H2 | A is adjointable}
(3.5)
B(H) = B(H, H).
(3.6)
Clearly, B(H) is a unital sub-algebra of all C-linear endomorphisms of H. In the particular case that H is a Hilbert space over C the famous Hellinger–Toeplitz theorem states that the adjointable operators coincide with the continuous operators. Particular examples of adjointable operators are the rank-one and finite rank operators. For φ ∈ H1 and ψ ∈ H2 we define the rank-one operator Θψ,φ : H1 3 χ 7→ ψ hφ, χi1 ∈ H2 ,
(3.7)
which is clearly adjointable with adjoint Θ∗ψ,φ = Θφ,ψ . Moreover, we define the finite-rank operators F(H1 , H2 ) = C-span{Θψ,φ | φ ∈ H1 , ψ ∈ H2 }
(3.8)
F(H) = F(H, H).
(3.9)
and set
We have F(H1 , H2 ) ⊆ B(H1 , H2 ) and F(H) ⊆ B(H). In general, these inclusions are proper: Example 3.6 (Standard Pre-Hilbert Space). Let Λ be a set and consider the L free C-module generated by Λ, i.e. H = C(Λ) = λ∈Λ Cλ with Cλ = C for all λ. Then H becomes a pre-Hilbert space by X h(xλ ), (yλ )i = x ¯λ yλ . (3.10) λ∈Λ
In general F(H) ( B(H) unless #Λ = n < ∞. In this case F(H) = B(H) ∼ = Mn (C). 3.3.
∗
-Algebras
The subalgebra B(H) ⊆ End(H) will be the motivating example of a ∗ -algebra: Definition 3.7 (∗ -Algebra). An associative algebra A over C together with a C-anti-linear, involutive anti-automorphism ∗ : A → A is called a ∗ -algebra and ∗ is called the ∗ -involution of A. A morphism of ∗ -algebras is a morphism φ: A → B of associative C-algebras with φ(a∗ ) = φ(a)∗ . Example 3.8 (∗ -Algebras). (i) Hermitian star products on Poisson manifolds give ∗ -algebras (C ∞ (M )[[λ]], ?) over C = C[[λ]], where the ∗ -involution is the complex conjugation. (ii) For any pre-Hilbert space H the algebra B(H) is a ∗ -algebra with ∗ -involution given by the adjoint. Moreover, F(H) ⊆ B(H) is a ∗ -ideal. (iii) In particular, Mn (C) is a ∗ -algebra.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
27
(iv) If A, B are ∗ -algebras then A ⊗ B is again a ∗ -algebra with the obvious ∗ -involution. (v) In particular, Mn (A) = A ⊗ Mn (C) is a ∗ -algebra. Having a ∗ -algebra we can adapt the definitions of positive functionals, positive algebra elements and positive maps from the well-known theory of ∗ -algebras over C, see e.g. [127], immediately to our algebraic context. This motivates the following definitions [29, 35, 37]: Definition 3.9 (Positivity). Let A be a ∗ -algebra over C. (i) A C-linear functional ω: A → C is called positive if ω(a∗ a) ≥ 0.
(3.11)
If A is unital then ω is called a state if in addition ω(1l) = 1. (ii) a ∈ A is called positive if ω(a) ≥ 0 for all positive functionals ω of A. We set
A++
A+ = {a ∈ A | a is positive} ( ) n X ∗ = a∈A a= βi bi bi , with 0 < βi ∈ R, bi ∈ A .
(3.12) (3.13)
i=1
(iii) A linear map φ: A → B into another ∗ -algebra B is called positive if φ(A+ ) ⊆ B + . Moreover, φ is called completely positive if the componentwise extension φ: Mn (A) → Mn (B) is positive for all n ∈ N. Remark 3.10 (Positive Elements and Maps). (i) Clearly, we have A++ ⊆ A+ but in general A++ 6= A+ . (ii) For C ∗ -algebras we have A++ = A+ and, moreover, any positive element has a √ unique positive square root a = ( a)2 . This follows from the spectral calculus. (iii) Any ∗ -homomorphism is a completely positive map. (iv) A linear map φ: A → B is positive if and only if for any positive linear functional ω: B → C the pull-back φ∗ ω = ω ◦ φ is a positive functional of A. This is the case if and only if φ(A++ ) ⊆ B + . (v) A positive functional ω: A → C is a completely positive map. However, there are simple counter-examples which show that not every positive map is completely positive, even in the case C = C, see e.g. [94, Exercise 11.5.15]. The following standard examples will be used later: Example 3.11 (Positive Maps). (i) The trace functional tr: Mn (A) → A is completely positive. (ii) The map τ : Mn (A) → A defined by τ ((aij )) =
n X
i,j=1
is completely positive.
aij
(3.14)
March 15, 2005 11:31 WSPC/148-RMP
28
J070-00229
S. Waldmann
(iii) For a ∈ A+ and b ∈ B + we have a ⊗ b ∈ (A ⊗ B)+ . Indeed, for b ∈ B the map a 7→ a ⊗ b∗ b is clearly a positive map. Thus for a ∈ A+ the element a ⊗ b∗ b is positive for all b. Hence also the map b 7→ a ⊗ b is a positive map for positive a. Thus the claim follows. It also follows that the tensor product of positive linear functionals is again a positive linear functional. Though the above definition of positive functionals and elements is in some sense the canonical one there are other concepts for positivity. Indeed, in the theory of O∗ -algebras the above definition does not give the most useful concept, see the discussion in [127]. In general, one defines an m-admissible wedge K ⊆ A of a unital ∗ -algebra to be a subset of Hermitian elements such that K is closed under convex combinations, a∗ Ka ⊆ K for all a ∈ A and A++ ⊆ K. Then the elements in K are a replacement for the positive elements A+ . Also one can define a linear functional ω: A → C to be positive with respect to K if ω(K) ≥ 0. In particular, ω is positive in the usual sense, but not all positive functionals will be positive with respect to K. Similarly, this gives a refined definition of (completely) positive maps, leading to the notion of strong positivity in the case of O ∗ -algebras. In the following we shall stick to Definition 3.9 since it seems that for deformation quantization this is the “correct” choice. At least in the classical limit this definition produces the correct positive elements in C ∞ (M ), see e.g. the discussion in [35, App. B] and [137, Sec. 3]. 4. Examples of Positive Functionals in Deformation Quantization We shall now discuss three basic examples of positive functionals in deformation quantization: the δ-functionals, the Schr¨ odinger functional and the positive traces and KMS functionals, see [24, 25, 29, 131] for these examples. 4.1. The δ-functional for the Weyl and Wick star product We consider the most simple classical phase space R2n with its standard symplectic structure and Poisson bracket. For this example one knows several explicit formulas for star products quantizing the canonical Poisson bracket. The most prominent one is the Weyl-Moyal star product iλ
f ?Weyl g = µ ◦ e 2
Pn
r=1
( ∂q∂r ⊗ ∂p∂r − ∂p∂r ⊗ ∂q∂r ) f ⊗ g,
(4.1)
where f, g ∈ C ∞ (R2n )[[λ]] and µ(f ⊗g) = f g is the pointwise (undeformed) product. Consider the Hamiltonian of the isotropic harmonic oscillator H(q, p) = 12 (p2 + 2 q ), where we put m = ω = 1 for simplicity. Then we have 2 ¯ ?Weyl H = H 2 − λ H 4
(4.2)
whence 2
¯ ?Weyl H) = − λ < 0. δ 0 (H 4
(4.3)
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
29
Thus the δ-functional at 0 (and similarly at any other point) cannot be a positive functional for the Weyl–Moyal star product, while classically all δ-functionals are of course positive. This has a very simple physical interpretation, namely that points in phase space are no longer valid states in quantum theory: we cannot localize both space and momentum coordinates because of the uncertainty relations. More interesting and in some sense surprising is the behavior of the Wick star product (or normal ordered star product) which is defined by f ?Wick g = µ ◦ e2λ
Pn
∂ r=1 ∂z r
⊗ ∂∂z¯r
f ⊗ g,
(4.4)
where z r = q r + ipr and z¯r = q r − ipr . First recall that ?Wick is equivalent to ?Weyl by the equivalence transformation f ?Wick g = S S −1 f ?Weyl S −1 g , (4.5)
where
S = eλ∆
and ∆ =
n X r=1
∂2 . ∂z r ∂ z¯r
(4.6)
The operator ∆ is, up to a constant multiple, the Laplace operator of the Euclidean metric on the phase space R2n . With the explicit formula (4.4) we find 2 n ∞ X ∂rf (2λ)r X (4.7) f¯ ?Wick f = i i r! i ,...,i =1 ∂ z¯ 1 · · · ∂ z¯ r r=0 1
∞
r
2n
for f ∈ C (R )[[λ]]. Thus any classically positive functional of C ∞ (R2n ) is also positive with respect to the Wick star product. In particular, all the δ-functionals are positive. In some sense they have to be interpreted as coherent states in this context, see e.g. [12–14, 39–42] and references therein. It should be remarked that this simple observation has quite drastic consequences as we shall see in Sec. 5.2 which are far from being obvious. We also remark that this result still holds for Wick-type star products on arbitrary K¨ ahler manifolds [27, 29, 96]. 4.2. The Schr¨ odinger functional Consider again the Weyl-Moyal star product ?Weyl on R2n which we interpret now as the cotangent bundle π: T ∗ Rn → Rn of the configuration space Rn . Denote by ι: Rn ,→ T ∗ Rn the zero section. We consider the following ω: C0∞ (T ∗ Rn )[[λ]] → C[[λ]] defined by Z ι∗ f dn q. ω(f ) = Rn
(4.8) Schr¨ odinger
functional
(4.9)
March 15, 2005 11:31 WSPC/148-RMP
30
J070-00229
S. Waldmann
Thanks to the restriction to formal series of functions with compact support the integration with respect to the usual Lebesgue measure dn q is well-defined. One defines the operator λ
N = e 2i ∆ ,
where ∆ =
n X
k=1
∂2 ∂pk ∂q k
(4.10)
is now the Laplacian (“d’Alembertian”) with respect to the maximally indefinite metric obtained by pairing the configuration space variables with the momentum variables. In fact, N is the equivalence transformation between the Weyl–Moyal star product and the standard-ordered star product as we shall see later. By some successive integration by parts one finds Z (ι∗ N f )(ι∗ N g)dn q, (4.11) ω(f¯ ?Weyl g) = Rn
whence immediately
ω(f¯ ?Weyl f ) ≥ 0.
(4.12)
Thus the Schr¨ odinger functional is a positive functional with respect to the Weyl– Moyal star product. In fact, there is a geometric generalization for any cotangent bundle π: T ∗ Q → Q of this construction, see [23–25, 117, 118]: For a given torsion-free connection ∇ on the configuration space Q and a given positive smooth density µ on Q one can construct rather explicitly a star product ?Weyl which is the direct analog of the usual Weyl–Moyal star product in flat space. Moreover, using the connection ∇ one obtains a maximally indefinite pseudo-Riemannian metric on T ∗ Q coming from the natural pairing of the vertical spaces with the horizontal space. The Laplacian ∆ (“d’Alembertian”) of this indefinite metric is in a bundle chart locally given by the explicit formula ∆=
n X k=1
n n X X ∂2 ∂2 ∂ ∗ k + p π Γ + π ∗ Γkk` , k `m k ∂q ∂pk ∂p` ∂pm ∂p` k,`,m=1
(4.13)
k,`=1
generalizing (4.10) to the general curved framework. Here Γk`m denote the Christoffel symbols of the connection ∇. This gives a geometric version of the operator N (Neumaier’s operator) λ
N = e 2i (∆+F(α)) ,
(4.14)
where α ∈ Γ∞ (T ∗ Q) is the one-form determined by ∇X µ = α(X)µ and F(α) is the differential operator d (4.15) (F(α)f )(αq ) = f (αq + tα(q)), dt t=0
Tq∗ Q.
where q ∈ Q and αq ∈ In particular, one has α = 0 if the density µ is covariantly constant. This is the case if we choose ∇ to be the Levi–Civita connection of a Riemannian metric and µ = µg to be the corresponding Riemannian volume
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
31
density. Note that in a typical Hamiltonian system on T ∗ Q we have a kinetic energy in the Hamiltonian which is nothing else than a Riemannian metric. Thus there is a preferred choice in this situation. The Schr¨ odinger functional in this context is simply given by the integration with respect to the a priori chosen density µ Z ω(f ) = ι∗ f µ, (4.16) Q
where again we restrict to f ∈ C0∞ (T ∗ Q)[[λ]] to have a well-defined integration. Now the non-trivial result is that the above formulas still hold in this general situation. We have Z (ι∗ N f )(ι∗ N g)µ (4.17) ω(f¯ ?Weyl g) = Q
whence the Schr¨ odinger functional is positive Z (ι∗ N f )(ι∗ N f )µ ≥ 0. ω(f¯ ?Weyl f ) =
(4.18)
Q
The proof consists again in successive integrations by parts which is now much more involved due to the curvature terms coming from ∇, see [24, 25, 112] for details. 4.3. Positive traces and KMS functionals We consider a connected symplectic manifold (M, ω) with a Hermitian star product ?. Then it is well-known that there exists a unique trace functional up to normalization and even the normalization can be chosen in a canonical way [71, 83, 97, 110]. Here a trace functional means a C[[λ]]-linear functional tr: C0∞ (M )[[λ]] → C[[λ]]
(4.19)
tr(f ? g) = tr(g ? f ).
(4.20)
such that
Furthermore, it is known that tr is of the form Z f Ω + higher orders in λ, tr(f ) = c0
(4.21)
M
where Ω is the Liouville form on M and c0 is a normalization constant. If the integration with respect to the Liouville form is already a trace, i.e. one does not need the higher order corrections in (4.21), then the star product is called strongly closed [49]. Since ? is a Hermitian star product, the functional f 7→ tr(f¯) is still a trace whence we can assume that the trace we started with is already a real trace. In
March 15, 2005 11:31 WSPC/148-RMP
32
J070-00229
S. Waldmann
particular, for this choice c0 = c¯0 is real. Passing to −tr if necessary we can assume that c0 > 0. Then tr(f¯ ?Weyl f ) = c0
Z
¯ Ω + higher orders in λ. ff
(4.22)
M
Hence, if f 6= 0, already the zeroth order in tr(f¯? f ) is nonzero and clearly positive. Thus by definition of the ordering of R[[λ]] we see that tr is a positive functional. Note however the difference in the argument compared to the δ-functional. More generally, we can consider thermodynamical states, i.e. KMS functionals. ¯ ∈ C ∞ (M )[[λ]] and an “inverse temperature” Here we fix a Hamiltonian H = H β > 0. Then the star exponential Exp(−βH) ∈ C ∞ (M )[[λ]] is well-defined as the solution of the differential equation d Exp(−βH) = −H ? Exp(−βH) dβ
(4.23)
with initial condition Exp(0) = 1. The star exponential has all desired functional properties of a “exponential function”, see e.g. the discussion in [26]. Using this, the KMS functional corresponding to this data is defined by ωH,β (f ) = tr(Exp(−βH) ? f )
(4.24)
for f ∈ C0∞ (M )[[λ]]. The positivity of tr and the existence of a square root Exp(− β2 H) of Exp(−βH) shows that the KMS functional is indeed a positive functional again. Remark 4.1. Originally, KMS functionals are characterized by the so-called KMS condition [85, 103, 107] in a more operator-algebraic approach which was transfered to deformation quantization in [6, 7]. However, the existence of a unique trace functional allows to classify the KMS functionals completely yielding the above characterization [26]. Note that this only holds in the (connected) symplectic framework as in the general Poisson framework, traces are no longer unique, see e.g. [16, 72], so the arguments of [26] do no longer apply. Thus it would be very interesting to get some more insights in the nature of the KMS functionals in the general Poisson case.
5. Deformation and Classical Limit of Positive Functionals The interpretation of states as positive linear functionals allows for a simple definition of a classical limit of a state of a Hermitian deformation of a ∗ -algebra. The converse question whether any “classical” state can be deformed is much more delicate and shed some new light on the relevance of the Wick star product.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
33
5.1. Completely positive deformations Let A be a ∗ -algebra over C. In the spirit of star products we consider a Hermitian deformation of A, i.e. an associative C[[λ]]-linear multiplication ? for A[[λ]] making A = (A[[λ]], ?, ∗ ) a ∗ -algebra over C[[λ]]. The C[[λ]]-bilinearity of ? implies that a?b=
∞ X
λr Cr (a, b)
(5.1)
r=0
with C-bilinear maps Cr : A × A → A, extended to A by the usual C[[λ]]-bilinearity. As usual, the deformation aspect is encoded in the condition C0 (a, b) = ab. Note that we do not want to deform the ∗ -involution though in principle this can also be taken into account, see e.g. the discussion in [33, 35]. As we mentioned already in Example 3.4 we are still in the framework of ∗ -algebras over ordered rings as R[[λ]] is canonically ordered. Now assume ω: A → C[[λ]] is a C[[λ]]-linear positive functional. Then the C[[λ]]P∞ linearity implies that ω is actually of the form ω = r=0 λr ωr with C-linear functionals ωr : A → C, the later being canonically extended to A by C[[λ]]-linearity. Since ? deforms the given multiplication of A we obtain from the positivity of ω 0 ≤ ω(a∗ ? a) = ω0 (a∗ a) + λ (ω0 (C1 (a∗ , a)) + ω1 (a∗ a)) + higher order terms.
(5.2)
Thus it follows immediately from the ordering of R[[λ]] that ω0 has to be a positive linear functional of A. In this sense, the classical limit of a quantum state is a classical state. Note that this statement is non-trivial, though physically of course more than plausible. This observation immediately raises the question whether the converse is true as well: can we always deform states? We know already from the example of the δ-functional and the Weyl–Moyal star product that in general some quantum corrections ω1 , ω2 , . . . are unavoidable. The reason can easily be seen from the expansion (5.2): If ω0 (a∗ a) = 0 then the positivity of ω is decided in the next order. But this involves now the higher order terms C1 (a∗ , a), etc. of the deformed product and these terms usually do not have any reasonable positivity properties. Thus the terms ω1 etc. have to be well chosen in order to compensate for this. We state the following definition [33, 37]: Definition 5.1 (Positive Deformations). A Hermitian deformation A = (A[[λ]], ?, ∗ ) of a ∗ -algebra A over C is called a positive deformation if for any posP r itive C-linear functional ω0 there exists a deformation ω = ∞ r=0 λ ωr : A → C[[λ]] into a C[[λ]]-linear positive functional with respect to ?. Furthermore A is called a A) is a positive deformation of Mn (A) for completely positive deformation if Mn (A all n ∈ N. Example 5.2 (A Non-Positive Deformation). Let A be a ∗ -algebra over C with multiplication µ: A ⊗ A → A. Then A = (A[[λ]], λµ, ∗ ) is a Hermitian deformation
March 15, 2005 11:31 WSPC/148-RMP
34
J070-00229
S. Waldmann
of the trivial ∗ -algebra A0 having A as C-module and equipped with the zero multiplication. Since for this trivial ∗ -algebra any (real) functional is positive, we cannot expect to have a positive deformation in general since positivity with respect to ? = λµ is a non-trivial condition. Thus not any Hermitian deformation is a positive deformation showing the non-triviality of the definition. 5.2. Complete positivity of Hermitian star products Recall that for the Wick star product ?Wick on R2n = Cn we do not need to deform classically positive linear functionals at all: They are positive with respect to ? Wick for free. Thus if ? is any Hermitian symplectic star product on R2n it is equivalent and hence ∗ -equivalent to ?Wick by some ∗ -equivalence transformation T = id + P∞ r r=1 λ Tr . Hence for any classical positive linear functional ω0 we see that ω = ω0 ◦ T = ω0 + λω0 ◦ T1 + higher order terms
(5.3)
gives a positive linear functional with respect to ? deforming ω0 . Example 5.3. The functional f 7→ δ ◦ eλ∆ f is a deformation of the δ-functional into a positive linear functional with respect to the Weyl–Moyal star product. Now let ? be a Hermitian star product on a symplectic manifold M . Using a P ¯α χα = 1 subordinate to some Darboux atlas of quadratic partition of unity α χ M we can localize a given classical positive linear functional ω0 by writing X ω(f ) = ω0 (χ ¯α ? f ? χ α ) (5.4) α
such that each ω0 (χ ¯α ? · ? χα ) has only support in one Darboux chart. Hence we can replace ω0 by some deformation (depending on α) in order to make it positive with respect to ? and glue things together in the end. The final result will then be a deformation of ω0 which is now positive with respect to ?. This shows [33]: Theorem 5.4. Any symplectic Hermitian star product is a positive deformation.
It is easy to see that the same argument applies for Mn (C)-valued functions whence any symplectic Hermitian star product is even a completely positive deformation. The Poisson case proves to be more involved. First we note that we again have to solve only the local problem of showing that Hermitian star product in Rn are positive since the same gluing argument as in the symplectic case can be applied. Hence we consider the local case Rn with coordinates q 1 , . . . , q n equipped with a Hermitian star product ? deforming some Poisson structure. We define W0 = C ∞ (Rn )[[p1 , . . . , pn ]] and W = W0 [[λ]],
(5.5)
whence we can equip W with the Weyl–Moyal star product ?Weyl . Clearly the formula (4.1) extends to the “formal” momentum variables. We can think of W0 as functions on a “formal cotangent bundle” of Rn . Thus we have also the two
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
35
canonical maps ι∗ : W0 → C ∞ (Rn ) and π ∗ : C ∞ (Rn ) ,→ W0 which are algebra homomorphisms with respect to the undeformed products. Now the idea is to deform π ∗ into a ∗ -algebra homomorphism τ=
∞ X r=0
τk : (C ∞ (Rn )[[λ]], ?) → (W, ?Weyl) ,
(5.6)
where τ0 = π ∗ and τk is homogeneous of degree k with respect to the grading induced by the λ-Euler derivation (homogeneity operator) n
H=λ
X ∂ ∂ + pk . ∂λ ∂pk
(5.7)
k=1
This can actually be done by an inductive construction of the τk using the vanishing of a certain Hochschild cohomology (essentially W as a C ∞ (Rn )-bimodule with bimodule multiplication given by ?Weyl), see [22, 114] as well as [37] for details. Remark 5.5 (Quantized Formal Symplectic Realization). Note that by setting λ = 0 in τ one obtains a “formal” symplectic realization of the Poisson manifold Rn . Here formal is understood in the sense that the dependence on the momentum variables is formal. Thus τ can be seen as a quantized formal symplectic realization. Note also that the same construction can be done for any Poisson manifold Q if one replaces W0 by the formal functions on the cotangent bundle T ∗ Q and ?Weyl by the homogeneous star product of Weyl type for T ∗ Q constructed out of a connection on Q as in [24, 25]. Having τ it is very easy to obtain a deformation of a given positive functional ω0 : C ∞ (Rn ) → C. The key observation is that ω0 ◦ ι∗ is a positive linear functional of W0 and thus, using the Wick star product again, ω0 ◦ ι∗ ◦ S is a positive C[[λ]]linear functional of W, equipped with the Weyl–Moyal star product ?Weyl, where S is defined as in (4.6) using only the formal momentum variables for the definition of the partial derivatives ∂z∂k and ∂∂z¯k . Then clearly ω0 ◦ι∗ ◦S ◦τ gives the deformation of ω0 into a positive functional with respect to ?. It is also clear that the matrixvalued case causes no further problems whence we have the following result [37], answering thereby a question raised in [16]: Theorem 5.6. Every Hermitian star product on a Poisson manifold is a completely positive deformation. We discuss now some easy consequences of this result: Corollary 5.7. For Hermitian star products one has sufficiently many positive ¯ ∈ C ∞ (M )[[λ]] one finds a linear functionals in the sense that for 0 6= H = H ∞ positive linear functional ω: C (M )[[λ]] → C[[λ]] with ω(H) 6= 0. Indeed, classically this is certainly true and by Theorem 5.6 we just have to deform an appropriate classically positive functional.
March 15, 2005 11:31 WSPC/148-RMP
36
J070-00229
S. Waldmann
Corollary 5.8. If H ∈ (C ∞ (M )[[λ]], ?)+ then for the classical limit we have H0 ∈ C ∞ (M )+ as well. In general, it is rather difficult to characterize the positive algebra elements in a star product algebra beyond this zeroth order. A nice application is obtained for the following situation: consider a Lie algebra g of a compact Lie group and let ?BCH be the Baker–Campbell–Hausdorff star product on g∗ , as constructed in [80]. Then any g-invariant functional on C ∞ (g∗ ) defines a trace with respect to ?BCH , see [16]. The question is whether one can deform a classical positive trace into a positive trace with respect to ?BCH . In [16] this was obtained for very particular trace functionals by some BRST like quantization of a phase space reduction. However, as already indicated in [16, Sec. 8], we can just deform the trace in some positive functional thanks to Theorem 5.6, losing probably the trace property. But averaging over the compact group corresponding to g gives again a g-invariant functional, hence a trace, now without losing the positivity. Thus we have the result: Corollary 5.9. Any g-invariant functional on C ∞ (g∗ ) can be deformed into a positive trace functional with respect to the Baker–Campbell–Hausdorff star product ?BCH . 6. The GNS Construction and Examples As we have argued in Sec. 2.2 we have to construct ∗ -representations of the observable algebra in order to implement the superposition principle for states. The GNS construction, which is well-known for ∗ -algebras over C, see e.g. [30, 93, 94, 125, 127], provides a canonical way to construct such ∗ -representations out of a given positive linear functional. 6.1.
∗
-Representation theory
We start with some general remarks on ∗ -representations of ∗ -algebras by transferring the usual notions to the algebraic framework for ∗ -algebras over C. Definition 6.1 (∗ -Representations). Let A be a ∗ -algebra over C. (i) A ∗ -representation π of A on a pre-Hilbert space H is a ∗ -homomorphism π: A → B(H).
(6.1)
∗
(ii) An intertwiner T between two -representations (H1 , π1 ) and (H2 , π2 ) is a map T ∈ B(H1 , H2 ) with T π1 (a) = π2 (a)T
(6.2)
for all a ∈ A. (iii) Two ∗ -representations are called unitarily equivalent if there exists a unitary intertwiner T between them, i.e. T ∗ = T −1 .
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
37
(iv) A ∗ -representation (H, π) is called strongly nondegenerate if π(A)H = H. (v) A ∗ -representation (H, π) is called cyclic with cyclic vector Ω ∈ H if π(A)Ω = H. Remark 6.2 (∗ -Representations). (i) For unital ∗ -algebras we only consider unital ∗ -homomorphisms by convention. Thus in the unital case, ∗ -representations are always strongly nondegenerate by convention. This is reasonable since if π is a ∗ -representation of a unital ∗ -algebra then π(1l) = P is a projection P = P ∗ = P 2 and thus we can split the representation space H = P H ⊕ (id − P )H into an orthogonal direct sum. Then the ∗ -representation π is easily seen to be block-diagonal with respect to this decomposition and π is identically zero on (id − P )H. Thus the only “interesting” part is P H which is strongly nondegenerate. (ii) The space of intertwiners is a C-module and clearly the composition of intertwiners gives again an intertwiner. This last observation allows us to state the following definition of the category of ∗ -representations: Definition 6.3 (∗ -Representation theory). The ∗ -representation theory of A is the category of ∗ -representations with intertwiners as morphisms. It is denoted by ∗ -rep(A), and ∗ -Rep(A) denotes the subcategory of strongly nondegenerate ∗ -representations. Thus the final goal is to understand ∗ -Rep(A) for a given ∗ -algebra A like e.g. a star product algebra A = (C ∞ (M )[[λ]], ?, ). Since direct orthogonal sums of ∗ -representations give again ∗ -representations one would like to understand if and how a given ∗ -representation can be decomposed into a direct sum of nondecomposable (irreducible) ∗ -representations. In practice this will be only achievable for very particular and simple examples. In general, it is rather hopeless to obtain such a complete picture of ∗ -Rep(A). Thus other strategies have to be developed, like e.g. finding “interesting” subclasses of ∗ -representations. Nevertheless, the GNS construction will allow us to construct at least a large class of ∗ -representations, namely out of given positive functionals. Hence one can discuss those GNS representations which come from positive functionals that are of particular interest.
6.2. The general GNS construction The whole GNS construction is a consequence of the Cauchy–Schwarz inequality which itself is obtained from the following simple but crucial lemma. Of course this is well-known for the case C = C.
March 15, 2005 11:31 WSPC/148-RMP
38
J070-00229
S. Waldmann
Lemma 6.4. Let p(z, w) = a¯ z z + bz w ¯ + b0 z¯w + cww ¯ ≥ 0 for all z, w ∈ C, where 0 a, b, b , c ∈ C. Then a ≥ 0,
c ≥ 0,
¯b = b0
and
b¯b ≤ ac.
(6.3)
Proof. Taking z = 0 gives c ≥ 0 and similarly a ≥ 0 follows. Taking z = 1, i and w = 1 implies ¯b = b0 . Now we first consider the case a = 0 = c. Then taking w = ¯b gives b¯b(z + z¯) ≥ 0 for all z ∈ C, whence z = −1 gives b = 0. Thus we can assume, say, a > 0. Taking z = ¯b and w = −a gives a(¯bb − ¯bb − ¯bb + ac) ≥ 0 whence the inequality ¯bb ≤ ac holds, too. Corollary 6.5 (Cauchy–Schwarz Inequality). Let ω: A → C be a positive linear functional. Then ω(a∗ b) = ω(b∗ a)
(6.4)
ω(a∗ b)ω(a∗ b) ≤ ω(a∗ a)ω(b∗ b)
(6.5)
and
for all a, b ∈ A.
For the proof one considers p(z, w) = ω((za + wb)∗ (za + wb)) ≥ 0. In particular, if A is unital then we have ω(a∗ ) = ω(a)
(6.6)
and ω(1l) = 0 implies already ω = 0. Thus (by passing to the quotient field if necessary) we can replace positive linear functionals by normalized ones, i.e. by states ω(1l) = 1. Now we consider the following subset Jω = {a ∈ A | ω(a∗ a) = 0} ⊆ A.
(6.7)
Using the Cauchy–Schwarz inequality one obtains immediately that Jω is a left ideal of A, the so-called Gel’fand ideal of ω. Thus we can form the quotient Hω = A J ω , (6.8) which is a left A-module in the usual way. We denote by ψb ∈ Hω the equivalence class of b ∈ A. Then the left module structure can be written as πω (a)ψb = ψab ,
(6.9)
for a ∈ A and ψb ∈ Hω . Furthermore, Hω becomes a pre-Hilbert space by setting hψb , ψc iω = ω(b∗ c).
(6.10)
Indeed, this is well-defined thanks to the Cauchy-Schwarz inequality. Moreover, h·, ·iω is positive definite since we divided precisely by the “null-vectors” in A. Finally, we have hψb , πω (a)ψc iω = ω(b∗ ac) = ω((a∗ b)∗ c) = hπω (a∗ )ψb , ψc iω , ∗
whence πω is a -representation of A on Hω .
(6.11)
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
39
Definition 6.6 (GNS Representation). For a positive linear functional ω: A → C the ∗ -representation (Hω , πω ) is called the GNS representation of ω. If A is unital, we can recover the functional ω as “vacuum expectation value” with respect to the “vacuum vector” ψ1l as follows ω(a) = hψ1l , πω (a)ψ1l iω ,
(6.12)
and ψ1l is obviously a cyclic vector for the GNS representation since ψb = πω (b)ψ1l
(6.13)
for all ψb ∈ Hω . It turns out that these properties already characterize the -representation (Hω , πω , ψ1l ) up to unitary equivalence:
∗
Theorem 6.7 (GNS Representation). Let A be unital and let ω: A → C be a positive linear functional. If (H, π, Ω) is a cyclic ∗ -representation such that ω(a) = hΩ, π(a)Ωi
(6.14)
for all a ∈ A then (H, π, Ω) is unitarily equivalent to the GNS representation (Hω , πω , ψ1l ) via the unitary intertwiner U : Hω 3 ψb 7→ U ψb = π(b)Ω ∈ H.
(6.15)
The proof consists essentially in showing that U is well-defined. Then the remaining properties are immediate. Example 6.8. Let H be a pre-Hilbert space and φ ∈ H a unit vector, hφ, φi = 1. Then for A = B(H) and for ω(A) = hφ, Aφi
(6.16)
one recovers the defining ∗ -representation of A on H as the GNS representation corresponding to ω. Note that one can replace B(H) by F(H) as well. A slight generalization is obtained for the following situation: Let B ⊆ A be a -ideal and let ω: B → C be a positive linear functional which doesnot necessarily extend to A. Let Jω ⊆ B be its Gel’fand ideal and let (Hω = B Jω , πω ) be the corresponding GNS representation of B. In this situation we have:
∗
Lemma 6.9. Jω ⊆ A is a left ideal in A as well whence the GNS representation πω of B extends canonically to A by the definition πω (a)ψb = ψab and yields a ∗ -representation of A on Hω . The proof is again a consequence of the Cauchy–Schwarz inequality, see [29, Corollary 1]. Nevertheless, this will be very useful in the examples later. 6.3. The case of δ, Schr¨ odinger and trace functionals Let us now come back to deformation quantization and the examples of positive functionals as in Sec. 4. We want to determine their GNS representations explicitly.
March 15, 2005 11:31 WSPC/148-RMP
40
J070-00229
S. Waldmann
The δ-functional and the Wick star product From the explicit formula for the Wick star product (4.7) we see that the Gel’fand ideal of the δ-functional is simply given by ∂rf Jδ = f ∈ C ∞ (M )[[λ]] i1 (0) = 0 for all r ∈ N , i , . . . , i = 1, . . . , n . 0 1 r ∂ z¯ · · · ∂ z¯ir (6.17) In order to obtain explicit formulas for the GNS representation we consider the C[[λ]]-module H = C[[¯ y 1 , . . . , y¯n ]][[λ]]
(6.18)
which we endow with the inner product hφ, ψi =
∞ X (2λ)r r=0
r!
n X
i1 ,...,ir =1
∂ y¯i1
∂rφ ∂rψ (0) (0). i i 1 r ∂ y¯ · · · ∂ y¯ir · · · ∂ y¯
(6.19)
Clearly, this is well-defined as formal power series in λ and turns H into a preHilbert space over C[[λ]]. Then we have the following characterization of the GNS representation corresponding to the δ-functional [29]: Theorem 6.10 (Formal Bargmann–Fock Representation). The GNS preHilbert space Hδ is isometrically isomorphic to H via the unitary map U : Hδ 3 ψf 7−→
n ∞ X ∂rf 1 X (0) y¯i1 · · · y¯ir ∈ H, i1 · · · ∂ z ir r! ∂ z ¯ ¯ r=0 i ,...,i =1 1
(6.20)
r
i.e. the formal z¯-Taylor expansion around 0. This way the GNS representation π δ on Hδ becomes the formal Bargmann–Fock representation in Wick ordering %Wick (f ) =
∞ X (2λ)r r!s! r,s=0
n X
i1 ,...,ir ,j1 ,...,js r
× y¯j1 · · · y¯js
∂ r+s f (0) ∂ z¯j1 · · · ∂ z¯js ∂z i1 · · · ∂z ir =1
∂ , ∂ y¯i1 · · · ∂ y¯ir
(6.21)
i.e. %Wick (f ) = U πδ (f )U −1 . With the explicit formulas for Jδ , ?Wick and U the proof is a straightforward verification. In particular, %Wick is indeed the Bargmann–Fock representation in normal ordering (Wick ordering): Specializing (6.21) we have %Wick (z i ) = 2λ
∂ ∂ y¯i
and %Wick (¯ z i ) = y¯i ,
(6.22)
together with normal ordering for higher polynomials in z i and z¯i . Thus we obtain exactly the creation and annihilation operators.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
41
In fact, one can make the relation to the well-known “convergent” Bargmann– Fock representation even more transparent. Recall that the Bargmann–Fock Hilbert space is given by the anti-holomorphic functions Z 1 ¯z n 2 − z2~ ¯ HBF = f ∈ O(C ) dzd¯ z<∞ , (6.23) |f (¯ z )| e (2π~)n
which are square-integrable with respect to the above Gaussian measure, see [4, 5] as well as [106]. Here ~ > 0 is Planck’s constant. The inner product is then the corresponding L2 -inner product and it is well-known that those anti-holomorphic funcz ¯z z ). tions form a closed subspace of all square integrable functions L2 (Cn , e− 2~ dzd¯ The quantization of polynomials in z¯ and z is given by ∂ = ak ∂ z¯k πBF (¯ z k ) = z¯k = a†k
πBF (z k ) = 2~
(6.24) (6.25)
plus normal ordering for the higher monomials. Here (6.24) and (6.25) are densely defined operators on HBF which turn out to be mutual adjoints when the domains are chosen appropriately. Then the formal Bargmann–Fock space H together with the inner product (6.19) can be seen as an asymptotic expansion of HBF for ~ → 0 and similarly %Wick arises as asymptotic expansion of πBF . Note that for a wide class of elements in H and a large class of observables like the polynomials the asymptotic expansion is already the exact result. In both cases the “vacuum vector” is just the constant function 1 out of which all anti-holomorphic “functions” are obtained by successively applying the creation operators %Wick (¯ z k ) or πBF (¯ z k ), respectively. Remark 6.11. In the formal framework, similar results can be obtained easily for any K¨ ahler manifold being equipped with the Fedosov star product of Wick type, i.e. the star product with separation of variables according to the K¨ ahler polarization [27, 29, 96]. In particular, all δ-functionals are still positive linear functionals and essentially all formulas are still valid if one replaces the formal z¯-Taylor expansion with the “anti-holomorphic part” of the Fedosov–Taylor series, see [29]. Note however, that the convergent analog is much more delicate and requires additional assumptions on the topology in the compact case. There is an extensive literature on this topic, see e.g. [12–14, 21, 39–42, 98, 106, 126]. It would be interesting to understand the relations between both situations better, in particular concerning the representation point of view. The Schr¨ odinger representation To obtain the most important representation for mathematical physics, the Schr¨ odinger representation, we consider again the Weyl–Moyal star product ? Weyl on T ∗ Rn together with the Schr¨ odinger functional ω as in (4.9). The functional is positive and defined on the ∗ -ideal C0∞ (T ∗ Rn )[[λ]] of C ∞ (T ∗ Rn )[[λ]]. Thus we are
March 15, 2005 11:31 WSPC/148-RMP
42
J070-00229
S. Waldmann
in the situation of Lemma 6.9 since ω does not have an extension to the whole ∗ -algebra. Nevertheless, the GNS representation extends canonically to all observables. From (4.11) we immediately obtain Jω = f 3 C0∞ (T ∗ Rn )[[λ]] ι∗ N f = 0 . (6.26)
This allows us to identify the GNS representation of the Schr¨ odinger functional explicitly. It is the formal Schr¨ odinger representation on “formal” wave functions [29]:
Theorem 6.12 (Formal Schr¨ odinger Representation). The GNS pre-Hilbert space Hω of the Schr¨ odinger functional is isometrically isomorphic to the “formal wave functions ” H = C0∞ (Rn )[[λ]] with inner product Z φ(q)ψ(q)dn q (6.27) hφ, ψi = Rn
via the unitary map U : Hω 3 ψf 7→ ι∗ N f ∈ C0∞ (Rn )[[λ]].
(6.28)
The GNS representation πω becomes the formal Schr¨ odinger representation (in Weyl ordering) r X ∞ n X ∂r 1 λ ∂ r (N f ) %Weyl(f ) = , (6.29) i 1 r! i ∂pi1 · · · ∂pir p=0 ∂q · · · ∂q ir r=0 i ,...,i =1 1
r
i.e. %Weyl(f ) = U πω (f )U −1 .
The proof consists again in a simple verification that U is well-defined and has the desired properties. Then (6.29) is a straightforward computation. Note that %Weyl is indeed the usual Schr¨ odinger representation with ~ replaced by the formal parameter λ. In particular, %Weyl(q k ) = q k
and %Weyl (p` ) = −iλ
∂ ∂q `
(6.30)
together with Weyl’s total symmetrization rule for the higher monomoials. Here the correct combinatorics is due to the operator N . Without N in (6.29) one obtains the Schr¨ odinger representation in standard ordering (all “p` ” to the right). Using the standard ordered star product λ
f ?Std g = µ ◦ e i
Pn
∂ k=1 ∂pk
⊗
∂ ∂qk
f ⊗ g,
(6.31)
which is equivalent to the Weyl–Moyal star product via the equivalence transformation N , we can write %Weyl (f )ψ = ι∗ (N f ?Std π ∗ ψ)
(6.32)
for f ∈ C ∞ (T ∗ Rn )[[λ]] and ψ ∈ C0∞ (Rn )[[λ]]. The corresponding standard ordered representation is then simply given by %Std (f )ψ = ι∗ (f ?Std π ∗ ψ).
(6.33)
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
43
This is precisely the usual symbol calculus for differential operators (in standard ordering) when we replace λ by ~ and restrict to polynomial functions in the momenta. Remark 6.13 (Formal Schr¨ odinger Representation). (i) The “formal” Schr¨ odinger representation can again be obtained from integral formulas for the Weyl-ordered symbol calculus by asymptotic expansions for ~ → 0. The asymptotic formulas are already exact for functions which are polynomial in the momenta. (ii) There are geometric generalizations not only for ?Weyl, ?Std and N as discussed in Sec. 4.2 but also the whole GNS construction can be translated to the geometric framework of cotangent bundles. Even the formulas for the representations %Weyl and %Std are still valid if partial derivatives are replaced by appropriate covariant derivatives, see [23–25]. (iii) For a spectral analysis, the formal Schr¨ odinger representation is only of limited value as we have already argued in Sec. 2.1: One first has to impose some convergence conditions before asking for a reasonable definition of spectra. Note that in the case of the Schr¨ odinger representation this can of course be done in a rather simple way. We only have to restrict to functions f ∈ Pol(T ∗ Rn )[λ] where Pol(T ∗ Rn ) denotes those functions in C ∞ (T ∗ Rn ) which are polynomial in the momentum variables. These functions are easily seen to form a subalgebra with respect to ?Weyl and the formal Schr¨ odinger representation restricts to a representation of this subalgebra on C0∞ (R)[λ]. Since now only polynomials in λ occur, we can simply set λ equal to ~ and recover the usual Schr¨ odinger representation from textbook quantum mechanics. This is still possible in the more general situation of an arbitrary cotangent bundle T ∗ Q instead of T ∗ Rn , see [23–25]. GNS representation of traces and KMS functionals We consider again a connected symplectic manifold M with a Hermitian star product ? and its positive trace functional tr: C0∞ (M )[[λ]] → C[[λ]] as in Sec. 4.3. Here tr is defined on a ∗ -ideal whence we can again apply Lemma 6.9 to extend the GNS representation of tr to the whole algebra of observables. From the earlier investigation of the trace functional tr in (4.21) see that the Gel’fand ideal of tr is trivial, Jtr = {0}. (6.34) Such positive functionals are called faithful. Thus the GNS pre-Hilbert space is simply Htr = C0∞ (M )[[λ]] together with the inner product hf, gitr = tr(f¯ ? g). (6.35) The GNS representation is then the left regular representation πtr (f )g = f ? g of C ∞ (M )[[λ]] on the ∗ -ideal C0∞ (M )[[λ]].
(6.36)
March 15, 2005 11:31 WSPC/148-RMP
44
J070-00229
S. Waldmann
It is clear that the analogous result still holds for the KMS states as the star exponential function Exp(−βH) is invertible. Thus the module structure is again the left regular one. Only the inner product changed and is now twisted by the additional factor Exp(−βH) inside (6.35). The characteristic property of the GNS representation of the trace functional and the KMS functionals is that the commutant π(A)0 = {A ∈ B(H) | [π(a), A] = 0 for all a ∈ A}
(6.37)
of the representation is as big as the algebra of observables itself. In fact, it is anti-isomorphic π(A)0 ∼ = Aopp ,
(6.38)
since the commutant is given by all right-multiplications. This is in some sense the beginning of an algebraic “baby-version” of the Tomita–Takesaki theory as is well-known for operator algebras, see [131] for more details. 6.4. Deformation and classical limit of GNS representations Since the above examples prove that the GNS construction gives physically meaningful representations also in formal deformation quantization we shall now discuss the classical limit of GNS representations and the corresponding deformation problem. Let A be a ∗ -algebra over C and let ? be a Hermitian deformation. Thus A = (A[[λ]], ?) is a ∗ -algebra over C[[λ]]. Then we want to understand how one can construct a ∗ -representation π out of a ∗ -representation π of A on some pre-Hilbert space H over C[[λ]]. It turns out that we can always take the classical limit of a ∗ -representation. First we consider a pre-Hilbert space H over C[[λ]] and define (6.39) H 0 = φ ∈ H hφ, φiH λ=0 = 0 . By the Cauchy–Schwarz inequality for the inner product hφ, ψiH hφ, ψiH ≤ hφ, φiH hψ, ψiH ,
(6.40)
which holds in general thanks to Lemma 6.4, we see that H 0 is a C[[λ]]-submodule of H . Thus we can define H) = H H 0 cl(H (6.41)
as a C[[λ]]-module. The canonical projection will be called classical limit map and is H the C[[λ]]-module structure of cl(H H ) is rather denoted by cl. Since H 0 contains λH trivial: λ always acts as zero. Hence we can forget this C[[λ]]-module structure and restrict it to the C-module structure. Then we define hcl(φ), cl(ψ)icl(H (6.42) H ) = hφ, ψiH λ=0 ∈ C.
It is easy to see that this gives indeed a well-defined and positive definite inner H ) which thereby becomes a pre-Hilbert space over C. product for cl(H
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
45
H 1 , H 2 ) and define a C-linear map Next we consider an adjointable map A ∈ B (H H 1 ) → cl(H H 2 ) by cl(A): cl(H cl(A)cl(φ) = cl(Aφ).
(6.43)
Since A is adjointable it turns out that this is actually well-defined and cl(A) is again adjointable. Moreover, it is easy to check that A 7→ cl(A) is compatible with C-linear combinations, taking the adjoints and the compositions of operators to adjoints and compositions of their classical limits. Thus we obtain a functor cl from the category of pre-Hilbert spaces over C[[λ]] to the category of pre-Hilbert spaces over C which we shall call the classical limit, see [35, Sec. 8] cl: PreHilbert(C[[λ]]) → PreHilbert(C).
(6.44)
H but in general it is much Remark 6.14. The C[[λ]]-submodule H 0 contains λH larger as we shall see in the examples. Thus cl is not just the functor “modulo λ” but takes into account the whole pre-Hilbert space structure. Now the classical limit functor also induces a classical limit for ∗ -representations H , π) of A we define cl(π): A → B (cl(H H )) by as follows. For a ∗ -representation (H cl(π)(a) = cl(π(a))
(6.45)
for a ∈ A. It is straightforward to check that this gives indeed a ∗ -representation of H ). Moreover, for an intertwiner T : (H H1 , π1 ) → the undeformed ∗ -algebra A on cl(H H 2 , π 2 ) we use its classical limit cl(T ) to obtain an intertwiner between the classical (H H 1 , π 1 ) and (H H 2 , π2 ). Then it is easy to check that this gives a functor limits of (H A) → ∗ -rep(A), cl: ∗ -rep(A
(6.46)
still called the classical limit. Thus we can always take the classical limit of ∗ -representations, even in a canonical way [35]. This immediately raises the question whether the converse is true as well: Can we always deform a given ∗ -representation of A into a ∗ -representation of A such that the above defined classical limit gives back the ∗ -representation we started with? In general this is a very difficult question whence we consider the particular case of GNS representations. Thus let ω: A → C[[λ]] be a positive functional of H ω , π ω ). Then we the deformed ∗ -algebra with corresponding GNS representation (H have the following result [132, Theorem 1]: H ω , π ω ) is unitarily equivalent to the GNS Theorem 6.15. The classical limit cl(H representation (Hω0 , πω0 ) corresponding to the classical limit ω0 = cl(ω) via the unitary intertwiner H ω ) 3 cl (ψa ) 7→ ψcl(a) ∈ Hω0 , U : cl(H where a = a0 + λa1 + · · · ∈ A and cl(a) = a0 ∈ A.
(6.47)
March 15, 2005 11:31 WSPC/148-RMP
46
J070-00229
S. Waldmann
One just has to check that the map U is actually well-defined. The remaining features of U are obvious. Note however, that though the theorem is not very difficult to prove, it is non-trivial in so far as there is no simple relation between the Gel’fand ideal of ω0 and ω: the latter is usually much smaller than Jω0 [[λ]]. An immediate consequence is that for positive deformations we can always deform GNS representations since we can deform the corresponding positive functionals: Corollary 6.16. Let A be a positive deformation of A. Then any direct orthogonal sum of GNS representations of A can be deformed. Remark 6.17 (Classical Limit and Deformation of GNS Representations). (i) Even in the very nice cases there might be representations which are not direct sums of GNS representations. In the C ∗ -algebraic case, every ∗ -representation is known to be a topological direct sum of GNS representations. (ii) Thanks to Theorem 5.6 the above corollary applies for star products. (iii) It is a nice exercise to exemplify the theorem for the three GNS representations we have discussed in detail in Sec. 6.3, see also [132]. 7. General ∗ -Representation Theory Given a ∗ -algebra A over C the aim of representation theory would be (in principle) to understand first the structure of the convex cone of positive functionals, then the resulting GNS representations and finally the whole category of ∗ -representations ∗ -Rep(A). Of course, beside for very simple examples this is rather hopeless from the beginning and we cannot expect to get some “final” answers in this fully general algebraic approach. One less ambitious aim would be to compare the representation theories of two given ∗ -algebras in a functorial sense ∗
-Rep(A) ∗ -Rep(B),
(7.1)
and determine whether they are equivalent. This is the principal question of Morita theory. It turns out that even if we do not understand the representation theories of A and B themselves completely, it might still be possible to understand whether they are equivalent or not. The question of finding some relations between the two representation theories is interesting, even if one does not expect to obtain an equivalence. The physical situation we have in mind is the following: Consider a “big” phase space (M, π) with some constraint manifold ι: C ,→ M , like e.g. the zero level set of a momentum map or just some coisotropic submanifold (which corresponds to first class constraints). Then the “physically interesting” phase space would be the reduced phase space Mred = C ∼ endowed with the reduced Poisson structure πred . Any gauge theory is an example of this situation. See e.g. the monography [116] for details and further references on phase space reduction.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
47
Of course we would like to understand the quantum theory of the whole picture, i.e. the quantization of phase space reduction. In deformation quantization this amounts to find a star product ? for M which induces a star product ?red on Mred . This has been discussed in various ways in deformation quantization, see e.g. [17, 20, 67, 70], culminating probably in the recent work of Bordemann [18]. Having understood the relation between the quantized observable algebra A = ∞ (C (M )[[λ]], ?) and Ared = (C ∞ (Mred )[[λ]], ?red ) we would like to understand also the relations between their representation theories ∗
-Rep(A) ∗ -Rep(Ared ),
(7.2)
and now we cannot expect to get an equivalence of categories as the geometrical structure on M may be much richer “far away” from the constraint surface C whence it is not seen in the reduction process. Nevertheless, already some relation would be helpful. Motivated by this we give now a rather general procedure to construct functors ∗ -Rep(A) → ∗ -Rep(B). 7.1.
∗
-Representation theory on pre-Hilbert modules
First we have to enlarge the notion of representation in order to get a more coherent picture: we have to go beyond representations on pre-Hilbert spaces over C but use general pre-Hilbert modules instead [37]. We consider an auxilliary ∗ -algebra D over C. Definition 7.1 (Pre-Hilbert Module). A pre-Hilbert right D-module HD is a right D-module together with a map h·, ·iD : H × H → D
(7.3)
such that (i) (ii) (iii) (iv) (v)
h·, ·iD is C-linear in the second argument. ∗ hφ, ψiD = hψ, φiD for φ, ψ ∈ H. hφ, ψ · diD = hφ, ψiD d for φ, ψ ∈ H and d ∈ D. h·, ·iD is nondegenerate, i.e. hφ, ·iD = 0 implies φ = 0 for φ ∈ H. h·, ·iD is completely positive, i.e. for all n ∈ N and all φ1 , . . . , φn ∈ H we have ( hφi , φj iD ) ∈ Mn (D)+ .
In addition, h·, ·iD is called strongly nondegenerate if the map x 7→ hx, ·iD ∈ HD ∗ = HomD ( HD , D) is bijective. As we will have different inner products with values in different algebras simultanously, we shall sometimes put indices on the symbols for algebras and modules to avoid ambiguities. Clearly we have an analogous definition for pre-Hilbert left D-modules where now the inner product is C-linear and D-linear to the left in the first argument.
March 15, 2005 11:31 WSPC/148-RMP
48
J070-00229
S. Waldmann
Remark 7.2 (Pre-Hilbert Modules). (i) This definition generalizes the notion of Hilbert modules over C ∗ -algebras, see e.g. [105, 106]. In this case it is well-known that positivity of the inner product implies complete positivity, see e.g. [105, Lemma 4.2]. (ii) Pre-Hilbert spaces are obtained for D = C, the complete positivity of h·, ·i for pre-Hilbert spaces over C can be shown [35, App. A] to be a consequence of the positivity of the inner product. 0 0 (iii) We have obvious definitions for B( HD , HD ), B( HD ), F( HD , HD ), and F( HD ) analogous to pre-Hilbert spaces. (iv) HD is a (B( HD ), D)-bimodule since adjointable operators are necessarily right D-linear. Moreover, F( HD ) is a ∗ -ideal inside the ∗ -algebra B( HD ). The following example shows that such pre-Hilbert modules arise very naturally in differential geometry: Example 7.3 (Hermitian Vector Bundles). Let E → M be a complex vector bundle with a Hermitian fiber metric h. Then the right module Γ∞ (E)C ∞ (M ) 0
(7.4) 0
with the inner product defined by hs, s i (x) = hx (s(x), s (x)), where x ∈ M , is a pre-Hilbert right C ∞ (M )-module. In this case, B Γ∞ (E)C ∞ (M ) = Γ∞ (End(E)), (7.5) with their usual action on Γ∞ (E) and ∗ -involution induced by h. Moreover, for the finite-rank operators we have F(Γ∞ (E)C ∞ (M ) ) = Γ∞ (End(E)) as well. This is clear in the case where M is compact but it is also true for non-compact M as sections of vector bundles are still finitely generated modules over C ∞ (M ). In fact, all these statements are consequences of the Serre–Swan Theorem [130].
Definition 7.4. A ∗ -representation π of A on a pre-Hilbert D-module HD is a -homomorphism
∗
π: A → B( HD ).
(7.6)
Clearly, we can transfer the notion of intertwiners to this framework as well whence we obtain the category of ∗ -representations of A on pre-Hilbert D-modules which we denote by ∗ -repD (A). The strongly nondegenerate ones are denoted by ∗ -RepD (A) where again in the unital case we require ∗ -representations to fulfill π(1l) = id. 7.2. Tensor products and Rieffel induction The advantage of looking at ∗ -RepD (A) for all D and not just for D = C is that we now have a tensor product operation. This will give us functors for studying ∗ -RepD (A) and in particular ∗ -Rep(A). The construction will be based on Rieffel’s internal tensor product of preHilbert modules. Rieffel proposed this originally for C ∗ -algebras [120, 121], see
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
49
also [105, 106, 119] but the essential construction is entirely algebraic whence we obtain a quite drastic generalization, see [35, 37]. We consider C FB ∈ ∗ -repB (C) and B EA ∈ ∗ -repA (B). Then we have the algebraic tensor product C FB ⊗B B EA which is a (C, A)-bimodule in a natural way, since we started with bimodules. Out of the given inner products on C FB and B EA we want F⊗E to construct an inner product h·, ·iA with values in A on this tensor product such that the left C-module structure becomes a ∗ -representation. This can actually be F⊗E done. We define h·, ·iA for elementary tensors by F⊗E
hx ⊗ φ, y ⊗ ψiA
E F = φ, hx, yiB · ψ A ,
(7.7)
for x, y ∈ F and φ, ψ ∈ E and extend this by C-sesquilinearity to C FB ⊗B B EA . Remark 7.5. One can show by some simple computations that h·, ·i F⊗E is indeed A well-defined on the B-tensor product. Moreover, it has the correct A-linearity properties and C acts by adjointable operators. All these are rather straightforward. The problem is the non-degeneracy and the complete positivity. Here we have the following result [37, Theorem 4.7]: E
F
F⊗E
Theorem 7.6. If h·, ·iA and h·, ·iB are completely positive then h·, ·iA pletely positive as well.
is com-
(1) (n) Proof. ..,Φ show that the matrix ∈ F ⊗B E be given. Then we must P Let Φ ,.F⊗E (α) (α) (α) (β) (α) ∈ Mn (A) is positive. Thus let Φ = N A = Φ ,Φ with i=1 xi ⊗φi A (α)
(α)
xi ∈ F and φi ∈ E where we can assume without restriction that N is the same for all α = 1, . . . , n. First we claim that the map f : MnN (B) → MnN (A) defined by αβ f : Bij
7→
(α)
(β) E
αβ φi , Bij φj
A
(7.8)
αβ ∈ MnN (B) is positive. Indeed, we have for any B = Bij ∗
f (B B) =
(α) φi , (B ∗ B)αβ ij
·
(β) E φj A
=
n X N X
γ=1 k=1
(α)
(β) E
γα γβ Bki · φi , Bkj · φj
A
,
γα (α) γβ (β) E E and each term Bki · φi , Bkj · φj A is a positive matrix in MnN (A) since h·, ·iA ∗ + is completely positive. Thus f (B B) ∈ MnN (A) whence by Remark 3.10(iii) we conclude that f is a positive map. Since h·, ·iFB is completely positive, the matrix
(α) (β) F B = xi , xj B is positive. Thus f (B) =
(α) (α) (β) F (β) E φ i , xi , x j B · φ j A
March 15, 2005 11:31 WSPC/148-RMP
50
J070-00229
S. Waldmann
is a positive matrix f (B) ∈ MnN (A)+ . Finally, the summation over i, j is precisely the completely positive map τ : MnN (A) → Mn (A). Hence N X
(α) (α) (β) F (β) E τ (f (B)) = φ i , xi , x j B · φ j A i,j=1
*
=
=
Φ
N X
(α)
(α)
⊗ φi ,
xi
i=1
(α)
,Φ
and thus the theorem is shown.
(β)
F⊗E A
N X j=1
+E (β) (β) xj ⊗ φ j A
∈ Mn (A)+
A final check shows that the degeneracy space (F ⊗B E)⊥ of h·, ·iA is preserved under the (C, A)-bimodule structure. Thus we can divide by this degeneracy space and obtain a pre-Hilbert A-module, again together with a ∗ -representation of C. We denote this new ∗ -representation by b B B EA = (F ⊗B E) (F ⊗B E)⊥ ∈ ∗ -repA (C). FB ⊗ (7.9) C F⊗E
The whole procedure is canonical, i.e. compatible with intertwiners at all stages. So finally we have a functor b B : ∗ -repB (C) × ∗ -repA (B) → ∗ -repA (C) ⊗
(7.10)
which restricts to strongly nondegenerate ∗ -representations in the following way b B : ∗ -RepB (C) × ∗ -repA (B) → ∗ -RepA (C). ⊗
(7.11)
By fixing one of the two arguments of this tensor product we obtain the following two particular cases: (i) Rieffel induction: We fix
B
EA . Then the functor
b A · : ∗ -repD (A) → ∗ -repD (B) RE = B E A ⊗
is called Rieffel induction. (ii) Change of base ring: We fix bD SG = · ⊗
D
D
(7.12)
GD0 then the functor
GD0 : ∗ -repD (A) → ∗ -repD0 (A)
(7.13)
is called the change of base ring functor. We clearly have the following commutative diagram ∗
∗
SG ∗
-rep D (A) → RE ↓
SG ∗
-rep D0 (A) ↓RE
(7.14)
-rep D (B) → -rep D0 (B),
which commutes in the sense of functors, i.e. up to natural transformations. This is b is associative up to usual natural due to the simple fact that the tensor product ⊗ transformations.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
51
7.3. A non-trivial example: Dirac’s monopole The following example is a particular case of the results of [36, Sec. 4.2] and can be better understood in the context of Morita equivalence. Nevertheless we mention the example already here. We consider the configuration space Q = R3 \ {0} of an electrically charged particle within the external field of a magnetic monopole, which sits at the origin. Thus the magnetic field is described by a closed two-form B ∈ Γ∞ (Λ2 T ∗ Q) which is not exact due to the presence of a “magnetic charge” at 0 ∈ R3 . We assume 1 B is an integral two-form, i.e. the magnetic charge satisfies furthermore that 2π 1 Dirac’s quantization condition. Mathematically this means that 2π [B] ∈ H2dR (M, Z) is in the integral deRham cohomology. Consider T ∗ Q with the Weyl–Moyal star product ?Weyl and replace now the canonical symplectic form ω0 by the formal symplectic form ωB = ω0 − λπ ∗ B. This is the “minimal coupling” corresponding to switching on the magnetic field. One can now construct by this minimal coupling a star product ?B out of ?Weyl by essentially replacing locally the momentum variables pi by pi − λAi where Ai are the components of a local potential A ∈ Γ∞ (T ∗ Q) of B, i.e. dA = B, see [23]. It turns out that ?B is actually globally defined, i.e. independent of the choice of A but only depending on B. The characteristic class of ?B is given by c(?B ) = i[π ∗ B].
(7.15)
Since B is integral it defines a (non-trivial) line bundle ` → Q,
(7.16)
1 [B]. This line bundle is unique up to whose Chern class is given by the class 2π isomorphism and up to tensoring with a flat line bundle. On ` we choose a Hermitian fiber metric h. Thus we also have the pull-back bundle L = π ∗ ` with Chern class 1 ∗ ∗ 2π [π B] together with the corresponding pull-back fiber metric H = π h. Then it is a fact that on Γ∞ (L)[[λ]] there exists a (?B , ?Weyl)-bimodule structure deforming the classical bimodule structure of Γ∞ (L) viewed as a C ∞ (T ∗ Q)bimodule. Moreover, there exists a deformation of the Hermitian fiber metric H into a (C ∞ (T ∗ Q)[[λ]], ?Weyl )-valued positive inner product H. In this way, the sections of L become a ∗ -representation of (C ∞ (T ∗ Q)[[λ]], ?B ) on a pre-Hilbert module over (C ∞ (T ∗ Q)[[λ]], ?Weyl ),
(Γ∞ (L)[[λ]], H) ∈ ∗ -Rep(C ∞ (T ∗ Q)[[λ]],?Weyl ) (C ∞ (T ∗ Q)[[λ]], ?B ) .
(7.17)
This construction can be made very precise using Fedosov’s approach to the construction of symplectic star products, for details we refer to [23, 36, 133]. Having such a bimodule we can use it for Rieffel induction. Since for ?Weyl we have a representation which is of particular interest, we apply the Rieffel induction functor to the Schr¨ odinger representation (C0∞ (Q)[[λ]], %Weyl ). Then it is another fact that the resulting ∗ -representation of ?B is precisely the usual “Dirac-type”
March 15, 2005 11:31 WSPC/148-RMP
52
J070-00229
S. Waldmann
representation on the pre-Hilbert space of sections Γ∞ 0 (`)[[λ]] of ` endowed with the inner product Z hs, s0 i = h(s, s0 ) dn q. (7.18) Q
The representation is given as follows: The configuration space variables act by multiplication operators while the corresponding canonical conjugate momenta act by covariant derivatives using a connection on ` whose curvature is given by B. This is exactly the minimal coupling expected for quantization in presence of a magnetic field.
Remark 7.7. The above “ad hoc” construction (observation) finds its deeper explanation in Morita theory stating that the above bimodule is actually an equivalence bimodule, see Sec. 9.3. Moreover, an arbitrary star product ?0 on T ∗ Q turns out to be Morita equivalent to ?Weyl if and only if the characteristic class of ?0 is integral. This is the Dirac’s quantization condition for magnetic monopoles in the light of Morita theory and Rieffel induction applied to the usual Schr¨ odinger representation. Remark 7.8. Note also that the whole construction works for any cotangent bundle T ∗ Q. One has very explicit formulas for star products as well as the representations on the sections of the involved line bundles, see [23, 36, 133]. 8. Strong Morita Equivalence and the Picard Groupoid We shall now give an introduction to Morita theory of ∗ -algebras over C based on the crucial notion of Picard Groupoid. 8.1. Morita equivalence in the ring-theoretic setting As a warming-up we start by recalling the ring-theoretic situation. Thus let A, B be two k-algebras, where we consider only the unital case for simplicity. By A-Mod we denote the category of left A-modules where we always assume that 1l A · m = m for all m ∈ A M where A M ∈ A-Mod. The morphisms of this category are just the usual left A-module morphisms. Given a (B, A)-bimodule B EA one obtains a functor by tensoring B
EA ⊗A · : A-Mod → B-Mod.
(8.1)
In particular, the canonical bimodule A AA gives a functor naturally equivalent to the identity functor idA-Mod . This motivates the following definition of an equivalence bimodule in this ringtheoretic framework: B EA is called a Morita equivalence bimodule if it is “invertible” in the sense that there exist bimodules A EB0 and A EB00 such that B
as bimodules.
EA ⊗A A EB0 ∼ = B BB
and
A
EB00 ⊗B B EA ∼ =
A
AA
(8.2)
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
53
In this case, it is easy to see that the functor (8.1) is an equivalence of categories. This is essentially the associativity of the tensor product up to a natural transformation. Moreover, A EB0 ∼ = A EB00 as bimodules and A
as bimodules. In addition, the form B
B
EB0 ∼ = HomA ( B EA , A)
(8.3)
EA is finitely generated and projective over A, i.e. of
EA ∼ = eAn
with e = e2 ∈ Mn (A)
(8.4)
and B is determined up to isomorphism by B∼ = EndA ( EA ) ∼ = eMn (A)e.
(8.5)
Finally, the idempotent e is full in the sense that the ideal in A generated by all the matrix coefficients eij of e = (eij ) ∈ Mn (A) coincides with the whole algebra A. The converse statement is true as well: Given a full idempotent e ∈ Mn (A) the (eMn (A)e, A)-bimodule eAn is invertible in the above sense and gives an equivalence of the categories of modules over the algebras B = eMn (A)e and A. These are the statements of the classical Morita theory, see e.g. [104, 108]. 8.2. Strong Morita equivalence We want to specialize the notion of Morita equivalence to the case of ∗ -algebras over C such that the specialized Morita equivalence implies the equivalence of the categories ∗ -RepD (·) for all auxilliary ∗ -algebras D. In fact, it will be an algebraic generalization of Rieffel’s notion of strong Morita equivalence of C ∗ -algebras, hence the name. We state the following definition [37]. Definition 8.1 (Strong Morita Equivalence). Let A, B be ∗ -algebras over C. E E A (B, A)-bimodule B EA with inner products h·, ·iA and Bh·, ·i is called a strong Morita equivalence bimodule if the following conditions are satisfied: (i) Both inner products are nondegenerate and completely positive. (ii) B · B EA = B EA = B EA · A. (iii) Both inner products are full, i.e. E
(8.6)
E
(8.7)
C-span{ hx, yiA | x, y ∈ B EA } = A
C-span{ Bhx, yi | x, y ∈ B EA } = B. (iv) We have the compatibility conditions hx, b · yiA = hb∗ · x, yiA E
E
E
∗
hx, y · ai = Bhx · a , yi
B
E
E
hx, yi · z = x · hy, ziA
B
for all b ∈ B, a ∈ A, and x, y, z ∈ B EA .
(8.8) E
(8.9) (8.10)
March 15, 2005 11:31 WSPC/148-RMP
54
J070-00229
S. Waldmann
If such a bimodule exists then A and B are called strongly Morita equivalent. Remark 8.2 (∗ -Morita Equivalence). Without the above complete positivity requirements this notion is called ∗ -Morita equivalence and the bimodules are called ∗ -Morita equivalence bimodules, see Ara’s works [1, 2]. Remark 8.3. It is easy to see that the B-valued inner product is completely deterE mined by (8.10) since this simply means that Bhx, yi acts as Θx,y or, in Dirac’s bra-ket notation, as |xihy|. From now on we shall assume that all ∗ -algebras are nondegenerate in the sense P that a · A = 0 implies a = 0 and idempotent in the sense that a = i bi ci for ∗ any a ∈ A with some bi , ci ∈ A. In particular, unital -algebras are nondegenerate and idempotent. This restriction is reasonable according to the following standard example: Example 8.4. Consider the (Mn (A), A)-bimodule An for n ≥ 1 with the canonical inner product hx, yiA =
n X
x∗i yi
(8.11)
i=1
and Mn (A)h·, ·i is determined uniquely by the requirement (8.10). Then one can show that both inner products are indeed completely positive, see [37, Exercise 5.11]. Moreover, h·, ·iA is a nondegenerate inner product if and only if A is nondegenerate and it is full if and only if A is idempotent. Thus, under the above assumption on the class of ∗ -algebras we are interested in, A is strongly Morita equivalent to Mn (A) via An . Example 8.5. Strong Morita equivalence is implied by ∗ -isomorphism. Indeed, let Φ: A → B be a ∗ -isomorphism. Then we take B as a left B-module in the canonical way and endow it with a right A-module structure by setting x ·Φ a = xΦ(a) for x ∈ B and a ∈ A. For the inner products we take the canonical one with values in B hx, yi = xy ∗
(8.12)
hx, yiA = Φ−1 (x∗ y)
(8.13)
B
and
for the A-valued one. A simple verification shows that this gives indeed a strong Morita equivalence bimodule. Hence ∗ -isomorphic ∗ -algebras are strongly Morita equivalent.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
55
Example 8.6. Let B EA be a strong Morita equivalence bimodule. Then we consider ¯ as an R-module it is just E but C acts now as the complex-conjugate bimodule E: α¯ x=α ¯x
(8.14)
where x 7→ x ¯ is the identity map of the underlying R-modules. Then E¯ becomes a (A, B)-bimodule by the definitions a · x¯ = x · a∗
and x ¯ · b = b∗ · x.
Moreover, we can take the “old” inner products of ¯ E
E
h¯ x, y¯i = hx, yiA
A
and
¯ E
B
(8.15)
EA and define E
h¯ x, y¯iB = Bhx, yi .
(8.16)
Then a simple conputation shows that A E¯B with these inner products gives indeed a strong Morita equivalence (A, B)-bimodule. Example 8.4 for n = 1 gives that strong Morita equivalence is a reflexive relation, while Example 8.6 gives symmetry. For transitivity, we have to use again the tensor b which can be shown to be compatible with the other inner product operation ⊗ product as well. Thus we finally arrive at the following statement [37, Theorem 5.9]: Theorem 8.7. Within the class of non-degenerate and idempotent strong Morita equivalence is an equivalence relation.
∗
-algebras
e to emphaWe shall denote the tensor product of strong equivalence bimodules by ⊗ size that in this case we have to take care of two inner products instead of one as b for ⊗. Our original motivation of finding conditions for the equivalence of the representation theories of ∗ -algebras finds now a satisfactory answer: Theorem 8.8. If B EA is a strong Morita equivalence bimodule then the corresponding Rieffel induction functors RE : ∗ -RepD (A) → ∗ -RepD (B)
(8.17)
RE¯: ∗ -RepD (B) → ∗ -RepD (D)
(8.18)
and
give an equivalence of categories of strongly nondegenerate ∗ -representations for all auxilliary ∗ -algebras D. Since it will turn out to be much easier to determine the strongly Morita equivalent ∗ -algebras to a given ∗ -algebra A than understanding ∗ -RepD (A) itself we are now interested in finding invariants of strong Morita equivalence like ∗ -RepD (·). For a detailed comparison of strong Morita equivalence with the original definition of Rieffel [120, 121], which contains also additional completeness requirements, we refer the reader to [34, 37]. It turns out that the strong Morita theory of C ∗ -algebras in Rieffel’s sense is already completely determined by the above algebraic version. Thus it is indeed a generalization of Rieffel’s definition.
March 15, 2005 11:31 WSPC/148-RMP
56
J070-00229
S. Waldmann
8.3. The strong Picard Groupoid In order to understand strong Morita equivalence and its invariants better, it is useful to consider not only the question of whether there is a strong Morita equivalence bimodule between A and B at all but also how many there may be.
Definition 8.9. For ∗ -algebras A, B we define Picstr (B, A) to be the class of isometric isomorphism classes of strong Morita equivalence (B, A)-bimodules. We set Picstr (A) = Picstr (A, A).
Here isometric isomorphism classes mean isomorphic as (B, A)-bimodules and isometric with respect to both inner products. If A and B are unital we already know that the strong Morita equivalence bimodules are (particular) finitely generated projective modules. Thus the class Picstr (B, A) is a set. In the following we shall ignore the possible logical subtleties which may arise for non-unital ∗ -algebras for which we do not know a priori if Picstr (B, A) is a set at all. There are analogous definitions using ∗ -Morita equivalence or ring-theoretic Morita equivalence yielding Pic∗ (·, ·) and Pic(·, ·), see e.g. [8, 11] for the ringtheoretic version. We have now the following structure for the collection of all Picstr (·, ·), see [37, Sec. 6.1] and [136, 137]: Theorem 8.10 (Strong Picard Groupoid). Picstr (·, ·) is a large groupoid, called the strong Picard Groupoid, with the ∗ -algebras as units, the complex conjugate e B B EA ] as bimodules [ A E¯B ] as inverses and the tensor product [ C FB ][ B EA ] = [ C FB ⊗ product.
The proof consists in showing the groupoid requirements up to isomorphisms for the bimodules directly. A large groupoid means that the collection of objects is not necessarily a set. Here it is the class of ∗ -algebras over C which are non-degenerate and idempotent. Corollary 8.11 (Strong Picard Group). Picstr (A) is a group, called the strong Picard group of A. It corresponds to the isotropy group of the strong Picard Groupoid at the unit A. Corollary 8.12. A ∗ -algebra B is strongly Morita equivalent to A if and only if they lie in the same orbit of Picstr . In this case Picstr (B) ∼ = Picstr (A) as groups and str str Pic (A) acts freely and transitively on Pic (B, A). Corollary 8.13. There are canonical “forgetful ” Groupoid morphisms (a) / Pic∗ (·, ·) Picstr (·, ·) ?? ?? ?? ? (c) ?? (b) ? , Pic(·, ·)
such that this diagram commutes.
(8.19)
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
57
Remark 8.14. In general the groupoid morphism (a) is not surjective as there may be more inner products (with different “signatures”) on a ∗ -equivalence bimodule than only the positive ones. For the same reason, (b) is not injective in general. However, even (c) shows a non-trivial and rich behavior: it is neither surjective nor injective in general. For C ∗ -algebras it turns out to be always injective but not necessarily surjective. Thus we obtain interesting information about A by considering these Groupoid morphisms.
8.4. Actions and invariants The idea we want to discuss now is that strong Morita invariants can arise from groupoid actions of Picstr on “something”. Then “something” is preserved along the orbits of the groupoid Picstr , i.e. the strong Morita equivalence classes of ∗ -algebras. This is of course more a philosophical statement than a theorem and we do not want to make any attempt to make this proposal precise. However, we can illustrate this principle by several examples following [137]: Example 8.15 (Strong Picard Groups). The strong Picard group Picstr (A) is a strong Morita invariant. Indeed, Picstr acts on itself so the isotropy groups are all isomorphic along an orbit. Any element in Picstr (B, A) provides then a group isomorphism between Picstr (A) and Picstr (B). This is in some sense the most fundamental Morita invariant. Example 8.16 (Hermitian K 0 -Groups). Recall that the Hermitian K0 -group K0H (A) of a unital ∗ -algebra A is defined as follows: one considers finitely generated projective modules with strongly nondegenerate and completely positive inner products h·, ·iA . We can take direct orthogonal sums without losing these properties so taking isometric isomorphism classes gives us an (abelian) semigroup with respect to ⊕. Then K0H (A) is defined as the Grothendieck group of this semigroup. Now if FB is such a finitely generated projective module and B EA is a strong b B B EA is again a finitely generated projective Morita equivalence bimodule then FB ⊗ right A-module and the A-valued inner product is still strongly nondegenerate. b is clearly compatible with direct orthogonal sums. Thus, by passing Moreover, ⊗ to isometric isomorphism classes, one obtains an action of Picstr on K0H K0H (B) × Picstr (B, A) → K0H (A)
(8.20)
by group isomorphisms. This has the following consequences: First, the strong Picard group Picstr (A) acts by group isomorphisms on the abelian group K0H (A). Second, K0H (A) is a strong Morita invariant, even as a Picstr (A)-space. Example 8.17 (Representation Theories). The strong Picard Groupoid acts on the representation theories ∗ -RepD (·) by Rieffel induction Picstr (B, A) × ∗ -RepD (A) → ∗ -RepD (B).
(8.21)
March 15, 2005 11:31 WSPC/148-RMP
58
J070-00229
S. Waldmann
However, this is not an honest action as the action properties are only fulfilled up to unitary equivalences of representations. Thus this should better be interpreted as an “action” of the strong Picard bigroupoid on the categories ∗ -RepD (·), where the strong Picard bigroupoid consists of all equivalence bimodules without identifying them up to isometric isomorphisms. Since it would require 10 additional pages of commutative diagrams to give a definition of what the action of a bigroupoid should be, we do not want to make this more precise but leave it as a heuristic example to challenge the imagination of the reader, see also [11]. Another option is to consider the unitary equivalence classes of ∗ -representations instead of ∗ -RepD (·): Then the Picard Groupoid acts by Rieffel induction in a well-defined way. There are many more examples of strong Morita invariants like the centers of -algebras or their lattices of closed ∗ -ideals in the sense of [34]. Thus it is interesting to see whether one can view all strong Morita invariants as arising from an appropriate action of the strong Picard Groupoid:
∗
Question 8.18. Can one view any strong Morita invariant as coming from an action of the strong Picard Groupoid? Probably it becomes tautological if one formulates this in the appropriate context. Nevertheless, a consequence of a positive answer would be that any strong Morita invariant carries an action of the Picard group which is itself invariant. It is clear that also in the ring-theoretic framework as well as for ∗ -Morita equivalence one can pose the same question. In fact, some of the above strong Morita invariants have their immediate and well-known analogs for these coarser notions of Morita equivalence.
8.5. Strong versus ring-theoretic Morita equivalence Let us now discuss the relation between strong Morita equivalence and ringtheoretic Morita equivalence more closely. For simplicity, we shall focus on unital ∗ -algebras throughout this section. Then it is clear that strong Morita equivalence implies Morita equivalence since we have even a groupoid morphism Picstr → Pic.
(8.22)
Thus if Picstr (B, A) is non-empty then the image under (8.22) is non-empty as well. To understand the relation between strong and ring-theoretic Morita equivalence better we want to understand the kernel and the image of the groupoid morphism (8.22). Thus we first have to determine the structure of strong equivalence bimodules as precise as possible. The following proposition gives a simple proof of the wellknown fact that equivalence bimodules are finitely generated and projective using
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
59
the inner products of a strong equivalence bimodule: Proposition 8.19. Let B EA be a strong Morita equivalence bimodule. Then there exist ξi , ηi ∈ B EA , i = 1, . . . , n, such that x=
n X i=1
E
ξi · hηi , xiA
for all x ∈ B EA .
(8.23)
It follows that B EA is finitely generated and projective as a right A-module and by symmetry the same statement holds for B. Proof. Indeed, let 1lB = inner products gives
Pn
E
i=1
x = 1lB · x =
hξi , ηi i by fullness. Then the compatibility of the
B
n X i=1
E
Bhξi , ηi i · x =
n X i=1
E
ξi · hηi , xiA .
E
Since h·, ·iA is A-linear to the right in the second argument, it follows that the ξi E together with the functionals hηi , ·iA form a finite dual basis. By the dual basis lemma, see e.g. [104, Lemma 2.9], this is equivalent to the fact that B EA is finitely generated (by the generators ξi ), and projective. We shall call the {ξi , ηi }i=1,...,n with the above property a Hermitian dual basis. Thus we have EA ∼ = eAn
(8.24)
for some idempotent e = e2 ∈ Mn (A). In fact, e can be expressed in terms of the Hermitian dual basis explicitly by e = (eij )
E
with eij = hηi , ξi iA ,
and the isomorphism (8.24) is simply given by EA 3 x 7→ hηi , xiEA i=1,...,n ∈ eAn ⊆ An .
(8.25)
(8.26)
In particular, it follows that the inner products on a strong Morita equivalence bimodule are always strongly nondegenerate, see Definition 7.1, in the case of unital ∗ -algebras. Note however, that we cannot conclude that e can be chosen to be a Hermitian idempotent, i.e. a projection. Thus the question how the inner product h·, ·iEA looks like when we identify B EA with eAn is difficult to answer: How many completely positive, full and nondegenerate A-valued inner products can we have on eAn up to isometries? In order to be able to say something one has to assume additional properties of the ∗ -algebras in question. Motivated by the case of C ∗ -algebras we state the following conditions: (I) For all n ∈ N and A ∈ Mn (A) the element 1l + A∗ A is invertible.
March 15, 2005 11:31 WSPC/148-RMP
60
J070-00229
S. Waldmann
In particular, since we require this condition for all n we also have the invertibility of 1l + A∗1 A1 + · · · + A∗k Ak for A1 , . . . , Ak ∈ Mn (A). The relevance of this condition (I) is classical, see Kaplansky’s book [95, Theorem 26]: Lemma 8.20. Assume that A satisfies (I). Then for any idempotent e = e2 ∈ Mn (A) there exists a projection P = P 2 = P ∗ ∈ Mn (A) and u, v ∈ Mn (A) with e = uv
and
P = vu,
(8.27)
whence the projective modules eAn and P An are isomorphic via v and u. Thus having the property (I) we can assume for any finitely generated projective module that eAn ∼ = P An with some projection instead of a general idempotent. On n P A there is the restriction of the canonical inner product h·, ·i of An such that B(P An , h·, ·i) ∼ = P Mn (A)P
(8.28)
as ∗ -algebras, since P = P ∗ . The next question is how many other inner products of interest does one have on P An ? The following technical condition will guarantee that there is only one up to isometric isomorphisms. Again, C ∗ -algebras are the motivation for this condition: (II) Let Pα ∈ Mn (A) be finitely many pairwise orthgonal projections Pα Pβ = P δαβ Pα = δαβ Pα∗ such that α Pα = 1l and let H ∈ Mn (A)+ be invertible. If [H, Pα ] = 0 then there exists an invertible U (depending on the Pα and on H) such that H = U ∗ U and [U, Pα ] = 0. This mimics in some sense the spectral calculus for matrices. For general C ∗ -algebras this is obviously fulfilled since here for any positive H one even has √ a unique positive square root H which commutes with all elements commuting with H. Assume that A satisfies (II) and let h: P An × P An → A be a completely positive and strongly nondegenerate inner product. Then we can extend h to An by using e.g. the restriction of the canonical inner product h·, ·i on (1l − P )An . The result is a completely positive and strongly nondegenerate inner product on the free ˆ ·). Then we define the matrix H ∈ Mn (A) by module An which we denote by h(·, ˆ i , ej ) = h(P ei , P ej ) + h(1l − P )ei , (1l − P )ej i , Hij = h(e
(8.29)
ˆ y) = hx, Hyi h(x,
(8.30)
whence
ˆ is completely positive H is a positive matrix and since for all x, y ∈ An . Since h ˆ is strongly nondegenerate one finds that H is invertible. Moreover, it is clear h
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
61
that [P, H] = 0. Thus we can apply (II) and find an invertible U ∈ Mn (A) with H = U ∗ U and [P, U ] = 0. Thus ˆ y) = hx, Hyi = hU x, U yi h(x,
(8.31)
ˆ is isometric to the canonical inner product h·, ·i. Since [P, U ] = 0 the isomwhence h etry U restricts to the projective module P An and gives an isometric isomorphism between h and the restriction of h·, ·i. Lemma 8.21. Assume that A satisfies (II) and let P = P ∗ = P 2 ∈ Mn (A). Then any two completely positive and strongly nondegenerate inner products on P A n are isometric. Combining both properties lead to the following characterizations of strong Morita equivalence bimodules: Theorem 8.22. Let A and B be unital ∗ -algebras and assume that A satisfies (I) and (II). If B EA is a ∗ -Morita equivalence bimodule with completely positive inner E product h·, ·iA then we have: ∼ P An are isometri(i) There exists a full projection P ∈ Mn (A) such that EA = cally isomorphic. (ii) The left action of B on B EA and the above isomorphism induce a ∗ -isomorphism E B ∼ = P Mn (A)P and under this isomorphism Bh·, ·i becomes the canonical n P Mn (A)P -valued inner product on P A . E (iii) Bh·, ·i is necessarily completely positive whence B EA is already a strong Morita equivalence bimodule. Conversely, any full projection P ∈ Mn (A) gives a strong Morita equivalence bimodule P An between A and P Mn (A)P . The fullness of the projection P is equivalent to the statement that the canonical inner product on P An is full. One easily obtains the following consequences of this theorem: Theorem 8.23. Conditions (I) and (II) together are strongly Morita invariant. To see this, we only have to check it for P Mn (A)P by hand which is straightforward. Theorem 8.24. Within the class of unital ∗ -algebras satisfying (I) and (II) the groupoid morphism Picstr → Pic
(8.32)
is injective (though not necessarily surjective). This is also clear since on a Morita equivalence bimodule we can have at most one inner product up to isometric isomorphisms according to Theorem 8.22. This also
March 15, 2005 11:31 WSPC/148-RMP
62
J070-00229
S. Waldmann
implies the following result for general finitely generated projective modules: Theorem 8.25. For a unital canonically
∗
-algebra A satisfying (I) and (II) we have
K0H (A) ∼ = K0 (A).
(8.33)
The question of surjectivity of (8.32) is actually more subtle. Here we have to impose first another condition on the ∗ -algebras we consider. The condition is not on a single ∗ -algebra but on a whole family of ∗ -algebras under considerations: (III) Let A and B be unital ∗ -algebras and let P ∈ Mn (A) be a projection and consider the ∗ -algebra P Mn (A)P . If B and P Mn (A)P are isomorphic as unital algebras then they are also ∗ -isomorphic. In fact, for unital C ∗ -algebras this is always fulfilled as in this case P Mn (A)P is a C ∗ -algebra again and thus the ∗ -involution is uniquely determined, see [125, Theorem 4.1.20]. Another class of ∗ -algebras satisfying this condition are the Hermitian star products. In fact, if ? is a star product having a ∗ -involution of the form f 7→ f¯ + o(λ) then it is ∗ -equivalent to a Hermitian star product, see [36, Lemma 5]. Now consider the automorphism group Aut(B) of B then Aut(B) acts an the left on the set Pic(B, A) in the following way (see also Example 8.5): The left B-module structure of B EA is twisted by Φ as b ·Φ x = Φ−1 (b) · x
(8.34)
while the right A-module structure is untouched. This gives again an equivalence bimodule, now denoted by Φ E. It can be checked easily that this descends to a group action of Aut(B) on Pic(B, A), see e.g. the discussion in [8, 38]. The problem with the surjectivity is that for a given ring-theoretic equivalence bimodule we may obtain the “wrong” ∗ -involution induced for B: Proposition 8.26. Let A, B satisfy condition (III) and let A satisfy (I) and (II). Then the map Picstr (B, A) → Pic(B, A) Aut(B) (8.35) is onto.
There are some immediate consequences when we apply this to the previous examples like C ∗ -algebras and star products: Corollary 8.27. Within a class of unital ∗ -algebras satisfying (I), (II) and (III) ring-theoretic Morita equivalence implies strong Morita equivalence. In fact, for C ∗ -algebras this is Beer’s theorem [10] while for star products this was obtained in [36], see Corollary 9.4.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
63
The obstruction whether (8.32) is onto and not only onto up to automorphisms can be encoded in a particular property of the automorphism group of the algebras. We state the last condition: (IV) For any Φ ∈ Aut(A) there is an invertible U ∈ A such that Φ∗ Φ−1 = Ad(U ∗ U ) where Φ∗ (a) = Φ(a∗ )∗ . In particular, if Φ is even a ∗ -automorphism then Φ∗ = Φ whence the condition is trivially fulfilled for those. So the condition says that those automorphisms which are not ∗ -automorphisms have to be “essentially inner”. Theorem 8.28. Consider ∗ -algebras A, B in a class of unital ∗ -algebras satisfying (I), (II) and (III). (i) Picstr (B, A) → Pic(B, A) is surjective if and only if B satisfies (IV). (ii) Within this class, (IV) is strongly Morita invariant. The condition (IV) captures very interesting properties of the automorphism group. One can find explicit examples of C ∗ -algebras where (IV) is not satisfied. Moreover, for ∗ -products, the automorphisms in question can be written as exponentials of derivations, see [37, Proposition 8.8] whence one arrives at the question whether certain derivations are inner or not, see [37, Theorem 8.9]: Example 8.29 (Aharonov–Bohm Effect). Let ?, ?0 be strongly Morita equivalent star products on M . Then Picstr (?0 , ?) → Pic(?0 , ?) is bijective if and only if all derivations of ? are quasi-inner, i.e. of the form D = λi ad(H) with some H ∈ C ∞ (M )[[λ]]. In particular, if M is symplectic, then this is the case if and only if H1dR (M, C) = {0}. On the other hand, as argued in [23] for the case of cotangent bundles and more generally in [38], the first deRham cohomology is responsible for Aharonov–Bohm like effects in deformation quantization. Thus the question of surjectivity of (8.32) gets the physical interpretation of the question whether there are Aharonov–Bohm effects possible or not. 9. (Strong) Morita Equivalence of Star Products We shall now apply the results of the last section to Hermitian deformations of ∗ -algebras in order to investigate their strong Morita theory. First we have to discuss how Conditions (I) and (II) behave under deformations, in particular for the case of star products. Again, in this section all ∗ -algebras are assumed to be unital. 9.1. Deformed ∗ -algebras We consider a Hermitian deformation A = (A[[λ]], ?) of a unital ∗ -algebra A. Then the following observation is trivial: Lemma 9.1. The ∗ -algebra A satisfies (I) if and only if A satisfies (I).
March 15, 2005 11:31 WSPC/148-RMP
64
J070-00229
S. Waldmann
Thus Condition (I) is rigid under Hermitian deformations. More subtle and surprising is the following rigidity statement: Theorem 9.2. Condition (II) is rigid under completely positive deformations. A )+ whence it’s classiThe idea of the proof is to consider an invertible H ∈ Mn (A + cal limit H0 ∈ Mn (A) is still invertible and positive according to Corollary 5.8, adapted to this more general formulation. Then if Pα are pairwise ?-commuting projections commuting with H then their classical limits Pα are pairwise commuting projections commuting with H0 whence we can apply (II) for the classical limit A and find a U0 . Then the idea is to lift U0 in an appropriate way to find U with H = U∗ U and [Pα , U] = 0. This rigidity is very nice since star products are completely positive deformations according to Theorem 5.6. Thus we have the following consequences: Corollary 9.3. Hermitian star products satisfy (I) and (II). Corollary 9.4. Hermitian star products are strongly Morita equivalent if and only if they are Morita equivalent. Moreover, the groupoid morphism Picstr (?0 , ?) → Pic(?0 , ?)
(9.1)
is injective. Thus we only have to understand the ring-theoretic Morita equivalence of star products to get the strong Morita equivalence for free. Note however, that the (non-) surjectivity of (9.1) depends very much on the star products under consideration. 9.2. Deformed projections We shall now simplify our discussion to the ring-theoretic Morita equivalence as for star products this is sufficient thanks to Corollary 9.4. For a given (Hermitian) deformation A = (A[[λ]], ?) we have to find the full A ) in order to find all other algebras B which are Morita idempotents P ∈ Mn (A equivalent to A since then A) ? P B = P ? Mn (A
(9.2)
gives all Morita equivalent algebras up to isomorphism. Thus we have to investigate A ). From the defining equation P ? P = P we find in zeroth the idempotents in Mn (A order P∞
P0 · P0 = P 0 ,
(9.3)
where P = r=0 λr Pr . Thus the classical limit of an idempotent P is an idempotent P0 for the undeformed product. Lemma 9.5. P is full if and only if P0 is full.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
65
This is straightforward to show. The next observation is less trivial and can be found implicitly in e.g. [63, 78] while the explicit formula can be found in [69, Eq. (6.1.4)]: If P0 ∈ Mn (A) is an idempotent with respect to the undeformed product then 1 1 1 P = + P0 − ?p (9.4) 2 2 1 + 4(P0 ? P0 − P0 )
A ) with respect to ?. Here we have to assume Q ⊆ R defines an idempotent P ∈ Mn (A in order to make the series well-defined. Note that as a formal power series in λ the right hand side in (9.4) is well-defined since in zeroth order P0 ? P0 − P0 vanishes. Moreover, the classical limit of this P coincides with P0 and if P0 is already an idempotent with respect to ?, then the formula reproduces P0 . Finally, if ? is a Hermitian deformation of a ∗ -algebra and if P0 is even a projection, i.e. P0∗ = P0 , then P is also a projection. The next statement concerns the uniqueness of the deformation P of a given projection P0 . First recall that two idempotents P and Q are called equivalent if (after embedding into some big matrix algebra Mn (A)) there exist U , V such that P = U V and Q = V U . This is the case if and only if the corresponding projective modules P An and QAn are isomorphic as right A-modules. In fact, U and V provide such mutually inverse module isomorphisms when restricted to P An and QAn , see also Lemma 8.20. For projections one has a slightly refined notion, namely one demands P = U ∗ U and Q = U U ∗ . For the deformations we have now the following statement: Lemma 9.6. Two deformed idempotents P and Q are equivalent if and only if their classical limits P0 and Q0 are equivalent. As a consequence one immediately obtains the rigidity of the K0 -group under formal deformations, i.e. the classical limit map induces an isomorphism ∼ =
A ) −→ K0 (A), cl∗ : K0 (A
(9.5)
see [124]. One can also show that as C[[λ]]-modules we have A) ? P (P0 Mn (A)P0 )[[λ]] ∼ = P ? Mn (A
(9.6)
by an isomorphism with the identity as classical limit, when we view both spaces as submodules of Mn (A)[[λ]]. With these results we have the following picture: Given a Morita equivalence bimodule B EA ∼ = P0 An we know B ∼ = P0 Mn (A)P0 . Moreover, let a deformation ? of A be given. Then any choice of a C[[λ]]-isomorphism as in (9.6) gives an isomorphism A) ? P B[[λ]] ∼ = P ? Mn (A
(9.7)
A )[[λ]] inducing the identity in zeroth order. Since the as C[[λ]]-submodules of Mn (A right-hand side carries an algebra structure this induces a new associative multiplication ?0 for B[[λ]] which turns out to be a deformation of B. Since everything is unique up to isomorphisms and since the isomorphisms can be adapted in such a
March 15, 2005 11:31 WSPC/148-RMP
66
J070-00229
S. Waldmann
way that they induce the identity in zeroth order we find the following: (i) ?0 is unique up to equivalence. (ii) (B[[λ]], ?0 ) is Morita equivalent to (A[[λ]], ?). Furthermore, everything depends only on the isomorphism class of the equivalence bimodule and behaves nicely under tensor products. Denoting by Def(A) = {equivalence classes of formal associative deformations of A}
(9.8)
the deformation theory of A we have an action Pic(B, A) × Def(A) 3 ([E], [?]) 7→ [?0 ] ∈ Def(B)
(9.9)
of the Picard Groupoid of the undeformed algebras on their deformation theories Def(·) such that ?0 gives a Morita equivalent deformation to ? if and only if [?0 ] and [?] lie in the same orbit of the groupoid action (9.9), see [31]. Remark 9.7. With the deformation theories we just found another Morita invariant coming from an action of the Picard Groupoid, here in the ring-theoretic framework, see also [78].
9.3. Morita equivalent star products Now we want to apply these general results to star product algebras. Thus we first have to identify the classical Picard Groupoid and then determine its action on the deformation theories by examining the projective modules. The first task has a well-known solution. The Picard Groupoid for algebras of smooth functions C ∞ (M ) is given as follows: Pic(C ∞ (M ), C ∞ (M 0 )) = ∅ for M 6∼ = M0 ∞
2
Pic(C (M )) = Diff(M ) n H (M, Z)
(9.10) (9.11)
Note that this determines Pic(·, ·) completely. Here Diff(M ) = Aut(C ∞ (M )) is the diffeomorphism group of M which twists the bimodules in the usual way. The equivalence bimodules where C ∞ (M ) acts the same way from left and right are just the sections Γ∞ (L) of complex line bundles L → M . It is well-known that they are classified by the second integral cohomology of M . For the second step we have to deform the sections Γ∞ (L) such that (C ∞ (M )[[λ]], ?) acts from the right on Γ∞ (M )[[λ]] while (C ∞ (M )[[λ]], ?0 ) acts from the left and both actions commute. We have to compute the characteristic class of ?0 in terms of the equivalence class of ? and [L] ∈ H2 (M, Z). The result for the symplectic case has a very appealing formulation using the characteristic classes of
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
67
star products, see [36]: Theorem 9.8. In the symplectic case we have c(?0 ) = c(?) + 2πic1 (L),
(9.12)
where c(?) ∈
[ω] + H2dR (M, C)[[λ]] iλ
(9.13)
is the characteristic class of ? and c1 (L) ∈ H2dR (M, C) is the Chern class of L. Remark 9.9 (Morita Equivalent Star Products). From this one immediately obtains the full classification of Morita equivalent star products in the symplectic case as we only have to re-implement the automorphisms from Diff(M ). The final answer is therefore that ? and ?0 are Morita equivalent symplectic star products on (M, ω) if and only if there exists a symplectomorphism ψ such that ψ ∗ c(?0 ) − c(?) ∈ 2πiH2dR (M, Z).
(9.14)
Remark 9.10. Similar results hold in the Poisson case where (9.12) has to be formally inverted to give the formal deformations of the Poisson tensor which classifies the star products according to the formality theorem, see [92] for a discussion. Remark 9.11 (Dirac’s Monopole). From this point of view the results on the Dirac monopole as in Sec. 7.3 are much more transparent: Dirac’s quantization condition for the monopole charge of a magnetic monopole is precisely the integrality condition for the two-form B to define a line bundle. But this is the condition for the Morita equivalence of the two quantizations. The Rieffel induction functor is then just the induction by the equivalence bimodule, see also [36]. The proof of the theorem consists in constructing local (even bidifferential) bimodule multiplications • and •0 which allow to construct a deformed version of the transition functions of the line bundle. Now these deformed transition functions obey a cocycle identity with respect to the star product ?. From this one can reconstruct the characteristic classes, or better, their difference t(?0 , ?) = c(?0 )−c(?) by using techniques from [82]. The other and easier option is to use a Fedosov like construction not only for the star product ? but also for the whole bimodule structure •, •0 and ?0 directly, by using a connection ∇L for L in addition to the symplectic connection, see [133]. Then the characteristic classes can be trivially determined in the construction. Since every star product is equivalent to a Fedosov star product this is sufficient to deduce (9.12) in general. Remark 9.12 (Deformed Vector Bundles). The other projective modules are precisely the sections of higher rank vector bundles, this is just the statement of the Serre–Swan Theorem, see e.g. [130]. From the previous section we already see that vector bundles can always be deformed, see also [32]. Deforming vector bundles into
March 15, 2005 11:31 WSPC/148-RMP
68
J070-00229
S. Waldmann
bimodules in general is useful and interesting for its own sake. In physics this gives the playing ground for a geometric formulation of the so-called noncommutative field theories, see e.g. [32] as well as [3, 48, 89–91, 128] and references therein. 10. Outlook: What Comes Next? To conclude this review let us just mention a few open questions, further developments and future projects arising from this discussion. Some of them are work in progress. (i) To what extend can one apply these techniques to field theories, as e.g. the notions of strong Morita equivalence? In particular, it would be interesting to compare formal and strict deformation quantizations, see e.g. [53, 56–58, 60–62] for approaches to quantum field theories using star products. (ii) The state space of formal star products is in many respects still not physically satisfying: it is much too big in order to allow physical interpretations for all positive functionals. Thus one should find criteria for positive functionals to describe physically relevant situations. In particular, how can one classify deformations of classical states? Which are the “minimal” ones? What is the relevance of mixed and pure states from the deformation point of view? Can one extend the baby versions of the Tomita–Takesaki theorems from [131]? (iii) Deformed line bundles and more generally deformed vector bundles are the starting point for any geometric description of noncommutative field theories. Here one has still many open questions concerning e.g. the global aspects of these theories, the convergence of star products and bimodule structures or the formulation of gauge transformations. (iv) Since symmetries play a fundamental role in physics one has to investigate the invariant states and their GNS representations in more detail. First steps in that direction can be found in [25]. Here one would like to understand also the role of coherent states and eigenstates. It seems that on the purely algebraic level of formal star products one cannot get very far but one needs some convergence conditions. Thus the relation between the formal GNS construction and their convergent counterparts has to be explored. Since a C ∗ -algebraic theory is usually very difficult to obtain in a first step the whole situation is probably better formulated for some locally convex algebras. Here one can rely on the general results on O ∗ -algebras [127] but these techniques have still to be adapted to star products. (v) What is the relevance of (strong) Morita equivalence from the physical point of view? In particular, can one interprete the Morita invariants in a more physical way, the way it was done for the Dirac monopole? Acknowledgments It is a pleasure to thank Didier Arnal, Giuseppe Dito and Paco Turrubiates for many discussions during my stay in Dijon as well as for encouraging me to write these
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
69
notes. I would also like to thank the University of Dijon for the warm hospitality. Moreover, I would like to thank Martin Bordemann for valuable discussions and careful reading of the manuscript. Finally, I would like to thank the referee for many useful suggestions and comments. References [1] P. Ara, Morita equivalence for rings with involution, Alg. Rep. Theo. 2 (1999) 227–247. [2] P. Ara, Morita equivalence and Pedersen ideals, Proc. AMS 129(4) (2000) 1041–1049. [3] D. Bahns, S. Doplicher, K. Fredenhagen and G. Piacitelli, Ultraviolet finite quantum field theory on quantum spacetime, Commun. Math. Phys. 237 (2003) 221–241. [4] V. Bargmann, On a Hilbert space of analytic functions and an associated integral transform, Part I, Comm. Pure Appl. Math 14 (1961) 187–214. [5] V. Bargmann, On a Hilbert space of analytic functions and an associated integral transform, Part II. A family of related function spaces application to distribution theory, Comm. Pure Appl. Math 20 (1967) 1–101. [6] H. Basart, M. Flato, A. Lichnerowicz and D. Sternheimer, Deformation theory applied to quantization and statistical mechanics, Lett. Math. Phys. 8 (1984) 483–494. [7] H. Basart and A. Lichnerowicz, Conformal symplectic geometry, deformations, rigidity and geometrical (KMS) conditions, Lett. Math. Phys. 10 (1985) 167–177. [8] H. Bass, Algebraic K-Theory (W. A. Benjamin, Inc., New York, Amsterdam, 1968). [9] F. Bayen, M. Flato, C. Frønsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization, Ann. Phys. 111 (1978) 61–151. [10] W. Beer, On morita equivalence of nuclear C ∗ -algebras, J. Pure Appl. Algebra 26(3) (1982) 249–267. [11] J. B´enabou, Introduction to bicategories, in Reports of the Midwest Category Seminar (Springer-Verlag, 1967), pp. 1–77. [12] F. A. Berezin, General concept of quantization, Commun. Math. Phys. 40 (1975) 153–174. [13] F. A. Berezin, Quantization, Math. USSR Izvestija 8(5) (1975) 1109–1165. [14] F. A. Berezin, Quantization in complex symmetric spaces, Math. USSR Izvestija 9(2) (1975) 341–379. [15] M. Bertelson, M. Cahen and S. Gutt, Equivalence of star products, Class. Quant. Grav. 14 (1997) A93–A107. [16] P. Bieliavsky, M. Bordemann, S. Gutt and S. Waldmann, Traces for star products on the dual of a Lie algebra, Rev. Math. Phys. 15(5) (2003) 425–445. [17] M. Bordemann, The deformation quantization of certain super-Poisson brackets and BRST cohomology, in Conf´erence Mosh´e Flato 1999. Quantization, Deformations, and Symmetries, eds. G. Dito and D. Sternheimer [54], pp. 45–68. [18] M. Bordemann, (Bi)Modules, morphismes at r´eduction des star-produits: le cas symplectique, feuilletages et obstructions, preprint math.QA/0403334 (2004), 135. [19] M. Bordemann, M. Brischle, C. Emmrich and S. Waldmann, Subalgebras with converging star products in deformation quantization: An algebraic construction for CP n , J. Math. Phys. 37 (1996) 6311–6323. [20] M. Bordemann, H.-C. Herbig and S. Waldmann, BRST cohomology and phase space reduction in deformation quantization, Commun. Math. Phys. 210 (2000) 107–144. [21] M. Bordemann, E. Meinrenken and M. Schlichenmaier, Toeplitz quantization of K¨ ahler manifolds and gl(N ), N → ∞ limit, Commun. Math. Phys. 165 (1994) 281–296.
March 15, 2005 11:31 WSPC/148-RMP
70
J070-00229
S. Waldmann
[22] M. Bordemann, N. Neumaier, C. Nowak and S. Waldmann, Deformation of Poisson brackets, Unpublished discussions on the quantization problem of general Poisson brackets, June 1997. [23] M. Bordemann, N. Neumaier, M. J. Pflaum and S. Waldmann, On representations of star product algebras over cotangent spaces on Hermitian line bundles, J. Funct. Anal. 199 (2003) 1–47. [24] M. Bordemann, N. Neumaier and S. Waldmann, Homogeneous fedosov star products on cotangent bundles I: Weyl and standard ordering with differential operator representation, Commun. Math. Phys. 198 (1998) 363–396. [25] M. Bordemann, N. Neumaier and S. Waldmann, Homogeneous Fedosov star products on cotangent bundles II: GNS representations, the WKB expansion, traces, and applications, J. Geom. Phys. 29 (1999) 199–234. [26] M. Bordemann, H. R¨ omer and S. Waldmann, A remark on formal KMS states in deformation quantization, Lett. Math. Phys. 45 (1998) 49–61. [27] M. Bordemann and S. Waldmann, A Fedosov star product of wick type for K¨ ahler manifolds, Lett. Math. Phys. 41 (1997) 243–253. [28] M. Bordemann and S. Waldmann, Formal GNS construction and WKB expansion in deformation quantization, in Deformation Theory and Symplectic Geometry, eds. D. Sternheimer, J. Rawnsley and S. Gutt [129], pp. 315–319. [29] M. Bordemann and S. Waldmann, Formal GNS construction and states in deformation quantization, Commun. Math. Phys. 195 (1998) 549–583. [30] O. Bratteli and D. W. Robinson, Operator algebras and quantum statistical mechanics I: C ∗ - and W ∗ -algebras. Symmetry groups. Decomposition of states, 2nd edn. (Springer-Verlag, New York, Heidelberg, Berlin, 1987). [31] H. Bursztyn, Semiclassical geometry of quantum line bundles and Morita equivalence of star products, Int. Math. Res. Not. 2002(16) (2002) 821–846. [32] H. Bursztyn and S. Waldmann, Deformation quantization of hermitian vector bundles, Lett. Math. Phys. 53 (2000) 349–365. [33] H. Bursztyn and S. Waldmann, On positive deformations of ∗ -algebras, in Conf´erence Mosh´e Flato 1999. Quantization, Deformations and Symmetries, eds. G. Dito and D. Sternheimer [54], pp. 69–80. [34] H. Bursztyn and S. Waldmann, ∗ -Ideals and formal Morita equivalence of ∗ -algebras, Int. J. Math. 12(5) (2001) 555–577. [35] H. Bursztyn and S. Waldmann, Algebraic rieffel induction, formal Morita equivalence and applications to deformation quantization, J. Geom. Phys. 37 (2001) 307–364. [36] H. Bursztyn and S. Waldmann, The characteristic classes of Morita equivalent star products on symplectic manifolds, Commun. Math. Phys. 228 (2002) 103–121. [37] H. Bursztyn and S. Waldmann, Completely positive inner products and strong Morita equivalence, preprint (FR-THEP 2003/12) math.QA/0309402 (September 2003), 36 pages, to appear in Pacific J. Math. [38] H. Bursztyn and S. Waldmann, Bimodule deformations, Picard groups and contravariant connections, K-Theory 31 (2004) 1–37. [39] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds I Geometric interpretation of Berezin’s quantization, J. Geom. Phys. 7 (1990) 45–62. [40] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds. II, Trans. Am. Math. Soc. 337(1) (1993) 73–98. [41] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds. III, Lett. Math. Phys. 30 (1994) 291–305. [42] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds. IV, Lett. Math. Phys. 34 (1995) 159–168.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
71
[43] A. Cannas da Silva and A. Weinstein, Geometric models for noncommutative algebras, Berkeley Mathematics Lecture Notes (AMS, 1999). [44] A. Cattaneo and G. Felder, A path integral approach to the Kontsevich quantization formula, Commun. Math. Phys. 212 (2000) 591–611. [45] A. S. Cattaneo, G. Felder and L. Tomassini, Fedosov connections on jet bundles and deformation quantization, in Deformation Quantization, ed. G. Halbout [86], pp. 191–202. [46] A. S. Cattaneo, G. Felder and L. Tomassini, From local to global deformation quantization of Poisson manifolds, Duke Math. J. 115(2) (2002) 329–352. [47] A. Connes, Noncommutative Geometry (Academic Press, San Diego, New York, London, 1994). [48] A. Connes, M. R. Douglas and A. Schwarz, Noncommutative geometry and matrix theory: compactification on tori, J. High Energy Phys. 02(003) (1998). [49] A. Connes, M. Flato and D. Sternheimer, Closed star products and cyclic cohomology, Lett. Math. Phys. 24 (1992) 1–12. [50] P. Deligne, D´eformations de l’Alg`ebre des Fonctions d’une Vari´et´e Symplectique: Comparaison entre Fedosov et DeWilde, Lecomte, Sel. Math. New Series 1(4) (1995) 667–697. [51] M. DeWilde and P. B. A. Lecomte, Existence of star-products and of formal deformations of the poisson lie algebra of arbitrary symplectic manifolds, Lett. Math. Phys. 7 (1983) 487–496. [52] M. DeWilde and P. B. A. Lecomte, Formal deformations of the Poisson Lie algebra of a symplectic manifold and star-products. Existence, equivalence, derivations, in Deformation Theory of Algebras and Structures and Applications, eds. M. Hazewinkel and M. Gerstenhaber [87], pp. 897–960. [53] G. Dito, Deformation quantization of covariant fields, in Deformation Quantization, ed. G. Halbout [86], pp. 55–66. [54] G. Dito and D. Sternheimer, (eds.), Conf´erence Mosh´e Flato 1999. Quantization, Deformations and Symmetries, Mathematical Physics Studies, No. 22 (Kluwer Academic Publishers, Dordrecht, Boston, London, 2000). [55] G. Dito and D. Sternheimer, Deformation quantization: genesis, developments and metamorphoses, in Deformation Quantization, ed. G. Halbout [86], pp. 9–54. [56] J. Dito, Star-product approach to quantum field theory: The free scalar field, Lett. Math. Phys. 20 (1990) 125–134. [57] J. Dito, Star-products and nonstandard quantization for Klein-Gordon equation, J. Math. Phys. 33(2) (1992) 791–801. [58] J. Dito, An example of cancellation of infinities in the star-quantization of fields, Lett. Math. Phys. 27 (1993) 73–80. [59] V. Dolgushev, Covariant and equivariant formality theorems, Adv. Math. 191 (2005) 147–177. [60] M. D¨ utsch and K. Fredenhagen, Algebraic quantum field theory, perturbation theory, and the loop expansion, Commun. Math. Phys. 219 (2001) 5–30. [61] M. D¨ utsch and K. Fredenhagen, Perturbative algebraic field theory, and deformation quantization, Field Inst. Commun. 30 (2001) 151–160. [62] M. D¨ utsch and K. Fredenhagen, The master ward identity and generalized Schwinger-Dyson equation in classical field theory, Commun. Math. Phys. 243 (2003) 275–314. [63] C. Emmrich and A. Weinstein, Geometry of the transport equation in multicomponent WKB approximations, Comm. Math. Phys. 176 (1996) 701–711. [64] B. V. Fedosov, Formal quantization, Some topics of modern mathematics and their applications to problems of mathematical physics (1985) 129–136. Moscow.
March 15, 2005 11:31 WSPC/148-RMP
72
J070-00229
S. Waldmann
[65] B. V. Fedosov, Quantization and the index, Sov. Phys. Dokl. 31(11) (1986) 877–878. [66] B. V. Fedosov, Index theorem in the algebra of quantum observables, Sov. Phys. Dokl. 34(4) (1989) 319–321. [67] B. V. Fedosov, Reduction and eigenstates in deformation quantization, in Pseudodifferential Calculus and Mathematical Physics, Advances in Partial Differential Equations, Vol. 5, eds. M. Demuth, E. Schrohe and B.-W. Schulze (Akademie Verlag, Berlin, 1994), pp. 277–297. [68] B. V. Fedosov, A simple geometrical construction of deformation quantization, J. Differential Geom. 40 (1994) 213–238. [69] B. V. Fedosov, Deformation Quantization and Index Theory (Akademie Verlag, Berlin, 1996). [70] B. V. Fedosov, Non-Abelian reduction in deformation quantization, Lett. Math. Phys. 43 (1998) 137–154. [71] B. V. Fedosov, On the trace density in deformation quantization, in Deformation Quantization, ed. G. Halbout [86], pp. 67–83. [72] G. Felder and B. Shoikhet, Deformation quantization with traces, Lett. Math. Phys. 53 (2000) 75–86. [73] M. Gerstenhaber, Cohomology structure of an associative ring, Ann. Math. 78 (1963) 267–288. [74] M. Gerstenhaber, On the deformation of rings and algebras, Ann. Math. 79 (1964) 59–103. [75] M. Gerstenhaber, On the deformation of rings and algebras II, Ann. Math. 84 (1966) 1–19. [76] M. Gerstenhaber, On the deformation of rings and algebras III, Ann. Math. 88 (1968) 1–34. [77] M. Gerstenhaber, On the deformation of rings and algebras IV, Ann. Math. 99 (1974) 257–276. [78] M. Gerstenhaber and S. D. Schack, Algebraic cohomology and deformation theory, in Deformation Theory of Algebras and Structures and Applications, eds. M. Hazewinkel and M. Gerstenhaber [87], pp. 13–264. [79] H. J. Groenewold, On the principles of elementary quantum mechanics, Physica 12 (1946) 405–460. [80] S. Gutt, An explicit ∗-product on the cotangent bundle of a lie group, Lett. Math. Phys. 7 (1983) 249–258. [81] S. Gutt, Variations on deformation quantization, in Conf´erence Mosh´e Flato 1999. Quantization, Deformations, and Symmetries, Mathematical Physics Studies No. 21, eds. G. Dito and D. Sternheimer (Kluwer Academic Publishers, Dordrecht, Boston, London, 2000), pp. 217–254. [82] S. Gutt and J. Rawnsley, Equivalence of star products on a symplectic manifold; ˇ an introduction to Deligne’s Cech cohomology classes, J. Geom. Phys. 29 (1999) 347–392. [83] S. Gutt and J. Rawnsley, Traces for star products on symplectic manifolds, J. Geom. Phys. 42 (2002) 12–18. [84] R. Haag, Local quantum physics (Springer-Verlag, Berlin, Heidelberg, New York, 2nd edn., 1993). [85] R. Haag, N. M. Hugenholtz and M. Winnink, On the equilibrium states in quantum statistical mechanics, Commun. Math. Phys. 5 (1967) 215–236. [86] G. Halbout, (eds.): Deformation Quantization, IRMA Lectures in Mathematics and Theoretical Physics, Vol. 1 (Walter de Gruyter, Berlin, New York, 2002). [87] M. Hazewinkel and M. Gerstenhaber, (eds.): Deformation Theory of Algebras and Structures and Applications (Kluwer Academic Press, Dordrecht, 1988).
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
73
[88] N. Jacobson, Basic Algebra I (Freeman and Company, New York, 2nd edn., 1985). [89] B. Jurˇco, L. M¨ oller, S. Schraml, P. Schupp and J. Wess, Construction of non-Abelian gauge theories on noncommutative spaces, Eur. Phys. J. C21 (2001) 383–388. [90] B. Jurˇco, P. Schupp and J. Wess, Noncommutative gauge theory for Poisson manifolds, Nuclear Phys. B584 (2000) 784–794. [91] B. Jurˇco, P. Schupp and J. Wess, Nonabelian noncommutative gauge theory via noncommutative extra dimensions, Nuclear Phys. B604 (2001) 148–180. [92] B. Jurˇco, P. Schupp and J. Wess, Noncommutative line bundles and Morita equivalence, Lett. Math. Phys. 61 (2002) 171–186. [93] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras. Volume I: Elementary Theory, Graduate Studies in Mathematics, Vol. 15 (American Mathematical Society, Providence, 1997). [94] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras. Volume II: Advanced Theory, Graduate Studies in Mathematics, Vol. 16 (American Mathematical Society, Providence, 1997). [95] I. Kaplansky, Rings of Operators (W. A. Benjamin, Inc., New York, Amsterdam, 1968). [96] A. V. Karabegov, Deformation quantization with separation of variables on a K¨ ahler manifold, Commun. Math. Phys. 180 (1996) 745–755. [97] A. V. Karabegov, On the canonical normalization of a trace density of deformation quantization, Lett. Math. Phys. 45 (1998) 217–228. [98] A. V. Karabegov and M. Schlichenmaier, Identification of Berezin-Toeplitz deformation quantization, J. Reine Angew. Math. 540 (2001) 49–76. [99] M. Kontsevich, Deformation quantization of Poisson manifolds, Lett. Math. Phys. 66 (2003) 157–216. [100] M. Kontsevich, Formality conjecture, in Deformation Theory and Symplectic Geometry, eds. D. Sternheimer, J. Rawnsley and S. Gutt [129], pp. 139–156. [101] M. Kontsevich, Operads and motives in deformation quantization, Lett. Math. Phys. 48 (1999) 35–72. [102] M. Kontsevich, Deformation quantization of algebraic varieties, Lett. Math. Phys. 56 (2001) 271–294. [103] R. Kubo, Statistical-mechanical theory of irreversible processes, I. General theory and simple applications to magnetic and conduction problems, J. Phys. Soc. Japan 12 (1957) 570–586. [104] T. Y. Lam, Lectures on Modules and Rings, Graduate Texts in Mathematics, Vol. 189 (Springer-Verlag, Berlin, Heidelberg, New York, 1999). [105] E. C. Lance, Hilbert C ∗ -modules. A Toolkit for Operator Algebraists, London Mathematical Society Lecture Note Series, Vol. 210 (Cambridge University Press, Cambridge, 1995). [106] N. P. Landsman, Mathematical Topics between Classical and Quantum Mechanics, Springer Monographs in Mathematics (Springer-Verlag, Berlin, Heidelberg, New York, 1998). [107] P. C. Martin and J. Schwinger, Theory of many-particle systems, I, Phys. Rev. 115 (1959) 1342–1373. [108] K. Morita, Duality for modules and its applications to the theory of rings with minimum condition, Sci. Rep. Tokyo Kyoiku Daigaku Sect. A6 (1958) 83–142. [109] J. E. Moyal, Quantum mechanics as a statistical theory, Proc. Camb. Phil. Soc. 45 (1949) 99–124. [110] R. Nest and B. Tsygan, Algebraic index theorem, Commun. Math. Phys. 172 (1995) 223–262.
March 15, 2005 11:31 WSPC/148-RMP
74
J070-00229
S. Waldmann
[111] R. Nest and B. Tsygan, Algebraic index theorem for families, Adv. Math. 113 (1995) 151–205. [112] N. Neumaier, Sternprodukte auf Kotangentenb¨ undeln und Ordnungs-Vorschriften, master thesis, Fakult¨ at f¨ ur Physik, Albert-Ludwigs-Universit¨ at, Freiburg (1998). [113] N. Neumaier, Local ν-Euler derivations and deligne’s characteristic class of Fedosov star products and star products of special type, Commun. Math. Phys. 230 (2002) 271–288. ¨ [114] C. J. Nowak, Uber Sternprodukte auf nichtregul¨ aren Poissonmannigfaltigkeiten, PhD thesis, Fakult¨ at f¨ ur Physik, Albert-Ludwigs-Universit¨ at, Freiburg (1997). [115] H. Omori, Y. Maeda and A. Yoshioka, Weyl manifolds and deformation quantization, Adv. Math. 85 (1991) 224–255. [116] J. P. Ortegra and R. S. Ratiu, Momentum Maps and Hamiltonian Reduction, Progress in Mathematics, Vol. 222 (Birkh¨ auser, Boston, 2004). [117] M. J. Pflaum, A deformation-theoretical approach to Weyl quantization on Riemannian manifolds, Lett. Math. Phys. 45 (1998) 277–294. [118] M. J. Pflaum, A deformation-theoretic approach to normal order quantization, Russ. J. Math. Phys. 7 (2000) 82–113. [119] I. Raeburn and D. P. Williams, Morita Equivalence and Continuous-Trace C ∗ Algebras, Mathematical Surveys and Monographs, Vol. 60 (American Mathematical Society, Providence, RI, 1998) [120] M. A. Rieffel, Induced representations of C ∗ -algebras, Adv. Math. 13 (1974) 176–257. [121] M. A. Rieffel, Morita equivalence for C ∗ -algebras and W ∗ -algebras, J. Pure. Appl. Math. 5 (1974) 51–96. [122] M. A. Rieffel, Deformation quantization of Heisenberg manifolds, Commun. Math. Phys. 122 (1989) 531–562. [123] M. A. Rieffel, Deformation quantization for actions of Rd , Mem. Amer. Math. Soc. 106(506) (1993) 93 pages. [124] J. Rosenberg, Rigidity of K-theory under deformation quantization Preprint q-alg/9607021 (July 1996). [125] S. Sakai, C ∗ -Algebras and W ∗ -Algebras, Ergebnisse der Mathematik und ihrer Grenzgebiete, Vol. 60 (Springer-Verlag, Berlin, Heidelberg, New York, 1971). [126] M. Schlichenmaier, Deformation quantization of compact K¨ ahler manifolds by Berezin-Toeplitz quantization, in Conf´erence Mosh´e Flato 1999. Quantization, Deformations, and Symmetries, eds. G. Dito and D. Sternheimer [54], pp. 289–306. [127] K. Schm¨ udgen, Unbounded Operator Algebras and Representation Theory, Operator theory: Advances and applications, Vol. 37 (Birkh¨ auser Verlag, Basel, Boston, Berlin, 1990). [128] N. Seiberg and E. Witten, String theory and noncommutative geometry, J. High Energy Phys. 09 (1999) 032. [129] D. Sternheimer, J. Rawnsley and S. Gutt, (eds.) Deformation Theory and Symplectic Geometry, Mathematical Physics Studies, No. 20 (Kluwer Academic Publisher, Dordrecht, Boston, London, 1997). [130] R. G. Swan, Vector bundles and projective modules, Trans. Amer. Math. Soc. 105 (1962) 264–277. [131] S. Waldmann, Locality in GNS representations of deformation quantization, Commun. Math. Phys. 210 (2000) 467–495. [132] S. Waldmann, A remark on the deformation of GNS representations of ∗ -algebras, Rep. Math. Phys. 48 (2001) 389–396. [133] S. Waldmann, Morita equivalence of Fedosov star products and deformed Hermitian vector bundles, Lett. Math. Phys. 60 (2002) 157–170.
March 15, 2005 11:31 WSPC/148-RMP
J070-00229
States and Representations in Deformation Quantization
75
[134] S. Waldmann, On the representation theory of deformation quantization, in Deformation Quantization, ed. G. Halbout [86], pp. 107–133. [135] S. Waldmann, Deformation quantization: Observable algebras, states and representation theory. Preprint (Freiburg FR-THEP 2003/04) hep-th/0303080 (March 2003). Lecture notes for the summer school in Kopaonik, 2002. [136] S. Waldmann, Morita equivalence, Picard Groupoids and noncommutative field theories. Preprint (Freiburg FR-THEP 2003/06) math.QA/0304011 (April 2003). Contribution to the Proceedings of the Sendai Workshop 2002. To appear in U. Carow-Watamura Y. Maeda and S. Wakamuda (eds.) Quantum Field Theory and Noncommutative Geometry. Lecture Notes in Physics 662 (2005). [137] S. Waldmann, The Picard Groupoid in deformation quantization, Lett. Math. Phys. 69 (2004) 223–235. [138] A. Weinstein, Deformation Quantization. S´eminaire Bourbaki 46`eme ann´ee 789 (1994). [139] A. Weinstein, Poisson geometry, Diff. Geom. Appl. 9 (1998) 213–238. [140] A. Weinstein and P. Xu, Hochschild cohomology and characteristic classes for starproducts, in Geometry of Differential Equations. Dedicated to V. I. Arnold on the occasion of his 60th birthday, eds. A. Khovanskij, A. Varchenko and V. Vassiliev (American Mathematical Society, Providence, 1998), pp. 177–194. [141] H. Weyl, The Theory of Groups and Quantum Mechanics (Dover, New York, 1931).
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
Reviews in Mathematical Physics Vol. 17, No. 1 (2005) 77–112 c World Scientific Publishing Company
ON THE TOPOLOGY OF T -DUALITY
ULRICH BUNKE∗ and THOMAS SCHICK† Mathematisches Institut, Universit¨ at G¨ ottingen, Bunsenstr. 3-5, 37073 G¨ ottingen, Germany ∗
[email protected] †
[email protected] Received 11 August 2004 Revised 21 December 2004 We study a topological version of the T -duality relation between pairs consisting of a principal U (1)-bundle equipped with a degree-three integral cohomology class. We describe the homotopy type of a classifying space for such pairs and show that it admits a selfmap which implements a T -duality transformation. We give a simple derivation of a T -duality isomorphism for certain twisted cohomology theories. We conclude with some explicit computations of twisted K-theory groups and discuss an example of iterated T -duality for higher-dimensional torus bundles. Keywords: T -duality; twisted K-theory; axiomatic twisted cohomology theory.
Contents 1. Introduction 1.1 Summary 1.2 Description of the results 2. The Classifying Space of Pairs 2.1 Pairs and the classifying space 2.2 Duality of pairs 2.3 The topology of R 2.4 The T -transformation 3. T -duality in Twisted Cohomology Theories 3.1 Axioms of twisted cohomology 3.2 T -admissibility 3.3 T -duality isomorphisms 4. Examples 4.1 The computation of twisted K-theory for 3-manifolds 4.2 Line bundles over CP r 4.3 An example where torsion plays a role 4.4 Iterated T -duality
77
78 78 78 83 83 85 89 94 97 97 100 102 104 104 106 107 108
March 15, 2005 11:33 WSPC/148-RMP
78
J070-00231
U. Bunke & T. Schick
1. Introduction 1.1. Summary 1.1.1. In this paper, we describe a new approach to topological T -duality for U (1)principal bundles E → B (E is the background space time) equipped with degreethree cohomology classes h ∈ H 3 (E, Z) (the H-flux in the language of the physical literature). 1.1.2. We first define a T -duality relation between such pairs using a Thom class on an associated S 3 -bundle. Then we introduce the functor B 7→ P (B) which associates to each space the set of isomorphism classes of pairs. We construct a classifying space R of P and characterize its homotopy type. It admits a homotopy class of selfmaps T : R → R which implements a natural T -duality transformation P → P of order two. This transformation maps a class of pairs [E, h] ∈ P (B) to a ˆ ˆ canonical class [E, h] ∈ P (B) of T -dual pairs. We conclude in particular that our definition of topological T -duality essentially coincides with previous definitions, based on integration of cohomology classes along the fibers. 1.1.3. We describe an axiomatic framework for a twisted generalized cohomology theory h. We further introduce the condition of T -admissibility. Examples of T -admissible theories are the usual twisted de Rham cohomology and twisted K-theory. For a T -admissible generalized twisted cohomology theory h we prove ˆ cˆ), where (E, c) and (E, ˆ cˆ) are a T -duality isomorphism between h(E, c) and h(E, T -dual pairs. 1.1.4. We compute a number of examples. Iterating the construction of T -dual pairs, we can define duals of certain higher-dimensional torus bundles. We show that with our definition of duality the isomorphism type of the dual of a torus bundle, even if it exists, is not always uniquely determined. 1.2. Description of the results 1.2.1. In this paper we try to explain our understanding of the results of the recent paper [2] and parts of [3, 10] (Sec. 4.1) by means of elementary algebraic topology. The notion of T -duality originated in string theory. Instead of providing an elaborate historical account of T -duality here we refer to the two papers above and the literature cited therein. In fact, the first paper which studies T -duality is in some sense [12]. We will explain the relation with the present paper later in this introduction. 1.2.2. However, a few motivating words what this paper is about, and more importantly what it is not about, are in order.
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
79
T -duality first came up in physics in the following situation. The space E appears as part of a “background space–time”. The cohomology class h ∈ H 3 (E, Z) describes the Flux for a Neveu–Schwarz 3-form gauge potential H. In connection with T -duality, the case where E admits a free U (1)-action and thus has the structure of a principal U (1)-bundle, is of particular interest. The natural generalization is a space with a free action of a higher-dimensional torus. Then E is a U (1)k -principal bundle. In such a situation, for physical reasons, one expects to find a dual bundle with a dual flux (i.e. cohomology class) roughly by replacing each fiber by the dual fiber, the so-called T -dual. The T -dual should share many properties with the original bundle. In particular one expects that certain twisted cohomology groups are isomorphic. In the physical situation the spaces come with geometry. When passing to the dual, the metric on the fiber should be replaced by the dual metric on the dual fiber. A lot of the literature about T -duality and its relation to mirror symmetry have the geometry as a major ingredient, and they focus on situations in which the dimension of the fiber and the base coincide. One of the basic contributions in this context is [13]. 1.2.3. In the present paper we will completely disregard the geometry and metrics. This also explains the title “topological T -duality”. We are only interested in the resulting topological type. Moreover, we adopt a mathematical definition of the T -duality relation by simply declaring certain cohomological properties which are expected for physical reasons. This approach works best for U (1)-bundles. So we will concentrate on those for most of the paper with the exception of Sec. 4.4, where we study torus bundles by considering them as iterated U (1)-bundles. 1.2.4. In the present paper we study T -duality for principal U (1)-bundles equipped with an integral cohomology class of degree three. We will call such data a pair (Definition 2.1). We first introduce T -duality as a relation between pairs (Definition 2.9) (in particular, a given pair can have several T -dual pairs). The paper [2] works almost exactly in the same setting: it also starts with a pair and defines what a dual pair is (via a construction which involves some choices, so again is not unique). This definition unfortunately is not very precise, since torsion in the cohomology is neglected. In Sec. 4.3 we show by an example that it is necessary to take the torsion into account if one studies, e.g., the T -duality isomorphism for twisted K-theory. 1.2.5. At a first glance our definition of T -duality, which is based on a Thom class on an auxiliary 3-sphere bundle, looks quite different from the definition given in [2], which relied on integration over the fiber. The link between the two definitions is provided by an explicit universal example over a universal base space R for our definition of T -duality. Using some non-trivial calculations in this universal
March 15, 2005 11:33 WSPC/148-RMP
80
J070-00231
U. Bunke & T. Schick
example we will obtain a complete characterization of the T -dual (according to our definition) by topological invariants, which contains in particular the same kind of integration over the fiber as in the older notions of T -duality. We will show that, up to passing to real cohomology, the T -duality of [2] is characterized by the same topological invariants. Therefore, we can eventually conclude that our definition is essentially equivalent to the one used there (see 2.2.6, 2.2.7). Later in the present paper we will understand T -duality as a map which associates to an isomorphism class of pairs a canonical dual isomorphism class of pairs in a two-periodic manner. This in particular reproves the result of [2] that each pair admits a T -dual. 1.2.6. A third definition of T -duality is given in [12] (compare also [10], 4.1) or in [2]. In [12], the main object is a continuous trace algebra A with an R-action such that its spectrum X(A) is a free U (1) ∼ = R/Z-space. To A we can associate a pair (X(A), h(A)) consisting of the U (1)-bundle X(A) → X(A)/U (1) and the Dixmier–Douady class h(A) ∈ H 3 (X(A), Z). Vice versa each pair can be realized in this way. With an appropriate notion of Morita equivalence we have a bijection of equivalence classes of such algebras and isomorphism classes of pairs. In [12] it is shown that the cross product Aˆ := A o R is again a continuous trace algebra with R-action (the latter R is in fact the dual group of R) of the same type as above. It follows from the comparison of the topological invariants \ h(A)) [ and the naturality of ˆ h(A)) ˆ and the dual pair (X(A), of the pairs (X(A), the constructions with respect to the change of the base spaces that our notion of T -duality of pairs indeed corresponds to the cross product in [12]. b It is well known that Aˆ is Morita equivalent to A. This fact is reflected in our picture by the result that T -duality is two-periodic.
1.2.7. Given a base space B, we study the set P (B) of isomorphism classes of pairs (E, h) over B, where E → B is of a U (1)-principal bundle and h a class h ∈ H 3 (E, Z). It turns out that the contravariant set-valued functor B 7→ P (B) can be represented by a space R, the classifying space of pairs. The T -duality can then be considered as a natural transformation T : P → P of functors and it is represented by a homotopy class of maps T : R → R. 1.2.8. Our first main result (Theorem 2.17) is the characterization of the homotopy type of R as the homotopy fibration K(Z, 3) → R → K(Z, 2) × K(Z, 2) which is classified by pr∗1 c ∪ pr∗2 c ∈ H 4 (K(Z, 2) × K(Z, 2), Z).
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
81
Here K(Z, n) is the Eilenberg–MacLane space, i.e., characterized by the property that πk (K(Z, n)) = 0 if k 6= n and πn (K(Z, n)) = Z = H n (K(Z, n), Z). In particular, we can choose K(Z, 2) = CP ∞ . The class c ∈ H 2 (K(Z, 2), Z) is the canonical generator. How fibrations are classified is recalled in 2.3.2. 1.2.9. The space R carries a universal pair, and the map T will of course have the property to represent the universal dual pair (Definition 2.27). The classifying space R in fact already appears in [12, proof of Theorem 4.12]. It is used there in order to simplify the verification of the relation of topological invariants which corresponds to the assertion of Lemma 2.33. 1.2.10. As observed in many places, T -duality comes with isomorphisms in certain twisted generalized cohomology theories. In fact, the calculation of such twisted generalized cohomology groups in terms of the (perhaps easier to understand) generalized cohomology groups of the dual is one (topological) motivation for the study ˆ ˆ ˆ are of T -duality. If (E, h) and (E, h) are pairs over M , and in particular E and E principal U (1)-bundles over M , which are dual to each other, then (as shown, e.g., in [2]) there is an isomorphism (of degree −1) in twisted complex K-theory ˆ R, ˆ ˆ ˆ h). h) or of real twisted cohomology H(E, R, h) ∼ K∗ (E, h) ∼ = H(E, = K∗−1 (E, These isomorphisms are implemented by explicit T -duality transformations (Definition 3.12) which are constructed out of the diagram E ×B Eˆ p& ˆ
p.
E
Eˆ
q↓
(1.1)
π ˆ.
π&
B using standard operations in twisted cohomology (like pull-back and integration over the fiber).a 1.2.11. We say that a twisted generalized cohomology theory is T -admissible if the T -duality transformation is an isomorphism in the special case of the pair (U (1) → ∗, 0). Our second main result is the observation (Theorem 3.13) that the T -duality transformation for a T -admissible twisted generalized cohomology theory is an isomorphism, and that this fact is an easy consequence of the Mayer–Vietoris principle. 1.2.12. In order to produce a precise statement we fix the axioms for a twisted generalized cohomology theory in Sec. 3.1. In doing so we add some precision to a In the C ∗ -algebraic context of [12, 10] the T -duality isomorphism is given by Connes’ Thom isomorphism for crossed products with R.
March 15, 2005 11:33 WSPC/148-RMP
82
J070-00231
U. Bunke & T. Schick
the statements in [2], in particular to the observation that the Chern character preserves the T -duality transformation ([2], 1.14). The main point is that the cohomology class h ∈ H 3 (E, Z) only determines the isomorphism class of a twist and so the isomorphism class of K(E, h) or H(E, R, h) as an abstract group. In order to be able to say that the Chern character is a transformation between twisted cohomology theories one must use the same explicit objects to twist K-theory as one uses to twist real cohomology. In order to twist complex K-theory one usually considers a principal P U -bundle (but not a threeform as in [2]). More details on twisted K-theory can be found in [1]. On the other hand, three-forms are usually used to twist real (de Rham) cohomology. We do not know any natural way to relate these two kinds of twists (but look at the proof of Proposition 3.5 in [7], which perhaps solves this problem). In a previous paper [5] we have constructed versions of twisted K-theory and twisted real cohomology where the twists in both cases are Hitchin gerbes. For these versions of twisted cohomology theories the Chern character is indeed a natural transformation and preserved by T -duality. Since this gives a framework to work simultaneously with twisted K-theory and twisted cohomology, we propose to use Hitchin gerbes in this context. In the paper, however, we simply assume that the twists H and the twisted generalized cohomology theory h satisfies certain natural axioms, and then we go on ∼ = ˆ H) ˆ for any theory which to prove a natural T -duality isomorphism h(E, H) → h(E, ˆ ˆ satisfies these axioms and for dual pairs (E, H) and (E, H). 1.2.13. For the purpose of illustration we perform some calculations of twisted K-theory. For three-manifolds we obtain a complete answer in Sec. 4.1 (compare with the partial results of [11]). We demonstrate the T -duality isomorphism in twisted K-theory for U (1)-principal bundles over surfaces by explicit calculation. 1.2.14. It is a natural question if T -duality can be generalized to principal U (1)k -bundles for k > 1. As observed in [3] and [10] not every U (1)k -principal bundle has a T -dual in the classical sense. Note the remarkable observation in [10, Theorem 4.4.2] that in general the T -dual of a U (1)2 -principal bundle equipped with a three-dimensional integral cohomology class is a bundle of non-commutative tori. In the present paper we discuss the approach of defining a T -dual of a higherdimensional principal torus bundle as an iterated T -dual of U (1)-principal bundles. We demonstrate by an example that this approach does not lead to a unique result. 1.2.15. A U (1)-principal bundle E → B is essentially the same object as the free U (1)-space E. In a continuation [6] of the present paper we discuss a generalization of T -duality to the case of U (1)-spaces where U (1) acts with at most finite stabilizers. For applications to physics, this seems to be of relevance.
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
83
2. The Classifying Space of Pairs 2.1. Pairs and the classifying space 2.1.1. Let B be a topological space. Definition 2.1. A pair (E, h) over B consists of a U (1)-principal bundle π: E → B and a class h ∈ H 3 (E, Z). 2.1.2. If f : A → B is a continuous map, then we can form the functorial pull-back f ∗ (E, h) = (f ∗ E, F ∗ h), where F is defined by the pull-back F
f ∗E → E ↓ ↓ . f
A →B
2.1.3. We say that two pairs are isomorphic (written as (E0 , h0 ) ∼ = (E1 , h1 )) if there exists an isomorphism of U (1)-principal bundles F
E0 → E 1 ↓ ↓ B = B such that F ∗ h1 = h0 . 2.1.4. Let (Ei , hi ), i = 0, 1, be pairs over B. We say that they are homotopic ˜ h) ˜ over I × B such that (written as (E0 , h0 ) ∼ (E1 , h1 )), if there exists a pair (E, ˜ = (Ei , hi ), i = 0, 1, where fi : B → I × B is given by b 7→ (i, b). Note ˜ h) fi∗ (E, that we insist here in equality, it is not sufficient for later purposes to only have an isomorphism. 2.1.5. Lemma 2.2. On pairs, the relations “homotopy equivalence” ∼ and “isomorphism” ∼ = coincide. ¯ ¯ Proof. Let (E0 , h0 ) and (E1 , h1 ) be homotopic via (E, h). Then there is an ¯ isomorphism E → E0 × [0, 1]. Using this, we immediately get an isomorphism ¯ ¯ ¯ ¯ h). F : f0∗ (E, h) ∼ = f1∗ (E, Conversely, if (E0 , h0 ) and (E1 , h1 ) are isomorphic via an isomorphism F , ¯ ¯ := E0 × [0, 1/2] ∪F ×id E1 × [1/2, 1], with h we construct the homotopy E {1/2} ¯ obtained (uniquely) using the Mayer–Vietoris sequence for the cohomology of E. We take the freedom to use canonical isomorphisms between Ek × {k} and Ek , k = 0, 1. 2.1.6. Definition 2.3. By K(Z, n) we denote the Eilenberg–Mac Lane space characterized (up to homotopy equivalence) by its homotopy groups πk (K(Z, n)) = 0 if k 6= n, πn (K(Z, n)) = Z. Recall that for an arbitrary space X the cohomology
March 15, 2005 11:33 WSPC/148-RMP
84
J070-00231
U. Bunke & T. Schick
with Z-coefficients H n (X, Z) can be identified with homotopy classes of maps from X to K(Z, n) (denoted by [X, K(Z, n)]), a fact we are going to use frequently. As a model for K(Z, 1) we choose U (1). As a model for K(Z, 2) we can choose CP ∞ . Let q: U → K(Z, 2) be the universal U (1)-principal bundle. If we choose S K(Z, 2) = CP ∞ , we can choose U := S(C∞ ), the unit sphere in C∞ = n∈N Cn , and p factors out the canonical U (1)-action on C∞ . Furthermore, let LK(Z, 3) be the free loop space of K(Z, 3). This space admits an action of U (1) by uγ(t) := γ(u−1 t) for γ ∈ LK(Z, 3) and u, t ∈ U (1). Definition 2.4. We define the space R as the total space of the associated bundle c: (R := U ×U (1) LK(Z, 3)) → K(Z, 2) . Note that R is well-defined up to homotopy equivalence. We consider c also as a cohomology class c ∈ H 2 (R, Z). 2.1.7. Over R we have the U (1)-principal bundle π: (E := c∗ U ) → R with first Chern class c ∈ H 2 (R, Z). Furthermore, we have a canonical map h: E → K(Z, 3);
h(u, [v, γ]) := γ(t),
where u, v ∈ U , γ ∈ LK(Z, 3) and t ∈ U (1) satisfy q(u) = q(v) = c([v, γ]), and tv = u. Note that this is well-defined, independent of the choice of the representative of the class [v, γ] ∈ R. We consider this map also as a cohomology class h ∈ H 3 (E, Z). In this way we get a pair (E, h) over R. Definition 2.5. We call this pair (E, h) the universal pair. 2.1.8. We define the contravariant functor P from the category of topological spaces to the category of sets which associates to the space B the set P (B) of isomorphism classes of pairs and to the map f : A → B the pull-back f ∗ : P (B) → P (A). Proposition 2.6. The space R is a classifying space for P . In fact, we have an isomorphism of functors Ψ... : [· · · , R] → P (· · ·) given by ΨB ([f ]) := [f ∗ (E, h)] for each homotopy class of maps [f ] ∈ [B, R] and each CW-complex B. Proof. It follows immediately from Lemma 2.2 that the functor P is homotopy invariant. Therefore Ψ... is a well-defined natural transformation. Let [E, h] ∈ P (B) be given. Up to isomorphism, we can assume that we have a pull-back diagram of U (1)-principal bundles C
E→ U . ↓ ↓ c B → K(Z, 2)
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
85
We represent the class h by a map h: E → K(Z, 3). We construct a lift f : B → R of c as follows. For b ∈ B choose e ∈ Eb . Then we set f (b) := [C(e), γ] ∈ R with γ(t) = h(te) ∀ t ∈ U (1). Observe that f (b) is independent of the choice of e. If F : E → E is the U (1)-bundle map covering f , then F (e) = (C(e), [C(e), γ]) with e and γ as above. Therefore, h ◦ F = h and we have f ∗ (E, h) ∼ = (E, h). This shows that ΨB is surjective. ¯ over Let now ΨB ([f0 ]) = ΨB ([f1 ]). Using Lemma 2.2, we choose a homotopy E ∗ ∗ B × [0, 1] between f0 (E, h) and f1 (E, h). The construction used for the surjectivity ¯ To ¯ h). part provides us with a map f¯: B × [0, 1] → R such that f¯∗ (E, h) = (E, ∗ ¯ in such a way that E ¯ = c¯ U for an appropriate achieve equality, we have to choose E map c¯: B ×[0, 1] → K(Z, 2) (without changing the bundle at the boundary, i.e., such that c¯k = c ◦ fk ). This is possible since K(Z, 2) is a classifying space for principal U (1)-bundles. The construction has the property that f¯k = fk , therefore f¯ is a homotopy between f0 and f1 , proving that ΨB is injective. 2.2. Duality of pairs ˆ → B be two U (1)-principal bundles. Let 2.2.1. Let π: E → B and π ˆ: E π: (L := E ×U (1) C) → B ˆ := E ˆ ×U (1) C) → B be the associated complex hermitian line bundles. and π ˆ : (L ˆ as unit sphere bundles in L and L. ˆ We form the comWe can consider E and E ˆ → B and let r: S(V ) → B be the unit plex vector bundle r: (V := (L ⊕ L)) sphere bundle, the fibers consisting of three-dimensional spheres. V being a complex vector bundle, the map r is oriented. In particular, we have an integration map r! : H 3 (S(V ), Z) → H 0 (B, Z) (in de Rham cohomology the corresponding map is really given by integration over the fiber). Let 1B denote the unit in the ring H(B, Z). Definition 2.7. A Thom class for S(V ) is a class Th ∈ H 3 (S(V ), Z) such that r! (Th) = 1B . If S(V ) admits a Thom class, then by the Leray–Hirsch theorem its cohomology is a free H(B, Z)-module generated by 1S(V ) and Th. Thom classes in general are not unique. In fact, Th0 is a second Thom class if and only if Th − Th0 = p∗ d for some d ∈ H 3 (B, Z). ˆ The product 2.2.2. Let c, cˆ ∈ H 2 (B, Z) denote the Chern classes of E and E. χ(V ) := c ∪ cˆ ∈ H 4 (B, Z) is the Euler class of V . Lemma 2.8. The bundle S(V ) admits a Thom class if and only if χ(V ) = 0.
March 15, 2005 11:33 WSPC/148-RMP
86
J070-00231
U. Bunke & T. Schick
Proof. This follows from the Gysin sequence for S(V ). For this question the important segment is r∗
χ(V )
r
→ H 3 (B, Z) → H 3 (S(V ), Z) →! H 0 (B, Z) → H 4 (B, Z) → . ˆ ˆ 2.2.3. We now consider two pairs (E, h) and (E, h). Let i: E → S(V ) and ˆ → S(V ) denote the inclusions of the S 1 -bundles into the S 3 -bundle. ˆi: E ˆ are dual to each other if there exists ˆ h) Definition 2.9. We say that (E, h) and (E, ∗ ˆ = ˆi∗ Th. a Thom class Th for S(V ) such that h = i Th and h ˆ → B be given U (1)-principal bundles with first 2.2.4. Let π: E → B and π ˆ: E Chern classes c and cˆ. Then Lemma 2.8 has the following consequence. ˆ ∈ H 3 (E, ˆ Z) such that (E, h) Corollary 2.10. There exists h ∈ H 3 (E, Z) and h ˆ ˆ and (E, h) is a dual pair if and only if c ∪ cˆ = 0. If such a dual pair exist, then any ˆ+π ˆ h other has the form (E, h + π ∗ b) and (E, ˆ ∗ b) for some b ∈ H 3 (B, Z). ˆ ˆ 2.2.5. Let (E, h) and (E, h) be a dual pair. We consider the following part of the Gysin sequence for E π∗
c
→ H 1 (B, Z) → H 3 (B, Z) → H 3 (E, Z) → . We observe the following consequence of Corollary 2.10. ˆ and also to (E, ˆ 0 −h ˆ= ˆ h) ˆ ˆ Corollary 2.11. If (E, h) is dual to (E, h0 ), then we have h ∗ 1 π ˆ (c ∪ a) for some a ∈ H (B, Z). ˆ then c = −ˆ ˆ and cˆ = −π! (h). ˆ h), Lemma 2.12. If (E, h) is dual to (E, π! (h) Proof. We defer the proof to Lemma 2.33. It follows from the calculation of the cohomology in the universal situation. ˆ Consider the fiber product ˆ h). Lemma 2.13. Let (E, h) be dual to (E, p.
E
ˆ E ×B E
p& ˆ
ˆ . E
q↓
(2.14)
π ˆ.
π&
B ˆ Then p∗ h = pˆ∗ h. Proof. This is the parameterized version of the situation considered later in 3.2.1. In particular, we have a homotopy h: I × E ×M Eˆ → S(V ) from i ◦ p to ˆi ◦ pˆ, where ˆ → S(V ) are the canonical inclusions into the sphere bundle i: E → S(V ) and ˆi: E ˆ then p∗ h = p∗ i∗ Th = of the complex vector bundle V associated to E and E, ∗ˆ∗ ∗ˆ pˆ i Th = pˆ h.
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
87
2.2.6. We are now in the situation to compare our definition of T -duality with the definition used in [2, Sec. 3.1]. When interpreted in cohomological terms instead of using the language of differential forms, [2] constructs to a given pair (E, hR ) (where hR ∈ im(H 3 (E, Z) → H 3 (E, R)) is a real cohomology class with integral ˆ R ), again with ˆ ˆ h ˆ R). periods) another pair (E, hR ∈ H 3 (E, Let c be the first Chern class of E and use the notation of (2.14). By cR we denote the image of c in H 2 (B, R). The construction in [2] depends on a few choices, in particular the choices of connections. An integral lift h ∈ H 3 (E, Z) of hR uniquely determines the isomorphism ˆ with Chern class cˆ := π! (h). The cohomology class of the U (1)-principal bundle E ˆ class hR is then determined up to addition of a class of the form π ˆ ∗ (cR ∪ b) with 1 some b ∈ H (B, R). ˆ R ) = cR . These formulas In [2, Sec. 3.1] it is shown that π∗ (hR ) = cˆR and π ˆ ∗ (h differ from those of Lemma 2.12 by some signs. The reason is that in [2] the dual bundle is considered with the opposite U (1)-action. In [2] it is also shown that ˆR. p∗ hR = pˆ∗ h We will now prove that up to addition of classes of the form π ˆ ∗ (cR ∪ b) for 1 3 ˆ R ∈ H (E, ˆ R) is uniquely determined by these properties. b ∈ H (B, R) the class h Since our T -duality pairs share these properties, we conclude that (upon passing to ˆ R can real cohomology) they are dual in the sense of [2]. It then follows also that h ˆ ˆ ˆ∗ h = c, since be chosen with integral periods and with an integral lift h such that π we construct an integral lift of some representative. This assertion is also implicit in [2], but without a detailed proof. Note also that the ambiguity in the dual class ˆ is exactly parallel to the ambiguity in the construction of [2]. h ˆ R = p ∗ hR ˆ R is determined by the properties π ˆ R = cR and pˆ∗ h 2.2.7. To prove that h ˆ∗ h we consider the following web of Gysin sequences for the U (1)-principal bundles p, pˆ, π and π ˆ . Every row and every column is exact, and by the naturality of the Gysin sequence every square commutes. We use cohomology with real coefficients throughout, but the diagram is of course also correct with integral coefficients
(2.15) ˆ ˆ ˆ 0 . It ˆ both satisfy the above equations and set d := ˆ Assume that h, h0 ∈ H 3 (E) h−h 2 ∗ follows that π ˆ! d = 0 ∈ H (B) and that pˆ d = 0. The second property implies that
March 15, 2005 11:33 WSPC/148-RMP
88
J070-00231
U. Bunke & T. Schick
ˆ with d = l ∪ π there is a lift l ∈ H 1 (E) ˆ ∗ c. Set n := π ˆ! l ∈ H 0 (B). Without loss of generality we can assume that B is connected (else we work one component at a time). Now, only two possibilities remain (since [2] uses real coefficients, where no torsion phenomena occur). ˆ −h ˆ0 = (1) Either n = 0, then l = π ˆ ∗ a for a suitable a ∈ H 1 (B), and consequently h ∗ d=π ˆ (c ∪ a), which is exactly what we want to prove. (2) If n 6= 0, then cR = 0, since ncR = π ˆ! d = 0. In this case, π ˆ ∗ cR = 0 and therefore 0 ∗ ˆ ˆ also h − h = d = l ∪ π ˆ cR = 0. 2.2.8. Let us fix (E, h). Theorem 2.16. The equivalence class of pairs which are dual to (E, h) is uniquely determined. ˆ Proof. By Lemma 2.12 the isomorphism class of the underlying U (1)-bundle E ˆ ˆ of a pair dual to (E, h) is determined by the first Chern class cˆ := π! (h). If (E, h) ˆ 0 ) are both dual to (E, h), then by Corollary 2.11 ˆ ˆ h and (E, h0 − ˆ h = π ˆ ∗ (c ∪ a) 1 for some a ∈ H (B, Z). It remains to show that there exists an automorphism of U (1)-principal bundles U ˆ ˆ→ E E ↓ ↓ B = B
ˆ=h ˆ 0 . Any automorphism U is given by multiplication by a suitable such that U ∗ h g: B → U (1). Then we can factor U as the composition (ˆ π ,id) m ˆ ˆ− ˆ −g×id ˆ→ E −−→ B × E −−→ U (1) × E E,
where m is given by the principal bundle structure. Observe that we have the pull-back diagram m ˆ ˆ→ U (1) × E E pr2 ↓ π ↓. π ˆ E → B
ˆ where oU (1) ∈ H 1 (U (a)) is ˆ ⊕ oU (1) × pr∗ H 2 (E) ˆ = pr∗ H 3 (E) Using H 3 (U (1) × E) 2 2 the canonical generator, naturality of integration over the fiber and the split of pr2 , we obtain ˆ = pr∗ (h) ˆ ⊕ oU (1) × π ˆ m∗ (h) ˆ∗π ˆ! (h). 2
Note that [B, U (1)] ∼ = H 1 (B, Z) via [g] 7→ g ∗ oU (1) =: a(g) ∈ H 1 (B, Z). ˆ 0 . To h=h Now we return to the construction of U (and therefore g) with U ∗ ˆ 1 0 ˆ ˆ achieve this, choose g corresponding to a ∈ H (B, Z) such that h − h = π ˆ (c ∪ a). ˆ = −c we get (g × id)∗ m∗ (h) ˆ = −a × π ∗ c + pr∗ (h). ˆ Finally U ∗ (h) ˆ = Using π ˆ! (h) 2 ∗ ˆ − π (c ∪ a). h
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
89
2.3. The topology of R 2.3.1. It is a topological fact that the universal bundle with fiber K(Z, 3) is K(Z, 3) → P K(Z, 4) → K(Z, 4), where P K(Z, 4) is the path space of K(Z, 4), i.e., the space of all path in K(Z, 4) starting in the base point. The map to K(Z, 4) is given by evaluation at the end point. The fiber of this evaluation over the base point is the based loop space ΩK(Z, 4), which serves here as a model for the homotopy type K(Z, 3). 2.3.2. If B is a space, then bundles over B with (“oriented”) fiber K(Z, 3) are classified by homotopy classes of maps [B, K(Z, 4)], i.e., by cohomology classes in H 4 (B, Z). The homotopy type of such a bundle is determined by such maps up to self homotopy equivalences of B and of K(Z, 4), i.e., up to the action of self homotopy equivalences of B and up to multiplication by −1 on H 4 (B; Z). We consider a bundle K(Z, 3) → F → B which is classified by κ ∈ H 4 (B, Z). For simplicity we assume that B is connected and simply connected. Then κ can be read off from the differential d40,3 in the Serre spectral sequence for the bundle. By the Hurewicz theorem, the relevant part of the E4 -page looks like 3
Z
0
H 2 (B, Z)
H 3 (B, Z)
H 4 (B, Z)
2
0
0
0
0
0
1
0
0
0
0
0
0
Z
0
H 2 (B, Z)
H 3 (B, Z)
H 4 (B, Z)
X
0
1
2
3
4
.
4 The differential d0,3 4 : Z → H (B, Z) is multiplication with κ.
2.3.3. The main result of the present subsection is the determination of the homotopy type of R. Let z ∈ H 2 (K(Z, 2), Z) be the canonical generator. By the K¨ unneth theorem, the cohomology of K(Z, 2) × K(Z, 2) is the polynomial ring in two generators c = pr∗1 z and cˆ := pr∗2 z, i.e., H(K(Z, 2) × K(Z, 2), Z) = Z[c, cˆ]. Theorem 2.17. R is the total space of a bundle K(Z, 3) → R → K(Z, 2) × K(Z, 2),
(2.18)
which is classified by c ∪ cˆ ∈ H 4 (K(Z, 2) × K(Z, 2), Z). 2.3.4. To prove Theorem 2.17, we first compute the homotopy groups πi (R). Observe that S 0 and S 1 admit only one isomorphism class of pairs. This implies
March 15, 2005 11:33 WSPC/148-RMP
90
J070-00231
U. Bunke & T. Schick
that R is connected and simply connected. This observation also frees us from base point considerations. Lemma 2.19. The homotopy groups of R are given by i 6∈ {2, 3} 0 πi (R) = Z ⊕ Z i = 2 Z i=3
Proof. We first observe that there is exactly one isomorphism class of pairs over S i for i ≥ 4, namely (U (1) × S i → S i , 0). This implies that πi (R) = 0 for i ≥ 4. It remains to determine π2 (R) and π3 (R). If (E, h) is a pair over S 3 , then we have E = S 1 × S 3 and h = n(E, h)1S 1 × oS 3 for a well-defined integer n(E, h) ∈ Z, where oS 3 ∈ H 3 (S 3 , Z) is the canonical generator. The bijection P (S 3 ) ∼ = Z given by (E, h) 7→ n(E, h) induces the isomorphism π3 (R) → Z in view of Proposition 2.6. Let us now consider a pair (E, h) over S 2 . Note that E is canonically oriented, in particular H 3 (E, Z) = [E] · Z. Let c ∈ H 2 (S 2 , Z) be its first Chern class. Then we define the tuple of integers (k(E, h), n(E, h)) = (hc, [S 2 ]i, hh, [E]i) ∈ Z ⊕ Z.
The bijection P (S 2 ) ∼ = Z ⊕ Z given by (E, h) 7→ (k(E, h), n(E, h)) defines the isomorphism π2 (R) ∼ = Z ⊕ Z in view of Proposition 2.6.b 2.3.5. The computation of the homotopy groups of R implies by the Hurewicz theorem that H0 (R, Z) = Z, H1 (R, Z) ∼ = 0 and H2 (R, Z) ∼ = Z ⊕ Z. By the universal 2 ∼ coefficient theorem H (R, Z) = Z ⊕ Z. Recall that c ∈ H 2 (R, Z) is the class of the projection c: R → K(Z, 2). Let π: E → R be the universal bundle and h ∈ H 3 (E, Z) be the universal class. Definition 2.20. We define ˆ c := −π! (h) ∈ H 2 (R, Z). Lemma 2.21. We have H 2 (R, Z) = cZ ⊕ ˆ cZ. Proof. Use the canonical isomorphisms H 2 (R, Z) ∼ = = Hom(H2 (R, Z), Z) ∼ 2 2 Hom(π2 (R), Z), where x ∈ H (R, Z) and [f : S → R] is mapped to hf ∗ x, [S 2 ]i. The identification π2 (R) ∼ = Z ⊕ Z above gives H 2 (R, Z) ∼ = Z ⊕ Z. An inspection shows that this isomorphism maps ac + bˆ c to (a, −b). Therefore, H 2 (R, Z) is freely generated by c and ˆ c. 2.3.6. Let cˆ be classified by a map ˆ c: R → K(Z, 2). We will now determine the homotopy fiber F of the map (c, cˆ): R → K(Z, 2) × K(Z, 2). b We
leave it to the interested reader to check that these bijections are in fact homomorphisms.
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
91
Lemma 2.22. The homotopy fiber of (c, cˆ) is K(Z, 3). Proof. We consider the long exact sequence of homotopy groups · · · → πi (F ) → πi (R) → πi (K(Z, 2) × K(Z, 2)) → πi−1 (F ) → · · · . We immediately conclude that πi (F ) = 0 if i 6∈ {1, 2, 3}. Furthermore we see that π3 (F ) ∼ = π3 (R) ∼ = Z. Therefore the relevant part is now π1 (c,ˆ c)
0 → π2 (F ) → π2 (R) −−−−→ π2 (K(Z, 2) × K(Z, 2)) → π1 (F ) → 0. Now we observe that (c, cˆ) induces an isomorphism in integral cohomology of degree i ≤ 2. Therefore it induces an isomorphism α: π2 (R) ∼ = π2 (K(Z, 2) × K(Z, 2)). It follows that πi (F ) = 0 for i ∈ {1, 2}. 2.3.7. We now have seen that R is the total space of a bundle (c,ˆ c)
K(Z, 3) → R −−→ K(Z, 2) × K(Z, 2). It remains to determine the invariant κ ∈ H 4 (K(Z, 2) × K(Z, 2), Z) which determines this bundle. To do this we compute the cohomology of R up to degree four and then we determine the differential in the Serre spectral sequence of the bundle. We already know that n H n (R, Z) 0
Z
1
0
2
cZ ⊕ cˆZ
.
2.3.8. We start with recalling the low-dimensional integral cohomology of LK(Z, 3). Note that K(Z, 3) has the structure of an H-space (because one possible model is ΩK(Z, 4)), so that LK(Z, 3) is homotopy equivalent to K(Z, 3)×ΩK(Z, 3). Further note that ΩK(Z, 3) ' K(Z, 2). We use that n
H n (K(Z, 2), Z)
H n (K(Z, 3), Z)
0
Z
Z
1
0
0
2
Z
0
3
0
Z
4
Z
0
5
0
0
.
March 15, 2005 11:33 WSPC/148-RMP
92
J070-00231
U. Bunke & T. Schick
We now conclude by the K¨ unneth formula that n
H n (LK(Z, 3), Z)
0
Z
1
0
2
Z
3
Z
4
Z
5
Z
.
2.3.9. We compute the cohomology H 3 (R, Z) using the Gysin sequence of U (1) → U × LK(Z, 3) → R.
(2.23)
Observe that U × LK(Z, 3) → U ↓ ↓ c R → K(Z, 2) is a pull-back of U (1)-principal bundles. Therefore the first Chern class of the U (1)-principal bundle U × LK(Z, 3) → R (with the diagonal U (1)-action) is c ∈ H 2 (R, Z). We further use the fact that U is contractible. The relevant part of the Gysin sequence is c
0 → H 3 (R, Z) → H 3 (LK(Z, 3), Z) → H 2 (R, Z) → H 4 (R, Z) → H 4 (LK(Z, 3), Z) → H 3 (R, Z).
Since c is the first Chern class of π: E → R, the above principal bundle is isomorphic to E and we can use the Gysin sequence for π: E → R π
c
→ H 3 (E, Z) →! H 2 (R, Z) → H 4 (R, Z) →
ˆ = −c ∪ π! (h) = 0. Therefore c: H 2 (R, Z) → H 4 (R, Z) is not to conclude that c ∪ c injective. Since H 3 (LK(Z, 3), Z) ∼ = Z and H 2 (R, Z) is free abelian this implies that H 3 (R, Z) = 0. 2.3.10. The map c: R → K(Z, 2) admits a natural split K(Z, 2) → R. It maps x ∈ K(Z, 2) to the class [u, γ], where γ is the constant loop and u ∈ Ux is any point. The split classifies the pair (U, 0) over K(Z, 2). The existence of the split implies that c generates a polynomial ring Z[c] as direct summand inside H ∗ (R, Z). 2.3.11. In particular, c2 6= 0. Therefore the kernel of c: H 2 (R, Z) → H 4 (R, Z) is generated by cˆ. The Gysin sequence for (2.23) now gives c2
0 → Z → H 4 (R, Z) → Z → 0,
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
93
where the last copy Z is H 4 (LK(Z, 3), Z). This implies that H 4 (R, Z) ∼ = c2 Z ⊕ Z. We now show that c2 and cˆ2 generate H 4 (R, Z) as a Z-module. We consider the pair over K(Z, 2) consisting of the trivial bundle Π: U (1) × K(Z, 2) → K(Z, 2) and the class h = oU (1) × z ∈ H 3 (U (1) × K(Z, 2), Z), where z ∈ H 2 (K(Z, 2), Z) is a generator. This pair is classified by a map f : K(Z, 2) → R. Let F : U (1)×K(Z, 2) → E be defined by the pull-back diagram F
U (1) × K(Z, 2) → E Π↓ π ↓. f
K(Z, 2)
→ R
Then we have f ∗ c = 0 and f ∗ cˆ = −f ∗ π! (h) = −Π! F ∗ (h) = Π! (h) = −z. This shows that cˆ ∈ H 2 (R, Z) generates a polynomial ring isomorphic to Z[ˆ c] inside H ∗ (R; Z). Furthermore, we see that f ∗ (ˆ c2 ) = z 2 is primitive so that ˆ c2 must be primitive, too. Thus H 4 (R, Z) = c2 Z ⊕ cˆ2 Z. Let us collect the results of our computations: Lemma 2.24. We have n
H n (R, Z)
0
Z
1
0
2
cZ ⊕ cˆZ
3 4
.
0 2
c Z ⊕ cˆ2 Z
2.3.12. We now finish the proof of Theorem 2.17. We consider the E4 -page of the Serre spectral sequence of the fibration (2.18) 4
0
0
0
0
0
3
Z
0
∗
0
2
0
0
0
Z
∗
1
0
0
0
0
0
Z
0 cZ ⊕ cˆZ 0 c2 Z ⊕ (c ∪ cˆ)Z ⊕ cˆZ
∗
0
1
2
3
0 0 4
.
March 15, 2005 11:33 WSPC/148-RMP
94
J070-00231
U. Bunke & T. Schick
We read off that we get the exact sequence d0,3
0 → Z −−4−→ c2 Z ⊕ (c ∪ cˆ)Z ⊕ cˆZ → c2 R ⊕ ˆ c2 Z → 0 . The last map is the edge homomorphism and therefore induced by the map R → K(Z, 2) × K(Z, 2). Since under this map c is mapped to c and cˆ to ˆ c, d40,3 is multiplication by ±c ∪ cˆ. This finishes the proof of Theorem 2.17. 2.4. The T -transformation 2.4.1. We have already observed that E and U × LK(Z, 3) are isomorphic U (1)-principal bundles over R, both having first Chern class c. Since U is contractible, we know the low-dimensional cohomology of E by Sec. 2.3.8. Using the Gysin sequence of π: E → R, we determine the generators in terms of characteristic classes of E. From c
0 → H 0 (R, Z) → H 2 (R, Z) → H 2 (E, Z) → 0, we conclude that H 2 (E, Z) ∼ cZ. Finally, we get = π∗ ˆ π
c
! 0 → H 3 (E, Z) −→ H 2 (R, Z) → H 4 (R, Z) → H 4 (E, Z) → 0.
This shows that H 3 (E, Z) ∼ c = −π! (h) generates = Z and H 4 (E, Z) ∼ = π ∗ cˆ2 Z. Since ˆ 2 4 the kernel of c: H (R, Z) → H (R, Z) we have H 3 (E, Z) ∼ = hZ. Lemma 2.25. n
H n (E, Z)
0
Z
1
0
2
∗ˆ
.
π cZ
3
hZ
4
∗ 2
c Z π ˆ
ˆ → R. Since c ∪ ˆ 2.4.2. The class ˆ c classifies a U (1)-principal bundle π ˆ: E c = 0 and 3 3 ˆ ∈ H 3 (E, ˆ Z) such that H (R, Z) = 0 there exist unique classes h ∈ H (E, Z) and h ˆ ˆ (E, h) and (E, h) are dual to each other, where we use Corollary 2.10. Lemma 2.26. We have h = h. Proof. Let r: V → R denote the two-dimensional complex vector bundle given by ˆ where L and L ˆ are the hermitian line bundles associated to E and E. ˆ V := L ⊕ L, Then we can factor the associated unit sphere bundle as s
t
S(V) → P (V) → R,
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
95
where P (V) is the projective bundle of V. Let c˜ ∈ H 2 (P (V), Z) be the first Chern class of the U (1)-principal bundle s: S(V) → P (V). By the Leray–Hirsch theorem H ∗ (P (V), Z) is a free module over H ∗ (R, Z) generated by 1P (V) ∈ H 0 (P (V), Z) ˆ induce two sections l, ˆl: R → P (V) such that we and c˜. The line bundles L and L have the following pull-back diagram ˆi i ˆ E → S(V) ← E π ↓ s↓ π ˆ↓ . ˆl
l
R → P (V) ← R Note that l∗ c˜ = c and ˆl∗ c˜ = ˆ c. Since t! (˜ c) = 1 = t! ◦s! (Th) we have s! (Th) = c˜+t∗ b for some b ∈ H 2 (R, Z). This implies that π! (h) = π! ◦ i∗ (Th) = l∗ ◦ s! (Th) = c + b. ˆ = ˆ Analogously, we get π ˆ! (h) c + b. Furthermore, we deduce from the projection formula that c ∪ (c + b) = c ∪ π! (h) = π! (π ∗ c ∪ h) = 0,
c + b) = 0 . cˆ ∪ (ˆ
Using the information about the ring structure of H ∗ (R, Z) it follows that c + b = c and cˆ + b = nc for some m, n ∈ Z. Since H 2 (R, Z) is freely generated by c mˆ c. By c + c) so that π! (h) = −ˆ and cˆ we conclude that m = n = −1, i.e., b = −(ˆ Lemma 2.25 each class in x ∈ H 3 (E, Z) is a multiple of h. Since π! (h) = π! (h) we see that h = h. ˆ = −c. This shows that h ˆ := h ˆ ∈ H 3 (E, ˆ Z) is a 2.4.3. We also see that π! (h) generator. ˆ ˆ h). Definition 2.27. We define the dual universal pair to be (E, As in 2.4.1 we have Corollary 2.28. n
ˆ Z) H n (E,
0
Z
1 2
0 ∗
3
π cZ ˆ hZ
4
π ∗ c2 Z
.
ˆ covered by the ˆ h), 2.4.4. Let T : R → R be the classifying map of the dual pair (E, ˆ U (1)-bundle map TE : E → E. Lemma 2.29. T ◦ T classifies (E, h). In particular, T 2 ∼ idR .
March 15, 2005 11:33 WSPC/148-RMP
96
J070-00231
U. Bunke & T. Schick
ˆ = ˆ = T ∗ π! (−h) = π Proof. We have T ∗ c = ˆ c. Furthermore, T ∗ c ˆ! (−TE∗ h) = π ˆ! (−h) ∗ ∗ˆ c. Thus (T ◦ T ) c = c and (T ◦ T ) c = ˆ c. The underlying bundle of the pair classified by T 2 is π: E → R. Since π! (h) = −ˆ c = −(T ◦ T )∗ c we must have (T ◦ T )∗ (E, h) ∼ = (E, h). 2.4.5. Recall from 2.1.8 that P (B) denotes the set of isomorphism classes of pairs over B and that we have a natural isomorphism of functors ΨB : [B, R] → P (B). The map T : R → R induces an involution T∗ : [· · · , R] → [· · · , R]. Definition 2.30. We define the natural transformation of set-valued functors T...: P (· · ·) → P (· · ·) by TB := ΨB ◦ T∗ ◦ Ψ−1 B . We call it the T -duality transformation. 2.4.6. The following is a consequence of Lemma 2.29. Corollary 2.31. Note that TB2 = id. In particular, the T -duality transformation is an isomorphism of functors. 2.4.7. Let (E, h) be a pair over B and c ∈ H 2 (B, Z) be the first Chern class of E. ˆ representing the class ˆ h) Lemma 2.32. Any pair (E, h) admits a dual pair (E, 2 ˆ TB ([E, h]). The first Chern class cˆ ∈ H (B, Z) of E is given by cˆ = −π! (h). ˆ Furthermore, c = −ˆ π! (h). ˆ = f ∗ (E, ˆ ˆ h) ˆ h). Proof. Let f : B → R classify the pair (E, h). Then we let (E, The relations between the Chern classes and the three-dimensional cohomology classes follow from the corresponding relations over R obtained in 2.4.2. We have compatible pull-back diagrams E ↓ B
→
E ↓, → R
ˆ→E ˆ E ↓ ↓, B→R
S(V ) → S(V) ↓ ↓ . B → R
We obtain the Thom class of S(V ) as a pull-back of the universal Thom class of ˆ This shows that (E, h) and (E, ˆ ˆ h) S(V). Its restriction to E and Eˆ gives h and h. are in duality. ˆ over a space B. Let c, cˆ denote the first ˆ h) 2.4.8. We consider pairs (E, h) and (E, ˆ Chern classes of E and E. ˆ are dual to each other, then we have c = −ˆ ˆ ˆ h) Lemma 2.33. If (E, h) and (E, π! (h) and cˆ = −π! (h).
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
97
Proof. Denote the canonical generators of the polynomial ring H ∗ (K(Z, 2) × K(Z, 2), Z) by z, zˆ (instead of c, cˆ - we do this in order to avoid notational conflicts). Recall that we have a bundle (c,ˆ c)
K(Z, 3) → R −−−→ K(Z, 2) × K(Z, 2)
¯ B → K(Z, 2) × K(Z, 2) which is classified by z ∪ zˆ ∈ H 4 (K(Z, 2) × K(Z, 2), Z). If f: ∗ ¯ ¯ ¯ B → K(Z, 2) × K(Z, 2) satisfies f z ∪ f zˆ = 0, then it admits a lift f : B → R. Let f: ∗ be the classifying map of the pair (c, cˆ), i.e., f¯ z = c and f¯∗ zˆ = cˆ. Then we have a lift f : B → R. Pulling back the universal pairs over R we get pairs (E, h0 ) and ˆ 0 ) which are dual to each other. Furthermore, π! (h0 ) = −ˆ ˆ 0 ) = −c. ˆ h (E, c and π ˆ ! (h 0 ∗ 0 ∗ 3 ˆ ˆ By 2.10 we have h = h + π b and h = h + π ˆ b for some b ∈ H (B). Hence ˆ = π ! (h ˆ 0 ) = −c. π! (h) = π! (h0 ) = −ˆ c and π! (h) 2.4.9. Note that there is a natural action of H 3 (B, Z) on the set P (B) given by β[E, h] := (E, h + π ∗ β), β ∈ H 3 (B). Lemma 2.34. The T -duality transformation is equivariant with respect to this action of H 3 (B, Z). Proof. This is an immediate consequence of Corollary 2.10. 2.4.10. By Theorem 2.16 we already knew that the equivalence class of pairs dual to (E, h) is unique, if such dual pairs exist at all. The new information obtained from the study of the topology of the classifying space is the existence of pairs dual to (E, h). More significantly, note that our proof of the uniqueness part of Theorem 2.16 involves Lemma 2.33, whose proof also depends on the knowledge of the topology of R. 3. T -Duality in Twisted Cohomology Theories 3.1. Axioms of twisted cohomology 3.1.1. There may be many explicit models of a twisted cohomology theory which lead to equivalent results and examples abound in the literature. In particular, this applies to the nature of a twist. What we will describe here is a picture which should be the common core of the various concrete realizations. In any case the twists come as a pre-sheaf of pointed groupoids B 7→ T (B) on the category of spaces. Let us fix some notation for the main ingredients, which also recalls the concept of a pre-sheaf we use. First of all T (B) is a groupoid with a distinguished trivial object θB , giving rise to the trivial twist (i.e., to no twist at all). If f : A → B is a map of spaces, then there is a functor f ∗ : T (B) → T (A) preserving the trivial twists. Furthermore, if g: B → C is a second map, then there exists a natural transformation Ψf,g : f ∗ ◦ g ∗ → (g ◦ f )∗ .
March 15, 2005 11:33 WSPC/148-RMP
98
J070-00231
U. Bunke & T. Schick
If h: C → D is a third map, then we require that
Ψf,h◦g ◦ f ∗ Ψg,h = Ψg◦f,h ◦ h∗ Ψf,g .
3.1.2. The following three requirements provide the coupling to topology. (1) We require that there is a natural transformation c: T (· · ·) → H 3 (· · · , Z) (the latter is considered as a pre-sheaf of categories in a trivial manner, i.e., with only identity morphisms) which classifies the isomorphism classes of T (B) for each B. (2) If H, H0 ∈ T (B) are equivalent objects, then we require that Hom(H, H 0 ) is a H 2 (B, Z)-torsor such that the composition with fixed morphisms gives isomorphisms of torsors. Furthermore, we require that the torsor structure is compatible with the pull-back. Note that we have natural bijections u: Hom(H, H) → H 2 (B, Z) which map compositions to sums. (3) Let K ∈ T (Σ(B ∪ ∗)), where Σ(B ∪ ∗) := I × B/({0} × B ∪ {1} × B) is the (reduced) suspension. We have a homotopy h: I × B → Σ(B ∪ ∗) from the constant map p: B 7→ ∗ ,→ Σ(B ∪ ∗) to itself given by ht (b) = [t, b]. It induces a morphism u(h): p∗ K → p∗ K as will be explained in 3.1.5. We require that u(u(h)) and c(K) correspond to each other under the suspension isomorphism H 3 (Σ(B ∪ ∗), Z) ∼ = H 2 (B, Z). 3.1.3. Let us list two examples. (1) In our first example the objects of T (B) are Hitchin gerbes. Recall that a Hitchin gerbe over X is a U (1)-extension H → G, where G is an ´etale groupoid which represents the space B. A morphism in T (B) is an equivalence class of equivalences of Hitchin gerbes u: H → H0 . The isomorphism classes of Hitchin gerbes are classified by the characteristic class c(H) ∈ H 3 (B, Z). We refer to [4] for further details, in particular the torsor structure on the sets of morphisms. (2) In the second example the objects of T (B) are given by the set of continuous maps f : B → K(Z, 3). A morphism u: f → f 0 is then a homotopy class of homotopies from f to f 0 . We set c(f ) := [f ] ∈ [B, K(Z, 3)] ∼ = H 3 (B, Z). Recall that LK(Z, 3) ' K(Z, 2) × K(Z, 3). Let u: LK(Z, 3) → K(Z, 2) be the first projection. The second projection is given by the evaluation map ev0 . An automorphism of f is a homotopy class [γ] ∈ [B, LK(Z, 3)] with ev0 ◦ γ = f . Therefore, automorphisms are classified by [u ◦ γ] ∈ [B, K(Z, 2)] ∼ = H 2 (B, Z). 3.1.4. In the following we fix some framework of twists and formulate the axioms of a twisted cohomology theory in this framework. We fix a cohomology theory h for which we want to define a twisted extension. Definition 3.1. A twisted cohomology theory h extending h associates to each space X and each twist H ∈ T (X) a Z-graded group h(X, H). To a map f : Y → X
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
99
it associates a homomorphism f ∗ : h(X, H) → h(Y, f ∗ H). To a morphism u: H → H0 of twists it associates an isomorphism, natural with respect to pull-backs, u∗ : h(X, H) → h(X, H0 ). Finally, we require an integration map p! : h(Y, p∗ H) → h(X, H) of degree dim(Y ) − dim(X) for a proper h-oriented map p: Y → X. Integration shall be natural with respect to morphisms in T (X). These structures must satisfy the axioms described below. Axiom 3.2 (Extension). Let θX ∈ T (X) denotes the trivial twist. There exists a canonical isomorphism h(X, θX ) → h(X) which preserves pull-back and integration over the fiber. Axiom 3.3 (Functoriality). If g: Z → Y is a second map, then we have Ψg,f (H)∗ ◦ (f ◦ g)∗ = g ∗ ◦ f ∗ .
If v: H00 → H is another morphism of twists, then we have v ∗ ◦ u∗ = (u ◦ v)∗ .
3.1.5. Assume that h: R × Y → X is a homotopy from f0 to f1 , i.e., fk = i∗k (h), where ik : Y → R × Y is given by ik (x) = (k, x), k = 0, 1. Define F : R × Y → R × X; (t, y) 7→ (t, h(t, y)). Observe that for H ∈ T (X) the twists (idR × f0 )∗ pr∗2 H and F ∗ pr∗2 H on R × Y are isomorphic, since we can by assumption read off the isomorphism class from the pull-backs of the corresponding classifying cohomology class, which are equal by homotopy invariance of cohomology. We define u(h): (idR × f0 )∗ pr∗2 H → F ∗ pr∗2 H to be the unique morphism of twists such that can
f0∗ H ∼ = i∗0 ◦ (idR × f0 )∗ ◦ pr∗2 H
i∗ 0 (u(h))
∼ =
can
i∗0 ◦ F ∗ ◦ pr∗2 H ∼ = f0∗ H
is the identity. The morphism u(h) is determined uniquely this way since i∗0 : H 2 (R× Y, Z) → H 2 (Y, Z) is an isomorphism. The canonical isomorphisms are induced by Axiom 3.3. Note that u(h) is natural with respect to morphisms in T (X). Finally we define can
v(F ): f0∗ H ∼ = i∗1 ◦ (idR × f0 )∗ ◦ pr∗2
i∗ 1 (u(h))
∼ =
can
i∗1 ◦ F ∗ ◦ pr∗2 H ∼ = f1∗ H.
Axiom 3.4 (Homotopy Invariance). With these conventions we require that v ∗ ◦ f1∗ = f0∗ .
March 15, 2005 11:33 WSPC/148-RMP
100
J070-00231
U. Bunke & T. Schick
Axiom 3.5 (Integration). (1) Functoriality. If q: Z → Y is a further proper h-oriented map, then we have p! ◦ q! ◦ Ψp,q (H)∗ = (q ◦ p)! : h(Z, (q ◦ p)∗ (H)) → h(X, H). (2) Naturality. If g: Z → X is a further map, then we have the Cartesian diagram p∗ g
Z ×X Y → Y g∗ p ↓ p↓ g Z → X and we require that (g ∗ p)! ◦ (Ψg,g∗ p )(H)∗ ◦ (Ψp,p∗ g (H)∗ )−1 ◦ (p∗ g)∗ = g ∗ ◦ p! . Axiom 3.6 (Mayer–Vietoris Sequence). If X = U ∪ V is a decomposition by open subsets, then we can find a function φ: X → R such that φ|X\U = 1, φX\V = −1, and the inclusion i: (Y := {φ = 0}) → X is a proper naturally h-oriented map. Let j: Y → U ∩ V, g: U → X, h: V → X, k: U ∩ V → U, l: U ∩ V → V and r: U ∩ V → X denote the inclusions, and define δ := i! ◦ j ∗ .c Then we require that the following sequence is exact: δ
(g ∗ ,h∗ )
· · · → h(U ∩ V, r∗ H) → h(X, H) −−−→ h(U, g ∗ H) ⊕ h(V, h∗ H) k∗ −l∗
−−−→ h(U ∩ V, r ∗ H) → · · · ,
where some canonical isomorphisms are suppressed in the notation. 3.1.6. Examples of twisted cohomology theories which satisfy these axioms (on the category of smooth manifolds and smooth maps) are twisted de Rham cohomology and twisted Spinc -cobordism theory [4] and [5]. In these examples twists are Hitchin gerbes. As indicated in [5] there should also be a twisted version of complex K-theory. In this case the missing piece in the literature is a nice description of integration over the fiber and also of the boundary operator in the Mayer–Vietoris sequence. 3.2. T -admissibility ˆ := S 1 . 3.2.1. We consider the unit sphere S ⊂ C2 = C ⊕ C. Let E := S 1 and E ˆ → S, ˆi(ˆ We consider the embeddings i: E → S, i(z) = (z, 0) and ˆi: E z ) = (0, zˆ). Let ˆ ˆ T := E × E and p: T → E and pˆ: T → E denote the projections. We define the homotopy h: I × T → S from i ◦ p to ˆi ◦ pˆ by 1 p ht (z, zˆ) := √ 1 − t2 z, tˆ z . 2 c Note
that δ is independent of the choice of φ.
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
101
Let K ∈ T (S) be a twist such that hc(K), [S]i = 1. We define H := i∗ K and ˆ := ˆi∗ K. The homotopy h induces a unique morphism H ˆ = pˆ∗ˆi∗ K u: pˆ∗ H
Ψp, ˆˆ i (K)
∼ =
u(h)
(ˆi ◦ pˆ)∗ K ∼ = (i ◦ p)∗ K
Ψp,i (K)−1
∼ =
p∗ i∗ K = p∗ H,
where u(h) is defined in Sec. 3.1.5. 3.2.2. Let h be a twisted cohomology theory. Note that pˆ is canonically h-oriented ˆ is canonically trivialized by the U (1)-action. since T E Definition 3.7. We say that the twisted cohomology theory h is T -admissible if ˆ H) ˆ pˆ! ◦ u(h)∗ ◦ p∗ : h(E, H) → h(E, is an isomorphism. Note that the map has degree −1. 3.2.3. Naturality implies that T -admissibility does not depend on the choice of K inside its isomorphism class. 3.2.4. We show now how one can check T -admissibility in practice. ˆ be the quotient space of S where i(E) and ˆi(E) ˆ are identified Let S/(E ∪ E) ∼ ˆ to one point. We have a natural identification r: S/(E ∪ E) = Σ(T ∪ ∗) given by the homotopy h used in Sec. 3.2.1. Note that p∗ r∗ : H 3 (Σ(T ∪ ∗), Z) → H 3 (S, Z) is ˆ is the projection. Thus, we can choose an isomorphism, where p: S → S/(E ∪ E) ∗ ∗˜ ˜ K := p r K for some twist K ∈ T (Σ(T ∪ ∗)) such that c(K) ∈ H 3 (Σ(T ∪ ∗), Z) ∼ =Z ˜ to the base point is a generator. Since H 3 (∗, Z) = 0 = H 2 (∗, Z), the restriction of K ˆ∼ is the trivial twist. Then we obtain canonical morphisms H ∼ = θE and H = θEˆ . The homotopy r ◦ h induces now a canonical morphism u(r ◦ h): θT → θT . By the third property stated in 3.1.2 we know that u(u(r ◦ h)) ∈ H 2 (T, Z) ∼ = Z is a generator too. The determination of this generator involves the precise understanding of the isomorphism p∗ r∗ and of the suspension isomorphism. Note that H 2 (T, Z) acts naturally on h(T ) via the identifications H 2 (T, Z) ∼ = Hom(θT , θT ) and h(T ) ∼ = h(T, θT ). For g ∈ H 2 (T, Z) we denote this action by g ∗ . Therefore, in order to check that the cohomology theory h is T -admissible, it suffices to show that ˆ pˆ! ◦ g ∗ ◦ p∗ : h(E) → h(E) is an isomorphism if g ∈ H 2 (T, Z) is a generator. 3.2.5. Lemma 3.8. Twisted K-theory is T -admissible. Proof. Let l ∈ K 0 (T ) be the class of the line bundle over T with first Chern class equal to g ∈ H 2 (T, Z) = Z. Then g ∗ is induced by the cup product with l.
March 15, 2005 11:33 WSPC/148-RMP
102
J070-00231
U. Bunke & T. Schick
Let 1 ∈ K 0 (S 1 ) and u ∈ K 1 (S 1 ) be the generators. One can compute pˆ! ◦ g ∗ ◦ p∗ (1) = gB(u)
pˆ! ◦ g ∗ ◦ p∗ (u) = 1,
where B: K 1 → K −1 is the Bott periodicity transformation. This is indeed an isomorphism if g ∈ {1, −1}. 3.2.6. We consider the graded ring R := R[z, z −1 ], where deg(z) = −2 and the twisted cohomology HR (X, H), where we use z in order to couple the twist. Lemma 3.9. Twisted cohomology with coefficients in R is T -admissible. 2 Proof. The action of g ∈ HR (T, Z) is given by the cup product with 1+zgR , where gR is the image of g in HR (T ). By a simple computation
pˆ! ◦ g ∗ ◦ p∗ (1) = zgu
pˆ! ◦ g ∗ ◦ p∗ (u) = 1. This is indeed an isomorphism if g 6= 0.
3.2.7. T -admissibility is a strong condition on h. It implies for example that p! ◦ g ∗ ◦ pˆ∗ ◦ pˆ! ◦ g ∗ ◦ p∗ : h(E) → h(E)
is an isomorphism of degree −2. This isomorphism induces a two-periodicity of h(E). Here is a non-example. Lemma 3.10. Twisted Spinc -cobordism is not T -admissible. Proof. MSpinc (S 1 ) is not two-periodic since it is concentrated in degree ≤ 1. 3.3. T -duality isomorphisms ˆ ˆ 3.3.1. We consider two pairs (E, h) and (E, h) over B which are dual to each other. 3 We use the notation of 2.2.1. Let Th ∈ H (S(V ), Z) be a Thom class. We choose a twist K ∈ T (S(V )) such that c(K) = Th. Then we define H := i∗ K ∈ T (E) and ˆ We consider the diagram ˆ := ˆi∗ K ∈ T (E). ˆ We have c(H) = h and c(H) ˆ = h. H E ×B Eˆ p& ˆ
p.
E
Eˆ .
q↓
(3.11)
π ˆ.
π&
B This is the parameterized version of the situation considered in 3.2.1. In particuˆ → S(V ) from i ◦ p to ˆi ◦ pˆ. It induces lar, we have a homotopy h: I × E ×B E
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
103
the isomorphism ˆ = pˆ∗ˆi∗ K u: pˆ∗ H
Ψp, ˆˆ i (K)
∼ =
u(h)
(ˆi ◦ pˆ)∗ K ∼ = (i ◦ p)∗ K
Ψp,i (K)−1
∼ =
p ∗ i∗ K = p ∗ H
which is natural under pull-back of bundles. Let h be a twisted cohomology theory. Definition 3.12. We define the T -duality transformation ˆ H). ˆ T := pˆ! ◦ u∗ ◦ p∗ : h(E, H) → h(E, 3.3.2. The main theorem of the present section is the following. Assume that B is homotopy equivalent to a finite complex. Theorem 3.13. If h is T -admissible, then the T -duality transformation T is an isomorphism. Proof. Let f : A → B be a map. Then we use the pull-back of K in order to ˆ → E ˆ define the duality transformation T over A. Let F : f ∗ E → E and Fˆ : f ∗ E be the induced maps. The statement of the following lemma involves various (not explicitly written) canonical isomorphisms. Lemma 3.14. We have ˆ Fˆ ∗ H). ˆ T ◦ F ∗ = Fˆ ∗ ◦ T : h(E, H) → h(f ∗ E, Assume that we have a decomposition B = U ∪ V with open subsets U and V and let j: U ∩ V → B denotes the inclusion. By taking pre-images with respect to ˆ =E ˆU ∪ E ˆV . π and π ˆ we obtain associated decompositions E = EU ∪ EV and E ˆ E ˆU ∩ E ˆV → E ˆ denote the inclusions. Finally let Let f : EU ∩ EV → E and f: ˆ h(E ˆ → h(E, ˆ H) ˆ denote the ˆU ∩ EˆV , fˆ∗ H) δ: h(EU ∩ EV , f ∗ H) → h(E, H) and δ: boundary operators in the Mayer–Vietoris sequences. Lemma 3.15. We have ˆ H). ˆ T ◦ δ = δˆ ◦ T : h(EU ∩ EV , f ∗ H) → h(E, Assuming these lemmas, the proof of the theorem now goes by induction on the number of cells of B. The induction starts with any contractible base since h is T -admissible, using naturality and homotopy invariance. In the induction step we adjoin a cell. We use Lemmas 3.14 and 3.15 in order to see that T induces a map of Mayer–Vietoris sequences. The induction step now follows from the five-lemma. ˆ be the induced 3.3.3. We now prove Lemma 3.14. Let G: f ∗ E ×A f ∗ Eˆ → E ×B E map. The assertion follows from the following computation, omitting a number of
March 15, 2005 11:33 WSPC/148-RMP
104
J070-00231
U. Bunke & T. Schick
canonical isomorphisms: Fˆ ∗ ◦ T = Fˆ ∗ ◦ pˆ! ◦ u∗ ◦ p∗
= (f ∗ p)! ◦ G∗ ◦ u∗ ◦ p∗
= (f ∗ p)! ◦ (G∗ u)∗ ◦ G∗ ◦ p∗
= (F ∗ p)! ◦ (G∗ u)∗ ◦ (F ∗ p)∗ ◦ F ∗
= T ◦ F ∗.
3.3.4. We now prove Lemma 3.15. Let φ ∈ C(B) be a function which takes the value −1 on B \V and 1 on B \U , and such that the inclusion of i: (Y := {φ = 0}) → B is canonically h-oriented. Let k: Y → U ∩ V , I := π ∗ i, Iˆ := π ˆ ∗ i, K := π ∗ k and ∗ ˆ K := π ˆ k denote the corresponding inclusions. Note that I and Iˆ have a trivialized normal bundle and thus are canonically h-oriented. We have δ = I! ◦ K ∗ and ˆ ∗ . Furthermore, we set I˜ := q ∗ i and K ˜ = q ∗ k. Finally let J := π ∗ j, Jˆ := δˆ = Iˆ! ◦ K π ˆ ∗ j and G := q ∗ j denote the corresponding embeddings over j: U ∩ V ,→ B. The assertion of the lemma now follows from the following computation, where canonical isomorphisms are omitted: ˆ ∗ ◦ (Jˆ∗ p)! ◦ (G∗ u)∗ ◦ (J ∗ p)∗ δˆ ◦ T = Iˆ! ◦ K = Iˆ! ◦ (Iˆ∗ p)! ◦ (I˜∗ u)∗ ◦ (I ∗ p)∗ ◦ K ∗ = pˆ! ◦ I˜! ◦ (I˜∗ u)∗ ◦ (I ∗ p)∗ ◦ K ∗ = pˆ! ◦ u∗ ◦ I˜! ◦ (I ∗ p)∗ ◦ K ∗ = pˆ! ◦ u∗ ◦ p∗ ◦ I! ◦ K ∗ = T ◦ δ.
4. Examples 4.1. The computation of twisted K-theory for 3-manifolds 4.1.1. If E is a closed oriented 3-manifold then isomorphism classes of twists H on E are classified by the number hc(H), [E]i ∈ Z. We fix an equivalence class of twists corresponding to n ∈ Z. Representatives can be pulled back from S 3 using a map of degree one. Note that K(E, H) is independent of the twist in its class up to a non-canonical isomorphism. In the present subsection we want to compute the isomorphism class of this group which we will denote by K(E, n). Our computation is based on the Mayer–Vietoris sequence. 4.1.2. We choose a ball U ⊂ E. Then we have a decomposition E = U ∪V such that U ∩V ∼ S 2 . We identify the twists on U and V with the trivial twist. We can arrange that under the degree one map to S 3 the set U is mapped to the complement of the south pole and V is mapped to the complement of the north pole. Using the relation between twists and morphisms in the suspension S 3 of S 2 and naturality, we see that a twist in the class n is given by the transition morphism
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
105
v: θS 2 → θS 2 such that hu(v), [S 2 ]i = ±n. Let u∗ : K(S 2 ) → K(S 2 ) denotes the corresponding automorphism. It acts by the cup product with the class of the line bundle of degree ±n. Then the Mayer–Vietoris sequence reads u∗ ◦i∗ −j ∗
δ
→ K(S 2 ) → K(E, n) → K(U ) ⊕ K(V ) −−−−−→ K(S 2 ) →, where i: S 2 → U and j: S 2 → V are the inclusions. At this point we have fixed the sign of the class of the twist. 4.1.3. We identify K(S 2 ) ∼ = ZI ⊕ Zθ, where I is represented by the trivial onedimensional bundle, and θ is represented by the difference of a line bundle of degree one and a trivial line bundle. We then have u∗ I = I ± nθ and u∗ θ = θ. 4.1.4. The Mayer–Vietoris sequence gives a:=u∗ ◦i∗ −j ∗
δ
0 → K 0 (E, n) → Z ⊕ K 0 (V ) −−−−−−−→ ZI ⊕ Zθ → K 1 (E, n) → K 1 (V ) → 0. The restriction of a to the first summand maps 1 ∈ Z to I ⊕ ±nθ. If x ∈ K 0 (V ), then x|S 2 = dim(x)I + hc1 (x)|S 2 , [S 2 ]iθ. Now observe that hc1 (x)|S 2 , [S 2 ]i = 0 since S 2 bounds in V . Therefore we have a(k, x) = (k − dim(x))I ± knθ. We conclude that for n 6= 0 ˜ 0 (V ) ∼ ˜ 0 (E), K 0 (E, n) ∼ =K =K ˜ 0 (E) := ker(dim) is the reduced group. Furthermore, K 1 (E, n) fits into a where K sequence 0 → Z/nZ → K 1 (E, n) → K 1 (V ) → 0. Note that K 1 (V ) is free abelian and satisfies rank K 1 (V ) = rank K 1 (E) − 1. In particular we get 1 K 1 (E, n) ∼ = Zrank K (E)−1 ⊕ Z/nZ.
4.1.5. Let M be a closed oriented surface of genus g. The U (1)-principal bundles over M are classified by the first Chern class. Let π: Ek → M be the bundle with first Chern class hc(Ek ), [M ]i = k. We use the Gysin sequence in order to compute the integral cohomology of Ek . We get i 0 1 2 3
H i (Ek , Z), k 6= 0
H i (E0 , Z)
Z
Z
2g
Z2g ⊕ Z/kZ Z
Z Z
2g+1
Z2g+1 Z
.
March 15, 2005 11:33 WSPC/148-RMP
106
J070-00231
U. Bunke & T. Schick
4.1.6. We now compute the K-theory of Ek using the Atiyah–Hirzebruch spectral sequence. The second page in the case k 6= 0 looks like (vertically periodic) 2
Z
Z2g
Z2g ⊕ Z/kZ
Z
1
0
0
0
0
0
Z
Z2g
Z2g ⊕ Z/kZ
Z
0
1
2
3
.
The only possibly non-trivial differential is d2,3 3 . But since the spectral sequence degenerates rationally, the differential is trivial. We get i
K i (Ek ), k 6= 0
K i (E0 )
Z2g+1
Z2g+2
Z2g+1 ⊕ Z/kZ
0 1
Z2g+2
.
4.1.7. We now use this result in order to compute K(Ek , n). We get for n 6= 0 i
K i (Ek , n), k 6= 0
K i (E0 , n)
Z2g ⊕ Z/nZ
Z2g+1 ⊕ Z/nZ
0 1
Z2g ⊕ Z/kZ
Z2g+1
.
4.1.8. Let us now verify that this computation confirms T -duality. In fact, the unique dual pair of (Ek , noEk ) is (En , −koEn ). Thus T -duality predicts an isomorphism K(Ek , n) ∼ = K(E−n , −k) of degree −1. This is in fact compatible with the results above. 4.2. Line bundles over CP r 4.2.1. Let pn : En,r → CP r be the U (1)-principal bundle with first Chern class nz, where z ∈ H 2 (CP r , Z) is the canonical generator. We first compute H(En , Z) using the Gysin sequence for pn . We get H k (En , Z), n 6= 0
H k (E0 , Z)
0
Z
Z
2l, 1 ≤ l ≤ r
Z/nZ
Z
2l + 1, 1 ≤ l ≤ r − 1
0
Z
Z
Z
k
2r + 1
.
Note that n = ∞ is permitted in the construction and calculation, and that En,∞ is a model for BZ/nZ.
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
107
4.2.2. We compute the K-theory of En,r using the Atiyah–Hirzebruch spectral sequence. We observe that this sequence degenerates. We get i 0
K i (En,r ), n 6= 0
K i (E0,r )
Z
Z2r+1
Z ⊕ A nr
1
Z2r+1
,
where Anr is an abelian group with nr elements and with composition series with [ ˜ 0 (En,r ) ∼ subquotients Z/nZ. Using Atiyah’s completion theorem lim←r K = R(G) we get extra information about these groups, e.g., that the limit is torsion-free. [ ∼ In the particularly simple case n = 2 we have R(G) = Z(2) , which implies that r ∼ A2r = Z/2 Z is cyclic. For other n, in particular if n is a prime number, Anr can also be computed explicitly by looking at the completion theorem and suitable Leray–Serre spectral sequences; we leave this as an exercise to the reader. A precise answer can be found, e.g., in the book of Gilkey [9 Theorem 4.6.7]. 4.2.3. The computation of the cohomology shows that for r > 1 only E0,r admits non-trivial twists (the case r = 1 is covered by Sec. 4.1). Let us fix the generator g ∈ H 3 (E0,r ) such that (p0 )! (g) = z. Then twists H over E0,r are classified by an integer k ∈ Z such that c(H) = kg. Let K(E0,r , k) be the isomorphism class of the twisted K-theory for the twists in the class k ∈ Z. We can now apply T -duality in order to compute this group. In fact, the unique dual pair of (E0,r , k) is (E−k,r , 0). Thus we get i
K i (E0,r , k)
0
Z
1
Z ⊕ A nr
.
Note that the calculations of this section, using the results of the present paper, rely on the fact that twisted K-theory is a twisted cohomology theory in the sense of our axioms. As explained earlier, no complete account of such a theory seems to be available in the literature. 4.3. An example where torsion plays a role 4.3.1. As the base space we consider the total space of the bundle pk : Ek,r → M as in Sec. 4.2 for a prime number k > 1 and for r > 1, i.e., we set B := Ek,r . We fix a class 0 6= c ∈ H 2 (B, Z) and let Fc denote the corresponding U (1)-principal bundle over B. Since c generates H ∗ (B, Z) as a ring, except for the top degree, the Gysin sequence of Fc shows that its cohomology vanishes in degrees 1 < i < 2r + 1, and H 1 (Fc , Z) ∼ = Z.
March 15, 2005 11:33 WSPC/148-RMP
108
J070-00231
U. Bunke & T. Schick
Choose therefore 0 = h ∈ H 3 (Fc ). Since H 3 (B, Z) = 0, there is a unique dual ˆ ∈ H 3 (Fcˆ). Since cˆ = 0, Fcˆ is the pair (Fcˆ, ˆ h) with Chern class cˆ = −π! (h) = 0 and h k trivial bundle, therefore its cohomology is H (Fcˆ, Z) ∼ = H k (B, Z) ⊕ H k−1 (B, Z), 3 2 ˆ = −c, e.g., H (Fcˆ, Z) ∼ h is the unique class with π ˆ! (h) = H (B, Z) ∼ = Z/kZ. Now ˆ i.e., corresponds to −c under the isomorphism H 3 (Fcˆ, Z) = H 2 (B, Z). Clearly, if we only worked with differential forms as is done in [2], then we could not distinguish this torsion twist from the trivial one. 4.3.2. The Atiyah–Hirzebruch spectral sequence for K(Fc ) degenerates. This shows that K0 (Fc ) ∼ = Z2 ∼ = K1 (Fc ), whereas K0 (F0 ) ∼ = K0 (B) ⊕ K1 (B) ∼ = Z ⊕ Z ⊕ Ak r ∼ = ˆ for the torsion K1 (F0 ). The T -duality isomorphism identifies K(Fc ) with K(F0 , h) ˆ In particular we see that K(F0 ) 6∼ ˆ which shows that the torsion twist h. = K(F0 , h) part of the twist is important. 4.4. Iterated T -duality 4.4.1. Let T denote the group U (1) × U (1). Definition 4.1. Two principal T -bundles F → B and F 0 → B are isomorphic if there exists an isomorphism of fiber bundles U
F → F0 ↓ ↓ B = B such that U is T -equivariant. 4.4.2. The group of automorphisms of T is GL(2, Z). If we identify T ∼ = R2 /Z2 , then the action of this group on T is induced by the linear action on R2 . Let φ ∈ GL(2, Z). Definition 4.2. Two principal T -bundles F → B and F 0 → B are φ-twisted isomorphic if there exists an isomorphism of fiber bundles U
F → F0 ↓ ↓ B = B such that U is φ-twisted T -equivariant, i.e., U (p · t) = U (p) · φ(t) for all p ∈ F , t ∈ T. Assume that B is connected. We say that two T -principal bundles over B are twisted isomorphic if they are φ-twisted isomorphic for some (then uniquely determined) φ. 4.4.3. We consider a T -principal bundle π: F → B. We need the subgroups S0 := U (1) × {1} ⊂ T and S1 := {1} × U (1) ⊂ T . We define E0 := F/S0 and E1 := F/S1 .
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
109
All these spaces fit into a diagram F p0 .
E0
p1 &
E1 ,
π↓ π0 &
(4.3)
π1 .
B where pi and πi are U (1)-principal bundles in a natural way. We consider a class h ∈ H 3 (F, Z). Definition 4.4. We say that the pair (F, h) is iteration-dualizable if h = p∗0 (h0 ) + p∗1 (h1 ) for some hi ∈ H 3 (Ei , Z). 4.4.4. We can now try to construct a T -dual of (F, h) by iterated T -duality. We first form the dual (0 Fˆ , 0 ˆ h) of the pair (F → E1 , h). Note that we have the pull-back diagram p0
F → E0 ∗ p1 =π1 π0 ↓ π0 ↓ . π E1 →1 B ˆ 0 ) be a dual of (E0 , h0 ). Then we get 0 Fˆ by the pull-back diagram ˆ0 , h Let (E pˆ0 ˆ 0→ Eˆ0 ∗ ˆ1 =π1 π ˆ0 ↓ π ˆ0 ↓ . 0p π1 E1 → B 0F
Furthermore we get ˆ := 0 pˆ∗ (h ˆ ˆ∗1 (h1 ). 0 0) + 0p
0h
ˆ Let (Eˆ1 , h ˆ 1 ) be the dual of ˆ of the pair (0 Fˆ → E ˆ0 , 0 h). Now we form the dual (Fˆ , h) (E1 , h1 ). Then we get Fˆ by the pull-back pˆ0 ˆ0 Fˆ → E pˆ1 ↓ π ˆ0 ↓ π ˆ1 ˆ E1 → B
and ˆ = pˆ∗ (h ˆ ˆ 1 ). h ˆ1 (h 0 0) + p
(4.5)
4.4.5. Note that this construction of the iterated dual involves the choice of a representation h = p∗0 (h0 ) + p∗1 (h1 ). The goal of the following discussion is to show that the bundle Fˆ → B may depend on this choice even if we consider it up to twisted equivalence. It should be remarked that our examples with a non-unique dual do not depend on the existence of torsion in cohomology and therefore would also show up if we only worked with de Rham cohomology.
March 15, 2005 11:33 WSPC/148-RMP
110
J070-00231
U. Bunke & T. Schick
Our example should be contrasted with the constructions of [3], where a very similar definition of T -duality for torus bundles is used, but in which case (at least according to the authors) the T -dual (which exists under conditions similar to ours) is uniquely determined (up to isomorphism). In [10], a different approach to T -duality for torus bundles is used, based on continuous trace algebras over the initial bundle and actions of Rn on the continuous trace algebra. Under our existence assumption, the construction of [10] also gives rise to a classical dual torus bundle, which is claimed to be uniquely determined, but the proof of this statement is not correct. The relationship to our construction is not quite clear, we plan to investigate this and to give more information about the higher dimensional case in a subsequent paper. 4.4.6. A T -principal bundle F → B gives rise to Chern classes c0 , c1 ∈ H 2 (B, Z) of the bundles E0 , E1 . The pair (c0 , c1 ) determines the isomorphism class of F by the proof of Corollary 4.7. We consider this pair of Chern classes as a class c(F ) ∈ H 2 (B, Z2 ) in the natural way. Then the Chern classes of the dual are −((π0 )! (h0 ), (π1 )! (h1 )). Note that GL(2, Z) acts on the cohomology with coefficients in Z2 . 4.4.7. Choose now φ ∈ GL(2, Z). Then we can define a new T -principal bundle φ F . It has the same underlying fiber bundle F → B, but we redefine the action of T such that id
F → F0 ↓ ↓ B = B
φ
is a φ-twisted isomorphism. Let σ: GL(2, Z) → GL(2, Z) be the bijection (of order two) a b a −c 7→ (ad − bc) . c d −b d Lemma 4.6. We have c(φ F ) = φσ c(F ). Proof. Let φ =
a c
b d
. Then by an easy computation (which has only to be carried out for the generators −10 01 , 01 −10 , 11 −10 of GL(2, Z)) we see
that φ E0 = (E0a ⊗ E1−c )det(φ) and φ E1 = (E0−b ⊗ E1d )det(φ) . Therefore c(φ F ) = det(φ) −ba −cd (c0 , c1 ). Corollary 4.7. The twisted isomorphism class of the T -principal bundle F is determined precisely by the orbit GL(2, Z)c(F ) ⊂ H 2 (B, Z2 ).
March 15, 2005 11:33 WSPC/148-RMP
J070-00231
On the Topology of T -Duality
111
Proof. If F and F 0 are isomorphic T -principal bundles over B, then c(F ) = c(F 0 ) ∈ H 2 (B, Z2 ). If F and F 0 are Φ-twisted isomorphic, then F and Φ F 0 are isomorphic, and by Lemma 4.6 c(F ) ∈ Gl(2, Z)c(F 0 ). Finally, if F and F 0 are two bundles with c(F ) = c(F 0 ), then the U (1)-principal bundles E0 and E00 as well as E1 and E10 are isomorphic. It follows that F , as the pull-back of E1 along E0 , is isomorphic to the pull-back of E10 along E00 , i.e., to F 0 . If c(F ) ∈ GL(2, Z)c(F 0 ), then c(F ) = c(Φ F 0 ) for a suitable Φ ∈ GL(2, Z) and therefore F and Φ F 0 are isomorphic T -principal bundles, so that F and F 0 are twisted isomorphic. 4.4.8. Let Li be the line bundles associated to Ei and V := L0 ⊕ L1 . Let further s: S(V ) → B be the unit sphere bundle. Then we have natural embeddings i0 : E0 → S(V ) and i1 : E1 → S(V ). We have a decomposition S(V ) = D(L0 ) ×B E1 ∪ E0 ×B D(L1 ). The associated Mayer–Vietoris sequence gives the exact sequence p∗ −p∗
i∗ ⊕i∗
H 2 (F, Z) → H 3 (S(V ), Z) 0→ 1 H 3 (E0 , Z) ⊕ H 3 (E1 , Z) 0→ 1 H 3 (F, Z).
(4.8)
Let ri ∈ H 3 (Ei , Z). Then we have p∗0 (r0 ) + p∗1 (r1 ) = 0 if and only if (r0 , −r1 ) ∈ im(i∗0 ⊕ i∗1 ). If this is satisfied we get a (second) splitting of 0 = p∗0 (0) = p∗1 (0) ∈ H 3 (F, Z). To understand the corresponding dual, we compute (πi )! (ri ). The dual of any T -bundle with splitting 0 = p∗0 (0) + p∗1 (0) has Chern class (0, 0). If we can find an example as above with ((π0 )! ((i0 )∗ X), (π1 )! ((i1 )∗ X)) 6= 0, the latter cannot lie in the GL(2, Z)-orbit of (0, 0) and therefore not even the underlying bundle of the second dual is twisted isomorphic to the first one. 4.4.9. Choose now B = S 2 , c0 = E1 has underlying space S 3 with (πi )! : H 3 (S 3 , Z) → H 2 (S 2 , Z) is an over S 3 and therefore H 2 (F, Z) = 0, Gysin sequence for S(V ) gives
c1 the generator of H 2 (S 2 , Z). Then E0 = the Hopf principal fibration. In this case isomorphism. Moreover, F is a U (1)-bundle consequently i∗0 ⊕ i∗1 in (4.8) is injective. The s
c ∪c
0 1 H 3 (B, Z) → H 3 (S(V ), Z) →! H 0 (B, Z) −− −→ H 4 (B, Z),
i.e., H 3 (S(V ), Z) ∼ = Z 6= {0}. It follows that there is 0 6= X ∈ H 3 (S(V ), Z) such that i∗0 (X) ⊕ i∗1 (X) 6= 0 and therefore (π0 )! (i∗0 (X))⊕(π1 )! (i∗1 (X)) 6= 0, and we are done by the above observation. Acknowledgments We thank the referees for their useful comments, in particular with respect to the presentation and the physical interpretation of our results.
March 15, 2005 11:33 WSPC/148-RMP
112
J070-00231
U. Bunke & T. Schick
References [1] M. Atiyah and G. Segal, Twisted K-theory, arXiv:math.KT/0407054. [2] P. Bouwknegt, J. Evslin and V. Mathai, T -Duality: Topology change from H-flux, arXiv:hep-th/0306062. [3] P. Bouwknegt, K. Hannabuss and V. Mathai, T -Duality for principal torus bundles, arXiv:hep-th/0312284. [4] U. Bunke and Th. Schick, Twisted Spinc -cobordism, preprint (2003). [5] U. Bunke and Th. Schick, Twisted characteristic classes for twisted Spinc -cobordism, preprint (2004). [6] U. Bunke and Th. Schick, T -duality for non-free circle actions, in preparation (2004). [7] D. S. Freed, M. J. Hopkins and C. Teleman, Twisted equivariant K-theory with complex coefficients, preprint (2003), arXiv:math.AT//0206257. [8] P. Gajer, Geometry of Deligne cohomology, Invent. Math. 127 (1997) 155–207. [9] P. B. Gilkey, Invariance Theory, the Heat Equation, and the Atiyah–Singer Index Theorem (Publish or Perish, Wilmington, 1984). [10] V. Mathai and J. Rosenberg, T-Duality for torus bundles with H-fluxes via noncommutative topology, arXiv:hep-th/0401168. [11] J. Mickelsson, Twisted K-theory invariants, arXiv:math-AT/0401130. [12] I. Raeburn and J. Rosenberg, Crossed products of continuous-trace C ∗ -algebras by smooth actions, Transactions of the AMS. 305 (1988) 1–45. [13] A. Strominger, S. T. Yau and E. Zaslow, Mirror symmetry is T -duality, Nuclear Phys. B479 (1996) 243–259.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Reviews in Mathematical Physics Vol. 17, No. 2 (2005) 113–173 c World Scientific Publishing Company
THERMAL QUANTUM FIELDS WITHOUT CUT-OFFS IN 1+1 SPACE-TIME DIMENSIONS
´ CHRISTIAN GERARD Universit´ e Paris Sud XI, F-91405 Orsay, France
[email protected] ¨ CHRISTIAN D. JAKEL Math. Inst. der LMU, Theresienstr. 39, 80333 M¨ unchen
[email protected] Received 12 May 2004 Revised 04 January 2005
We construct interacting quantum fields in 1+1 dimensional Minkowski space, representing neutral scalar bosons at positive temperature. Our work is based on prior work by Klein and Landau and Høegh-Krohn. Keywords: Constructive field theory; thermal field theory; KMS states. AMS 1991 Subject Classification: 81T08, 82B21, 82B31, 46L55
Contents 1. Introduction 1.1. Content of this paper 2. Stochastically Positive KMS Systems and Generalized Path Spaces 2.1. Stochastically positive KMS systems 2.2. Generalized path spaces 2.3. Perturbations of generalized path spaces 2.4. Perturbed dynamics associated to FKN kernels 3. Gaussian Measures 3.1. Distribution spaces 3.2. Gaussian measures 3.3. Sharp-time fields 3.4. Sharp-space fields 3.5. Some elementary properties (Sβ × IR), Σ, dφC ) 4. Path Spaces Supported by (SIR 4.1. The free massive euclidean field on the circle at 0-temperature 4.2. The free massive euclidean field on IR at temperature β −1 5. Perturbations of Path Spaces 5.1. Interaction terms 113
114 116 121 121 122 125 126 129 129 130 131 131 132 133 134 135 136 136
March 29, 2005 8:59 WSPC/148-RMP
114
J070-00230
C. G´ erard & C. D. J¨ akel
5.2 The P (φ)2 model on the circle Sβ at temperature 0 5.3. The spatially cutoff P (φ)2 model on IR at temperature β −1 6. The Thermodynamic Limit 6.1. Preparations 6.2. The net of local algebras 6.3. Existence of the limiting dynamics 6.4. An identification of local algebras 6.5. Existence of the limiting state 7. Construction of the Interacting Path Space 7.1. Construction of the interacting measure 7.2. Existence and properties of sharp-time fields 7.3. Properties of the interacting β-KMS system Appendix A. A Time-Dependent Heat Equation A.1. Existence of solutions A.2. The dissipative case A.3. Some additional results Appendix B. Miscellaneous Results
139 140 142 142 143 145 146 148 150 151 156 157 161 162 163 168 169
1. Introduction Constructive thermal field theory allows one to circumvent (at least in 1+1 spacetime dimensions) the severe infrared problems (see e.g. [31]) of thermal perturbation theory. A class of models representing scalar neutral bosons with polynomial interactions was constructed by Høegh-Krohn [21] more than twenty years ago. Shortly afterwards, several related results on the construction of self-interacting thermal fields were announced by Fr¨ ohlich [10]. Our first paper was devoted to the construction of neutral and charged thermal fields with spatially cutoff interactions in 1 + 1 space-time dimensions, using the notion of stochastically positive KMS systems due to Klein and Landau [23]. The construction of interacting thermal quantum fields without cutoffs presented here includes several of the original ideas of Høegh-Krohn [21], but instead of starting from the interacting system in a box we start from the Araki–Woods representation for the free thermal system in infinite volume. This “algebraic” approach eliminates some cumbersome limiting procedures present in Høegh-Krohn’s work due to the introduction of boxes. We provide complete proofs for a number of statements which where only touched upon in Høegh-Krohn’s work. The list of “new” contributions contains the Wick (re-)ordering with respect to different covariance functions, the existence of interacting sharp-time fields, the identification of local algebras, the existence and uniqueness of the solution of Høegh-Krohn’s time dependent heat equation, local normality of the interacting KMS state, uniqueness of the weak∗ accumulation point of the sequence of approximating KMS states, and a number of inequalities that enter into a rigorous construction at several points. Although some of our results were probably already known by the experts (most of our work is based on results by Glimm and Jaffe, Høegh-Krohn, Fr¨ ohlich, Klein and Landau, and Simon) more than twenty years ago, we feel that it is worth while to present the arguments in full detail.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
115
We will provide a detailed description of the content of this paper in the next subsection. But before we do so, we give a rough outline of the main ideas. Let h and denote the one-particle Hilbert space and the one-particle energy for a single neutral scalar boson. On the Weyl algebra W(h) we define a quasi-free (τ ◦ , β)-KMS state ωβ◦ for the time evolution {τt◦ }t∈IR by 1 ωβ◦ W (h) := e− 4 (h,(1+2ρ)h) , τt◦ W (h) = W (eit h), h ∈ h, t ∈ IR, where ρ := (eβ − 1)−1 , β > 0. A convenient realization of the GNS representation associated to the ◦ pair W(h), ωβ is the Araki–Woods representation defined by: ¯ HAW := Γ(h ⊕ h), ΩAW := Ω,
1 1 πAW (W (h)) = WAW (h) := WF (1 + ρ) 2 h ⊕ ρ¯ 2 ¯h ,
h ∈ h.
¯ is the conjugate Hilbert space to h, WF (.) denotes the Fock–Weyl operator Here h ¯ and Ω ∈ Γ(h ⊕ h) ¯ is the Fock vacuum. The von Neumann algebra on Γ(h ⊕ h) generated by {πAW (W (h)) | h ∈ h} is denoted by RAW . The local von Neumann algebra generated by {πAW (W (h)) | h ∈ hI } is denoted by RAW (I). Here I ⊂ IR is an open and bounded interval and hI will be defined in (6.4). Since ωβ◦ is τ ◦ -invariant, there exists a standard implementation (see [8]) of the time evolution in the representation πAW : eiLAW t πAW (A)ΩAW := πAW τt◦ (A) ΩAW and LAW ΩAW = 0. The generator LAW of the free time evolution is called the (free) Liouvillean. Euclidean techniques were used in our first paper to define the operator sum l : P (φ(0, x)) :C0 dx Hl := LAW + −l
and to show that Hl is essentially selfadjoint. Using Trotter’s product formula as in [15], a finite propagation speed argument shows that τtl (A) = eiHl t Ae−iHl t is independent of l for t ∈ IR and A ∈ RAW (I) fixed, if I is bounded and l is sufficiently large. Thus there exists a limiting dynamics τ such that (1.1) lim τtl (A) − τt (A) = 0 l→∞
for all A ∈ RAW (I), I bounded. This norm convergence extends to the norm closure A :=
I⊂IR
(∗)
RAW (I)
March 29, 2005 8:59 WSPC/148-RMP
116
J070-00230
C. G´ erard & C. D. J¨ akel
of the local von Neumann algebras. The C ∗ -algebra A is called the algebra of local observables. It follows from general results of [23] that the vector Ωl ∈ HAW , β
Ωl :=
e− 2 Hl ΩAW β
e− 2 Hl ΩAW
,
(1.2)
induces a (τ l , β)-KMS state ωl for the W ∗ -dynamical system (A, τ l ). Equation (1.2) should be compared with similar expressions which are well known (see e.g. [4, Theorem 5.4.4]) for bounded perturbations and which have recently been derived for a class of unbounded perturbations in [8, Theorem 5.6]. The existence of weak limit points (which are states) of the net {ωl }l>0 is a consequence of the Banach–Alaoglu theorem (see [4, Theorem 2.3.15]). The fact that all limit states satisfy the KMS condition w.r.t. the pair (A, τ ) follows from (1.1), which itself is a consequence of finite propagation speed. Since A is the norm closure of the weakly closed local algebras, all limit points are locally normal KMS states w.r.t. the Araki–Woods representation [32]. To prove that there is only one accumulation point is more delicate. Following Høegh-Krohn [21] we will use Nelson symmetry to relate the interacting vacuum theory on the circle to the interacting thermal model on the real line. 1.1. Content of this paper In Sec. 2 we recall the notions of stochastically positive KMS systems and associated generalized path spaces, due to Klein and Landau [23]. The property corresponding to stochastic positivity in the 0-temperature case is called Nelson–Symanzik positivity. In Sec. 2.1 we recall the characterization of the thermal equilibrium states of a dynamical system (B, τ ) by the KMS condition and the definition of Euclidean Green’s functions. The notion of a stochastically positive KMS systems (B, U, τ, ω) rests on the introduction of a distinguished abelian sub-algebra U of the observable algebra B. In our case this algebra will be the algebra generated by the time-zero fields. In Sec. 2.2 we recall the notion of a generalized path space (Q, Σ, Σ0 , U (t), R, µ). It consists of a probability space (Q, Σ, µ), a distinguished sub-σ-algebra Σ0 , a one-parameter group t → U (t) of automorphisms of L∞ (Q, Σ, µ) such that Σ = t∈IR U (t)Σ0 and a reflection R, acting as an automorphism on L∞ (Q, Σ, µ) such that R2 = 1l, RU (t) = U (−t)R. Klein and Landau (see [23]) have shown that for β > 0 there is a one to one correspondence between stochastically positive β-KMS systems and β-periodic OSpositive path spaces (for β = ∞ the object associated to an OS-positive path space is called a positive semigroup structure, see [22]). The role of OS-positivity is to ensure the positivity of the inner product in the Hilbert space H on which the real time quantum fields act. A similar reconstruction theorem allowing to go from
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
117
Euclidean Green’s functions to a KMS system was also shown in a slightly different framework in [16, 17]. The case of Euclidean structures corresponding to quasi-free KMS states (which give rise to Gaussian path spaces) was considered in [18, 24]. The reconstruction theorem provides a concrete realization of the GNS triple (Hω , πω , Ωω ) associated to the pair (B, ω). The Liouvillean L implements the time evolution in the GNS representation πω . In Sec. 2.3 we recall some results from [23] (with some improvements in [12]) concerning perturbations of generalized path spaces obtained from Feynman–Kac– Nelson kernels. The main examples of FKN kernels are those obtained from a selfadjoint operator V on the physical Hilbert space Hω , which is affiliated to U ∼ = L∞ (K, νω ). If e−βV ∈ L1 (K, νω ) and 1 , 2 ≤ p, q ≤ ∞, 2 then the operator sum L + V is essentially selfadjoint on D(L) ∩ D(V ) and the perturbed time-evolution τV on B is given by V ∈ Lp (K, νω ),
β
e− 2 V ∈ Lq (K, νω ) for p−1 + q −1 =
τV,t (B) = eitL+V Be−itL+V . The KMS state ωV for the pair (B, τV ) is the vector state induced by β
ΩV :=
e− 2 L+V Ωω e− 2 L+V Ωω β
.
The Liouvillean LV for the perturbed β-KMS system (BV , τV , ωV ) equals L + V − JV J. (J denotes the modular conjugation associated to the pair (B, Ω).) It satisfies eitLV AΩV = τV,t (A)ΩV
and LV ΩV = 0.
In Sec. 3 we recall some standard facts about Gaussian measures on distribution spaces and fix some notation. Gaussian measures are reviewed in Sec. 3.2. Sharptime free fields are introduced in Sec. 3.3. If the space dimension d is one, then it is possible to define similarly sharp-space free fields. This is done in Sec. 3.4. (Sβ ×IR), dφC ), In Sec. 4 we recall two well known path spaces supported by (SIR where Sβ is the circle of length β. In Sec. 4.1 we identify the generalized path space (Sβ × IR), dφC ) corresponding to the free massive scalar field on the circle on (SIR Sβ at temperature 0. (Sβ × IR), dφC ) corIn Sec. 4.2 we identify the generalized path space on (SIR responding to the free massive scalar field on the real line IR at temperature β −1 . The physical Hilbert space associated to this path space can be unitarily identified with the Fock space Γ(h ⊕ ¯ h). The KMS vector ΩAW is identified with the Fock ¯ The dynamics τ ◦ can be unitarily implemented in vacuum vector Ω in Γ(h ⊕ h). ). πAW : The (free) Liouvillean LAW is identified with dΓ( ⊕ −¯ In Sec. 5 we describe perturbations of the two path spaces defined in Secs. 4.1 and 4.2. The perturbed path spaces are obtained from FKN kernels corresponding to P (φ)2 interactions.
March 29, 2005 8:59 WSPC/148-RMP
118
J070-00230
C. G´ erard & C. D. J¨ akel
In Sec. 5.1 we recall some well known facts concerning the Wick ordering of Gaussian random variables. In 1+1 space-time dimensions Wick ordering is sufficient to eliminate the UV divergences of polynomial interactions. As it turns out, the leading order in the UV divergences is independent of the temperature. Thus it is a matter of convenience whether one uses the thermal covariance function C0 or the vacuum covariance function Cvac to define the Wick ordering. In Sec. 5.2 the P (φ)2 model on the circle Sβ at temperature 0 is discussed. It is specified by the formal interaction VC = : P (φ(t, 0)) :Cβ dt. Sβ
Here P (λ) is a real-valued polynomial, which is bounded from below. The timeren evolution x → eixHC is generated by HCren := HC −EC , where EC := inf(σ(HC )) and 1 HC = dΓ (Dt2 + m2 ) 2 + VC . The operator HC is bounded from below and has a unique vacuum state ωC ( . ) = (ΩC , . ΩC ) such that (ΩC , Ω) > 0 and HCren ΩC = 0. The renormalized energy operator HCren is called the P (φ)2 Hamiltonian on the circle Sβ . Some bounds are provided in Proposition 5.4, which are used in the sequel to prove the existence of interacting sharp-time fields. The spatially cutoff P (φ)2 model on the real line IR at temperature β −1 is discussed in Sec. 5.3. It is specified by the formal interaction l : P (φ(0, x)) :C0 dx. Vl = −l
Here P (λ) is once again a real-valued polynomial, which is bounded from below, and l ∈ IR+ is a spatial cutoff parameter. The perturbed KMS state ωl turns out to be normal w.r.t. the Araki–Woods representation πAW . In fact, it is the vector state induced by e− 2 Hl ΩAW β
Ωl :=
β
e− 2 Hl ΩAW
,
where Hl is the selfadjoint operator Hl := LAW + Vl . The perturbed time-evolution on B is given by τtl (B) := eitHl Be−itHl ,
B ∈ B.
The following consequence of Lemma 5.3 will be important in Sec. 7: e
−
R β/2
−β/2
U(t)
Rl
−l
:P (φ(0,x)):C0 dxdt
=e
−
Rl
−l
UC (x)
R Sβ
:P (φ(t,0)):Cβ dtdx
.
(1.3)
The analog of (1.3) in the zero temperature case is called Nelson symmetry (see e.g. [29]). The thermodynamic limit is discussed in Sec. 6. We prove that the limits lim τtl (A) =: τt (A)
l→+∞
and
lim ωl (A) =: ωβ (A)
l→+∞
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
119
exist for A in the C ∗ -algebra of local observables A and that (A, τ, ωβ ) is a β-KMS system, describing the translation invariant P (φ)2 model at temperature β −1 . In Sec. 6.1 we recall localization properties of the classical solutions of the Klein– Gordon equation. In Sec. 6.2 we introduce the net of local algebras I → RAW (I) for the free thermal field: for a bounded open interval I ⊂ IR, the symbol RAW (I) denotes the von Neumann algebra generated by {WAW (h) | h ∈ hI }. By a result of Araki [1], the local von Neumann algebras for the free thermal scalar field are regular from the inside and from the outside: RAW (J) = RAW (I) = RAW (J). ¯ J⊂I
J⊃I¯
Moreover, if I is bounded, then the local algebra RAW (I) is ∗-isomorphic to the unique hyper-finite factor of type III1 . In Sec. 6.3 the existence of the limiting dynamics is discussed. For t ∈ IR fixed, the norm limit lim τtl (B) =: τt (B)
l→∞
exists for all B in A :=
RAW (I)
(∗)
,
I⊂IR
where the I’s are open and bounded. Finite propagation speed is used to show that τtl (B), for B ∈ RAW (I) and |t| ≤ T , is independent of l for l > |I| + T . The proof uses Trotter’s product formula, which requires that LAW + Vl is essentially self-adjoint on D(LAW ) ∩ D(Vl ). In order to apply the results of Sec. 7 to the C ∗ -algebra A, it is necessary to identify the local von Neumann algebra RAW (I) with the von Neumann algebra obtained by applying the interacting dynamics τ to the local abelian algebra of time-zero fields. This is done in Sec. 6.4: for I ⊂ IR a bounded open interval, we denote by UAW (I) the abelian von Neumann algebra generated by {WAW (h) | h ∈ hI , h real valued}. We denote by Bα (I) the von Neumann algebra generated by
τt (A) | A ∈ UAW (I), |t| < α . We set B(I) := α>0 Bα (I) and show that B(I) = RAW (I). Taking the existence of the interacting path space (which we will construct in Sec. 7) for granted, we show that the net {ωl }l>0 has a unique accumulation point. This is done in Sec. 6.5, using the identification of algebras established in the previous subsection. Thus w- lim ωl =: ωβ exists on A. l→+∞
March 29, 2005 8:59 WSPC/148-RMP
120
J070-00230
C. G´ erard & C. D. J¨ akel
The state ωβ is a (τ, β)-KMS state on A. It follows from a result of Takesaki and Winnink [32] that ωβ is locally normal, i.e., if I is an open and bounded interval, then ωβ|RAW (I) is normal w.r.t. the Araki–Woods representation; thus ωβ|RAW (I) is also normal with respect to the Fock representation. Moreover, ωβ is invariant under spatial translations and satisfies the space-clustering property: lim ωβ (Aαx (B)) = ωβ (A)ωβ (B),
x→∞
A, B ∈ A.
Finally, the main results of this paper, namely the explicit construction of the translation invariant P (φ)2 model at positive temperature is given in Sec. 7. Following ideas of Høegh-Krohn [21], Nelson symmetry is used to establish the existence of the model in the thermodynamic limit. (Sβ ×IR) The first step is to construct the interacting path space supported by SIR −1 describing the translation invariant P (φ)2 model at temperature β . Following Høegh-Krohn [21] we consider the operator W[−∞,∞] (f ) solving the time-dependent heat equation d W[a,b] (f ) = W[a,b] (f ) −HCren + iφ(fb ) , db
a ≤ b,
where fb (·) := f (·, b) ∈ SIR (Sβ ) for f ∈ SIR (Sβ × IR). We show that for f ∈ C0∞IR (Sβ × IR), lim
l→+∞
eiφ(f ) dµl = (ΩC , W[−∞,∞] (f )ΩC )
exists and that the map + SIR (Sβ × IR) → IR f → ΩC , W[−∞,∞] (f )ΩC
is the generating functional of a Borel probability measure µ on (Q, Σ). The measure µ is invariant under space translations, time translations and time reflection. In Sec. 7.2 we prove the existence of interacting sharp-time fields. (Note that the necessary bounds (5.9) depend on the dimension of space-time.) This result allows us to equip the probability space (Q, Σ, µ) with an OS-positive β-periodic path space structure: — U (t) is the group of transformations generated by the time translations Ts induced on Q by the map (t, x) → (t + s, x); — R is the transformation generated by the (euclidean) time reflection at t = 0; — Σ0 is the sub-σ-algebra of Σ generated by the functions {φ(0, h) | h ∈ SIR (IR)}.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
121
In Sec. 7.3 some properties of the associated interacting β-KMS system (B, U, τ˜, ω ˜ ) are discussed. We prove the convergence of sharp-time Schwinger functions and show that ˜ WAW (h) ω ˜ αx (WAW (h)) = ω for all x ∈ IR and ˜ WAW (h)αx (WAW (g)) = ω ˜ WAW (h) ω ˜ WAW (g) lim ω
x→∞
for h, g ∈ C0∞IR (IR). In Appendix A we discuss the abstract time-dependent heat equation d U (t, s) = −H + iR(t)U (t, s), s ≤ t, dt U (s, s) = 1l.
(1.4)
Here H ≥ 0 is a selfadjoint operator on a Hilbert space H and R(t), t ∈ IR, is a family of closed operators with D(H γ ) ⊂ D(R(t)) for some 0 ≤ γ < 1. We show that there exists a unique solution U (t, s) such that U (s, s) = 1l and U (t, r)U (r, s) = U (t, s) for s ≤ r ≤ t. In Sec. A.2 we consider the dissipative case when R(t) is selfadjoint for t ∈ IR. We establish an approximation of U (t, s) by time-ordered products and prove some bounds on U (t, s), which are used in the main text to show the existence of interacting sharp-time fields and the convergence of sharp-time Schwinger functions. Finally we establish a lemma which is used in the main text to prove spatial clustering for the translation invariant P (φ)2 model at temperature β −1 . 2. Stochastically Positive KMS Systems and Generalized Path Spaces In this section we briefly recall the notions of stochastically positive KMS systems and associated generalized path spaces, due to Klein and Landau [23]. We will also need the corresponding notions at 0-temperature, which can be found in [22]. 2.1. Stochastically positive KMS systems Let B be a C ∗ -algebra and let {τt }t∈IR be a one-parameter group of ∗-automorphisms of B. We recall that a state ω on B is a (τ, β)-KMS state or (B, τ, ω) is a β-KMS system, if for each pair A, B ∈ B there exists a function FA,B (z) holomorphic in the strip Iβ+ = {z ∈ C | 0 < Im z < β} and continuous on Iβ+ such that FA,B (t) = ω(Aτt (B))
and FA,B (t + iβ) = ω(τt (B)A)
∀ t ∈ IR.
March 29, 2005 8:59 WSPC/148-RMP
122
J070-00230
C. G´ erard & C. D. J¨ akel
For Ai ∈ B and ti ∈ IR, 1 ≤ i ≤ n, the Green’s functions are defined as follows: n G(t1 , . . . , tn ; A1 , . . . , An ) := ω τti (Ai ) . i=1
It is well known (see [2, 3]) that the Green’s functions are holomorphic in Iβn+ := {(z1 , . . . , zn ) ∈ Cn | Im zi < Im zi+1 , Im zn − Im z1 < β}, continuous on Iβn+ and bounded there by Euclidean Green’s functions: E
n 1
Ai . Therefore one can define the
G(s1 , . . . , sn ; A1 , . . . , An ) := G(is1 , . . . , isn ; A1 , . . . , An ) for s1 ≤ · · · ≤ sn ,
sn − s1 ≤ β.
The following class of β-KMS systems has been introduced by Klein and Landau [23]. Definition 2.1. Let (B, τ, ω) be a β-KMS system and let U ⊂ B be an abelian -sub-algebra. The KMS system (B, U, τ, ω) is stochastically positive if (i) the C ∗ -algebra generated by t∈IR τt (U) is equal to B;
∗
(ii) the Euclidean Green’s functions E G(s1 , . . . , sn ; A1 , . . . , An ) are positive for all A1 , . . . , An in U + = {A ∈ U | A ≥ 0}. In applications it is more convenient to use a version of stochastic positivity, which is adapted to von Neumann algebras. Definition 2.2. Let B ⊂ B(H) be a von Neumann algebra and let U ⊂ B(H) be a weakly closed abelian sub-algebra of B. Assume that the dynamics τ : B → B is given by τt (B) := eitL Be−itL ,
B ∈ B,
where L is a selfadjoint operator on H. Moreover, assume that ω is a β-KMS state for the W ∗ -dynamical system (B, τ ). Then the KMS system (B, U, τ, ω) is stochastically positive if (i) the von Neumann algebra generated by t∈IR τt (U) is equal to B; (ii) the Euclidean Green’s functions E G(s1 , . . . , sn ; A1 , . . . , An ) are positive for all A1 , . . . , An in U + = {A ∈ U | A ≥ 0}. 2.2. Generalized path spaces Stochastically positive β-KMS systems can be associated to generalized path spaces (see [23], [22]). Let us first recall some terminology. If Ξi , for i in an index set I, is a family of subsets of a set Q, we denote by i∈I Ξi the σ-algebra generated by all the Ξi , i ∈ I.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
123
Definition 2.3. A generalized path space (Q, Σ, Σ0 , U (t), R, µ) consists of (i) a probability space (Q, Σ, µ); (ii) a distinguished sub-σ-algebra Σ0 ⊂ Σ; (iii) a one-parameter group IR t → U (t) of measure preserving automorphisms of L∞ (Q, Σ, µ), strongly continuous in measure, such that Σ = t∈IR U (t)Σ0 ; (iv) a measure-preserving automorphism R of L∞ (Q, Σ, µ) such that R2 = 1l, RU (t) = U (−t)R and RE0 = E0 R, where E0 is the conditional expectation with respect to Σ0 . A path space (Q, Σ, Σ0 , U (t), R, µ) is said to be supported by the probability space (Q, Σ, µ). It follows from (iii) and (iv) that U (t) extends to a strongly continuous group of isometries of Lp (Q, Σ, µ) and R extends to an isometry of Lp (Q, Σ, µ) for 1 ≤ p < ∞. We say that the path space (Q, Σ, Σ0 , U (t), R, µ) is β-periodic for β > 0 if U (β) = 1l. On a β-periodic path space one can consider the one-parameter group U (t) as being indexed by the circle Sβ = [−β/2, β/2]. For I ⊂ IR, we denote by EI the conditional expectation with respect to the σ-algebra ΣI := t∈I Σt . Definition 2.4. (0-temperature case): A generalized path space (Q, Σ, Σ0 , U (t), R, µ) is OS-positive if E[0,+∞[ RE[0,+∞[ ≥ 0 as an operator on L2 (Q, Σ, µ). (Positive temperature case): A β-periodic path space (Q, Σ, Σ0 , U (t), R, µ) is OS-positive if E[0,β/2] RE[0,β/2] ≥ 0 as an operator on L2 (Q, Σ, µ). For simplicity of notation we will consider β as a parameter in ]0, +∞], the case β = +∞ corresponding to the 0-temperature case. It is shown in [23] that for β > 0 there is a one to one correspondence between stochastically positive β-KMS systems and β-periodic OS-positive path spaces. For β = ∞ the object associated to an OS-positive path space is called a positive semigroup structure (see [22]). Let us describe in more details one part of this correspondence, which is an example of a reconstruction theorem. Let (Q, Σ, Σ0 , U (t), R, µ) be an OS-positive path space, β-periodic if β < ∞. We set HOS := L2 (Q, Σ[0,β/2] , µ). Let N ⊂ HOS be the kernel of the positive quadratic form ¯ ψRψdµ.
(ψ, ψ) := Q
March 29, 2005 8:59 WSPC/148-RMP
124
J070-00230
C. G´ erard & C. D. J¨ akel
Then the physical Hilbert space is H := completion of HOS /N , where the completion is done with respect to the positive definite scalar product (., .). Let us denote by V the canonical map V: HOS → HOS /N . Then in H there is the distinguished unit vector Ω := V1, where 1 ∈ HOS is the constant function equal to 1 on Q. For A ∈ L∞ (Q, Σ0 , µ) one defines A˜ ∈ B(H) by ˜ AVψ := VAψ.
(2.1)
(Note that multiplication by A preserves N , since A is by assumption Σ0 measurable). One denotes by U ⊂ B(H) the abelian von Neumann algebra U := {A˜ | A ∈ L∞ (Q, Σ0 , µ)}. It is shown in [23, 22] that the map A → A˜ is a weakly continuous ∗ -isomorphism between L∞ (Q, Σ0 , µ) and U. Finally, setting Mt = L2 (Q, Σ[0,β/2−t] , µ) for 0 ≤ t ≤ β/2 and Dt = VMt , one can define P (s): Dt → H for 0 ≤ s ≤ t by P (s)Vψ := VU (s)ψ, ψ ∈ Mt . The triple (P (t), Dt , β/2) forms a local symmetric semigroup (see [9, 25]) and there exists a unique selfadjoint operator L on H such that P (s)u = e−sL u for u ∈ Dt and 0 ≤ s ≤ t. The selfadjoint operator constructed in this way is said to be associated to the local symmetric semigroup (P (t), Dt , β/2). Next one defines: — B ⊂ B(H) as the von Neumann algebra generated by {eitL Ae−itL | t ∈ IR, A ∈ U}; — τ : t → τt as the weakly continuous group of ∗-automorphisms of B, which is given by τt (B) = eitL Be−itL for t ∈ IR and B ∈ B; — ω as the vector state on B given by ω(B) = (Ω, BΩ) for B ∈ B; — the modular conjugation J associated to the KMS system (B, τ, ωΩ ) as the unique extension of ¯ JVψ = V(Rβ/4 ψ),
ψ ∈ L2 (Q, Σ, µ),
where Rβ/4 := U (β/4)RU (−β/4) = RU (−β/2) = U (β/2)R is the reflection at t = β/4 in HOS .
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
125
It is shown in [23] that (B, U, τ, ω) is a stochastically positive β-KMS system. The relationship between the two objects is fixed by the following identity: n E ˜ ˜ G(s1 , . . . , sn ; A1 , . . . , An ) = U (si )Ai dµ (2.2) Q
i=1
∞
for Ai ∈ L (Q, Σ0 , µ), 1 ≤ i ≤ n, and s1 ≤ · · · ≤ sn , sn − s1 ≤ β. 2.3. Perturbations of generalized path spaces We now describe perturbations of generalized path spaces obtained from a Feynman–Kac–Nelson kernel. Unless stated otherwise, we will consider the case β < ∞. Let (Q, Σ, Σ0 , U (t), R, µ) be an OS-positive β-periodic path space. Let V be a selfadjoint operator on H, which is affiliated to U. Using the isomorphism between U and L∞ (Q, Σ0 , µ) we can view V as a real Σ0 -measurable function on Q, which we still denote by V . Assume that V ∈ L1 (Q, Σ0 , µ) and exp(−βV ) ∈ L1 (Q, Σ0 , µ). Then (see β/2 [23] or [12, Proposition 6.2]) the function F := exp − −β/2 U (t)V dt belongs to L1 (Q, Σ, µ). One can hence define the perturbed measure dµV := ( Q F dµ)−1 F dµ. The perturbed path space (Q, Σ, Σ0 , U (t), R, µV ) is OS-positive and β-periodic (see [23]). Hence we can associate to this perturbed path space a stochastically positive β-KMS system (BV , UV , τV , ωV ). The following concrete realization of the perturbed β-KMS system (BV , UV , τV , ωV ) has been obtained in [23] (with some improvements in [12]): — the physical Hilbert space HV obtained from the reconstruction theorem outlined in the previous subsection is equal to the physical Hilbert space H of the unperturbed β-KMS system (B, U, τ, ω); — the von Neumann algebra BV and the abelian algebra UV are equal to B and U, respectively; — if in addition V ∈ L2+ (Q, Σ0 , µ) for > 0 or V ∈ L2 (A, Σ0 , µ) and V ≥ 0, then the operator sum L + V is essentially selfadjoint on D(L) ∩ D(V ) and if HV := L + V , then the perturbed time-evolution τV on B is given by τV,t (B) = eitHV Be−itHV , B ∈ B; — the vector Ω of the unperturbed KMS system belongs to −distinguished β HV 2 and the perturbed KMS state ωV is given by ωV (B) = (ΩV , BΩV ), D e where e− 2 HV Ω β
ΩV :=
β
e− 2 HV Ω
.
The following result is shown in [12, Theorem 6.12]: If e−βV ∈ L1 (Q, Σ0 , µ) and β 1 V ∈ Lp (Q, Σ0 , µ), e− 2 V ∈ Lq (Q, Σ0 , µ) for p−1 + q −1 = , 2 ≤ p, q ≤ ∞, 2
March 29, 2005 8:59 WSPC/148-RMP
126
J070-00230
C. G´ erard & C. D. J¨ akel
then the operator sum HV − JV J is essentially selfadjoint and the Liouvillean LV (for a general definition of Liouvilleans see, e.g., [8]) for the perturbed β-KMS system (BV , τV , ωV ) is equal to HV − JV J. Here J denotes the modular conjugation associated to the pair (B, Ω). 2.4. Perturbed dynamics associated to FKN kernels Let us describe in more details the construction of HV = L + V given in [23] which is based on the Feynman–Kac–Nelson formula. Note that the results of this subsection are also valid in the 0-temperature case β = +∞. Let V be a real Σ0 measurable function such that V ∈ L1 (Q, Σ0 , µ) and e−T V ∈ L1 (Q, Σ0 , µ) for some T > 0 if β = ∞ and for T = β if β < ∞. Set F[0,s] := e−
Rs 0
U(t)V dt
,
0 ≤ s ≤ inf(T, β)/2,
which belongs to L2 (Q, Σ[0,inf(T,β)/2] , µ). The family {F[0,s] }0≤s≤inf (T,β)/2 is called a Feynman–Kac–Nelson kernel. For 0 ≤ t ≤ inf(T, β)/2 we set F[0,s] L∞ Q, Σ[0,inf(T,β)/2−t] , µ Mt := linear span of 0≤s≤inf(T,β)/2−t
and
UV (s): Mt → L2 Q, Σ[0,inf(T,β)/2] , µ , ψ → F[0,s] U (s)ψ
0 ≤ s ≤ t.
Setting finally Dt = V(Mt ),
(2.3)
one can show that PV (s): Dt → H V(ψ) → V(F[0,s] U (s)ψ) is a well-defined linear operator and that PV (t), Dt , inf(T, β)/2 is a local symmetric semigroup on H. Now let HV be the unique selfadjoint operator associated to the local symmetric semigroup (Dt , PV (t), inf(T, β)/2). If V ∈ L2+ (Q, Σ0 , µ) for > 0 or V ∈ L2 (A, Σ0 , µ) and V ≥ 0, then (see [23]) one knows that HV = L + V . In the sequel we will need the following result. Proposition 2.5. Let V ∈ L2 + (Q, Σ0 , µ) be a real function such that e−T V ∈ L1 (Q, Σ0 , µ) for some T > 0 and Vn := V 1l{|V |≤n} for n ∈ IN. Denote by L the selfadjoint operator on H associated to the OS-positive path space (Q, Σ, Σ0 , U (t), R, µ). Let Hn be the closure of L + V − Vn . Then e−itL = s- lim e−itHn , n→∞
t ∈ IR.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
127
Note that the selfadjoint operators Hn are associated to local symmetric semi (n) groups Pn (t), Dt , T /2 obtained from the FKN kernels F[0,s] := e− (n)
Rs 0
U(t)(V −Vn )dt
and the operator L is associated to the local symmetric semigroup P∞ (t), Dt , T /2 (∞) obtained from the FKN kernels F[0,s] = 1. Proof. We first claim that (n)
sup 0≤s≤T /2
F[0,s] − 1L1 (Q,Σ,µ) → 0
for n → ∞.
(2.4)
In order to prove (2.4), we recall the following bound from [26, Theorem 6.2 (i)]: e−
Rb a
U(t)V dt
Lp (Q,Σ,µ) ≤ e−(b−a)V Lp (Q,Σ,µ) ,
1 ≤ p < ∞.
Now let W be a real measurable function on Q. Using 1 − e−a = a 1 − e−
Rs 0
U(t)W dt
s
=
U (t)W dt 0
1
e−θ
Rs 0
U(t)W dt
1 0
(2.5)
e−θa dθ we find
dθ.
0
This yields 1 − e−
Rs 0
U(t)W dt
1
L1 ≤ |s| W L2 ≤ |s| W
e−θ
Rs 0
U(t)W dt
L2 dθ
0 1
L2
e−θsW L2 dθ
0
≤ |s| W L2 1 +
1
e
θsW−
L2 dθ
0
≤
T W L2 1 + eT W− L1 , 2
where W− = sup(0, −W ) denotes the negative part of W . In the first line we have used the Cauchy–Schwarz inequality and the fact that U (t) is unitary on L2 (Q, Σ, µ), in the second line the estimate (2.5). By assumption V ∈ L2 (Q, Σ, µ) and e−T V ∈ L1 (Q, Σ, µ). Thus V − Vn → 0 in L2 (Q, Σ, µ) and eT (V −Vn )− → 0 in L1 (Q, Σ, µ). Applying the above bound for W = V − Vn , we obtain (2.4). Before we finish the proof, we extract a lemma. (n) Lemma 2.6. Let Pn (t), Dt , T for n ∈ IN ∪ {∞} be a family of local symmetric semigroups on a Hilbert space H. Let Hn , n ∈ IN ∪ {∞}, denote the associated
March 29, 2005 8:59 WSPC/148-RMP
128
J070-00230
C. G´ erard & C. D. J¨ akel
selfadjoint operators. Assume that there exists a family {Lt } for 0 < t ≤ T ≤ T of subspaces of H with (n) Lt ⊂ Dt , Lt dense in H. (2.6) 0
Assume moreover that Ψ ∈ Lt , 0 ≤ s ≤ t ≤ T ,
lim (Ψ, Pn (s)Ψ) = (Ψ, P∞ (s)Ψ),
n→∞
Ψ ∈ Lt , 0 ≤ t ≤ T .
sup sup (Ψ, Pn (s)Ψ) < ∞, n 0≤s≤t
(2.7) (2.8)
Then s-limn→∞ e−itHn = e−itH∞ for all t ∈ IR. Proof. Let us fix 0 < t ≤ T and Ψ ∈ Lt . From [25, Lemma 1], we know that there exist positive measures {νn } on IR such that s 2 Ψ = e−sa dνn (a), 0 ≤ s ≤ t. (Ψ, Pn (s)Ψ) = Pn 2 IR Moreover, one has (see [25, Lemma 1]) (Ψ, e−iyHn Ψ) =
e−iya dνn (a).
IR
Set
fn (z) :=
e−za dνn (a),
z ∈ ]0, t[ +iIR.
IR
The family {fn } is uniformly bounded on ]0, t[ +iIR by (2.8) and converges pointwise to f∞ on ]0, T [ by (2.7). Applying Lemma B.3 we conclude that fn (z) converges to f∞ (z) for all z ∈ iIR. This implies that on Lt w- lim e−iyHn = e−iyH∞ n→∞
∀ y ∈ IR.
Since by hypothesis 0
Clearly Lt is included in the spaces Dt defined in (2.3) (with V replaced by V − Vn ). Moreover, 0
n→∞
and supn sup0≤s≤t (Ψ, Pn (s)Ψ) < ∞. Hence hypotheses (2.7) and (2.8) of Lemma 2.6 are satisfied. Thus we can apply Lemma 2.6 and this completes the proof of Proposition 2.5.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
129
3. Gaussian Measures In this section we recall some standard facts about Gaussian measures on distribution spaces. 3.1. Distribution spaces Let Sβ = [−β/2, β/2] (with endpoints identified) be the circle of length β > 0. Points in Sβ × IRd , d ≥ 1, will be denoted by (t, x). The Fr´echet space of Schwartz functions on IRd will be denoted by S(IRd ). For coherence of notation, the Fr´echet space D(Sβ ) of smooth periodic functions on Sβ will also be denoted by S(Sβ ). In addition, we denote by S(Sβ × IRd ) the Fr´echet space of Schwartz functions on Sβ × IR, i.e., the space of smooth functions on Sβ × IRd , which are β-periodic in t and such that for all p ∈ IN and α ∈ INd (1 + |x|)|α| ∂tp ∂xα f (t, x) ≤ Cp,α . We will denote by S (IRd ), S (Sβ ) and S (Sβ × IRd ), the duals of S(IRd ), S(Sβ ) and S(Sβ × IRd ). The spaces of real elements in these spaces will be denoted by (IRd ), SIR (Sβ ) and SIR (Sβ × IRd ). SIR −1 We set Dt = i ∂t and Dx = i−1 ∂x , and we will denote by Dt2 the selfadjoint operator on L2 (Sβ ) defined by
Dt2 := −∂t2 , D(Dt2 ) := u ∈ L2 (Sβ ) | ∂t2 u ∈ L2 (Sβ ), u(0) = u(β) . We denote by Dt2 + Dx2 the selfadjoint operator on L2 (Sβ × IRd ) with domain
D(Dt2 + Dx2 ) := u ∈ L2 (Sβ ×IRd ) | (Dt2 + Dx2 )u ∈ L2 (Sβ ×IRd ), u is β-periodic in t . We denote by S(ZZ × IRd ) the Fr´echet space of sequences {un }n∈IN with values in S(IRd ) such that |n|p (Dx2 + x2 )p/2 un L2 (IRd ) < ∞ ∀ p ∈ IN. n∈ZZ
We now fix the notation concerning partial Fourier transforms. We first define the (unitary) partial Fourier transform with respect to t: Ft : S(Sβ × IRd ) → S(ZZ × IRd ) , u → {ˆ un } 1 where u ˆn (x) = β − 2 Sβ e−iνn t u(t, x) dt. (The coefficients νn = 2πn/β, n ∈ IN, are called in physics Matsubara frequencies). Its inverse is 1 eiνn t u ˆn (x). u(t, x) = β − 2
n∈ZZ
March 29, 2005 8:59 WSPC/148-RMP
130
J070-00230
C. G´ erard & C. D. J¨ akel
The (unitary) partial Fourier transform with respect to x is Fx : S(Sβ × IRd ) → S(Sβ × IRd ) , u → u ˆ where uˆ(t, p) = (2π)−d/2 IRd e−ix·p u(t, x)dx. Its inverse is −d/2 u(t, x) = (2π) eix·p u ˆ(t, p)dp. IRd
For later use we fix two approximations of the Dirac δ functions in t and x. We set, for k ∈ IN, δk (s) := β −1 eiνn s and δk (x) := kχ(kx), |n|≤k
where χ is a function in C0∞IR (IRd ) with
χ(x)dx = 1.
3.2. Gaussian measures We set C(f, g) = f, (Dt2 + Dx2 + m2 )−1 g ,
f, g ∈ S(Sβ × IRd ),
(3.1)
where (., .) is the scalar product on L2 (Sβ × IRd ). (Sβ × IRd ) and let Σ be the Borel σ-algebra on Q. If f ∈ SIR (Sβ × Let Q := SIR d IR ), then φ(f ) denotes the coordinate function φ(f ): Q → C . q → q, f Let F be a Borel function on IR. Then F (φ(f )) denotes the function F (φ(f )): Q → C . q → F q, f We denote by dφC the Gaussian measure on (Q, Σ) with covariance C defined by eiφ(f ) dφC = e−C(f,f )/2 , f ∈ SIR (Sβ × IRd ). (3.2) Q
We have
φ(f )p dφC = Q
0, p odd, (p − 1)!!C(f, f )p/2 , p even,
(3.3)
where n!! = n(n − 2)(n − 4) · · · 1. One easily deduces from (3.3) that eφ(f ) ∈ L1 (Q, Σ, dφC ) if f ∈ SIR (Sβ × IRd ). The cylindrical functions F φ(f1 ), . . . , φ(fn ) , fi ∈ SIR (IR × Sβ ), F a Borel function on IRn and n ∈ IN, are dense in Lp (Q, Σ, dφC ) for 1 ≤ p < ∞.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
131
3.3. Sharp-time fields We now recall some standard results about the existence of sharp-time fields. We will make use of the following well known identity (see [24]): 1 eiνn t e−|t| + e−(β−|t|) = 2 2 β νn + 2(1 − e−β )
for > 0,
νn =
n∈ZZ
2πn , β
0 ≤ |t| ≤ β. (3.4)
For h1 , h2 ∈ SIR (IRd ), 0 ≤ t1 , t2 ≤ β, and k ∈ IN C δk (· − t1 ) ⊗ h1 , δk (· − t2 ) ⊗ h2 ˆ 1n , (ν 2 + D2 + m2 )−1 ˆh2n 2 d . eiνn (t1 −t2 ) h = β −1 n x L (IR ) |n|≤k
Using (3.4) we see that lim C δk (· − t1 ) ⊗ h1 , δk (· − t2 ) ⊗ h2 =
k→∞
e−|t2 −t1 | + e−(β−|t2 −t1 |) h , h1 , 2 2(1 − e−β ) L2 (IRd )
1
where := (Dx2 + m2 ) 2 . Using (3.3) this implies that, for h ∈ SIR (IRd ) and t ∈ Sβ fixed, the sequence of functions {φ(δk (· − t) ⊗ h)}k∈IN is Cauchy in 1≤p<∞ Lp (Q, Σ, dφC ). We set φ(t, h) := lim φ(δk (· − t) ⊗ h) k→∞
(3.5)
and
e−|t2 −t1 | + e−(β−|t2 −t1 |) h2 . (3.6) C0 (t1 , h1 , t2 , h2 ) := h1 , 2(1 − e−β ) L2 (IRd ) We note that φ(t, h) belongs to 1≤p<∞ Lp (Q, Σ, dφC ). For later use we define the temperature β −1 covariance on IRd : (1 + e−β ) h2 , h1 , h2 ∈ S(IRd ). (3.7) C0 (h1 , h2 ) := h1 , 2(1 − e−β ) L2 (IRd )
3.4. Sharp-space fields If d = 1, then it is possible to define similarly sharp-space fields. We first recall another well known identity, which is analogous to (3.4): eipx e−b|x| (2π)−1 for b > 0, x ∈ IR. (3.8) dp = 2 2 2b IR p + b For g1 , g2 ∈ SIR (Sβ ) and x1 , x2 ∈ IR one has C g1 ⊗ δk (· − x1 ), g2 ⊗ δk (· − x2 ) p ip(x1 −x2 ) 2 g1 , (Dt2 + p2 + m2 )−1 g2 L2 (S ) dp. = χ ˆ e β k IR
(3.9)
March 29, 2005 8:59 WSPC/148-RMP
132
J070-00230
C. G´ erard & C. D. J¨ akel 1
Using (3.8) and χ(0) ˆ = (2π)− 2 we find
lim C g1 ⊗ δk (· − x1 ), g2 ⊗ δk (· − x2 ) =
k→∞
e−|x1 −x2 |b g2 g1 , , 2b L2 (Sβ )
(3.10)
1
use (3.3) again: for where b := (Dt2 + m2 ) 2 . Now we can g ∈ SIR (Sβ ) and x ∈ IR fixed, the sequence of functions φ g ⊗ δk (· − x) k∈IN is Cauchy in p 1≤p<∞ L (Q, Σ, dφC ). We set φ(g, x) := lim φ g ⊗ δk (· − x) k→∞
and
e−|x1 −x2 |b Cβ (g1 , x1 , g2 , x2 ) := g1 , g2 . (3.11) 2b L2 (Sβ ) We note that φ(g, x) belongs to 1≤k<∞ Lp (Q, Σ, dφC ). For later use we define the 0-temperature covariance on Sβ : 1 Cβ (g1 , g2 ) := g1 , g2 , g1 , g2 ∈ S(Sβ ). (3.12) 2b L2 (Sβ )
3.5. Some elementary properties From (3.3), (3.6) and (3.11) we deduce that the maps −1 HIR (Sβ × IRd ) → Lp (Q, Σ, dφC ) 1≤p<∞
f → φ(f ) −1
Sβ × HIR 2 (IRd ) →
,
Lp (Q, Σ, dφC )
1≤p<∞
(3.13)
(3.14)
(t, h) → φ(t, h) and −1
HIR 2 (Sβ ) × IR →
Lp (Q, Σ, dφC )
1≤p<∞
(g, x) → φ(g, x) are continuous. For f ∈ SIR (Sβ × IR), t ∈ Sβ and x ∈ IR we set ft : IR → C , x → f (t, x)
f x : Sβ → C . t → f (t, x)
We note that ft ∈ SIR (IR) and fx ∈ SIR (Sβ ).
(3.15)
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
133
Lemma 3.1. If f ∈ SIR (Sβ × IR), then the following identity holds on p 1≤p<∞ L (Q, Σ, dφC ) : φ(fx , x)dx = φ(t, ft )dt = φ(f ). IR
Sβ
Proof. Let f ∈ SIR (Sβ × IR) and k ∈ IN. The map IR → H −1 (Sβ × IR) x → fx ⊗ δk (· − x) is continuous. Since f ∈ SIR (Sβ × IR), the bound fx ⊗ δk (· − x)H −1 (Sβ ×IR) ∈ −∞ O |x| holds true. Hence by (3.15) the map IR → Lp (Q, Σ, dφC ) 1≤p<∞
x → φ fx ⊗ δk (· − x) is continuous and φ fx ⊗ δk (· − x) Lp (Q,Σ,dφC ) ∈ O |x|−∞ . Therefore IR φ fx ⊗ δk (. − x) dx is well-defined as an element of 1≤p<∞ Lp (Q, Σ, dφC ). Moreover, φ fx ⊗ δk (· − x) dx = φ fx ⊗ δk (· − x)dx = φ(f ∗ δk ), IR
IR
where the convolution product ∗ acts only in the space variable x. Since limk→∞ f ∗ δk = f holds in H −1 (Sβ × IR), we obtain from (3.13) lim φ fx ⊗ δk (· − x) dx = φ(f ) in Lp (Q, Σ, dφC ). k→∞
IR
1≤p<∞
It follows from (3.9) and (3.10) that lim sup |x|N φ fx ⊗ δk (· − x) − φ(fx , x)Lp (Q,Σ,dφC ) = 0 k→∞ x∈IR
for f ∈ SIR (Sβ × IR) and N ∈ IN. Hence φ fx ⊗ δk (· − x) dx = φ(fx , x)dx. lim k→∞
IR
IR
This proves the first identity of the lemma. The second one can be shown by similar arguments. 4. Path Spaces Supported by (SIR (Sβ × IR), Σ, dφC ) In this section we recall two well known path spaces supported by (Q, Σ, dφC ). The first is associated to the free neutral scalar field of mass m on Sβ at temperature 0; the second is associated to the free neutral scalar field of mass m on IR at temperature β −1 .
March 29, 2005 8:59 WSPC/148-RMP
134
J070-00230
C. G´ erard & C. D. J¨ akel
We recall that (t, x) denotes a point in Sβ × IR, and refer to t as the (euclidean) time and to x as space variable. The time translation induced on Q by the map (t, x) → (t + s, x) will be denoted by Ts : Q → Q and the spatial translations induced on Q by the map (t, x) → (t, x + y) will be denoted by ay : Q → Q. Note that for the thermal model on the real line, t is the (euclidean) time and x is the space variable, while for the model on the circle, t has to be interpreted as the “position variable on the circle” and x has to be identified with the “euclidean time variable”. 4.1. The free massive euclidean field on the circle at 0-temperature In this subsection we identify the generalized path space on (Q, Σ, dφC ) corresponding to the free massive scalar field on the circle Sβ at temperature 0. Let ΣC0 be the sub-σ-algebra of Σ generated by the functions {φ(g, 0) | g ∈ SIR (Sβ )}. We denote by {UC (x)}x∈IR the one-parameter group generated by the spatial translations {ax }x∈IR . More precisely, if F : Q → C is a function on Q, then UC (x)F (q) := F a−x (q) for q ∈ Q. Applying (3.2) we see that x → UC (x) is a strongly continuous unitary group on L2 (Q, Σ, dφC ), and hence extends to a group of measure-preserving automorphisms of L∞ (Q, Σ, dφC ) which is continuous in measure. Let rC : Q → Q be the space reflection around x = 0. We denote by RC the measure-preserving transformation of (Q, Σ, dφC ) generated by rC . For g ∈ SIR (Sβ ) we have (4.1) UC (x)φ(g, 0) = φ(g, x). Using then Lemma 3.1, we see that Σ = x∈IR UC (x)ΣC0 . Hence (Q, Σ, ΣC0 , UC (x), RC , dφC ) is a generalized path space. Moreover, it is OSpositive (see e.g. [24]). It describes the free neutral scalar euclidean field of mass m on the circle Sβ at temperature 0. Let us now briefly describe a well known concrete form of the physical objects 1 associated to this path space by the reconstruction theorem. Let H − 2 (Sβ ) be the Sobolev space of order − 12 equipped with its canonical complex structure i and 1 scalar product (h1 , (2b)−1 h2 )L2 (Sβ ) , where b = (Dt2 + m2 ) 2 . Then the physical 1 Hilbert space can be unitarily identified with the bosonic Fock space Γ H − 2 (Sβ ) 1 over H − 2 (Sβ ).The distinguished unit vector Ω◦C := V1 is identified with the Fock 1 vacuum Ω in Γ H − 2 (Sβ ) . The (free) Hamiltonian is HC◦ = dΓ(b). The abelian von Neumann algebra UC obtained from the reconstruction theorem can −1 be identified with the von Neumann algebra generated by {WF (g) | g ∈ HIR 2 (Sβ )}. ˜ In fact, if A = eiφ(g,0) for g ∈ SIR (Sβ ), then the operator A defined in (2.1) is 1 iφF (g) on Γ H − 2 (Sβ ) . identified with the Fock–Weyl operator WF (g) = e
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
135
4.2. The free massive euclidean field on IR at temperature β −1 We now identify the generalized path space on (SIR (Sβ × IR), Σ, dφC ) corresponding to the free massive scalar euclidean field on IR at temperature β −1 . Let Σ0 be the sub-σ-algebra of Σ generated by the functions {φ(0, h) | h ∈ SIR (IR)}. We denote by {U (t)}t∈Sβ the one-parameter group generated by {Tt }t∈Sβ . If F : Q → C is a function on Q, then U (t)F (q) := F T−t (q) for q ∈ Q. Using (3.2) we see that t → U (t) is a strongly continuous β-periodic unitary group on L2 (Q, Σ, dφC ). Hence it extends to a group of measure-preserving automorphisms of L∞ (Q, Σ, dφC ) which is continuous in measure. Let r be the (euclidean) time reflection around t = 0. We denote by R the measure-preserving transformation of (Q, Σ, dφC ) generated by r. For h ∈ SIR (IR) we have
U (t)φ(0, h) = φ(t, h). (4.2) Again by Lemma 3.1, we see that Σ = t∈Sβ U (t)Σ0 . Hence (Q, Σ, Σ0 , U (t), R, dφC ) is a generalized path space. Moreover, it is β-periodic and OS-positive (see e.g. [24]). It describes the free neutral scalar field of mass m on IR at temperature β −1 . We now describe a well known concrete form of the β-KMS system associated 1 to the generalized path space (Q, Σ, Σ0 , U (t), R, dφC ). Let h := H − 2 (IR) be the 1 Sobolev space of order − 2 , equipped with its canonical complex structure i and 1 scalar product (h1 , h2 ) = (h1 , (2)−1 h2 )L2 (IR) , where = (Dx2 + m2 ) 2 . On h we consider the unitary dynamics e−it . On the Weyl algebra W(h) we define a state ωβ◦ and a one-parameter group of automorphisms {τt◦ }t∈IR by 1 τt◦ W (h) := W (eit h), h ∈ h, t ∈ IR, (4.3) ωβ◦ W (h) := e− 4 (h,(1+2ρ)h) , where ρ := (eβ − 1)−1 , β > 0. It can be easily seen that ωβ◦ is a quasi-free (τ ◦ , β)KMS state on W(h). Let us now recall some terminology. If h is a complex vector space, then the ¯ is the real vector space h equipped with the complex conjugate vector space h ¯ ∈ h the (anti-linear) identity operator. structure −i. We will denote by h h → h ¯ := ah. If h is a Hilbert ¯ the operator a If a ∈ L(h), then we denote by a ¯ ∈ L(h) ¯h ¯ 2 ) := (h2 , h1 ). ¯ ¯ 1, h space, then h is equipped with the Hilbert space structure (h We recall a convenient realization of the GNS representation associated to W(h), ωβ◦ , which is called the right Araki–Woods representation. It is specified by setting ¯ HAW := Γ(h ⊕ h), ΩAW := Ω,
1 1 πAW (W (h)) = WAW (h) := WF (1 + ρ) 2 h ⊕ ρ¯ 2 ¯h ,
h ∈ h.
¯ and Ω ∈ Γ(h ⊕ h) ¯ is the Here WF (.) denotes the Fock–Weyl operator on Γ(h ⊕ h) Fock vacuum.
March 29, 2005 8:59 WSPC/148-RMP
136
J070-00230
C. G´ erard & C. D. J¨ akel
The physical Hilbert space associated to the path space (Q, Σ, Σ0 , U (t), R, dφC ) ¯ The distinguished vector V1 is identified can be unitarily identified with Γ(h ⊕ h). ¯ The Liouvillean LAW satisfies with the Fock vacuum vector Ω in Γ(h ⊕ h). eiLAW t πAW (A)ΩAW = πAW τt◦ (A) ΩAW and LAW ΩAW = 0, and can be identified with dΓ( ⊕ −¯ ). The abelian von Neumann algebra UAW obtained by the reconstruction theorem can be identified with the abelian von Neumann algebra generated by {WAW (h) | −1 h ∈ HIR 2 (IR)}. In fact, if A = eiφ(0,h) for h ∈ SIR (IR), then the operator A˜ defined ¯ in (2.1) is identified with the Weyl operator WAW (h) = eiφAW (h) on Γ(h ⊕ h). ◦ The von Neumann algebra BAW generated by t∈IR τt (UAW ) can be identified 1 with the von Neumann algebra RAW generated by {WAW (h) | h ∈ H − 2 (IR)}. 5. Perturbations of Path Spaces In this section we describe perturbations of the two path spaces defined in Secs. 4.1 and 4.2, obtained from FKN kernels corresponding to P (φ)2 interactions. 5.1. Interaction terms We recall some well known facts concerning the Wick ordering of Gaussian random variables. Let (K, ν) be a probability space and X a real vector space equipped with a positive quadratic form f → c(f, f ) called a covariance. Let f → φ(f ) be a IR-linear map from X into the space of real measurable functions on K. The Wick ordering : φ(f )n :c with respect to the covariance c is defined by the following generating series: ∞ α2 αn : φ(f )n :c = eαφ(f ) e− 2 c(f,f ) . n! n=0
: eαφ(f ) :c : =
(5.1)
Thus m n! 1 n−2m φ(f ) , − c(f, f ) m!(n − 2m!) 2
[n/2] n
: φ(f ) :c =
m=0
(5.2)
where [.] denotes the integer part. Lemma 5.1. ∈ L1 (Sβ × IR) ∩ L2 (Sβ × IR) the following limit exists in p 1≤p<∞ L (Q, Σ, dφC ): n lim f (t, x) : φ δk (· − t) ⊗ δk (· − x) :C dtdx.
(i) For f
(k,k )→∞
It will be denoted by
Sβ ×IR
Sβ ×IR
f (t, x) : φ(t, x)n :C dtdx;
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
137
(ii) For h ∈ L1 (IR) ∩ L2 (IR) the following limit exists in 1≤p<∞ Lp (Q, Σ, dφC ) : h(x) : φ(0, δk (· − x))n :C0 dx. lim k→∞
IR
It will be denoted by IR h(x) : φ(0, x)n :C0 dx; (iii) For g ∈ L1 (Sβ ) ∩ L2 (Sβ ) the following limit exists in 1≤p<∞ Lp (Q, Σ, dφC ) : lim g(t) : φ(δk (· − t), 0)n :Cβ dt. k →∞
It will be denoted by
Sβ
Sβ
g(t) : φ(t, 0)n :Cβ dt.
We recall that the covariances C, C0 and Cβ have been defined in (3.1), (3.7) and (3.12), respectively. In Lemma 5.1 the probability space is (Q, Σ, dφC ) and the real vector spaces are equal to SIR (Sβ × IR), SIR (IR) and SIR (Sβ ), respectively. Proof. The proof is straightforward, adapting standard arguments (see e.g. [29], [12, Sec. 9]) used for the spatially cutoff P (φ)2 model at 0-temperature. Remark 5.2. If P = P (λ) is a polynomial, then the functions f (t, x) : P (φ(t, x)) :C dtdx, h(x) : P (φ(0, x)) :C0 dx Sβ ×IR
and
IR
g(t) : P (φ(t, 0)) :Cβ dt Sβ
are well-defined, by linearity. It can be easily shown (see [12, Proposition 8.4]) using the so-called Wick reordering identities that there exists a linear invertible map between polynomials P → P˜ with deg P = deg P˜ , deg(P − P˜ ) ≤ deg(P ) − 1 such that h(x) : P (φ(0, x)) :C0 dx = h(x) : P˜ (φ(0, x)) :vac dx. IR
IR
Here : :vac denotes Wick ordering with respect to the 0-temperature covariance 1 h)L2 (IR) . (h, 2 Lemma 5.3. Let P be a polynomial, h ∈ L1 (IR) ∩ L2 (IR) and g ∈ L1 (Sβ ) ∩ L2 (Sβ ). Set V0 (h) := h(x) : P (φ(0, x)) :C0 dx, IR
Vβ (g) :=
(5.3) g(t) : P (φ(t, 0)) :Cβ dt
Sβ
March 29, 2005 8:59 WSPC/148-RMP
138
J070-00230
C. G´ erard & C. D. J¨ akel
as functions on Q. Then g(t)U (t)V0 (h) dt =
g(t) ⊗ h(x) : P (φ(t, x)) :C dtdx
Sβ ×IR
Sβ
h(x)UC (x)Vβ (g)dx
=
(5.4)
IR
as functions on Q. Proof. Let W be a function in Lp (Q, Σ, dφC ) for some 1 ≤ p < ∞. The oneparameter groups {U (t)}t∈Sβ and {UC (x)}x∈IR , defined in (4.2) and (4.1), are strongly continuous groups of isometries of 1≤p<∞ Lp (Q, Σ, dφC ). Therefore the functions IR h(x)UC (x)W dx and Sβ g(t)U (t)W dt belong to Lp (Q, Σ, dφC ). Together with Lemma 5.1 this implies that all three functions given in (5.4) belong to Lp (Q, Σ, dφC ). Let us now prove that they are identical. By linearity, we may assume that P (λ) = λn . Using Lemma 5.1 and the Wick identity (5.2), it follows that F (k, k ) in Lp (Q, Σ, dφC ), g(t) ⊗ h(x) : P (φ(t, x)) :C dtdx = lim (k,k )→∞
Sβ ×IR
where
n! − 1 C(δk,k , δk,k ) m 2 m!(n − 2m)! m=0 m × g(t) ⊗ h(x) φ δk (· − t) ⊗ δk (· − x) dtdx
[n/2]
F (k, k ) =
Sβ ×IR
and δk,k (t, x) := δk (t) ⊗ δk (x). Since lim C(δk,k , δk,k ) = C0 (δk , δk ),
k→∞
the definition given in (3.5) of sharp-time fields implies that lim F (k, k ) = g(t)Vk (t, h)dt in Lp (Q, Σ, dφC ), k→∞
Sβ
where
[n/2]
Vk (t, h) =
m=0
m m n! 1 h(x)φ t, δk (· − x) dx. − C0 (δk , δk ) m!(n − 2m)! 2 IR
Note that (4.2) implies Vk (t, h) = U (t)Vk (0, h). By Lemma 5.1 (ii) we know that V (0, h) = h(x) : P (φ(0, x)) :C0 dx in Lp (Q, Σ, dφC ) lim k k →∞
and hence
lim
k→∞
IR
g(t)V (t, h)dt =
g(t)U (t)V0 (h)dt
k
Sβ
Sβ
in Lp (Q, Σ, dφC ).
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
139
Applying Lemma B.1 with E = Lp (Q, Σ, dφC ) we obtain the first identity in (5.4). The second identity follows by the same argument, taking first the limit k → ∞ and then using that lim C(δk,k , δk,k ) = Cβ (δk , δk ).
k →∞
5.2. The P (φ)2 model on the circle Sβ at temperature 0 Let P (λ) be a real-valued polynomial, which is bounded from below. The P (φ)2 model on the circle Sβ is specified by the formal interaction term : P (φ(t, 0)) :Cβ dt. VC := Vβ 1[−β/2,β/2] = Sβ
This expression can be given two equivalent meanings: first of all, as recalled in Lemma 5.1, it can be viewed as a ΣC0 measurable function VC ∈ p C 1≤p<∞ L (Q, Σ0 , dφC ). Secondly, VC can be considered as a selfadjoint operator −1 on Γ H 2 (Sβ ) affiliated to the abelian algebra UC . More precisely, for t ∈ Sβ and 1 Λ 1 an UV cutoff parameter, we define an approximation hΛ,t ∈ H − 2 (Sβ ) of the 1 Dirac delta-function δ(· − t) ∈ H − 2 (Sβ ) by 1
hΛ,t := 1l[0,Λ] (b)δ(· − t) ∈ H − 2 (Sβ ), 1
where b = (Dt2 + m2 ) 2 . Setting φΛ (t, 0) := φF (hΛ,t ) one obtains by well known arguments that : P (φΛ (t, 0)) :Cβ dt VC = lim Λ→∞
Sβ
1 −1 on a dense set of vectors in Γ H − 2 (Sβ ) . Since hΛ,t ∈ HIR 2 (Sβ ) is a real-valued function, it is easy to see that VC is a selfadjoint operator affiliated to UC . It is then easy to verify, by adapting well known results for the spatially cutoff P (φ)2 model on the real line IR at 0-temperature (see [30]), that VC ∈ p C −T VC ∈ L1 (Q, ΣC0 , dφC ) for all T > 0. Now consider, 1≤p<∞ L (Q, Σ0 , dφC ) and e for 0 ≤ b − a < ∞, G[a,b] := e−
Rb a
UC (x)VC dx
(5.5)
as a function on Q. It follows from Jensen’s inequality (see [26, Theorem 6.2]) that G[a,b] Lp (Q,Σ,dφC ) ≤ e−(b−a)VC Lp (Q,Σ,dφC ) (5.6) and hence G[a,b] ∈ 1≤p<∞ Lp (Q, Σ, dφC ). From the results recalled in Sec. 2.3, we obtain a selfadjoint operator HC = dΓ(b) + VC −1 on Γ H 2 (Sβ ) associated to the FKN kernel {G[0,s] }. The Hamiltonian HC is called the P (φ)2 Hamiltonian on the circle Sβ .
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
C. G´ erard & C. D. J¨ akel
140
Proposition 5.4. The Hamiltonian HC is bounded from below and has a unique normalized ground state such that (ΩC , Ω) > 0. We set ωC ( . ) = (ΩC , . ΩC ). Moreover, for c 1, 1
φF (g)(HC + c)− 2 ≤ Cg ±φF (g) ≤ Cg
1
H − 2 (Sβ )
,
1
(5.7)
(HC + c) 2
(5.8)
±φF (g) ≤ CgH −1 (Sβ ) (HC + c)
(5.9)
1
H − 2 (Sβ )
and
− 12
iφF (g) for all is the Fock–Weyl operator g− 1∈ H (Sβ ). As before, WF (g) = e on Γ H 2 (Sβ ) .
Proof. The existence and uniqueness of the vacuum state can be shown by following the proofs of the corresponding results for spatially cutoff P (φ)2 models. For example, one easily obtains (see e.g. [29, Theorem V.20] or [7, Theorem 6.4(ii)]) that (dΓ(b) + 1) ≤ C(HC + c) for c 1. (5.10) −1 Since dΓ(b) has compact resolvent on Γ H 2 (Sβ ) , it follows that HC is bounded from below with a compact resolvent and hence has a ground state. The uniqueness of the vacuum (i.e., the ground state of HC ) follows from a Perron–Frobenius argument (see e.g. [29, Theorem V.17]). Since b ≥ m > 0, we see that it suffices to check (5.7) and (5.8), with HC replaced by the number operator N , which is immediate. To prove (5.9) we use (5.10) and the well known bound (see e.g. [11, Appendix]) 1
±φF (g) ≤ b− 2 g
1
H − 2 (Sβ )
(dΓ(b) + 1).
Without proof we quote the following result (see [19]). Theorem 5.5. Let HCren := HC − EC , where EC := inf(σ(HC )) and let PC denote the generator of the translations along the circle Sβ . The joint spectrum of HCren and PC is purely discrete and is contained in the forward light cone. Consequently the correlation function ren (t, x) → ΩC , AeixHC +itPC BΩC ,
1 A, B ∈ B Γ(H − 2 (Sβ )) ,
allows an analytic continuation to the tube IR2 + iV+ , where V+ := {(t, x) | |t| < x; x > 0} denotes the forward light cone (with t and x reversed, due to our conventions). 5.3. The spatially cutoff P (φ)2 model on IR at temperature β −1 Let P (λ) be a real-valued polynomial, which is bounded from below (as in Sec. 5.2), and let l ∈ IR+ be a spatial cutoff parameter. The spatially cutoff P (φ)2 model on
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
141
IR is specified by the formal interaction term (see (5.3)) l : P (φ(0, x)) :C0 dx. Vl := V0 1l[−l,l] = −l
Again this formal expression can be given two equivalent meanings: first of all, as recalled in Lemma 5.1, it can be viewed as a Σ0 -measurable function Vl ∈ p 1≤p<∞ L (Q, Σ0 , dφC ). Secondly, Vl can be considered as a selfadjoint operator on ¯ Γ(h⊕ h) affiliated to the abelian von Neumann algebra UAW . As in Sec. 5.2 we define 1 1 an approximation hΛ,x ∈ H − 2 (IR) of the Dirac delta-function δ(· − x) ∈ H − 2 (IR). For x ∈ IR and Λ 1 we set 1
hΛ,x := 1l[0,Λ] ()δ(· − x) ∈ H − 2 (IR) and introduce cutoff fields φΛ (0, x) := φAW (hΛ,x ), where φAW (h) is the selfadjoint field operator associated to WAW (h), h ∈ h. As before, the limit l Vl = lim : P (φΛ (0, x)) :C0 dx (5.11) Λ→∞
−l
1
¯ Since hΛ,x ∈ H − 2 (IR), one obtains that exists on a dense set of vectors in Γ(h ⊕ h). IR Vl is a selfadjoint operator affiliated to UAW . Adapting well known arguments (see [12, Sec. 8.2]) it can be shown that e−T Vl ∈ L1 (Q, Σ0 , µ) for all T > 0. Consequently, we can associate to Vl the FKN kernel l := e− F[a,b]
Rb a
U(t)Vl dt
,
0 ≤ b − a ≤ β,
and the measure dµl :=
l dφC F[−β/2,β/2]
Q
l F[−β/2,β/2] dφC
.
The generalized path space (Q, Σ, Σ0 , U (t), R, µl ) is β-periodic and OS-positive. The associated β-KMS system is called the spatially cutoff P (φ)2 model on IR at temperature β −1 . Applying the abstract results recalled in Sec. 2.3, we obtain the following facts: — the physical Hilbert space HVl is equal to HAW = Γ(h ⊕ ¯h); — the W ∗ -algebra BVl and the abelian algebra UVl are equal to RAW and UAW , respectively; — the operator sum LAW + Vl is essentially selfadjoint on D(LAW ) ∩ D(Vl ) and if Hl := LAW + Vl , then the perturbed time-evolution on B is given by τtl (B) := eitHl Be−itHl , B ∈ B; ¯ belongs to D e− β2 Hl and the perturbed KMS — the GNS vector ΩAW ∈ Γ(h ⊕ h) β β state ωl is given by ωl (B) = (Ωl , BΩl ), where Ωl := e− 2 Hl ΩAW −1 e− 2 Hl ΩAW .
March 29, 2005 8:59 WSPC/148-RMP
142
J070-00230
C. G´ erard & C. D. J¨ akel
The following consequence of Lemma 5.3 will be important in Sec. 7: l F[−β/2,β/2] = G[−l,l] ,
(5.12)
where G[a,b] was defined in (5.5). The analog identity in the zero temperature case is called Nelson symmetry.
6. The Thermodynamic Limit In this section we prove that the limits lim τtl (A) =: τt (A)
l→+∞
and
lim ωl (A) =: ωβ (A)
l→+∞
exist for A in the C ∗ -algebra of local observables A and that (A, τ, ωβ ) is a β-KMS system, describing the translation invariant P (φ)2 model at temperature β −1 . 6.1. Preparations We first recall a well known relationship between e−it and the Klein–Gordon equation: let −1
1
1
2 (IR) U : H − 2 (IR) → HIR 2 (IR) + iHIR
h → f := Re h + i−1 Im h.
(6.1)
(Note that U is IR-linear but not C-linear.) Then U e−it = T (t)U,
where T (t)f = ft ,
(6.2)
and ft is the solution of the Klein–Gordon equation 2 ∂t − ∂x2 + m2 ft = 0, ∂t f t=0 = −2 Im f + i Re f. ft=0 = f, 1
Moreover if hi ∈ H − 2 (IR) and U hi = fi for i = 1, 2, then = Im f¯1 (x)f2 (x)dx. σ(h1 , h2 ) := Im(h1 , h2 ) − 1 H
2
(IR)
(6.3)
IR
For I ⊂ IR a bounded open interval we define the real vector subspace hI of h hI := {h ∈ h | supp U h ⊂ I}.
(6.4)
It follows from (6.2) that i: D() ∩ hI → hI , and hence (1 + α2 )−1 : hI → hI for α > 0. In particular D() ∩ hI is dense in hI . Moreover (6.3) shows that hI and hJ are orthogonal for the symplectic form σ if I ∩ J = ∅.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
143
6.2. The net of local algebras We start by recalling a result of Araki [1, Theorem 1] which will be useful later on. Let us recall a standard notation: If H1 , H2 are two vector subspaces of a Hilbert space H, then H1 ∨ H2 denotes H1 + H2 . If R1 , R2 are two ∗ -sub-algebras of B(H), then R1 ∨ R2 denotes the von Neumann algebra generated by R1 ∪ R2 . Proposition 6.1. Let X be a Hilbert space and let Z be a real vector subspace of X. Let W(Z) ⊂ W(X) denote the C ∗ -algebra generated by {W (x) | x ∈ Z} and let πF : W(X) → B(Γ(X)) be the Fock representation. Then πF W(Zα ) = πF W (∩α Zα ) , πF W(Zα ) = πF W (∨α Zα ) (6.5) α
α
and πF W(Z) = πF W(Z ⊥ ) ,
(6.6)
where Zα is a family of real vector subspaces of X and Z ⊥ is the vector space orthogonal to Z for the symplectic form σ(x1 , x2 ) = Im(x1 , x2 ). We now define the net of local von Neumann algebras I → RAW (I) describing free thermal scalar bosons. Let I ⊂ IR be a bounded open interval. We denote by RAW (I) the von Neumann algebra generated by {WAW (h) | h ∈ hI }. Lemma 6.2. (i) The local von Neumann algebras for the free thermal field are regular from the inside and regular from the outside: RAW (J) = RAW (I) = RAW (J); ¯ J⊂I
J⊃I¯
(ii) The net of local von Neumann algebras for the free thermal field is additive: RAW (Ji ) if I = ∪i Ji ; RAW (I) = Ji
(iii) For each open and bounded interval I, the local observable algebra RAW (I) is ∗-isomorphic to the unique hyper-finite factor of type III1 . Proof. Recalling the definition of the Araki–Woods representation we see that, with the notation introduced above, RAW (I) = πF W(ZI ) , ¯ is the vector subspace where ZI ⊂ h ⊕ h 1
1
¯ | h ∈ hI }. ZI = {(1 + ρ) 2 h ⊕ ρ¯ 2 h
March 29, 2005 8:59 WSPC/148-RMP
144
J070-00230
C. G´ erard & C. D. J¨ akel
Clearly J⊃I¯ ZJ = J⊂I ZJ = ZI , which using (6.5) implies (i). Part (ii) is a direct ¯ consequence of (6.5). To prove (iii) we use (6.6) and (6.5) which implies that RAW (I) ∩ RAW (I) = πF W(ZI ∩ ZI⊥ , ¯ for the symplectic form σ(f, g) = where Z ⊥ is the orthogonal space to Z in h ⊕ h ¯ Im(f, g) on h ⊕ h. We claim that ZI ∩ ZI⊥ = {0},
(6.7)
which will imply that RAW (I) is a factor. To prove our claim we pick h ∈ hI such 1 1 ¯ ∈ Z ⊥ . This implies that Im(h, g) = 0 for all g ∈ hI . Hence to that (1 + ρ) 2 h ⊕ ρ¯ 2 h I prove (6.7) it suffices to check that hI ∩ h⊥ I = {0}.
(6.8)
2 −1 But if h ∈ hI ∩ h⊥ h) = 0 for α > 0, since i(1 + I , we have Im(h, i(1 + α ) 2 −1 α ) h ∈ hI for h ∈ hI . Letting α → 0 this yields Re(h, h) = (h, h) = 0, since is selfadjoint. Using that ≥ m > 0 this implies that h = 0, which proves (6.8) and hence (6.7). Thus RAW (I) is a factor, if I is bounded. Note that (6.8) shows that πF (W(hI )) is a factor and it is well known (see e.g. [5, 28] and lit. cit.) that πF (W(hI )) is ∗-isomorphic to the unique hyper-finite factor of type III1 . Thus Lemma 6.3 below completes the proof of the lemma.
We now recall an easy fact about the restriction of the free KMS state ωβ◦ to the local algebras W(hI ). Lemma 6.3. Let I ⊂ IR be a bounded open interval. Then the representations πAW and πF of W(hI ) are quasi-equivalent. Proof. Let h be a Hilbert space and let ≥ m > 0 be a positive selfadjoint operator 1 on h. Let ωβ◦ be the quasi-free state on W(h) defined by ωβ◦ W (h) = e− 4 (h,(1+2ρ)h) , where ρ = (eβ − 1)−1 . Then it is well known that ωβ◦ is normal with respect to the Fock representation πF of W(X) iff Tr e−β < ∞ (see e.g. [4, Proposition 5.2.27]). This fact implies that if h1 ⊂ h is a complex vector subspace, then the restriction of ωβ◦ to W(h1 ) is πF -normal iff Tr(Ee−β E) < ∞, where E is the orthogonal projection onto h1 . 1 We will apply this remark to h = H − 2 (IR), ρ = (eβ − 1)−1 and h1 = ChI . Let EI denote the orthogonal projection on ChI . Let χ ∈ C0∞IR such that χ ≡ 1 near I and x = i∂k . If h ∈ hI , then Re h = χ(x)Re h and Im h = χ(x)−1 Im h. Using pseudodifferential calculus, we see that the operators (1 + |x|)N χ(x) and 1 (1 + |x|)N χ(x)−1 are bounded on H − 2 (IR) for all N ∈ IN. This implies that (1 + |x|)N h
1
H − 2 (IR)
≤ Ch
1
H − 2 (IR)
,
h ∈ hI .
(6.9)
Clearly (6.9) extends to ChI , which implies that (1 + |x|)N EI is bounded for all N ∈ IN. Since e−β (1+|x|)−N is trace class for N large enough we see that EI e−β EI
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
145
is trace class. Using the arguments given above we obtain that ωβ◦ restricted to W(ChI ) (and hence also to W(hI )) is πF -normal. Finally we have seen in the proof of Lemma 6.2 that πF (W(hI )) is a factor, hence πF is a factor representation of W(hI ). It is shown in [27, Proposition 10.3.14] that if R is a C ∗ -algebra and π is a factor representation of R, then π is quasiequivalent to the GNS representation of any π-normal state ω. Since the restriction of πAW to W(hI ) is the GNS representation for the quasi-free state ωβ◦ , this completes the proof of the lemma. 6.3. Existence of the limiting dynamics The C ∗ -algebra of local observables A is defined as follows: A :=
RAW (I)
(∗)
,
I⊂IR
where the union is over all open bounded intervals I ⊂ IR and the symbol (∗) denotes the C ∗ -inductive limit (see e.g. [27, Proposition 11.4.1]). I⊂IR RAW (I) We denote by {αx }x∈IR the group of space translations on A, defined by αx WAW (h) := WAW (eix·k h), x ∈ IR, 1
where k is the momentum operator acting on h = H − 2 (IR). Theorem 6.4 (Existence of Limiting Dynamics). Let I ⊂ IR be a bounded open interval. For t ∈ IR fixed, the norm limit lim τtl (B) =: τt (B)
l→∞
exists for all B ∈ RAW (I). The map τ : t → τt defines a group of ∗-automorphisms of A such that τt ◦ αx = αx ◦ τt for all t, x ∈ IR. Moreover, τt : RAW (I) → RAW I+] −t, t[ . (6.10) Proof. The proof follows the well known proof in the 0-temperature case, which is based on finite propagation speed (see [15, Theorem 4.1.2]). To prove the existence of the limit and the group property, it suffices to show that τtl (B), for B ∈ RAW (I) and |t| ≤ T , is independent of l for l > |I| + T . It follows from (6.2) and Huygens principle that τt◦ : RAW (I) → RAW (I+] − t, t[).
(6.11)
Moreover (6.3) implies that RAW (I1 ) ⊂ RAW (I2 ) , if I1 ∩ I2 = ∅. The dynamics τtl is unitarily implemented by eitHl , where Hl = LAW + Vl for Vl = : P (φ(0, x)) :C0 dx. ]−l,l[
March 29, 2005 8:59 WSPC/148-RMP
146
J070-00230
C. G´ erard & C. D. J¨ akel
Trotter’s formula yields eitHl = s- limn→∞ (eitLAW /n eitVl /n )n and hence ◦ n l τtl (A) = s- lim τt/n ◦ γt/n (A), A ∈ B(HAW ), n→∞
(6.12)
where γtl (A) := eitVl Ae−itVl . Note that for l > l Vl = Vl + : P (φ(0, x)) :C0 dx. ]−l ,l [ \ ]−l,l[
Since Vl − Vl is affiliated to RAW ] − l , l [ \ [−l, l] , we see that γtl = γtl on RAW (I) for l, l > |I|. Using (6.11) and (6.12), this implies that τtl = τtl on RAW (I) for |t| ≤ T and l, l > |I| + T . This proves our claim. The same argument using again (6.11) proves (6.10). It remains to check that τ and α commute. Let T > 0 and I a bounded interval. For |t| ≤ T the time evolution is locally (i.e., applied to elements in RAW (I)) ¯ generated by Hl if l > |I| + t. Now αx is implemented by eixP with P = dΓ(k ⊕ k). −1 itHl,x ixP −ixP with Hl,x = e Hl e . It It follows that αx ◦ τt ◦ αx is implemented by e is easy to see that : P (φ(0, x)) :C0 dx. Hl,x = LAW + ]−l+x,l+x[
By the same argument as above, τt is implemented by eitHl,x for |t| ≤ T if l > |I| + |T | + |x|, which implies that αx ◦ τt ◦ α−1 x = τt . 6.4. An identification of local algebras In order to apply the results of Sec. 7 to the algebra of local observables A, it is necessary to identify the local Weyl algebra RAW (I) with the von Neumann algebra B(I) obtained by applying the interacting dynamics τ to the local abelian algebra of time-zero fields UAW (I). This is done in Proposition 6.5 below. Note that by similar arguments the corresponding result holds also in the 0-temperature case. For I ⊂ IR a bounded open interval, we denote by UAW (I) the von Neumann algebra generated by {WAW (h) | h ∈ hI , h real-valued}. Note that UAW (I) ⊂ RAW (I) is abelian. We denote by Bα (I) the von Neumann algebra generated by {τt (A) | A ∈ UAW (I), |t| < α}. Proposition 6.5. Set B(I) :=
α>0
(6.13)
Bα (I). Then
B(I) = RAW (I). Proof. Let us first prove that B(I) ⊂ RAW (I). Using (6.10) and UAW (I) ⊂ RAW (I), we see that Bα (I) ⊂ RAW I+] − α, α[ for all α > 0. According to Lemma 6.2(i) this implies B(I) ⊂ RAW (I).
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
147
Let us now prove that RAW (I) ⊂ B(I). Using Lemma 6.2(i) it suffices to show that for all J¯ ⊂ I and α 1 one has RAW (J) ⊂ Bα (I).
(6.14)
To this end we fix I and J with J¯ ⊂ I and set δ = prove that eitLAW Ae−itLAW ∈ Bα (I),
1 2
A ∈ UAW (J),
dist(J, I c ). We will first |t| < α,
(6.15)
if α < δ. The proof of Theorem 6.4 shows that for |t| ≤ δ the unitary group eitHI , with HI := LAW + VI and VI := : P (φ(0, x)) : dx, I
induces the correct dynamics τ on RAW (J). Applying then Proposition 2.5, we obtain (n)
eitLAW = s- lim eitHI , n→∞
(n)
for HI
(n)
= LAW + VI − VI
(n)
HI
(n)
, where VI
t ∈ IR, (n)
= VI 1l{|VI |≤n} . Since VI (n)
= LAW + VI − VI
is bounded,
(n)
= HI − VI
and hence by Trotter’s formula (n)
eitHI
p (n) = s- lim eitHI /p e−itVI /p . p→∞
This yields, for A ∈ RAW (J),
p p (n) (n) eitLAW Ae−itLAW = s- lim s- lim eitHI /p e−itVI /p A eitVI /p e−itHI /p . n→∞
p→∞
Using again Theorem 6.4 we obtain, for |t| < α, itHI /p −itV (n)/p p itV (n)/p −itHI /p p (n) p I e e A e I e = τt/p ◦ γt/p (A), (n)
where γ (n) is the dynamics implemented by the unitary group t → e−itVI . Since (n) (n) p VI is affiliated to UAW (I), e−itVI ∈ UAW (I) and hence τt/p ◦ γt/p (A) ∈ Bα (I) for |t| < α. Since Bα (I) is weakly closed, we obtain (6.15). Let us now prove (6.14). Clearly the operators WAW (h) for h ∈ hJ and h realvalued belong to UAW (J) and hence to Bα (I). Let us now pick h ∈ hJ ∩ D() and h real-valued. (This is possible; see the discussion presented at the end of Sec. 6.1). it Applying (6.15) A = WAW (h), we obtain that WAW (e h) ∈ Bα (I) for |t| < α. −1toit Hence WAW t (e h − h) ∈ Bα (I) for |t| < α. Letting t → 0 and using the fact that the map h h → WAW (h) is continuous for the strong operator topology, we obtain that WAW (ih) ∈ Bα (I). But any vector h ∈ hJ can be approximated in norm by vectors of the form h1 + ih2 , with hi ∈ hJ real and h2 ∈ D(). This implies that for all h ∈ hJ the operators WAW (h) belong to Bα (I) and hence RAW (J) ⊂ Bα (I). This completes the proof of the proposition.
March 29, 2005 8:59 WSPC/148-RMP
148
J070-00230
C. G´ erard & C. D. J¨ akel
6.5. Existence of the limiting state Theorem 6.6 (Existence of Limiting State). Let {ωl }l>0 be the family of (τ l , β)-KMS states for the spatially cutoff P (φ)2 models constructed in Sec. 5.3. Then w- lim ωl =: ωβ l→+∞
exists on A.
The state ωβ on A has the following properties: (i) ωβ is a (τ, β)-KMS state on A; (ii) ωβ is locally normal, i.e., if I is an open and bounded interval, then ωβ|RAW (I) is normal w.r.t. the Araki–Woods representation; (iii) ωβ is invariant under spatial translations, i.e., ωβ (αx (A)) = ωβ (A),
x ∈ IR,
A ∈ A;
(iv) ωβ has the spatial clustering property, i.e., lim ωβ (Aαx (B)) = ωβ (A)ωβ (B)
x→∞
∀ A, B ∈ A.
Remark 6.7. Let R be a C ∗ -algebra, πi : R → B(Hi ), i = 1, 2, two quasi-equivalent representations of R. Then there exists a ∗-isomorphism τ between π1 (R) and π2 (R) intertwining the two representations. This isomorphism is automatically weakly continuous. Therefore the representation π2 extends uniquely from R to π1 (R) and is quasi-equivalent to the concrete representation of π1 (R) in B(H1 ). Applying this easy observation to the representations πAW and πF of W(hI ), which are quasi-equivalent by Lemma 6.3, we see that the Fock representation πF extends by weak continuity from πAW (W(hI )) to RAW (I) and is quasi-equivalent to the Araki–Woods representation. Since two quasi-equivalent representations have the same set of normal states, we obtain that ωβ|RAW (I) is also normal with respect to the Fock representation. Proof. The family {ωl }l>0 of states on A is weak∗ compact by the Banach–Alaoglu theorem. Let ω1 be one of the limit points of {ωl }l>0 . Then we can find a subneta {ω r }r∈R such that ω1 = w- limr∈R ω r . We claim that ω1 is a (τ, β)-KMS state. Let A, B ∈ A. Writing ω1 (Aτt (B)) − ω r (Aτtlr (B)) = (ω1 − ω r )(Aτt (B)) + ω r Aτt (B) − Aτtlr (B) aA net {y } β β∈B is a subnet of a net {xα }α∈A if there exists a map B β → α(β) ∈ A such that: (i) yβ = xα(β) for all β ∈ B; (ii) for all α0 ∈ A there exists some β0 such that α(β) ≥ α0 whenever β ≥ β0 .
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
149
and using that liml→∞ τ l (A) − τt (A) = 0 for A ∈ A and t ∈ IR fixed, we find ω1 (Aτt (B)) = lim ω r Aτtlr (B) , r∈R
t ∈ IR.
(6.16)
t ∈ IR.
(6.17)
The same argument shows ω1 (τt (B)A) = lim ω r τtlr (B)A , r∈R
Since the ω r ’s are (τ lr , β)-KMS states there exist functions F r (z), which are holomorphic in Iβ+ = {0 < Im z < β} and continuous in I¯β+ , such that F r (t) = ω r Aτtlr B and F r (t + iβ) = ω r τtlr (B)A . Moreover, one has supz∈Iβ |F r (z)| ≤ AB. Applying Vitali’s theorem and possibly extracting a subnet, we know that limr→∞ F r (z) = F (z) exists and is holomorphic and bounded in Iβ+ . By Lemma B.3, we obtain that F is continuous on Iβ+ and F (t) = lim F r (t), F (t + iβ) = lim F r (t + iβ). r→∞
r→∞
Using (6.16) and (6.17) this implies that ω1 is a (τ, β)-KMS state. We now apply a result of Takesaki and Winnink [32]: clearly I → {RAW (I)} is a net of von Neumann algebras (see [32, Sec. 2]). The algebras RAW (I) are σ-finite, ¯ on which they act is separable. The algebras RAW (I) since the Hilbert space Γ(h⊕h) have separable preduals since there exists a faithful normal representation (namely the defining Araki–Woods representation πAW ) of RAW (I) on a separable Hilbert space (namely the Araki–Woods Hilbert space HAW ). Moreover, as factors on a separable infinite dimensional Hilbert space they are properly infinite. Applying the following theorem [32, Theorem 1], we obtain that the KMS state ω1 is normal on RAW (I) Theorem 6.8. Let A be a C ∗ -algebra, IR t → τt a one-parameter group of ∗ -automorphisms of A and ω a (τ, β)-KMS state on A. If there exists a net of σ-finite, properly infinite von Neumann algebras Mα with separable preduals such that (i) to all pairs Mα , Mβ in {Mα }α∈Γ there exists Mγ with the property Mα ∪ Mβ ⊂ Mγ ; (ii) every Mα contains the unit of A; (iii) A is the norm closure of the von Neumann algebras Mα , i.e., A :=
Mα
(C ∗ )
,
α∈Γ
then ω is locally normal, i.e., the restriction of ω to each von Neumann algebra Mα is a normal state.
March 29, 2005 8:59 WSPC/148-RMP
150
J070-00230
C. G´ erard & C. D. J¨ akel
Let us now show that all limit states are identical. Let us denote by U0 (I) the abelian C ∗ -algebra generated by
F (φAW (h)) | h ∈ C0∞IR (I), F ∈ C0∞ (IR)
and by Tα (I) the ∗-algebra generated by τt (A) | A ∈ U0 (I), |t| < α . From Theorem 6.4 and Proposition 7.6 we deduce that n n τti (Ai ) = ω ˜ τ˜ti (Ai ) lim ωl l→∞
1
= ω1
1 n
τti (Ai ) ,
Ai ∈ U0 (I),
ti ∈ IR,
1
where ω ˜ and τ˜ are defined in Sec. 7.2. Therefore all weak accumulation points of {ωl }l>0 coincide on the algebras Tα (I) ⊂ RAW I+] −α, α[ . We note that Tα (I) is weakly dense in the von Neumann algebra Bα (I) defined in (6.13). Moreover, we have seen that all limit states are normal on the local algebras RAW (I), I open and bounded. Therefore they coincide on the von Neumann algebras Bα (I), and hence by Proposition 6.5 on RAW (I). Consequently, they also coincide on the norm closure A. Thus the weak∗ compact family {ωl }l>0 has a unique accumulation point, which implies that ωβ := w- lim ωl exists on A. l→∞
We have already seen that ωβ is a locally normal (τ, β)-KMS state on A, which completes the proof of (i) and (ii). Property (iii) follows from the invariance of the state ω ˜ under space translations shown in Lemma 7.7 and the same density argument as above. It remains to prove (iv). Let (Hβ , πβ , Ωβ ) denote the GNS objects associated to (A, ωβ ). The group {αx }x∈IR is implemented in Hβ by a strongly continuous group of unitary operators {eixPβ }x∈IR with Pβ Ωβ = 0. Lemma 7.7(ii) implies that, for A, B ∈ Tα (I), lim πβ (A)Ωβ , eixPβ πβ (B)Ωβ = πβ (A)Ωβ , Ωβ )(Ωβ , πβ (B)Ωβ . (6.18) x→∞
Since RAW (I) is a factor, the representation πβ provides a weakly continuous ∗isomorphism between RAW (I) and Rβ (I) = πβ (RAW (I)) = πβ (RAW (I)) . Hence, by the same weak density argument as above, (6.18) extends to all A, B ∈ RAW (I). Thus the space clustering property holds on RAW (I) for all I, I open and bounded, and extends to A by norm density. 7. Construction of the Interacting Path Space (Sβ × IR) In this section we construct the interacting path space supported by SIR −1 and study describing the translation invariant P (φ)2 model at temperature β some of its properties.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
151
7.1. Construction of the interacting measure Let HCren = HCren − EC be the renormalized P (φ)2 Hamiltonian on Sβ defined in Sec. 5.2. Let f ∈ SIR (Sβ × IR). For x ∈ IR the function fx defined in Sec. 3.5 belongs to SIR (Sβ ). We will apply the results of Appendix A to the selfadjoint operator H = HCren , R(x) = φF (fx ) (replacing the variable t in Appendix A by the variable x). It follows from the bound (5.7) in Proposition 5.4 and the fact that the map 1 IR → B Γ(H − 2 (Sβ )) 1
x → φ(fx )(HCren + 1)− 2 is infinitely differentiable that the hypothesis (A.3) in Sec. A.1 is satisfied. Similarly, is in L1 (IR) ∩ using the bound (5.8) and the fact that the map x → fx − 12 H
(Sβ )
L∞ (IR), we see that hypotheses (A.7) and (A.8) in Sec. A.2 is satisfied. Therefore we can apply all the abstract results from Secs. A.1 and A.2. In particular there exists a solution U (b, a) of the time-dependent heat equation: d U (b, a) = −HCren + iφF (fb ) U (b, a), db
U (a, a) = 1l.
We will set for −∞ ≤ a ≤ b ≤ +∞: W[a,b] (f ) := U (b, a)∗ . Proposition 7.1. Let f ∈ SIR (Sβ × IR) and assume that supp f ⊂ Sβ × [−a, a]. Then ren ren eiφ(f ) G[−l,l] dφC = e−2lEC e−(l−a)HC Ω◦C , W[−a,a] (f )e−(l−a)HC Ω◦C , Q
1 where Ω◦C is the free vacuum on Γ H − 2 (Sβ ) . Proof. Let us first introduce a notation which we will use throughout the proof. If A is a Σ-measurable function on Q, the image of A under UC (x) for x ∈ IR will be denoted by UC (x)(A). On the other hand, the expression UC (x)A will denote the operator product of the operator UC (x) and the operator of multiplication by A, acting on L2 (Q, Σ, dφC ). Using Lemma 3.1 we find eiφ(f ) = ei
Ra
−a
φ(fx ,x)dx
= ei
Ra
−a
UC (x)(φ(fx ,0))dx
.
We will approximate the above integral using Riemann sums. Let n, p ∈ IN and a and zj = −a + [j/p] na , where [·] denotes the 0 ≤ j ≤ 2np. We set xj = −a + j np integer part.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
C. G´ erard & C. D. J¨ akel
152
It follows from (3.15) that the map x → φ(fx , x) ∈ continuous. Therefore
a
−a
in
1≤p≤∞
2np−1
UC (x) φ(fx , 0) dx = lim
n,p→∞
1≤p<∞
Lp (Q, Σ, dφc ) is
(xj+1 − xj )UC (xj ) φ(fzj , 0)
j=0
Lp (Q, Σ, dφC ) and hence
e
i
Ra
−a
UC (x)(φ(fx ,0))dx
2np−1
= lim
n,p→∞
j=0 2np−1
= lim
n,p→∞
ei(xj+1 −xj )UC (xj )(φ(fzj ,0)) UC (xj ) ei(xj+1 −xj )φ(fzj ,0)
(7.1)
j=0
in 1≤p≤∞ Lp (Q, Σ, dφC ), where in the last line we use the fact that UC (xj ) is an automorphism of L∞ (Q, Σ, dφC ). Since G[a,b] is a FKN kernel, G[−l,l] = G[−l,−a] = G[−l,−a]
2np−1 j=0 2np−1
G[xj ,xj+1 ] G[a,l] UC (xj ) G[0,xj+1 −xj ] G[a,l] .
j=0
Therefore G[−l,l]
2np−1
UC (xj ) ei(xj+1 −xj )φ(fzj ,0)
j=0
= G[−l,−a]
2np−1
i(xj+1 −xj )φ(fz ,0) j UC (xj ) e G[0,xj+1 −xj ] G[a,l] .
j=0
Next let Aj , 0 ≤ j < 2np − 1, be the multiplication operators by Σ-measurable functions. Using the identity UC (xj )(Aj ) = UC (xj )Aj UC (−xj ) and the fact that UC (x) is an automorphism of L∞ (Q, Σ, dφC ), we obtain as an operator identity on L2 (Q, Σ, dφC ): 2np−1 j=0
UC (xj )(Aj ) = UC (x0 )
2np−1
Aj UC (xj+1 − xj )UC (−x2np ).
j=0
In the above identity the product on the left hand side is the operator of multiplication by the product of the functions UC (xj )(Aj ) and the product on the right hand side is an operator product. Using that x0 = −x2np = −a and that UC (−a)∗ = UC (a)
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
153
we get
2np−1
G[−l,l] Q
UC (xj ) ei(xj+1 −xj )φ(fzj ,0) dφC
j=0
= Q
2np−1 i(x −x )φ(f ,0) j+1 j zj G[−l,−a] UC (−a) e UC (xj+1 − xj )G[0,xj+1 −xj ] j=0
× UC (−a)G[a,l] dφC 2np−1 i(x −x )φ(f ,0) j+1 j zj = G[−l+a,0] e UV (xj+1 − xj ) G[0,l−a] dφC Q
j=0
for UV (s) = G[0,s] UC (s). Let us now set for 0 ≤ k ≤ 2n, yk = −a + k na . We note that zj = y[j/p] and that a (xj+1 − xj ) = np = (yk+1 − yk )/p. We obtain that G[−l,l]
Q
2np−1
UC (xj ) ei(xj+1 −xj )φ(fzj ,0) dφC
j=0
2n−1
p yk+1 − yk = G[−l+a,0] UV G[0,l−a] dφC e p Q k=0 p 2n−1 yk+1 − yk = RC (G[0,l−a] ) G[0,l−a] dφC . ei(yk+1 −yk )φ(fyk ,0)/p UV p Q
i(yk+1 −yk )φ(fyk ,0)/p
k=0
Taking into account the construction of HC recalled in Sec. 5.2 we find G[−l,l] Q
=
e
2np−1
UC (xj ) ei(xj+1 −xj )φ(fzj ,0) dφC
j=0 −(l−a)HC
Ω◦C ,
2n−1
i(y −y )φ(fy )/p −(y −y )HC /p p −(l−a)HC ◦ k e k+1 k e k+1 k e ΩC
k=0
ren
= e−2lEC e−(l−a)HC Ω◦C ,
2n−1
i(y −y )φ(fy )/p −(y −y )H ren /p p k e k+1 k e k+1 k C
k=0
ren
× e−(l−a)HC Ω◦C . Letting now n and p tend to ∞ and using Proposition A.5, we obtain the proposition.
March 29, 2005 8:59 WSPC/148-RMP
154
J070-00230
C. G´ erard & C. D. J¨ akel
Theorem 7.2. (i) Let f ∈ C0∞IR (Sβ × IR). Then lim eiφ(f ) dµl = (ΩC , W[−∞,∞] (f )ΩC ), l→+∞
where ΩC is the unique vacuum state of HC ; (ii) The map SIR (Sβ × IR) f → (ΩC , W[−∞,∞] (f )ΩC ) is the generating functional of a Borel probability measure µ on (Q, Σ); (iii) The measure µ is invariant under space translations {ax }x∈IR , time translations {Tt }t∈Sβ and the time reflection r; (iv) The functions φ(f ) belong to 1≤p<∞ Lp (Q, Σ, µ) for f ∈ SIR (Sβ × IR). Moreover, φ(f )n dµ = n! Q
−∞<x1 ≤···≤xn <∞
× ΩC ,
n−1
φ(fxk )e
−(xk+1 −xk )HCren
φ(fxn )ΩC dx1 , . . . , dxn ;
1
(v) Let fi ∈ C0∞IR (Sβ × IR) for 1 ≤ i ≤ n. Then lim
l→+∞
n Q i=1
φ(fi )dµl =
n
φ(fi )dµ.
Q i=1
Proof. Note first that applying Proposition 7.1 for f = 0, we obtain W[−a,a] (0) = ren e−2aHC : ren ren G[−l,l] dφC = e−2lEC e−(l−a)HC Ω◦C , e−(l−a)HC Ω◦C . Q
Let f ∈ C0∞IR (Sβ × IR) with supp f ⊂ Sβ × [−a, a] for some a ∈ IR. Using Proposition 7.1 we find −(l−a)H ren ◦ ren C e ΩC , W[−a,a] (f )e−(l−a)HC Ω◦C iφ(f ) dµl = . e ren ren (e−lHC Ω◦C , e−lHC Ω◦C ) ren
Now liml→+∞ e−(l−a)HC Ω◦C = (ΩC , Ω◦C )ΩC , where ΩC is the eigenvector for the simple eigenvalue {0} of HCren . Thus lim eiφ(f ) dµl = (ΩC , W[−a,a] (f )ΩC ). l→+∞
Because supp f ⊂ Sβ × [−a, a], we see that ΩC , W[s,t] (f )ΩC is constant for s ≤ −a, t ≥ a, which proves (i).
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
155
To prove (ii) we apply Minlos theorem (see e.g. [13]). As a limit of functionals of Borel probability measures on (Q, Σ) the functional f → ΩC , W[−∞,∞] (f )ΩC is of positive type. It remains to show that the map S(Sβ × IR) → C f → (ΩC , W[−∞,∞] (f )ΩC ) is continuous. Using the bound (5.8) we obtain 1 ± φF (f2,x ) − φF (f1,x ) ≤ Cr(x)(HCren + 1) 2
for f1 , f2 ∈ S(Sβ × IR),
where C > 0 is some constant and r(x) := (f2,x − f1,x )
1
H − 2 (Sβ )
.
Clearly rL2 (IR) ≤ Cf1 −f2 p , where ·p is a Schwartz semi-norm on S(Sβ ×IR). Applying Lemma A.7 for δ = 12 , we obtain W[−∞,+∞] (f2 ) − W[−∞,+∞] (f1 ) ≤ Cf2 − f1 p , which proves the desired continuity result. Let us now verify (iii). The measure µ is invariant under time translations and time reflection as the weak limit of the time translation and time reflection invariant measures µl . The fact that µ is invariant under space translations follows directly from (i) and Remark A.9. To prove (iv) we apply Lemma B.2, using the estimates in Proposition A.6(ii). We obtain that φ(f ) ∈ 1≤p<∞ Lp (Q, Σ, µ). The formula in (iv) follows from Proposition A.6(iii). It remains to prove (v). Let f ∈ C0∞IR (Sβ × IR) with supp f ⊂ Sβ × [−a, a]. We consider the family of functions ul (λ) = eiλφ(f ) dµl for λ ∈ C. l Since eφ(f ) ∈ 1≤p<∞ Lp (Q, Σ, dφC ) and F[−β/2,β/2] ∈ L1+ (Q, Σ, dφC ), the functions ul (λ) are entire and dn n u (0) = i (7.2) φ(f )n dµl . l dλn Using Proposition 7.1 and λφ(f ) = φ(λf ) for λ ∈ IR we find ren ren W[−a,a] (λf )e−(l−a)HC Ω◦C , e−(l−a)HC Ω◦C ul (λ) = ren e−lHC Ω◦C
for λ ∈ IR.
The right hand side is an entire function by Lemma A.3. Therefore this identity extends to λ ∈ C. Applying (5.8) and Proposition A.6(i) with δ = 1/2 we obtain |ul (λ)| ≤ eC|Im
λ|2
,
l ∈ IR+ , λ ∈ C.
(7.3)
March 29, 2005 8:59 WSPC/148-RMP
156
J070-00230
C. G´ erard & C. D. J¨ akel
We have seen above that
eiλφ(f ) dµ for λ ∈ IR.
lim ul (λ) =
l→∞
Q
By Vitali’s theorem we obtain dn lim ul (0) = in l→∞ dλn
φ(f )n dµ. Q
Using (7.2) and multi-linearity, this proves (v). 7.2. Existence and properties of sharp-time fields Proposition 7.3. Let h ∈ SIR (IR) and t ∈ Sβ . Then the sequence φ δk (· − t) ⊗ h is Cauchy in 1≤p<∞ Lp (Q, Σ, µ) and hence
φ(t, h) := lim φ δk (· − t) ⊗ h ∈ k→∞
Lp (Q, Σ, µ).
1≤p<∞
Moreover, the map Sβ →
Lp (Q, Σ, µ)
1≤p<∞
t → φ(t, h) is continuous for each h ∈ SIR (IR). Proof. For p ≥ 1 we have 2p φ(δk (· − t) ⊗ h) − φ(δk (· − t) ⊗ h) dµ Q
= (−i)2p
d2p ΩC , W[−∞,+∞] λ(δk (· − t) ⊗ h − δk (· − t) ⊗ h) ΩC |λ=0 . dλ2p
If f = δk (· − t)⊗h, then for x ∈ IR the function fx ∈ SIR (Sβ ) is equal to δk (·−t)h(x). It follows then from the estimate (5.9) in Proposition 5.4 that ± φF (δk (· − t)h(x)) − φF (δk (· − t)h(x)) ≤ c δk (· − t) − δk (· − t)H −1 (Sβ ) |h(x)| HCren + 1 . Applying now Lemma A.8 we obtain that 2p d dλ2p W[−∞,+∞] λ(δk (· − t) ⊗ h − δk (· − t) ⊗ h) −1
2p h1 h∞ . ≤ cp δk (· − t) − δk (· − t)2p H −1 (Sβ ) h∞ e
(7.4)
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
157
Since δk (· − t) converges to δ(· − t) in H −1 (Sβ ), we see that φ δk (· − t) ⊗ h is Cauchy in L2p (Q, Σ, µ). A similar argument shows that t → φ(t, h) ∈ L2p (Q, Σ, µ) is continuous, using the fact that t → δ(· − t) ∈ H −1 (Sβ ) is continuous. Using the existence of sharp-time fields, we can equip the probability space (Q, Σ, µ) with an OS-positive β-periodic path space structure: We recall that U(t) is the group of transformations generated by the (euclidean) time translations Tt and R is the transformation generated by time reflection, and Σ0 is the sub-σalgebra of Σ generated by the functions {φ(0, h) | h ∈ SIR (IR)}. Theorem 7.4. (Q, Σ, Σ0 , U (t), R, µ) is an OS-positive β-periodic generalized path space. Proof. Since the measure µ is invariant under time translations and time reflection, we see that U (t) and R are measure-preserving automorphisms of L∞ (Q, Σ, µ). Proposition 7.3 implies that the map Sβ t → eiφ(t,h) ∈ L2 (Q, Σ, µ) is continuous. Hence U (t) is a strongly continuous group on L2 (Q, Σ, µ). This implies that U (t) is strongly continuous in measure on L∞ (Q, Σ, µ). Clearly it is β-periodic. The generalized path space (Q, Σ, Σ0 , U (t), R, µ) is OS-positive, since µ is the weak limit of the measures µl , which are associated to OS-positive path spaces. Finally we have already seen that Σ = t∈Sβ Σt . This completes the proof of the theorem. By the reconstruction theorem, we obtain a stochastically positive β-KMS system ˜ U, ˜ τ˜, ω (B, ˜) which describes the translation invariant P (φ)2 model at temperature β −1 . 7.3. Properties of the interacting β-KMS system We first prove the convergence of sharp-time Schwinger functions. Proposition 7.5. Let hi ∈ C0∞IR (IR) and ti ∈ Sβ for 1 ≤ i ≤ n. Then lim
l→∞
n
eiφ(tj ,hj ) dµl =
Q 1
n
eiφ(tj ,hj ) dµ.
Q 1
Proof. Let a > 0 such that supp hj ⊂ [−a, a]. By Proposition 7.3, we know that in L1 (Q, Σ, µ). φ(tj , hj ) = lim φ δk (· − tj ) ⊗ hj k→∞
March 29, 2005 8:59 WSPC/148-RMP
158
J070-00230
C. G´ erard & C. D. J¨ akel
After extracting a subsequence, this implies that φ(tj , hj ) = lim φ δk (· − tj ) ⊗ hj pointwise µ a.e. on Q k→∞
and hence n
eiφ(tj ,hj ) dµ = lim
k→∞
Q 1
n
eiφ(δk (·−tj )⊗hj ) dµ
Q 1
= lim
k→∞
ΩC , W[−a,a]
n
δk (· − tj ) ⊗ hj
ΩC
,
(7.5)
1
by Theorem 7.2(i). Note that for all l > 0 φ(tj , hj ) = lim φ δk (· − tj ) ⊗ hj k→∞
in L1 (Q, Σ, µl )
because this convergence holds in L2 (Q, Σ, dφC ) and G[−l,l] dφC , Q G[−l,l] dφC
dµl :=
where G[−l,l] ∈ L2 (Q, Σ, dφC ) as a consequence of (5.6). By the same arguments as above, we obtain n eiφ(tj ,hj ) dµl Q 1
−(l−a)H ren ◦ n −(l−a)H ren ◦ C C e ΩC , W[−a,a] ΩC 1 δk (· − tj ) ⊗ hj e = lim . ren ◦ −lH 2 C k→∞ e ΩC
Let us denote by F (k, l) the quantity on the right hand side. Applying (7.4) we obtain n eiφ(tj ,hj ) dµl uniformly w.r.t. l. lim F (k, l) = k→∞
Q 1
As we have seen, Theorem 7.2(i) implies n lim F (k, l) = eiφ(δk (·−tj )⊗hj ) dµ. l→∞
Q 1
Applying now Lemma B.1(ii) and using (7.5) we obtain the proposition. ¯ the C ∗ -algebra generated by Let us denote by U0 ⊂ B Γ(h ⊕ h)
F φAW (h1 ), . . . , φAW (hn ) | hi ∈ C0∞IR (IR), F ∈ C0∞ (IRn ), n ∈ IN . The isomorphism between L∞ (Q,Σ0 , dφC ) and UAW , which we recalled in (h ), . . . , φ (h ) onto the function Sec. 2.3, maps the operator F φ AW 1 AW n F φ(0, h1 ), . . . , φ(0, hn ) . This function is Σ0 -measurable. We will still denote by
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
159
A the image of such a function A in the abelian algebra U˜ provided by the reconstruction theorem for the translation invariant P (φ)2 model. Proposition 7.6. Let Ai ∈ U0 and ti ∈ IR, 1 ≤ i ≤ n. Then n n l τti (Ai ) = ω ˜ τ˜ti (Ai ) . lim ωl l→+∞
1
Proof. Let us fix Ai ∈ U0 and set n l l τti (Ai ) G (t1 , . . . , tn ) := ωl
1
and G(t1 , . . . , tn ) = ω ˜
1
n
τ˜ti (Ai ) .
1
Due to the KMS condition, the functions Gl and G are holomorphic in
Iβn+ = (z1 , . . . , zn ) ∈ Cn | Im zi < Im zi+1 , Im zn − Im z1 < β , n continuous on Iβn+ and bounded by 1 Ai . We first claim that lim Gl (is1 , . . . , isn ) = G(is1 , . . . , isn ) for s1 ≤ · · · ≤ sn , sn − s1 ≤ β.
l→∞
(7.6)
Using Proposition 7.5 and the identity (2.2) we see that (7.6) holds for Aj = eiφAW (hj ) , hj ∈ C0∞IR (IR). Using functional calculus we can extend (7.6) to arbitrary Aj ∈ U0 . Let us now consider, for s2 ≤ · · · ≤ sn and sn − s2 ≤ β, the functions ul (z) := Gl (z, is2 , . . . , isn ), which are holomorphic in {0 < Im z < s2 } and continuous on {0 ≤ Im z ≤ s2 }. Since the family {ul } is uniformly bounded, we can apply Lemma B.3. It follows that lim Gl (t1 , is2 , . . . , isn ) = G(t1 , is2 , . . . , isn )
l→+∞
for s2 ≤ · · · ≤ sn , sn − s2 ≤ β and t1 ∈ IR. Iterating this argument, we obtain lim Gl (t1 , . . . , tn ) = G(t1 , . . . , tn ).
l→+∞
This completes the proof of the proposition. by {α space translations on U0 defined by Let us denote x }x∈IR the group of ∞ αx WAW (h) = WAW h( · − x) for h ∈ C0 IR (IR). Lemma 7.7. Let Aj ∈ U0 and tj ∈ IR, 1 ≤ j ≤ n. Set A = n B = j=k+1 τtj (Aj ). It follows that (i) ω ˜ (αx (A)) = ω ˜ (A) for all x ∈ IR; ˜ (Aαx (B)) = ω ˜ (A)˜ ω (B). (ii) limx→∞ ω
k
j=1 τtj (Aj )
and
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
C. G´ erard & C. D. J¨ akel
160
Proof. Property (i) follows directly from the invariance of the measure µ under the space translations {αx }x∈IR shown in Theorem 7.2(iii). It remains to prove (ii). We set l n Gx (t1 , . . . , tn ) := ω ˜ τtj (Aj ) αx ◦ τtj (Aj ) ,
j=1 l
G∞ (t1 , . . . , tn ) := ω ˜
j=l+1
τtj (Aj )
·ω ˜
j=1
n
τtj (Aj ) .
j=l+1
Due to the KMS condition, the functions Gx and G∞ are holomorphic in Iβn+ and bounded by nj=1 Aj . We claim that, for s1 ≤ · · · ≤ sn and sn − s1 ≤ β, lim Gx (is1 , . . . , isn ) = G∞ (is1 , . . . , isn ).
(7.7)
x→∞
Let us prove (7.7). Let us first assume that Aj = eiφ(0,hj ) for hj ∈ C0∞IR (IR). Then l n Gx (is1 , . . . , isn ) = eiφ(δ(·−sj )⊗hj ) eiφ(δ(·−sj )⊗hj ( · −x)) dµ, Q j=1
G∞ (is1 , . . . , isn ) =
l
j=l+1
n
eiφ(δ(·−sj )⊗hj ) dµ ×
Q j=1
eiφ(δ(·−sj )⊗hj ) dµ.
Q j=l+1
By Proposition 7.3 we have Gx (is1 , . . . , isn ) = lim
l
k→+∞
G∞ (is1 , . . . , isn ) = lim
k→∞
eiφ(δk (·−sj )⊗hj )
Q j=1
l
n
eiφ(δk (·−sj )⊗hj ( · −x)) dµ,
j=l+1
n
eiφ(δk (·−sj )⊗hj ) dµ ×
Q j=1
eiφ(δk (·−sj )⊗hj ) dµ.
Q j=l+1
From Theorem 7.2 we get l n eiφ(δk (·−sj )⊗hj ) eiφ(δk (·−sj )⊗hj ( · −x)) dµ Q j=1
j=l+1
= ΩC , W[−∞,+∞] (R1,k + ax (R2,k ))ΩC
and l
e
iφ(δk (·−sj )⊗hj )
dµ ×
Q j=1
n
eiφ(δk (·−sj )⊗hj ) dµ
Q j=l+1
= ΩC , W[−∞,+∞] (R1,k )ΩC · ΩC , W[−∞,+∞] (R2,k )ΩC ,
where
R1,k = φ
l j=1
δk (· − sj ) ⊗ hj
and R2,k = φ
n j=l+1
δk (· − sj ) ⊗ hj .
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
161
As before (see Sec. 4.1), the group of spatial translations induced on Q by the map (t, y) → (t, y + x) has been denoted by {ax}x∈IR . Applying Lemma A.10 we find | ΩC , W R1,k + ax (R2,k ) ΩC − ΩC , W (R1,k )ΩC ΩC , W (R2,k )ΩC | ≤ e−(|x|−C)a , where a > 0 is the spectral gap of HCren and W (·) := W[−∞,+∞] (·). Letting k → ∞ and using Proposition 7.3 we obtain Gx (is1 , . . . , isn ) − G∞ (is1 , . . . , isn ) ≤ e−(|x|−C)a . Using functional calculus, we conclude that (7.7) holds for all Aj ∈ U0 . To complete the proof of the lemma, we can now argue as in the proof of Proposition 7.6, using Lemma B.3. Acknowledgments The authors would like to thank Jan Derezi´ nski for useful discussions. C. J¨ akel wants to thank Hanno Gottschalk for discussing Høegh-Krohn’s original paper. The authors are also grateful to the referees for several suggestions and for pointing out the references [16–18], which contain related material. The second author was supported under the FP5 TMR program of the European Union by the Marie Curie fellowship HPMF-CT-2000-00881 and is currently supported by the IQN network of the DAAD. Both authors benefited from the IHP network HPRN-CT-2002-00277 of the European Union. Appendix A. A Time-Dependent Heat Equation Let H ≥ 0 be a selfadjoint operator on a Hilbert space H and let R(t), t ∈ IR, be a family of closed operators with D(H γ ) ⊂ D(R(t)) for some 0 ≤ γ < 1. We consider the following time-dependent heat equation: d U (t, s) = − H + iλR(t) U (t, s), s ≤ t, dt (A.1) U (s, s) = 1l. This equation is (formally) equivalent to the following integral equation: t −(t−s)H U (t, s) = e − iλ e−(t−τ )H R(τ )U (τ, s)dτ.
(A.2)
s
In the main text we only use the results of this section in the dissipative case, i.e., when R(t) is selfadjoint for all t ∈ IR. However, part of the results are valid and will be proved in the general case. The solution of (A.1) will be denoted by U (t, s) or Uλ (t, s). If we want to display its dependence on the family R(t), then the solution of (A.1) will be denoted by U (t, s; R).
March 29, 2005 8:59 WSPC/148-RMP
162
J070-00230
C. G´ erard & C. D. J¨ akel
A.1. Existence of solutions We assume that the maps IR → B(H) t → R(t)(H + 1)−γ
and
IR → B(H) t → R∗ (t)(H + 1)−γ
(A.3)
are H¨older continuous of some order > 0. In the sequel we will use the following result. Lemma A.1. Assume (A.3). Then
e−(t−τ )H R(τ ) ≤ cγ R(τ )∗ (H + 1)−γ |t − τ |−γ + 1
∀ τ ≤ t.
(A.4)
Proof. We have e−(t−τ )H R(τ ) ≤ R(τ )∗ (H + 1)−γ (H + 1)γ e−(t−τ )H . This proves the lemma, using |(λ + 1)γ e−sλ | ≤ cγ (s−γ + 1) for s, λ ≥ 0.
(A.5)
The following result is shown in [20, Theorem 7.1.3]. Proposition A.2. There exists a unique solution U (t, s) of (A.1) such that (i) U (s, s) = 1l and U (t, r)U (r, s) = U (t, s) for s ≤ r ≤ t; (ii) t → U (t, s) ∈ B(H) is strongly continuous in [s, +∞[ and strongly differentiable in ]s, +∞[. Lemma A.3. The map λ → Uλ (·, s)Ψ ∈ C [s, +∞[, H is entire analytic for each Ψ ∈ H. Proof. Let Ψ ∈ H. For s ≤ T < ∞, set V (t)Ψ = e−(t−s)H Ψ and U (·, s)Ψ := sup U (t, s)ΨH . t∈[s,T ]
If we define a map K: C([s, T ], H) → C([s, T ], H) (·) W (·) → −i s e−(·−τ )H R(τ )W (τ )dτ, then the integral equation (A.2) can be rewritten as (1 − K)(U ( · , s)Ψ) = V ( · )Ψ. Now (A.4) implies t |t − τ |−γ + 1 dτ, KU (t, s)Ψ ≤ U (·, s)Ψ sup R∗ (τ )(H + 1)−γ τ ∈[s,T ]
s
and hence KU ( · , s)Ψ ≤ c sup R∗ (τ )(H + 1)−γ |T − s|1−γ U ( · , s)Ψ, τ ∈[s,T ]
which shows that K ∈ B C([s, T ], H) . Then Uλ (·, s)Ψ solves (1−λK) Uλ (·, s)Ψ = V ( · )Ψ, which implies that λ → Uλ ( · , s)Ψ is entire analytic.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
163
A.2. The dissipative case We now consider the dissipative case when R(t) is selfadjoint for t ∈ IR. We first prove a result about approximation by time-ordered products. We will make use of an extension of Gronwall’s inequality to integral equations, shown in [20, Lemma 7.1.1]. Lemma A.4. Let b ≥ 0 and γ > 0. Let a(t) and u(t) be non-negative locally integrable functions on s ≤ t ≤ T < ∞ such that t (t − τ )−γ u(τ )dτ for s ≤ t ≤ T. u(t) ≤ a(t) + b s
Then u(t) ≤ a(t) + cb(1−γ)
−1
t
E(t − τ )a(τ )dτ
for s ≤ t ≤ T,
s
where |E(r)| ≤ cT (|r|−γ ) on [0, T − s]. Proposition A.5. Assume that R(t) is selfadjoint and that (A.3) holds. Then for s ≤ t there exists a sequence {pn }n∈IN with limn→∞ pn = +∞ such that U (t, s) = s- lim
n→∞
where tj = s +
(t−s)j n
0 −(tj+1 −tj )H/pn −i(tj+1 −tj )R(tj )/pn pn e , e n−1
for 0 ≤ j ≤ n.
Proof. Let Ri , i = 1, 2, be two families of closed operators satisfying (A.3) and let U (i) (t, s) be the associated propagators. Then t (1) (2) U (t, s) − U (t, s) = −i e−(t−τ )H R1 (τ ) U (1) (τ, s) − U (2) (τ, s) dτ s
−i
t
e−(t−τ )H R1 (τ ) − R2 (τ ) U (2) (τ, s)dτ.
s
This implies, using Lemma A.1, that U (1) (t, s) − U (2) (t, s) t ≤ cγ (|t − τ |−γ + 1) (H + 1)−γ R1 (τ ) U (1) (τ, s) − U (2) (τ, s)dτ s t (|t − τ |−γ + 1) (H + 1)−γ R1 (τ ) − R2 (τ ) U (2) (τ, s)dτ. + cγ s
We now apply Gronwall’s inequality, as given in Lemma A.4, with b = sup (H + 1)−γ R1 (t), s≤t≤T a(t) ≡ a = cT sup (H + 1)−γ R1 (t) − R2 (t) × sup U (2) (t, s). s≤t≤T
s≤t≤T
March 29, 2005 8:59 WSPC/148-RMP
164
J070-00230
C. G´ erard & C. D. J¨ akel
We obtain sup U (1) (t, s) − U (2) (t, s) s≤t≤T
≤ cT sup (H + 1)−γ R1 (t)(1−γ)
−1
s≤t≤T
× sup (H + 1)−γ R1 (t) − R2 (t) × sup U (2) (t, s). s≤t≤T
(A.6)
s≤t≤T
Let us now prove the proposition. For s < t fixed, n ∈ IN, we set (t − s)j tj := s + , n
0 ≤ j ≤ n,
Rn (τ ) =
n−1
1l[tj ,tj+1 [ (τ )R(tj ).
n=0
Note that H + iR(tj ) with domain D(H) is the generator of a C0 -semigroup of contractions, since it is closed and maximal accretive, using (A.3). If U (n) (t, s) is the solution of (A.1) for the piecewise constant family of operators {Rn (t)}, then one can easily verify that U (n) (t, s) =
0
e−(tj+1 −tj )(H+iRn (tj )) .
n−1
Since IR t → (H + 1)−γ R(t) ∈ B(H) is continuous, we conclude that lim sup (H + 1)−γ R(t) − Rn (t) = 0. n→∞ s≤t≤T
Using (A.6) we get lim
sup U (t, s) − U (n) (t, s) = 0.
n→∞ s≤t≤T
Applying next [6] we obtain p e−(tj+1 −tj )(H+iR(tj )) = s- lim e−(tj+1 −tj )H/p e−i(tj+1 −tj )R(tj )/p . p→∞
Using the fact that e−τ (H+iR(tj )) , e−τ H and e−iτ R(tj ) are all contractions, we conclude that there exists a sequence pn → ∞ such that U (t, s) = s- lim
n→∞
0 −(tj+1 −tj )H/pn −i(tj+1 −tj )R(tj )/pn pn e e . n−1
This completes the proof of the proposition. Proposition A.6. Assume that R(t) is selfadjoint and satisfies (A.3). Assume moreover that the function t → (H + 1)−γ R(t) is in L1 (IR) ∩ L∞ (IR)
(A.7)
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
165
and ±R(t) ≤ r(t)(H + 1)δ ,
0 ≤ δ < 1,
(A.8)
for some r ∈ L1 (IR) ∩ L∞ (IR). Then the limit Uλ (+∞, −∞) := w-
lim (t,s)→(+∞,−∞)
Uλ (t, s) exists for all λ ∈ C.
(i) The function C λ → Uλ (+∞, −∞) is entire and satisfies −1
λ|(1−δ)
Uλ (+∞, −∞) ≤ ec|Im
∀ λ ∈ C;
(ii) The derivatives w.r.t. λ are uniformly bounded: sup |∂λn Uλ (+∞, −∞)| < ∞
∀ n ∈ IN;
λ∈IR
(iii) For n ∈ IN and λ ∈ IR the derivatives at λ = 0 are given by the following formula: dn Uλ (+∞, −∞)|λ=0 dλn n = n!(−i)
−∞
1l{0} (H)
2
R(tk )e−(tk −tk−1 )H
n
× R(t1 )1l{0} (H)dt1 · · · dtn . Proof. Let Ψ ∈ H. Using Proposition A.2 and the fact that R(t) is selfadjoint we obtain d Uλ (t, s)Ψ2 = −2 Re Uλ (t, s)Ψ, (H + iλR(t))Uλ (t, s)Ψ dt = −2 Uλ (t, s)Ψ, (H − Im λR(t))Uλ (t, s)Ψ . Now
(1−δ)−1 H − Im λR(t) ≥ H − |Im λ| r(t)(H + 1)δ ≥ c |Im λ| r(t) ,
since inf s≥0 s − t(s + 1)δ = −c t(1−δ)
−1
for t ≥ 0. This yields
−1 −1 d Uλ (t, s)Ψ2 ≤ c|Im λ|(1−δ) r(t)(1−δ) Uλ (t, s)Ψ2 , dt
and hence Uλ (t, s)Ψ ≤ ec|Im
−1
λ|(1−δ)
Since r ∈ L1 (IR) ∩ L∞ (IR) we have r(1−δ) sup Uλ (t, s) ≤ ec|Im
−1
Rt s
−1
r(τ )(1−δ)
dτ
Ψ.
∈ L1 (IR), which yields −1
λ|(1−δ)
∀ λ ∈ C.
(A.9)
s≤t
Let us now prove that w-
lim (t,s)→(+∞,−∞)
Uλ (t, s)
exists for all λ ∈ IR.
(A.10)
March 29, 2005 8:59 WSPC/148-RMP
166
J070-00230
C. G´ erard & C. D. J¨ akel
For Ψ ∈ H, Φ ∈ D(H) and 0 ≤ γ < 1 we find Φ, (Uλ (t, s) − e−(t−s)H )Ψ t −(t−τ )H e (H + 1)γ Φ, (H + 1)−γ R(τ )U (τ, s)Ψ dτ. = −iλ s
Using dominated convergence and hypothesis (A.7) we obtain the existence of Φ, Uλ (t, s)Ψ for Ψ ∈ H and Φ ∈ D(H γ ). lim (t,s)→(+∞,−∞)
Applying a density argument and the uniform bound (A.9) this proves (A.10). Now {λ → Uλ (t, s) | s ≤ t} is a locally uniformly bounded family of entire functions. Applying Lemma B.3 and (A.10) we obtain that Uλ (+∞, −∞) = w-
lim (t,s)→(+∞,−∞)
Uλ (t, s)
exists for all λ ∈ C. Moreover, the map λ → Uλ (+∞, −∞) is entire and Uλ (+∞, −∞) ≤ ec|Im
−1
λ|(1−δ)
∀ λ ∈ C.
If f (z) is a bounded holomorphic function in a strip {|Im z| < a}, then it follows easily from Cauchy’s formula that supx∈IR |∂xn f (x)| < ∞ for all n ∈ IN. This completes the proof of (ii). Let us now prove (iii). Set, as in Sec. A.1, K: C([s, T ], H) → C([s, T ], H) (·) e−(·−τ )H R(τ )W (τ )dτ, W (·) → −i s −(t−s)H
Ψ. From the integral equation (1l − λK)Uλ ( · , s)Ψ = V ( · )Ψ and V (t)Ψ = e we deduce that dn Uλ (t, t0 )|λ=0 Ψ = n!K n V (t)Ψ dλn n = n!(−i) e−(t−tn )H ×
t0 ≤t1 ≤···≤tn ≤t 1
R(tk )e
−(tk −tk−1 )H
Ψdt1 · · · dtn .
n
The function C λ → Uλ (t, s)Ψ is entire and uniformly bounded in {|Im λ| ≤ a} for −∞ < s ≤ t < +∞. Therefore dn dn Uλ (+∞, −∞)Ψ = lim Uλ (t, s)Ψ. n dλ (t,s)→(+∞,−∞) dλn Setting tn+1 = t we find 1 1 −(t−tn )H −(tk −tk−1 )H |tk+1 − tk |−γ + 1 (H + 1)−γ R(tk ). R(tk )e e ≤c n n
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
167
From Lebesgue dominated convergence we deduce that dn n Uλ (t, s)|λ=0 = n!(−i) lim s→−∞ dλn −∞
A similar argument yields dn Uλ (t, s)|λ=0 t→+∞ s→−∞ dλn = n!(−i)n lim
lim
−∞
1l{0} (H)
2
R(tk )e
−(tk −tk−1 )H
n
× R(t1 )1l{0} (H)dt1 · · · dtn . Applying Lemma B.1 we obtain (iii). We will use the following lemma to show that the limiting functional obtained in Theorem 7.2 defines a Borel measure on S (Sβ × IR). Lemma A.7. Let Ri (t), i = 1, 2, be selfadjoint families satisfying (A.3) and (A.7). Assume that ± R1 (t) − R2 (t) ≤ r(t)(H + 1)δ , 0 ≤ δ < 1, −1
for r ∈ L(1−δ) (IR). Then U (+∞, −∞; R2 ) − U (+∞, −∞; R1 ) ≤ cr(1−δ)−1 . Proof. Let us denote by Zλ (t, s) the operator U (t, s; R1 + λ(R2 − R1 )). By the same arguments as used in the proof of Proposition A.6, we see that λ → Zλ (t, s) is an entire analytic function, which satisfies the bound Zλ (t, s) ≤ ec|Im
λ|γ rγ γ
for λ ∈ C and γ = (1 − δ)−1 .
As in Proposition A.6, the limit of Zλ (t, s) when (t, s) → (+∞, −∞) exists for λ ∈ IR fixed. Applying again Vitali’s theorem, we obtain the existence of Zλ (+∞, −∞) for all λ ∈ C and the bound Zλ (+∞, −∞) ≤ ec|Im
λ|γ rγ γ
∀ λ ∈ C.
Applying Cauchy’s formula on the circle of radius R centered around λ ∈ IR yields d Zλ (+∞, −∞) ≤ R−1 ecRγ rγγ . dλ
March 29, 2005 8:59 WSPC/148-RMP
168
J070-00230
C. G´ erard & C. D. J¨ akel
Optimizing this bound w.r.t. R we get d Zλ (+∞, −∞) ≤ crγ . dλ Integrating in λ from 0 to 1 we obtain the lemma. A.3. Some additional results We now prove some bounds on U (t, s), which we use in the main text to show the existence of sharp-time fields and the convergence of sharp-time Schwinger functions. Lemma A.8. Let Ri (t), i = 1, 2, be two families of selfadjoint operators satisfying (A.3) and (A.7). Assume that (A.11) ± R2 (t) − R1 (t) ≤ r(t)(H + 1) for r ∈ L∞ (IR) ∩ L1 (IR). Set Zλ (t, s) := U t, s; R1 + λ(R2 − R1 ) for −∞ ≤ s ≤ t ≤ +∞. Then n d n r1 r−1 ∞ . dλn Zλ (t, s) ≤ n! r∞ e Proof. Since Ri (t) satisfy (A.3) and (A.7), the function λ → Zλ (t, s) is entire. We still denote by Zλ (t, s) its extension to λ ∈ C. As in the proof of Proposition A.6, we find d Zλ (t, s)Ψ2 = −2 Zλ (t, s)Ψ, H − Im λ(R2 (t) − R1 (t)) Zλ (t, s)Ψ . dt Now H − Im λ R2 (t) − R1 (t) ≥ H − r(t) |Im λ| (H + 1) ≥ −r(t) |Im λ| for |Im λ| ≤ r−1 ∞ . This yields d Zλ (t, s)Ψ2 ≤ 2 |Im λ| r(t)Ψ2 dt
for |Im λ| ≤ r−1 ∞.
Hence Zλ (t, s) ≤ e|Im
λ|r1
for |Im λ| ≤ r−1 ∞.
We apply Cauchy’s formula on a circle of radius r−1 ∞ and obtain n d n r1 r−1 ∞ for λ ∈ IR. dλn Zλ (t, s) ≤ n! r∞ e This completes the proof of the lemma. Finally we prove a lemma which is used in the main text to prove spatial clustering.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
169
Remark A.9. Let t0 ∈ IR and define the time-translated family by ξt0 R(t) := R(t − t0 ). Then clearly U t, s; ξt0 (R) = U (t − t0 , s − t0 ; R) for − ∞ < s ≤ t < +∞. Letting (s, t) → (−∞, +∞) we obtain that U +∞, −∞; ξt0 (R) = U (+∞, −∞; R). Lemma A.10. Assume that 0 is a simple eigenvalue of H and that H has a spectral gap, i.e., ]0, a] ∩ σ(H) = ∅
for some a > 0.
Let {R1 (t)}, {R2 (t)} be two selfadjoint families of operators satisfying (A.3) and (A.7) with Ri (t) ≡ 0 for |t| ≥ T . If Ω is a normalized ground state of H, then Ω, U ∞ (R1 + ξt (R2 ))Ω − Ω, U ∞ (R1 )Ω Ω, U ∞ (R2 )Ω ≤ e−(|t|−2T )a for |t| > 2T, where U ∞ (R) := U (+∞, −∞; R). Proof. It suffices to consider the case t > 0. Using the group property and considering the supports of Ri (·), we find U t, s; R1 + ξt0 (R2 ) = U t, t0 − T ; ξt0 (R2 ) e−(t0 −2T )H U (T, s; R1 ) (A.12) = U (t − t0 , −T ; R2 )e−(t0 −2T )H U (T, s; R1 ) for s ≤ −T , t0 > 2T and t > t0 + T . Since H has a spectral gap of length a, −(t −2T )H e 0 − |ΩΩ| ≤ e−(t0 −2T )a . (A.13) Moreover, since HΩ = 0 and supp Ri (·) ⊂ [−T, T ], Ω, U (t − t0 , −T ; R2 )Ω = Ω, U (t − t0 , s; R2 )Ω , Ω, U (T, s; R2 )Ω = Ω, U (t, s; R2 )Ω .
(A.14)
Combining (A.12), (A.13), (A.14) and letting (t, s) → (+∞, −∞) we obtain the lemma. Appendix B. Miscellaneous Results Lemma B.1. Let F : IR2 → E be a map with value in a metric space E. (i) Assume that lim F (k, k ) = F∞ exists,
k,k →∞
lim F (k, k ) = G(k) exists
k →∞
lim G(k) = G∞ exists.
k→∞
Then F∞ = G∞ ;
∀ k ∈ IN,
March 29, 2005 8:59 WSPC/148-RMP
170
J070-00230
C. G´ erard & C. D. J¨ akel
(ii) Assume that lim F (k, k ) = G(k) exists,
k →∞
lim F (k, k ) = F (k ) exists and the convergence is uniform w.r.t. k ,
k→∞
lim G(k) = G∞ exists.
k→∞
Then limk →∞ F (k ) = G∞ . The proof is easy and left to the reader. Lemma B.2. Let (Q, Σ, µ) be a probability space. Let f be a real measurable function on Q and set C(t) := eitf dµ. Q
Then f ∈ 1≤p<∞ Lp (Q, Σ, µ) if and only if supt∈IR |∂tn C(t)| < ∞ for all n ∈ IN. If this is the case, then ∂tn C(t) = in f n eitf dµ. Q
Proof. The ⇒ part and the formula for ∂tn C(t) is obvious by differentiating under 2 the integral sign. It remains to prove the ⇐ part. Let χ(τ ) = e−τ /2 and let p ≥ 1. By monotone convergence it suffices to prove that f 2p sup f χ dµ < ∞ n n∈IN Q in order to show that f ∈ L2p (Q, Σ, µ). We have τ n2p+1 = ˆ dt. τ 2p χ eitτ ∂t2p χ(nt) n 2π Hence
f n2p+1 f χ eitf ∂t2p χ(nt) ˆ dtdµ dµ = n 2π Q Q IR n2p+1 = C(t) ∂t2p χ(nt) ˆ dt 2π IR 2p (−1)2p = ˆ n ∂t C(t) χ(nt)dt, 2π IR
2p
using Fubini’s theorem and integrating by parts 2p times. Since χ ˆ ∈ L1 (IR) and 2p 2p −1 ∂t C is uniformly bounded, we obtain that supn∈IN Q f χ(n f )dµ < ∞, which completes the proof of the lemma.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
171
Lemma B.3. Let I be a directed set and {uα }α∈I a net of functions which are holomorphic in an open set Ω ⊂ C. (i) Assume that the family {uα } is locally uniformly bounded in Ω and that there exists a set Γ ⊂ Ω having an accumulation point in Ω such that lim uα (z)
α∈I
exists for z ∈ Γ.
Then limα∈I uα = u exists in the compact–open topology on Ω and u is a holomorphic function in Ω. (We recall that the compact–open topology on C(Ω) is the topology of uniform convergence on all compact subsets of Ω.) (ii) Assume moreover that Ω is bounded with a smooth boundary and that sup sup |uα (z)| < ∞. α∈I z∈Ω
¯ and limα∈I sup Then u is continuous on Ω z∈∂Ω |uα (z) − u(z)| = 0. Proof. Let us first prove (i). By Vitali’s theorem the family {uα } is compact for the compact–open topology. Let {uβ }β∈J be a subnet converging to a continuous function u. Assume that the net {uα }α∈I does not converge to u. Then there exists a bounded open set Ω1 ⊂ Ω and a subnet {uγ }γ∈J1 such that supz∈Ω1 |uγ (z) − u(z)| ≥ 0 > 0 for γ ∈ J1 . Applying again Vitali’s theorem to the net {uγ }γ∈J1 , we obtain another subnet {uδ }δ∈J2 such that limδ∈J2 uδ = v, with v = u. But u and v are holomorphic in Ω, as limits of holomorphic functions for the compact–open topology and coincide on Γ by hypothesis. Since Γ has an accumulation point in Ω, we have u = v which gives a contradiction. Let us now prove (ii). Assume the contrary and let {uβ }β∈J be a subnet such that inf sup |uβ (z) − u(z)| ≥ > 0.
β∈J z∈∂Ω
Since ∆u = 0 in Ω, we see that u belongs to the Sobolev space H 2 (Ω). Using that ∆uβ = 0 in Ω and the fact that the family {uβ }β∈J is uniformly bounded in Ω, we obtain similarly that {uβ }β∈J is a bounded family in H 2 (Ω). Hence (i) implies limβ∈J uβ = u in D (Ω). Finally we note that the injection H 2 (Ω) → H 3/2 (Ω) is compact. Extracting again a subnet, we obtain limγ∈J1 uγ = u in H 3/2 (Ω). Together with the trace theorem this implies that limγ∈J1 uγ = u in H 1 (∂Ω) and hence in C(∂Ω). This gives a contradiction. References [1] H. Araki, A lattice of von Neumann algebras associated with the quantum theory of a free Bose field, J. Math. Phys. 4 (1963) 1343–1362. [2] H. Araki, Relative Hamiltonian for faithful normal states of a von Neumann algebra, Publ. Res. Int. Math. Soc. 9 (1973) 165–209.
March 29, 2005 8:59 WSPC/148-RMP
172
J070-00230
C. G´ erard & C. D. J¨ akel
[3] H. Araki, Positive cone, Radon-Nikodym theorems, relative Hamiltonian and the Gibbs condition in statistical mechanics. An application of Tomita-Takesaki theory, in C ∗ -algebras and their Applications to Statistical Mechanics and Quantum Field Theory (Kastler, D., Ed., North Holland, 1976). [4] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vols. I, II (Springer-Verlag, New York-Heidelberg-Berlin, 1981). [5] D. Buchholz, C. D’Antoni and K. Fredenhagen The universal structure of local algebras, Commun. Math. Phys. 111 (1987) 123–135. [6] P. R. Chernoff, Note on product formulas for operator semigroups, J. Funct. Anal. 2 (1968) 238–242. [7] J. Derezinski and C. G´erard, Spectral scattering theory of spatially cut-off P (ϕ)2 Hamiltonians, Comm. Math. Phys. 213 (2000) 39–125. [8] J. Derezinski, V. Jaksic and C. A. Pillet, Perturbations of W ∗ -dynamics, Liouvilleans and KMS states, preprint mp-arc 03–94 (2003). [9] J. Fr¨ ohlich, Unbounded, symmetric semigroups on a separable Hilbert space are essentially selfadjoint, Adv. Appl. Math. 1 (1980) 237–256. [10] J. Fr¨ ohlich, The reconstruction of quantum fields from Euclidean Green’s functions at arbitrary temperatures, Helv. Phys. Acta 48 (1975) 355–363. [11] C. G´erard, On the existence of ground states for massless Pauli-Fierz Hamiltonians, Ann. Henri Poincar´e 1 (2000) 443–458. [12] C. G´erard and C. J¨ akel, Thermal quantum fields with spatially cut-off interactions in 1+1 space-time dimensions, preprint math-ph/0307053 (2003). [13] I. M. Gelfand and N. J. Vilenkin, Generalized functions. Vol. 4: Applications of harmonic analysis (Academic Press, 1964). [14] J. Glimm and A. Jaffe, The λϕ42 quantum field theory without cutoffs. II. The field operators and the approximate vacuum, Ann. Math. 91 (1970) 362–401. [15] J. Glimm and A. Jaffe, Collected Papers, Volume 1: Quantum Field Theory and Statistical Mechanics (Birkh¨ auser, 1985). [16] R. Gielerak, L. Jak´ obczyk and R. Olkiewicz, Reconstruction of KuboMartinSchwinger structure from Euclidean Green functions, J. Math. Phys. 35 (1994) 3726–3744. [17] R. Gielerak, L. Jak´ obczyk and R. Olkiewicz, W*-KMS structure from multi-time Euclidean Green functions, J. Math. Phys. 35 (1994) 6291–6303. [18] R. Gielerak, L. Jak´ obczyk and R. Olkiewicz, Stochastically positive structures on Weyl algebras. The case of quasi-free states, J. Math. Phys. 39 (1998) 6291–6328. [19] E. P. Heifets and E. P. Osipov, The energy momentum spectrum in the P (ϕ)2 quantum field theory, Comm. Math. Phys. 56 (1977) 161–172. [20] D. Henry, Geometric Theory of Semilinear Parabolic Equations (Springer Lect. Notes in Math. 840, 1981). [21] R. Høegh-Krohn, Relativistic quantum statistical mechanics in two-dimensional space-time, Commun. Math. Phys. 38 (1974) 195–224. [22] A. Klein, The semigroup characterization of Osterwalder-Schrader path spaces and the construction of Euclidean fields, J. Funct. Anal. 27 (1978) 277–291. [23] A. Klein and L. Landau, Stochastic processes associated with KMS states, J. Funct. Anal. 42 (1981) 368–428. [24] A. Klein and L. Landau, Periodic Gaussian Osterwalder-Schrader positive processes and the two-sided Markov property on the circle, Pacific J. Math. 94 (1981) 341–367. [25] A. Klein and L. Landau, Construction of a unique selfadjoint generator for a symmetric local semigroup, J. Funct. Anal. 44 (1981) 121–137.
March 29, 2005 8:59 WSPC/148-RMP
J070-00230
Thermal Quantum Fields without Cut-Offs in 1+1 Space-Time Dimensions
173
[26] A. Klein and L. Landau, Singular perturbations of positivity preserving semigroups via path space techniques, J. Funct. Anal. 20 (1975) 44–82. [27] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras II (Academic Press, New York, 1986). [28] R. Longo, Algebraic and modular structure of von Neumann algebras of physics, in Proc. Symposia in Pure Mathematics, Vol. 38 (1982), pp. 551–566. [29] B. Simon, The P (ϕ)2 Euclidean (Quantum) Field Theory (Princeton University Press, 1974). [30] B. Simon and R. Høegh-Krohn, Hypercontractive semigroups and two dimensional self-coupled Bose fields, J. Funct. Anal. 9 (1972) 121–180. [31] O. Steinmann, Perturbative quantum field theory at positive temperatures: An axiomatic approach, Commun. Math. Phys. 170 (1995) 405–415. [32] M. Takesaki and M. Winnink, Local normality in quantum statistical mechanics, Commun. Math. Phys. 30 (1973) 129–152.
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Reviews in Mathematical Physics Vol. 17, No. 2 (2005) 175–226 c World Scientific Publishing Company
SYSTEMS OF CLASSICAL PARTICLES IN THE GRAND CANONICAL ENSEMBLE, SCALING LIMITS AND QUANTUM FIELD THEORY
SERGIO ALBEVERIO∗ and HANNO GOTTSCHALK† Institut f¨ ur angewandte Mathematik, Rheinische Friedrich-Wilhelms-Universit¨ at Bonn, Wegelerstr. 6, D-53115 Bonn, Germany ∗
[email protected] †
[email protected] MINORU W. YOSHIDA Department of Mathematics and Systems Engeneering, The University of Electrocommunications 1-5-1, Chofugaku, Tokyo 182-8585, Japan
[email protected] Received 19 August 2004 Revised 16 February 2005 Euclidean quantum fields obtained as solutions of stochastic partial pseudo differential equations driven by a Poisson white noise have paths given by locally integrable functions. This makes it possible to define a class of ultra-violet finite local interactions for these models (in any space-time dimension). The corresponding interacting Euclidean quantum fields can be identified with systems of classical “charged” particles in the grand canonical ensemble with an interaction given by a nonlinear energy density of the “static field” generated by the particles’ charges via a “generalized Poisson equation”. A new definition of some well-known systems of statistical mechanics is given by formulating the related field theoretic local interactions. The infinite volume limit of such systems is discussed for models with trigonometric interactions using a representation of such models as Widom–Rowlinson models associated with (formal) Potts models at imaginary temperature. The infinite volume correlation functional of such Potts models can be constructed by a cluster expansion. This leads to the construction of extremal Gibbs measures with trigonometric interactions in the low-density high-temperature (LD-HT) regime. For Poissonian models with certain trigonometric interactions an extension of the well-known relation between the (massive) sine-Gordon model and the Yukawa particle gas connecting characteristic and correlation functionals is given and used to derive infinite volume measures for interacting Poisson quantum field models through an alternative route. The continuum limit of the particle systems under consideration is also investigated and the formal analogy with the scaling limit of renormalization group theory is pointed out. In some simple cases the question of (non-) triviality of the continuum limits is clarified. Keywords: Euclidean quantum field theory; Poisson random fields; local interactions; particle systems in the grand canonical ensemble; correlation functionals; Potts- and 175
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
176
Widom–Rowlinson models; cluster expansion; extremal Gibbs measures; continuum limit of particle systems; sine-Gordon model. Mathematics Subject Classification 2000: 81T08, 60G55, 60G60, 81T10, 82B21, 82B28
1. Introduction Strong connections between classical statistical mechanics and quantum field theory have been established in the framework of Euclidean quantum field theory (EQFT), see e.g. [1, 34, 55]. In particular this applies to the approximation of Euclidean quantum fields by lattice spin systems [34, 55], the representation as a gas of interacting random walks [1, 26, 59], or the connection of quantum field models with trigonometric interaction (e.g. the sine-Gordon model) with the gas of particles interacting through Yukawa or Coulomb forces [9, 10, 24, 27–29]. In this way, cluster expansions or correlation inequalities from classical particle or ferromagnetic spin systems have been applied to the solution of the infra-red problem in Euclidean quantum field theory. Basically, all these constructions concern models of quantum fields given by a classical Euclidean action functional S(X) = S 0 (X) + βVΛ (X) with the free term 1 S 0 (X) = 2 Rd [|∇X|2 +m2 X 2 ] dx and the interaction term VΛ (X) being an additive d functional in the infra-red regularizer Λ ⊆ R of local type, i.e. VΛ (X) = Λ v(X) dx for some function v: R → R. Inserting this into a heuristic path integral of Feynman type, one gets the well-known heuristic formula for the vacuum expectation values of the relativistic quantized field continued to imaginary times (Schwinger functions) as 0 1 (1) X(y1 ) · · · X(yn )e−S (X)−βVΛ (X) DX , SΛ,n (y1 , . . . , yn ) = ZΛ > where y1 , . . . , yn ∈ Rd , β ∈ R, m2 (−) 0. While the path integral itself makes sense — 0
e−S (X) DX/Z∅ can be identified with the Gaussian measure with covariance operator (−∆ + m2 )−1 , i.e. the Nelson free field measure (for m2 = 0, d = 1 the Wiener measure) — it is difficult to define the interaction term VΛ (X), since the field configurations X in the support of the Gaussian measure are functions only if d = 1. If d ≥ 2 the field configurations in the integral (1) generically are distributions and expressions as v(X) are ill-defined. This also limits the EQFT approach essentially to space-time dimension d = 1 or d = 2, where v(X) for polynomial, trigonometric or exponential v can be regularized by Wick-ordering (for the construction of the φ4 -model in d = 3 dimensions see [35]). In the present paper we suggest to replace Nelson’s measure e−S0 (X) DX/Z∅ in (1) by a convoluted Poisson noise measure [3]. Since Nelson’s measure can be seen as a convoluted Gaussian white noise measure, from a mathematical point of view it is natural to generalize Eq. (1) to Poisson path space measures. Furthermore, given the fact that convoluted Poisson white noise measures have support on locally integrable field configurations X, for a certain class of functions v we can
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
177
define potentials VΛ (X) without any ultra-violet renormalization (not even Wickordering) independently of the dimension d ≥ 2. As we will show in Sec. 4, the Euclidean quantum field models obtained in this way can be identified with systems of classical continuous and interacting particles in the grand canonical ensemble. Since at least in principle the perturbed Gaussian free field models can be recovered from the related interacting “Poissonian” quantum field models by a scaling limit of the associated particle system, we can consider the above replacement as a new approximation of EQFTs by systems of statistical mechanics. Also, properties of EQFT, as e.g. Euclidean invariance, are preserved in the infinite volume limit Λ ↑ Rd . In this sense this new approximation takes care of important structural aspects of EQFT (which are violated e.g. by the lattice approximation, discussed e.g. in [1, 34, 55]). Another motivation for our suggestion is the constructive approach to quantized gauge type fields developed in [3–6, 10–12]. The basic framework in these references is the one of covariant stochastic partial (pseudo) differential equations driven by noise not necessarily of the Gaussian type, in contrast to Nelson’s Euclidean approach [47, 48] which can be considered in the framework of stochastic partial (pseudo) differential equations of the Gaussian type. This approach started in the study of quaternionian vector [11–13, 15] and scalar models [3, 4, 16], it has then extended to much more general fields, see [6, 19, 32, 33, 39, 41]. In these cases the axiomatic framework for the relativistic fields to be accommodated, when possibly constructed, is the concept of quantum fields with indefinite metric [46, 57]. In fact, analytic continuation for these models from Euclidean imaginary time to relativistic real time is possible and the modified Wightman axioms [46] for quantum fields with indefinite metric can be verified explicitly [3, 6, 15, 19]. In particular, fields with interesting scattering behavior have been found in this class of models, also in the physical space-time dimension 4, cf. [2, 5, 6, 41]. Therefore the connection with relativistic quantum field theory does not get lost, if we replace the Nelson’s measure by a convoluted Poisson noise measure. An alternative way to describe the main attitude of this paper is to say that a systematic discussion is given on how to introduce perturbations of the basic (indefinite metric) Euclidean quantum fields to construct other such fields. In analogy with the standard constructive approach, this is achieved by constructing Gibbs type measures for a bounded region of space-time (finite volume) and then removing this restriction in the sense of a thermodynamic limit. The main result of this paper consists in showing that such an approach can indeed be developed and yields at the same time interesting new relations with models of classical statistical mechanics. Some results of this work have been announced in [7]. Let us finally describe the content of each section of this paper. In Sec. 2 the basic notions of generalized white noise convoluted generalized white noise are recalled. It is also described, how the corresponding random fields lead, by analytic continuation of their moment functions (Schwinger functions), to relativistic Wightman functions satisfying all axioms of an indefinite metric quantum field theory. Some special
March 29, 2005 8:59 WSPC/148-RMP
178
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
Green’s functions used to perform the convolution are discussed and the scattering behavior of the associated quantum field models is recalled. Finally, we show that the lattice approximation of the Euclidean noise fields canonically leads to the notion of a generalized white noise. In Sec. 3, path properties of convoluted Poisson noise (CPN) are discussed and exploited to construct ultra-violet finite, local interactions. In Sec. 3.2 we recall that pure Poisson noise has paths in the space of locally finite “marked” configurations and hence convolution with an integrable kernel leads to fields with paths which are locally integrable (independently of the dimension d ≥ 2), cf. Sec. 3.3. In Sec. 3.4 we then define the interaction term VΛ for any v measurable such that |v(t)| ≤ a + b|t| for some a, b > 0. Section 4 is devoted to the connection between the particle systems in a grand canonical ensemble (GCE) and quantum fields defined by convoluted Poisson white noise with interaction. Theorem 4.1 shows the stability (in the statistical mechanics sense) of the field theoretic interaction potential for the associated system of classical, continuous particles. In Sec. 5 several models of statistical mechanics are looked upon as systems of classical particles associated (in the sense of Sec. 4) with convoluted, interacting Poisson white noise. In particular the cases of a gas of hard spheres, particle systems with potentials of stochastic geometry or pair potentials which are positive definite fit into this framework. Section 6 is the technical core of this work. We give a complete solution of the problem of taking the infinite volume limit of the models of quantum fields resp. statistical mechanics in the low-density high-temperature regime (LD-HT) and trigonometric interactions (cf. Sec. 6.1 for the definition of the interaction). This is presumably one of the first cluster expansion for a continuous particle system for an interaction that is not a pair-interaction.a The strategy is to represent such a model as the projection of a (formal) Potts model at imaginary temperature to one of its components (Widom–Rowlinson model), cf. Sec. 6.2. Even though such formal Potts models are only represented as complex valued measures on the space of locally finite configurations with an extra mark indicating the “component” and cannot be interpreted in terms of statistical mechanics, the standard cluster expansion [52] for their correlation functionals goes through (Sec. 6.3). The projection to the first component then defines the correlation functional of the system with trigonometric interaction. Using standard arguments [42, 43] one can then reconstruct the associated infinite volume measure. Verification of Ruelle equations in Sec. 6.4 then implies that such measures are Gibbs. Cluster properties of correlation functionals in the infinite volume follow from the cluster expansion and a It is probably known to some experts that the cluster expansion for the standard Potts model at positive temperature leads to a construction of the ordinary Widom–Rowlinson model in the LD-HT regime (corresponding to “exponential” interactions for systems of particles with only positive charges [38]). But the details have not been worked out, nor has the flexibility of this method in connection with “charged” or “marked” particles been realized.
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
179
imply ergodicity of the translation group and hence extremality of the Gibbs state (Sec. 6.5). The case of trigonometric interactions is analyzed in Sec. 6.6. We extend the previously known connection for the massive resp. massless sine-Gordon model and Yukawa resp. Coulomb gas models (“duality transformation”). The continuum (scaling) limit of interacting convoluted Poisson noise (with infra-red cut-off) is discussed in Sec. 7. We start with a rather general discussion of scaling limits for Poisson models and the relation with “renormalization group methods”. The case of trigonometric interactions with ultra-violet cut-off is then analyzed with the related, ultra-violet regularized, perturbed free field. Triviality without an ultra-violet cut-off and without renormalization is shown in Sec. 7.3. In Sec. 7.4 the scaling limit for the d = 2-dimensional sine-Gordon model without ultra-violet cut-off and with a coupling constant renormalization is established in the sense of a formal power series.
2. Generalized White Noise and Convoluted Generalized White Noise In this section we introduce our notation and recall some results of [3].
2.1. Generalized white noise For d ≥ 1 we identify the d-dimensional Euclidean space-time with Rd , by · / |.| we denote the Euclidean scalar product/norm and E(d) stands for the group of Euclidean transformations on Rd . The space S is the space of real valued fast falling test functions on Rd endowed with the Schwartz topology. By S we denote its topological dual space (space of tempered distributions). Let B(S ) be the Borel σ-algebra on S , i.e. the σ-algebra generated by the open (in the weak topology) subsets of S . Then, (S , B(S )) is a measurable space. A (tempered) random field over Rd by definition is a mapping from S into the space of real valued random variables on some probability space X: S → L RV(Ω, B, P ) such that (i) X is linear P -a.s. and (ii) fn → f in S ⇒ X(fn ) → L X(f ) where → means convergence in probability law. Two processes Xj , j = 1, 2, on probability spaces (Ωj , Bj , Pj ), j = 1, 2, are called equivalent in law if P1 {X1 (f1 ) ∈ B1 , . . . , X1 (fn ) ∈ Bn } = P2 {X2 (f1 ) ∈ B1 , . . . , X2 (fn ) ∈ Bn } ∀n ∈ N, f1 , . . . , fn ∈ S and B1 , . . . , Bn ∈ B(R), where B(R) stands for the Borel sigmaalgebrab on R. By Minlos’ theorem [45] there is a one-to-one correspondence (up to equivalence in law) between tempered random fields and the characteristic functionals (i.e. continuous, normalized and positive definite functionals) C: S → C given by C(f ) = EP [eiX(f ) ]. Furthermore, X can be realized as a coordinate process, i.e. there exists a unique probability measure P X on (S , B(S )) such that for the random field b The
sigma-algebra generated by the open subsets.
March 29, 2005 8:59 WSPC/148-RMP
180
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
Xc (f )(ω) = ω, f = ω(f ) ∀ω ∈ S and f ∈ S and EP [eiX(f ) ] = EP X [eiXc (f ) ] ∀f ∈ S. In the following we drop the subscript c and we adopt the general rule that a random field X on the probability space (S , B(S ), P X ) is always the coordinate process. Let ψ: R → C be a L´evy-characteristic, i.e. a continuous, conditionally positive n definite function (for tj ∈ R, zj ∈ C, j = 1, . . . , n such that j=1 zj = 0 we have n zl zj ≥ 0) such that ψ(0) = 0. We set l,j=1 ψ(tl − tj )¯ CF (f ) = e
R Rd
ψ(f ) dx
∀f ∈ S
(2)
and we get from [30, Theorem 6, p. 283] that CF is a characteristic functional. The associated random field F is called a generalized white noise. F has infinitely divisible probability law, is invariant in law under Euclidean transformations and for f, h ∈ S such that supp f ∩ supp h = ∅, F (f ) and F (h) are independent random variables. Provided ψ is C 1 -differentiable at 0, one can derive the following representation for ψ (cf. [20]) ist σ2 2 t +z e − 1 dr(s). (3) ψ(t) = iat − 2 R Here a ∈ R, z, σ 2 ∈ [0, ∞) and r is a probability measure on R such that r{0} = 0. The representation (3) is unique (for z > 0). Using notions which are slightly different from the standard definitions, we call r the L´evy measure of ψ and z is called the activity. The first term in (3) is called the deterministic part, the second one the Gaussian part and the third one the Poisson part. Inserting (3) into (2) we see that F can be written as the sum of independent deterministic (i.e. constant), Gaussian and Poisson parts which are uniquely determined by ψ. 2.2. Convoluted generalized white noise Let L: S → S be a symmetric, Euclidean invariant linear operator. For reasons which will become apparent in Sec. 4, we call an equation of the type Lξ = η, η ∈ S , a generalized Poisson equation c (GPE). Suppose that L is continuously invertibled by a Green’s function G ∈ S , i.e. G ∗ ω = L−1 ω ∀ω ∈ S . Then the stochastic GPE LX = F
(4)
has a pathwise solution X = G ∗ F and X is called a convoluted generalized white noise. c Set
L = −∆ and η a signed measure to obtain the Poisson equation of electrostatics. we only deal with GPEs leading to short range static fields.
d Here
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
181
If the L´evy measure r of F has moments of all orders, then the Schwinger functions n X(fl ) , f1 , . . . , fn ∈ S (5) Sn (f1 ⊗ · · · ⊗ fn ) = EP X l=1
exist and can be calculated explicitly. They fulfil the requirements of temperedness, symmetry, invariance, Hermiticity and clustering of the Osterwalder–Schrader axioms [49]. In general they do not fulfil the axiom of reflection positivity, cf. [3, 39] for some counter examples (but we also note that the question is not yet completely settled in the general case). Nevertheless, if G has a representation of the form ∞ ∞ d|ρ|(m2 ) Cm dρ(m2 ), < ∞, (6) G= m2 0 0 for some (signed) measure ρ and Cm the covariance function of Nelson’s free field of mass m, then the Schwinger functions (5) can be analytically continued to a sequence of Wightman functions which fulfil all Wightman’s axioms [58] except (possibly) for positivity [3]. The Wightman functions however fulfil the Hilbert space structure condition of Morchio and Strocchi [46] and therefore can be considered as vacuum expectation values of a local, relativistic quantum field with indefinite metric [4]. 2.3. Some special Green’s functions The Green’s functions G = Gα associated with the partial pseudo differential operators Lα = (−∆ + m20 )α , m0 > 0, 0 < α ≤ 1/2, are of particular interest, since for F a purely Gaussian white noise, X is a generalized free field [36], in particular, X is reflection positive [49, 55] (cf. item (i) below). In the special case α = 1/2, X is Nelson’s free field of mass m0 > 0 [48]. We give a list of the properties of the kernels Gα in the following: Proposition 2.1. For m0 > 0 and α ∈ (0, 1] let Gα = Gα,m0 be the Green’s function of (−∆ + m20 )α . Then (i) Gα has a representation (6) with dρα (m2 ) = sin(πα)1{m2 >m20 } (m2 )
(ii) (iii) (iv) (v) (vi)
(m2
dm2 − m20 )α
0<α<1
(7)
and ρ1 (dm2 ) = δ(m2 − m20 )dm2 ; Gα ∈ L1 (Rd , dx) and Gα is smooth on Rd \ {0}; Gα (x) > 0 ∀x ∈ Rd \ {0}; ∃ C > 0 such that Gα (x) ≤ Ce−m0 |x| ∀x ∈ Rd : |x| > 1; For λ > 0, Gα,m0 (λx) = λ2α−d Gα,λm0 (x). |Gαm0 (x)| < cα (d) |x|−(d−2α) for x ∈ Rd \ {0}, where 0 < cα (d) < ∞, for d ≥ 2, 0 < α < 1, can be chosen optimal as in Eq. (9) below.
March 29, 2005 8:59 WSPC/148-RMP
182
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
Proof. All properties hold for Cm0 = G1,m0 , cf. [34, p. 126]. The representation (i) has been established in [3, Sec. 6]. (iii) now follows from the fact that ρα is a positive measure and Cm (x) > 0 ∀x = 0. (iv) follows from (i) and the related property of Cm , m ≥ m0 . (v) is a consequence of the representation eik·x (8) Gα,m0 (x) = (2π)−d 2 α dk 2 Rd (|k| + m0 ) where the integral has to be understood in the sense of Fourier transform of a tempered distribution. (ii) follows from (iv) and (vi); smoothness of Gα for x = 0 follows from the fact that by (i) Gα can be represented as a Fourier–Laplace transform and therefore is real analytic for such x. The same argument (using also the “mass-gap” in (i)) also shows that the partial derivatives of G are in L1 (Rd \ B1 (0), dx). e| = 1. By rotation Finally it remains to prove (vi): let λ = |x| and eˆ ∈ Rd , |ˆ invariance of the Gα,m0 we get, using (i), (v) and the residuum theorem ∞ dm2 −(d−2α) sin(πα) Cm (ˆ e) Gα,m0 (x) = λ (m2 − m20 )α λ2 m20 √ ∞ ∞ − t2 +m2 +λ2 m2 0 e dm2 d−2 = λ−(d−2α) γα (d) t dt m2α t2 + m2 + λ2 m20 0 0 √ ∞ ∞ − t2 +m2 e dm2 √ < λ−(d−2α) γα (d) td−2 dt 2α . (9) m t2 + m 2 0 0 We have set γα (d) = Vol(S d−2 ) sin(πα)/4π. Here the right-hand side multiplied with λd−2α defines the constants cα (d) and it is clear from the calculation that these constants are optimal for λ → 0. For d > 2 it is obvious, that the integrals converge. For d = 2, 0 < α < 1, the inner integral has a logarithmic singularity at m = 0. This singularity multiplied with m−2α is however dm2 -integrable and thus so is cα (d) < ∞ in this case. Remark 2.2. (i) As Propositon 2.1(v) shows, Gα,0 ∈ L1 (Rd , dx) but Gα,0 ∈ L1loc (Rd , dx). (ii) For m0 > 0, d/4 ≥ α > 0 we have Gα ∈ L2 (Rd , dx) since Rd G2α dx = G2α (0) = ∞, see also Proposition 2.1(vi). (iii) For d = 1, α > 1/4 we have Gα ∈ L2 (Rd , dx); in particular this applies to α = 1/2. We can deduce from Proposition 2.1(i) that the Schwinger functions of the model with G = Gα can be analytically continued to Wightman functions, which have been calculated explicitly in [3]. From these explicit formulas one can see that for 0 < α < 1/2 the mass-shell singularities of the truncated Wightman functions are of order κ−α (κ = k 0 − ωm0 , ωm = (|k|2 + m2 )1/2 ) and hence the model does not describe scattering particles.e In the most important case α = 1/2 one can e The
use of partial pseudo differential operators leads to mass smearing which in some sense is related to the concept of “infra particles”, cf. [54].
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
183
construct incoming and outgoing multi-particle states using the method of [2] but the scattering is trivial, since the mass-shell singularities of the Wightman functions in momentum space are of the order κ−1/2 and are thus too weak to produce nontrivial scattering (for that one requires order κ−1 ). In this sense, the convoluted generalized white noise models can still be considered as “free fields”, even though higher order truncated Wightman functions do not vanish. But it should also be noted that such higher order truncated Wightman functions can be decomposed into a superposition of “structure functions” with non-trivial scattering behavior [2]. 2.4. Lattice approximation of noise fields and infinitely divisible laws Finally in this section we want to give some heuristic evidence that it is natural to define the generalized white noise F as in (2) and (3). Heuristically speaking, a noise field is a collection of independent identically distributed (i.i.d.) random variables {F (x)}x∈Rd . To make this notion precise, we substitute the continuum Rd with a lattice Ln = n1 Zd , n odd, of lattice spacing 1/n and we consider the limit n → ∞ for i.i.d. random variables {Fn (x)}x∈Ln . We require that the distribution of the average of the random variables Fn (x) remains constant in the unit cube Λ1 centered at zero, i.e.
L F1 (0) = Fn (x)/nd . (10) x∈Ln ∩Λ1
We remark that Λ1 ∩ Ln = n . Equation (10) can only be fulfiled for n ∈ N has infinitely divisible probability law and thus by Schoenberg’s arbitrary if F1 (0) itF ψ(t) 1 (0) for some conditionally positive definite function ψ theorem [20] E e =e d d and E eitFn (x)/n = eψ(t)/n . Furthermore, (if ψ is C 1 -differentiable) a representation (3) is given by the L´evy–Khintchine theorem [20]. For f ∈ S with compact support we set Fn , f = x∈Ln Fn (x)f (x)/nd and we get d E eiFn (x)f (x)/n E eiFn ,f = d
x∈Ln
=
x∈Ln P
=e
d
eψ(f (x))/n
x∈Ln
ψ(f (x))/nd
→e
R Rd
ψ(f ) dx
as n → ∞
(11)
where the last step shows that the lattice approximation Fn converges to F in law as n → ∞, cf. (2). For further information on the lattice approximation see [16]. 3. Path Properties of Convoluted Poisson White Noise and Ultra-Violet Finite Local Interactions 3.1. Path properties and quantum field theory We say that a random field X realized on the probability space (S , B(S ), P X ) has paths in E, where E ⊆ S is a continuously embedded topological vector space,
March 29, 2005 8:59 WSPC/148-RMP
184
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
if E has P X inner measure one, i.e. supB(S ) B⊆E P X {X ∈ B} = 1.f The path properties of X are then given by the general properties of the distributions in E, e.g. the property that they can be represented as functions. The rather irregular paths of Nelson’s free field can be considered as the main source of problems in constructive quantum field theory. For d ≥ 2 the paths are contained in weighted Sobolev spaces with negative index and have no representation as function spaces [22, 51, 61]. Consequently, energy densities v(X) needed for the construction of local field interactions are ill-defined. In d = 2 (and partially also in d = 3) local interactions of polynomial, exponential and trigonometric type have been defined via regularization of paths and application of a renormalization procedure leading to the definition of “Wick-ordered” local interactions: v(X): (for the path properties of these: v(X): see e.g. [61]). Increasing irregularity of the paths as d ≥ 4 (in physical terms: increasing ultra-violet divergences) so far do not allow an application of these techniques to the physical case d = 4. It is therefore an interesting feature of convoluted Poissson noise (CPN), i.e. a convoluted generalized white noise such that the L´evy characteristic (3) has only a Poisson part,g that for a large class of convolution kernels G the paths are given by locally integrable functions and thus some local interactions can be defined without renormalization and therefore give ultra-violet finite interactions. This works independently of the space-time dimension d ≥ 2 (and, of course, also for d = 1). 3.2. Poisson noise and locally finite marked configurations Let us first recall a well-known construction, see e.g. [14]. Let Λn ⊆ Rd be a monotone sequence of compact sets such that Λn ↑ Rd as n → ∞ and Λ0 = ∅. For Λn \ Λn−1 and we denote the (Lebesgue) volume of Dn by |Dn |. Let n ∈ N let Dn = {Nn }n∈N , {Ynj }n,j∈N , {Snj }j,n∈N be three families of independent random variables on some proability space (Ω, B, P ) which are distributed as follows: Nn : Ω → N0 has a Poisson law with intensity z|Dn |, i.e. P {Nn = l} = e−z|Dn | z l |Dn |l /l!, Ynj : Ω → Rd has uniform distribution on Dn (i.e. 1Dn dx/|Dn |) and the distribution of Snj : Ω → R is given by the L´evy measure r. From now on we will assume that r has compact support, supp r ⊆ [−c, c], for some c > 0. By D we denote the space of (not necessarily tempered) distributions. We define a mapping φ: Ω → D via φ=
∞
n=1
φn ,
φn =
Nn
Snj δYnj
(12)
j=1
one only demands that {X ∈ E} has P X outer measure 1, but for our considerations we need this stronger formulation. g With only minor modifications, the considerations of this work can be extended to fields which also have a deterministic part. f Usually
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
185
where δx is the Dirac measure in X. Obviously, φ has range in the space of locally finite marked configurations, which is defined as the space of (real) signed measures γ on Rd such that (supp γ ∩ Λ) < ∞ for any compact Λ ⊆ Rd . By |γ| we denote the absolute of the signed measure γ. Let f be a positive measurable function on Rd . A signed measure γ is called f-finite, if Rd f d|γ| < ∞. We also use the notation
γ, f = Rd f dγ for a (signed) measure γ on Rd , provided that the integral exists. In particular this is always the case if both f and γ are non-negative. Proposition 3.1. (i) φ is P -a.s. f -finite ∀f ∈ L1 (Rd , dx) ∩ L∞ (Rd , dx), f > 0. (ii) In particular, φ ∈ S P -a.s. For N the exceptional null set, φ : (Ω \ N , B ∩ (Ω \ N )) → (S , B(S )) is measurable. (iii) Let F be the Poisson white noise with pure Poisson L´evy characteristic determined by r and z and let P F be the associated measure on (S , B(S )) such that F is the coordinate process w.r.t. P F . Then φ∗ P = P F . (iv) Assume that f as above is also continuous. Then F has paths in the space of f -finite, locally finite marked configurations, that is an element of B(S ). The estimates obtained in this proposition are actually not better than those known in the literature. We give a proof for the convenience of the reader. Proof. (i) Since [−c, c] × Dn (s, y) → sδy , f = sf (y) ∈ R is measurable, we get that φn , f and |φn |, f are measurable real-valued random variables. Since ∞
|φ|, f = n=1 |φn |, f ∈ [0, ∞] converges by monotonicity, the left-hand side of this equation is measurable. EP [e|φ|,f ] < ∞ implies P { |φ|, f < ∞} = 1. We can now use the following Laplace transform estimate EP e|φ|,f = lim EP e|φ|,1ΛN f N →∞
= lim
N →∞
N →∞
≤ ez
e−z|Dn |
n=1
N
R Rd
R Rd
∞
z l |Dn |l l=0
j j j=1 |sn |f (yn )
N →∞
=e
n=1
Pl
= lim z
EP e|φn |,f
N
= lim ×e
N
ez
n=1 R [−c,c]
l!
×l Dn ×[−c,c]×l
dynl dyn1 1 l ··· dr(sn ) · · · dr(sn ) |Dn | |Dn |
R Dn ×[−c,c]
(e|s|f (y) −1)dydr(s)
(e|s|f (y) −1)dr(s)dy
(ecf (y) −1)dy
cf ∞
≤ ezc f ∞ e
f 1
< ∞.
(13)
March 29, 2005 8:59 WSPC/148-RMP
186
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
Here · p denotes the norm of Lp (Rd , dx), p ∈ [1, ∞] and the limits in the intermediate steps always exist by monotonicity. (ii) follows immediately, since choosing f (x) = 1/(1 + |x|2 )d shows that |φ|, and hence also φ, is polynomially bounded P -a.s. To show measurability of φ, by definition of B(S ) it suffices to show that φ, f is measurable ∀f ∈ S and this can be proven as in (i). (iii) By a calculation which is analogous to (13) one can show that CF (f ) = EP [eiφ,f ] = S eiω,f dφ∗ P (ω) and the statement follows from the uniqueness of P F which holds by Minlos’ theorem. To show (iv) we first remark that by (iii) the range of φ is in this set. Thus, the set of f -bounded, locally finite marked configurations has P F outer measure one. It remains to show that it is a measurable set. Firstly, the set of locally finite marked configurations Γ in S can be written as n
(14) sl h(yl ) < ω ∈ S : ω(h) − R∈Q+ n∈N ε∈Q+
˜ R) s1 ,...,sn ∈Q h∈D(B y1 ,...,yn ∈Qd h ∞ <1
l=1
˜ j ) is a countable, dense subset of the set of test functions with support where D(B in the ball centered at zero with radius R. Thus, Γ is measurable. The subset of f -finite elements in Γ can be written in manifestly measurable form as {ω ∈ Γ: |ω(hfn )| < C} (15) C∈Q+ n∈N
h∈S˜
h ∞ ≤1
where fn ∈ S is a monotone sequence of positive functions approximating f from below in the local uniform topology and S˜ is a countable, dense subset of S. This concludes the proof. By item (iii) of Proposition 3.1 we can identify φ with F and we therefore drop the notion φ in the following. 3.3. Path properties of convoluted Poisson noise From the path properties of F we can now deduce the path properties of X = G ∗ F as follows: Theorem 3.2. Let F be a Poisson noise with L´evy measure r of compact support, supp r ⊆ [−c, c], and let G ∈ L1 (Rd , dx). Then X = G∗F has paths in L1 (Rd , g dx) where > 0 and g (x) = 1/(1 + |x|2 )(d+)/2 . Proof. Let Γ|G|∗g be the set of |G| ∗ g -finite, locally finite marked configurations. One can easily check that |G| ∗ g fulfils the conditions on f in Proposition 3.1. As proven there, this set is B(S )-measurable. By our general assumptions on L,
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
187
L−1 : S → S is continuous and thus is a measurable transformation on (S , B(S )). Since P X = L∗ P F = P F ◦ L−1 , the support of P X lies in the measurable set L−1 Γ|G|∗g and we have to prove that this set lies in L1 (Rd , g dx). Let Λn ↑ Rd , Λn ⊆ Rd be open and bounded. Furthermore let Dnl = Λn \ Λl for n > l. For γ ∈ Γ|G|∗g , we denote the restriction of γ to an open set A ⊆ Rd by γA . Clearly, G∗ γΛn ∈ L1 (Rd , g dx) since G is in L1 (Rd , dx) and supp γΛn is finite. The following estimate shows that G ∗ γΛn forms a Cauchy sequence in L1 (Rd , g dx). With · ,1 the L1 -norm on that space, we get sup G ∗ γΛn − G ∗ γΛl ,1 = sup G ∗ γDnl ,1 n>l n>l ≤ sup |G| ∗ g d|γDnl | n>l Rd = |G| ∗ g d|γ| → 0 Rd \Λl
as l → ∞
(16)
since γ is |G| ∗ g -finite. Also, lim G ∗ γΛn , f = lim γΛn , G ∗ f = γ, G ∗ f = G ∗ γ, f ∀ f ∈ S,
n→∞
n→∞
(17)
and by the fact that convergence in L1 (Rd , g dx) implies convergence in S , we get that G ∗ γ coincides with the limit of G ∗ γΛn in the Banach space L1 (Rd , g dx). We remark that by Proposition 2.1 the kernels Gα for 0 < α ≤ 1 fulfil the requirements of Theorem 3.2. In the context of quantum vector fields obtained from SPDEs driven by a Poisson white noise path properties have been considered in [11, 12, 13, 32, 33, 60] where in the latter references it is proven CPN has piecewise smooth paths with discrete singularities. This has been used to define Wilson loop observables or stochastic cosurfaces (for this concept see [14, 26] and references therein). Local L1 -integrability of paths does not hold for all of these models, since the Green’s functions for vector-valued fields in many cases cannot be represented by locally integrable functions. Nevertheless, most of the analysis of this paper would also be possible using the path properties derived in the references given above at the price of more restrictive assumptions on the interactions (to be introduced in the following subsection). 3.4. Definition of local potentials Having established the path properties of the CPN model, we now want to define nonlinear, local interactions. The construction is based on the elementary fact that for a measurable function v: R → R such that |v(t)| ≤ a + b|t| for some a, b > 0 the nonlinear transformation L1 (Rd , g dx) f → v(f ) ∈ L1 (Rd , g dx) is well defined.
March 29, 2005 8:59 WSPC/148-RMP
188
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
Theorem 3.3. Let v: R → R be a measurable function such that |v(t)| ≤ a + b|t| for some a, b ≥ 0 and let X be a CPN as in Theorem 3.2. Let Λ ⊆ Rd be compact and β ≥ 0. Then (i) (ii) (iii) (iv)
v(X) is a random field with paths in L1 (Rd , g dx), > 0; VΛ = v(X), 1Λ ∈ p≥1 Lp (S , P X ); e−βVΛ ∈ p≥1 Lp (S , P X ); Let ΞΛ = Ξ(z, β, Λ) = EP X [e−βVΛ ]. Then ¯
P XΛ =
e−βVΛ X P ΞΛ
(18)
defines a probability measure on (S , B(S )). Proof. (i) That X ∈ L1 (Rd , g dx) ⇒ v(X) ∈ L1 (Rd , g dx) is elementary. It remains to prove that v(X), f is measurable. To this aim let v be continuous and χε be a sequence of Schwartz functions such that χε → δ0 as ε → 0. Let χεx be the translation of χε by x. Then v(X(χε,x )) is a random variable. For a fixed random parameter in the set G∗ Γ|G|∗g of P X measure one, X(x) is a L1 (Rd , g dx) function in x, cf. the proof of Theorem 3.2. For random parameters in the exceptional null set we re-define v(X(χεx)) to be zero. Approximating the integral by a Riemannian sum, we also get that Λ v(X(χεx ))f (x) dx is measurable since the pointwise limit of measurable functions is measurable. Since X is a L1 (Rd , g dx)-function, there exists a subsequence εn such that in the limit ε → 0 X(χεn ,x ) → X(x) dx-a.e. n ,x )) → v(X(x)) dx-a.e. for v continuous. Consequently, the integral and v(X(χ εn v(X(χ ))f (x) dx converges to v(X), f by dominated convergence. Thus, this x Λ expression is measurable for continuous v. By an approximation of a measurable v by continuous functions, using the dominated convergence theorem again, we get that v(X), f is measurable also for v assumed to be only measurable. (ii) and (iv) follow from (iii) with v(t) replaced with −|v(t)|. (iii) Since (e−βVΛ )p = e−pβVΛ it suffices to prove the statement for p = 1. We note that −βVΛ = −β v(X), 1Λ ≤ βb |F |, |G| ∗ 1Λ + βa|Λ|
(19)
and that βb|G| ∗ 1Λ ∈ L1 (Rd , dx) ∩ L∞ (Rd , dx). Thus EP X [e−βVΛ ] < ∞ follows as in the estimate (13).
Remark 3.4. The growth condition on v in Theorem 3.3 can be relaxed in various ways. For example, obviously for v positive, (iii) is trivially satisfied and to see that VΛ < ∞ P X -a.s. the condition v(G) ∈ L1loc (Rd , dx) would be sufficient. A refined analysis of this point is postponed to later work (see however the examples of Sec. 5).
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
189
¯ ¯ Λ and we call it interacting We denote the coordinate process associated to P XΛ by X CPN with infra-red cut-off Λ.
4. The Connection with Particle Systems in the Grand Canonical Ensemble In this section we explain, how models of CPN with local interaction can be interpreted as systems of interacting classical, continuous particles in the configurational grand canonical ensemble (GCE).
4.1. Continuous classical particles in the grand canonical ensemble To begin with, we recall some notions of statistical mechanics following [52]. Let (y, p, s) ∈ Rd × Rd × [−c, c] be the “coordinates” of a classical point particle of mass M > 0 in d-dimensional Euclidean space. Here y gives the position, p the momentum and s is an “internal parameter”, called charge, which is not dynamic, i.e. is not altered by the interaction with other particles. The classical Hamiltonian of n such particles is given by H(y1 , . . . , yn ; p1 , . . . , pn ; s1 , . . . , sn ) =
n
|pl |2 l=1
2M
+ U (y1 , . . . , yn ; s1 , . . . , sn )
(20)
where U (y1 , . . . , yn ; s1 , . . . , sn ) is the potential energy. We assume that there is some a priori distribution of the charges s given by a probability measure r with supp r ⊆ [−c, c]. The GCE at inverse temperature β > 0 with chemical potential µ ∈ R in the finite volume Λ ⊆ Rd , Λ compact, is given (up to normalization) by the following measures on the n-particle configuration space 1 β[nµ−H(y1 ,...,yn ;p1 ,...,pn ;s1 ,...,sn )] e dy1 · · · dyn dp1 · · · dpn dr(s1 ) · · · dr(sn ) n!
(21)
where y1 , . . . , yn ∈ Λ, s1 , . . . , sn ∈ [−c, c]. Carrying out the Gaussian integral over the momenta, we pass to the configurational GCE (also abbreviated by GCE in the following) defined (up to normalization) through the following measures on spatial n-particles configurations (“marked” by charges s1 , . . . , sn ) z n −βU(y1 ,...,yn ;s1 ,...,sn ) e dy1 · · · dyn dr(s1 ) · · · dr(sn ) n!
(22)
where z=e
βµ
2πM β
d/2 >0
(23)
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
190
is the activity of the system.h The functions e−βU(y1 ,...,yn ;s1 ,...,sn ) are called the Boltzmann weights of the system. 4.2. Interacting Poisson quantum fields and interacting particle systems Identifying (y1 , . . . , yn ; s1 , . . . , sn ), yj = yl , l = j, with nl=1 sl δyl it is easy to show (by a calculation analogous to Eq. (13)) that in the case U ≡ 0 the measure (22) can be identified with the Poisson noise FΛ = 1Λ F where F is has L´evy measure r and activity z. Thus, FΛ describes a gas of non-interacting particles in the “box” Λ, see e.g. [31, 50]. Here we want to extend this analogy to the interacting models of the preceding section: • We consider (configurational) GCEs of charged, indistinguishable particles in a finite volume Λ. • The charges of the particles give rise to a static field ; the field of the unit charge in y is given by the Green’s function G(x − y); the static fields penetratesi the “walls” of the “box” Λ. • The static field X of a charge configuration (y1 , . . . , yn ; s1 , . . . , sn ) yj = yl , l = j, is obtained by superposition from the fields of the single particles and is thus n given by l=1 sl G(x − yl ); equivalently the static field is obtained as the solution n of the generalized Poisson equation LX = η with η = l=1 sl δyl (Fig. 1). 0
0 0.2 0.4 0.6
0.2 0 0.4
0.8
0.6
1 0.8 1
1
4 2
0.5
0 -2
1
0.8
0.6
0.4
0.2
0 0
1
0.8
0.6
0.4
0.2
0
√ Fig. 1. Field of a unit charge and ten particles with positive and negative charges ±1/ 10, G(x) = e−m0 |x| /|x|, m0 = 3. h By
an adaptation of µ and/or M it is possible to consider z and β as independent parameters. assumption can be changed by introducing boundary conditions for L, cf. Remark 4.2 below.
i This
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
191
n • The potential energy of the particle configuration η = l=1 sl δyl is given by a (nonlinear) energy density v: R → R, v(0) = 0, of the static field X
U (η) = U (y1 , . . . , yn ; s1 , . . . , sn ) =
v Rd
=
Rd
n
sl G(x − yl )
dx
l=1
v(G ∗ η) dx
=
v(X) dx.
(24)
Rd
The interacting CPN in the finite volume Λ is the random field given by the statical field of the interacting particle system in the GCE with potential energy U restricted to the box Λ. We remark that the potential U defined in (24) is Euclidean invariant, provided G is invariant under rotations. Furthermore U is symmetric under permutations of arguments (y1 , s1 ), . . . , (yn , sn).
4.3. Finite volume versus infra-red cut-off Let us now put this into mathematical terms. In particular we want to give sufficient conditions such that the potential U in (24) is well-defined and stable. Let FΛ = 1Λ F be the restriction of F to the compact region Λ. We set NΛ = supp FΛ and we recall that NΛ is Poisson distributed with intensity z|Λ|. We have NΛ < ∞ P F -a.s. and hence XΛ = G ∗ FΛ ∈ L1 (Rd , dx) P F -a.s. if G ∈ L1 (Rd , dx). The crucial observation in (24) is that for the CPN in finite volume Λ, XΛ , we can define local interactions without taking an additional infra-red cut-off as in the usual QFT. Throughout the paper we thus distinguish between the techniques of taking an infra-red cut-off (as in Sec. 3) and restriction of the associated particle system to a finite volume. While it seems conceptually clear that both formulations lead to the same system if the infra-red cut-off is removed or the infinite volume limit is taken, respectively, this remains to be established mathematically. We now get the counterpart to Theorem 3.3 using a finite volume instead of an infra-red cut-off: Theorem 4.1. Let F be a Poisson noise and G, the Green’s function of an operator L, as in Theorem 3.2. Let XΛ = G ∗ FΛ . Furthermore, let v: R → R such that |v(t)| ≤ b|t| for some b > 0 and let β > 0. Then (i) v(XΛ ) is a random field with paths in L1 (Rd , dx); (ii) V˜Λ = v(XΛ ), 1Rd ∈ p≥1 Lp (S , P XΛ ) or, equivalently, UΛ = v(G ∗ FΛ ), 1Rd ∈ p≥1 Lp (S , P F ); (iii) The potential UΛ is stable, i.e. for UΛ− the negative part of UΛ we have − UΛ ≤ BNΛ where B = cb G1 ;
March 29, 2005 8:59 WSPC/148-RMP
192
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
(iv) The grand partition function ˜ Λ = Ξ(z, ˜ β, Λ) = EP XΛ e−β V˜Λ = EP F e−βUΛ Ξ
(25)
is entirely analytic in z; ˜ p XΛ (v) In particular, e−β VΛ ∈ ) or, equivalently, e−βUΛ ∈ p≥1 L (S , P p F p≥1 L (S , P ); (vi) There exist measures on (S , B(S )) defined by ˜
˜
P XΛ =
e−β VΛ XΛ P , ˜Λ Ξ
˜
P FΛ =
e−βUΛ FΛ P ˜Λ Ξ
(26)
˜ ˜ ˜Λ related through L∗ P XΛ = P FΛ . Equivalently, the associated coordinate processes X ˜ ˜ ˜ ˜ ˜ and FΛ fulfil the generalized Poisson equation LXΛ = FΛ ⇔ XΛ = G ∗ FΛ .
Proof. (i) That v(XΛ ) is a random field can be proven as in Theorem 3.3. That the paths are in L1 (Rd , dx) follows from |v(XΛ )|, 1Rd ≤ b XΛ 1 < ∞. (ii) follows from (v) with v replaced by −|v|. (v) follows from (iv). By [52, Chap. 3], (iv) is a consequence of the stability of the potential (iii). To prove (iii) we note that UΛ− ≤ v − (G ∗ FΛ ), 1Rd ≤b |G ∗ FΛ | dx d R ≤b |G| ∗ |F | dx = b G1 d|FΛ | (27)
Rd
Rd
and Rd d|FΛ | ≤ c NΛ . (vi) now follows from (v), the fact that V˜Λ = UΛ ◦ L, cf. Eq. (24), and the transformation formula for probablity measures. The conditions of Theorem 4.1 on the energy-density v are a little more restric tive than those of Theorem 3.3, where, for example, densities of the form v(t) = |t| are admissible. In the framework of Theorem 4.1 such potentials can be dealt with at the price of a more technical treatment if one for example assumes an exponential decay for G, since stability is trivial for positive potentials. We also point out that in the framework of Theorem 4.1 we can treat the masszero cases (where G ∈ L1 (Rd , dx), cf. Remark 2.2 ) of Proposition 2.1 if we demand that the (positive) energy density v at t = 0 tends to zero sufficiently fast, e.g. 0 ≤ v(t) ≤ c tγ , for 0 ≤ |t| ≤ , with γ > d/(d − 2α). Remark 4.2. Most of the constructions presented in Secs. 3 and 4 can be extended to Riemannian manifolds. In particular, we can introduce local interactions on compact manifolds without any cut-off. As a simple example we consider the d-dimensional torus Tdl of length l: in this case the Green’s functions G = Gα,m0 , m0 > 0, in Proposition 2.1 have to be modified by introducing periodic boundary conditions for the Laplacian. Then the translation invariant potential U can be
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
193
n defined by U (y1 , . . . , yn ; s1 , . . . , sn ) = Td v j=1 sj G(x − yj ) dx. The proof of l stability is completely analogous to the one of Theorem 4.1. The infinite volume limit Tdl → Rd can now be studied as l → ∞. 5. Models of Statistical Mechanics seen as “Poisson” Quantum Fields In this section we show that a number of well-known particle systems can be associated to an interacting CPN — and hence to a “Poisson”, Euclidean QFT — in the spirit of Theorem 4.1(vi). Most of the potentials we discuss in this section do not fulfil directly the requirements of Theorem 4.1, however they are known to fulfil the stability condition (e.g. when the potentials are positive or else by applying wellknown criteria, cf. [52]). Hence these potentials (with the exception of Sec. 5.1) can also be used to construct Euclidean quantum field models in our spirit. Moreover they can be obtained by approximation from potentials in the class considered in Theorem 4.1. 5.1. The gas of hard spheres Here we consider a particle system with identical particles carrying a unit charge, hence we set r = δ1 , the Dirac measure in 1. Let BR = BR (0) ⊆ Rd be the open ball centered at zero with radius R > 0. We set 1 if x ∈ BR (28) G(x) = 1BR (x) = 0 else and we define v h.c. (t) =
0 ∞
if t < 2 . if t ≥ 2
(29)
Then we get for the potential U in Eq. (24) n
h.c. U (y1 , . . . , yn ) = v G(x − yn ) dx Rd
=
l=1
0 if minl,j=1...,n;l=j |yl − yj | ≥ R . ∞ else
(30)
Here we did not write out the arguments sj ≡ 1 and the integral in (30) is welldefined as an integral of non-negative functions with values in [0, ∞]. Obviously, on the right-hand side of (30) we have the potential of particles with a hard core of radius R (“gas of hard spheres”). We also note that if we modify (29) and set v(t) = 0 if t < l and v(t) = ∞ if t ≥ l, l ∈ N, l ≥ 2, we obtain a system where a non-empty intersection of l (and more) balls of radius R is energetically forbidden, but all configurations without such intersections have zero potential energy. Such systems have pure l-point potentials in the sense of statistical mechanics, cf. [52].
March 29, 2005 8:59 WSPC/148-RMP
194
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
5.2. Potentials from stochastic geometry Here we give “local” formulations of two potentials of stochastic geometry [53, 56, 44], starting with the threshold potential : let G and r be as in Theorem 4.1 and for C > 0 we define the energy density vC (t) =
0 if t < C 1 else
(31)
which obviously is in the class of Theorem 4.1 (set b = 1/C). We now get (cf. Fig. 2)
U (y1 , . . . , yn ; s1 , . . . , sn ) =
Rd
vC
n
sl G(x − yl ) dx
l=1
n
d = x∈R : sl G(x − yl ) ≥ C .
(32)
l=1
If we, in particular, choose G and r as in Sec. 5.1, we get the so-called Boolean grain model of stochastic geometry [56]. We can also define similar energy densym sym sities vC (t) = vC (|t|) and v−C = vC − vC to obtain related potentials which “threshold” also negative values of XΛ . Next we formulate the isodensity contour potential. Let us assume that G is C 1 -differentiable in Rd \{0} (cf. Proposition 2.1(ii) for examples) and limx→0 |G(x)| = ∞, lim|x|→∞ G(x) = 0. For C > 0 we define heuristically i.d.c. (X) = δ(X − C)|∇X| vC
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
(33)
0 0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
Fig. 2. Threshold and isodensity contour potentials for n = 30 and n = 300 particles of charge √ ±1/ n and with G as in Fig. 1. Isodensity contours of integer values from −4 to 4 are displayed. The fractal structure of the continuum limit (Sec. 7) becomes visible.
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
or, more precisely,
U (y1 , . . . , yn ; s1 , . . . , sn ) =
Rd
i.d.c. vC
= lim ↓0
−1
n
sl G(x − yl ) dx
l=1
vC−
Rd
195
n
sl G(x − yl )
l=1
n
− vC sl G(x − yl ) sl ∇G(x − yl ) dx l=1 l=1 n
= x ∈ Rd : sl G(x − yl ) = C (34)
n
l=1
d−1
where | · |d−1 denotes the (d − 1)-dimensional (surface) volume (cf. Fig. 2). Clearly, ∇G(x − yj ) is well-defined on the set of points where vC− − vC does not vanish. The last step follows from the fact that obviously the Hausdorff dimension of the set on the right-hand side is d − 1. This also shows that U is well-defined. i.d.c. might be of particular interest in the continuum Potentials like vC and vC limit (see Sec. 7) since they are designed to measure the fractal properties of the sample paths in that limit, see Fig. 2. 5.3. Particle systems with positive definite pair interactions Let r = δ1 (in this subsection we may thus omit the variables sl ≡ 1) and G as in Theorem 4.1 reflection invariant under x → −x and let G fulfil Rd G dx = 0. We set Φ = G ∗ G and we get Φ ∈ L1 (Rd , dx). Φ is positive definite in the sense that Φ is the Fourier transform of a (not necessarily finite) non-negative function on Rd . We consider two separate situations: either Φ is the Fourier transform of a non-negative L1 (Rd , dx)-function and hence is continuous. Or we assume that Φ is non-negative, in this case possibly Φ(0) = ∞. Also, we remark that choosing G = Gα , 0 < α ≤ 1/2, as in Proposition 2.1 leads to the second case, cf. Remark 2.2. Let χ ∈ C0∞ (Rd ) be symmetric, non-negative such that Rd χ dx = 1. For > 0, we set χ (x) = χ(x/)/d and we introduce an ultra-violet cut-off setting G = χ ∗ G and Φ = G ∗ G . We consider the quadratic energy density v(t) = t2 for the ultra-violet regularized model, namely 2
n U (y1 , . . . , yn ) = G (x − yl ) dx Rd
=
n
l=1
l=1
G (x − yl )2 dx +
Rd
= nΦ (0) +
n
l,j=1 l=j
n
l,j=1 l=j
Φ (yl − yj ).
G (x − yl )G (x − yj ) dx
Rd
(35)
March 29, 2005 8:59 WSPC/148-RMP
196
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
If Φ(0) = +∞, then the first term on the right-hand side of (35) in the limit ↓ 0 gives an infinite contribution, while the second term remains well-defined (for yl = yj ). Since the first term is proportional to n, it can be seen as a (negative) chemical potential or a self-energy which become infinite if ↓ 0 — this is very similar to the self-energy problem of a charged point-particle in ordinary electro dynamics. Subtracting this infinite contribution (“self energy renormalization”) gives a suitable renormalization for the quadratic potential of the interacting CPN. We want to show that this can be done in a way preserving the local structure of the interaction. We set (note that Rd G dx = Rd G dx = 0) 2 : t2 :s.e.r. = t − c t, c = Φ (0) G dx (36) Rd
and we get that the renormalized potential n 2
s.e.r. U (y1 , . . . , yn ) = : G (x − yl ) :s.e.r. dx Rd
=
l=1
Φ (yl − yj )
(37)
l,j=1 j=l
has a well-defined limit as ↓ 0, which is just given by the potential resulting from the pair interaction Φ. If Φ ≥ 0, stability is obvious. In the case where Φ is positive definite and continuous, stability follows from [52, Proposition 3.2.7]. It can be seen in the same reference that positive definite pair potentials play a quite special role in the theory of stability. We would like to point out that a quadratic interaction for a CPN is obviously non-trivial, since a particle gas with pair interactions is obviously different from a gas of non-interacting particles. However, the interaction becomes trivial in the Gaussian (continuum) limit of Sec. 7 (the interacting process in that limit becomes Gaussian) as can be seen most easily by performing the continuum limit with an ultra-violet cut-off.j 6. High Temperature Expansion In this section we give a construction of the infinite volume limit Λ ↑ Rd (the removal of the infra-red cut-off, respectively) using techniques from continuous particle systems. In particular, we give a high temperature expansion for the correlation functional for the case of trigonometric interaction. The main trick is to write the trigonometric interaction as effective potential of a (formal) two-component marked Potts model at imaginary temperature. Though imaginary temperature might look strange, we show that it does not interfere with the usual cluster expansion method [52]. Once the complex valued correlation functional has been continuum limit for the renormalized potential can be performed setting r1/√z = δ1/√z and √ az = − z in (2) in order to avoid problems with the stability of the potential. This is only slightly different from the techniques in Sec. 7.
j The
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
197
constructed for the Potts model, real valuedness of the interaction is restored by restriction to one component. The construction of Gibbs measures then follows from the general analysis of the excellent review article [42], see also the original article by Lenard [43]. Obviously, here the techniques are inspired by statistical mechanics of continuous, classical particles. For another construction of infinite volume measures (working also outside the LD-HT regime) with a quantum field flavor that applies to the case where the interaction energy density v is concave and uses FKG inequalities, cf. [37]. 6.1. Trigonometric interactions From now on we focus on the case of trigonometric interactions. Let ν be a complex valued measure on R with ν(A) = ν(−A) ∀A ∈ B(R). Furthermore, let ν have compact support ⊆ [−c , c ], |ν|([−c , c ]) < ∞. Let b = [−c ,c ] |α|d|ν|(α) and b = |s| dr(s) with r as in Eq. (3). For the definition of the modulus |ν| = |ν|+|ν| [−c,c] of ν see e.g. [40]. ν is called the interaction measure. We set v(t) = (1 − eiαt ) dν(α), t ∈ R. (38) R
Obviously v is real-valued and fulfils the conditions of Theorem 4.1. Suppose also that G is given as in that theorem. For c > 0 let Γc0 (Γc ) be the space of signed, real-valued measures η on Rd with (locally) finite support such that −c ≤ η{x} ≤ c ∀x ∈ Rd . For reasons that are connected with the use of Potts models in the next section, in this section we work with an infra-red cut-off and a finite volume. For η ∈ Γc0 and Λ ⊂ Rd compact we thus define the interaction UΛ : Γc0 → R by UΛ (η) = v(G ∗ η), 1Λ . Furthermore let FΛ be a Poisson noisek as in Sec. 4.3. We define the correlation functional ρΛ : Γc0 → (0, ∞) associated with FΛ and UΛ at the inverse temperature β 1 1{supp η⊆Λ} (η)EP F [e−βUΛ (η+FΛ ) ], ρΛ (η) = ˜ ΞΛ ˜ Λ = EP F [e−βUΛ (FΛ ) ]. (39) ∀η ∈ Γc0 , Ξ What is remarkable is that this correlation functional fulfils the following: Proposition 6.1. The correlation functional ρΛ fulfils the uniform (in Λ) Ruelle bound |ρΛ (η)| ≤ (eβB )η for all η ∈ Γc0 with B = bcG1 . Proof. As v is differentiable and |v | < b, we get 1 d UΛ (γ + tη) dt |UΛ (η + γ) − UΛ (γ)| = dt 0 ≤ b |G ∗ η| dx ≤ bG1 d|η| ≤ Bη. Λ
Λ
Combining this with the definition of ρΛ in (39) gives the assertion. k The
associated interacting Poisson noise in this section is denoted by F˜Λ .
March 29, 2005 8:59 WSPC/148-RMP
198
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
The uniform Ruelle bound is crucial for the passage from infinite volume correlation functionals to Gibbs measures, cf. [42]. In Sec. 6.4 we come back to this point. 6.2. Two-component formal Potts model at imaginary temperature l Let c > 0 be as in Sec. 6.1. Clearly Λl × Rl (y1 , . . . , yl ; s1 , . . . , sl ) → j=1 sj δyj ∈ Γc0 ⊆ S is continuous and hence measurable. For A ∈ B(Γ0 ) we thus get that 1{Pl sl δy ∈A} is measurable w.r.t. B(Λl ) ⊗ B(Rl ). We set j=1
j
P ZΛ (A) = e−β|Λ|ν([−c ,c ]) δ0 (A) + e−β|Λ|ν([−c ,c ])
∞
βl l=1
× 1{Plj=1 αj δy
j
l!
∈A} (y1 , . . . , yl ; α1 , . . . , αl ) dy1
Λ×l ×[−c ,c ]×l
· · · dyn dν(α1 ) · · · dν(αn ). (40)
Here δ0 (A) = 1 if 0 ∈ A and 0 otherwise.
Lemma 6.2. The function P ZΛ : B(Γc0 ) → C is a complex valued measure on (Γc0 , B(Γc0 )). Proof. This follows from the fact that P ZΛ is a direct sum of such measures. We note that P ZΛ can be seen as a complex valued generalization of a Poisson noise measure. In particular, if ν is a probability measure, P ZΛ is the defining measure for the marked Poisson process in the finite volume Λ with mark distribution ν and intensity σ. Let FΛ be as in the preceding subsection with associated measure P FΛ . For H: Γc0 × Γc0 → C measurable and L1 (P F ⊗ |P ZΛ |) integrable, we define the linear functional EP FΛ [H(FΛ , γ)] dP ZΛ (γ). (41) EP FΛ ⊗P ZΛ [H] = Γ0
Let u: Γc0 × Γc0 → R be defined by u(η, γ) = η, G ∗ γ. u is the interaction of a twocomponent marked Potts model where component one interacts with component two but there is no interaction within either component. The formal Potts model at imaginary temperature that we consider here is defined by complex valued Gibbs measure in the finite volume dP f.P.(η, γ) =
eiu(η,γ) dP FΛ ⊗ P ZΛ (η, γ). EP F ⊗P ZΛ [eiu(FΛ ,ZΛ ) ]
(42)
Here ZΛ (γ) = γ is the coordinate process of the second component. This does not give an ensemble of statistical mechanics (unless ν is a probability measure and the temperature is real), as the second component has a non-probability “distribution”
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
199
P ZΛ . Nevertheless, this object can be treated analogously to the measure defining the ordinary Potts model. In particular this applies to the correlation functional 1
ρf.P.(η, γ) =
1{suppη⊆Λ, suppγ⊆Λ} (η, γ)EP FΛ ⊗P ZΛ [e Ξf.P. Λ
iu(η+FΛ ,γ+ZΛ )
],
(43)
= EP FΛ ⊗P ZΛ [eiu(FΛ ,ZΛ ) ]. The following crucial obser∀η ∈ Γc0 , γ ∈ Γc0 where Ξf.P. Λ > 0: vation implies in particular that Ξf.P. Λ ˜ Λ and ρΛ be as in the preceding section. The following idenProposition 6.3. Let Ξ tities hold: ˜ Λ; (i) Ξf.P. =Ξ Λ (ii) ρf.P. (η, 0) = ρΛ (η). Λ Proof. By the definitions (39) and (43) and Fubini’s theorem, it is sufficient to integrate out the second component and show e−βUΛ (η) =
Γc0
eiu(η,γ) dP ZΛ (γ)
with UΛ the trigonometric interaction defined in Sec. 6.1. Using the definition (41) to evaluate the right-hand side, we get Z
Γc0
eiu(η,γ) dP ZΛ (γ)
Z ∞ P X βl iη,G∗ lj=0 αj δyj e dy1 · · · dyl dν(α1 ) · · · dν(αl ) l! Λl ×[−c ,c ]×l l=0 !l Z ∞ X βl −β|Λ|ν([−c ,c ]) iαG∗η(y) =e e dν(α) dy l! Λ×[−c ,c ]
= e−β|Λ|ν([−c ,c ])
=e
β
R Λ
[
R R
l=0
eiαG∗η(y) −1dν(α)]dy
= e−β
R Λ
v(G∗η)dy
.
Spelled out in words Proposition 6.3 means that one can obtain the model with trigonometric interaction as the projection (or Widom–Rowlinson model) of a formal Potts model at imaginary temperature. What one has gained from this representation is that the formal Potts model is a model with a pure two-point interaction, hence the usual cluster expansion procedure of Ruelle goes through, cf. the following subsection. That the formal Potts model at imaginary temperature does not possess the necessary positivity properties poses no problems, as we are only interested in the projection, where positivity holds, cf. Sec. 6.4.
March 29, 2005 8:59 WSPC/148-RMP
200
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
6.3. The cluster expansion To specify the domain of convergence for our expansion, we define isαG(y−y ) e sup − 1d|ν|(α) dy ≤ cbG1 < ∞ C1 = y∈Rd ,s∈supp r
and
Rd
C2 =
sup y ∈Rd ,α∈supp ν
Rd
(44)
R
R
isαG(y−y ) e − 1dr(α) dy ≤ c b G1 < ∞.
(45)
The following theorem is based on the convergence of Ruelle’s cluster expansion [52, Chap. 4.4] Theorem 6.4. The high-temperature low-density expansion of the infinite volume limit of the correlation functional in the case of trigonometric interactions converges for z > 0 and β ∈ R such that |z| < 1/(eC1 ) and |β| < 1/(eC2 ). In particular (i) ρ(η) = limΛ↑Rd ρΛ (η) exists for η ∈ Γc0 and depends analytically on β and z. In particular, for z fixed, the high temperature expansion of ρ converges; (ii) ρ is invariant under the action of the Euclidean group, i.e. ρ(η) = ρ(η{g,a} ) for g ∈ O(d), a ∈ Rd . Proof. (i) To obtain the cluster expansion for ρf.P. = limΛ↑Rd ρf.P. Λ , only a few modifications w.r.t. [52, Chap 4.4] are necessary. Let Sa = Rd × R × {a}, a = 1, 2 and S = S1 ∪˙ S2 = Rd × R × {1, 2}. Any pair of finite marked configurations (η, γ) ∈ Γc0 × Γc0 can then be identified with a nonˆ 0 (S) on S defined as follows. First, given η = n sj δxj marked configuration ξ ∈ Γ j=1 n we define a non-marked configuration η˜ = j=1 δ(xj ,sj ,1) on S1 and likewise γ defines a non-marked configuration γ˜ on S2 . Then we set ξ|S1 = η˜ and ξ|S2 = γ˜. Such a non-marked configuration ξ = nj=1 δqj , qj ∈ S can be identified with the finite subset {q1 , . . . , qn } of S. Let σ be the complex measure on S obtained by σ|Rd ×R×{1} = dy ⊗ r and σ|Rd ×R×{2} = dy ⊗ ν with dy the Lebsgue measure. For Λ ⊆ Rd compact let and SΛ = Λ × R × {1, 2}. For q ∈ S let furthermore ζ(q) = z if q = (y, s, 1), y ∈ Rd , s ∈ R and ζ(q) = β otherwise. Let χ be a function on S with supp χ ⊆ SΛ for some Λ ⊆ Rd compact. For ˆ Ψ: Γ0 (S) → C measurable and bounded, we can define ∞
1
χ, Ψ(ζ) = Ψ0 + (ζχ)(q1 ) · · · (ζχ)(qn )Ψ({q1 , . . . , qn }) dσ(q1 ) · · · dσ(qn ). n! Sn n=1
Letting Ψ(ξ) = eiu(η,γ) with (η, γ) ∈ Γc0 × Γc0 associated with ξ, we obtain the following representation of ρf.P. Λ : −1 ˆ 0 (S), ρf.P.
χΛ , Dξ Ψ(ζ), ξ ∈ Γ Λ (ξ) = 1{ξ⊆SΛ } (ξ) χΛ , Ψ(ζ)
ˆ 0 (S), and χΛ = 1SΛ . where Dξ Ψ(τ ) = Ψ(ξ ∪ τ ), ξ, τ ∈ Γ
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
Using Ruelle’s ∗-product (known as s-product in QFT [21])
ˆ 0 (S), Ψ1 ∗ Ψ2 (ξ) = Ψ1 (τ )Ψ2 (ξ \ τ ), ξ ∈ Γ
201
(46)
τ ⊆ξ
one obtains as in [52, Chap. 4.4] the following expansion of ρf.P. in the formal Λ parameter ζ ˜ξ (ζ), ρf.P. Λ (ξ) = 1{ξ⊆SΛ } (ξ) χΛ , ϕ
ϕ˜ξ = Ψ−1 ∗ Dξ Ψ.
(47)
Here Ψ−1 is the inverse of Ψ w.r.t. the ∗-multiplication (as Ψ(∅) = 1 this inverse exists). We have to study the range of convergence of χΛ , ϕ˜ξ (ζ) as Λ ↑ Rd . To this aim, we define the pair-potential o(q, q ) = ss G(y − y ) for q = (y, s, a), q = (y , s , a ) with a = a and o(q, q ) = 0 otherwise. For q ∈ ξ we set ˆ 0 (S) such that q ∈ ξ, K(q, ξ) = W (q, ξ) = q ∈ξ\{q} o(q, q ) and for q ∈ S, ξ ∈ Γ io(q,q ) − 1]. As in Ruelle’s book, we then obtain the recurrence formula for q ∈ξ [e ϕ˜ξ and q ∈ ξ
K(q, κ)ϕ˜ξ\{q}∪κ (τ \ κ), (48) ϕ˜ξ (τ ) = eiW (q,ξ) κ⊆τ (q)c
where τ (q)c for q = (y, s, a) is defined as {q = (y , s , a ) ∈ τ : a = a}. One obtains from (48) by induction over n + m with m = m1 + m2 = ξ1 + ξ2 , ξa = ξ ∩ Sa , a = 1, 2, and n = n1 + n2 = τ1 + τ2 that for θ1 , θ2 > 0 ∃C = C(θ1 , θ2 ) < ∞ such that ϕ˜{q ,...,q } ({q , . . . , q }) d|σ|(q ) · · · d|σ|(q ) sup 1 n 1 n 1 m m
m2
(q1 ,...,qm )∈S1 1 ×S2 qj =ql ,j=l
≤
n
n
S1 1×S2 2
Cn1 !n2 !θ1m1 θ2m2
eθ1 C1 θ1
n1 +m1
eθ2 C2 θ2
n2 +m2 .
(49)
This estimate for θa = Ca−1 , a = 1, 2, implies that for m = ξ fixed the right-hand side of (47) converges uniformly (in Λ ⊆ Rd compact) if |z| < 1/(eC1 ) and |β| < 1/(eC2 ). f.P. From the uniform convergence of ρf.P. (ξ) = limΛ↑Rd ρf.P. Λ (ξ) Λ (ξ) it follows that ρ exists and is analytic in the above parameter domain. Combining this with Proposition 6.3, one obtains the assertion (i) of the theorem. (ii) Note that ρΛ (η{g,a} ) = ρgΛ+a (η). Invariance of ρ now follows from the equivalence of the limits Λ ↑ Rd and gΛ + a ↑ Rd .
Remark 6.5. Let us briefly sketch three methods for an analytic or numerical evaluation of ρ. The details can be worked out by (more or less lengthy) straight forward calculations. (i) Meyer’s series for ρf.P.: The usual graphical methods connected with the Meyer series [52, p. 88] can now be used for the explicit calculation of the expansion coefficients in β and z of ρ.
March 29, 2005 8:59 WSPC/148-RMP
202
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
(ii) High temperature expansion: One can go back to Eq. (6.1) and calculate the β-expansions of the nominator and the denominator, which essentially amounts to calculating the functional Fourier transforms of P F . Taking the quotient in the sense of formal power series then yields an expansion, which is known to converge in the infinite volume limit for z and β sufficiently small. (iii) Small G expansion: Finally, it is possible to adapt [25] to the formal Potts model and to obtain a representation of the moments of the infinite volume Gibbs measure with trigonometric interaction (to be constructed in the following section) in terms of generalized Feynman graphs,l which amounts to a formal expansion in powers of G. This method has the advantage that only finitely many moments of ν appear as parameters in the expansion up to a finite order. Hence the form of the interaction (at least in principle) can be determined comparing the expansion with experimental data. 6.4. Construction of Gibbs measures Here we want to construct the Gibbs measure associated with the correlation functional ρ. Before we can do this, some preparations are needed. Here we mostly follow [42]. The σ-finite Lebesgue–Poisson measure λz is defined on (Γc0 , B(Γc0 )) by setting ∞ l
zl λz (A) = δ0 (A) + 1A sj δyj dy1 · · · dyl dr(s1 ) · · · dr(sl ) l! Rdl ×[−c,c]×l j=1 l=1
(50) A ∈ B(Γc0 ). It is well-known, see e.g. [42], that ρΛ can be represented as ˜
ρΛ (η) = ˜
Here dP FΛ (η) =
1 ˜Λ Ξ
dP FΛ (η) dλz
for λz -a.e. η ∈ Γc0 .
(51)
e−βUΛ (η) dP FΛ (η).
Let ΓcΛ = {γ ∈ Γc : supp γ ⊆ Λ} for Λ ⊆ Rd measurable. Note that for Λ compact, ΓcΛ ⊆ Γc0 . Suppose ρˆ: Γc0 → R is a given functional that fulfils a Ruelle bound |ˆ ρ(η)| ≤ C η for some C > 0. Assume furthermore that the functional is Lennard positive in the sense Λ (−1)η ρˆ(η + γ) dλz (γ) ≥ 0 (52) q (η) = ΓcΛ
for λz -a.e. η and all Λ ⊆ Rd compact. One can then check that {P Λ }Λ⊆Rd compact defined by dP Λ (η) = q Λ (η) dλz (η) is a projective family of probability measures. Hence the inductive limit P of this family exists by Kolmogorov’s theorem as a measure on (Γc , B(Γc )), cf. [42, Proposition 4.5 and Theorem 4.5]. In particular l Here
it is necessary that G is a regular function in order to avoid ultra-violet singularities.
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
203
these results imply that the constructed measure has support on tempered marked configurations. Furthermore, let ρˆn : Γc0 → R be a sequence of Lennard-positive correlation functionals that fulfil a uniform Ruelle bound |ˆ ρn (η)| ≤ C η for some C > 0 independent of n. Let furthermore ρˆ(η) = limn→∞ ρˆn (η) exist λz -a.e. and suppose that for some D > 0 sufficiently small ρˆ − ρˆn D → 0 as n → ∞ with ρ(η) − ρˆn (η)|. Then the limiting functional ρˆ is Lenard ρˆ − ρˆn D = esssupη∈Γc0 Dη |ˆ positive and fulfils the same Ruelle bound as the ρˆn ’s, [42, Proposition 4.9]. Hence there exists a measure P associated to ρˆ, as explained in the preceding paragraph. Let pΛ : Γc0 → ΓcΛ be the projection given by γ → 1Λ γ. A function H: Γc → R is c ˆn and ρˆ are as above Λ-measurable, if it is measurable w.r.t. BΛ (Γc ) = p−1 Λ (ΓΛ ). If ρ c c with associated measures P and Pn on (Γ , B(Γ )), then the measures Pn converge locally to P , i.e. for all H: Γc → R positive that is Λ-measurable for some Λ ⊆ Rd compact, we get limn→∞ EPn [H] = EP [H], see [42, Corollary 4.11]. Also, the Euclidean invariance of ρˆ is equivalent to the Euclidean invariance of the associated measure, cf. [42, Proposition 3.11]. Applying these pieces of general theory to the case of the preceding subsection, we obtain ˜
Proposition 6.6. Let P FΛ be the measures on (Γ0 , B(Γ0 )) associated with the correlation functionals ρΛ defined in Sec. 6.1. Then ˜
(i) There exists a uniquely determined measure P F on that measurable space which is associated with ρ = limΛ↑Rd ρΛ ; ˜ ˜ (ii) limΛ↑Rd P FΛ = P F holds in the sense of local convergence; ˜ (iii) P F is Euclidean invariant. Proof. The estimate (49) implies that limΛ↑Rd ρΛ − ρD = 0 for 0 < D < z. The three assertions therefore follow from the general formalism. F Obviously, Proposition 6.6 also implies the existence of P X = L−1 as a ∗ P measure on (S , B(S )). The interaction without infra-red cut-off is U (η) = Rd v(G ∗ η) dx, η ∈ Γc0 . Let η ∈ Γc0 and γ ∈ Γc . Then the mutual interaction W between η and γ is by definition [v(G ∗ (η + γ)) − v(G ∗ η) − v(G ∗ γ)] dx. (53) W (η, γ) = ˜
˜
Rd
As the derivative of v is bounded, it is easy to see that W (η, γ) is well-defined and that |W (η, γ)| < 2Bη with B = bcG1 as above. We say that the measure P on (Γc , B(Γc )) is a Gibbs measure for the interaction U , inverse temperature β and activity z if for arbitrary Λ ⊆ Rd compact and H: Γc → R non-negative the following holds EP [H] = H(η + γ)e−βU(η)−βW (η,γ) dP (γ) dλz (η). (54) ΓcΛ
Γc d
R \Λ
March 29, 2005 8:59 WSPC/148-RMP
204
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
Equations (54) are called Ruelle equations. Under the given conditions they are equivalent with other definitions of Gibbs measures as for example Dobrushin– Lanford–Ruelle equations, Georgii–Nguyen–Zessin equations and the standard definition of Gibbs measures via conditional probablilities, cf. [42, Theorem 3.12]. The ˜ following theorem verifies the Gibbs property for P F . ˜
Theorem 6.7. Let z, β be as in Theorem 6.4, G ∈ L1 (Rd , dx). Then P F is a Gibbs measure w.r.t. U, z, β. Proof. On both sides of (54) the function H can be approximated from below by elementary functions. Thus, it suffices to consider the case where H is a characteristic function 1A , A ∈ B(Γ0 ). On both sides of (53) we have to evaluate σ-finite measures. It is therefore sufficient to consider A from a ∩-stable generating subsystem of B(Γc ). We may thus assume that A ∈ BΛ (Γc ) for some Λ ⊆ Rd compact. Let Λ be fixed. For compact Λ ⊆ Rd we consider the infra-red cut-off interaction UΛ as in Sec. 6.1 and we let WΛ (η, γ) = UΛ (η + γ) − UΛ (η) − UΛ (γ), η, γ ∈ Γc0 . Using (51) and
˜ H(η) dλz (η) = Γc˜
˜ + γ) dλz (η) dλz (γ) H(η Γc˜
˜ Λ\Λ
Λ
Γc˜ Λ
˜ ⊆ Λ ˜ ⊆ Rd and H: ˜ Γc → R non-negative and measurable, it is for compact sets Λ ˜ ˜ easy to show that the Ruelle equations (54) hold for P FΛ instead of P F and UΛ and WΛ instead of U and W . By Proposition 6.6(ii) we get that limΛ ↑Rd EP F˜Λ [H] = EP F˜ [H]. In order to verify the Ruelle equations in the infinite volume limit Λ ↑ Rd one thus has to prove that the right-hand side of the Ruelle equation with cut-off Λ converge to the right-hand side without that cut-off. The modulus of the difference, which we abbreviate by I, can be estimated as follows (Λ ⊆ Rd is an arbitrary compact set and γΛ = 1Λ γ): I ≤
ΓcΛ
−
˜
H(η + γ)e ΓcΛ
+ −
H(η + γ)e−βU(η)−βW (η,γ) dP F (γ) dλz (η)
Γc d R \Λ
Γc d
ΓcΛ
dP (γ) dλz (η) F˜
R \Λ
˜
H(η + γ)e−βUΛ (η)−βWΛ (η,γ) dP F (γ) dλz (η)
Γc d
R \Λ
H(η + γ)e ΓcΛ
−βUΛ (η)−βWΛ (η,γ)
Γc d
R \Λ
−βUΛ (η)−βWΛ (η,γΛ )
dP (γ) dλz (η) F˜
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
+ −
ΓcΛ
ΓcΛ
+ −
˜
H(η + γ)e−βUΛ (η)−βWΛ (η,γΛ ) dP F (γ) dλz (η)
Γc d
R \Λ
Γc d
ΓcΛ
˜ H(η + γ)e−βUΛ (η)−βWΛ (η,γΛ ) dP FΛ (γ) dλz (η)
R \Λ
˜
H(η + γ)e−βUΛ (η)−βWΛ (η,γΛ ) dP FΛ (γ) dλz (η)
Γc d R \Λ
H(η + γ)e ΓcΛ
205
−βUΛ (η)−βWΛ (η,γ)
dP
Γc d
F˜Λ
(γ) dλz (η).
(55)
R \Λ
Let us call these terms I1 (Λ , Λ ), . . . , I4 (Λ , Λ ). Let > 0 be given — we have to show that for Λ sufficiently large and a suitable Λ we get Ij (Λ , Λ ) < , j = 1, . . . , 4. I1 (Λ , Λ ) in fact only depends on Λ . Note that the integrand in (53) is dominated by 2b|G| ∗ |η| ∈ L1 (Rd , dx), hence by dominated convergence UΛ (η) → U (η) and WΛ (η, γ) → W (η, γ) for η ∈ Γc0 and γ ∈ Γc . At the same time e2βBη is an upper bound for |H(η + γ)e−βUΛ (η)−βWΛ (η,γ) |, if H is bounded by one. Consequently, I1 (Λ , Λ ) → 0 as Λ ↑ Rd holds by dominated convergence. For Λ fixed, I3 (Λ , Λ ) → 0 as Λ ↑ Rd follows from Proposition 6.6(ii). It remains to show that one can find a compact set Λ such that I2 (Λ , Λ ), I4 (Λ , Λ ) < for all compact Λ ⊆ Rd . Note that this is trivial for G of finite range R and d(Λ, ∂Λ ) = inf{|x − y|: x ∈ Λ, y ∈ R \ Λ } > 2R since then WΛ (η, γΛ ) = WΛ (η, γ) implies I2 (Λ , Λ ) = I4 (Λ , Λ ) = 0. In the next step we consider I4 (Λ , Λ ) in the general case. Let for R > 0 ΛR = {x ∈ Rd : ∃y ∈ Λ such that |x − y| ≤ R}. We set Λ = ΛR for some R > 0 and we have to show that I4 (Λ , ΛR ) → 0 uniformly in Λ as R → ∞. Let us begin with the estimate
|e−βWΛ (η,γ) − e−βWΛ (η,γΛR ) | ≤ e2|β|Bη (e|β||WΛ (η,γ)−WΛ (η,γΛR )| − 1).
(56)
For R > 0 arbitrary, we can combine (56) with
|WΛ (η, γ) − WΛ (η, γΛR )| |G| ∗ |γRd \ΛR | dx + ≤ 2b ΛR
Rd \ΛR
|G| ∗ |η| dx
= 2b[ |G| ∗ 1ΛR · 1Rd \ΛR , |γ| + |G| ∗ 1Rd \ΛR , |η|]
(57)
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
206
and we obtain ˜ |e−βWΛ (η,γ) − e−βWΛ (η,γΛR ) |dP FΛ (γ) Γc d
R \Λ
≤e
2B|β|η
2b|β||G|∗1Rd \Λ ,|η| R e
e
2b|β||G|∗1ΛR ·1Rd \Λ
R
,|γ|
dP
F˜Λ
(γ) − 1 .
Γc
(58) For a function h: Rd × [−c, c] → R we define the Lebsgue–Poisson coherent state n c eλ (h, ξ) = nl=1 h(xl , sl ), ξ = j=1 sl δxl ∈ Γ0 . The following identity holds for c d ˜ ˜ H: ΓΛ˜ → R non-negative and Λ ⊆ R an arbitrary measurable set ˜ ˜ H(η + γ)eλ (h1 , η)eλ (h2 , γ) dλz (η) dλz (γ) = H(η)e λ (h1 + h2 , η) dλz (η) Γc˜
Λ
Γc˜
Γc˜
Λ
Λ
(59) and can be found in [42, Corollary 2.5] for ΓcΛ˜ replaced with Γc0 . One can check the above identity along the same lines, cf. [42, proof of Lemma 2.1]. Using the notation h(x, s) = |s||G| ∗ 1ΛR (x)1Rd \ΛR (x) we thus get 2b|β||G|∗1ΛR ·1Rd \Λ ,|γ| ˜ R 1≤ e dP FΛ (γ) c Γ = eλ (e2b|β|h , γ)eλ (−1, ξ)ρΛ (γ + ξ) dλz (ξ) dλz (γ) ΓcΛ
ΓΛ
eλ (e2b|β|h − 1, γ)ρΛ (γ) dλz (γ)
= ΓcΛ
≤
eλ (e2b|β|h − 1, γ)C γ dλz (γ)
ΓcΛ
≤ ezC
R
Λ
[e2b|β|h(x,s) −1]dr(s)dx
≤e
2zCbc|β|e2B|β|
R Rd \Λ
R
|G|∗1ΛR dx
,
(60)
with B = cbG1 ≥ supx∈Rd ,s∈[−c,c] h(x, s) and C the Ruelle constant that does not depend on Λ , cf. Proposition 6.1. Inserting (60) into (58) we obtain for I4 (Λ , ΛR ) I4 (Λ , ΛR ) R 2B|β| |G|∗1ΛR dx 2b|β||G|∗1Rd \Λ ,|η| 2zCcb|β|e Rd \Λ R R ≤ e2B|β|η (e e − 1) dλz (η) ΓcΛ
≤ e2zB|β||Λ| (e = e2zB|β||Λ| (e
2zCcb|β|e2B|β|
R Rd \Λ
R
R 2zbc|β|(Ce2B|β| Rd \Λ
|G|∗1ΛR dx+2zcb|β|
R
|G|∗1ΛR dx+
R
R Rd \ΛR
Λ
|G|∗1Rd \Λ dx
|G|∗1Λ dx)
R
− 1)
− 1).
(61)
Let > 0 be arbitrary. We have to show that we can choose R, R > 0 such that each of the integrals in the exponent on the right-hand side of (61) is smaller than . We note that |G| ∗ 1Λ ∈ L1 (Rd , dx), thus Rd \ΛR |G| ∗ 1Λ dx < for R = R( ) > 0 large enough. Let such R be fixed, we then see that |G| ∗ 1ΛR ∈ L1 (Rd , dx), hence
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
207
we can find an R = R (R, ) > 0 large enough, such that Rd \Λ |G| ∗ 1ΛR dx < . R Choosing = () > 0 small enough and R, R accordingly, we can finally ensure that the right-hand side of (61) becomes smaller than , which establishes the required estimate for I4 (Λ , ΛR ). To estimate I2 (Λ , ΛR ), we remark that (57) is independent of Λ . Thus one obtains (58) with F˜Λ replaced by F˜ . The integral over Γc in the first line of (60) with F˜Λ replaced by F˜ fulfils the same uniform bound as on the right-hand side, as we have by monotone convergence and Proposition 6.6(ii) 2bβ|G|∗1ΛR ·1Rd \Λ ,|γ| ˜ R e dP F (γ) c Γ 2bβ|G|∗1ΛR ·1Rd \Λ ,|γΛ ˜ ˜ | R = sup e dP F (γ) d compact ˜ Λ⊆R
≤
Γc
sup
e
˜ ⊆Rd compact Λ,Λ
=
sup
e
Λ ⊆Rd compact
2bβ|G|∗1ΛR ·1Rd \Λ
R
,|γΛ ˜ |
˜
dP FΛ (γ)
Γc 2bβ|G|∗1ΛR ·1Rd \Λ
R
,|γ|
˜
dP FΛ (γ).
(62)
Γc
Hence the estimate (61) also holds for I2 (Λ , ΛR ). 6.5. Cluster property and extremality of the state ˜
The aim of this subsection is to show that P F is a pure or extremal Gibbs state, ˜ i.e. P F cannot be written as the convex combination of two translation invariant measures on (Γc , B(Γc )). We first prove a cluster property for the correlation functional ρ. Let h: Rdn × [−c, c]n → R and ρˆ: Γc0 → R be a correlation functional. We define Z ρˆ(h) =
Rdn ×[−c,c]×n
0
ρˆ @
n X
1
sj δyj A h(y1 , . . . , yn , s1 , . . . , sn ) dy1 · · · dyn dr(s1 ) · · · dr(sn )
j=1
(63)
and h{g,a} (y1 , . . . , yn , s1 , . . . , sn ) = h(gy1 + a, . . . , gyn + a, s1 , . . . , sn ) g ∈ O(d), n n a ∈ Rd . For η = j=1 sj δyj ∈ Γc0 , η{g,a} = j=1 sj δg−1 (yj −a) . Proposition 6.8. Let z, β as in Theorem 6.4. Then ρ fulfils the cluster property 1 [ρ(f ⊗ h{1,a} ) − ρ(f )ρ(h)] da = 0 lim ˜ Λ˜ ˜ d |Λ| Λ↑R for h: Rdn2 ×[−c, c]n2 → R, f : Rdn2 ×[−c, c]n2 → R infinitely differentiable, bounded and decreasing like a Schwartz test function in all Rd -arguments, n1 , n2 ∈ N. Proof. Again, we closely follow [52, Sec. 4.4.7]. Let log∗ be the logarithm w.r.t. the ∗ product (46) and let ρT = log∗ ρ : Γc0 → R be the cluster functional associated
March 29, 2005 8:59 WSPC/148-RMP
208
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
with ρ. Let ϕ = log∗ Ψ, with Ψ as in the proof of Theorem 6.4, then the following representation holds: ρT (η) =
∞
1 ζ(q1 ) · · · ζ(qn )D(η,0) ϕ({q1 , . . . , qn }) dσ(q1 ) · · · dσ(qn ). n! Sn n=0
(64)
And from (49) one obtains for m = n1 + n2 , q1 = (y, s, 1) in combination with ϕ˜{q} = ϕ{q} , cf. [52, Eq. (4.24)], |ϕ({q1 , . . . , qn })|d|σ|(q2 ) · · · d|σ|(qn ) n −1
S1 1
n
×S2 2
≤ C(n1 − 1)!n2 !C1−1 (eC1 )n1 (eC2 )n2 . This finally gives the estimate n T
dy2 · · · dym dr(s2 ) · · · dr(sm ) ρ s δ j y j Rd(m−1) ×[−c,c]×(m−1) j=1 (eC1 )m 1 . 1 − |z|eC1 1 − |β|eC2 By a simple change of variables Rd |ρT (f ⊗ h{1,a} )|da < ∞ follows for f, h as in the assertion, which implies (cf. [52, Sec. 4.4.3]) Rd |ρ(f ⊗ h{1,a} ) − ρ(f )ρ(h)|da < ∞. ≤ C(m − 1)!C1−1
One way to link the cluster property of the correlation functional to ergodicity of the measure is to express the moments of the measure in terms of the correlation functional. This at the same time gives us a formula for the infinite volume ˜ Schwinger functions of the associated random field X. Proposition 6.9. Let z, β be as in Theorem 6.4. Then ˜ (i) All moments of P F exist. In terms of the correlation functional ρ they are given by l
F˜ (fj ) = ρ(fI ) (65) EP F˜ {I1 ,...,Ij }:1≤j≤l,Ir ⊆{1,...,l} ˙ ∪I ˙ j ={1,...,l} I1 ∪···
j=1
where for f = (f1 , . . . , fl ) ∈ S ×l and I = (I1 , . . . , Ij ) as in the sum in (65) I fI (y1 , . . . , yj , s1 , . . . , sj ) = jq=1 sq q p∈Iq fp (yq ). ˜ exist and are (ii) The Schwinger functions of the associated interacting CPN X l l ˜ F (G∗fj )]. Furthermore, X(fj )] = E F˜ [ given by Sn (f1 ⊗· · ·⊗fl ) = E X˜ [ P
j=1
P
j=1
the Schwinger functions are analytic in the coupling constant (inverse temperature) β and the Feynman series converges on the indicated domain.
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
209
Proof. (i) From the formula for the local densities q Λ and (59), one obtains the ˜ following formula for the Laplace transform of P F ˜ EP F˜ [ef,F ] = eλ (ef − 1, η)ρ(η) dλz (η) (66) ΓcΛ
where f ∈ D(Rd ) with supp f ⊆ Λ and eλ is the Lebesgue–Poisson coherent state defined in the proof of Theorem 6.4. In (66) we can replace ΓcΛ with Γc0 . From the Ruelle-bound for ρ it follows that the right-hand side is well-defined not only for f with compact support, but also for f ∈ S and hence (66) extends by continuity. Existence of the (two-sided) Laplace-transform implies existence of moments of all orders. Taking derivatives of the Laplace-transform at zero yields l l
1 F˜ (fj ) = ρ(fI ). (67) EP F˜ l! j=1 j=1 (I1 ,...,Ij ):Iq ⊆{1,...,l} ˙ ∪I ˙ j ={1,...,l} I1 ∪···
This gives (65) by symmetry of ρ. ˜ = G ∗ F˜ . The analyticity and (ii) This is a immediate corollary from (i) and X the range of convergence follows from the related statements for ρ. Combination of Propositions 6.8 and 6.9 gives us ˜
Theorem 6.10. P F is an extremal Gibbs measure. ˜
Proof. Extremality of P F is equivalent to the ergodicity property 1 ˜ EP F˜ [H1 H2a ] da = EP F˜ [H1 ]EP F˜ [H2 ] ∀H1 , H2 ∈ L2 (Γc , P F ), lim d Λ↑R |Λ| Λ
(68)
where H2a (η) = H2 (η{1,a} ), η ∈ Γc , cf. e.g. [17, Sec. 3.2]. By approximation of both sides of (68) it is furthermore easy to see that it suffices to check (68) for H1 and H2 ˜ in a set that has dense algebraic span in L2 (Γc , P F ). Since the two-sided Laplacelp ˜ F˜ (fjp ), p = 1, 2, with transform exists for P F , functions of the form Hp = j=1 flp ∈ S form such a set. One can thus use the cluster property of ρ, cf. Theorem 6.4, and (65) as follows. We note that the right-hand side of (65) is a sum over all partitions of {1, . . . , l} into disjoint sets. By (65) (see also [52, Sec. 4.4.3]) it is therefore sufficient to 2 , . . . , fl22 ,{1,a} ) and f p = (f1p , . . . , flpp ), check that for f a = (f11 , . . . , fl11 , f1,{1,a} 1 p = 1, 2 we have limΛ↑Rd |Λ| ρ(fIa ) da = 0 if I = {I1 , . . . , Ij } such that Λ Iq ∩ {1, . . . , l1 } = ∅ and Iq ∩ {l1 + 1, . . . , l1 + l2 } = ∅ and ∃Iq ∈ I with 1 a 1 2 1 ρ(f = {Iq ∈ I: Iq ⊆ {1, . . . , l1 }}, limΛ↑Rd |Λ| I ) da = ρ(fI 1 )ρ(fI 2 ) with I Λ 2 I = {Iq − l1 : Iq ⊆ {l1 + 1, . . . , l1 + l2 }} otherwise. The second condition is just the cluster property of ρ, cf. Proposition 6.8. To verify the first condition, one can look into the definition of fIa and ρ(fIa ) to see (using also Ruelle bounds) that ρ(fIa ) → 0 faster than any inverse power of |a|, which implies the first condition. In fact, in that case at least one (non-translated)
March 29, 2005 8:59 WSPC/148-RMP
210
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
2 fj1 and one (translated) fl,{1,a} are evaluated w.r.t. the same integration variable. The product of these two functions thus decreases rapidly, if a gets large.
6.6. An alternative construction using “duality” Here we prove an equality between correlation functionals of interacting particles systems and characteristic functionalsm of interacting CPNs making it possible to apply the results on the infinite volume for the correlation functional from Sec. 6.3 to the removal of the infra-red cut-off for the characteristic functional. The results obtained here are somewhat weaker than those obtained by the detour through correlation functionals, as in the previous section. But still we hope that this new way of performing the thermodynamic limit for the measure is of independent interest, as, for example, it gives a direct perturbative control over the characteristic function. to trigonometric interactions given by energy densities v(t) = We restrict iαt (1 − e ) dν(α) where ν is a symmetric probability measure on [−c , c ] for [−c ,c ] some c > 0. Furthermore, in this subsection we also assume that the L´evy measure r is symmetric. Clearly, in that case the formal Potts model of Sec. 6.2 becomes a real Potts model (however still at imaginary temperature) and one expects a specific symmetry or “duality” depending on the question, whether the Potts model is projected to its first or its second component. The following technical lemma states that there is a pointwise definition X(x) of the convoluted Poisson noise X and that also the trigonometric interactions v(X(x)) have a pointwise meaning. Lemma 6.11. Let XΛ = G∗FΛ be a CPN with assumptions on G and FΛ as above. For p ≥ 1, the mapping Rd × R (x, α) → eiαXΛ (x) ∈ Lp (S , P XΛ ) is continuous. Proof. We prove that eiαXΛ (x) is well-defined for (x, α) ∈ Rd ×R. Continuity can be proven in an analogous manner. Let XΛ, = χ ∗ XΛ be a ultra-violet regularization of XΛ , cf. Sec. 5.3. Without loss of generality we restrict ourselves to the case p = 2n, n ∈ N. Then, for , > 0, EP XΛ [|eiαXΛ, (x) − eiαXΛ, (x) |p ] = EP XΛ [(eiαXΛ, (x) − eiαXΛ, (x) )n (e−iαXΛ, (x) − e−iαXΛ, (x) )n ] n
n n = (−1)l+j EP XΛ [eiα[(n−l)XΛ, (x)+lXΛ, (x)−(n−j)XΛ, (x)−jXΛ, (x)] ] j l j,l=0 n R
n n = (69) (−1)l+j e Λ−x ψ(α(j−l)G∗(χ −χ )(y))dy . j l j,l=0
m We
are grateful to an anonymous referee for pointing out to us that the “duality transformation” discussed here has already been considered by V. Shkripnik from Kiev in some unpublished preprints in the 1970s.
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
211
n n n l+j We note that = ((1 − 1)n )2 = 0. In order to prove j,l=0 j j (−1) iαXΛ, (x) forms a Cauchy sequence in Lp (S , P XΛ ) it is thus sufficient to that e show that the integrals on the right-hand sideof (69) vanish as , ↓ 0. Clearly, | Λ−x ψ(α(j − l)G ∗ (χ − χ )(y)) dy| ≤ zcn|α| Rd |G ∗ χ − G ∗ χ | dy and now the assertion of the lemma follows from G ∗ χ → G in L1 (Rd , dx) as ↓ 0 (the latter again is a consequence of the Riesz convergence theorem [18]). In particular, Lemma 6.11 shows that the characteristic functional CX of X can $ be extended to the space Γ0 = c>0 Γc0 of finite configurations. Since the measure ¯ P X¯ Λ , is absolutely continuous w.r.t. P X , the same holds for the interacting CPN X, for the characteristic functional CX¯ Λ . Let us now recall all the data entering in the ¯ Λ (cf. the left row of Table 1). We say that an interacting particle definition of X ¯ Λ if the defining data for F˜Λ can be ˜ system FΛ is dualn to the interacting CPN X ¯ Λ according to Table 1. The following theorem obtained from the defining data of X clarifies the sense of this notion. ¯ Λ be an interacting CPN with a trigonometric interaction Theorem 6.12. Let X ¯ Λ . Then, the characteristic functional of X ¯ Λ and the corre˜ and let FΛ be dual to X ˜ lation functional of FΛ are related via CX¯ Λ (η) = ρΛ (η) ∀η ∈ Γ0 , supp η ⊆ Λ. Proof. By Lemma 6.11 and the estimate |VΛ | < 2|Λ| we get that all expressions in the following chain of equations are well-defined: 1 E X [eiX(η) e−βVΛ ] ΞΛ P ∞ e−β|Λ| (−β)n EP X [eiX(η) (VΛ − |Λ|)n ] = ΞΛ n=0 n! ∞ e−β|Λ| β n = ΞΛ n=0 n! Λn ×[−b,b]×n
CX¯ Λ (η) =
Table 1. Identifications for the characteristic–correlation functional duality between “Poisson” quantum fields and interacting particle systems for the case of trigonometric interactions.
1. 2. 3. 4. 5. 6.
¯ Λ interacting CPN X
F˜Λ interacting particle system
activity z inv. temperature (coupling const.) β IR cut-off and finite volume Λ L´ evy measure r interaction measure ν integral kernel G
inverse temperature β activity z finite volume and IR-cut off Λ interaction measure ν L´ evy measure r integral kernel G
is a conceptual difference between this notion of “duality” and the notion of F˜Λ being ˜Λ . associated to X
n There
March 29, 2005 8:59 WSPC/148-RMP
212
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
× e−z
R Rd
R [−c,c]
[1−eis(G∗η(y)+α1 G(y−y1 )+···+αn G(y−yn )) ]dr(s)dy
× dy1 · · · dyn dν(α1 ) · · · dν(αn ). (70) Defining UΛ (η) = Λ vr (G ∗ η) dx with vr (t) = [−c,c] (1 − eits ) dr(s) we now get the statement of the theorem comparing the-right hand side of (70) and the defining Eq. (39), cf. also (13). That here the partition function ΞΛ of the interacting CPN ˜ Λ can be seen by an analogous argument. is equal to the partition function Ξ Theorem 6.12 generalizes the well-known connection of trigonometric interactions and particle systems with certain pair interactions (equivalence of massive/massless sine-Gordon model and Yukawa/Coulomb gas, respectively), see e.g. [9, 10, 27] and references therein. In fact, in the ultra-violet regularized case one can obtain this classical “duality” from Theorem 6.10 by a scaling in the spirit of Corollary 7.5 below, see also Sec. 7.4. From Theorem 6.12 we get that for interacting CPNs with negative trigonometric interactions the high-temperature (“Feynman”) expansion of the characteristic functional is equivalent to the low activity expansion of the dual particle system and vice versa. Hence the results of Sec. 6.3 carry over to characteristic functionals of the fields in the following way. Corollary 6.13. Let β and z be the coupling constant and the activity of the inter˜ Λ . If |β| < 1/(eC1 ) and |z| < 1/(eC2 ) with C1 , C2 as in (44) and acting CPN X (45). Then (i) CX˜ (η) = limΛ↑Rd CX˜ Λ (η) exists for η ∈ Γ0 and is analytic in z and β; n (ii) CX˜ (η) is continuous at zero in the sense that CX˜ ( l=1 αl δxl ) → 1 if α1 , . . . , αl → 0; ˜ (iii) CX˜ :Γ0 → C hence defines a projective family of measures (PJX )J⊆Rd finite ; ˜ X (iv) There exists a canonical measure Pcan. on the space of functions ω: Rd → R equipped with the sigma-algebra generated by pointwise evaluation f → f (x). ˜ The infinite volume interacting CPN X(x)(ω) = ω(x) can be seen as the canon˜ X ical process of Pcan. in the above sense. Proof. (i) This follows from Theorems 6.4 and 6.10. (ii) One can use the representation through the dual correlation functional and obtain the following uniform estimate 1 EP F [e−βUΛ (η+FΛ ) − e−βUΛ (FΛ ) ] |ρΛ (η) − 1| = ΞΛ 1 ≤ E F [|e−βUΛ (η+FΛ )−βUΛ (FΛ ) − 1|e−βUΛ (FΛ ) ] ΞΛ P Pn ≤ sup |e−βUΛ (η+γ)−βUΛ (γ) − 1| ≤ eβb G 1 l=1 |αl | − 1. γ∈Γ0
(iii) Thus, CX˜ defines a family of positive definite (as the limit of positive definite functions) and continuous functions that obviously generates a projective family of
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
213
˜
finite dimensional distributions PJX of the random vectors (X(x1 ), . . . , X(xn )) for J = {x1 , . . . , xn } (again, the projectivity property is evident for Λ ⊆ Rd compact and it survives the limit as the vectors converge in distribution by L´evy’s theorem). (iv) follows from (iii) and Kolmogorov’s theorem on the existence of the induc˜ X of the family of finite dimensional distributions. tive limit Pcan.
7. The Continuum Limit In this section we discuss the continuum scaling limit of interacting particle systems with infra-red cut-off.o On the level of interacting CPNs, this scaling can be seen as a kind of implementation of the renormalization group. 7.1. Scaling limits Here we first consider the situation of a gas of charged, non-interacting particles. The number of positive and negative charges is assumed to be equal in average, hence the gas macroscopically is neutral. If we let the number of particles per unit √ volume (the activity z) go to infinity s scale charges with a factor s → s/ z, we obtain the so-called continuum limit. See e.g. [23] for an overview of the scaling of particle systems. Let r be a probability measure on [−c, c] such that r{0} = 0. For 0 < λ < ∞ and A ⊆ R measurable, we define rλ (A) = r(λA). Let F be the Poisson noise determined by the L´evy measure r and the activity z = 1, cf. Eqs. (2) and (3). We then denotep the Poisson noise determined by the L´evy measure r1/√z and activity z ≥ 1 by F z . Throughout the section we assume [−c,c] s dr(s) = 0. We also set σ 2 = 2 [−c,c] s2 dr(s) and ψz (t) = z [−c/√z,c/√z] (eist − 1) dr1/√z (s). Finally, by Fgσ we denote the Gaussian noise with intensity σ > 0 (cf. (2)–(3)) and we write X z = G ∗ F z , Xgσ = G ∗ Fgσ for the associated convoluted Poisson and Gaussian noise, respectively. The basic facts on the continuum limit are given by the following proposition. Proposition 7.1. With definitions as above we get L
(i) F z → Fgσ as z → ∞; L
(ii) X z → Xgσ as z → ∞. Proof. We have (i) ⇔ (ii) and it therefore suffices to prove the first statement. By L´evy’s theorem convergence in law is equivalent with the convergence of char acteristic functionals. It is thus sufficient to prove (cf. (2)–(3)) Rd ψz (f ) dx → 2 2 (− σ2 f 2 ) dx as z → ∞ ∀f ∈ S. Since ψz (t) → − σ2 t2 as z → ∞ we have Rd o Working
with finite volume instead of an infra-red cut-off would lead to Gaussian tail fields outside this volume, which would lead to misleading “tail-effects” in the scaling. p The superscript z in this section is used in a different sense than in Sec. 6, since there the charges remained unscaled.
March 29, 2005 8:59 WSPC/148-RMP
214
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
pointwise convergence and since |ψz (f )| ≤ dominated convergence.
σ2 2 2 f
∀z ≥ 1 the statement follows by
We recall from Sec. 2.3 that Proposition 7.1 is of particular interest in the case σ = Gα,m0 ∗ Fgσ is a generalized G = Gα,m0 , cf. Proposition 2.1, since then Xα,m 0 ,g free field for 0 < α < 1/2 and is Nelson’s free field of mass m0 > 0 for α = 1/2. Next we investigate the effect of a length scale transformation x → λx, x ∈ Rd , 0 < λ < ∞, on the Poisson noise F and the CPN Xα,m0 = Gα,m0 ∗ F , respectively. The basic observation is that increasing the activity can be performed by a scaling of the length (z ∼ λd ), cf. Fig. 3. Also one has to take into account that for a locally finite marked configuration we have a scaling dimension λ−d , since δ(λx) = λ−d δ(x). To obtain the same scaling as in Proposition 7.1, we thus have to define Fλ (x) = λd/2 F (λx),
(71)
0 < λ < ∞, where this scaling relation has to be understood in the sense of distributions. Proposition 7.2. With definitions as above L
(i) Fλ = F z for z = λd ;
L
(ii) For Xα,m0 ,λ (x) = λ(d−4α)/2 Xα,m0 /λ (λx) we get Xα,m0 ,λ = Gα,m0 ∗ Fλ ; L
σ (iii) For 0 < α ≤ 1/2, Xα,m0,λ → Xα,m as λ → ∞ where the latter is a 0 ,g (generalized) free field.
Proof. (ii) follows from (i) and Proposition 2.1(v). (iii) follows from (i), (ii) and Proposition 7.1. To prove (i) let f λ (x) = f (x/λ), f ∈ S, x ∈ Rd , λ > 0. Then,
Fig. 3. The average number of particles in a region Λ is proportional to |Λ|. A scaling Λ → λΛ thus scales the activity by a factor λd . Here d = 2, λ = 2.
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
215
by (71), Fλ , f = λ−d/2 F, f λ . Thus, CFλ (f ) = e =e
R Rd
R [−c,c]
(eisλ
−d/2 f λ (y)
−1) dr(s) dy
R R λd Rd [−λ−d/2 c,λ−d/2 c] (eisf (y) −1) drλ−d/2 (s) dy
,
f ∈ S,
(72)
and the claim follows from Eqs. (2) and (3). The scaling 7.2(iii) is of the same form as for block-spin transformations implementing the renormalization group for lattice systems [26]. In the general sense, that the renormalization group is a scaling limit adding more and more “microstructures” to a given region, we can say that the continuum limit for the models studied in this article is a suitable formulation of the renormalization group. Remark 7.3. (i) It is an interesting fact that it is just property (ii) of Remark 2.2 which prevents us from taking a pointwise continuum limit: If we have G ∈ L2 (Rd , dx), then the i.i.d. variables Zj (x) = Sj G(x − Yj ) with Sj , Yj distributed as 2 r⊗dx|Λ /|Λ| for a finite volume Λ ⊆ Rd have finite variance σ2 Λ |G(x − y)|2 dy < ∞ and therefore fulfil the requirements of the central limit theorem. Under such condiNΛz √ Zj (x)/ z (NΛz being a Poisson random variable tions, the quantity XΛz (x) = j=1 with intensity z|Λ|) converges in law to a Gaussian random variable, and one can σ (x). If however G ∈ L2 (Rd , dx), thus expect a pointwise definition of the process XΛ,g as is the case for the examples relevant for QFT, then the variance of Zj (x) is infinite. Heuristically speaking, XΛz (x) then converges to a “Gaussian random variable with infinite fluctuations” — thus there is no pointwise limit. Ultra-violet divergences and renormalization in these cases have to be taken into account. In the case d = 2, G = G1/2 , the variance of Zj (x) only diverges logarithmically, which already gives a hint that ultra-violet divergences in this specific caseq will be rather mild. (ii) From the above discussion it clear that the Gaussian (continuum) limit can also be taken in the canonical ensemble (CE) by replacing NΛ with it’s expectation |Λ|. Interactions for the CE can be defined as in Sec. 4. It is however open, whether also the analytic continuation [3] can be performed in the CE.
7.2. The continuum limit for trigonometric interactions with ultra-violet cut-off Here we study the continuum limit of CPNs with ultra-violet and infra-red regularized bounded interactions and we show convergence in law to the corresponding perturbed Gaussian models. Let G be a ultra-violet regularization of the kernel G (cf. Sec. 5.3). Λ ⊆ Rd is assumed to be compact. Let Xz = G ∗ F z and Xσ = G ∗ F σ . Here we dropped q This
is the standard case considered usually in constructive QFT in two dimensions, see [1, 34, 55].
March 29, 2005 8:59 WSPC/148-RMP
216
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
the superscripts g for notational simplicity and we adopt the convention that a (convoluted) noise with superscript σ is Gaussian. It is clear that Xσ = χ ∗ X σ has paths in the set C ∞ (Rd , R). For v: R → R being measurable and bounded (by a constant a > 0) we can thus define the potentials z/σ VΛ, = v(Xz/σ ), 1Λ
(73)
z/σ
and measures (for β > 0, we also note that |VΛ, | ≤ a|Λ| a.s.) z/σ
P
¯ z/σ X ,Λ
=
e−βVΛ, z/σ ΞΛ,
z/σ
P X
,
z/σ
z/σ
ΞΛ, = E
z/σ
P X
[e−βVΛ, ].
(74)
z/σ
¯ Let X ,Λ be the associated coordinate processes. We now obtain the same result as Proposition 7.1 for the perturbed models. L ¯σ ¯z → X,Λ as z → ∞. Theorem 7.4. X ,Λ
Proof. As convergence in law is equivalent with the convergence of characteristic z z (f ) → CX functionals, we have to prove CX¯ ,Λ ¯ σ (f ) ∀f ∈ S. Since VΛ, is a uniformly ,Λ (in z) bounded random variable, we get that the expression iX z (f ) z n (−β)n z VΛ, n! EP X e ∞ (−β)n z n z n=0 n! EP X [(VΛ, ) ]
∞ z (f ) = CX¯ ,Λ
n=0
(75)
converges to the related expression with z replaced with σ if all terms in the numerator and denominator converge separately. Using Fubini’s theorem we get for a term in the numerator z n z ] EP Xz [eiX (f ) VΛ, iX z (χ ∗f ) = EP X z e v(X z (χ,y1 )) · · · v(X z (χ,yn )) dy1 · · · dyn
(76)
Λ×n
and the corresponding term in the denominator is obtained setting f = 0. Here χ is the ultra-violet cut-off function (cf. Sec. 5.3) and χ,y (x) = χ (x − y). Since χ,y ∈ S we now get the pointwise convergence of the integrand on the right-hand side of (76) to the related integrand with z replaced with σ from the convergence in law of X z , cf. Proposition 7.1(ii). Since the integrand is uniformly bounded by (a|Λ|)n , convergence of the right-hand side of (76) then follows from dominated convergence.
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
217
We want to modify this result in the We replace the functions following way. z/σ z/σ v(t) in (73) with functions : v(t) : = [−b,b] : cos(αt) : dν(α) where : cos(αt) :z/σ = cos(αt)/E
[cos(αXz/σ (x))],
z/σ
P X
x ∈ Rd .
(77)
Here ν is a finite, complex measure on [−b, b] such that ν(A) = ν(−A) for A ⊆ [−b, b] measurable. These energy densities define the (ultra-violet regularized) trigonometric interactions [9, 10]. It is not difficult to prove that under the given conditions : v(t) :z is uniformly bounded (in z and t) and : v(t) :z →: v(t) :σ uniformly in t as z → ∞. Thus, the proof of Theorem 7.4 carries over to the modified interactions. ¯ z be the ultra-violet regularized interacting CPN with Corollary 7.5. Let X Λ, ¯ σ be the related perturbed trigonometric interaction specified as above and let X Λ, Gaussian model. Then the statement of Theorem 7.4 still holds.
7.3. Triviality for trigonometric potentials without renormalization We now want to consider the continuum limit without ultra-violet cut-off in the case of trigonometric potentials without renormalization “: :z0 ”, i.e. we set the denom inator in (77) equal to one. Let v(t) = R cos(αt) dν(α) for some finite, complex measure ν on R such that ν{0} = 0 and ν(A) = ν(−A) for A ⊆ R measurable. Let furthermore r be symmetric, r(A) = r(−A), A ⊆ [−c, c] measurable. In this case ψz is real and ψz (t) ≤ 0. We chose G ∈ L1 (Rd , dx) such that G ∈ L2 (Rd , dx), cf. Remarks 2.2(ii) and 7.3(i) for the motivation. Finally, we define VΛz as in Eq. (73) with = 0 and by Theorem 3.3 we get that this is well-defined (since ν is finite, v is bounded). We get the following lemma. Lemma 7.6. VΛz L2 (S ,P X z ) → 0 as z → ∞. Proof. We get by Fubini’s theorem for bounded functions z z EP X z[|VΛz |2 ] = EP X z[ei(α1 X (y1 )+α2 X (y2 )) ]dy1 dy2 dν(α1 ) dν(α2 )
(78)
Λ×2 ×R2
with (cf. Eq. (2) and Lemma 6.11) EP X z[ei(α1 X
z
(y1 )+α2 X z (y2 ))
R
]=e
Rd
ψz (α1 G(x−y1 )+α2 G(x−y2 ))dx
(79)
and the integral in the exponent on the right-hand side exists for 0 < z < ∞, since √ |ψz (t)| ≤ c z|t|. If we can show that the right-hand side of (79) vanishes dy1 dy2 dν(α1 ) dν(α2 ) a.e., we get the statement of the lemma by dominated convergence (since ψz ≤ 0). √ Let g(t) = ψz ( zt)/zt2 . One easily verifies that g is continuous and g(0) = −σ 2 /2.
March 29, 2005 8:59 WSPC/148-RMP
218
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
By Fatou’s lemma we get for α1 , α2 = 0 and y1 = y2 lim sup ψz (α1 G(x − y1 ) + α2 G(x − y2 )) dx z→∞ Rd √ = lim sup g([α1 G(x − y1 ) + α2 G(x − y2 )]/ z) z→∞
Rd
× (α1 G(x − y1 ) + α2 G(x − y2 ))2 dx √ ≤ lim sup g([α1 G(x − y1 ) + α2 G(x − y2 )]/ z) Rd
z→∞
× (α1 G(x − y1 ) + α2 G(x − y2 ))2 dx σ2 =− (α1 G(x − y1 ) + α2 G(x − y2 ))2 dx = −∞. 2 Rd
(80)
This concludes the proof. ¯ z be the interacting CPN with infra-red cut-off Λ associated to V z . We Let X Λ Λ then get: L ¯z → X σ as z → ∞, i.e. the limit is trivial (Gaussian). Theorem 7.7. X Λ
Proof. Again we have to show convergence of characteristic functionals. Let f ∈ S, then EP X z [eiX
z
(f ) −βVΛz
e
] = CX z (f ) − EP X z [eiX
z
(f )
(1 − e−βVΛ )] z
(81)
and from Lemma 7.6 we get |EP X z [eiX
z
(f )
(1 − e−βVΛ )]| ≤ EP X z [|1 − e−βVΛ |2 ]1/2 z
z
≤ CEP X z [|VΛz |2 ]1/2 → 0
as z → ∞
(82)
where C = β|Λ||ν|(R)eβ|Λ||ν|(R) . Likewise one can show that ΞzΛ → 1 as z → ∞. The statement now follows from Proposition 7.1(ii). Remark 7.8. (i) Clearly, Theorem 7.7 is what one would expect from the analysis of the sine-Gordon model [27–29]. The normal ordering : cos(αt) :σ , cf. (77), in this case can be understood as a renormalization of the coupling constant, i.e. we chose the energy density cos(αt) with coupling constant βσ = β0 /EP Xσ [cos(αXσ (x))] and one can easily check that βσ ↑ ∞ as ↓ 0. Since this coupling constant renormalization leads to a well-defined limit potential, it is natural to expect that without renormalization of β the limit is trivial. This is the same statement as in Theorem 7.7, where we however use the continuum limit z → ∞ without ultra-violet cut-off instead of the limit ↓ 0. We will continue this discussion in the following subsection. (ii) Even though Theorem 7.7 does not come as a surprise, it’s interpretation is of some interest. If z → ∞ the spatial fluctuation of sample paths of X z increase
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
5
219
1
0
0.8
-5 0.6 0 0.2
0.4 0.4 0.2
0.6 0.8 10 Fig. 4.
Sample paths of X z in the unit cube for z = 100 (see also Fig. 1).
rapidly, cf. Fig. 4. This leads to increasing oscillations of the function cos(αX z (x)) and thus Λ cos(αX z (x)) dx integrates out to zero as z → ∞. (iii) For a different approach to the triviality of the sine-Gordon model without renormalization, based on random Colombeau distributions, see [8]. Another intresting approach to triviality in quantum field theory that is less motivated by trigonometric interactions than Remark 7.8(ii) but probably works for all bounded non-renormalized interaction densities v is to look at the “spatial” properties of the sample paths as depending on the strength of the ultra-violet singularity. Plots as in Fig. 3 at high scaling parameter are appropriate, cf. Fig. 5. Already for z = 1000 one can see that in the ultra-violet finite case (Fig. 5(a)) long range “Gaussian tails” dominate the sample path. Hence the fluctuations of the potential energy prevail in the scaling limit. In contrast to this, the ultra-violet divergent case exposes a strong “localization” of the path properties due to the “volume” of the singularities. Thus, each of the one hundred little squares with an average 10 particles is “approximately independent” from its neighbors and contributes an amount proportional to the covered volume ∼1/100. One thus recognizes the regime of the law of large numbers and the convergence of the potential to a constant (i.e. triviality of the interaction) is expected. Again, the uv-critical case (Fig. 5(b)) is just the uv-singularity strength of constructive quantum field theory in d = 2 dimensions.
March 29, 2005 8:59 WSPC/148-RMP
220
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
(a)
0 0
0.2
0.4
0.6
0.8
1
(b)
0
0.2
0.4
0.6
0.8
1
1
0.8
0.6
0.4
0.2
0
(c)
0
0.2
0.4
0.6
0.8
1
Fig. 5. Density plot of the static field ∼|x|−α of a two-dimensional system of 1000 non-interacting charged particles (a) α = 0.2 (uv-finite scaling limit); (b) α = 1 (uv-critical); (c) α = 2 (uv-divergent).
7.4. Some remarks on the continuum limit for the sine-Gordon model Here we give some remarks on the continuum limit for the sine-Gordon (sG) model in d = 2 dimensions with coupling constant renormalization, namely we show that the Boltzmann weights of the dual particle system converge to those of the Yukawa gas, which is dual the sine-Gordon model, see e.g. [27]. We also comment on a simultanous expansion in the coupling constant β and a re-scaled activity ζ and we show that the continuum limit yields convergence in the expansion’s coefficients. A treatment which goes beyond these very preliminary results and investigates convergence in law of the “Poissonian” sine-Gordon models under the continuum limit would be desirable. But the technical details of such a treatment seem to be rather complicated, as is the case for the proof of the ultra-violet stability of the
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
221
classical sine-Gordon model [24, 27–29]. It therefore goes beyond the scope of the present article. We fix G = G1/2,m0 , G1 = G ∗ G, cf. Proposition 2.1. We consider the interacting CPN with energy density : cos(αt) :z = : cos(αt) :z0 , cf. (77). Using the language of particle systems, we define the potential for the dual particle system with external source f as n
sG −ψz G ∗ f (x) + αl G(x − yl ) Uz (f ; y1 , . . . , yn ; α1 , . . . , αn ) = Rd
+
n
l=1
ψz (αl G(x − yl )) dx
(83)
l=1
where f ∈ S, y1 , . . . , yn ∈ Rd , yj = yl , j = l and α1 , . . . , αn ∈ supp ν ⊆ [−b, b]. Here the integrals of the second term in (83) do not depend on the yl and these terms arise from the coupling constant renormalization (77). We also define UσsG (f ; y1 , . . . , yn ; α1 , . . . , αn ) =
n n
σ2 f ∗ G1 ∗ f (0) + 2 α G ∗ f (y ) + αj αl G1 (yj − yl ) l 1 l 2 l=1
(84)
l,j=1 j=l
which for f = 0 gives the Yukawa potential for particles with charges αl . We sG consider the f -dependent Boltzmann weights e−ζUz (f ;y1 ,...,yn ;α1 ,...,αn ) for the dual sG particle system of the interacting CPN and e−ζUσ (f ;y1 ,...,yn ;α1 ,...,αn ) for the Yukawa gas. Here ζ > 0 is an inverse temperature for the dual particle systems and hence is a scaling factor for the activity (the intensity σ, respectively) for the quantum field systems, cf. Theorem 6.10. We get the following expansion in β and ζ ¯ (z,ζ) , defined as the interacting CPN with for the characteristic functional of X Λ sG-interaction and L´evy-characteristic ζψz : ∞
(−ζ)l (−β)n 1 CX¯ (z,ζ) (f ) = (z,ζ) Λ l! n! Λ×n ×[−b,b]×n ΞΛ l,n=0 l × UzsG (f ; y1 , . . . , yn ; α1 , . . . , αn ) dy1 · · · dyn dν(α1 ) · · · dν(αn ). (85) The related expansion for the partition function is obtained from the expansion of the numerator by setting f = 0. From the fact that |UzsG (f ; y1 , . . . , yn ; α1 , . . . , αn )| ≤ C(n, z, c) where C(n, z, c) is linearly bounded in n, we get that the expansion (85) converges absolutely for any fixed z < ∞, independently of the dimension d. For d = 2, the related expansion for the characteristic functional of the Gaussian sine-Gordon model exists term by term, which can be deduced from (84) and the fact that G1 (x) ∼ − ln|x|/2π for |x| small. It is known for the special case
March 29, 2005 8:59 WSPC/148-RMP
222
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
ν = (δb + δ−b )/2 that if we sum up over l under the √ integral, then the series con2 verges absolutely for any β provided 0 < ζ < 2/σ 4πb, cf. [27]. From the analysis of that model it seems to us that after summing up n, at most asymptotic convergence in l can be expected, since ultra-violet divergences for ζ < 0 are more severe than in the case ζ > 0. This can be explained from the fact that the Yukawa gas at negative temperatures becomes unstable. Here we ignore the question of convergence and consider (85) as a formal power series in β and ζ. Proposition 7.9. With definitions as above (i) The f -dependent Boltzmann weights potentials UzsG of the dual particle system of CPN converge pointwisely to the f -dependent Boltzmann the interacting weights potentials UσsG of the Yukawa gas as z → ∞. (ii) For d = 2 the expansion (85) converges to the related expansion of the classical (“Gaussian”) sine-Gordon model, where UzsG is replaced by UσsG , in the sense of convergence of formal power series. Proof. (i) Using ψz (t) → −σ 2 t2 /2 as z → ∞, it is elementary to show n n n
σ2 −ψz tl + s + ψz (tl ) → tl tj + 2s tl + s 2 as z → ∞ (86) 2 l=1
l=1
j,l=1 l=j
l=1
where t1 , . . . , tn , s ∈ R. If we replace tl = αl G(x − yl ) and s = G ∗ f (x) we thus get the convergence of the left-hand side of (86) to the right-hand side whenever x = yl , l = 1, . . . , n. We note that under this replacement, the right-hand side of (86) integrated over Rd w.r.t. dx is just the right-hand side of (84). To prove the convergence of the right-hand side of (83) to the right-hand side of (84) for yj = yl , l = j, it is thus sufficient to show that the integrand in (83) has a uniform (in z) L1 (Rd , dx)-bound. We note that |ψz (t)| ≤ σ 2 t2 /2 and |ψz (t)| ≤ σ 2 |t| for all z > 0. For j = 1, . . . , n we thus get that the modulus of the left-hand side of (86) can be estimated as follows: n n 1
ψ tl + s u + t j tl + s − ψz (tj ) du ... = 0 l=1 l=1 l=1 l=j
≤ σ2
n
l,p=1 l,p=j
|tl tp | + 2σ 2
l=j
n
l=1 l=j
l=j
n 3σ 2 |tl s| + s2 + tl . 2 2
(87)
l=1 l=j
If one replaces on the right-hand side tl with αl G(x − yl ) and s with G ∗ f (x) one apparently gets a function of fast decay which is locally integrable on $n Rd \ l=1,l=j BRj (yl ) with Rlj = |yl − yj |/2 by our assumption yj = yl , j = l. l
A point x ∈ Rd is contained in such a set for j such that the |x − yj | =
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
223
min{|x − yl |: l = 1, . . . , n}. Therefore, the union over j = 1, . . . , n of all such sets gives Rd and there is a global L1 (Rd , dx)-majorant. (ii) To obtain the convergence in terms of formal power series in (85) it suffices to prove the convergence of each expansion coefficient in the numerator and in the (z,ζ) denominator (i.e. in the expansion of ΞΛ ), since the coefficients of the expansion of the fraction can be calculated from those of the numerator and denominator via a finite combinatorial expression (note that the zero order coefficient of the partition function is one). Furthermore, the calculation for the partition function is a special case of the calculation for the numerator, namely f = 0, we only have to consider the latter. By (i) we have pointwise convergence of the integrands in (85). For d = 2, n and l fixed, we can find a Ll (Λ×n × [−b, b]×n, d2n x⊗ r⊗n )-majorant by integrating the majorant constructed in (i) over R2 w.r.t. dx. The first term on the right hand side of (87) then gives rise to a term l,j=1,l=j |αl αj |G1 (yl − yj ) which is Lp -integrable in the variables y1 , . . . , yl for any p ≥ 1 since G1 (x) ∼ − ln|x|/2π for small x. The terms involving s and s2 in (85) trivially have the same property, since under the replacements as above the integration over dx can be estimated by n l=1 |αl |G1 ∗ |f |(yl ) and |f | ∗ G1 ∗ |f |(0) which are manifestly bounded. Hence, the only really problematic term in (87) is the last one. n This term, l=1,l=j α2l G(x − yl )2 , by the construction of the dx-majorant is $n integrated (in x) over R2 \ l=1,l=j BRj (yl ). By applying Proposition 2.1(vi) applied l to the case d = 2, α = 1/2, one gets |G(x)| < c1/2 (2)/|x|. We can thus dominate n this term by −C1 j,l=1j=l ln(|yj − yl |)1{|yj −yl |<1} + n2 C2 for C1 (σ, b), C2 (σ, b) > 0 sufficiently large. This establishes Lp , p ≥ 1, integrability also for this last term and we can thus use the dx integral of the majorant found in (i) as an Ll -majorant needed to prove dominated convergence in each term of (85).
Acknowledgments Discussions with Klaus R. Mecke on Sec. 4.2 and Tobias Kuna on Sec. 6 were very helpful for the indicated parts of the article. We also thank Martin Grothaus, Armin Seyfried and Jiang-Lun Wu for interesting discussions and an anonymous referee for reading of the typoscript very carefully. Financial support for the second-named author via DFG projects “Stochastic analysis and systems with infinitely many degrees of freedom” and “Stochastic methods in QFT”, and for the third-named author by the Grant-in-Aid Science Research No. 12640159 (Ministry of Education and Sciences, Japan) is gratefully acknowledged.
References [1] S. Albeverio, J. E. Fenstad, R. Høegh-Krohn and T. Lindstrøm, Nonstandard Methods in Stochastic Analysis and Mathematical Physics, Pure and Applied Math., 122 (Academic Press, New York, 1987).
March 29, 2005 8:59 WSPC/148-RMP
224
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
[2] S. Albeverio and H. Gottschalk, Scattering theory for quantum fields with indefinite metric, Commun. Math. Phys. 216 (2001) 491. [3] S. Albeverio, H. Gottschalk and J.-L. Wu, Convoluted generalized white noise, Schwinger functions and their continuation to Wightman functions, Rev. Math Phys. 8(6) (1996) 763. [4] S. Albeverio, H. Gottschalk and J.-L. Wu, Models of local relativistic quantum fields with indefinite metric (in all dimensions), Commun. Math. Phys. 184 (1997) 509. [5] S. Albeverio, H. Gottschalk and J.-L. Wu, Nontrivial scattering amplitudes for some local relativistic quantum field models with indefinite metric, Phys. Lett. B405 (1997) 243. [6] S. Albeverio, H. Gottschalk and J.-L. Wu, Scattering behavior of quantum vector fields obtained from Euclidean covariant SPDEs, Rep. Math. Phys. 44(1/2) (1999) 21. [7] S. Albeverio, H. Gottschalk and M. W. Yoshida, Representing Euclidean Quantum fields as scaling limits of particle systems, J. Stat. Phys. 108(1/2) (2002) 631–639. [8] S. Albeverio, Z. Haba and F. Russo, A two-dimensional, semi-linear heat equation perturbed by white noise, Probab. Theory Relat. Fields 121 (2001) 319–366. [9] S. Albeverio and R. Høegh-Krohn, Uniqueness of the physical vacuum and the Wightman functions in the infinite volume limit for some non-polynomial interactions. Commun. Math. Phys. 30 (1973) 171. [10] S. Albeverio and R. Høegh-Krohn, The scattering matrix for some non-polynomial interactions I, Helv. Physica Acta 46 (1973) 504. [11] S. Albeverio and R. Høegh-Krohn, Euclidean Markov fields and relativistic quantum fields from stochastic partial differential equations, Phys. Lett. B177 (1986) 175. [12] S. Albeverio and R. Høegh-Krohn, Quaternionic non–Abelian relativistic quantum fields in four space–time dimensions, Phys. Lett. B189 (1987) 329. [13] S. Albeverio and R. Høegh-Krohn, Construction of interacting local relativistic quantum fields in four space–time dimensions, Phys. Lett. B200 (1988) 108–114, with erratum in ibid. B202 (1988) 621. [14] S. Albeverio, R. Høegh-Krohn, H. Holden and T. Kolsrud, Representation and construction of multiplicative noise, J. Funct. Anal. 87 (1989) 250. [15] S. Albeverio, K. Iwata and T. Kolsrud, Random fields as solutions of the inhomogenous quarternionic Cauchy-Riemann equation. I. Invariance and analytic continuation, Commun. Math. Phys. 132 (1990) 550. [16] S. Albeverio and J.-L. Wu, Euclidean random fields obtained by convolution from generalized white noise, J. Math. Phys. 36 (1995) 5217–5245. [17] G. Battle, Wavelets and Renormalization (World Scientific, Singapore/ New Jersy/London/Hong Kong, 1999). [18] H. Bauer, Mass- und Integrationstheorie (W. de Gruyter, Berlin/New York, 1990). [19] C. Becker, R. Gielerak and P. L ugievicz, Covariant SPDEs and quantum field structures, J. Phys. A (1998) 231. [20] C. Berg and G. Forst, Potential Theory on Locally Compact Abelian Groups (SpringerVerlag, Berlin/Heidelberg/New York, 1975). [21] H.-J. Borchers, Algebraic aspects of quantum field theory, in Int. Symp. Math. Probl. Theor. Phys., Kyoto 1975, Lect. Notes Phys. 39 (1975) 283–292. [22] P. Colella and O. E. Lanford, Sample Path Behavior for the Free Markov Field, Lecture Notes Phys. 25 (Springer, Berlin, 1973), p. 44. [23] A. De Masi and E. Presutti, Mathematical Methods for Hydrodynamic Limits, LNM 1501 (Springer-Verlag, Berlin/Heidelberg/New York, 1991).
March 29, 2005 8:59 WSPC/148-RMP
J070-00232
Particle Systems in the GCE, Scaling and QFT
225
[24] C. Deutsch and M. Lavaud, Equilibrium properties of a two-dimensional Coulomb gas, Phys. Rev. A9 (1974) 2598. [25] S.-H. Djah, H. Gottschalk and H. Ouerdiane, Feynman graph representation for the perturbation series for general functional measures, math-ph/0408031, to appear in J. Funct. Anal. [26] R. Fern` andez, J. Fr¨ ohlich and A. D. Sokal, Random Walks, Critical Phenomena and triviality in Quantum Field Theory (Springer-Verlag, Berlin/Heidelberg/New York, 1992). [27] J. Fr¨ ohlich, Classical and quantum statistical mechanics in one and two dimensions: Two-component Yukawa — and Coulomb systems, Commun. Math. Phys. 47 (1976) 233. [28] J. Fr¨ ohlich and Y. M. Park, Remarks on exponential interactions and the quantum sine-Gordon equation in two space-time dimensions, Helv. Phys. Acta 50 (1977) 315. [29] J. Fr¨ ohlich and E. Seiler, The massive Thirring–Schwinger model (QED)2 : Convergence and perturbation structure, Helv. Phys. Acta 49 (1976) 889. [30] I. M. Gelfand and N. Ya. Vilenkin, Generalized Functions, IV. Some Applications of Harmonic Analysis (Academic Press, New York/London, 1964). [31] H.-O. Georgii, Gibbs measures and phase transitions (W. de Gruyter, Berlin/ New York, 1988). [32] R. Gielerak and P. L ugiewicz, From stochastic differential equation to quantum field theory, Rep. Math. Phys. 44(1/2) (1999) 101. [33] R. Gielerak and P. L ugiewicz, 4D local quantum field theory models from covariant stochastic partial differential equations, Rev. Math. Phys. 13(3) (2001) 335–408. [34] J. Glimm and A. Jaffe, Quantum Physics: A Functional Integral Point of View, 2nd edn. (Springer, Berlin/Heidelberg/New York, 1987). [35] J. Glimm and A. Jaffe, Positivity of the φ43 Hamiltonian, Fortschr. Phys. 21 (1973) 327. [36] O. W. Greenberg, Generalized free fields and models of local field theory, Ann. Phys. 16 (1969) 158. [37] H. Gottschalk, Particle systems with weakly attractive interaction, SFB 611 preprint Bonn 2002, math-ph/0409029. [38] H. Gottschalk, Wick rotation for holomorphic random fields, in Recent Developments in Stochastic Analysis and Related Topics, World Scientific, Singapore 2004, Proceedings of the First Sino-German Conference on Stochastic Analysis, eds. S. Albeverio, Z.-M. Ma and M. Rckner. [39] M. Grothaus and L. Streit, Construction of relativistic quantum fields in the framework of white noise analysis, J. Math. Phys. 40(11) (1999) 5387. [40] P. R. Halmos, Measure Theory, 2nd edn. (Springer Verlag, Berlin/Heidelberg, 1976). [41] G. E. Johnson, Interacting quantum fields, Rev. Math. Phys. 11(7) (1999) 881, with Erratum ibid. 12 (2000) 687. [42] Yu. Kondratiev and T. Kuna, Correlation functionals for Gibbs measures and Ruelle bounds, Methods Funct. Anal. Topol. 9(1) (2003) 9–58. [43] A. Lenard, Correlation functions and the uniqueness of the state in classical statistical mechanics, Commun. Math. Phys. 30 (1973) 35. [44] K. R. Mecke, Integral geometry in statistical physics, Int. J. Mod. Phys. 12(9) (1998) 861. [45] R. A. Minlos, Generalized Random Processes and their Extension in Measure, Translations in Mathematical Statistics and Probability, Vol. 3 (AMS Providence, 1963), p. 291.
March 29, 2005 8:59 WSPC/148-RMP
226
J070-00232
S. Albeverio, H. Gottschalk & M. W. Yoshida
[46] G. Morchio and F. Strocchi, Infrared singularities, vacuum structure and pure phases in local quantum field theory, Ann. Inst. H. Poincar´ e 33 (1980) 251. [47] E. Nelson, Construction of quantum fields from Markoff fields, J. Funct. Anal. 12 (1973) 97. [48] E. Nelson, The free Markoff field, J. Funct. Anal. 12 (1973) 211. [49] K. Osterwalder and R. Schrader, Axioms for Euclidean Green’s functions I, Commun. Math. Phys. 31 (1973) 83. [50] C. Preston, Random Fields, LNM 534 (Springer, Berlin/Heidelberg/New York, 1976). [51] M. Reed and J. Rosen, Support properties of the free measure for the Boson field, Commun. Math. Phys. 36 (1974) 123. [52] D. Ruelle, Statistical mechanics — rigorous results (Benjamin, London/Amsterdam/ Don Mills (Ontario)/Sydney/Tokyo, 1969). [53] L. A. Santal` o, Integral geometry and geometric probability (Addison–Wesley, Reading, 1976). [54] B. Schroer, Infrateilchen in der Quantenfeldtheorie, Fortschr. Phys. 173 (1963) 1527. [55] B. Simon, The P (φ)2 Euclidean (Quantum) Field Theory (Princeton University Press, Princeton, New Jersey, 1974). [56] D. Stoyan, W. S. Kendall and J. Mecke, Stochastic geometry and it’s applications (Wiley & Sons, 1987). [57] F. Strocchi, Selected Topics on the General Properties of Quantum Field Theory, Lecture Notes in Physics 51 (World Scientific, Singapore/New York/London/ Hong Kong, 1993). [58] R. F. Streater and A. S. Wightman, PCT, Spin, Statistics and All That (Benjamin, New York, 1964). [59] K. Symanzik, Euclidean quantum field theory, in Local Quantum Theory, ed. R. Jost (Academic Press, New York, 1969). [60] H. Tamura, On the possibility of confinement caused by nonlinear electromagnetic interaction, J. Math. Phys. 32 (1991) 897. [61] M. W. Yoshida, Non-linear continuous maps on abstract Wiener spaces defined on space of tempered distributions, Bull Univ. Electro-Commun. 12 (1999) 101–117.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Reviews in Mathematical Physics Vol. 17, No. 3 (2005) 227–311 c World Scientific Publishing Company
CONSERVATION OF THE STRESS TENSOR IN PERTURBATIVE INTERACTING QUANTUM FIELD THEORY IN CURVED SPACETIMES
STEFAN HOLLANDS∗ and ROBERT M. WALD† Enrico Fermi Institute and Department of Physics, University of Chicago, 5640 S. Ellis Avenue, Chicago, IL 60637, USA ∗
[email protected] †
[email protected] Received 20 April 2004 Revised 12 January 2005 We propose additional conditions (beyond those considered in our previous papers) that should be imposed on Wick products and time-ordered products of a free quantum scalar field in curved spacetime. These conditions arise from a simple “Principle of Perturbative Agreement”: for interaction Lagrangians L1 that are such that the interacting field theory can be constructed exactly — as occurs when L1 is a “pure divergence” or when L1 is at most quadratic in the field and contains no more than two derivatives — then timeordered products must be defined so that the perturbative solution for interacting fields obtained from the Bogoliubov formula agrees with the exact solution. The conditions derived from this principle include a version of the Leibniz rule (or “action Ward identity”) and a condition on time-ordered products that contain a factor of the free field ϕ or the free stress-energy tensor Tab . The main results of our paper are (1) a proof that in spacetime dimensions greater than 2, our new conditions can be consistently imposed in addition to our previously considered conditions and (2) a proof that, if they are imposed, then for any polynomial interaction Lagrangian L1 (with no restriction on the number of derivatives appearing in L1 ), the stress-energy tensor Θab of the interacting theory will be conserved. Our work thereby establishes (in the context of perturbation theory) the conservation of stress-energy for an arbitrary interacting scalar field in curved spacetimes of dimension greater than 2. Our approach requires us to view time-ordered products as maps taking classical field expressions into the quantum field algebra rather than as maps taking Wick polynomials of the quantum field into the quantum field algebra. Keywords: Quantum field theory on curved space; renormalization theory; stress tensor; perturbation theory.
Contents 1. Introduction 2. The Nature and Properties of Time-Ordered Products 2.1. The construction of the free quantum field algebra and the nature of time-ordered products 2.2. Properties of time-ordered products: Axioms T1–T9 227
228 232 232 238
May 19, 2005 1:20 WSPC/148-RMP
228
J070-00234
S. Hollands & R. M. Wald
3. The Leibniz Rule 3.1. Formulation of the Leibniz rule, T10, and proof of consistency with axioms T1–T9 3.2. Anomalies with respect to the equations of motion 4. Quadratic Interaction Lagrangians and Retarded Response 4.1. Formulation of the general condition T11 4.2. External source variation: Axiom T11a 4.3. Metric variation: Axiom T11b 4.4. External potential variation 5. Some Key Consequences of Our New Requirements 5.1. Consequences for the free field 5.2. Consequences for interacting fields 6. Proof that There Exists a Prescription for Time-Ordered Products Satisfying T11a and T11b in Addition to T1–T10 6.1. Proof that T11a can be satisfied 6.2. Proof that T11b can be satisfied when D > 2 6.2.1. Proof that Dn is supported on the total diagonal 6.2.2. Proof that Dn is a c-number 6.2.3. Proof that Dn is local and covariant and scales almost homogeneously 6.2.4. Proof that Dn = 0 when one of the Φi is equal to ϕ 6.2.5. Proof that Dn satisfies a wave front set condition and depends smoothly and analytically on the metric 6.2.6. Proof that Dn is symmetric when Φ1 = Tab 6.2.7. Proof that Dn can be absorbed in a redefinition of the time-oriented products 7. Outlook Appendix A. Infinitesimal Retarded Variations Appendix B. Functional Derivatives
243 243 251 255 255 260 261 264 265 265 271 276 278 280 282 283 286 287 288 297 300 302 305 307
1. Introduction In [13] and [14], we took an axiomatic approach toward defining Wick powers and time-ordered products of a quantum scalar field, ϕ, in curved spacetime. We provided a list of axioms that these quantities are required to satisfy (see conditions T1–T9 of [14] or Sec. 2 below) and then succeeded in proving both their uniqueness (up to specified renormalization ambiguities) [13] and their existence [13, 14]. Our previous analysis restricted attention to the case where the Wick powers and the factors appearing in the time-ordered products do not contain derivatives of the scalar field ϕ. In fact, however, as we already noted in [13, 14], our uniqueness and existence results extend straightforwardly to the case where the Wick powers and the factors appearing in the time-ordered products are arbitrary polynomial expressions in ϕ and its derivatives.1 We excluded the explicit consideration of expressions containing derivatives partly for simplicity but also because it was clear to us that additional axioms should be imposed on these quantities — and, 1 Axiom T9 was explicitly stated in [14] only for the case of expressions that do not contain derivatives. Its generalization to expressions with derivatives is given in Sec. 2 below.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
229
consequently, stronger uniqueness and existence theorems should be proven — but it was not clear to us precisely what form these additional axioms should take. The main purpose of this paper is to provide these additional axioms, to investigate some of their consequences — most notably, conservation of the stress-energy of the interacting field — and to prove the desired stronger existence and uniqueness results for our new strengthened set of axioms. Some simple examples should serve to illustrate the issues involved in determining what additional conditions should be imposed. One obvious possible requirement is the “Leibniz rule”. Consider, for example, the Wick monomials ϕ2 and ϕ∇a ϕ in D = 4 spacetime dimensions. The uniqueness theorem of [13] applies to both of these expressions. It establishes that the first is unique up to the addition of c1 R11, where c1 is an arbitrary constant and R denotes the scalar curvature. Similarly, the second is unique up to the addition of c2 ∇a R11, where c2 is an independent arbitrary constant. However, it would be natural to require that ∇a ϕ2 = 2ϕ∇a ϕ
(1)
where the left side denotes the distributional derivative of ϕ2 . If we wished to impose Eq. (1), then we would need to strengthen our previous existence theorem to show that Eq. (1) can be imposed in addition to our previous axioms. (This is easily done.) Our above uniqueness result would then be strengthened in that we would have c1 = 2c2 , i.e., c1 and c2 would no longer be independent. Note that the Leibniz rule Eq. (1) has an obvious generalization to arbitrary Wick polynomials, but it is not so obvious, a priori, what form the Leibniz rule should take on factors occurring in time-ordered products. A second “obvious” requirement that one might attempt to impose on Wick polynomials and time-ordered products is that they respect the equations of motion of the free field ϕ. Consider the case of a massless Klein–Gordon field, so that ∇a ∇a ϕ = 0. Then it would seem natural to require the vanishing of any Wick monomial containing a factor of ∇a ∇a ϕ — such as the Wick monomials ϕ∇a ∇a ϕ and (∇b ϕ)(∇a ∇a ϕ). Similarly, it would be natural to require the vanishing of any time-ordered product with the property that any of its arguments contains a factor of this form. However, it turns out that — as we will explicitly prove in Sec. 3 below — it is not possible to impose this “wave equation” requirement together with the Leibniz rule requirement of the previous paragraph. Should one impose the Leibniz rule or the free equations of motion (or neither of them) on Wick polynomials or time-ordered products? If the Leibniz rule is imposed, what form should it take for time-ordered products? Should any conditions be imposed in addition to the Leibniz rule or, alternatively, to the free equations of motion? In this paper, we will take the view that these and other similar questions should not be answered by attempting to make aesthetic arguments concerning properties of Wick polynomials and time-ordered products for the free field theory defined by the free Lagrangian L0 . Rather, we will consider the properties of the interacting quantum field theory defined by adding to L0 an interaction Lagrangian
May 19, 2005 1:20 WSPC/148-RMP
230
J070-00234
S. Hollands & R. M. Wald
density L1 , which may contain an arbitrary (but finite) number of powers of ϕ and its derivatives. As discussed in detail e.g., in [15, Sec. 3] (see also Sec. 4.1 below), an arbitrary interacting quantum field ΦL1 (with Φ denoting an arbitrary polynomial in ϕ and its derivatives) is defined perturbatively by the Bogoliubov formula, which expresses ΦL1 in terms of the free-field time-ordered products with factors composed of Φ and L1 . The main basic idea of this paper is to invoke the following simple principle, which we will refer to as the “Principle of Perturbative Agreement”: if the interaction Lagrangian L1 is such that the quantum field theory defined by the full Lagrangian L0 + L1 can be solved exactly, then the perturbative construction of the quantum field theory must agree with the exact construction. There are two separate cases in which this principle yields nontrivial conditions. The first is where the interaction Lagrangian corresponds to a pure “boundary term”, i.e., in differential forms notation, the interaction Lagrangian is of the form dB, where B is a smooth (D − 1)-form of compact support depending polynomially on ϕ and its derivatives. Such an “interaction” produces an identically vanishing contribution to the action, and the interacting quantum field theory is therefore identical to the free theory. As we shall show in Sec. 3.1, the imposition of the requirement that all perturbative corrections vanish for any interaction Lagrangian of the form dB precisely yields the Leibniz rule for Wick polynomials and yields a generalization of the Leibniz rule for time-ordered products. This generalization states that, in effect, derivatives can be freely commuted through the “time ordering”. We will refer to this condition as the generalized Leibniz rule and will label it as “T10”. Our condition T10 corresponds to the “action Ward identity” proposed in [18, 9] and proven recently in the context of flat spacetime theories in [10]. In order for condition T10 to be mathematically consistent, it is necessary that we adopt the viewpoint of [2] and [8] — which we already adopted in [15] for other reasons — that time-ordered products are maps from classical field expressions (on which the classical equations of motion are not imposed) into the quantum algebra of observables. This viewpoint and the reasons that necessitate its adoption are explained in detail in Sec. 2. A proof that condition T10 can be consistently imposed in addition to conditions T1–T9 is given in Sec. 3.1. The second case where the above principle yields nontrivial conditions is where the interaction Lagrangian is at most quadratic in the field and contains a total of at most two derivatives. This includes interaction Lagrangians consisting of terms of the form Jϕ, V ϕ2 , and hab ∇a ϕ∇b ϕ, corresponding to the presence of an external classical source, a spacetime variation of the mass, and a variation of the spacetime metric. In all of these cases, the exact quantum field algebra of the theory with Lagrangian L0 + L1 can be constructed directly, in a manner similar to the theory with Lagrangian L0 . Our demand that perturbation theory reproduce this construction yields new, nontrivial conditions on time-ordered products (which are most conveniently formulated in terms of retarded products). The general form of this requirement, which we label as “T11”, is formulated in Sec. 4.1. A useful
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
231
infinitesimal version of this condition for the case of an external current interaction — which we label as condition T11a — is derived in Sec. 4.2, and a corresponding infinitesimal version for the case of a metric variation — which we label as condition T11b — is derived in Sec. 4.3. The consequences of our additional conditions are investigated in Sec. 5. The main results proven there — which also constitute some of the main results of this paper — are that our conditions imply the following: (i) the free stress-energy tensor, Tab , in the free quantum theory must be conserved; (ii) for an arbitrary polynomial interaction Lagrangian, L1 , (a) the interacting quantum field ϕL1 always satisfies the interacting equations of motion and (b) the interacting stress-energy tensor, Θab L1 , of the interacting theory always is conserved. This is rather remarkable in that, a priori, one might have expected properties (i) and (ii) to be entirely independent of conditions T1–T11. Indeed, one might have expected that if one required that (i) and (ii) be satisfied in perturbation theory, one would obtain a further set of requirements on Wick polynomials and time-ordered products. The fact that no additional conditions are actually needed provides confirmation that T10 and T11 are the appropriate conditions that are needed to supplement our original conditions T1–T9. In effect, the analysis of Sec. 5 shows the following: suppose that the definition of time-ordered products satisfies T1–T10. Then, if the definition of timeordered products is further adjusted, if necessary, so that in perturbation theory the quantum field satisfies the correct field equation in the presence of an arbitrary classical current source J (as required by T11a), then the interacting field also will satisfy the correct field equation for an arbitrary self-interaction. Furthermore, if, in perturbation theory, the stress-energy tensor remains conserved in the presence of an arbitrary metric variation (as is a consequence of T11b), it also will remain conserved in the presence of an arbitrary self-interaction. Finally, in Sec. 6, we prove that condition T11a and — in spacetimes of dimension D > 2 — condition T11b can be consistently imposed, in addition to conditions T1–T10. The proof that condition T11a can be consistently imposed is relatively straightforward, and is presented in Sec. 6.1. The proof that condition T11b also can be imposed when D > 2 is much more complex technically, and is presented in Sec. 6.2.7. Despite its complexity, the proof is logically straightforward except for a significant subtlety that is treated in Sec. 6.2.6. Here we find that a potential obstruction to satisfying T11b arises from the requirement that time-ordered products containing more than one factor of the stress-energy tensor be symmetric in these factors. We show that this potential obstruction does not actually occur for the theory of a scalar field, as treated here. However, this need not be the case for other fields, and, indeed, it presumably is the underlying cause of the inability to impose stress-energy conservation in certain parity violating theories in curved spacetimes of dimension D = 4k +2, as found in [1]. For scalar fields, we are thereby able to show that condition T11b can be consistently imposed in curved spacetimes of dimension D > 2. However, for D = 2 a further difficulty arises from the simple fact that the freedom to modify the definition of ϕ∇a ∇b ϕ by the addition of an
May 19, 2005 1:20 WSPC/148-RMP
232
J070-00234
S. Hollands & R. M. Wald
arbitrary local curvature term does not give rise to a similar freedom to modify the definition of Tab , and we find that, as a consequence, condition T11b cannot be satisfied for a scalar field in D = 2 dimensions. It is our view that conditions T1–T11 provide the complete characterization of Wick polynomials and time-ordered products of a quantum scalar field in curved spacetime. Notation and Conventions. Our notation and conventions generally follow those of our previous papers [13]–[15]. The spacetime dimension is denoted as D, and (M, g) always denotes an oriented, globally hyperbolic spacetime. We denote by √ = −g dx0 ∧ · · · ∧ dxD−1 the volume element (viewed as a D-form, or density of weight 1) associated with g. Abstract index notation is used wherever it does not result in exceedingly many indices. However, abstract index notation is generally not used for g = gab and = ab···c . 2. The Nature and Properties of Time-Ordered Products 2.1. The construction of the free quantum field algebra and the nature of time-ordered products Consider a scalar field ϕ on an arbitrary globally hyperbolic spacetime, (M, g), with classical action 1 ab (2) S0 = L 0 = − g ∇a ϕ∇b ϕ + m2 ϕ2 + ξRϕ2 . 2 The equations of motion derived from this action have unique fundamental advanced and retarded solutions ∆adv/ret (x, y) satisfying (∇a ∇a − m2 − ξR)∆adv/ret = δ,
(3)
together with the support property supp∆adv/ret ⊂ {(x, y) ∈ M × M | x ∈ J −/+ (y)},
(4)
where J −/+ (S) is the causal past/future of a set S in spacetime. Here we view the distribution kernel of ∆adv/ret as undensitized, i.e., acting on test densities rather than scalar test functions,2 i.e., we view ∆adv/ret as a linear map from compactly supported, smooth densities to smooth scalar functions. The quantum theory of the field ϕ is defined by constructing a suitable *-algebra of observables as follows: we start with the free *-algebra with identity 11 generated by the formal expressions ϕ(f ) and ϕ(h)∗ where f, h are smooth compactly supported densities on M . Now factor this free *-algebra by the following relations: (i) ϕ(α1 f1 + α2 f2 ) = α1 ϕ(f1 ) + α2 ϕ(f2 ), with α1 , α2 ∈ C; (ii) ϕ(f )∗ = ϕ(f¯); 2 Consequently,
the delta-distribution in Eq. (3) is also undensitized.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
233
(iii) ϕ((∇a ∇a − m2 − ξR)f ) = 0; and (iv) ϕ(f1 )ϕ(f2 )−ϕ(f2 )ϕ(f1 ) = i∆(f1 , f2 )11, where ∆ denotes the causal propagator for the Klein–Gordon operator, ∆ = ∆adv − ∆ret .
(5)
We refer to the algebra, A(M, g), defined by relations (i)–(iv) as the CCR-algebra (for “canonical commutation relations”). Quantum states on the CCR-algebra A are simply linear maps ω from A into C that are normalized in the sense that ω(11) = 1 and that are positive in the sense that ω(a∗ a) is non-negative for any a ∈ A. This algebraic notion of a quantum state corresponds to the usual notion of a state as a normalized vector in a Hilbert space as follows: given a representation, π, of A on a Hilbert space, H (so that each a ∈ A is represented as a linear operator π(a) on H), then any normalized vector state |ψ ∈ H defines a state ω in the above sense via taking expectation values, ω(a) = ψ|π(a)|ψ. Conversely, given a state, ω, the GNS construction establishes that one can always find a Hilbert space, H, a representation, π of A on H, and a vector |ψ ∈ H such that ω(a) = ψ|π(a)|ψ. By construction, the only observables contained in A are the correlation functions of the quantum field ϕ. Even if we were only interested in considering the free quantum field defined by the action Eq. (2), there are observables of interest that are not contained in A, such as the stress-energy tensor of the quantum field T ab = 2−1
δL0 1 1 = ∇a ϕ∇b ϕ − g ab ∇c ϕ∇c ϕ − g ab m2 ϕ2 δgab 2 2 + ξ[Gab ϕ2 − 2∇a (ϕ∇b ϕ) + 2g ab ∇c (ϕ∇c ϕ)].
(6)
We will refer to any polynomial expression, Φ, in ϕ and its derivatives as a “Wick polynomial”. All Wick polynomials, such as Tab , that involve quadratic or higher order powers of ϕ are intrinsically ill defined on account of the distributional character of ϕ. It is natural, however, to try to interpret Wick polynomials as arising from “unsmeared” elements of A that are then made well defined via some sort of “regularization” procedure. In Minkowski spacetime, a suitable regularization is accomplished by “normal ordering”, which can be interpreted in terms of a subtraction of expectation values in the Minkowski vacuum state. However, in curved spacetime, regularization via “vacuum subtraction” is, in general, neither available (since there will, in general, not exist a unique, preferred “vacuum state”) nor appropriate (since the resulting Wick polynomials will fail to be local, covariant fields [13]). The necessity of going beyond observables in A becomes even more clear if one attempts to construct the theory of a self-interacting field (with a polynomial selfinteraction) in terms of a perturbation expansion off of a free field theory. First, the interaction Lagrangian, L1 , itself will be a Wick polynomial and thereby corresponds to an observable that does not lie in A. Second, the nth order perturbative corrections to ϕ — or, more generally, the nth order perturbative corrections to any Wick monomial Φ — are formally given by the Bogoliubov formula (see Eq. (91)
May 19, 2005 1:20 WSPC/148-RMP
234
J070-00234
S. Hollands & R. M. Wald
below), which expresses the Wick monomial ΦL1 , for the interacting field as a sum of Φ and correction terms involving the “time-ordered products” of expressions containing one factor of Φ and n factors of L1 . For the case of two Wick monomials, Φ1 and Φ2 , the time-ordered product is formally given by T(Φ1 (x1 )Φ2 (x2 )) = ϑ(x01 − x02 )Φ1 (x1 )Φ2 (x2 ) + ϑ(x02 − x01 )Φ2 (x2 )Φ1 (x1 )
(7)
where ϑ denotes the step function. (The formal generalization of Eq. (7) to timeordered products with n-arguments is straightforward.) However, even if Wick monomials have been suitably defined, the time-ordered product (7) is not well defined since the Wick monomials also have a distributional character, and taking their product with a step function is, in general, ill defined. Nevertheless, in Minkowski spacetime, time-ordered products can be defined by well-known renormalization procedures. Thus, the perturbative construction of the quantum field theory of an interacting field requires the definition of Wick polynomials and time-ordered products, both of which necessitate enlarging the algebra of observables beyond the original CCRalgebra, A. These steps were successfully carried out in [13, 14], based upon prior results obtained in [4, 5]. The first key step is to construct an algebra of observables, W(M, g), which is large enough to contain all Wick polynomials and time-ordered products. To do so, consider the following expressions in A(M, g): n ϕ(xi ) :ω Wn (u) = u(x1 , . . . , xn ) : ≡
i
δn iϕ(f )+ 12 ω2 (f,f ) e u(x1 , . . . , xn ) n i δf (x1 ) · · · δf (xn )
,
u ∈ C0∞
(8)
f =0
where ω2 is the two-point function of an arbitrarily chosen Hadamard state. Thus, ω2 is a distribution on M × M with antisymmetric part equal to (i/2)∆, satisfying the spectrum condition given in Eq. (31) and satisfying the Klein–Gordon equation in each entry, i.e., (P ⊗1)ω2 = 0 = (1⊗P )ω2 where P is the Klein–Gordon operator associated with L0 , P = ∇a ∇a − m2 − ξR.
(9)
It follows from the above relations (i)–(iv) in the CCR-algebra that Wn (u)∗ = u), and that Wn (¯ Wn (u) · Wm (u ) = Wn+m−2k (u ⊗k u ), (10) 2k≤m+n
where the “k-times contracted tensor product” ⊗k is defined by (u ⊗k u )(x1 , . . . , xn+m−2k ) n!m! def = S u(y1 , . . . , yk , x1 , . . . , xn−k ) (n − k)!(m − k)!k! M 2k × u (yk+1 , . . . , yk+i , xn−k+1 , . . . , xn+m−2k )
k i=1
ω2 (yi , yk+i )(yi )(yk+i )
(11)
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
235
where S denotes symmetrization in x1 , . . . , xn+m−2k . If either m < k or n < k, then the contracted tensor product is defined to be zero. The above product formula can be recognized as Wick’s theorem for normal ordered products. The enlarged algebra W(M, g) is now obtained by allowing not only compactly supported smooth functions u ∈ C0∞ as arguments of Wn (u) but more generally any distribution u in the space En (M, g) = {u ∈ D (M n ) | WF(u) ∩ (V + )n = WF(u) ∩ (V − )n = ∅}.
(12)
Here, V +/− ⊂ T ∗ M is the union of all future resp. past lightcones in the cotangent space over M , and WF(u) is the wave front [16] set of a distribution u. The key point is that Hadamard property of ω2 and the wave front set condition on the u and u imposed in the definition of the spaces En (M, g) is necessary and sufficient in order to show that the distribution products appearing in the contracted tensor prod (M, g). uct are well-defined and give a distribution in the desired class Em+n−2k Note that the definition of the algebra W(M, g) a priori depends on the choice of ω2 . However, it can be shown [13] that different choices give rise to *-isomorphic algebras. Thus, as an abstract algebra, W(M, g) is independent of the choice of ω2 . Although the algebra W(M, g) is “large enough” to contain all Wick polynomials and time-ordered products, the above construction does not determine which elements of W(M, g) correspond to given Wick polynomials or time-ordered products. (In particular, the normal-ordered quantities Wn , Eq. (8), with u taken to be a smooth function of one variable times a delta-function, clearly do not provide an acceptable definition of Wick powers, since they fail to define local, covariant fields [13].) In [13, 14], an axiomatic approach was then taken to determine which elements of W correspond to given Wick polynomials and time-ordered products. In other words, rather than attempting to define Wick polynomials and time-ordered products by the adoption of some particular regularization scheme, we provided a list of properties that these quantities should satisfy. We proved the existence of Wick polynomials and time-ordered products satisfying these properties and also proved their uniqueness up to expected renormalization ambiguities. As already discussed in the previous section, one of the main purposes of the present paper is to supplement this list of axioms with additional conditions applicable to Wick polynomials and time-ordered products containing derivatives, and to prove correspondingly stronger existence and uniqueness theorems. We will shortly review the axioms that we previously gave in [13, 14]. However, before doing so, we shall explain a subtle but important shift in our viewpoint on the nature of Wick polynomials and time-ordered products. A Wick polynomial is a distribution, valued in the quantum field algebra W that corresponds to a polynomial expression in the classical field ϕ and its derivatives. It is therefore natural to consider the classical algebra, Cclass , of real polynomial expressions in the (unsmeared) classical field ϕ(x) and its derivatives, where we impose all of the normal rules of algebra (such as the associative, commutative, and distributive laws) and tensor calculus (such as the Leibniz rule) to the expressions in Cclass , and, in addition, we impose the wave equation on ϕ, i.e., we set
May 19, 2005 1:20 WSPC/148-RMP
236
J070-00234
S. Hollands & R. M. Wald
(∇a ∇a − m2 − ξR)ϕ(x) = 0. It would then be natural to view Wick polynomials as maps from Cclass into distributions with values in W. However, this viewpoint on Wick polynomials is, in general, inconsistent because of the existence of anomalies. Indeed, we already mentioned in the introduction that — as we will explicitly show in Sec. 3.2 below — under our other assumptions, it will not be consistent to set to zero all Wick monomials containing a factor of (∇a ∇a − m2 − ξR)ϕ(x), even though elements of Cclass that contain such a factor vanish. This difficulty has a simple remedy: we can instead define a classical field algebra of polynomial expressions in the unsmeared field ϕ(x) and its derivatives where we no longer impose the wave equation. More precisely, let Vclass denote the real vector space of all classical polynomial tensor expressions3 involving ϕ, its symmetrized covariant derivatives4 (∇)k ϕ, the metric, and arbitrary curvature tensors C, Vclass = spanR {Φ = C · (∇)r1 ϕ · · · (∇)rk ϕ; k, ri ∈ N}
(13)
where, as in the case of Cclass , we impose all of the normal rules of algebra and tensor calculus to the expressions in Vclass but now we do not impose the field equation associated with L0 . We denote a generic monomial element in Vclass by the capital greek letter Φ. We also introduce the space Fclass of all classical D-form functionals of the metric g, the field ϕ and its derivatives, depending in addition on compactly supported (complex) tensor fields f , Fclass = span{ A(x) = (x)∇c1 · · · ∇cm f a1 ···ar (x)Φa1 ···ar c1 ···cm (x) | f smooth, comp. supported tensor field on M ; Φ a monomial in Vclass }. (14) Again, we do not assume in the definition of Fclass that the classical equations of motion for ϕ hold. In particular, we do not assume that expressions such as f (∇a ∇a − ξR − m2 )ϕ are set to zero. We will often suppress the tensor indices and write a classical D-form functional A ∈ Fclass simply as A = f Φ ∈ Fclass ,
(15)
or A = [(∇)k f ]Φ, if we want to emphasize that the functional depends on derivatives of f . We then view the Wick polynomials as linear maps from Fclass into W. Following [2] and [8], we previously explicitly adopted the above viewpoint on Wick polynomials in [15]. This viewpoint does not constitute a significant departure from standard viewpoints, but merely provides a clearer framework for discussing anomalies. However, as we shall now explain, our viewpoint on time-ordered 3 The coefficients of these polynomial expressions may have arbitrary polynomial dependence on the dimensionful parameter m2 and may have arbitrary analytic dependence on the dimensionless parameter ξ. However, we will not normally explicitly write these possible dependences on the parameters appearing in the theory. 4 The notation (∇)k t bc···d is a shorthand for the symmetrized kth derivative of a tensor, ∇(a1 · · · ∇ak ) tbc···d . We may write any expression containing k derivatives of a tensor field tbc···d in terms of symmetrized derivatives of tbc...d of kth and lower order and curvature.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
237
products — which corresponds to the viewpoint taken in [9] — does constitute a significant departure from viewpoints that are commonly taken. As indicated above (see Eq. (7)), it would appear natural to view the timeordered product, T, in n-factors as a multilinear map taking Wick polynomials into W. Indeed, our previous papers [13–15] contain the phrase “Wick powers and their time-ordered products” in many places. However, the untenability of this view can be seen from the following simple example. Consider the quantum field theory defined by the classical Lagrangian density L = L0 + L1 , with L1 = f P ϕ
(16)
for some smooth function, f , of compact support, where P stands for the Klein– Gordon operator associated with L0 , Eq. (9). The classical equations of motion arising from the Lagrangian L are simply Pϕ = Pf
(17)
i.e., ϕ satisfies the inhomogeneous wave equation with smooth source J = P f . Clearly, the interacting quantum field, ϕL1 , should also satisfy the inhomogeneous wave equation with source J = P f 11. By inspection, it follows that ϕL1 should be given in terms of the free quantum field ϕ by ϕL1 = ϕ + f 11.
(18)
Note that this interacting quantum field theory has a trivial S-matrix (since ϕL1 = ϕ outside of the support of f ), but the local field ϕL1 is, of course, affected by f in the region where f = 0. Now compare Eq. (18) with what is obtained from perturbation theory. As already noted above, in perturbation theory, ϕL1 is equal to the free quantum field ϕ(x), plus a sum of corrections terms, where the nth order correction term involves the quantity n P ϕ(yi ) f (y1 ) · · · f (yn )(y1 ) · · · (yn ). (19) T ϕ(x) i=1
Since P ϕ = 0, it would appear that perturbation theory yields ϕL1 = ϕ rather than Eq. (18). Consequently, we are put in the position of having to choose (at least) one of the following three possibilities: (1) the exact solution (18) for the interacting field is wrong; (2) the Bogoliubov formula for the interacting quantum field is wrong, at least in the case of interactions involving derivatives of the field; (3) the time-ordered product Eq. (19) can be nonvanishing even though the Wick monomial P ϕ vanishes. In our view, choices (1) and (2) are far more unacceptable than (3), and we therefore choose option (3). The results of this paper (specifically, the existence theorem of Sec. 6), will establish that it is mathematically consistent to make this choice.
May 19, 2005 1:20 WSPC/148-RMP
238
J070-00234
S. Hollands & R. M. Wald
Thus, we do not view the time-ordered products (with n factors) as an n-times multilinear map on Wick polynomials but rather as an n-times multilinear map Tg : Fclass × · · · × Fclass → W(M, g),
n factors n (f1 Φ1 , . . . , fn Φn ) → Tg f i Φi .
(20)
(21)
i=1
We note that, for a fixed choice of monomials Φi ∈ Vclass , we get a multilinear func n fj Φj ) mapping test functions on M to the algebra W. In tional (f1 , . . . , fn ) → T( the following, we will sometimes use the more suggestive informal integral notation5 n T fi Φi = T(Φ1 (x1 ) · · · Φn (xn ))f1 (x1 ) · · · fn (xn ) (22) i=1
for this multilinear map. Note that this notation is exactly analogous to the usual informal integral notation for distributions u(f ) = u(x)f (x) acting on test densities f . The Wick monomials are simply time-ordered products with a single factor, and we will use the notation T(f Φ) = Φ(f ) = Φ(x)f (x) (23) for these objects. Note, however, we will not use the much more standard notation
n T( Φi (fi )) for time-ordered products, since this would suggest that the timeordered products are functions of the Wick monomials Φj (fj ) rather than of the classical functionals fj Φj of the field ϕ. We turn now to a review of the properties satisfied by time-ordered products. 2.2. Properties of time-ordered products: Axioms T1–T9 In [14], we imposed a list of requirements on time-ordered products. Since a timeordered product in a single factor is just a Wick polynomial, these requirements on time-ordered products also apply to Wick polynomials. In addition, Wick polynomials are further restricted by the requirement that if A ∈ Fclass is independent of ϕ, i.e., if A is of the form A = (∇)m f C
(24)
for some test tensor field f and some monomial C in the Riemann tensor and its derivatives, then the corresponding Wick polynomial is given by T(A) = (∇)m f C · 11, (25) M 5 Note that we do not need to specify an integration element in the formula below since the quantities fi Φi already have the character of a density.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
239
where 11 is the identity element in W. Similarly, if A = f ϕ, then we require that T(f ϕ) = ϕ(f ),
(26)
where ϕ(f ) is the free quantum field, i.e., the algebra element in W obeying the relations (i)–(iv) above. For the convenience of the reader, we now provide the list of axioms given in [14]. We refer the reader to [13] and [14] for further discussion of the motivation for these conditions as well as further discussion of their meaning and implications. T1 locality/covariance. The time-ordered products are local, covariant fields, in the following sense. Consider an isometric embedding χ of a spacetime (M , g ) into a spacetime (M, g) (i.e., g = χ∗ g) preserving the causality structure, and let αχ : W(M , g ) → W(M, g) be the corresponding algebra homomorphism. Then time-ordered products are required to satisfy (27) f i Φi = T g (χ∗ fi )Φi . αχ Tg Here, χ∗ f denotes the compactly supported tensor field on M obtained by pushing forward the compactly supported tensor f field on N via the map χ. [For example, χ∗ f (x) = f (χ−1 (x)) if f is scalar and x in the image of χ.] In particular, for the (scalar) Wick products, the requirement reads αχ [Φg (x)] = Φg (χ(x))
for all x ∈ M .
(28)
T2 scaling. The time-ordered products scale “almost homogeneously” under rescalings g → λ−2 g of the spacetime metric in the following sense. Let Tg be a local, covariant time-ordered product with n factors, and let Sλ Tg be the rescaled local, covariant field given by Sλ Tg ≡ λ−Dn σλ Tλ−2 g , where σλ : W(M, λ−2 g) → W(M, g) is the canonical isomorphism defined in [13, Lemma 4.2]. The scaling requirement on the time-ordered product is then that there is some N such that ∂N λ−dT Sλ Tg = 0. ∂ N ln λ
(29)
Here, dT is the engineering dimension of the time-ordered product, defined as6 dT = dΦi , with dΦ =
(D − 2) × #(factors of ϕ) + #(derivatives) + 2 × #(factors of curvature) 2 + #(“up” indices) − #(“down” indices), (30)
where D is the dimension of the spacetime M . T3 microlocal spectrum condition. Let ω be any continuous state on W(M, g), so that, as shown in [12], ω has smooth truncated n-point functions for n = 2 and a 6 The
rule for assigning an engineering dimension to a field is obtained by requiring that the classical action be invariant under scaling. Formula (30) holds only for scalar field theory, i.e., in other theories, the dimension of the basic field(s) may be different from (D − 2)/2.
May 19, 2005 1:20 WSPC/148-RMP
240
J070-00234
S. Hollands & R. M. Wald
two-point function ω2 (f1 , f2 ) = ω(ϕ(f1 )ϕ(f2 )) of Hadamard from, i.e., WF(ω2 ) ⊂ C+ (M, g), where C+ (M, g) = {(x1 , k1 ; x2 , −k2 ) ∈ T ∗ M 2 \ {0} | (x1 , k1 ) ∼ (x2 , k2 ); k1 ∈ (V + )x1 }. (31) Here the notation (x1 , k1 ) ∼ (x2 , k2 ) means that x1 and x2 can be joined by a null-geodesic and that k1 and k2 are cotangent and coparallel to that null-geodesic. (V + )x is the future lightcone at x. Furthermore, let n Φi (xi ) . (32) ωT (x1 , . . . , xn ) = ω T i=1
Then we require that WF(ωT ) ⊂ CT (M, g),
(33)
where the set CT (M, g) ⊂ T ∗ M n \ {0} is described as follows (we use the graphological notation introduced in [4, 5]): let G(p) be a “decorated embedded graph” in (M, g). By this we mean an embedded graph ⊂ M whose vertices are points x1 , . . . , xn ∈ M and whose edges, e, are oriented null-geodesic curves. Each such null-geodesic is equipped with a coparallel, cotangent covectorfield pe . If e is an edge in G(p) connecting the points xi and xj with i < j, then s(e) = i is its source and / J ± (xt(e) ). t(e) = j its target. It is required that pe is future/past directed if xs(e) ∈ With this notation, we define CT (M, g) = (x1 , k1 ; . . . ; xn , kn ) ∈ T ∗ M n \ {0} | ∃ decorated graph G(p) with vertices x1 , . . . , xn such that ki =
pe −
e:s(e)=i
pe ∀ i .
(34)
e:t(e)=i
T4 smoothness. The functional dependence of the time-ordered products on the spacetime metric, g, is such that if the metric is varied smoothly, then the timeordered products vary smoothly, in the following sense. Consider a family of metrics g(s) depending smoothly upon a set of parameters s in a parameter space P. Furthermore, let ω (s) be a family of Hadamard states with smooth truncated n-point (s) functions (n = 2) depending smoothly on s and with two-point functions ω2 depending smoothly on s in the sense that, when viewed as a distribution jointly in (s, x1 , x2 ), we have WF(ω2 ) ⊂ {(s, ρ; x1 , k1 ; x2 , k2 ) ∈ T ∗ (P × M 2 ) \ {0} | (s)
(x1 , k1 ; x2 , k2 ) ∈ C+ (M, g(s) )},
(35)
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
241
where the family of cones C+ (M, g(s) ) is defined by Eq. (31) in terms of the family g(s) . Then we require that the family of distributions given by (s) ωT (x1 , . . . , xn )
=ω
(s)
Tg(s)
n
Φi (xi )
(36)
i=1
(viewed as distributions in the variables (s, x1 , . . . , xn )) depends smoothly on s with respect to the sets CT (M, g(s) ) defined in Eq. (34), in the sense that WF(ωT ) ⊂ {(s, ρ; x1 , k1 ; . . . ; xn , kn ) ∈ T ∗ (P × M n ) \ {0} | (s)
(x1 , k1 ; . . . ; xn , kn ) ∈ CT (M, g(s) )}.
(37)
We similarly demand that the time-ordered products also have a smooth dependence upon the parameters m2 , ξ in the free theory. T5 analyticity. Similarly, we require that, for an analytic family of analytic metrics (depending analytically upon a set of parameters), the expectation value of the timeordered products in an analytic family of states7 varies analytically in the same sense as in T4, but with the smooth wave front set replaced by the analytic wave front set. We similarly demand an analytic dependence upon the the parameters m2 , ξ. T6 symmetry. The time-ordered products are symmetric under a permutation of the factors. ¯ fi Φi ) = [T( f¯i Φi )]∗ , Φi ∈ Vclass , be the “anti-timeT7 unitarity. Let T( ordered” product. Then we require
¯ 1 Φ1 · · · f n Φn ) = T(f
(−1)n+j T
I1 ···Ij ={1,...,n}
i∈I1
f i Φi
···T
fj Φj , (38)
j∈Ij
where the sum runs over all partitions of the set {1, . . . , n} into pairwise disjoint subsets I1 , . . . , Ij . T8 causal factorization. For time-ordered products with more than one factor, we require the following causal factorization rule, which reflects the time-ordering of the factors. Consider a set of test functions (f1 , . . . , fn ) and a partition of {1, . . . , n} into two non-empty disjoint subsets I and I c , with the property that no point xi ∈ supp fi with i ∈ I is in the past of any of the points xj ∈ supp fj with j ∈ I c ,
7 As
explained in [13, Remark (2), p. 311], it suffices to consider a suitable analytic family of linear functionals on W that do not necessarily satisfy the positivity condition required for states.
May 19, 2005 1:20 WSPC/148-RMP
242
J070-00234
S. Hollands & R. M. Wald
that is, xi ∈ / J − (xj ) for all i ∈ I and j ∈ I c . Then the corresponding time-ordered product factorizes in the following way: n (39) T f k Φk = T f i Φi T f j Φj . k=1
i∈I
j∈I c
In the case of 2 factors, this requirement reads (in the informal notation introduced above) Φ(x)Ψ(y) when x ∈ / J + (y); T(Φ(x)Ψ(y)) = (40) Ψ(y)Φ(x) when y ∈ / J − (x). T9 commutator. The commutator of a time-ordered product with a free field is given by lower order time-ordered products times suitable commutator functions, namely n n δ(fi Φi ) (41) T f1 Φ1 · · · (∆F ) · · · f n Φn , fi Φi , ϕ(F ) = i T δϕ i=1 where ∆ = ∆adv − ∆ret is the causal propagator (commutator function), and where we are using the notation (∆F )(x) = M ∆(x, y)F (y) for the action of the causal propagator on a smooth density F of compact support.8 Here, the functional derivative, δA/δϕ ∈ Fclass , of an arbitrary element of A ∈ Fclass is given by ∂A δA = . (−1)r ∇(a1 · · · ∇ar ) δϕ ∂(∇ (a1 · · · ∇ar ) ϕ) r
(42)
This formula corresponds to the usual “Euler–Lagrange”-type expression familiar from the calculus of variations; see Appendix B for further discussion. Remark: In [14], condition T9 was explicitly stated only for the case where each Φi has no dependence on derivatives of ϕ. Equation (41) is the appropriate generalization to arbitrary Φi . For the case of a Wick power (i.e., a time-ordered product in one argument), Eq. (41) can be motivated by the requirement of maintaining the desired relationship between Poisson-brackets and commutators. The main results of [13] and [14] are that there exists a definition of time-ordered products that satisfies conditions T1–T9 and that, furthermore, this definition is unique up to the expected renormalization ambiguites. Our goal now is to impose additional conditions appropriate to time-ordered products whose factors Φi depend upon derivatives of ϕ, and to then prove the corresponding existence and uniqueness theorems. These additional conditions will arise from the following basic principle 8 As previously noted at the beginning of this section, when writing expressions like ∆F or likewise ∆adv/ret F , we take the point of view that the Green’s functions are linear maps from smooth compactly supported densities on M to smooth scalar functions on M .
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
243
already stated in the introduction: Principle of Perturbative Agreement: If the interaction Lagrangian L1 = i fi Φi (where each fi is smooth and of compact support and each Φi ∈ Vclass ) is such that the quantum field theory defined by the full Lagrangian L0 + L1 can be solved exactly, then the perturbative construction of the quantum field theory as defined by the Bogoliubov formula must agree with the exact construction. 3. The Leibniz Rule 3.1. Formulation of the Leibniz rule, T10, and proof of consistency with axioms T1–T9 Our first new requirement arises from considering a classical functional A ∈ Fclass of the form A = dB,
(43)
where B is in the analog of the space Fclass (see Eq. (14)) but with “D-form” replaced by “(D − 1)-form”, and where d is the exterior differential (mapping (D − 1)-forms to D-forms). An example of such a B is Ba1 ···aD−1 = f b ba1 ···aD−1 Φ
(44)
where f c is a test vector field and Φ is a scalar element of Vclass . The general such B would be of a similar form, except that Φ and f could have additional tensor indices, derivatives could act on f , and could be contracted with an index of Φ rather than an index of f . For B of the form Eq. (44), A would take the explicit form. A = (∇c fc )Φ + fc ∇c Φ.
(45)
Classically, the Lagrangian L = L0 + A defines the same theory as the Lagrangian L0 . Consequently, the “interacting” quantum field theory defined by the interaction Lagrangian L1 = A should coincide with the free quantum field theory, i.e., all of the perturbative corrections should vanish for an interaction Lagrangian of this form. To ensure this, we shall now add the following condition to our list of axioms of the previous section: T10 Leibniz rule. Let A ∈ Fclass be any classical functional of the form Eq. (43). Then for all fi Φi ∈ Fclass , we require that T(Af1 Φ1 · · · fn Φn ) = 0,
(46)
i.e., any time-ordered product containing a factor of A = dB must vanish. Remark: Condition T10 has previously been proposed (in the context of quantum field theory in flat spacetime) in [18] and [9, 10] and is referred to as the “action Ward identity” in these references.
May 19, 2005 1:20 WSPC/148-RMP
244
J070-00234
S. Hollands & R. M. Wald
Condition T10 for time-ordered products with two or more factors is clearly necessary and sufficient for the vanishing of all perturbative corrections to the interacting fields (including the interacting time-ordered products). However, causal factorization (T8) then implies that condition T10 must hold for Wick powers as well. Thus, condition T10 is necessary and sufficient to guarantee that the theory defined perturbatively by the interaction Lagrangian L1 = dB yields exactly the free theory. However, it may not be obvious what, if anything, condition T10 has to do with the “Leibniz rule”, so we shall now explain the relationship of this condition to more usual formulations of the Leibniz rule. By doing so, we will also clarify our notation and further elucidate the viewpoint on time-ordered products introduced in the previous section. Consider, first, the case of time-ordered products in one factor, i.e., Wick powers, in which case condition T10 simply states that for all B, T(dB) = 0.
(47)
Therefore, for the case in which B is given by Eq. (44) — and hence A is given by Eq. (45) — we obtain T((∇a f a )Φ + f a ∇a Φ) = 0
(48)
for all scalar Φ ∈ Vclass and all test vector fields f a . It should be understood here that ∇a Φ represents the classical expression corresponding to taking the derivative of Φ. For example, if Φ = ϕα for some natural number α, then ∇a Φ = αϕα−1 ∇a ϕ. But T((∇a f a )Φ) is the same thing as the distributional derivative of −T(Φ) smeared with f a . Hence, using our notation T(f Φ) = Φ(f ) for Wick powers, we may re-write Eq. (48) as ∇a Φ(f a ) = (∇a Φ)(f a ).
(49)
Here, the quantity ∇a Φ, appearing on the left side of this equation represents the distributional derivative of the algebra valued distribution Φ, whereas the quantity (∇a Φ) appearing on the right side of this equation represents the Wick polynomial associated with the classical quantity ∇a Φ. (Note that since these logically distinct quantities look the same except for the parentheses, our notation Φ(f ) for Wick powers would be unacceptable if Eq. (48) was not imposed!) Thus, in the above example where Φ = ϕα , Eq. (49) takes the form ∇a ϕα (f a ) = α(ϕα−1 ∇a ϕ)(f a ),
(50)
or, in the more common, informal notation ∇a [ϕα (x)] = α(ϕα−1 ∇a ϕ)(x).
(51)
Again, the left side of this equation denotes the distributional derivative of ϕα , so this equation does indeed correspond to the usual notion of the Leibniz rule.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
245
Analogous results hold for the relations for Wick powers arising from T10 for general forms of B. The meaning of requirement T10 for time-ordered products with more than one factor can be seen as follows. Again, for simplicity, let B be of the form Eq. (44). Condition T10 states that for all hi Ψi ∈ Fclass , we have hj Ψj = T f c (∇c Φ) hj Ψj . −T (∇c f c )Φ
(52)
In the more common, informal notation this equation can be re-written as ∇ay [T(Φ(y)Ψ1 (x1 ) · · · Ψn (xn ))] = T((∇a Φ)(y)Ψ1 (x1 ) · · · Ψn (xn )) .
(53)
Here, the left side denotes the distributional derivative of T(Φ(y)Ψ1 (x1 ) · · · Ψn (xn )) with respect to the variable y, whereas the factor (∇a Φ) appearing on the right side denotes the classical field expression obtained by taking the derivative of Φ. In other words, for time-ordered products with more than one factor, the operational meaning of T10 is simply that derivatives can be “freely commuted” through T. Since the arguments of time-ordered products are classical field expressions, the Leibniz rule, of course, holds for the expressions hit by the derivative inside of T. It is useful to further illustrate the meaning of condition T10 — and the extent to which it differs from conventional viewpoints on time-ordered products — with a simple example. Let us attempt to calculate T(ϕ(x)ϕ(y)) according to our axiom scheme. By causal factorization (T8), T(ϕ(x)ϕ(y)) must satisfy T(ϕ(x)ϕ(y)) =
ϕ(x)ϕ(y) ϕ(y)ϕ(x)
if x ∈ / J + (y), if y ∈ / J + (x),
(54)
which determines T(ϕ(x)ϕ(y)) except on the “diagonal” x = y. However, since there do not exist any local and covariant distributions (T1) with support on the diagonal that have the correct scaling behavior (T2) as well as the desired smooth and analytic dependence upon the spacetime metric T4 and T5, it follows that T(ϕ(x)ϕ(y)) is unique. This unique extension of the distribution defined by Eq. (54) to the diagonal is T(ϕ(x)ϕ(y)) = ϑ(x0 − y 0 )ϕ(x)ϕ(y) + ϑ(y 0 − x0 )ϕ(y)ϕ(x)
(55)
where ϑ denotes the step function. (The right side of this equation is mathematically well defined on account of the wave front set properties of ϕ(x)ϕ(y); as already noted in Sec. 1, the corresponding expression for general Wick monomials (see Eq. (7)) is not well defined.) If we apply the Klein–Gordon operator P to the variable x of this distribution, we obtain (P ⊗ 1)T(ϕ(x)ϕ(y)) = iδ(x, y)11.
(56)
May 19, 2005 1:20 WSPC/148-RMP
246
J070-00234
S. Hollands & R. M. Wald
Consider, now, the time-ordered product T((P ϕ)(x)ϕ(y)). By causal factorization (T8), this distribution satisfies T((P ϕ)(x)ϕ(y)) = 0 if x = y.
(57)
The most obvious extension of this distribution to the diagonal is, of course, to put T((P ϕ)(x)ϕ(y)) = 0 for all x, y, including the diagonal. This is the conventional assumption. However, since T((P ϕ)(x)ϕ(y)) has dimension length−D , there does exist a distribution with support on the diagonal that satisfies the above required properties, namely δ(x, y)11. Consequently, within the scheme of axioms T1–T9, we have the freedom to add a “contact term” and define T((P ϕ)(x)ϕ(y)) to be an arbitrary multiple of δ(x, y)11. Axiom T10 together with Eq. (56) requires, in fact, that we make use of this freedom to define T((P ϕ)(x)ϕ(y)) to be given by T((P ϕ)(x)ϕ(y)) = iδ(x, y)11.
(58)
Since the Wick power P ϕ vanishes identically, this explicitly shows that it is inconsistent with axioms T1–T10 to view a time-ordered product as a multilinear map on Wick polynomials rather than as a multilinear map on elements of Fclass . We now prove that, for arbitrary Wick powers and time-ordered products, it is consistent to impose the Leibniz rule T10 in addition to our previous axioms T1–T9. In essence, the following proposition provides a generalization to curved spacetime of the proof of the “action Ward identity” given in [10]. Proposition 3.1. There exists a prescription for defining time-ordered products satisfying our requirements T1–T10. Proof. As in [14], we will proceed by an inductive argument on the number of factors, NT , appearing in the time-ordered product. Consider, first, the case of Wick monomials, i.e., NT = 1. We previously showed [13] that the following prescription of “local Hadamard normal ordering” (i.e., “covariant point-splitting regularization”) satisfies conditions T1–T9: let H(x, y) be a symmetric, locally constructed Hadamard parametrix.9 We define [13] 1 δk exp H(f ⊗ f ) + iϕ(f ) (59) : ϕ(x1 ) · · · ϕ(xk ) :H = k i δf (x1 ) · · · δf (xk ) 2 f =0 For an arbitrary Φ = C(∇)r1 ϕ · · · (∇)rk ϕ ∈ Vclass (where C denotes a curvature term and all tensor indices have been suppressed), we define the corresponding 9 See e.g. Eqs. (7) and (8) and Appendix A of [17] for the explicit form of H in D dimensions. Note that [17] uses a parametrix, Zn , that is “truncated” at nth order, which will give an acceptable prescription only when the total number of derivatives, N∇ , appearing in the Wick power is sufficiently small. In order to give a prescription that is valid for arbitrary N∇ , one must define H by the procedure explained below Eq. (69) of [13].
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
Wick monomial by [13]
247
C(y) : ϕ(x1 ) · · · ϕ(xk ) :H F (y; x1 , . . . , xk )
T(f Φ) = Φ(f ) = =
C(y) :
k
(∇)ri ϕ(y) :H f (y)(y)
(60)
i=1
where F (y; x1 , . . . , xk ) = f (y)(∇x1 )r1 · · · (∇xk )rk δ(y, x1 , . . . , xk )
k
(xi ).
(61)
i
(Note that for the definition of general Wick powers with derivatives, it is essential for this prescription to be well defined that H(x, y) be symmetric in x and y, so that it does not matter which of the variables (x1 , . . . , xk ) we select to apply derivatives to.) The arguments of [13] can now be straightforwardly generalized to show that this prescription satisfies not only conditions T1–T9 but also T10. Thus, there is no difficulty in adding condition T10 to the list of properties that we require for Wick powers. However, since our previous existence proof for time-ordered products [14] does not provide a correspondingly explicit prescription for their definition, we cannot give a similar, direct proof that condition T10 can be imposed on time-ordered products. Instead, we must proceed by re-proving the existence theorem of [14], where we now explicitly allow the factors appearing in the time-ordered products to contain derivatives of ϕ and where we now add condition T10 to the list of requirements. We inductively assume that the construction of the time-ordered products satisfying T1–T10 has been performed up to
(62)
denotes the total diagonal. Furthermore, it is easily verified that T 0 satisfies T1–T10 when acting on test functions supported away from the total diagonal. Consequently, our task is to extend T 0 to a distribution in D (M NT ), i.e., to a distribution defined everywhere, in such a way that T1–T7 and T9–T10 are preserved in the extension process. (T8 has already been satisfied by the requirement that T be an extension of T 0 .) In fact, we need only show that an extension can be chosen so as to preserve T1–T5 and T9–T10, since the symmetry property, T6, and unitarity property, T7, can always be satisfied by a simple re-definition of T if all of the other properties have been satisfied [14]. As in [14], we can reduce this task to a much more manageable one by the use of a “local Wick expansion” for T 0 . The appropriate form of this local Wick expansion
May 19, 2005 1:20 WSPC/148-RMP
248
J070-00234
S. Hollands & R. M. Wald
in the case where the arguments of T 0 are general elements10 of Fclass is N T 1 0 f i Φi = (yj ) T α ! · · · αNT ! α ,α ,... 1 i=1 j 1
2
× t0 [δ α1 Φ1 ⊗ · · · ⊗ δ αNT ΦNT ] (y1 , . . . , yNT ) NT [(∇)j ϕ(yj )]αij :H . × f1 (y1 ) · · · fNT (yNT ) :
(63)
i=1 j
Here, we are using the following notation: The t0 are multilinear mappings t0 :
NT !
Vclass → D (M NT \ ∆NT )
(64)
⊗i Φi → t0 [⊗i Φi ](y1 , . . . , yNT )
(65)
from classical field expressions to c-number distributions in the product manifold minus its total diagonal.11 Each αi is a multi-index (αi1 , αi2 , . . .), and we are using
the shorthand αi ! = j αij ! for such a multi-index. If Φ ∈ Vclass and α is a multiindex, we are using the notation αi ∂ α Φ. (66) δ Φ= ∂(∇)i ϕ i As in [14], Eq. (63) can be proved by induction in Nϕ , using the commutator property, T9. By use of the local Wick expansion, we reduce the problem of extending T 0 to the problem of extending the expansion coefficients t0 . The time-ordered product defined via Eq. (63) from the extension of t0 will automatically satisfy the commutator requirement T9. Thus, we need only show that the expansion coefficients t0 can be extended so as to satisfy T1–T5 and T10. It follows directly from the assumed properties T1–T10 of the time-ordered products for
(67)
ith slot 10 Since the Hadamard parametrix H(x, y) is only defined when x, y are in a convex normal neighborhood of each other, it follows that the local Wick expansion is only defined if F = ⊗i fi is supported in a sufficiently small neighborhood of the total diagonal. Note that this does not cause any problems in the present context since we are only interested in an arbitrarily small neighborhood of the total diagonal for the extension problem. 11 More precisely, the t0 are distributions that are defined on a suitable neighborhood of the total diagonal, minus the total diagonal itself; see the previous footnote.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
249
In parallel with the arguments of [14], properties T1–T5 and T10 will hold for all time-ordered products with NT factors if and only if each t0 can be extended to a distribution t defined on all of M NT in such a way that the above properties are preserved in the extension process. The methods of [14] already provide an extension t of t0 that satisfies all of the required properties except that it is not guaranteed to satisfy the Leibniz relation 1 ⊗ · · ·
∇ ⊗ · · · 1 t[⊗i Φi ] = t[Φ1 ⊗ · · · ∇Φi ⊗ · · · ΦNT ].
(68)
ith slot
We will now complete the proof by showing how this relation can be satisfied. Let Vclass denote the space of classical field expressions, Eq. (13). Let Vclass (Nϕ , N∇ ) denote the subspace of Vclass spanned by the monomial expressions in the curvature, in ϕ, and in symmetrized derivatives of ϕ that have a total number of precisely Nϕ powers of ϕ and a total number of precisely N∇ derivatives acting on the factors of ϕ. Let Q∇ : Vclass (Nϕ , N∇ − 1) → Vclass (Nϕ , N∇ )
(69)
denote the map whose action12 on an element Φ ∈ Vclass (Nϕ , N∇ − 1) is defined by first applying ∇a to Φ, then dropping all terms where ∇a acts on factors of curvature rather than factors of ϕ and, finally, symmetrizing over all derivatives acting on any given factor of ϕ. We note, first, that for Nϕ > 0, the map Q∇ has vanishing kernel, i.e., in essence, the derivative of any nonvanishing expression with a nontrivial dependence on ϕ cannot vanish. To see this explicitly, choose an arbitrary but fixed point x ∈ M , a coordinate basis xµ = (x0 , x1 , . . . , xD−1 ), and consider the coordinate components Φν (x) =
µ ···µ 1
C µ1 ···µn ν (x) n
n
∇µ ϕ(x) i
(70)
i=1
of a Φ ∈ Vclass (Nϕ = n, N∇ ), where ν etc. is a shorthand for a symmetrized combination (ν1 · · · νk ) of components. With each such coordinate component (70), we assign a unique element pΦ ∈ C[P1 , . . . , Pn ], the ring of polynomials in the indeterminates Piµ , i = 1, . . . , n, µ = 0, . . . , D − 1 which are symmetric under exchange of Pi and Pj , by the following rule: with the ith factor of ϕ in expression (70), we associate the monomial in Pi = (Pi0 , . . . , PiD−1 ) obtained by replacing each derivative operator by the corresponding component of Pi . We then multiply the that the definition of Q∇ appears more natural in a framework in which one views Vclass not as a vector space over the reals, but instead as a module over the ring of polynomial curvature expressions, Rclass . In this context, Q∇ is simply defined to act as the derivative followed by symmetrization on monomials in Vclass without curvature coefficients, and then extended to all of Vclass by Rclass -linearity. 12 Note
May 19, 2005 1:20 WSPC/148-RMP
250
J070-00234
S. Hollands & R. M. Wald
resulting monomials in Pi for all i = 1, . . . , n, we multiply by the corresponding real constants C ... ... (x) and we symmetrize in i. It is then clear that the components of Q∇ Φ correspond to the polynomials ( i Piµ )pΦ , and it is clear that Q∇ Φ for Φ ∈ Vclass (Nϕ = n, N∇ ) will be zero if and only if all these polynomials are zero. It is easy to see (e.g., by the arguments given in [10, p. 22]) that this can only happen if in fact pΦ = 0, and hence that Φ = 0, as we desired to show. 0 (Nϕ , N∇ ) denote the subspace of Vclass (Nϕ , N∇ ) spanned by expresLet Vclass sions in the image of Vclass (Nϕ , N∇ − 1) under the map Q∇ , multiplied by arbitrary 1 (Nϕ , N∇ ) be a complementary subspace, so that curvature tensors. Let Vclass 0 1 (Nϕ , N∇ ) ⊕ Vclass (Nϕ , N∇ ). Vclass (Nϕ , N∇ ) = Vclass
(71)
0 Note that Vclass (Nϕ , N∇ ) is uniquely defined by our construction, but there are, 1 of course, many possible choices13 of complementary subspace Vclass (Nϕ , N∇ ). The key point is that if we fix all of the arguments of t except for the ith and if we know the action of t on all Φi ∈ Vclass (Nϕ , N∇ − 1), then — because Q∇ has vanishing kernel — the Leibniz rule Eq. (68) uniquely determines the action of t for 0 (Nϕ , N∇ ). On the other hand, the Leibniz rule imposes no constraints Φi ∈ Vclass 1 (Nϕ , N∇ ). We will therefore refer to whatsoever on the action of t for Φi ∈ Vclass 0 Vclass (Nϕ , N∇ ) as the “Leibniz dependent” subspace of Vclass (Nϕ , N∇ ), and will 1 (Nϕ , N∇ ) as the subspace of “Leibniz independent” expressions. refer to Vclass We now fix NT and fix the number, Nϕi , of powers of ϕ in each factor of the NT 1 , . . . , N∇ }. The proof of [14] argument of t, and proceed by induction on {N∇ 1 = ··· = already directly establishes existence of the desired extension of t when N∇ NT N∇ = 0, since the Leibniz rule clearly imposes no additional restrictions in this i case. We now inductively assume existence has been proven for all N∇ < ni , where i = 1, . . . , NT . The inductive proof will be completed if we can show that for any j i = nj and N∇ < ni for i = j. To prove j, existence continues to hold whenever N∇ j j j = nj ) into its “Leibniz existence for N∇ = nj , we decompose Φj ∈ Vclass (Nϕ , N∇ j 1 (Nϕj , N∇ = nj ), dependent” and “Leibniz independent” pieces, Eq. (71). On Vclass j 0 we define the extension of t as in [14], whereas on Vclass (Nϕj , N∇ = nj ) we simply define the extension of t so as to satisfy Eq. (68), i.e., we use the left side of that equation to define the right side. It is clear that defining t in this way yields a local and covariant distribution that depends smoothly and analytically on the metric, that has an almost homogeneous scaling behavior, and that satisfies the desired microlocal properties, since taking covariant derivatives preserves these properties.
1 particular choice of Vclass (Nϕ , N∇ ) in the context of flat spacetime theories was made in [10]. 0 (2, N∇ ) when N∇ is odd, It is worth noting that when Nϕ = 2, we have Vclass (2, N∇ ) = Vclass i.e., there are no “Leibniz independent” expressions when N∇ is odd; when N∇ is even, a con1 (2, N∇ ) are the expressions of the form of local curvature terms times venient choice of Vclass ϕ∇(a1 . . . ∇aN ) ϕ. 13 A
∇
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
251
Consequently, an extension of t satisfying all of the desired properties (including the Leibniz rule) exists.
3.2. Anomalies with respect to the equations of motion We conclude this section by elucidating the difficulties (i.e., “anomalies”) that arise when one attempts to impose further “reasonable” conditions concerning the equations of motion in addition to T1–T10 on Wick powers containing derivatives. Specifically, we shall show that in two spacetime dimensions, it is impossible to require the vanishing of (∇a ϕ)P ϕ, where P = ∇a ∇a − m2 − ξR is the Klein– Gordon operator. We shall also show that in four spacetime dimensions, it is impossible to require the vanishing of both ϕP ϕ and (∇a ϕ)P ϕ. These difficulties are closely related to the well-known trace anomaly property for a conformally invariant field in these dimensions. However, in contrast to previous discussions, in our framework the relation T a a = 0 holds as a classical algebraic identity (not requiring the field equations) for the conformally invariant field in D = 2 spacetime dimensions. Consequently, in our framework, there cannot be a trace anomaly for the conformally invariant scalar field in two spacetime dimensions; rather, in two dimensions there is necessarily an anomaly in the conservation of Tab . Consider a prescription (such as the local Hadamard normal ordering defined above in Eq. (60)) for defining Wick polynomials in D spacetime dimensions that satisfies T1–T10. Using the Leibniz rule, T10, the stress-tensor Tab , Eq. (6), may be re-written entirely in terms of the Wick monomials Ψab ≡ ϕ∇a ∇b ϕ
(72)
1 c 1 1 ∇a ∇b Ψ − Ψab − gab ∇ ∇c Ψ + m2 Ψ − Ψc c 2 2 2 + ξ[Gab Ψ − ∇a ∇b Ψ + gab ∇c ∇c Ψ].
(73)
Ψ ≡ ϕ2 , as follows: Tab =
The divergence of Tab is straightforwardly calculated (again using the Leibniz rule) to be given by ∇a Tab = (∇b ϕ)P ϕ.
(74)
On the other hand, we also have the obvious relation Ψa a − (m2 + ξR)Ψ = ϕP ϕ.
(75)
The above equations hold for any prescription satisfying T1–T10. Now let us calculate ϕP ϕ and (∇a ϕ)P ϕ by the local Hadamard normal ordering prescription.
May 19, 2005 1:20 WSPC/148-RMP
252
J070-00234
S. Hollands & R. M. Wald
In odd dimensions, these quantities vanish, but in even dimensions the computations reported in [17, Lemma 2.1] yield14 ϕP ϕ = Q11 D (∇a ϕ)P ϕ = ∇a Q11 2(D + 2)
(76) (77)
where Q is a nonvanishing local curvature scalar of dimension (length)−D that can be computed explicitly from the Hadamard recursion relations. If we wish to require the vanishing of ϕP ϕ and (∇a ϕ)P ϕ, our task is to modify — in a manner consistent with axioms T1–T10 — the definitions of the Wick powers, Ψ = ϕ2 and Ψab = ϕ∇a ∇b ϕ, so that the left sides of Eqs. (74) and (75) vanish. However, since Ψ and Ψab are each quadratic in ϕ, our previous uniqueness theorem [13] establishes that the allowed freedom in the definition of these quantities consists of local curvature terms of the correct dimension times the identity, 11. More precisely, the allowed freedom to modify the definition of these quantities is15 Ψ → Ψ + C 11 Ψab → Ψab + Cab 11
(78) (79)
where C is any scalar constructed out of the metric, curvature, derivatives of the curvature, m2 and ξ, with dimension (length)−(D−2) and Cab is any tensor (symmetric in a and b) that is constructed out of the metric, curvature, derivatives of the curvature, and m2 and has dimension (length)−D . Therefore, in order to modify the local Hadamard normal ordering prescription so as to preserve T1–T10 and also make the right sides of Eqs. (74) and (75) vanish, we must solve the following equations 1 1 − ∇a Cab − ∇b C a a + ∇b ∇a ∇a C 2 4 1 1 + Rb c ∇c C + (m2 + ξR)∇b C = −∇b Q (80) 2 2 14 The
difference between the behavior occurring in even and odd dimensions can be understood as arising from the following fact: in odd dimensions, one can construct a local and covariant Hadamard parametrix H(x, y) that satisfies the wave equation in each variable up to arbitrarily high order in the geodesic distance between x and y. As a result, the “local Hadamard normal ordering” prescription for Wick powers satisfies T1–T10 and also satisfies the property that any Wick power containing a factor of the wave operator must vanish. By contrast, in even dimensions, it is impossible to construct a local and covariant Hadamard parametrix H(x, y), that is symmetric in x and y and satisfies the wave equation to arbitrarily high order in the geodesic distance between x and y. Consequently, if one requires H(x, y) to be symmetric (as is necessary for the prescription for defining general Wick monomials involving derivatives to be well defined), then it will fail to satisfy the wave equation, and “anomalies” will occur for the regularized quantities. 15 Condition T10, of course, was not imposed in [13]. However, it is worth noting that the subspace of classical field expressions spanned by ϕ2 and ϕ∇a ∇b ϕ does not intersect the “Leibniz depen0 (see footnote 13), so condition T10 actually imposes no extra conditions dent” subspace Vclass on these quantities, i.e., the full ambiguity given by Eqs. (78) and (79) is present even when we impose T10.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
C a a − m2 C − ξRC = −
D Q. 2(D + 2)
253
(81)
Let us specialize, now, to the case of two spacetime dimensions, D = 2. Since C has dimension zero, the most general choice of C is simply C = α, where α is a constant that is independent of m2 but may have arbitrary analytic dependence on ξ. However, since C appears in Eq. (80) only in the form ∇b C, it cannot contribute to that equation. Similarly, since Cab has dimension (length)−2 , it must take the form Cab = β1 Rab + β2 Rgab + β3 m2 gab , where β1 , β2 , β3 are constants that are independent of m2 but may have arbitrary analytic dependence on ξ. However, in two spacetime dimensions, we have Rab = 12 Rgab , and since Cab appears in Eq. (80) only in the combination Cab − 12 C c c gab , it follows immediately that Cab also cannot contribute to Eq. (80). Since ∇a Q is nonvanishing, it is therefore impossible to solve Eq. (80). Consequently, in two spacetime dimensions, there does not exist a prescription for defining Wick powers that satisfies axioms T1–T10 and also satisfies (∇a ϕ)P ϕ = 0. Since ∇a Tab = (∇b ϕ)P ϕ by Eq. (74) above, this means, equivalently, that it is impossible to satisfy conservation of stress-energy in two spacetime dimensions within our axiomatic framework. By contrast, in spacetime dimension D > 2, no difficulty arises in satisfying Eq. (80) alone (in addition to axioms T1–T10), since we may always solve this equation by choosing C = 0 and taking Cab = −
2 Qgab . D−2
(82)
Thus, in all spacetime dimensions except D = 2, there is no obstacle to imposing (∇a ϕ)P ϕ = 0 — or, equivalently, conservation of stress-energy — within our axiom scheme. However, difficulties do arise if, in addition, we attempt to impose ϕP ϕ = 0 as well. For example, when D = 4, the general form of C is C = α0 R + α1 m2 ,
(83)
where α0 , α1 are constants. If one substitutes this general form of C into Eq. (80), it turns out that it is still always possible to solve Eq. (80) for Cab . The general solution of Eq. (80) is most conveniently expressed in terms of the quantity C¯ab ≡ Cab − 12 C c c gab , and takes the explicit form α0 c 1 1 1 C¯ab = Q + ∇ ∇c R + α0 m2 R + α0 (2ξ + 1)R2 gab + α0 RGab 4 2 8 2 + β¯1 Iab + β¯2 Jab + β¯3 m2 Gab + β¯4 m4 gab (84) where β¯1 , . . . , β¯4 are arbitrary constants, and Iab and Jab are the two independent conserved local curvature tensors of dimension (length)−4 . We now substitute the general form, Eq. (83), of C and this general solution, Eq. (84), for Cab into Eq. (81). Since both Iab and Jab have trace proportional to ∇a ∇a R, we obtain an equation of the form Q = γ1 ∇a ∇a R + γ2 R2 + γ3 m2 R + γ4 m4 .
(85)
May 19, 2005 1:20 WSPC/148-RMP
254
J070-00234
S. Hollands & R. M. Wald
However, by explicit calculation, Q contains terms of the form C abcd Cabcd and Rab Rab , which cannot be expressed as a sum of the curvature terms on the right side. Consequently, there are no solutions to Eq. (81) when D = 4. Similar results presumably hold in all higher even dimensions, but a proof of this would require both a calculation of Q and an analysis of the conserved local curvature terms of dimension (length)−D . Finally, we note that our viewpoint with respect to the definition of the stressenergy tensor differs in two significant aspects with that of Moretti [17] and others. First, we take the free field quantum stress-energy tensor, Tab , to be defined in terms of Wick monomials by Eq. (6) and we do not allow any modifications of this formula. It is natural that we take this viewpoint, because it is precisely the quantity defined by Eq. (6) that will directly enter condition T11b of the next section. Furthermore, if condition T11b is imposed — as is possible except in two spacetime dimensions, as will be proven in Sec. 6 — then, as we shall show in Sec. 5, one obtains not only the conservation of the stress-energy tensor Tab in the free theory, but one also obtains conservation of the interacting stress-energy tensor Θab in an arbitrary interacting theory. By contrast, Moretti [17] allows modifications to the formula for the stress-energy tensor that are proportional to ϕP ϕ and thus vanish classically. If one were only interested in considering the free field theory and were seeking a definition of its stress-energy tensor that is conserved and that corresponds to the classical expression in the classical limit, we see no argument against allowing such a re-definition.16 However, it seems unlikely that this approach could naturally lead to a conserved stress-energy tensor in interacting theories. Second, Moretti [17] takes the Wick monomials appearing in his (modified) formula for Tab to be defined by a particular, fixed prescription, namely local Hadamard normal ordering (see Eq. (60) above). By contrast, we allow an arbitrary prescription for defining Wick products satisfying T1–T10. However, it turns out that — with the exception of one case — the freedom (78) and (79) allowed by T1–T10 in the definition of the relevant Wick products is sufficient to encompass the modifications to Tab obtained by Moretti by adding terms proportional to ϕP ϕ but keeping the prescription for defining Wick products fixed. In other words — with one exception — we achieve the same final result for Tab as an element of W by modifying the prescription for defining Wick products rather than by modifying the formula for Tab in terms of Wick products. The exception is the case of two spacetime dimensions. Indeed, it can be seen directly from Eq. (73) that when D = 2 and when m = 0, the allowed freedom (78) and (79) does not permit any modification whatsoever to Tab . Thus, while our results agree with those of Moretti when D = 2, they differ when D = 2.
16 Indeed, this was the philosophy taken in the prior work of one of us [20] on the stress-energy tensor, before the general theory of Wick products in curved spacetime had been developed.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
255
4. Quadratic Interaction Lagrangians and Retarded Response 4.1. Formulation of the general condition T11 In this section, we formulate the new requirement, T11, that arises from our Principle of Perturbative Agreement when the perturbation, L1 , of the free Lagrangian, L0 , is at most quadratic in the field ϕ and contains at most 2 derivatives of ϕ. The most general such nontrivial interaction Lagrangian for a real scalar field is L1 =
# 1" j(x)ϕ + wab (x)∇a ϕ∇b ϕ + v(x)ϕ2 2
(86)
where j, wab , and v are smooth. Without loss of generality, we will assume that j, wab , and v are of compact support on M , since the conditions that we derive are local conditions that will not depend on their support properties, and it is simpler to consider the compact support case. The term jϕ in Eq. (86) corresponds to the addition of an external current; the term vϕ2 corresponds to the presence of an external potential (or, equivalently, a spacetime variable mass); finally, the change in L produced by changing the spacetime metric from gab to g ab corresponds to ab taking wab = g −g ab and v = m2 ( −)+ξ(R −R). Note that we have not included a term of the form Aa ϕ∇a ϕ in L1 , since such a term is Leibniz equivalent to vϕ2 with v = −∇a Aa . However, if we were considering a complex scalar field, ¯ a ϕ − ϕ∇a ϕ). ¯ then we would have an additional term in L1 of the form iAa (ϕ∇ The quantum field theory of a scalar field ϕ with Lagrangian L = L0 + L1 can be constructed in the following two independent ways. First, it can be constructed perturbatively about L0 by means of the Bogoliubov formula. This formula is most conveniently expressed in terms of retarded products, so we first recall the definition of retarded and advanced products, R and A, in terms of time-ordered products. If A, B ∈ Fclass , we define R(eiA ; eiB ) = S(B)−1 S(A + B),
A(eiA ; eiB ) = S(A + B)S(B)−1 ,
(87)
where S and S −1 are given by the formal series expressions S(A) = T(eiA ),
¯ iA ), S(A)−1 = T(e
(88)
¯ denotes the “anti-time-ordered” product given by Eq. (38). Equaand where T tion (87) is to be interpreted as an infinite sequence of well-defined equations obtained by formally expanding the exponentials on the left side, formally substituting Eq. (88) for S and S −1 on the right side, and then equating the terms with the same number of powers of A and B. For example, if we set A = f i Φi and B = hΨ and expand to first order in B, we get the formula R
f i Φi fi Φi ; hΨ = −Ψ(h)T fi Φi + T hΨ
(89)
May 19, 2005 1:20 WSPC/148-RMP
256
J070-00234
S. Hollands & R. M. Wald
as well as a similar formula for the advanced products. The retarded and advanced products have the important support property Ψj (yj ) supp A/R Φi (xi ); ⊂ {(x1 , . . . , xn ; y1 , . . . , ym ) | some xi ∈ J −/+ (yj ) for at least one j} (90) which follows from the causal factorization property, T8, of the time-ordered products. Now let L1 be any compactly supported interaction Lagrangian density that is polynomial in ϕ and its derivatives. Let Φi ∈ Vclass . Then the Bogoliubov formula
defines the interacting time-ordered product TL1 ( fi Φi ) as an element of the noninteracting algebra W(M, g) by the power series expression m m m in R f i Φi ; L 1 · · · L 1 . (91) f i Φi = R fi Φi ; eiL1 = TL 1 n! n i=1 i=1 i=1 n factors
This definition of time-ordered products for the interacting quantum field corresponds to the boundary condition that the interacting field be equal to the corresponding non-interacting field outside the causal future of the support of the interaction Lagrangian L1 . The interacting field corresponding to the opposite boundary condition (with the future of the support of L1 replaced by the past) is given by the analogous formula involving advanced products. As is well known, the perturbation series (91) is expected not to converge17 for general L1 , so this relation is, in general, only a formal one. The Bogoliubov formula (91) holds for an arbitrary interaction Lagrangian. However, if L1 takes the simple form (86), then there is a second means of constructing “interacting” time-ordered products: We simply construct them directly for the theory given by the Lagrangian L0 = L0 + L1 . As already indicated, the Lagrangian L0 corresponds to a free scalar field in a curved spacetime with metric g in the presence of an external potential V and an external source J . We have already constructed the quantum field theory of ϕ in an arbitrary, globally hyperbolic spacetime (M, g). The generalization of this construction to include an external potential, V , is accomplished in an entirely straightforward manner as follows: In the construction of the algebras A(M, g, V ) and W(M, g, V ), we simply replace the original Klein-Gordon operator P = ∇a ∇a − ξR − m2 by ∇a ∇a − ξR − m2 − V . In the definition of the classical field algebra, Vclass , we must now also allow arbitrary factors of V and its symmetrized covariant derivatives. In the axioms for time-ordered products, we simply make the obvious modifications to T1, T2, T4, and T5 to allow for the presence of V . The proof of existence of a prescription for defining time-ordered products satisfying T1–T10 then goes through without any substantive changes. 17 However,
see the remarks at the end of Sec. 7.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
257
Given the theory of the real scalar field ϕ in the absence of an external current, J, the corresponding theory in the presence of an external current may be uniquely constructed as follows (where, for simplicity, we assume V = 0 in the following discussion): first, the CCR algebra, A(M, g, J), is constructed by starting with the free algebra generated by symbols ϕJ (f ) and ϕJ (h)∗ , and factoring by the relations (i), (ii) and (iv) of Sec. 2, together with the relation (iii’) (92) ϕJ (P f ) = f J · 11, where P is the Klein–Gordon operator, Eq. (9). Thus, the only modified relation compared to the case of vanishing source is (iii’), which ensures that the field ϕJ satisfies the Klein–Gordon equation with source J in the distributional sense. Note that the commutation relations (iv) of the field with source are identical to the commutation relations of the field without source. To construct W(M, g, J), we define the generators Wn (u) by Eq. (8) with ϕJ replacing ϕ but with ω2 taken to be a Hadamard state of the theory with vanishing source, so that ω2 still satisfies the homogeneous equation in each entry. The Wick monomials may now be defined for the theory with external source by exactly the same formula, Eq. (60), as used in the theory without source, with the only difference being that the normal ordered expressions of the field without source that appear on the right side of (60) must be replaced by the local Hadamard normal ordered expressions: ϕJ (x1 ) · · · ϕJ (xk ):H of the field ϕJ with source. The latter are defined exactly as for the case of a vanishing source by Eq. (59), where H is the same local Hadamard parametrix for the Klein–Gordon operator P as used in the theory without a source. It then follows from causal factorization, T8, and the commutator property, T9, that a local Wick expansion of the form Eq. (63) holds for time-ordered products, TJ0 (Φ1 (x1 ) · · · Φn (xn )) when xi = xj for all i, j, where the expansion coefficients, t0 [⊗Φi ], are identical to those of the theory without source. We may therefore define time-ordered products in the theory with source by choosing the extension, t[⊗Φi ], to be independent of the source J. The resulting definition of time-ordered products satisfies axioms T1–T10, with the obvious modifications to T1, T2, T4, and T5 to allow for the presence of J. This construction of the theory of a real scalar field in the presence of an external source, J, in terms of the theory defined when J = 0 corresponds to demanding that the renormalization prescription be independent18 of J. Our Principle of Perturbative Agreement demands that for L1 given by Eq. (86), the perturbative theory defined by the Bogoliubov formula (91) must agree with 18 J-dependent renormalization prescriptions that satisfy the appropriate versions of T1–T10 could be defined by choosing a local Hadamard parametrix for the Klein–Gordon operator P that depends nontrivially on J in a local and covariant manner (e.g., choosing H (x, y) = H(x, y) + J(x)J(y)) and/or by choosing the extension, t, of t0 to depend nontrivially on J in a local and covariant manner.
May 19, 2005 1:20 WSPC/148-RMP
258
J070-00234
S. Hollands & R. M. Wald
the exact theory for the Lagrangian L0 = L0 + L1 . However, we cannot yet easily compare these constructions, since the Bogoliubov formula expresses the interacting time-ordered products as elements of the algebra W(M, g), whereas the exact construction yields time-ordered products as elements of a different algebra, W(M, g , V , J ). Therefore, in order to compare these two expressions for the timeordered products, we must map W(M, g , V , J ) into W(M, g) in such a way that, in the past (i.e., outside the future of the supports of g − g , V , and J ), the interacting time-ordered products in W(M, g , V , J ) are mapped into the corresponding time-ordered products in W(M, g). The desired map, denoted τ ret , was constructed in [13, Lemma 4.1] in the case where V = J = 0. We now briefly review its construction and generalize it to the case of nonvanishing external potential and current. Let (M, g) be a globally hyperbolic spacetime and (M, g , V , J ) be such that (M, g ) also is globally hyperbolic and outside some compact set K ⊂ M , we have g = g, V = 0, and J = 0. Let Σt be a foliation of (M, g) by Cauchy surfaces. Choose t1 such that Σt1 does not intersect the causal future of K and choose t0 < t1 , so that Σt0 ⊂ I − (Σt1 ). (It follows automatically that Σt0 and Σt1 are also Cauchy surfaces for (M, g ).) Let ψ be a smooth function on M such that ψ(x) = 0 for x in the future of Σt1 and ψ(x) = 1 for x in the past of Σt0 . The action of τ ret on ϕ(g ,V ,J ) may now be defined as follows: let f be a test function on M . We define F = ∆(g ,V ) f
(93)
f = P (ψF )
(94)
and we define
with P = ∇ ∇ a − ξR − m2 − V and with ∆(g ,V ) being the advanced minus retarded Green’s function associated with P . Then f is a test function with support lying between Σt0 and Σt1 . Furthermore, it is easily checked that ∆(g ,V ) (f − f ) = 0, which implies that f − f = P h where h ≡ ∆adv (g ,V ) (f − f ) is of compact support [20], and hence is a test function. Consequently, by Eq. (92), we have (95) ϕ(g ,V ,J ) (f ) = ϕ(g ,V ,J ) (f ) + hJ · 11. a
adv Since the supports of ∆adv (g ,V ) f and J do not overlap, and since ∆(g ,V ) (x, y) = ∆ret (g ,V ) (y, x) (up to factors of ) we may re-write this equation as
ϕ(g ,V ,J ) (f ) = ϕ(g ,V ,J ) (f ) +
f ∆ret (g ,V ) J · 11.
(96)
All of the above equations are relations in the algebra W(M, g , V , J ). We want to define the map τ ret : W(M, g , V , J ) → W(M, g) to be such that it maps
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
259
ϕ(g ,V ,J ) (f ) to ϕg (f ) when the support of f lies outside the future of K and to be such that τ ret (11) = 11. Therefore, for arbitrary f , we define τ
ret
[ϕ(g ,V ,J ) (f )] = ϕg (f ) +
f ∆ret (g ,V ) J · 11.
(97)
The above formula for the action of τ ret on ϕ(g ,V ,J ) can be rewritten in a more useful form as follows. Let S: D(M ) → D (M ) be the map defined by its distributional kernel, (F1 ∇a F2 − F2 ∇a F1 ) dσ a ,
S(f1 , f2 ) =
(98)
Σ−
where F1 = ∆g f1 , F2 = ∆(g ,V ) f2 , and Σ− is any Cauchy surface that does not intersect the future of K. Define Aret : D(M ) → D (M ) by Aret f = −P (ψSf ).
(99)
Then we have τ ret [ϕ(g ,V ,J ) (f )] = ϕg (Aret f ) +
f ∆ret (g ,V ) J · 11.
(100)
The map τ ret can be uniquely extended to all of A(M, g , V , J ) as a *-isomorphism and, thereby, to Wn (u) when u is smooth. The further extension of τ ret to Wn (u) for u a distribution in the space Eq. (12) then can be accomplished as in [13, Lemma 4.1], with the trivial modifications resulting from the presence of V and the straightforward modifications resulting from the presence of J. Our Principle of Perturbative Agreement now leads to the following requirement: T11 Quadratic interaction Lagrangians. Let (M, g) and (M, g ) be globally hyperbolic spacetimes such that g = g outside of a compact set K. Let V and J have support in K. Then for all Φi ∈ Vclass we have τ ret T(g ,V ,J )
m i=1
f i Φi
=
in n
n!
Rg
m i=1
f i Φi ; L 1 · · · L 1 .
(101)
n factors
where L1 is the interaction Lagrangian of the form Eq. (86) given by L1 = L0 − L0 . In order to express this condition in a more explicit and useful form, we consider separately the subcases of (a) external source variation, (b) metric variation, and (c) external potential variation. These will lead to sub-requirements T11a, T11b, and T11c.
May 19, 2005 1:20 WSPC/148-RMP
260
J070-00234
S. Hollands & R. M. Wald
4.2. External source variation: Axiom T11a We now apply condition T11 to the case of an interaction Lagrangian of the form L1 = Jϕ, where J is a compactly supported smooth function (“external current”). In this case, condition T11 reduces to m m in RJ=0 fi Φi ; Jϕ · · · Jϕ . (102) f i Φi = τ ret TJ
n! n i=1 i=1 n factors
However, if only an external current is present, Eq. (97) becomes simply ret τ [ϕJ (f )] = ϕJ=0 (f ) − f ∆ret J · 11.
(103)
More generally, for Ai ∈ Fclass , we have ret ret Ai (ϕ) = TJ=0 Ai (ϕ − ∆ J) , TJ τ
(104)
i
Ai ∈ Fclass ,
i
which can be proved by induction the number of factors of the time-ordered product, making use of the fact that the c-number coefficients in the Wick expansion of TJ are independent of J. Expanding Eq. (102) to first order in J and using Eq. (104), we obtain T11a Free field factor. We have δ(fj Φj ) ret · · · f n Φn . T f1 Φ1 · · · (∆ J) R fj Φj ; Jϕ = i δϕ j
(105)
Remarks. (1) Condition T11a corresponds to condition N4 of [7] in the context of QED in flat spacetime. (2) We can use Eq. (89) to re-write Eq. (105) purely in terms of time-ordered products as T Jϕ fj Φj = ϕ(J)T f j Φj δ(fj Φj ) · · · fn Φn . (106) +i T f1 Φ1 · · · (∆ret J) δϕ j Hence, condition T11a implies that a time-ordered product with n + 1 factors such that at least one of the factors is ϕ can be expressed in terms of time-ordered products with fewer factors. Using this fact, one may show that condition T11a implies all of the relations obtained by expanding Eq. (102) to any order in J. Thus, condition T11a is equivalent to Eq. (102) and contains the full content of condition T11 in the case of an external current. (3) The requirement T11a could also have been formulated in terms of advanced products by replacing “R” by “A” on the left side of Eq. (105) and by replacing the retarded propagator ∆ret by the advanced propagator ∆adv on the right side. It is
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
261
not difficult to show (using the commutator property T9) that this would yield an equivalent requirement. (4) If we choose J = (∇a ∇a − m2 − ξR)h and use the Leibniz rule T10 as well as (∇a ∇a − m2 − ξR)∆ret = δ, Eq. (106) yields δ(fi Φi ) · · · f n Φn . f i Φi = i T f 1 Φ1 · · · h T h(∇a ∇a − m2 − ξR)ϕ δϕ i (107) In Theorem 5.3 in the next section, we will see that Eq. (107) is necessary and sufficient for the interacting field to satisfy the interacting equations of motion. (5) In the simple case of two free-field factors, Eq. (106) reduces (in unsmeared form) to simply T(ϕ(x)ϕ(y)) = ϕ(x)ϕ(y) + i∆ret (x, y)11.
(108)
This agrees with Eq. (7), which was deduced from axioms T1–T9. Similarly, condition T11a directly yields T(P ϕ(x)ϕ(y)) = iδ(x, y)11,
(109)
which agrees with Eq. (58), which was deduced from the Leibniz rule. These agreements are comforting, but they illustrate that many nontrivial consistency checks will arise when we attempt to impose condition T11a along with T1–T10. A proof that T1–T10, T11a, and T11b (see below) can all be consistently imposed will be given in Sec. 6. 4.3. Metric variation: Axiom T11b We now apply condition T11 to the case where the interaction Lagrangian corresponds to a variation in the spacetime metric, i.e., L1 = L0 (g) − L0 (g ), where g and g are both globally hyperbolic and differ only in a compact subset K. In this case, condition T11 becomes m m in Rg fi Φi ; L1 · · · L1 . f i Φi = τ ret Tg (110) n! i=1
n
i=1
n factors
As in the case of an external current considered in the previous subsection, it is useful to pass to an infinitesimal version of this equation. To accomplish this, we introduce a smooth 1-parameter family of metrics g(s) differing from g = g(0) only within K. To first order in s, the interaction Lagrangian density is then given ∂ (s) by L1 = (s/2)hab T ab , where hab = ∂s gab , and where Tab is the stress-energy tensor (6). For all fi Φi ∈ Fclass we define ∂ ret τg(s) Tg(s) δgret Tg (111) f i Φi = fi Φi . ∂s s=0
In Appendix A, we show that the right side of this equation exists as a well-defined element of W(M, g). By differentiating Eq. (110) with respect to s and setting
May 19, 2005 1:20 WSPC/148-RMP
262
J070-00234
S. Hollands & R. M. Wald
s = 0, we obtain the following infinitesimal version of condition T11 in the case of metric variations: T11b Stress-energy factor: Let Tg(s) be the 1-parameter family of time-ordered products associated with a smooth 1-parameter family of globally hyperbolic metrics g(s) on M that vary only within some compact subset K, and such that g ≡ g(0) . Then we require that for all fi Φi ∈ Fclass , i δgret Tg fi Φi = Rg fi Φi ; hab T ab 2 δ(fi Φi ) + Tg f1 Φ1 · · · hab · · · f n Φn , (112) δgab i where hab is the compactly supported tensor field given by ∂ (s) hab = g , ∂s ab
(113)
s=0
and the functional derivative, δA/δgab , of a classical functional A ∈ Fclass with respect to the metric, g, is defined in Appendix B and is explicitly given by the formula ◦ ◦ ∂A δA = (−1)r ∇(c1 · · · ∇cr ) . (114) ◦ ◦ δgab ∂(∇(c1 · · · ∇cr ) gab ) r ◦
In this formula, ∇a is an arbitrary fixed, background derivative operator, and it is understood that we have re-written the dependence of A on ∇a and the curvature ◦
◦
in terms of ∇a and ∇a -derivatives of g. Note that when none of the functionals fi Φi explicitly depend upon the metric (including dependence on ∇a or curvature terms), then the term in the second line of Eq. (112) is absent. Remarks. (1) We are not aware of condition T11b having been proposed previously. (2) Condition T11b represents only the “first order” part of the identity (110), so one might wonder if one would get any new requirements by expanding Eq. (110) to higher orders. However, it can be checked by an explicit calculation that this is not the case, i.e., that all of the higher order relations implicit in Eq. (110) already follow from the first order condition stated as T11b. This is not surprising since condition T11b is required to hold for metric variations about all (globally hyperbolic) spacetimes. (3) Using Eq. (89) we can re-write condition T11b purely in terms of time-ordered products as i i fi Φi − T ab (hab )Tg fi Φi = Tg hab T ab f i Φi δgret Tg 2 2 δ(fi Φi ) + Tg f1 Φ1 · · · hab · · · f n Φn . (115) δgab i
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
263
(4) In Euclidean field theory for the action (2) on a complete Riemannian manifold there will, in general, be a unique Green’s function for the (now elliptic) operator P . Hence, there would be no distinction between retarded and advanced variations, nor between retarded and advanced products. There also would be a unique, preferred vacuum state, 0 . The Euclidean version of condition T11b would be: δg Φ1 (f1 ) · · · Φn (fn )0 =
1 Φ1 (f1 ) · · · Φn (fn )T ab (hab )0 2 % $ δ(fi Φi ) + (hab ) · · · Φn (fn ) . Φ1 (f1 ) · · · δgab 0 i
(116)
This corresponds to the formula that one would formally obtain by assuming that the correlation functions Φ1 (f1 ) · · · Φn (fn )0 can be defined by a path integral Φ1 (f1 ) · · · Φn (fn )0 =
[Dϕ] Φ1 (f1 ) · · · Φn (fn )e−S(ϕ,g) .
(117)
Of course, in curved spacetime, there is no direct relationship between the formulations of the Euclidean and Lorentzian versions of quantum field theory, since a (non-static) Lorentzian spacetime will not, in general, be a real section of a complex analytic manifold with complex analytic metric that also admits a real Riemannian section. Nevertheless, we may view condition T11b as a mathematically precise formulation — applicable in the Lorentzian case — of a relation that can be formally derived from the Euclidean path integral. (5) The requirement T11b could also have been formulated in terms of advanced variations and advanced products by replacing δgret by δgadv on the left side of Eq. (112) and replacing “R” by “A” on the right side of that equation, i.e., i δgadv Tg fi Φi = Ag fi Φi ; hab T ab 2 δ(fi Φi ) + Tg f1 Φ1 · · · hab · · · f n Φn . δgab i
(118)
However, this formulation of condition T11b can be seen to be equivalent to Eq. (112) as follows. If g(s) is the 1-parameter family of metrics appearing in Eq. (112), then ret −1 , βs ≡ τgadv (s) ◦ (τg (s) )
(119)
is an automorphism of W(M, g) for all s with the property that β0 = id. It was proven in [6], that, for all a ∈ W, we have i ∂ βs (a) = [T ab (hab ), a] , ∂s 2 s=0
(120)
May 19, 2005 1:20 WSPC/148-RMP
264
J070-00234
S. Hollands & R. M. Wald
where T ab is the stress-energy tensor.19 In particular, for any time-ordered product, we have20 δ adv (T) − δ ret (T) =
i ab [T (hab ), T]. 2
(121)
Equivalence of Eqs. (112) and (118) then follows from this equation and the definitions of advanced and retarded products.
4.4. External potential variation Finally, for completeness, we state the infinitesimal version of condition T11 for the case of a variation of the external potential: T11c ϕ2 factor. Let (M, g) be globally hyperbolic and let V (s) be a smooth one-parameter family of smooth functions which vary only in a fixed compact set K. Write V = V (0) and write U = (∂V (s) /∂s)|s=0 . Then we require that for all fi Φi ∈ Fclass , i δVret T(g,V ) fi Φi = R(g,V ) fi Φi ; U ϕ2 2 δ(fi Φi ) · · · f n Φn . + T(g,V ) f1 Φ1 · · · U δV i
(122)
Remarks. (1) In writing condition T11c, we have generalized the definition of τ ret in the obvious way so that it now maps W(M, g , V , J ) to W(M, g, V ). The second term on the right side of Eq. (122) is present in this formula because, as mentioned above, in the construction of the theory with an external potential, elements of Vclass are allowed to depend explicitly upon V . (2) For the most part, condition T11c imposes restrictions on the definition of time-ordered products only if one has defined the exact theory in an arbitrary external potential. However, even if one considers only the theory defined by the action (2) (which does not include an external potential), condition T11c does impose a restriction on the definition of time-ordered products in the case where U is constant on the union of the supports of the fi . For simplicity, we shall not consider this or any other consequences of condition T11c in the remainder of this paper. However, it should be straightforward to generalize the proof of Sec. 6 to show that condition T11c can be consistently imposed for quantum field theory in curved spacetime with an arbitrary external potential. 19 Equation (120) holds for any valid prescription for defining Wick powers satisfying T1–T10, since the ambiguity in Tab is proportional to 11. 20 Note that the advanced and retarded variations are only defined on local covariant field quantities such as the time-ordered products. By contrast, Eq. (120) holds for an arbitrary element a ∈ W.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
265
(3) As mentioned at the beginning of this section, if we considered a complex scalar ¯ a ϕ − ϕ∇a ϕ), ¯ corfield, then we could also have a term in L1 of the form iAa (ϕ∇ responding to the presence of an external electromagnetic field. Our Principle of Perturbative Agreement would then lead to a corresponding additional condition (“T11d”) on time-ordered products. We expect that this additional condition can also be consistently imposed (in addition to the analogs of all of our other conditions) for a complex scalar field. We also expect that the imposition of this condition will imply current conservation for the free and interacting fields in analogy with Theorems 5.1 and 5.3 below. 5. Some Key Consequences of Our New Requirements In this section, we will derive some important consequences of our new requirements for both the free field theory and the interacting field theory. The demonstration that our requirements can, in fact, be imposed (except in two spacetime dimensions) will be given in Sec. 6. 5.1. Consequences for the free field In this subsection, we will derive some key consequences of condition T11b for the stress-energy tensor Tab associated with the free field Lagrangian L0 . Theorem 5.1. Suppose that the prescription for defining Wick products and timeordered products satisfies conditions T1–T10 together with condition T11b. Then the stress tensor Tab defined via that prescription is automatically conserved ∇a T ab (x) = 0.
(123)
More generally, we have the following free field “Ward identity” for Tab : n a b 0 = T (∇ ξ )Tab f i Φi δ(fi Φi ) +i T f 1 Φ1 · · · £ ξ ϕ · · · f n Φn . δϕ i
(124)
Proof. We first show that the divergence of the stress tensor is a c-number, i.e., proportional to the identity operator. Let F be a density of compact support, and let ξ a be a compactly supported vector field. Then (∇a T ab )(ξb ) = −(1/2)Tab (£ξ g ab ), and the commutator property, T9, yields δ ab ab . (125) [T (£ξ gab ), ϕ(F )] = iT (∆F ) (£ξ gab )T δϕ We want to show that the right side of this equation is, in fact, equal to 0. To see this, we write (£ξ gab )T ab = 2(£ξ gab ) δL0 /δgab and use the fact, proven in
May 19, 2005 1:20 WSPC/148-RMP
266
J070-00234
S. Hollands & R. M. Wald
Appendix B, that functional derivatives of L0 with respect to ϕ and g commute modulo exact forms in the sense of Eq. (268). We therefore obtain δ δ δL0 ab (∆F ) (£ξ gab )T = 2(£ξ gab ) (∆F ) + dB0 δϕ δgab δϕ δ (∆F )P ϕ + dB0 , = 2(£ξ gab ) δgab
(126)
where in this equation ∆F is viewed as being evaluated at the metric g about which the variations are being taken (i.e., the δ/δgab does not act on ∆). However, we have (£ξ gab )
δ (∆F )P ϕ = £ξ (∆F )P ϕ − £ξ (∆F ) P ϕ − (∆F )P (£ξ ϕ) δgab = − £ξ (∆F ) P ϕ − P (∆F )£ξ ϕ + dB1 = − £ξ (∆F ) P ϕ + dB1 , (127)
where in the second line we used the facts that £ξ applied to any D-form is exact and that P is self-adjoint, whereas in the last line we used the fact that P (∆F ) = 0. Using the Leibniz rule, T10, we see that the right side of Eq. (125) is equal to −2iϕ[P (£ξ ∆F )], which indeed vanishes since the quantum field ϕ satisfies the Klein–Gordon equation. Thus, T ab (£ξ gab ) commutes with ϕ(F ) for all compactly supported F . By [13, Proposition 2.1], every element of W with this property has to be proportional to 11. Since ∇a Tab is locally and covariantly constructed out of the metric by T1, we must therefore have that Ca ξ a · 11
ab
T (£ξ gab ) =
(128)
M
for some local curvature term Ca . Furthermore, by our scaling axiom T2, Ca must be a polynomial in the Riemann tensor and its covariant derivatives of dimension length−D−1 . We will now show that if condition T11b holds, then Ca , in fact, has to vanish. To prove this, we consider the retarded variation of the local covariant field T ab (£ξ gab ) with the metric variation taken to be of the “pure gauge” form hab = £η gab , where η a is a compactly supported vector field. Condition T11b in the simple case of only one factor f1 Φ1 = (£ξ gab )T ab yields δ ret (T ab (£ξ gab )) =
i R((£ξ gab )T ab ; (£η gcd )T cd ) 2 δ ab + T (£η gcd ) ((£ξ gab )T ) . δgcd
(129)
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
By Eq. (128), the left side of this equation is just ret ab a ξ £η (Ca ) · 11 = − [η, ξ]a Ca · 11 , δ (T (£ξ gab )) = M
267
(130)
M
where a partial integration was done in the last step. Now subtract from Eq. (129) the same equation with η a and ξ a interchanged. We obtain i [η, ξ]a Ca · 11 = R (£ξ gab )T ab ; (£η gcd )T cd −2 2 M i − R (£η gcd )T cd; (£ξ gab )T ab 2 & δ δ + 2T (£η gcd ) L0 (£ξ gab ) δgcd δgab & δ δ L0 − (£ξ gcd ) (£η gab ) , (131) δgcd δgab where we again substituted T ab = 2δL0 /δgab . The terms on the right side can be simplified as follows: For the retarded products, we use the identity R(f Φ, hΨ) − R(hΨ, f Φ) = [Ψ(h), Φ(f )], which holds for any Wick products Φ, Ψ. In the case at hand, Φ and Ψ are equal to the divergence of the stress tensor and therefore have a vanishing commutator by Eq. (128). Hence, there is no contribution from the terms in Eq. (131) involving retarded products. The last term on the right side can be simplified using the identity (Dξ Dη − Dη Dξ )A = D[η,ξ] A + dC
(132)
holding for the “derivative operator” δA (133) δgab on classical functionals A of the metric such as L0 . A proof of this identity is given in Appendix B. Inserting this relation (with A= L0 ) and using again Eq. (128), we find that the last term in Eq. (131) is just – M [η, ξ]a Ca · 11. Thus, we obtain Ca [ξ, η]a = 0. (134) Dξ A = (£ξ gab )
M
This equation holds for all smooth compactly supported vector fields ξ a , η a . Variation with respect to ξ a yields (∇a η b )Cb + (∇b η b )Ca + η b ∇b Ca = 0
(135)
for all η a , at every point in M . Now focus on an arbitrary, but fixed point x ∈ M . At x, the quantities η a and Ka b = ∇a η b can be chosen independently to be arbitrary tensors. Choosing first Ka b = 0 and η a arbitrary, we conclude that ∇b Ca = 0 at x. Thus, at any x ∈ M we must have (Ka b + Kc c δa b )Cb = 0 b
(136)
for all Ka , which is possible only when Cb = 0. This completes the proof of stresstensor conservation, Eq. (123).
May 19, 2005 1:20 WSPC/148-RMP
268
J070-00234
S. Hollands & R. M. Wald
To prove the more general Ward identity, Eq. (124), we again consider condition T11b for the case of a “pure gauge” metric variation hab = £ξ gab = 2∇(a ξb) , but we now consider arbitrary factors of fi Φi . We obtain n δ ret T fi Φi ; (∇a ξ b )Tab fi Φi = iR +
δ(fi Φi ) T f1 Φ1 · · · (£ξ gab ) · · · f n Φn . δgab i
But, for our “pure gauge” metric variation, we have21 ∂ = Tg ((χs ∗ f1 )Φ1 · · · (χs ∗ fn )Φn ) δgret Tg f i Φi ∂s s=0 =− Tg (f1 Φ1 · · · (£ξ fi )Φi · · · fn Φn ),
(137)
(138)
i
since the time-ordered products are local, covariant fields. Moreover, writing the retarded product on the right side of Eq. (137) in terms of time-ordered products and using the relation Tab (∇a ξ b ) = 0 which we just proved above, we find n right side of Eq. (137) = iT (∇a ξ b )Tab f i Φi +
δ(fi Φi ) T f1 Φ1 · · · (£ξ gab ) · · · f n Φn . δgab i
(139)
We now use the fact — proven in Appendix B — that for any classical field D-form fi Φi ∈ Fclass , we have (£ξ fi )Φi +
δ(fi Φi ) δ(fi Φi ) £ξ ϕ = dH £ξ gab + δgab δϕ
(140)
for some (D − 1)-form H that is locally constructed out of gab , ϕ, fi , and ξ a . Inserting this relation into Eqs. (138) and (139), and using T10, we get the desired relation Eq. (124). Remarks. (1) There exist completely reasonable classical field theories for which Tab in the quantum field theory cannot be made divergence free within our axiom scheme. As we have seen in Sec. 3.2 above, one example of such a theory is the free scalar field in D = 2 spacetime dimensions. Hence Theorem 5.1 implies that T11b cannot be satisfied in addition to conditions T1–T10 for scalar field theory for D = 2. (2) There appear to be two independent possible obstructions to the imposition of the analog of condition T11b in a general free quantum field theory. First, there that the Lie derivative of a tensor field f = fa···c b···d is defined by £ξ f = ∂ χs ∗ f , since χ∗s = (χ−1 in turn is equal to − ∂s s )∗ = χ−s ∗ .
21 Note
∂ ∗ χ f, ∂s s
which
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
269
may exist algebraic identities (i.e., relations that do not involve the field equations) satisfied by the classical stress-energy tensor. Within our axiom scheme, these identities must be respected by the quantum stress-energy tensor, and, consequently, the freedom to modify the definition of Wick powers by arbitrary local curvature terms of the correct dimension does not translate into a similar ability to modify the definition of Tab so as to make it conserved, as is necessary for T11b to hold. As we have seen in Sec. 3.2 above, this occurs for the free scalar field in D = 2 spacetime dimensions. However, we will see in the course of our analysis in Sec. 6.2.6 below that, in principle, there can also exist a “cohomological obstruction” to the imposition of condition T11b. Although such a cohomological obstruction does not occur in scalar field theory, it can occur in parity violating theories, and it appears to be the cause of the failure of conservation of Tab for the theories described in [1] in D = 4k + 2 spacetime dimensions. (In these theories, one finds that ∇a Tab = Ab 11, where Ab (the “gravitational anomaly”) is a curvature polynomial that does not arise as the divergence of a symmetric curvature tensor, i.e., Ab = ∇a Aab for any symmetric Aab .) Thus, the analog of condition T11b also cannot be satisfied in theories analyzed in [1], but the root cause of the failure of condition T11b for these theories appears to be different in nature from the root cause of the failure of condition T11b for a scalar field in D = 2 dimensions. (3) Note that our Ward identity, Eq. (124), is a relation between elements in the free field algebra W(M, g), rather than an relation between correlation functions, which is the more conventional way to express Ward identities. A similar type of argument to that used in the first part of the above proof can be applied in the context of a conformally coupled massless field to yield the following nontrivial consistency (or “cocycle”) relation for the trace of the stressenergy tensor: Theorem 5.2. Suppose that conditions T1–T10 and T11b are satisfied. Then, for the case of a massless, conformally coupled scalar field [i.e., m = 0, ξ = (D − 2)/4(D − 1)], we have T a a (x) = C(x)11, where C(x) is a local curvature term of mass dimension D that satisfies the “cocycle condition” [k(δf C) − f (δk C)] = 0, (141) M
for any smooth compactly supported functions f, k on M, where ∂ C(esf g) δf C(g) = ∂s s=0
(142)
denotes the infinitesimal variation of a curvature term under a change in the conformal factor. Proof. That the trace of the quantum stress tensor in the massless, conformally coupled case is proportional to the identity, T a a = C 11, follows, as in the proof of the previous theorem, from the fact that [T a a (f ), ϕ(F )] = 0, which is an immediate
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
S. Hollands & R. M. Wald
270
consequence of the commutator property, T9, and the field equation for ϕ. By the scaling property T2, C must be a curvature polynomial of dimension length−D . To show that T11b implies that this curvature polynomial satisfies the cocycle condition Eq. (141), we consider the retarded variation of the field T a a (f ) with respect to the metric perturbation hab = kgab , where k is some smooth, compactly supported function (i.e., we consider an infinitesimal change k in the conformal factor of the metric gab ). Condition T11b in the simple case where only the single factor f1 Φ1 = f T a a is present now yields i δ ret a a b a δ (T a (f )) = R(f T a ; kT b ) + T (kgcd ) (f T a ) . (143) 2 δgcd For the left side we use the fact thatT a a (f ) = f C · 11, so the retarded variation with respect to hab = kgab yields f δk (C) · 11. Now antisymmetrize the above equation in k and f . We obtain [k(δf C) − f (δk C)] · 11 M
=
i i R f T a a ; kT b b − R kT b b ; f T a a 2 2 & & δ δ δ δ + 2T (f gcd ) L0 − (kgab ) L0 (kgab ) (f gcd ) . δgcd δgab δgab δgcd
(144)
As in the proof of Theorem 5.1, the two retarded products on the right side combine to yield the commutator (i/2)[T a a (f ), T b b (k)], which in turn vanishes because the trace of the stress tensor is proportional to 11 (note that the retarded products individually might be nonvanishing). The last term on the right side also vanishes since taking the variation of L0 with respect to the conformal factors f and k clearly does not depend on the order in which they are taken. Consequently, the right side vanishes, and we obtain the desired cocycle property (141) for the trace of the stress tensor, C. Remarks. (1) The cocycle condition on the conformal anomaly that we have derived here from axioms T1–T10 and T11b is the same condition as would be formally derived by assuming that the (expectation value of the) quantum stressenergy tensor can be calculated by taking the variation of some “effective action” with respect to the metric. Conditions of this nature are known in the literature under the name “Wess–Zumino consistency conditions” [21]. (2) The method of proof of the above theorem only relies upon properties T1–T10 and T11b, and therefore can be generalized to arbitrary field theories that satisfy suitable analogs of these conditions, and whose stress-energy tensor has a c-number trace. We shall consider such “non-perturbative” results elsewhere. (3) In odd spacetime dimensions, there simply are no scalar polynomials in the curvature of dimension length−D , so there is no trace anomaly. By contrast, in even dimensions, there always exist curvature scalars C of dimension length−D satisfying
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
271
the cocycle condition. For example, any term scaling as C → ΩD C under a conformal transformation gab → Ω2 gab is a solution to the cocycle condition (141), so when D = 2k with k > 1, any monomial expression in the Weyl tensor which contains k factors of the Weyl tensor is a solution. In D = 4 spacetime dimensions, there are 3 linearly independent local, covariant scalars with dimension (length)−4 which solve the cocycle condition,22 namely, C1 = Cabcd C abcd , C2 = Rabcd Rabcd − 4Rab Rab + R2 (the Euler density), and C3 = ∇a ∇a R. Since there are 4 linearly independent local covariant curvature terms with that dimension (namely, R2 in addition to the above 3 terms), this shows explicitly that the cocycle condition is nontrivial, and thus is potentially useful for restricting the form of the trace anomaly. Of course, the existence of solutions to the cocycle condition does not automatically imply that there is actually a trace anomaly, as the coefficients of these terms might be zero. However, for the massless, conformally coupled scalar field in D = 4 dimensions, the arguments given in Sec. 3.2 can be used to show that the coeficients of C1 and C2 must be nonvanishing for any prescription satisfying axioms T1–T10 and T11b. (The coeficient of C3 can be set to zero using the renormalization freedom allowed by these axioms.) By the same type of arguments, it can presumably be established that C cannot be zero in any even dimension D > 2, although the calculations that must be carried out to show this rapidly become very complicated as the number of dimensions D of the spacetime increases. All solutions to the cocycle condition in D = 6 have been found in Ref. [3]. We are not aware of an efficient algorithm to determine the general solution to the cocycle condition in arbitrary dimensions, and this appears to be an interesting mathematical problem.
5.2. Consequences for interacting fields In this subsection, we will derive some important consequences of our requirements with regard to perturbatively defined interacting field theories. (Other consequences such as the existence of the renormalization group are derived in [15].) We will consider interacting theories described by a classical Lagrangian of the form L = L0 + L1
(145)
where L0 is given by Eq. (2) and where the interaction Lagrangian is of the form L1 =
1 κ i Φi . 2
(146)
The κi denote coupling parameters and each Φi ∈ Vclass is any polynomial in the field ϕ and its derivatives as well as the Riemann tensor and its derivatives. 22 If our axioms were weakened so as to require the locality and covariance condition T1 only for orientation preserving isometries, then the “parity violating” curvature term C4 = abcd Rab pq Rpqcd would be allowed and would also satisfy the cocycle condition.
May 19, 2005 1:20 WSPC/148-RMP
272
J070-00234
S. Hollands & R. M. Wald
In particular, we do not require that L1 be renormalizable. Associated with the Lagrangian L is the classical stress-energy tensor Θab given by Θab = 2−1
δL δL1 = T ab + 2−1 , δgab δgab
(147)
where Tab is the stress tensor (6) associated with the free Lagrangian L0 . As reviewed in Sec. 4.1 above, if θ is a smooth function of compact support, interacting fields in the quantum field theory associated with the interaction Lagrangian θL1 can be defined in perturbation theory in terms of time-ordered products of the free theory by the Bogoliubov formula, Eq. (91). As shown in [5] and in [15, Sec. 3.1], one may always then take the limit as θ → 1 in a suitable way so as to (perturbatively) define the interacting theory with interaction Lagrangian L1 . (No restrictions on the asymptotic properties of the spacetime (M, g) are needed in order to take this limit.) As explained in [15], the resulting interacting fields — denoted
ΦL1 (x) — and interacting time-ordered products — denoted TL1 ( fi Φi ) — live (after smearing with a smooth compactly supported test functions) in an suitable abstract algebra BL1 (M, g) of formal power series of elements of W(M, g). The classical stress tensor Θab is conserved when the classical field equations associated with the Lagrangian, L, hold for ϕ. However, it is a priori far from clear that the quantized interacting field operator ϕL1 satisfies the interacting field equations, and, even if it does, it is far from clear that the interacting stress-energy operator Θab L1 is conserved. The following theorem — which constitutes one of the main results of this paper — establishes that if axioms T1–T10 hold, then condition T11a guarantees that ϕL1 satisfies the interacting field equations, and condition T11b guarantees that the interacting stress-energy operator Θab L1 is conserved. Theorem 5.3. Suppose that the prescription for defining time-ordered products in the free theory with Lagrangian L0 satisfies axioms T1–T10. Let L1 be any interaction Lagrangian of the form Eq. (146). Then the following properties hold for the interacting theory: (1) Let B be a (D − 1)-form on M depending polynomially on the classical field ϕ and its derivatives and the Riemann tensor and its derivatives. Then the map ΦL1 +dB (f ) → ΦL1 (f ) defines an isomorphism BL1 +dB (M, g) ∼ = BL1 (M, g),
(148)
i.e., the theory is unchanged if a total divergence is added to the Lagrangian. (2) The Leibniz rule holds for the interacting fields in the sense that ∇a [ΦL1 (x)] = (∇a Φ)L1 (x),
(149)
where the expression (∇a Φ) on the right denotes the field expression obtained by applying the Leibniz rule. More generally, the Leibniz rule also holds for the interacting time-ordered products.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
273
(3) If, in addition to T1–T10, axiom T11a also holds, then the equations of motion are satisfied in the interacting theory, i.e., (∇a ∇a − m2 − ξR)ϕL1 (x) = −(δL1 /δϕ)L1 (x)
(150)
in the sense of distributions valued in BL1 (M, g). Here, the variation on the right is the usual Euler–Lagrange type variation defined in Eq. (42) above. (4) If, in addition to T1–T10, axiom T11b also holds, then the interacting stressenergy tensor is conserved ∇a Θab L1 (x) = 0.
(151)
More generally, for all Ji Ψi ∈ Fclass , and all vector fields ξ of compact support, the following interacting field Ward identity holds: n a b 0 = TL1 (∇ ξ )Θab Ji Ψi a
+i
i
δ(Ji Ψi ) £ξ ϕ · · · Jn Ψn . TL1 J1 Ψ1 · · · δϕ
(152)
Proof. From the construction of the interacting fields given in [15, Sec. 3.1], it is clear that it suffices to prove the statements in the theorem for a cutoff interaction θL1 , where θ is a smooth function of compact support which is equal to 1 in the spacetime region under consideration. Statements (1) and (2) of the theorem are seen to be an immediate consequence of the Leibniz rule T10 applied to the individual terms in the formula in R(f Φ; (θL1 )n ), ΦθL1 (f ) ≡ (153) n! n≥0
where θ = 1 on the support of f . In order to prove statement (3), we must show that for any smooth function f of compact support, we have ϕθL1 ((∇a ∇a − m2 − ξR)f ) = −(δL1 /δϕ)θL1 (f ),
(154)
where, again, θ = 1 on the support of f . In terms of retarded products, we need to show that δL1 iθL1 a 2 iθL1 0 = R f (∇ ∇a − m − ξR)ϕ; e ;e . (155) +R f δϕ However, using the definition of the retarded products in terms of time-ordered products [see Eq. (87)], and using the fact that θ ≡ 1 on the support of f , we see that Eq. (155) is equivalent to n n−1 ) δ(θL 1 (156) θL1 = inT f θL1 T f (∇a ∇a − m2 − ξR)ϕ δϕ for all natural numbers n. But this equation is equivalent to (107) above, which was previously shown to hold as a direct consequence of condition T11a. Thus, we have
May 19, 2005 1:20 WSPC/148-RMP
274
J070-00234
S. Hollands & R. M. Wald
succeeded in showing that the equations of motion (150) hold in the interacting theory. To prove Eq. (151) of statement (4), we must show that for any smooth vector field ξ a of compact support, we have (Θab )θL1 ∇a ξ b = 0, (157) where θ = 1 on the support of ξ a . In terms of retarded products, we need to show that a b iθL1 a b δL1 iθL1 0 = R (∇ ξ )Tab ; e . (158) + 2R (∇ ξ ) ab ; e δg Equation (158) is seen to be equivalent to n−1 n δ(θL1 ) a b T (∇ ξ )Tab θL1 = inT (£ξ gab ) θL1 δg ab
(159)
for all natural numbers n. Now, if we apply Eq. (140) to the case fi = θ, Φi = L1 and use the fact that θ = 1 on the support of ξ a , we obtain δ(θL1 ) δ(θL1 ) £ξ ϕ = dB. £ξ gab + δgab δϕ
(160)
Therefore, by the Leibniz rule T10, we can rewrite Eq. (159) in the equivalent form n−1 n ) δ(θL 1 £ξ ϕ T (∇a ξ b )Tab (161) θL1 = −inT θL1 . δϕ But this equation holds as a consequence of the free field Ward identity Eq. (124), which was proven to hold when condition T11b is satisfied. Thus, we have shown that Eq. (151) holds, i.e., the interacting stress-energy is conserved in the interacting theory. To prove the interacting Ward identity, Eq. (152), we will need the generalization of Eq. (124) for the case where L1 may also depend upon an external source Jab···c , L1 = L1 (gab , (∇)k ϕ, Jab···c ).
(162)
We assume the source to be a smooth compactly supported tensor field on M although certain distributional sources would also be admissible.23 The generalization of the conservation law Eq. (151) appropriate to this case is24 δL1 Θab (£ g ) + 2 (£ξ Jab···c ) = 0 (163) ξ ab L1 δJab···c L1 23 It can be shown as a consequence of the microlocal spectrum condition that any distributional source with spacelike WF(J) would be admissible, i.e., lead to well defined interacting field expressions. For example J given by the delta distribution supported on a timelike smooth submanifold R S, δS (f ) = S f n · , (with na the normal to S) is acceptable. 24 The factor of 2 in front of the second term arises because Θ ab is twice the metric variation.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
275
for any compactly supported test vector field ξ a . In unsmeared form, this equation can be written as δL1 r (x) + J (x) ∇q Θqr ab···c L1 δJqab···c L1 δL1 δL1 r r + Ja b···c (x) + · · · + Jab···c (x) δJaqb···c L1 δJab···cq L1 δL1 r = (∇ Jab···c ) (x). (164) δJab···c L1 A similar formula holds when there is any finite number of external sources. Now consider, for a given Φ, the m-parameter family of interaction Lagrangians K1 = L1 + λj Jj Ψj . We differentiate the interacting field ΦK1 with respect to the parameters λi — identifying at the same time ΦK1 with an element in BL1 (M, g) via a suitably defined isomorphism τ ret : BK1 (M, g) → BL1 (M, g) associated with the respective algebras of interacting fields. We thereby obtain the retarded products25 in the interacting field theory associated with L1 , m " # ∂ (165) τ ret ΦL1 +P λj Jj Ψj (F ) = RL1 F Φ; Jj Ψj . ∂λ1 · · · ∂λm λi =0 j Now consider the special case in which Φ is the stress-energy tensor associated with the Lagrangian density L0 + K1 , i.e., we choose δ(Jj Ψj ) , (166) Φab = Θab + 2−1 λj δgab where Θab is the stress-energy tensor Eq. (147) associated with L0 + L1 . We also choose the “F ” in Eq. (165) to be Fab = ∇a ξb , where ξ a is a smooth, compactly supported vector field. We use formula (163) (with L1 replaced by K1 ) to calculate Φab K1 (Fab ), and differentiate the resulting identity with respect to the parameters λi . This gives Jj Ψj iRL1 (∇a ξb )Θab ;
=
j
j
δ(J Ψ ) j j RL1 £ξ gab ; Ji Ψi + RL1 Ψj £ξ Jj ; Ji Ψi . δgab j i =j
(167)
i =j
We now again apply Eq. (140), this time with fi Φi = Jj Ψj , to obtain δ(Jj Ψj ) δ(Jj Ψj ) £ξ ϕ = dB, £ξ gab + Ψj £ξ Jj + δgab δϕ
(168)
25 In a similar way, we could write the advanced products in the interacting field theory considering instead the corresponding *-isomorphism τ adv : BK1 (M, g) → BL1 (M, g), but this would make no difference in the argument.
May 19, 2005 1:20 WSPC/148-RMP
276
J070-00234
S. Hollands & R. M. Wald
and we apply the Leibniz rule to the retarded products in (167) (which also holds for the interacting quantities, since these are expressible in terms of the time-ordered products in the free theory). Finally, we express the retarded products RL1 in the interacting theory in terms of time-ordered products TL1 in the interacting theory (using a formula completely analogous to Eq. (112)). When this is done, we arrive at the Ward identity, Eq. (152), for the interacting field theory associated with the interaction Lagrangian L1 . Remarks. (1) We note explicitly that Theorem 5.3 does not say that an interacting field ΦL1 vanishes when the classical field expression Φ is of the form Φ = Ψ δL δϕ , with Ψ containing factors of ϕ. In other words, the theorem does not say that a general interacting field ΦL1 vanishes if it would vanish in the classical interacting theory associated with L by the classical equations of motion. Rather, the theorem a asserts only that this is true in the special cases Φ = δL δϕ and Φb = ∇ Θab . Indeed, we have already seen in Sec. 3.2 above that even in the free theory, field expressions 0 of the form Ψ δL δϕ will, in general be nonvanishing. (2) Note that as in the case of the free theory, the interacting Ward identity, Eq. (152), is a relation between elements in the algebra BL1 (M, g) of interacting fields, rather than a relation between correlation functions associated with a state. Note also that the interacting Ward identity has the same form as the Ward identity (124) in the free quantum field theory, except that the free stress-energy tensor Tab is replaced by the interacting stress-energy tensor Θab . Note, however, that the Ward identity in the free theory is an operator identity between elements in the algebra W(M, g), whereas the interacting Ward identity is an identity in the interacting field algebra BL1 (M, g). (3) In our informal distribution notation (22) for the time-ordered products, the Ward identity (152) takes the form n y ab ∇a TL1 Θ (y) Ψi (xi ) =i
i
δ b δ(y, xi ) TL1 Ψ1 (x1 ) · · · (∇ ϕ) Ψi (xi ) · · · Ψn (xn ) . (169) δϕ
6. Proof that There Exists a Prescription for Time-Ordered Products Satisfying T11a and T11b in Addition to T1–T10 Our remaining task is to prove that requirements T11a and — in D > 2 dimensions — T11b can be consistently imposed in addition to requirements T1–T10. Specifically, we shall prove the following: Theorem 6.1. In all spacetime dimensions D > 2, there exists a prescription for defining time-ordered products of the quantum scalar field with Lagrangian L0 , Eq. (2), that satisfies conditions T1–T10, T11a, and T11b. When D = 2, there
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
277
exists a prescription satisfying T1–T10 and T11a, but condition T11b cannot be imposed in addition to T1–T10. We have already proven that condition T11b cannot be satisfied in addition to T1–T10 in D = 2 spacetime dimensions (see remark (1) following Theorem 5.1 above), so we need only prove the existence statements here. We have already proved in Proposition 3.1 above that T1–T10 always can be satisfied. Our strategy will therefore be to use the remaining “renormalization freedom” to additionally satisfy T11a and T11b. This remaining renormalization freedom may be precisely characterized as follows: in our previous work [13] (see also [15]) we proved a uniqueness theorem for time-ordered products satisfying T1–T9 whose factors do not contain derivatives of the fields. This result can be straightforwardly generalized to the case when derivatives are present and the prescription also satisfies T10. The generalized result is as follows: let T and T be arbitrary prescriptions for defining time-ordered products satisfying T1–T10. Then they must be related in the following way: n n Ai = T Ai T i=1
i=1
+
I0 ∪I1 ∪···∪Ik ={1,...,n}
T
O|Ik |
k>0
! j∈Ik
Aj
Ai . (170)
i∈I0
Here the Or are linear maps (essentially the “counterterms”, see Eq. (175) below) Or : ⊗r Fclass → Fclass that can be written in the following form: 1 (yj ) Or (⊗fi Φi )(x) = α ! · · · αr ! α ,α ,... 1 j 1
2
× c [δ α1 Φ1 ⊗ · · · ⊗ δ αr Φr ] (x; y1 , . . . , yr ) r × f1 (y1 ) · · · fNT (yr ) [(∇)j ϕ(yi )]αij ,
(171)
i=1 j
where we are using the same notation as in the Wick expansion (63). The c are linear maps on ⊗r Vclass taking values in the distributions over M r+1 . These distributions are always writable as a sum of derivatives of the delta function δ(x; y1 , . . . , yr ), times polynomials in the Riemann tensor and its covariant derivatives and m2 . The engineering dimension of each such term appearing in c[⊗i Φi ] (with the dimension of the delta function counted as rD) must be equal precisely to the sum of the engineering dimensions of the Φi , defined as in the scaling requirement, T2. The c must satisfy the reality condition c [⊗ni=1 Φi ] = (−1)n+1 c [⊗ni=1 Φi ]
(172)
as a consequence of the unitarity property satisfied by T and T , and they must satisfy the symmetry condition c [⊗ni=1 Φi ] (y; x1 , . . . , xn ) = c[⊗ni=1 Φπi ](y; xπ1 , . . . , xπn )
∀ permutations π,
(173)
May 19, 2005 1:20 WSPC/148-RMP
278
J070-00234
S. Hollands & R. M. Wald
as a consequence of the symmetry of the time-ordered products. Finally, the imposition of the Leibniz rule, T10, on the time-ordered products T and T yields the following additional constraint on the c: ∇ ⊗ · · · 1 c [⊗i Φi ] . c [Φ1 ⊗ · · · ∇Φi ⊗ · · · Φn ] = 1 ⊗ · · ·
(174)
ith slot
Formula (170) can be restated more compactly using the generating functional S(A) for the time-ordered products defined in Eq. (88): 1 iA S (A) = S A + O(e ) , (175) i where in On O(eiA ) = n!
n !
A
(176)
n≥0
is a formal power series in Fclass . In other words, if L1 = A is the interaction Lagrangian, then L2 ≡ (1/i)O(eiL1 ) corresponds precisely to the (finite) counterterms that must be added to L1 in order to compensate for the change in the renormalization prescription from T to T . Our task is to show that T11a and T11b can be satisfied by making changes within the allowed class of changes that we have just characterized in terms of the c. 6.1. Proof that T11a can be satisfied It is not difficult to prove that T11a can always be satisfied in any dimension D, including D = 2. In fact, T11a automatically holds for the Wick powers (i.e., time-ordered products with one factor) when the latter are defined via the local normal ordering prescription given in Eq. (60). To show that T1–T10 together with T11a can be satisfied for arbitrary time-ordered products, we proceed inductively in the number of powers of ϕ as follows. We assume that we are given a prescription which satisfies T1–T10 for arbitrary time-ordered products, and we assume, inductively, that T11a also holds for all time-ordered products T(f1 Φ1 · · · fn Φn ) that contain a total number Nϕ < k powers of ϕ. From the identity R(J1 ϕ; J2 ϕ) = i∆ret (J1 , J2 ), we easily see that T11a is satisfied when Nϕ = 1, which occurs only when n = 1 and Φ1 is linear in ϕ. Consider now a set of fields Φ1 , . . . , Φn with Nϕ = k, and let Gn (J; f1 , . . . , fn ) be the difference between the left and right sides of T11a (see Eq. (105)). We wish to show that it is possible to change our prescription, if necessary, so that Gn = 0 for the new prescription T , while maintaining T1–T10 on all time-ordered products and maintaining T11a on the time-ordered products with Nϕ < k. It can easily be seen, from the causal factorization property and the definition of the retarded products, that Gn (J; f1 , . . . , fn ) = 0 for test functions J, f1 , . . . , fn supported off the total diagonal ∆n+1 in the product manifold M n+1 . Furthermore, using the inductive
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
279
assumption and T9, one can verify by an explicit calculation that the commutator [Gn (J; f1 , . . . , fn ), ϕ(F )] vanishes for any compactly supported F . Thus, by [13, Proposition 2.1], Gn must be proportional to the identity operator and can therefore be identified with a multilinear functional taking values in the complex numbers. By conditions T1–T5, this functional must actually be a distribution (i.e., it must be continuous in the appropriate sense) which is local and covariantly constructed from the metric, with a smooth/analytic dependence upon the metric and m2 , ξ, and with an almost homogeneous scaling behavior. Therefore, by the arguments in [13], Gn has to be a sum of covariant derivatives of the delta-distribution on M n+1 , multiplied by polynomials in m2 , covariant derivatives of the Riemann tensor, and analytic functions of ξ, of the appropriate dimension. It follows from unitarity T7 ¯ n = (−1)n+1 Gn . that Gn satisfies the reality condition G We now set c[ϕ⊗(⊗i Φi )] = −iGn , and define c[(∇)k ϕ⊗(⊗i Φi )] via Eq. (174). We use these c to define a new prescription T via Eqs. (170) and (171). It is clear that the new prescription satisfies Gn = 0 and hence satisfies T11a for Nϕ = k. Thus, our inductive proof will be complete if we can show that the c satisfy all of the properties that are necessary for the new prescription to satisfy T1–T10 on all time-ordered products. However, it is clear from its definition that c satisfies all of these properties, with the possible exception of the symmetry property (173). We now complete the proof by showing that c also satisfies this symmetry property. The symmetry property of c[ϕ ⊗ (⊗i Φi )] holds trivially except in the case where we have a factor, say Φ1 , of the form Φ1 = ϕ and we consider the interchange of Φ1 and ϕ. Thus, let us consider the difference between the left and right sides of Eq. (105) with free field factor J2 ϕ, in the case when f1 Φ1 = J1 ϕ. Antisymmetrizing in J1 and J2 , we get Gn (J1 ; J2 , f2 , . . . , fn ) − Gn (J2 ; J1 , f2 , . . . , fn ) = i∆ret (J2 , J1 ) − i∆ret (J1 , J2 ) − [ϕ(J1 ), ϕ(J2 )] T
n
f i Φi
i=2
δ(fi Φi ) δ(fj Φj ) ret ret + · · · (∆ J2 ) · · · − (J1 ↔ J2 ) T · · · (∆ J1 ) δϕ δϕ i,j=2 n
+ other terms,
(177)
where “other terms” stand for expressions that vanish under the inductive assumption that T11a is true for Nϕ < k. The first expression on the right side vanishes, because the commutator of ϕ with itself is given by i∆ [see Eq. (5)], and because ∆ret (J2 , J1 ) = ∆adv (J1 , J2 ). The second expression on the right side vanishes because the time-ordered products are symmetric. This shows that Gn (J1 ; J2 , f2 , . . . , fn ) is symmetric in J1 , J2 , implying that c[ϕ ⊗ ϕ ⊗ Φ2 ⊗ · · · Φn ] is symmetric in the spacetime arguments associated with the factors of ϕ, as we desired to show. This completes the proof.
May 19, 2005 1:20 WSPC/148-RMP
280
J070-00234
S. Hollands & R. M. Wald
We have therefore obtained a construction of time-ordered products satisfying T1–T10 and T11a. We will work with such a prescription in everything that follows. Any other prescription satisfying these properties will differ from the given one by formulas (170) and (171), where the distributions c must now satisfy the additional constraint c[ϕ ⊗ (⊗i Φi )] = 0
(178)
due to the imposition of the further requirement T11a. 6.2. Proof that T11b can be satisfied when D > 2 For the remainder of this section, we restrict consideration to spacetimes of dimension D > 2, and we will prove that the remaining requirement, T11b, can be satisfied together with all other requirements T1–T10, T11a. Condition T11b is far from obvious even for the Wick products, and it is not satisfied by our local normal ordering prescription (60) (which satisfies T1–T10, T11a), as can be seen from the fact that the stress tensor Tab when defined via the local normal ordering prescription fails to be conserved (see Sec. 3.2), whereas any prescription satisfying T11b automatically gives rise to a conserved stress tensor by Theorem 5.1. Thus, in order to construct a prescription satisfying T11b together with all other requirements, we have to reconsider even the definition of Wick powers. For these reasons, it is not surprising that our proof of T11b is technically much more complex than the proof of T10 or T11a given in the previous sections. Nevertheless, the basic logic underlying the proof is actually rather simple and transparent. We now outline this basic logic, leaving the details to the following Secs. 6.2.1–6.2.6. As with many other constructions in this paper, it is convenient not to attempt to construct the time-ordered products satisfying T11b in one stroke for an arbitrary number Nϕ of factors of ϕ, but to proceed inductively in the number of factors. Starting off with the trivial case, we therefore assume that a prescription satisfying T11b has been given up to less than k factors. At Nϕ = k factors we consider the algebra valued map Dn which is precisely the failure of T11b to be satisfied. For a given collection of fi Φi ∈ Fclass with a total number of k factors of ϕ, and any smooth, compactly supported variation hab of the metric, this is given by n ab ret Dn (hab ; f1 , . . . , fn ) ≡ δ f i Φi T i − R 2 −
i=1 n
fi Φi ; hab T
ab
i=1
δ(fi Φi ) T f1 Φ1 · · · hab · · · f n Φn , δgab i
(179)
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
281
where the retarded variation is taken with respect to the infinitesimal variation hab of the metric. (Note that Dn involves time-ordered products with up to NT = n + 1 factors.) The basic idea of the proof is to show that the given prescription T for the time-ordered products can be adjusted, if necessary, to a new prescription T — related to the original one by Eq. (170) — in such a way that Dn = 0 for this new prescription, and so that the coefficient distributions c implicit in Eq. (170) obey the constraints described above. This then would show that T1–T10, T11a, and T11b will hold for the modified prescription for time-ordered products with up to k factors of ϕ. The obvious strategy for doing this is, of course, to absorb Dn into a redefinition of the appropriate time-ordered products involving a stress-energy factor by simply subtracting it from the given prescription, and we will indeed follow this basic strategy. However, while it is straightforward to show that subtracting Dn from the corresponding time-ordered products T with a stress-energy factor will automatically produce a new prescription T satisfying Dn = 0, it is not at all obvious that T will continue to satisfy the other requirements T1–T10 and T11a. In order to demonstrate that this is indeed the case, we proceed by establishing a number of properties about Dn in the following subsections. The upshot is that Dn is “sufficiently harmless”, in the sense that subtracting it from the given prescription T will produce a T which continues to have the desired properties T1–T10 and T11a. In more detail, we proceed as follows: (1) In Sec. 6.2.1, we first show that Dn is a functional of hab , f1 , . . . , fn that is supported on the total diagonal. (2) In Sec. 6.2.2, we then establish that, at the induction order considered, Dn is a c-number. (3) In Sec. 6.2.3, we show that Dn is local and covariant, with an appropriate scaling behavior. (4) In Sec. 6.2.4, we show that Dn = 0 if one of the field factors is equal to ϕ. (5) In Sec. 6.2.5., we establish that Dn is not merely a linear functional, but in fact a distribution (i.e., continuous in the appropriate sense) with a smooth dependence upon the metric and with an appropriate scaling behavior under scaling of the metric. (6) In Sec. 6.2.6., we show that Dn has the appropriate symmetry property when one of the factors Φi is equal to a stress-energy tensor Tcd . These properties imply that Dn is, in fact, a delta function, multiplied by appropriate curvature polynomials (with appropriate symmetry properties). Since the freedom to redefine time-ordered products consists precisely in adding such delta function expressions, we can absorb Dn into a redefinition of time-ordered products (here it is used that D > 2), while preserving T1–T10 and T11a. This is described in detail in Sec. 6.2.7. We now elaborate these arguments. As for the induction start, when there are no factors of ϕ in the fields f1 Φ1 , . . . , fn Φn on which Dn depends, we obviously
May 19, 2005 1:20 WSPC/148-RMP
282
J070-00234
S. Hollands & R. M. Wald
must have n = 0. In this case, D0ab (hab ) = −(i/2)R(hab T ab ), since δ ret (11) = 0. But any retarded product with only one factor vanishes by definition, so there is nothing to show for Nϕ = 0. Let us therefore inductively assume that Dn = 0 for any set of f1 Φ1 , . . . , fn Φn , with a total number Nϕ of ϕ less than k. 6.2.1. Proof that Dn is supported on the total diagonal First, we will show that Dn is supported on the total diagonal ∆n+1 in the product manifold M n+1 . For this, choose a test function hab ⊗ f1 ⊗ · · · ⊗ fn whose support does not intersect ∆n+1 . Then, without loss of generality, we can assume that one of the following cases occurs: (1) There is a Cauchy surface Σ in M such that supp hab ⊂ J + (Σ) and supp fi ⊂ J − (Σ) for all i = 1, . . . , n. (2) The same as the previous one, but with “+” and “−” interchanged. (3) There is a Cauchy surface Σ, and a proper, non-empty subset I ⊂ {1, . . . , n} with the property that supp hab , supp fi ⊂ J + (Σ) for all i ∈ I, and such that supp fj ⊂ J − (Σ) for all j in the complement J of I. (4) The same as the previous one, but with “+” and “−” interchanged. We now analyze these cases one-by-one. To simplify the notation, let us use the shorthand Ai = fi Φi ∈ Fclass .
(180)
In case (1), the support of infinitesimal variation hab is outside the causal past of
the support of the Ai , and we consequently have that δ ret [T( Ai )] = 0. Thus, the first term in Dn vanishes. But the other terms also vanish: the second because of the support properties of the retarded products, Eq. (90), and the third because supp fi ∩ supp hab is empty.
In case (2), it follows that δ adv [T( Ai )] = 0 by the same argument as above. Thus, the first term in Dn is equal to Ai = δ ret T Ai − δ adv T Ai δ ret T i = − T ab (hab ), T Ai 2 i i = R Ai ; hab T ab − A Ai ; hab T ab 2 2 i ab (181) = R Ai ; hab T 2 where in the second line we have used Eqs. (119) and (120), in the third line we have used an identity for retarded and advanced products, and in the fourth line we have used that supp fi ⊂ J + (supp hab ) and the support property of the advanced products. The calculation shows that the first term and the second term in Dn cancel. But the third term vanishes, because supp fi ∩ supp hab is empty, showing that Dn = 0 in case (2).
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
283
In case (3), we use the causal factorization property T8 of the time-ordered products and the homomorphism property of τ ret (that is, τ ret (ab) = τ ret (a)τ ret (b)) to write Ai T Aj first term in Eq. (179) = δ ret T = δ ret T +T
i∈I
i∈I
Ai
j∈J
Ai
T
Aj
j∈J
δ ret T
i∈I
Aj .
(182)
j∈J
Since neither I nor J are empty by assumption, and since A1 , . . . , An together have at most a total of Nϕ = k factors of ϕ, it follows that Ai , i ∈ I as well as Aj , j ∈ J each have strictly less than k factors of ϕ. Hence we can use our inductive assumption which gives D|I| = D|J| = 0. It follows that first term in Eq. (179) i i = R Ai ; hab T ab T Aj + T Ai R Aj ; hab T ab 2 2 i∈I j∈J i∈I j∈J δAk T Ai T Aj + δϕ j∈J k∈I i∈I,i =k δA l +T (183) Ai T Aj . δϕ i∈I
l∈J
j∈J,j =l
For the second and third terms in Eq. (179), we likewise use the causal factorization property and the definition of the retarded product. It is then seen that these terms precisely cancel the first term in Eq. (179), showing that Dn = 0 in case (3). Case (4) can be treated in the same way as the previous one. 6.2.2. Proof that Dn is a c-number We next want to show that the algebra element Dn ∈ W is in fact proportional to the identity operator. By [13, Proposition 2.1], an element a ∈ W is proportional to the identity if and only if [a, ϕ(F )] = 0 for all smooth, compactly supported densities F . Thus, we will be done if we can show that [Dnab (hab ; f1 , . . . , fn ), ϕ(F )] = 0.
(184)
Inductively, we know this is true when Nϕ < k since Dn itself vanishes then. We now prove that it is also true when Nϕ = k.
May 19, 2005 1:20 WSPC/148-RMP
284
J070-00234
S. Hollands & R. M. Wald
We begin by calculating the commutator with the first term in Dn [see Eq. (179)], which, using the homomorphism property of τ ret is equal to , + δ ret T fi Φi , ϕ(F ) + , . = δ ret T (185) fi Φi , ϕ(F ) − T fi Φi , δ ret ϕ(F ) . We now simplify the first term on the right side of this expression using the commutator property of the time-ordered products, T9, and we simplify the second term on the right side using that δ ret [ϕ(F )] = −ϕ(δ(P )∆adv F ),
(186)
which follows from a direct calculation using the definition of τ ret (see [6]). Here, δ(P ) is the infinitesimal variation of the densitized Klein-Gordon operator under a change in the metric, ∂ (P )g+sh f . (187) δ(P )g f = ∂s s=0 (Note that δ(P ) is a second-order differential operator mapping smooth scalar functions to densities.) Substituting Eq. (186) into Eq. (185) gives + , δgret Tg fi Φi , ϕg (F ) n δ(fi Φi ) ∂ ret · · · f n Φn Tg(s) f1 Φ1 · · · (∆g(s) F ) = i τg(s) ∂s δϕ i=1 s=0 adv + Tg (188) fi Φi , ϕg (δ(P )∆ F ) . The first term on the right side involves only Nϕ = k − 1 factors of ϕ, and therefore can be simplified using the inductive assumption that Dn = 0 in that case. The second term on the right side can again be simplified using the commutator property. This gives26 + , δgret Tg fi Φi , ϕg (F ) n δ(fi Φi ) 1 · · · fn Φn ; hab T ab Rg f1 Φ1 · · · (∆F ) =− 2 i=1 δϕ & n δ δ +i Tg f1 Φ1 · · · hab (∆F ) (fi Φi ) · · · fn Φn δgab δϕ i=1 n δ(fi Φi ) ∂ · · · f n Φn +i Tg f1 Φ1 · · · ∆g(s) F ∂s δϕ s=0 i=1 26 We use the convention that whenever the expression ∆F appears in an expression to which δ/δgab is applied, we will view ∆F as independent of g, i.e., δ/δgab does not act on ∆F in such an expression.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
285
δ(fk Φk ) δ(fi Φi ) · · · f n Φn +i Tg f1 Φ1 · · · hab · · · (∆F ) δgab δϕ i=1 k =i n δ(fi Φi ) +i · · · f n Φn . Tg f1 Φ1 · · · (∆δ(P )∆adv F ) (189) δϕ i=1 n
We next calculate the commutator of the second term in Dn [see Eq. (179)] with ϕ(F ), by expanding the retarded product in terms of time-ordered products and using for each of the resulting terms the commutator property of the time-ordered products, −
i R fi Φi ; hab T ab , ϕ(F ) 2 1 δ(hab T ab ) fi Φi ; (∆F ) = R 2 δϕ n δ(fi Φi ) 1 ab · · · fn Φn ; hab T R f1 Φ1 · · · (∆F ) . + 2 i=1 δϕ
(190)
Using that the variational derivatives δ/δgab and δ/δϕ commute up to an exact form (see Eq. (268)), and using Eq. (266) of Appendix B, we have & δL0 δ(hab T ab ) δ 1 (∆F ) (∆F ) = hab + dB1 2 δϕ δgab δϕ ∂ {(∆g F )(P )g+sh ϕ} = + dB2 . ∂s s=0
(191)
where B1 , B2 are local, (D − 1)-form functional of ϕ and the metric, and where P is the Klein–Gordon operator. Since P is hermitian, the right side can be rewritten further as δ(hab T ab ) ∂ 1 (∆F ) = {(P )g+sh (∆g F )ϕ + dC} + dB2 2 δϕ ∂s s=0 = δ(P )(∆F )ϕ + dB3 (192) remembering that δ(P ) is metric variation of the densitized Klein–Gordon operator. Thus, by the Leibniz rule, T10, we get −
i R fi Φi ; hab T ab , ϕ(F ) 2 =R fi Φi ; (δ(P )∆F )ϕ n δ(fi Φi ) 1 · · · fn Φn ; hab T ab . R f1 Φ1 · · · (∆F ) + 2 i=1 δϕ
(193)
May 19, 2005 1:20 WSPC/148-RMP
286
J070-00234
S. Hollands & R. M. Wald
We apply T11a to the first term on the right side of this equation. This gives −i R fi Φi ; hab T ab , ϕ(F ) n δ(fi Φi ) =i · · · f n Φn T f1 Φ1 · · · (∆ret δ(P )∆F ) δϕ i=1 n δ(fi Φi ) + (194) R f1 Φ1 · · · (∆F ) · · · fn Φn ; hab T ab . δϕ i=1 We finally take the commutator of the third term in Dn with ϕ(F ), and use the commutator property to simplify. This gives δ(fi Φi ) − T f1 Φ1 · · · hab · · · fn Φn , ϕ(F ) δgab i & n δ δ = −i T f1 Φ1 · · · (∆F ) (fi Φi ) · · · fn Φn hab δϕ δgab i=1 n δ(fk Φk ) δ(fi Φi ) −i · · · fn Φn . (195) T f1 Φ1 · · · hab · · · (∆F ) δgab δϕ i=1 k =i
We have now calculated the commutator of all three terms in Dn with ϕ(F ), given by Eqs. (189), (194) and (195) respectively. If we add these contributions up, then we see that the commutator [Dn , ϕ(F )] will vanish if we can show that ∂ ret adv i∆ δ(P )∆F + i∆δ(P )∆ F = −i ∆g(s) F (196) ∂s s=0 for all compactly supported densities F . However, this identity follows immediately from ∆ = ∆adv − ∆ret together with the identity ∂ ret ∆ (s) F = −∆ret δ(P )∆ret F and “adv” ↔ “ret”, (197) ∂s g s=0 for all compactly supported densities F , which in turn is seen to be true owing to the relation (∂/∂s)(P ∆ret )g(s) = 0 (and the analogous relation for the advanced propagator). 6.2.3. Proof that Dn is local and covariant and scales almost homogeneously It is “obvious” that Dn is a c-number functional that is constructed entirely from the metric, because all the terms in the defining equation for Dn have this property. Dn depends moreover locally and covariantly on the metric in the sense that if χ: N → M is any causality and orientation preserving isometric embedding, then Dnab [M, g](χ∗ hab , χ∗ f1 , . . . , χ∗ fn ) = Dnab [N, χ∗ g](hab , f1 , . . . , fn )
(198)
for all test (tensor-)fields with compact support on N . This property follows because the second and third terms in the definition of Dn are local and covariant quantities
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
287
by T1, and because the map τ ret appearing in the first term also has this property by construction. Moreover, the functionals Dn also have an almost homogeneous scaling behavior under rescalings of the metric in the sense that ∂N (λd · Dn [M, λ2 g]) = 0 (199) ∂ N ln λ for some natural number N , where d is the sum of the engineering dimension of the fields appearing in Dn . Again, for the second and third terms in the definition of Dn , this property follows since we are assuming that our time-ordered products (and hence retarded products) have an almost homogeneous scaling behavior in the sense of T2. For the first terms in the definition of Dn , this follows from the fact that if (M, g) and (M, g ) are spacetimes whose metrics differ only within some compact set K, and if τ adv/ret are the corresponding algebra isomorphisms from W(M, g) to W(M, g ), then σλ ◦ τ adv/ret = τ adv/ret ◦ σλ ,
(200)
where σλ is the natural isomorphism from W(M, g) to W(M, λ g) introduced above in T2, and where σλ is the corresponding isomorphism for g . 2
6.2.4. Proof that Dn = 0 when one of the Φi is equal to ϕ Let us now assume that one of the fields Φi is equal to ϕ, say Φn = ϕ, and as before, that the total number Nϕ of free field factors in Φ1 , . . . , Φn−1 , Φn = ϕ is equal to k. We will show that Dn is automatically zero in this case under our inductive assumption that Dn = 0 when Nϕ < k. We first look at the second term in Dn in Eq. (179), setting Ai = fi Φi for i < n and fn = F to facilitate the notation. After some algebra, repeatedly using T10, T11a and Eq. (89), we get n−1 i ab − R Fϕ Ai ; hab T 2 δAj 1 ret ab · · · An−1 ; hab T R A1 · · · (∆ F ) = 2 j δϕ n−1 i ab Ai ; hab T − ϕ(F ) R 2 n−1 − ϕ(δ(P )∆adv F ) T Ai +i
δAj · · · An−1 . T A1 · · · (∆ret δ(P )∆ret F ) δϕ j
(201)
Here δ(P ) is the first-order variation of the Klein–Gordon operator with respect to our family of metrics, see Eq. (187). For the third term in Dn , we get,
May 19, 2005 1:20 WSPC/148-RMP
288
J070-00234
S. Hollands & R. M. Wald
using T10, T11a, δAj T F ϕA1 · · · hab · · · An−1 − δgab j δAk δAj ret = −i T A1 · · · hab · · · (∆ F ) · · · An−1 δgab δϕ j =k & δ δ ret −i T A1 · · · (∆ F ) Aj · · · An−1 hab δϕ δgab j δAj T A1 · · · hab · · · An−1 . − ϕ(F ) δgab j
(202)
For the first term in Dn we get, using Eqs. (186) and the definition (187) of δ(P ), n−1 n−1 ret adv T Fϕ = ϕ(δ(P )∆ F ) T δ Ai Ai δAj ret · · · An−1 δ +i T A1 · · · (∆ F ) δϕ j n−1 ret T . + ϕ(F ) δ Ai
ret
(203)
We can simplify the terms on the right side using the inductive assumption. Adding up the contributions Eqs. (201), (202) and (203) to Dn , and using Eq. (197), we find that all terms cancel. Thus we have shown that Dn = 0 when Nϕ = k and when one of the factors Φi is ϕ. 6.2.5. Proof that Dn satisfies a wave front set condition and depends smoothly and analytically on the metric We now show that Dn is a distribution on M n+1 — i.e., Dn is a multilinear functional that is continuous in the appropriate sense — and that it satisfies the wave front set condition WF(Dn ) ∆n+1 ⊥ T (∆n+1 ).
(204)
Moreover, we will show that if g(s) is a smooth (resp. analytic) family of metrics depending smoothly (resp. analytically) upon a set of parameters s in a parameter (s) space P, and if Dn are the corresponding distributions (viewed now as a single distribution on P × M n+1 ), then WF(Dn(s) ) P×∆n+1 ⊥ T (P × ∆n+1 ),
(205)
(with the smooth wave front set WF replaced by the analytic wave front set WFA in the analytic case).
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
289
Since Dn is a c-number, it is equal to the expectation value of Eq. (179) in any state ω on W(M, g). To simplify things we take ω to be a quasifree Hadamard state, and write Dn as Dnab (hab ; f1 , . . . , fn ) = rnab (hab ; f1 , . . . , fn ) n i fi Φi ; hab T ab − ω R 2 i=1 δ(fi Φi ) − ω T f1 Φ1 · · · hab · · · f n Φn , δgab i
(206)
where we have set rnab (hab ; f1 , . . . , fn )
=ω δ
ret
T
n
f i Φi
.
(207)
i=1
To prove the desired properties, Eqs. (204) and (205), of Dn , we show that each term on the right side of Eq. (206) satisfies these properties separately. This is relatively straightforward for the second and third terms. It follows from our microlocal spectrum condition, T3, that the second and third terms in Dn each satisfy WF(2nd and 3rd terms in Eq. (206)) ⊂
(y, p; x1 , k1 ; . . . ; xn , xn ) |
∃ Feynman graph G(q) with vertices y, x1 , . . . , xn and edges e such that ify = s/t(e) then t/s(e) ∈ J + (y) qe − qe , p = qe − qe ≡ CR (M, g). ki = e:s(e)=xi
e:t(e)=xi
e:s(e)=y
e:t(e)=y
(208) Since CR (M, g) ∆n+1 ⊥ T (∆n+1 ),
(209)
on the total diagonal, it immediately follows that the second and third terms in Dn satisfy the analog of Eq. (204). Moreover, if we consider a smooth family of metrics g(s) and a corresponding family of quasifree Hadmard states ω (s) depending smoothly upon s in the sense of Eq. (35), then it similarly follows (s) from T4 that the second and third terms in Dn (with ω in those expressions replaced by ω (s) ) have a smooth dependence upon s. It then follows immediately (s) that the second and third terms in Dn satisfy the smoothness condition (205). The corresponding statement in the analytic case similarly follows from condition T5.
May 19, 2005 1:20 WSPC/148-RMP
290
J070-00234
S. Hollands & R. M. Wald
Having dealt with the second and third terms on the right side of Eq. (206), our claims will be established by proving the following proposition: Proposition 6.1. The first term on the right side of Eq. (206) satisfies WF(rn ) ∆n+1 ⊥ T (∆n+1 ),
(210)
WF(rn(s) ) P×∆n+1 ⊥ T (P × ∆n+1 ),
(211)
as well as
(s)
where rn is defined by the same formula as rn except that ω is replaced by the smooth family ω (s) in that formula, and the metric g is replaced everywhere by g(s) . The analogous statement also holds true with regard to the analytic wave front set. Proof. We know that rn is a multilinear functional which is also a distribution in f1 ⊗ · · · ⊗ fn for any fixed h of compact support. Also, since Dn is already known to vanish for test functions h ⊗ f1 ⊗ · · · ⊗ fn whose support has no intersection with the total diagonal ∆n+1 in M n+1 , it follows that rn (h, f1 , . . . , fn ) is equal to minus the second and third terms in Eq. (206). Therefore, since these terms are individually known to be distributions, we know that rn is in fact a distribution off the total diagonal. However, our constructions so far do not tell us that rn is also a distribution on the total diagonal, let alone whether it satisfies the wave front set conditions Eqs. (210) and (211) there. Thus, in order to prove the above proposition, we must look at the detailed structure of rn near the total diagonal. For this, we first use our local Wick expansion (63) to write the time-ordered products in the following form when f1 ⊗ · · · ⊗ fn is supported in a sufficiently small neighborhood Un total diagonal in M n (which we assume from now on): n 1 T f i Φi = (yj ) α ! · · · αn ! α ,α ,... 1 i=1 j 1
2
× t [δ α1 Φ1 ⊗ · · · ⊗ δ αn Φn ] (y1 , . . . , yn ) n f1 (y1 ) · · · fn (yn ): [(∇)j ϕ(yi )]αij :H =
i=1 j
w(y1 , . . . , yn ; x1 , . . . , xr )
r
n i=1
fi (yi ) :
r
ϕ(xj ) :H .
(212)
j=1
Here, the distributions w ∈ D (Un × M r ) are defined by the last equation in terms of sums of products of t[· · ·] and suitable delta functions and their derivatives. Since these distributions are in turn locally and covariantly constructed from the metric, it follows that also the distributions w have this property, and we will write w = wg when we want to emphasize this fact. From the δ-functions implicit in the definition of w, one easily finds the support property supp w ⊂ {(y1 , . . . , yn ; x1 , . . . , xr ) | ∃ partition {1, . . . , r} = I1 ∪ · · · ∪ In such that xi = yl
∀ i ∈ Il },
(213)
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
291
and from the wave front set property of the t, one finds furthermore the wave front set property WF(w) ⊂ {(y1 , k1 ; . . . ; yn , kn ; x1 , p1 ; . . . ; xr , pr ) | ∃ partition {1, . . . , r} = I1 ∪ · · · ∪ In such that xi = yl ∀ i ∈ Il if ql ≡ kl + pi , then (y1 , q1 ; . . . ; yn , qn ) ∈ CT (M, g)} =: IT (M, g) i∈Il
(214) for the w. We also note that the w scale almost homogeneously under a rescaling of the metric, and vary smoothly under smooth variations of the metric in the sense that if g(s) is a family of metrics depending smoothly on s in some parameter space P, then the distributions w(s) = wg(s) (viewed as distributions on P × Un × M r ) have wave front set WF(w(s) ) ⊂ {(s, ρ; y1 , k1 ; . . . ; yn , kn ; x1 , p1 ; . . . ; xr , pr ) | (y1 , k1 ; . . . ; yn , kn ; x1 , p1 ; . . . ; xr , pr ) ∈ IT (M, g(s) )}.
(215)
These properties follow immediately from the corresponding properties satisfied by the t (as a consequence of T3 and T4) as well as the delta functions. We now insert Eq. (212) into the definition of rn . This gives w(0) (y1 , . . . , yn ; x1 , . . . , xr ) rn (h; f1 , . . . , fn ) = r
∂ : × fi (yi )ω τgret ϕ(xj ) :H (s) (s) ∂s i j s=0 ∂ w(s) (y1 , . . . , yn ; x1 , . . . , xr ) + ∂s s=0 r × fi (yi ) ω : ϕ(xj ) :H (0) ≡ I1 + I2
i
(216)
j
∂ (s) g = h. Furthermore, when g is replaced everywhere in the above formula where ∂s by a family g(s) depending on a parameter s ∈ P, we obtain a corresponding (s) expression for rn . The proof of the proposition will be complete if we can show that the first term, I1 , and second term, I2 , on the right side separately satisfy the wave front set condition Eq. (210), and the smoothness condition Eq. (211), i.e., if we can prove the following lemma:
Lemma 6.1. I1 and I2 are distributions satisfying WF(Ij ) ∆n+1 ⊥ T (∆n+1 ),
(217)
as well as (s)
WF(Ij ) P×∆n+1 ⊥ T (P × ∆n+1 ). The remainder of this subsection consists of the proof of this lemma.
(218)
May 19, 2005 1:20 WSPC/148-RMP
292
J070-00234
S. Hollands & R. M. Wald
Proof of Lemma 6.1 for I 1 . We begin by showing Eq. (217) for I1 . For this, ∂ (s) g = h, and let consider the smooth 1-parameter family of metrics g(s) with ∂s (s) (s) ω be the unique quasifree Hadamard state on W(M, g ) with the property that ω (s) coincides with ω on M \J + (K), where K is the compact region where h is supported. Furthermore, let H (s) be the local Hadamard parametrix associated with this 1-parameter family of metrics, and, in a sufficiently small neighborhood U2 of the diagonal, define (s)
d(s) (x1 , x2 ) = ω2 (x1 , x2 ) − H (s) (x1 , x2 ). Then one finds from the definition of τ ret that : ϕ(xj ) :H (s) = d(s) (xi , xj ), ω τgret (s) j
and hence that I1 (h; f1 , . . . , fn ) =
(219)
(220)
pairs ij
r
×
i
w(0) (y1 , . . . , yn ; x1 , . . . , xr ) ∂ (s) fi (yi ) d (xi , xj ) . ∂s pairs ij s=0
(221)
We estimate the wave front set of I1 by analyzing the wave front set of the individual terms in Eq. (221). The wave front set of w is already known, whereas the wave front set associated with the distributions d(s) is given by the following lemma. Lemma 6.2. d(s) is jointly smooth in s and its spacetime arguments within a sufficiently small neighborhood U2 of the diagonal in M × M . Furthermore, in such a neighborhood, if ∂ (s) d (f1 , f2 ) , (222) (δd)(h, f1 , f2 ) = ∂s s=0 then WF(δd) ⊂ {(y, p; x1 , k1 ; x2 , k2 ) | either of the following holds: ((y, p) ∼ (x1 , −k1 ), k2 = 0) or ((y, p) ∼ (x2 , −k2 ), k1 = 0) or (x1 = x2 = y and p = −k1 − k2 )}.
(223)
Proof. The bidistribution d(s) is symmetric, and is a bisolution of the Klein– Gordon equation modulo a smooth function, because H (s) is a bisolution modulo a smooth function. In fact, (P (s) ⊗ 1)d(s) (x1 , x2 ) = G(s) (x1 , x2 ) (1 ⊗ P (s) )d(s) (x1 , x2 ) = G(s) (x2 , x1 ),
(224)
where G(s) is equal to the action of the Klein–Gordon operator on the first variable in H (s) (and can thereby be calculated by Hadamard’s recursion procedure, at least
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
293
in analytic spacetimes), and where P (s) is the Klein–Gordon operator associated with g(s) . It follows that G(s) is jointly smooth (resp. analytic, in analytic spacetimes) in s and its spacetime arguments. Furthermore, since ω (s) is independent of s everywhere in M \J + (K), and since H (s) is independent of s on any convex normal neighborhood which does not intersect K, it follows that d(s) is independent of s on any convex normal neighborhood which has no intersection with J + (K). Using these facts, we will now show that d(s) (x1 , x2 ) is jointly smooth in s, x1 , x2 . For this, we consider a globally hyperbolic subset N of M with compact closure, which contains K, and which has Cauchy surfaces S− , resp. S+ , not intersecting J + (K), resp. J − (K), (for all metrics g(s) with s sufficiently small). Without loss of generality, we may assume that K is so small that N can be chosen to be convex and normal (again for all metrics g(s) with s sufficiently small). By what we have said above, d(s) does not depend upon s in a neighborhood of S− . Within N , we define the bidistribution ↔ ↔
(s)
αab (x1 , x2 ) = (∆adv (s) f1 )(x1 )(∆adv (s) f2 )(x2 ) ∇a ∇b d(s) (x1 , x2 ),
(225) (s)
where ∇a acts on x1 and ∇b acts on x2 . We now take the divergence of αab (x1 , x2 ) both in x1 and x2 and integrate the resulting expression over U × U , where U ⊂ N is the region enclosed by S− and S+ . By Stokes’ theorem and the support property of ∆adv , we have ∇a ∇b αab (x1 , x2 )(x1 )(x2 ) = αab (x1 , x2 )dσ a (x1 )dσ b (x2 ), (226) S− ×S−
U×U
for any test (densities) f1 , f2 supported in U . (Here, dσ a is the usual integration element induced by , and we are suppressing the dependence upon s to lighten the notation.) Now perform the differentiation on the left side, using (∇a ∇a − ξR − m2 )∆adv = δ, using the fact that the advanced propagator on the right side can be replaced by the causal propagator, and using the symmetry properties of G implied by Eq. (224). We obtain ↔ ↔ (∆f1 )(x1 )(∆f2 )(x2 ) ∇a ∇b d(x1 , x2 ) dσ a (x1 )dσ b (x2 ) d(f1 , f2 ) = S− ×S−
1 1 − G(∆adv f1 , f2 ) − G(∆adv f2 , f1 ) − G(P f1 , f2 ) − G(P f2 , f1 ), 2 2 (227) where it should be remembered that all quantities depend upon s. This equation expresses d(s) (f1 , f2 ) in terms of the advanced and retarded propagators for the metric g(s) , G(s) , and initial data of d(s) on S− . Now the retarded and advanced propagators have a smooth dependence upon s in the sense that WF(∆(s) ret/adv ) ⊂ {(s, ρ; x1 , k1 ; x2 , k2 )|(x1 , k1 ; x2 , k2 ) ∈ CR/A (M, g(s) )},
(228)
and G(s) is explicitly seen to be jointly smooth in s and its spacetime arguments. Moreover, near S− , d(s) is a smooth function independent of s, since ω (s) is equal to
May 19, 2005 1:20 WSPC/148-RMP
294
J070-00234
S. Hollands & R. M. Wald
the Hadamard state ω there. It follows from these facts, together with the expression Eq. (227) for d(s) and the wave front set calculus, that d(s) is jointly smooth in s and its spacetime arguments within N . We next analyze the s-derivative of d(s) . We denote the variation of any functional, F , of the metric by ∂ ∂ (s) F (s) (f1 , . . . , fm ) , g = h. (229) δFg (h; f1 , . . . , fm ) = ∂s g ∂s s=0 s=0 Now take the s derivative of both sides of Eq. (227) at s = 0. It follows that δd can be written as a sum of terms involving δ∆adv and δ∆ret , δG and δP (the variation of the KG-operator) linearly. The wave front set of δG can be computed explicitly and is given by WF(δG) ⊂ {(y, p; x1 , k1 ; x2 , k2 ) | either of the following holds: (y = x1 and p = −k1 , k2 = 0) or (y = x2 and p = −k2 , k1 = 0) or (x1 = x2 = y and p = −k1 − k2 )}.
(230)
In order to calculate the wave front set of δ∆ret (and likewise δ∆adv ), we use formula (197) (and an analogous formula for the advanced propagator), as well as the wave front set of the advanced, resp. retarded, propagator, bounded by CA/R (M, g). The calculus for the wave front set yields WF(δ∆adv/ret ) = {(y, p; x1 , k1 ; x2 , k2 ) | y ∈ J −/+ (x1 ), x2 ∈ J −/+ (y); ∃(y, q1 ), (y, q2 ) such that (y, qi ) ∼ (xi , −ki ), p = q1 + q2 }.
(231)
We now compute the wave front set of δd by expressing it in terms of δG and δ∆adv/ret via the s-derivative of Eq. (227), and using the wave front set calculus. This gives the bound on the wave front set of δd, thus completing the proof of Lemma 6.2. To complete the proof of Eq. (217) for I1 , we estimate its wave front set using the calculus for the wave front set together with the estimates Eq. (214) for the wave front set of w, and the estimates on the wave front set of δd provided in Lemma 6.2. This gives WF(I1 ) ⊂ {(y, p; x1 , k1 ; . . . ; xn , kn ) | ∃(x1 , k1 ; . . . ; xn , kn ; z1 , 0; . . . ; zi , qi ; . . . ; zj , qj ; . . . zr , 0) ∈ WF(w)
such that
(y, p; zi , qi ; zj , qj ) ∈ WF(δd) for some i, j} ⊂ {(y, p; x1 , k1 ; . . . ; xn , kn ) | ∃(xi , qi ) such that (xi , qi ) ∼ (y, −p) and (x1 , k1 ; . . . ; xi , ki + qi ; . . . ; xn , kn ) ∈ CT (M, g) or xi = xj = y and there exist qi , qj ∈ Ty∗ M such that p = −q1 − q2 and (x1 , k1 ; . . . ; y, ki + qi ; . . . ; y, kj + qj ; . . . ; xn , kn ) ∈ CT (M, g)}.
(232)
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
295
One verifies thereby that I1 satisfies the wave front set condition Eq. (217). The smooth, resp. analytic, dependence of I1 upon the metric, Eq. (218), can be proved in the same way by considering metrics that have in addition a smooth (analytic) dependence upon a further parameter. Proof of Lemma 6.1 for I 2 . We next show that I2 satisfies the wave front set condition Eq. (217). It was shown in our previous paper [14] that any distribution that is locally and covariantly constructed from the metric with a smooth dependence upon the metric and an almost homogeneous scaling behavior has a so-alled “scaling expansion”. This scaling expansion for w takes the form a ···a (Cg 1 j α∗g ua1 ...aj )(x1 , . . . , xn+r ) + ρg (x1 , · · · , xn+r ), wg (x1 , . . . , xn+r ) = j
(233) D n+r−1
(we think where u are tensor valued, Lorentz invariant distributions on (R ) of RD as being identified with the tangent space on M at x1 ), where C are local curvature terms (evaluated at x1 ) that are polynomials in the Riemann tensor and its derivatives, and where αg is the map α : Un+1 (x1 , x2 , . . . , xr+n ) → (eµ (x1 , x2 ), . . . , eµ (x1 , xr+n )) ∈ (RD )n+r−1 , (234) µ
µ
where e (x, y) denotes the Riemannian normal coordinates y of a point y relative to a point x. The “remainder” ρg is a local, covariant distribution that depends smoothly upon the metric and satisfies the additional properties stated in [14, Theorem 4.1]. We refer the reader to [14, Theorem 4.1] for the construction and further properties of the scaling expansion. To proceed, we split I2 = I3 + I4 further into a contribution I3 arising from the sum in our scaling expansion and a contribution I4 arising from the remainder in that expansion. We analyze these separately and show that each of them satisfies the wave front set condition Eq. (217). We first analyze I3 , given by I3 (h, f1 , . . . , fn ) = δ(C a1 ···aj α∗ ua1 ···aj )(z, y1 , . . . , yn , x1 , . . . , xr ) j
× h(z)
i
fi (yi ) ω :
r
ϕ(xj ) :H .
(235)
j=1
Since the distributions u in the scaling expansion are actually independent of g (so that δu = 0), we have, dropping the tensor indices, WF[δ(Cα∗ u)] ⊂ WF[(δC)α∗ u] ∪ WF[C(δα)∗ u].
(236)
Thus, in order to analyze the wave front set of δ(Cα∗ u), we only need to analyze the variations δC and δα. But C is just a polynomial in the Riemann tensor and its derivatives, from which one finds WF(δC) ⊂ {(y, p; x, k) | x = y, k = −p}.
(237)
May 19, 2005 1:20 WSPC/148-RMP
296
J070-00234
S. Hollands & R. M. Wald
The wave front set of δα in turn follows from the wave front set of δeµ (recall that eµ is essentially the inverse of the exponential map), which in turn can be calculated to be WF(δeµ ) ⊂ {(y, p; x1 , k1 ; x2 , k2 ) | either of the following holds: (y = x1 and p = −k1 , k2 = 0) or (y = x2 and p = −k2 , k1 = 0) or (x1 = x2 = y and p = −k1 − k2 )}.
(238)
Using the calculus for the wave front set, we find that WF[δ(Cα∗ u)] ∆n+r+1 ⊥ T (∆n+r+1 ).
(239)
Since ω is a Hadamard state, the distribution ω (: ϕ(xj ) :H ) is actually a smooth function. Therefore, using again the calculus for the front set, and using the fact that Cα∗ u has the same support as t [see Eq. (213)], we conclude that I3 is a distribution jointly in h, f1 , . . . , fn , satisfying the wave front set condition Eq. (217). The smooth, resp. analytic, dependence of I3 upon the metric, Eq. (218), can be proved in a similar way by considering appropriate families of metrics, instead of the fixed metric, g. We finally turn our attention to the functional I4 , given by I4 (h, f1 , . . . , fn ) fi (yi ) ω : ϕ(xj ) :H . = δρ(z, y1 , . . . , yn , x1 , . . . , xr )h(z) i
(240)
j
We need to show that I4 , in fact, defines a distribution on Un+1 , with the wave
front property (217). Since ω (: ϕ(xj ) :H ) is a smooth function, the nontrivial contributions to the wave front set of I4 arise entirely from δρ. The wave front set of δρ is analyzed as follows. By construction, δρ is already known to be a distribution on Un+r+1 away from ∆n+r+1 . Let us denote this distribution δρ0 . It follows from the properties of the scaling expansion (cf. [14, Theorem 4.1]) that δρ0 has arbitrary low scaling degree at ∆n+r+1 (if the scaling expansion is carried out to sufficiently large order). By the arguments given in [14], this entails that δρ arises from δρ0 by continuing the latter in a unique way to a distribution defined on all of Un+r+1 , in the sense that δρ = lim θλ δρ0 . λ→0+
(241)
Here, θλ (y, x1 , . . . , xn ) = θ(λ−1 S(y; x1 , . . . , xn )), where S is any smooth function measuring the distance from the total diagonal, and θ is a any smooth, real valued function which vanishes in a neighborhood of the origin in R and which is equal to 1 outside a compact set. The key point is that we now can derive the wave front set properties of δρ from the fact that it is the unique continuation of δρ0 together
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
297
with the known properties of δρ0 . The relevant properties of δρ0 are that27 closure of wave front set of δρ0g(s) in . T ∗ (P × M n+r+1 ) P×∆n+r+1 ⊥ T (P × ∆n+r+1 ) (242) where g(s) is any family of metrics depending smoothly upon a parameter s ∈ P, and that δρ0 has a certain integral representation (see [14, Eqs. (55)–(57)]) which can be derived from the fact that it is the remainder in a scaling expansion. It follows from these properties (by an argument completely analogous to the one given in the proof of [14, Proposition 4.1, p. 336]) that WF(δρ) ∆n+r+1 ⊥ T (∆n+r+1 ).
(243)
This estimate can be used to establish Eq. (217) for I4 by applying the wave front set calculus to the defining relation (240) for I4 . The smooth, resp. analytic, dependence of I4 upon the metric, Eq. (218), can be shown by similar methods. This shows that I2 satisfies relations Eqs. (204) and (218), and thereby concludes the proof of Lemma 6.1. 6.2.6. Proof that Dn is symmetric when Φ1 = Tab We now examine the symmetry properties of Dn . It is a straightforward consequence of the definition of Dn together with the symmetry of the time-ordered products, T6, that Dn (h; f1 , . . . , fn ) is symmetric in f1 , . . . , fn when the fields Φ1 , . . . , Φn are also exchanged accordingly. However, the symmetry properties of Dn with regard to exchanges of h with the fi are not at all manifest from the definition of Dn , as h appears on a completely different footing than the fi . We examine here the symmetry properties of Dn under such exchanges, which of course are relevant only when one of the fields fi Φi is equal to the (densitized) stress-energy tensor, say f1 Φ1 = hab 1 Tab . We claim that the prescription for defining time-ordered products can be modified (within the allowed freedom) so that the corresponding new Dn is symmetric in the sense that Dn (h2 ; h1 , f2 , . . . , fn ) − Dn (h1 ; h2 , f2 , . . . , fn ) = 0.
(244)
We note that it is an immediate consequence of this equation, the definition of Dn and the symmetry of the time-ordered products, T6, that Dn (h1 ; . . . , hi , . . . , hj , . . . , ) = symmetric in h1 , . . . , hi , . . . , hj , . . . ,
(245)
ab if fi Φi = hab i Tab , . . . , fj Φj = hj Tab , i.e., if any number of the fields are given by stress energy tensors.
that it is essential that we know this property for an arbitrary smooth family g(s) and not just a fixed metric.
27 Note
May 19, 2005 1:20 WSPC/148-RMP
298
J070-00234
S. Hollands & R. M. Wald
To prove Eq. (244), let us first consider the simplest case, n = 1, for which the antisymmetric part of D1 is given by28 E(h1 , h2 ) ≡ D1 (h1 , h2 ) − D1 (h2 , h1 ) i ret cd ab cd = δ1ret Tab (hab 2 ) − δ2 Tcd (h1 ) + [Tab (h1 ), Tcd (h2 )]. 2
(246)
We already know, inductively, that D1 , and hence E, is a c-number distribution that is supported on the total diagonal in M × M . Moreover, E is also locally and covariantly constructed out of the metric and scales almost homogeneously (with degree = dimension of spacetime) under a rescaling of the metric by a constant conformal factor, because D1 has already been shown to have these properties. Finally, since D1 satisfies the wave front set properties Eqs. (204) and (205), it follows by the same arguments as in [14] that D1 (and hence E) must, in fact, be given by a delta function times suitable curvature terms of the correct dimension, cd f1 ···fr − (1 ↔ 2)], (247) [hab E(h1 , h2 ) = 1 (∇(f1 · · · ∇fr ) h2 )Cabcd r f1 ···fr
are local curvature terms of dimension D − r. where Cabcd We now claim that E = 0 for any prescription such that the quantum stress tensor is conserved, ∇a Tab = 0. To see this, consider the variations hab and £ξ gab , where ξ a is an arbitrary smooth, compactly supported vector field and hab an arbitrary smooth compactly supported symmetric tensor field, i.e., choose one of the variations to be of pure gauge. Using stress-energy conservation, and the remark in footnote 28, one deduces E(h, £ξ g) = 0 for any such pair of variations. Substituting this into the above expression for E, one can show this implies that E = 0 by an argument similar to that given in the proof of Theorem 5.1. But it follows from the analysis of Sec. 3.2 above that when D > 2 we can always adjust our prescription for Wick powers and time-ordered products so as to satisfy ∇a Tab = 0 in addition to T1–T11a. Thus, if we take the “prime” prescription to satisfy conservation of the stress tensor, then Eq. (244) follows when n = 1. In order to prove Eq. (244) for n > 1, we use the identity Dn (h1 ; h2 , . . . , fn ) − Dn (h2 ; h1 , . . . , fn ) n n 2 ret ret ret ret = (δ1 δ2 − δ2 δ1 )T fi Φi + E(h1 , h2 )T f i Φi , i i=2 i=2
(248)
which follows from our inductive assumptions and the definition of the retarded products by a calculation similar to those given in the previous subsections. But 28 In this formula, and in other similar formulas below, we are assuming for simplicity that the metric variations h1 and h2 commute, i.e., that h[1,2] = δ1 h2 − δ2 h1 = 0. For non-commuting variations, there would appear the additional term Tab (hab ) in the formula (246) for E (and cor[1,2] responding other terms in similar other formulas below). An example of non-commuting variations is h1 = h and h2 = £ξ g; in that case h[1,2] = −£ξ h.
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
299
we have already shown that E = 0, and we know that δ1ret δ2ret − δ2ret δ1ret = 0 from the definition of the retarded variation. Hence, the right side of Eq. (248) in fact vanishes, thus establishing Eq. (244) for n > 1. Remark concerning the cohomological nature of E = 0, and of T11b. There is an alternative strategy to prove the key identity E = 0, which shows the cohomological nature of that condition, and therefore — since it is a necessary condition for T11b to hold — that there can exist “cohomological obstructions” to imposing T11b. We define i δˆ = δ ret + [Tab (hab ), · ] 2 1 = (δ ret + δ adv ) 2
(249)
and we view δˆ as a “gauge connection” on local covariant fields. Now apply δˆ3 to the defining relation (246) for E(h1 , h2 ), and antisymmetrize over the different metric variations 1, 2 and 3. We obtain δˆ[1 E(h2 , h3] ) = δˆ[1 δˆ2 Tab (hab 3] ) = 0,
(250)
29
where the second equality can be verified by a direct calculation using the Jacobi identity (or alternatively can be viewed as the “Bianchi identity” for the “connecˆ because E is the “curvature” of δ). ˆ Since E is a c-number, the antisymtion” δ, ˆ metrized δ-variation of the left side of the equation is actually equal to δ[1 E(h2 , h3] ), where δ is the ordinary variation of a functional with respect to the metric. Hence δ[1 E(h2 , h3] ) = 0,
(251)
i.e., E has vanishing “curl”. In finite dimensions, every differential form with vanishing curl can be written as the curl of a form of lower degree, unless there is a topological obstruction. In the present case, the key issue is whether it is possible to write E as the curl of some F , i.e. E(h1 , h2 ) = δ[1 F (h2] ) for some functional
(252)
F (h) =
C ab hab ,
(253)
where C ab is a local curvature term (of the appropriate dimension). The point is that, if E could indeed be written in this way, and if we could then redefine our prescription for the stress-energy tensor by T ab = Tab − Cab 11, then the new prescription would satisfy E = 0 (as well as ∇a T ab = 0). Alternatively, if it is 29 We are assuming that the variations 1, 2 and 3 commute, see footnote 28. If the variations do not commute, there would appear the additional terms E(h1 , h[2,3] ) + E(h2 , h[3,1] ) + E(h3 , h[1,2] ) on the left side.
May 19, 2005 1:20 WSPC/148-RMP
300
J070-00234
S. Hollands & R. M. Wald
not always possible to write any E satisfying Eq. (251) in the form (252) — i.e., if the space of functionals of the metric of this type has a nontrivial cohomology with respect to the differential δ — then if such an E arises in Eq. (246) in a quantum field theory, it is clear that there would be no way consistent with axioms T1–T10 to adjust the prescription for defining time-ordered products so as to make E vanish. Consequently, by the arguments given above, it would not be possible to have a conserved stress-energy tensor in such a quantum field theory, i.e., the theory would have a “gravitational anomaly”. As we have seen above, this “cohomological obstruction” does not occur for the theory of a scalar field, but it could occur for quantum field theories containing fields of higher spin. In field theories (such as scalar field theory) that are invariant under parity, → −, it follows that E must transform as30 E → −E. We are not aware of any E with this transformation property which has vanishing curl but cannot be written as the curl of some F . If this could be proven, then this would provide a general proof that we can satisfy E = 0 in field theories preserving parity (assuming that there are no algebraic restrictions on Tab ). However, nontrivial cocycles E can occur when parity invariance is dropped. An example in D = 2 spacetime dimensions is " ab R h1 ac h2 b c ED=2 (h1 , h2 ) = # + 2ab ∇c (h1 a c − δa c h1 m m )∇d (hd2 b − δb d h2 n n ) . (254) We have checked explicitly that ED=2 has vanishing curl. However, ED=2 is not the curl of some FD=2 , as can easily be seen from the fact that, in D = 2 dimensions, the only functional FD=2 with the appropriate dimension of length is, up to a numerical factor, FD=2 (h) = Rab hab . But FD=2 transforms as FD=2 → −FD=2 under parity, while ED=2 → +ED=2 , so its antisymmetrized variation cannot be proportional to ED=2 . This example explicitly shows that nontrivial cocycles E can be present, in principle, in parity violating theories, at least in D = 2 dimensions. In fact, as we have previously noted, gravitational anomalies are known to occur [1] for certain parity violating theories in D = 4k + 2 dimensions. 6.2.7. Proof that Dn can be absorbed in a redefinition of the time-ordered products We now complete our inductive argument by showing how to redefine our prescription for the time-ordered products so that Dn = 0 for the new prescription when Nϕ ≤ k factors of ϕ are present in f1 Φ1 , . . . , fn Φn . To do this, we first collect the facts about Dn which we have established in the previous subsections, and we summarize the conclusions that can be drawn from them about the nature of the Dn . minus sign arises simply because an integration is implicit in the definition of E. The integrand of such an E would be parity invariant.
30 The
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
301
By its very definition, we know that for any choice of the fields Φi , Dn (h; f1 , . . . , fn ) is an (n + 1)-times multilinear functional valued in W. We showed that the values of this functional are actually proportional to the identity operator, allowing us to identify Dn with a functional taking values in the complex numbers. This functional is supported only on the total diagonal in M n+1 , i.e., it vanishes if the supports of h, f1 , . . . , fn have no common points. We also established that the functional Dn depends locally and covariantly on the metric, and that it has an almost homogeneous scaling behavior under rescalings of the metric g → λ2 g, with λ a real constant. Furthermore, since the multilinear functional Dn has been shown to be in fact a distribution (i.e., to be continuous in the appropriate sense) satisfying the wave front set condition (204), it follows that n hab C ab α1 ···αn −1 (∇)αi fi · 11, (255) D(h; f1 , . . . , fn ) = α1 ,...,αn
M
i=1
where the C are smooth tensor fields depending locally and covariantly on the metric, with a suitable almost homogeneous scaling behavior. Moreover, since we know that the dependence of Dn on the metric is actually smooth, resp. analytic, in the sense of Eq. (210), it follows by the same arguments as in [13] that the C have to be polynomials in the metric, its inverse, the Riemann tensor and its derivatives. The engineering dimension of the derivatives and curvature monomials in each term in Eq. (255) must add up precisely to d − nD where d is the sum of the engineering dimensions of the Φi . (Here, it should be noted that the delta function implicit in Eq. (255) has engineering dimension −nD = n × engineering dimension of −1 in D spacetime dimensions.) It follows from the unitarity condition on the timeordered products, T7, together with the fact that τ ret (a∗ ) = [τ ret (a)]∗ that the Dn ¯ n = (−1)n+1 Dn . The Dn also are distributions satisfying the reality condition D satisfy the symmetry condition (245) when one or more of the fields Φi is given by a stress-energy tensor. Finally, we have shown that, when one of the fields Φi is equal to ϕ, we automatically have that Dn = 0. Our proposal for redefining the prescription for time-ordered products at the given induction order is now the following: if Φ1 , . . . , Φn are fields in Vclass with a total number of Nϕ = k factors of ϕ, then we define 1 n c gab Dn c . c [ϕ∇a ∇b ϕ ⊗ (⊗i=1 Φi )] = 2i Dnab − (256) D−2 We also define distributions c[(∇)r (ϕ∇a ∇b ϕ) ⊗ (⊗i Φi )] associated with all Leibniz dependent expressions in such a way that Eq. (174) is satisfied and we define c[Ψ ⊗ (⊗i Φi )] = 0 for all Ψ which are “Leibniz independent” of ϕ∇a ∇b ϕ in the sense used in Proposition 3.1. It is a direct consequence of these definitions that c [Tab ⊗ (⊗ni=1 Φi )] = 2iDnab .
(257)
Because of the symmetry condition (245) satisfied by Dn , it follows that c[Tab ⊗ · · · Tcd ⊗ · · ·] is symmetric in the respective spacetime arguments if one
May 19, 2005 1:20 WSPC/148-RMP
302
J070-00234
S. Hollands & R. M. Wald
(or more) of the fields Φi is given by a stress tensor. Since the c satisfy the Leibniz rule in the first argument by construction, and since they satisfy the Leibniz rule in the remaining n arguments as a consequence of T10, an analogous statement also holds by definition for derivatives of the stress tensor. It follows that c satisfy the symmetry condition (173), and the Leibniz condition (174). It is now clear from the properties that we have established about the Dn that the so-defined coefficients c obey all further restrictions that are necessary in order that the new prescrition T satisfies T1–T10 and T11a. Since Dn is of the form (255), the c are similarly local covariant delta function type distributions with coefficients that are given by local curvature terms of the appropriate dimension. The c satisfy the unitarity constraint Eq. (172) because the Dn satisfy the analogous relation, and the c satisfy the constraint (178), because we showed that Dn = 0 when one of the Φi is equal to ϕ. On account of the formula (6) for the free stress tensor Tab , the changes in the time-ordered products corresponding to the c given in Eq. (256) via Eqs. (170) and (171) take the following form for time-ordered products with one factor of Tab and n factors Φ1 , . . . , Φn with Nϕ = k factors of ϕ: n n ab ab T hab T fi Φi = T hab T fi Φi + 2iDnab (hab ; f1 , . . . , fn ) · 11. i=1
i=1
(258) It follows from this relation that the new prescription T is designed so that Dn = 0 for all Φ1 , . . . , Φn such that Nϕ ≤ k. Hence, T11b holds for the new prescription at the desired order in the induction process. This completes the proof that when D > 2, we can satisfy condition T11b in addition to conditions T1–T10 and T11a. Remarks. In D = 2 spacetime dimensions, we cannot define coefficients c by Eq. (256) (because of the factor of D − 2 in the denominator), unless Dn already happens to vanish, in which case there would of course be nothing to show in the first place. However, Dn is explicitly seen to be nonzero already for n = 1 and Φ1 = ϕ2 in D = 2 spacetime dimensions in the local normal ordering prescription, and our previous arguments show that it cannot be made to vanish. 7. Outlook In this paper, we have proposed two new conditions, T10 and T11, that we argued should be imposed on the definition of Wick polynomials and time-ordered products in the theory of a quantum scalar field in curved spacetime. These conditions supplement our previous conditions T1–T9, and place significant additional restrictions on the definition of Wick polynomials and time-ordered products that involve derivatives of the field. We also showed that conditions T1–T10 and T11a can
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
303
always be consistently imposed, and in spacetimes of dimension D > 2, condition T11b also can be imposed. In addition, we proved that if these conditions are imposed on the definition of Wick polynomials and time-ordered products of the free field, then for an arbitrary interaction Lagrangian, L1 , the perturbatively defined stress-energy tensor of the interacting field will be conserved. We do not believe that there are any further natural conditions that should be imposed on the definition of Wick polynomials and time-ordered products for a quantum scalar field in curved spacetime. If so, axioms T1–T11 together with the existence proofs and uniqueness analyses of this paper and our previous papers essentially complete the perturbative formulation of interacting quantum field theory in curved spacetime for a scalar field with an arbitrary interaction Lagrangian. For quantum fermion fields in curved spacetime, one can define a “canonical anti-commutation algebra” in direct analogy to the canonical commutation algebra A(M, g) defined at the beginning of Sec. 2.1. The next step toward the formulation of the theory of interacting fermion fields in curved spacetime would be to define the fermionic analog of the algebra W and to formulate suitable fermionic analogs of our axioms T1–T11. We do not anticipate that any major difficulties would arise in carrying out these steps, although we have not yet attempted to do so ourselves. We also would expect it to be possible to prove existence and uniqueness results for the fermion case in close parallel to the scalar field case. Indeed, the only place in our entire analysis where it is clear that differences can arise is the analysis of obstructions to the implementation of condition T11b. As previously noted, the analysis of [1] establishes that the analog of condition T11b cannot hold for certain parity violating theories in spacetimes of dimension 4k + 2. To define the quantum theory of Yang–Mills fields in curved spacetime, one would presumably start, as in flat spacetime, by “gauge fixing” and introducing “ghost fields”. However, to proceed further in the spirit of our approach, one would have to formulate the theory entirely within the algebraic framework, including procedures for extracting gauge invariant information from the field algebra. Since many subtleties already arise in the usual treatments of Yang–Mills fields in flat spacetime due to local gauge invariance, we do not anticipate that it will be straightfoward to extend our analysis to the Yang–Mills case. We expect that it would be even less straightforward to extend our analysis to a perturbative treatment of quantum gravity itself off of an arbitrary globally hyperbolic, classical solution to Einstein’s equation, although we also do not see any obvious reasons why this could not be done. Returning to the case of a scalar field, there remain some significant unresolved issues even if the renormalization theory as presently formulated turns out to be essentially complete. One such issue concerns the probability interpretation of the theory. As emphasized, e.g., in [20], there is no meaningful notion of “particle” — even asymptotically — in a general curved spacetime. Thus, the only meaningful observables are the smeared local and covariant quantum fields themselves. Let Φ(f ) ∈ W be such a field observable for the free scalar field ϕ, which is
May 19, 2005 1:20 WSPC/148-RMP
304
J070-00234
S. Hollands & R. M. Wald
“self-adjoint” in the sense that Φ(f )∗ = Φ(f ). For any state, ω, the very definition of ω provides one with the expectation value, Φ(f ) = ω(Φ(f )), of this observable in the state ω. We also can directly obtain the moments, ω([Φ(f )− Φ(f )]n ), of the probability distribution for measurements of Φ(f ) in the state ω, since powers of Φ(f ) are also in W. However, to obtain the probability distribution itself, we need to go to a Hilbert space representation, such as the GNS representation, where ω is represented by an ordinary vector in a Hilbert space, and Φ(f ) is represented by an operator π[Φ(f )], so that probabilities can be calculated by the usual Hilbert space methods. However, a potential difficulty arises here. Although in the GNS representation π[Φ(f )] is automatically a symmetric operator defined on a dense, invariant domain D, there does not appear to be any guarantee that π[Φ(f )] will be essentially self-adjoint on D. If essential self-adjointness fails, then further input would be needed to obtain a probability distribution. Specifically, if π[Φ(f )] has more than one self-adjoint extension, then additional rules would have to be found to determine which self-adjoint extension should be used to define the probability distribution. Worse yet, if π[Φ(f )] does not admit any self-adjoint extension at all, it is hard to see how any consistent probability rules can be given. As far as we are aware, this issue is unresolved for general observables in W even for the vacuum state in Minkowski spacetime. Another issue of interest that has not yet been investigated in depth concerns whether a useful, non-perturbative, axiomatic characterization of interacting quantum field theory in curved spacetime can be given. The usual axiomatic formulations of quantum field theory in Minkowski spacetime, such as the Wightman axioms [19], make use of properties that are very special to Minkowski spacetime. It seems clear that a suitable replacement for the Minkowski spacetime assumption of covariance of the quantum fields under Poincar´e transformations is the condition that the quantum fields be local and covariant [13, 6]. It also seems clear that microlocal spectral conditions should provide a suitable replacement for the usual spectral condition assumptions in Minkowski spacetime. However, it is far less clear what should replace the Minkowski spacetime assumption of the existence of a unique, Poincar´e invariant vacuum state, since no analog of this property exists in curved spacetime. One possibility for such a replacement might be suitable assumptions concerning the existence and properties of an operator product expansion [11]. Undoubtedly, the foremost unresolved issue with regard to the perturbative formulation of quantum field theory in curved spacetime concerns the meaning and convergence properties of the Bogoliubov formula, Eq. (91), which defines the interacting field. It is, of course, very well known that “perturbation theory in quantum field theory does not converge”. However, as we pointed out in [15], the usual results and arguments against convergence concern the calculation of quantities that involve ground states or “in” and “out” states, and such states would not be expected to have the required analyticity properties. We believe that Eq. (91) stands the best chance of making well-defined mathematical sense if it is interpreted as determining the algebraic relations that hold in the interacting field algebra. The
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
305
formula does not, of course, make sense as it stands (except as a formal power series) since we have not defined a topology on W — so the notion of “convergence” has not even been defined — and, in any case, W should be “too small” to contain the elements of the interacting field algebra, since W consists only of polynomial expressions in ϕ smeared with appropriate distributions. However, one could imag¯ into which W is ine “enlarging” W by defining a suitable topological algebra W ¯ densely embedded. We see no obvious reason why such a W could not be defined ¯ — but, of course, we also so that Eq. (91) would define a convergent series in W do not see an obvious way of carrying this out! These ideas appear to be worthy of further investigation. Acknowledgments We would like to thank M. D¨ utsch and K. Fredenhagen for discussions and for making available to us their manuscript on the Action Ward Identity [10] prior to publication. S. Hollands would like to thank the II. Institut f¨ ur Theoretische Physik, Universit¨ at Hamburg, for their kind hospitality. This work was supported by NFS grant PHY00-90138 to the University of Chicago. Appendix A. Infinitesimal Retarded Variations Let g(s) be a smooth 1-parameter family of metrics differing from g ≡ g(0) only within a compact subset K. In this appendix, we show that the retarded variation with respect to the metric defined by n n ∂ τgret f i Φi f i Φi (259) = Tg(s) δgret Tg (s) ∂s i=1 i=1 s=0
appearing in our requirement T11b is well-defined and yields an element of W(M, g). Our proof can easily be generalized to also prove the corresponding statement for an infinitesimal retarded variation of the potential, T11c, but we shall not treat this case explicitly. Let Aret (s) be the map as defined by Eq. (99) above, let ω2 be the two-point (s) function of a Hadamard state on (M, g) and let ω2 be the Hadamard 2-point (s) functions for (M, g(s) ), uniquely specified by the requirement that ω2 coincides + with ω2 when both arguments are taken within M \J (K). Then it can be verified (s) that Aret (s) and ω2 have a smooth dependence upon s in the sense that, when viewed as distributions jointly in s, x1 , x2 , we have WF(Aret (s) ) ⊂ {(s, ρ; x1 , k1 ; x2 , −k2 ) | ∃y ∈ M \J + (K) and (y, p) ∈ Ty∗ M \{0} such that (x1 , k1 ) ∼ (y, p) with respect to g and such that (x2 , k2 ) ∼ (y, p) with respect to g(s) }, (260) as well as Eq. (35). Since the definition of W does not depend upon the choice of quasifree Hadamard state used in the definition of the generators Wn , we can assume
May 19, 2005 1:20 WSPC/148-RMP
306
J070-00234
S. Hollands & R. M. Wald
without loss of generality that the generators W (s) of the algebra W(M, g(s) ) are defined using the particular 1-parameter family of states ω (s) that we have just described. To compute the action of τ ret on the time-ordered products, we recall that the time-ordered products have the following “(global) Wick expansion”, n n 1 ω T T f i Φi = δ αi Φi (yi ) fi (yi ) α ! · · · αn ! α1 ,...,αn 1 i=1 i=1 i ·: [(∇)j ϕ(yi )]αij :ω (261) i
j
where we are using the same notation as in the local Wick expansion31 given in (212). Inserting suitable δ-distributions, we can rewrite the Wick expansion in the form n T f i Φi = ϕ(xi ) :ω fj (yj ) un (y1 , . . . , ym ; x1 , . . . , xn ) : n
=
Wn
un (⊗i fi ) ,
i
j
(262)
n
where the distributions un are defined in terms of the ω[T( δ αi Φi (yi ))] with |αi | = n, together with suitable derivatives of delta functions. On account of the delta functions, the un have the same support as the distributions w in Eq. (213), and they satisfy the same wave front set condition as in Eq. (214). Furthermore, if we repeat the above steps for our family of metrics g(s) instead of the single metric (s) g (with ω2 replaced by ω2 everywhere) then we find that the corresponding distri(s) butions un have a smooth dependence upon s, i.e., that they satisfy the same wave front set condition as in Eq. (215). By the wave front set calculus, we conclude that, for any n, and for any fixed choice of smooth compactly supported functions fi , the (s) quantity un (⊗i fi ; x1 , . . . , xn ) is indeed a distribution in the variables x1 , . . . , xn belonging to the space En (M, g(s) ). Moreover, it follows from the smoothness prop(s) erty in s of the un that these distributions actually have a smooth dependence upon s in the sense that, when viewed as a distribution jointly in s, x1 , . . . , xn , we have WF u(s) n (⊗i fi ) ⊂ {(s, ρ; x1 , k1 ; . . . ; xn , kn ) | / [(V (s) + )n ∪ (V (s) − )n ]\{0}}, (x1 , k1 ; . . . ; xn , kn ) ∈
(263)
where V (s) +/− are the future/past lightcones associated with the metrics g(s) . 31 Note that this expansion is entirely analogous to the local Wick expansion (212). The only difference is that in the local Wick expansion, the time-ordered products are expanded in terms of the local normal ordered products (59), while we are using the normal ordered products with respect to ω2 in Eq. (262). The latter are globally defined on all of M n , whereas the former are only defined in a neighborhood Un of the total diagonal (but, in contrast to the normal ordered products in Eq. (262), the former depend locally and covariantly on the metric).
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
307
Substituting Eq. (262) into the the definition of τgret (s) , we find " # τgret f i Φi Wn (Aret (s) )⊗n u(s) Tg(s) = (s) n (⊗i fi ) n
i
=
Wn (vn(s) ),
(264)
n
where the distributions vn are the elements in the space En (M, g) defined by the last equation. It follows from the calculus for wave front sets together with the (s) wave front set of Aret (s) [see Eq. (260)], and the wave front property of un [see (s) Eq. (263)] that vn are distributions depending smoothly upon s in the sense that, when viewed as distributions jointly in s, x1 , . . . , xn , we have (s)
WF(vn(s) ) ⊂ {(s, ρ; x1 , k1 ; . . . ; xn , kn ) | (x1 , k1 ; . . . ; xn , kn ) ∈ / [(V (s) + )n ∪ (V (s) − )n ] \ {0}}.
(265)
(s)
∂ It then follows that the differentiated functionals ∂s vn |s=0 are in fact well-defined distributions in the class En (M, g). Thus, expression (259) exists as the well-defined ∂ (s) vn |s=0 ). This is what we wanted to show. algebra element Wn ( ∂s
Appendix B. Functional Derivatives In this appendix we define the functional derivatives, δA/δϕ and δA/δgab , of any element A of the space Fclass . We shall elucidate the calculus of the functional derivative operations and, in particular, we will derive Eqs. (132) and (140), which were used in the proof of Theorem 5.1 and in Sec. 6. Let A ∈ Fclass . Then A is a D-form that is locally constructed out of g, the curvature, finitely many symmetrized derivatives of the curvature, ϕ, finitely many symmetrized derivatives of ϕ, and test tensor fields f and their symmetrized derivatives. We will denote these dependences as simply A = A[g, ϕ, f ]. The functional derivative of A with respect to ϕ is defined by δA ∂ A[g, ϕ + sψ, f ] + dBϕ [g, ϕ, f, ψ], =ψ (266) ∂s δϕ s=0
where Bϕ is a (D − 1)-form that is similarly locally constructed out of g, ϕ, f , and ψ. The deomposition of the right side of Eq. (266) into the two terms written there is uniquely determined by the requirements that (1) no derivatives of ψ appear in the first term and (2) the second term is exact. The manipulations leading to this decomposition are the familiar ones that would be used to derive the Euler– Lagrange equations if A were a Lagrangian; these manipulations are usually done under an integral sign, with the “boundary term”, dBϕ , discarded. An explicit formula for δA/δϕ was given in Eq. (42) above. It is worth noting that if A is an exact form, i.e., A = dC for some C = C[g, ϕ, f ], then its functional derivative vanishes, since clearly Eq. (266) holds with δA/δϕ = 0 and Bϕ = (∂/∂s)C[g, ϕ + sψ, f ]|s=0 .
May 19, 2005 1:20 WSPC/148-RMP
308
J070-00234
S. Hollands & R. M. Wald
Similarly, the functional derivative of A with respect to gab is defined by δA ∂ A[g + sh, ϕ, f ] = hab + dBg [g, ϕ, f, h]. ∂s δg ab s=0
(267)
We can obtain an explicit expression for δA/δgab by introducing an arbitrary fixed, ◦
background derivative operator, ∇a , on M , and re-writing ∇a and the curvature in ◦
◦
terms of ∇a and derivatives of g with respect to ∇a . The resulting explicit formula for δA/δgab was given in Eq. (114) above. Our first result is that functional derivatives with respect to ϕ and g commute modulo exact forms in the sense that hab
δ δgab
δA δ δA ψ + dB =ψ hab δϕ δϕ δgab
(268)
for some (D − 1)-form, B, that is locally constructed out of g, ϕ, f , ψ and h. To prove this, we note that δA ∂2 ∂ A[g + th, ϕ + sψ, f ] = + dBg hab ∂s∂t ∂s δgab t=s=0 s=0 δA ∂ = + dCg hab ∂s δgab s=0 δA δ =ψ + dCg + dBϕ hab δϕ δgab
(269)
where Cg = (∂/∂s)Bg [g, φ + sψ, f, h]|s=0 . By equality of mixed partials, we may reverse the order of differentiation with respect to s and t on the left side of Eq. (269). However, ∂ 2 A/∂t∂s is given by a similar expression with the order of the functional derivatives reversed. This establishes Eq. (268). Let us now prove the relation (£ξ gab )
δA δA δA + (£ξ f ) = dH + (£ξ ϕ) δgab δϕ δf
(270)
where δA/δf is defined by analogy with Eqs. (266) and (267) and is given by an explicit formula analogous to Eq. (42). This equation is equivalent to Eq. (140) when A depends linearly upon f . Let Fs be the parameter family of diffeomorphisms of M generated by a smooth, compactly supported vector field ξ a . Since A is locally and covariantly constructed from g, ϕ, f , we have Fs∗ A[g, ϕ, f ] = A[Fs∗ g, Fs∗ ϕ, Fs∗ f ],
(271)
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
309
where Fs∗ denotes the pull-back of a tensor field. We differentiate this equation at s = 0, and use the fact that the Lie-derivative of any D-form A is given by £ξ A = d(ξ · A), where ξ · A is the (D − 1)-form obtained by contracting the index of the vector field into the first index of the form. We obtain ∂ ∂ A[g + s£ξ g, ϕ, f ] A[g, ϕ + s£ξ ϕ, f ] + d(ξ · A[g, ϕ, f ]) = ∂s ∂s s=0 s=0 ∂ + A[g, ϕ, f + s£ξ f ] . (272) ∂s s=0 By Eq. (267) (with hab = £ξ gab in that equation), the first term on the right side is equal to £ξ gab · δA/δgab up to some exact form dBg . Similarly, the second term is equal to £ξ ϕ · δA/δϕ, up to some exact from dBϕ . Finally, the last term is given by £ξ f · δA/δf plus some dBf . Thus, we get Eq. (270), with H = ξ · A − Bg − Bϕ − Bf . Finally, we prove the relation (Dη Dξ − Dξ Dη )A = D[ξ,η] A + dC,
(273)
for some locally constructed (D − 1) form C, where the variational operation Dξ is defined by Dξ A = £ξ gab · δA/δgab .
(274)
According to Eq. (267), we may write ∂ A[g + s£ξ g] Dξ A[g] = − dB[g, ξ] ∂s s=0
(275)
for some B, where we are now omitting reference to the dependence upon f, ϕ to lighten the notation. Now apply Dη to this equation. ∂ Dη Dξ A[g] = Dη A[g + s£ξ g] − Dη dB[g, ξ]. (276) ∂s s=0 The second term on the right side of this equation vanishes, since it is the functional derivative of an exact form. Applying Eq. (275) to the first term on the right side of Eq. (276), we get ∂ ∂2 A[g + s£ξ (g + t£η g)] = + dE[g, ξ, η] (277) Dη A[g + s£ξ g] ∂s ∂s ∂t s=0 s=t=0 for some E. Combining these equations and antisymmetrizing over ξ and η, we obtain ∂2 A[g + st(£ξ £η − £η £ξ )g] + dK (278) (Dη Dξ − Dξ Dη )A[g] = ∂s ∂t s=t=0
May 19, 2005 1:20 WSPC/148-RMP
310
J070-00234
S. Hollands & R. M. Wald
for some locally constructed (D − 1)-form K. Applying Eq. (275) once more to the first term on the right side of Eq. (278) and using £ξ £η − £η £ξ = £[ξ,η] , we obtain the desired relation (273).
References [1] L. Alvarez-Gaume and E. Witten, Gravitational anomalies, Nucl. Phys. B234 (1984) 269. [2] F. M. Boas, Gauge theories in local causal perturbation theory, DESY-THESIS 1999032 (1999) [arXiv: hep-th/0001014]. [3] L. Bonora, P. Pasti and M. Bergola, Weyl cocycles, Class. Quant. Grav. 3 (1986) 635–649. [4] R. Brunetti, K. Fredenhagen and M. K¨ ohler, The microlocal spectrum condition and Wick polynomials on curved spacetimes, Commun. Math. Phys. 180 (1996) 633–652. [5] R. Brunetti and K. Fredenhagen, Microlocal analysis and interacting quantum field theories: Renormalization on physical backgrounds, Commun. Math. Phys. 208 (2000) 623–661. [6] R. Brunetti, K. Fredenhagen and R. Verch, The generally covariant locality principle: A new paradigm for local quantum physics, Commun. Math. Phys. 237 (2003) 31, [math-ph/0112041]; see also K. Fredenhagen, Locally covariant quantum field theory, [arXiv:hep-th/0403007]. [7] M. D¨ utsch and K. Fredenhagen, A local (perturbative) construction of observables in gauge theories: The example of QED, Commun. Math. Phys. 203 (1999) 71. [8] M. D¨ utsch and K. Fredenhagen, Algebraic quantum field theory, perturbation theory, and the loop expansion, Commun. Math. Phys. 219 (2002) 5, [arXiv: hepth/0001129]; Perturbative algebraic field theory, and deformation quantization, [arXiv: hep-th/0101079]. [9] M. D¨ utsch and K. Fredenhagen, The master Ward identity and generalized Schwinger–Dyson equation in classical field theory, Commun. Math. Phys. 243 (2003) 275, [hep-th/0211242]. [10] M. D¨ utsch and K. Fredenhagen, Causal perturbation theory in terms of retarded products, and a proof of the action ward identity, [hep-th/0403213]. [11] S. Hollands, PCT theorem for the operator product expansion in curved spacetime, Commun. Math. Phys. 244 (2004) 209–244, [gr-qc/0212028]. [12] S. Hollands and W. Ruan, The state space of perturbative quantum field theory in curved space-times, Annales Henri Poincare 3 (2002) 635, [arXiv:gr-qc/0108032]. [13] S. Hollands and R. M. Wald, Local Wick polynomials and time ordered products of quantum fields in curved space, Commun. Math. Phys. 223 (2001) 289–326, [grqc/0103074]. [14] S. Hollands and R. M. Wald, Existence of local covariant time-ordered-products of quantum fields in curved spacetime, Commun. Math. Phys. 231 (2002) 309–345, [gr-qc/0111108]. [15] S. Hollands and R. M. Wald, On the renormalization group in curved spacetime, Commun. Math. Phys. 237 (2003) 123–160, [gr-qc/0209029]. [16] L. H¨ ormander, The Analysis of Linear Partial Differential Operators I (SpringerVerlag, Berlin, 1983). [17] V. Moretti, Comments on the stress-energy tensor operator in curved spacetime, Commun. Math. Phys. 232 (2003) 189, [arXiv:gr-qc/01090].
May 19, 2005 1:20 WSPC/148-RMP
J070-00234
Conservation of the Stress Tensor in Perturbative Interacting Quantum Field Theory
311
[18] R. Stora, Pedagogical experiments in renormalized perturbation theory, in Theory of Renormalization and Regularization, Hesselberg, Germany (2002), http:// wwwthep.physik.uni-mainz.de/scheck/Hessbg02.html. [19] R. F. Streater and A. A. Wightman, PCT, Spin and Statistics and All That (Benjamin, New York, 1964). [20] R. M. Wald, Quantum Field Theory on Curved Spacetimes and Black Hole Thermodynamics (The University of Chicago Press, Chicago, 1990). [21] J. Wess and B. Zumino, Consequences of anomalous Ward identities, Phys. Lett. B37 (1971) 95.
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
Reviews in Mathematical Physics Vol. 17, No. 3 (2005) 313–364 c World Scientific Publishing Company
NEW QUANTUM “az + b” GROUPS
PIOTR MIKOLAJ SOLTAN Department of Mathematical Methods in Physics, Faculty of Physics, University of Warsaw, Poland
[email protected] Received 15 June 2004 Revised 21 February 2005 We construct quantum “az + b” groups for new values of the deformation parameter. Along the way we introduce new special functions and study their analytic properties as well as analyze the commutation relations determined by the choice of parameter. Keywords: Non-compact quantum group; multiplicative unitary; commutation relations.
Contents 1. Introduction 1.1. Algebraic quantum “az + b” groups 1.2. Quantum “az + b” groups of S. L. Woronowicz 1.3. Description of the paper 2. Three Special Functions and Their Properties 2.1. The bicharacter χ 2.2. The Fresnel function α 2.3. The quantum exponential function Fq 2.4. The product formula 2.5. Asymptotic behavior of Fq 2.6. Uniqueness of Fq 2.7. Fourier transform of Fq 2.8. Other useful functions 3. Operator Equalities 3.1. The commutation relations 3.2. Products 3.3. Sums 3.4. Necessity of the spectral condition 4. Affiliation Relation 4.1. Generators of some C∗ -algebras 4.2. Algebraic consequences 5. Multiplicative Unitary and Its Properties 5.1. The quantum group space 5.2. Multiplicativity 5.3. Modularity
313
314 314 315 315 316 317 317 318 321 327 328 330 337 338 338 340 342 348 348 349 352 353 353 354 356
May 19, 2005 1:20 WSPC/148-RMP
314
J070-00233
P. M. Soltan
6. The Quantum “az + b” Group for New Values of q 6.1. The C∗ -algebra 6.2. Quantum group structure
360 360 362
1. Introduction The aim of this paper is to carry out the construction of quantum “az +b” presented in [17], for new values of the deformation parameter. The construction procedure from [17] will be repeated with necessary modifications. The main problems lie in developing the machinery of special functions needed for the construction and applying this machinery to analysis of appropriate commutation relations. We also propose ways to streamline some aspects of the construction known from [17]. We will base our construction of new quantum “az + b” groups on the notion of a modular multiplicative unitary operator [8]. The formula for the multiplicative unitary will be given in terms of special functions defined in Sec. 2 and operators satisfying commutation relations described in Sec. 3. In order to prove the necessary properties of our multiplicative unitary we shall have to study both the special functions and the commutation relations in detail. As a result the two long Secs. 2 and 3 will have little to do with quantum groups. Their content will, however, be of utmost importance for the construction of the new quantum “az + b” groups. 1.1. Algebraic quantum “az + b” groups On the level of Hopf ∗-algebra the quantum “az + b” groups we will be interested in are described as follows: let λ be a non-zero complex number. Let H be a unital ∗-algebra generated by three normal elements a, a−1 and b with the relations a−1 a = I,
aa−1 = I, ab∗ = b∗ a.
ab = λba,
Then H can be given a structure of a Hopf ∗-algebra by δ(a) = a ⊗ a, δ(b) = a ⊗ b + b ⊗ I. The coinverse and counit are given on generators in the following way: κ(a) = a−1 , κ(b) = −a
−1
(a) = 1, b,
(b) = 0.
An important fact about H is that the coinverse has a polar decomposition (cf. [9, Proposition 2.4]). By existence of a polar decomposition of κ we mean that there exist a ∗-antiautomorphism R of H and a one-parameter group (τt )t∈R of ∗-automorphisms of H such that for any linear functional f on H and any x ∈ H the map R t → f τt (x) ∈ C
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
315
has an extension to an entire function and κ = R ◦ τ 2i , where τ 2i is an automorphism of H obtained by analytic continuation of the group (τt )t∈R . 2πi For λ = e n we can impose a further condition that an and bn be self-adjoint. This corresponds to taking a quotient of H by the ideal generated by an − (an )∗ and bn − (bn )∗ . It turns out that this is a Hopf ideal, i.e. the quotient inherits Hopf ∗-algebra structure. Moreover the coinverse on the resulting Hopf ∗-algebra still has a polar decomposition. 1.2. Quantum “az + b” groups of S. L. Woronowicz A great challenge taken up in [17] was to construct the quantum “az + b” group on C∗-algebra level. The construction was based on the Hopf ∗-algebraic picture with 2πi λ = e n with n ≥ 3 and the relations an − (an )∗ and bn − (bn )∗ were translated into spectral conditions for a and b. On C∗-algebra level the generators a and b become unbounded operators. The condition that their nth powers be self-adjoint are equivalent to the condition that their spectra be contained in the closure of the set 2n−1 kπi e n R+ . k=0
It was noticed that this set is a multiplicative (and self-dual) subgroup of C\{0} and the commutation relations between a and b were written in the Weyl form with use of a bicharacter on this group (cf. also Subsec. 3.1). This very successful construction was then performed in a different setting. In [17, Appendix B] a construction of quantum “az + b” groups was also done for 0 < λ < 1. One could no longer take the Hopf ∗-algebra with an and bn self-adjoint. Therefore one starts with the Hopf ∗-algebra H (see the previous subsection) and then imposes the condition that the spectra of a and b be contained in the closure of the following multiplicative (and self-dual) subgroup of C\{0}: 1 z ∈ C : |z| ∈ λ 2 Z . It turns out that the construction works and we obtain quantum “az +b” groups for real deformation parameter. Note that this spectral condition does not correspond to any quotient of H. The quantum “az + b” groups we aim to construct in this paper will be of similar nature. They do not correspond to any quotient of the Hopf ∗-algebra H. Nevertheless some aspects of these groups are similar to those with deformation parameter assuming the value of an even root of unity. 1.3. Description of the paper Let us now briefly describe the contents of the paper. In Sec. 2 we describe the values of the deformation parameter which we will use throughout the paper. Then ¯ q ⊂ C and define three special functions on these we introduce the subsets Γq , Γ
May 19, 2005 1:20 WSPC/148-RMP
316
J070-00233
P. M. Soltan
sets with values in the unit circle T. We provide detailed proofs of special relations between these functions, discuss their asymptotic behavior and uniqueness properties. In Sec. 3 we define and analyze the commutation relations needed for the construction of new quantum “az + b” groups. This section lays foundations for the study of our quantum groups on C∗-algebra level. Section 4 is devoted to the algebraic consequences of commutation relations studied in Sec. 3. Since we are dealing with unbounded operators, all results are formulated with the use of the affiliation relation for C∗-algebras. Some results of [17] are generalized in such a way that they can be applied to our construction. This is accomplished mainly through unveiling the general mechanisms behind them. The concept of a C∗-algebra generated by unbounded elements as well as by a quantum family of affiliated elements plays an important role in these considerations. In Sec. 5 we define and study the basic object leading to the construction of our quantum “az + b” group, i.e. the multiplicative unitary operator. Using the machinery of operator equalities and special functions developed in Secs. 3 and 2 we show that this multiplicative unitary is modular in the sense of [8]. In the last section we identify the C∗-algebra of continuous functions vanishing at infinity on our deformation of “az + b” group. Then we describe the generators a and b and introduce the quantum group structure. It is all done in accordance with the general procedures known e.g. from [15]. Then using the latest results of Woronowicz [18] we introduce the right Haar measure on our quantum group. Thus we show that the constructed object falls into the category studied in [4, 3]. 2. Three Special Functions and Their Properties The construction of new quantum “az + b” groups on C∗-algebra level in Sec. 6 will extend the framework presented in [17]. It will be governed by a deformation parameter q. We shall impose certain conditions on the value of this parameter. The admissible values of the deformation parameter are q = exp ρ−1 , N where ρ is a complex number such that Re ρ < 0 and Im ρ = 2π with N an even natural number. The number ρ or the pair (N, Re ρ) can equally be taken to be the fundamental parameters of our theory. The choice of q as the main parameter is motivated by the traditional formulation of commutation relations discussed in Subsecs. 1.1, 1.2 and Sec. 3. The choice of ρ determines a choice of logarithm of q: for any z ∈ C we set z . q z = exp ρ
All our constructions work equally well for negative values of N . The restriction to the case N ∈ 2N helps keep our notation simpler. The case of all admissible values of q (including the ones with N < 0) is treated in detail in [7].
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
317
The special functions discussed in this section will be defined on subsets of C defined with the use of the parameter q. Let Γq be the multiplicative subgroup of ¯ q be the closure of Γq in C. C\{0} generated by q and {q it : t ∈ R} and let Γ 2.1. The bicharacter χ The group Γq with the topology inherited from C\{0} is isomorphic as a locallycompact group to ZN × R. It is therefore self-dual. The isomorphism of Γq to its dual group can be encoded by a non-degenerate bicharacter defined on Γq × Γq . We shall choose a particular bicharacter and use it throughout the paper. Proposition 2.1. There exists a unique continuous function χ: Γq × Γq → T such that for all γ, γ , y ∈ Γq χ(γ, γ ) = χ(γ , γ), χ(γ, γ )χ(γ, γ ) = χ(γ, γ γ ) and χ(γ, q) = Phase γ,
(2.1)
χ(γ, q it ) = |γ|it
for all γ ∈ Γq , t ∈ R. The function χ is non-degenerate, i.e. χ(γ, γ ) = 1 for all γ ∈ Γq implies γ = 1.
For γ = q n q it and γ = q n q it we have
χ(γ, γ ) = ei(nn −tt )
Im ρ−1 i(nt +n t) Re ρ−1
e
.
(2.2)
The expression of γ as q n q it is not unique. In fact we have Re ρ
q n q it = q n+N q i(t− Im ρ ) for all n ∈ Z and t ∈ R. Nevertheless one can easily check that the definition (2.2) does not depend on the way γ and γ are expressed in the form q n q it and q n q it respectively. 2.2. The Fresnel function α Proposition 2.2. There exists a continuous function α: Γq → T such that for all γ, γ ∈ Γq χ(γ, γ ) =
α(γγ ) α(γ)α(γ )
and α(γ) = α(γ −1 ).
(2.3)
May 19, 2005 1:20 WSPC/148-RMP
318
J070-00233
P. M. Soltan
Proof. The function α is unique up to multiplication by a Z2 -valued character of Γq . The formula (n + it)2 (n + it)2 n it = exp i Im α(q q ) = Phase (2.4) 2ρ 2ρ defines a function with required properties. We chose a particular formula for α only to make the exposition more transparent. The important properties are those described in the statement of Proposition 2.2. Formula (2.4) will make it easier to proceed with computations, but whenever we use it only the absolute value of α (which is uniquely determined) will enter our considerations. The pair of functions (χ, α) is in many aspects analogous to the pair of functions x2
R × R (p, x) → eipx ∈ T and R x → ei 2 ∈ T. The latter enters the formula for the Fresnel integral from which we borrow the name Fresnel function for α. 2.3. The quantum exponential function Fq From the fact that N is even it follows that −1 ∈ Γq and consequently −q −2k belongs to Γq for all k ∈ Z+ . For γ ∈ Γq \{−q −2k : k ∈ Z+ } we put Fq (γ) =
∞ 1 + q 2k γ . 1 + q 2k γ
k=0
The infinite product is convergent since |q| < 1. We have the following simple proposition: ¯ q → T. With Proposition 2.3. The function Fq extends to a continuous function Γ this extension we have Fq (0) = 1 and (1 + γ)Fq (γ) = (1 + γ¯ )Fq (q 2 γ)
(2.5)
¯q. for all γ ∈ Γ Remark 2.4. Dividing both sides of (2.5) by (1 + γ) and calculating the limit γ → −1, one finds that −ρ Fq (−q 2 ). Fq (−1) = ρ This formula will prove useful. The function Fq was first introduced in [12] with a real deformation parameter q. Its remarkable properties have been very useful in developing examples of quantum groups (cf. e.g. [13]). The name quantum exponential function is taken from [16] and will be justified in Subsec. 3.3. ¯ q we can consider a function ¯ q and a fixed γ ∈ Γ For any function f defined on Γ f [γ] of a real variable given by f [γ] (t) = f (q it γ).
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
319
Throughout the paper much attention will be given to analytic continuation of such functions. Theorem 2.5. ¯ q the function Fq [γ] has a holomorphic continuation to the lower (1) For any γ ∈ Γ half plane. (2) We have the following formula Fq [qγ] (−i) = (1 + γ)Fq (γ).
(2.6)
(3) For any γ ∈ Γq , ε > 0 and M > 0 there exists an R > 0 such that for all τ ∈ R with |τ | < M and all t > R Fq [γ] (−t − iτ )− 1 < ε. Proof. Ad. (1) Let γ = q n q ir . Denote ∞ 2k + n − i(t + r) A(t) = , 1 + exp ρ¯ k=0 ∞ 2k + n + i(t + r) 1 + exp B(t) = , ρ k=0
with t ∈ C. We can now write Fq [γ] (t) as a quotient A(t) . (2.7) B(t) The functions A and B extend holomorphically to the whole complex plane. All zeros of both these functions are simple. Zeros of A form the set Fq [γ] (t) =
(2Z + 1)π ρ¯ − i(2Z+ + n) − r,
(2.8)
while zeros of B form the set (2Z + 1)πρ + i(2Z+ + n) − r.
(2.9)
Let z = (2p + 1)πρ + i(2l + n) − r be a zero of B such that Im z ≤ 0. Notice that N ρ = ρ¯ + ρ − ρ¯ = ρ¯ + 2i Im ρ = ρ¯ + i . π Therefore z = (2p + 1)πρ + i(2l + n) − r = (2p + 1)π ρ¯ + i(2p + 1)N + i(2l + n) − r = (2p + 1)π ρ¯ + i (2p + 1)N + 2l + n − r = (2p + 1)π ρ¯ + i(2pN + N + 2l + 2n − n) − r
N = (2p + 1)π ρ¯ + i 2 pN + +l+n −n −r 2
N = (2p + 1)π ρ¯ − i 2 −pN − − l − n + n − r. 2
May 19, 2005 1:20 WSPC/148-RMP
320
J070-00233
P. M. Soltan
The number − pN +
N 2
+ l + n is a positive integer, since
0 ≥ Im z = (2p + 1)π Im ρ + 2l + n N = (2p + 1)π + 2l + n 2π N = pN + + 2l + n 2
N = pN + +l+n +l 2 shows that
N − pN + + l + n ≥ l ∈ Z+ . 2
This means that z is a zero of the function A. In particular all singularities of Fq [γ] in the lower half plane are removable. Consequently Fq [γ] extends holomorphically to the lower half plane. Ad. (2) To prove formula (2.6) denote ∞ 2k + n + 1 − i(t + r) ˜ A(t) = , 1 + exp ρ¯ k=0 ∞ 2k + n + 1 + i(t + r) ˜ B(t) = 1 + exp . ρ k=0
Then Fq [qγ] (t) =
˜ A(t) . ˜ B(t)
Moreover ∞ ∞ 2k + n − ir (1 + q 2k γ), = 1 + exp ρ¯ k=0 k=0 ∞ ∞ 2(k + 1) + n + ir ˜ B(−i) = (1 + q 2k γ), 1 + exp = ρ ˜ A(−i) =
k=0
k=1
which shows that ∞ ∞ 2k 1 + q 2k γ k=0 (1 + q γ) = (1 + γ)Fq (γ). Fq [qγ] (−i) = ∞ = (1 + γ) 2k 1 + q 2k γ k=1 (1 + q γ) k=0
Ad. (3) The function −2 ∞ ∞ (1 + q 2k w) 2k k=0 w → ∞ (1 + q w) = Phase 2k k=0 (1 + q w) k=0
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
321
is continuous in a neighborhood of 0 and its value converges to 1 as w tends to 0. Now if w = γq i(t+iτ ) with |τ | bounded by M then w tends to 0 as t goes to −∞ and the result follows. Remark 2.6. The function Fq can be defined for any non-zero q of absolute value strictly less than 1. Then for general γ ∈ Γq the function Fq [γ] has a meromorphic continuation to the lower half plane. This continuation is holomorphic if and only if the imaginary part of the inverse of the logarithm of q is 2π N with N ∈ 2Z\{0} [7, Twierdzine 5.11.1]. 2.4. The product formula We shall devote this subsection to proving the following theorem: Theorem 2.7. For any γ ∈ Γq we have Fq (γ)Fq (q 2 γ −1 ) = Cq α(q −1 γ),
(2.10)
where i
−1
Cq = e− 2 Im ρ Fq (1)2 . We shall examine the function Γq γ → Fq (γ)Fq (q 2 γ −1 ).
(2.11)
n it
Writing γ = q q we shall treat the right-hand side of (2.11) as a meromorphic function of t ∈ C. It can be rewritten as ∞ ∞ 1 + q¯2k q¯n−it 1 + q¯2k q¯−n+it ϕn (t¯) , (2.12) = C t → 2k n+it 2k −n−it 1+q q 1+q q ϕn (t) k=0
k=1
where ϕn (t) =
∞ ∞ 1 + q 2k q n+it 1 + q 2k q −n−it . k=0
k=1
The zeros of the entire function ϕn are all simple and constitute the set 1 Λ = 2πρ Z + + i(2Z + n). 2 It is easy to see that thanks to the form of the imaginary part of ρ the set Λ satisfies ¯ In particular the function Λ = Λ. C t →
ϕn (t¯) ϕn (t)
is entire. Remark 2.8. In fact we have ¯ ⇔ (Λ = Λ) (cf. [7, Lemat A.1]).
N Im ρ = , with N ∈ 2Z 2π
May 19, 2005 1:20 WSPC/148-RMP
322
J070-00233
P. M. Soltan
By the Weierstrass theorem we have the following expression for ϕn : t t2 t 1− ϕn (t) = Gn (t) e λ + 2λ2 , λ
(2.13)
λ∈Λ
where the infinite product is absolutely convergent and Gn is a nowhere vanishing entire function. On the other hand it is easy to check that 2k + n + it 1 for k ≥ 0 1 + exp ϕn (t) = × 2k+n+it for k < 0 exp − ρ ρ k∈Z 2k + n + it 2k + n + it exp = 2 cosh 2ρ 2ρ k∈Z 1 for k ≥ 0 . × for k < 0 exp − 2k+n+it ρ So introducing the function 1 for k ≥ 0 s(k) = −1 for k < 0 we obtain
2k + n + it 2k + n + it ϕn (t) = exp s(k) 2 cosh 2ρ 2ρ k∈Z 2k + n + it 2k + n + it exp s(k) 1− = 2 2ρ 2πiρ(p + 12 ) k∈Z p∈Z 2k + n + it × exp . 2πiρ(p + 12 )
This can be rewritten as 2k + n + it ϕn (t) = 2 exp s(k) 2ρ k∈Z t i(2k + n) i(2k + n) 1− × 1+ exp − λ λ − i(2k + n) λ − i(2k + n) λ∈Λk t , (2.14) × exp λ − i(2k + n) where
1 Λk = 2πρ Z + + i(2k + n). 2
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
Of course we have Λ =
323
Λk . Taking into account (2.13) and (2.14) we obtain 2k + n + it Gn (t) = 2 exp s(k) 2ρ k∈Z i(2k + n) i(2k + n) 1+ × exp − λ − i(2k + n) λ − i(2k + n) λ∈Λk it(2k + n) t2 − . × exp λ(λ − i(2k + n)) 2λ2 k∈Z
For computational reasons it will be more convenient to work with it(2k + n) Gn (t) it t2 = exp s(k) exp − 2 exp Gn (0) 2ρ 2λ λ(λ − i(2k + n)) k∈Z λ∈Λk λ∈Λk it t2 1 exp s(k) exp − = 2ρ 2 λ2 k∈Z λ∈Λk 1 , × exp i(2k + n)t λ(λ − i(2k + n)) λ∈Λk
especially since from (2.13) we immediately get an expression for the constant Gn (0): Gn (0) =
∞
(1 + q 2k+n )
k=0
∞
(1 + q 2k−n ).
(2.15)
k=1
Moreover using standard methods of complex analysis one can compute the sums 2k + n 1 1 1 = tanh , λ(λ − i(2k + n)) (2k + n) 2ρ 2ρ λ∈Λk 2 1 1 1 = 2k+n 2 , λ2 2ρ cosh 2ρ λ∈Λk which show that
2k + n it Gn (t) = exp s(k) + tanh Gn (0) 2ρ 2ρ k∈Z 2 2 1 1 t × exp− 2 . 2 2ρ cosh 2k+n k∈Z
2ρ
Lemma 2.9. We have 2k + n s(k) + tanh = −(n − 1). 2ρ k∈Z
May 19, 2005 1:20 WSPC/148-RMP
324
J070-00233
P. M. Soltan
Proof. First of all notice that since Re ρ < 0 the series in question is absolutely convergent. To compute its sum we shall use the following, obvious, formulas: k s(k) + tanh = 1, (2.16) ρ k∈Z 1 k s k+ + tanh = 0. (2.17) 2 ρ 1 k∈Z+ 2
Let us deal with the cases of even and odd n separately. First let n = 2l. Then 2k + n k s(k) + tanh s(k − l) + tanh = . (2.18) 2ρ ρ k∈Z
k∈Z
Subtracting the left-hand side of (2.16) from the right-hand side of (2.18) we obtain 2k + n k s(k) + tanh s(k − l) + tanh = 2ρ ρ k∈Z k∈Z k − s(k) + tanh +1 ρ k∈Z (s(k − l) − s(k)) + 1 = −2l + 1 = k∈Z
= −(2l − 1) = −(n − 1). Assume now that n = 2l + 1. Then 2k + n 1 k s(k) + tanh s k−l− = + tanh . 2ρ 2 ρ 1
k∈Z
(2.19)
k∈Z+ 2
As before we subtract the left-hand side of (2.17) from the right-hand side of (2.19) and arrive at 2k + n k 1 s(k) + tanh s k−l− = + tanh 2ρ 2 ρ k∈Z k∈Z+ 12 k 1 s k− − + tanh 2 ρ k∈Z+ 12 1 1 = s k−l− −s k− 2 2 k∈Z+ 12 (s(k − l) − s(k)) = −2l = −(n − 1), = k∈Z
which ends the proof of Lemma 2.9. Let us introduce the notation Θn =
k∈Z
cosh
1 2k+n 2 . 2ρ
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
We can summarize the analysis we have done so far in the following way: 2 it Gn (t) t2 1 = exp −(n − 1) Θn . exp − Gn (0) 2ρ 2 2ρ
325
(2.20)
Now let us return to the study of the function (2.11). Using (2.12) and (2.13) we obtain t2 Gn (t¯) λ∈Λ¯ 1 − λt exp λt + 2λ ϕn (t¯) Gn (t¯) 2 n it 2−n −it t = . q )= Fq (q q )Fq (q 2 = t t ϕn (t) Gn (t) Gn (t) λ∈Λ 1 − λ exp λ + 2λ2 By (2.20) this means that Fq (q n q it )Fq (q 2−n q −it ) = (Phase Gn (0))−2 1 Θn × exp i(n − 1) Re t2 . t + i Im ρ (2ρ)2 (2.21) In particular the left-hand side of (2.21) extends to an entire function of t. Let us denote this extension by t → Φ(n, t). Lemma 2.10. For any t ∈ R we have Φ(n + 1, t − i) = q n q it Φ(n, t).
(2.22)
¯q Proof. Just as in the proof of Theorem 2.5(2) one can show that for γ ∈ Γ Fq [qγ ] (i) = (1 + γ )−1 Fq (q 2 γ ).
(2.23)
With γ = q n q it we have Φ(n, t) = Fq (γ)Fq (q 2 γ −1 ) and
Φ(n + 1, t + s) = Fq (qγq is )Fq (qγ −1 q −is ) = Fq [qγ] (s)Fq qγ −1 (−s). Now using (2.6) and (2.23) we get
1+γ Φ(n + 1, t − i) = Fq [qγ] (−i)Fq qγ −1 (i) = Fq (γ)Fq (q 2 γ −1 ) = γΦ(n, t) 1 + γ −1 for γ = −1. For γ = −1 we use the continuity of both sides with respect to γ. Lemma 2.10 allows us to determine the constants appearing in (2.21) with simple recurrence formulas. From now on let t be a real number. By (2.21) the right-hand
May 19, 2005 1:20 WSPC/148-RMP
326
J070-00233
P. M. Soltan
side of (2.22) is equal to
1 Θn RHS = q q (Phase Gn (0)) exp i(n − 1) Re t2 t + i Im ρ (2ρ)2 1 1 1 1 −2 = (Phase Gn (0)) exp n Re + in Im − Im t + i Re t ρ ρ ρ ρ 1 Θn × exp i(n − 1) Re t2 , t + i Im ρ (2ρ)2 −2
n it
while the left-hand side equals
1 Θn+1 2 LHS = (Phase Gn+1 (0)) exp in Re (t − i) (t − i) + i Im ρ (2ρ)2 1 1 Θn+1 −2 = (Phase Gn+1 (0)) exp n Re + in Re t − i Im ρ ρ (2ρ)2 Θn+1 Θn+1 2 × exp 2 Im t + i Im t . (2ρ)2 (2ρ)2 −2
Comparing these expression gives 1 1 Θn −2 (Phase Gn (0)) exp in Im t2 − Im t + i Im ρ ρ (2ρ)2 Θn+1 Θn+1 −2 = (Phase G)n+1 (0) exp −i Im + 2 Im t (2ρ)2 (2ρ)2 Θn+1 2 + i Im t (2ρ)2 for all t ∈ R. As first and second derivatives of LHS and RHS at t = 0 are equal, we obtain 1 Θn+1 1 Θn Im . (2.24) = Im = − Im (2ρ)2 (2ρ)2 2 ρ At the same time the equality of values of LHS and RHS at t = 0 shows that 1 −2 (Phase Gn (0)) exp in Im ρ Θn+1 = (Phase Gn+1 (0))−2 exp −i Im . (2.25) (2ρ)2 Using (2.24) and (2.25) we find the recurrence relation 1 1 −2 −2 Im (Phase Gn (0)) , (Phase Gn+1 (0)) = exp i n − 2 ρ which we turn into −2
(Phase Gn (0))
i 1 (n − 1)2 1 −2 = exp − Im exp i Im (Phase G0 (0)) . 2 ρ 2 ρ (2.26)
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
327
Now inserting (2.24) and (2.26) into (2.21) we get 1 i n it 2−n −it Fq (q q ) = Fq (q q ) exp − Im 2 ρ 1 i −2 2 2 (n − 1) − t Im × (Phase G0 (0)) exp 2 ρ 1 , + i(n − 1)t Re ρ which in view of (2.4) means that 1 i −2 2 −1 (Phase G0 (0)) α(q −1 γ), Fq (γ)Fq (q γ ) = exp − Im 2 ρ where γ = q n q it . Finally Eq. (2.15) together with the definition of Fq gives the expression for the constant Cq in (2.10). This concludes the proof of Theorem 2.7. The name product formula for Eq. (2.10) is self explanatory. It is an interesting fact that for other values of the deformation parameter q than those considered in this paper, the proof of the analogous formula simplifies considerably (cf. [12, Sec. 1] and [17, Sec. 1]). 2.5. Asymptotic behavior of Fq Proposition 2.11. The function Fq has the following asymptotic behaviour: for ¯ q and t, τ ∈ R we have any γ ∈ Γ Fq [γ] (t − iτ ) = Ξγ (t − iτ )|q −1 q it γ|τ , where lim Ξγ (t − iτ ) = 1
t→∞
for any τ ∈ R. Proof. The mapping
t → Fq [γ] (t)Fq q 2 γ −1 (−t) = Fq (q it γ)Fq q 2 (q it γ) = Cq α(q −1 q it γ) extends to an entire function on C. Theorem 2.5(3) says that with bounded τ we have
Fq q 2 γ −1 (−t − iτ ) −→ 1. t→∞
Denoting the reciprocal of the absolute value of this function by Ξγ we obtain
Fq [γ] (t − iτ ) Ξγ (t − iτ )−1 = α q −1 γ (t − iτ ) .
May 19, 2005 1:20 WSPC/148-RMP
328
J070-00233
P. M. Soltan
It remains to determine the asymptotic behavior of the analytic continuation of α. With γ = q m q is we have (cf. (2.4)) i
2 2 −1 −1 |α q −1 γ (t − iτ )| = e 2 ((m−1) −(s+t−tτ ) ) Im ρ ei(m−1)(s+t−iτ ) Re ρ −1
= e−(s+t)τ Im ρ eτ (m−1)
Re ρ−1
= |q −1 q m q is q it |τ = |q −1 γq it |τ . 2.6. Uniqueness of Fq ¯ q such that Proposition 2.12. Let Φ be a continuous function on Γ (1) Φ(0) = 1; ¯ q the function Φ[γ] : R t → Φ(q it γ) ∈ C has a holomorphic (2) for any γ ∈ Γ extension to the lower half plane; ¯ q we have (3) for any γ ∈ Γ Φ[qγ] (−i) = (1 + γ)Φ(γ);
(2.27)
¯ q there exist constants C1 and C2 such that for (4) for any δ > 0 and any γ ∈ Γ any 0 ≤ σ < 1 and any s ∈ R we have Φ[γ] (s − iσ) ≤ C1 + C2 q −1 γq is 1+δ . Then Φ = Fq . ¯ q and define Proof. Let us fix a γ ∈ Γ ϕγ (z) =
Φ[γ] (z) Fq [γ] (z)
for z in the lower half plane. This way we obtain a meromorphic function ϕγ on the lower half plane. Our aim is to prove that it is a constant function equal to one. We shall show this in four steps: (1) ϕγ is holomorphic; (2) ϕγ is periodic with non-real period, in particular ϕγ extends to an entire function; (3) ϕγ factorizes through the map s → q is , more precisely there exists an entire function ψγ such that ϕγ (s − iσ) = ψγ q i(s−iσ) ; (4) ψγ is constant equal to 1. Ad. (1) Analysis of zeros of the function Fq [γ] (cf. (2.7), (2.8) and (2.9)) shows that all zeros of this function have integer imaginary parts. By (2.6) we have Fq [γ] (t − i) = (1 + q it q −1 γ)Fq (q it q −1 γ)
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
329
for all t ∈ R. Therefore the only possible zero on the line R − i exists when there is a t0 ∈ R such that q it0 γ = −q. By (2.27) we have the same facts for Φ. Therefore the (only possible) zero of Fq [γ] in the strip {z ∈ C: −2 < Im z ≤ 0} cancels with one of zeros of Φ[γ]. It follows that ϕγ extends to a holomorphic function in {z ∈ C: −2 < Im z < 0}. Notice further that by (2.6) and (2.27) we have
(1 + q it q −1 γ)Φ q −1 γ (t) Φ[γ] (t − i) = = ϕq−1 γ (t) (2.28) ϕγ (t − i) = Fq [γ] (t − i) (1 + q it q −1 γ)Fq [q −1 γ] (t) for all t ∈ R and so this equality remains true for t in the lower half plane. Therefore for any k ∈ N holomorphy of ϕγ in the strip {z ∈ C: −(k + 2) < Im z < k} is equivalent to that of ϕq−k γ in {z ∈ C: −2 < Im z < 0}. In particular ϕγ is holomorphic in the lower half plane. Ad. (2) Using (2.28) N times we get ϕγ (t − N i) = ϕq−N γ (t) = ϕqit0 γ (t), for some t0 ∈ R (namely such that q it0 = q −N ) and all t ∈ R. In particular
Φ q it0 γ (t) Φ[γ] (t + t0 ) = = ϕγ (t + t0 ) ϕγ (t − N i) = it 0 Fq [q γ] (t) Fq [γ] (t + t0 ) or in other words ϕγ t − (t0 + N i) = ϕγ (t) for all t ∈ R. By holomorphy of ϕγ this equality holds for t in the lower half plane. Ad. (3) for any s ∈ R we have ϕγ (s) =
Φ(q is γ) Φ[γ] (s) = . Fq [γ] (s) Fq (q is γ)
and by periodicity Φ(q i(s−(t0 +N i)) γ) ϕγ (s) = ϕγ s − (t0 + N i) = . Fq (q i(s−(t0 +N i)) γ) As z goes along a path from s to s − (t0 + N i) the variable q iz γ goes along a closed path beginning and ending in q is γ. Therefore the formula ψγ (q iz ) = ϕγ (z) defines a holomorphic function ψγ on C\{0}. By Theorem 2.5(3) we know that Fq [γ] (z) converges to 1 as the real part of z goes to −∞ and the imaginary part stays bounded. Also by assumption (4) of this proposition Φ[γ] (z) is bounded when z moves to −∞. Therefore the quotient is bounded. Consequently ψγ (z) is bounded as z → 0. It follows that ψγ is entire. Ad. (4) Let us fix a 0 < δ < 12 . We know that there are constants C1 and C2 such that Φ[γ] (s − iσ) ≤ C1 + C2 |q −1 γq is |1+δ
May 19, 2005 1:20 WSPC/148-RMP
330
J070-00233
P. M. Soltan
for all s ∈ R and 0 ≤ σ < 1. By Proposition 2.11, for s → ∞ we have C1 C1 ϕγ (s − iσ) ≤ |q −1 γq is |1+δ−σ . + Ξγ (s − iσ)|q −1 γq is |σ Ξγ (s − iσ)
(2.29)
Consider now the values of ψγ on the curve Υ = {q i(s−iσ) γ : s ∈ R}. As s → ∞ the corresponding point of Υ goes to infinity. Now (2.29) shows that for z ∈ Υ we have |ψγ (z)| = o(|z|2 ). It follows that ψγ (z) = ψγ (0) + ψγ (0)z. However if σ > 1 − δ then 1 + δ − σ < 2δ < 1 and consequently ψγ (z) = o(|z|). In particular ψγ (0) = 0 and ψγ (z) = ψγ (0) for all z. By assumption (1) ψγ (0) =
lim
Rt→−∞
ϕγ (t) =
lim
¯ q γ→0 Γ
Φ(γ) = 1. Fq (γ)
In the next subsection we shall exhibit a function satisfying conditions (1)–(4) of Proposition 2.12. In particular that will imply that Fq satisfies these conditions. 2.7. Fourier transform of Fq All results of this subsection can be proved with more or less standard techniques from the theory of functions of one complex variable and theory of distributions. Therefore we have decided to present only sketches of proofs. The details have been taken care of in [7, Uzupe lnienie A.2] (cf. also [17, Appendix B]). As any locally-compact group Γq possesses a Haar measure dµ. We shall chose the following normalization: N −1 +∞ f (γ) dµ(γ) = f (q k q it ) dt. Γq
k=0
−∞
Apart from integration we shall also use theory of distributions on Γq . To that end let us define the Schwartz space S(Γq ) as the space of all functions f : Γq → C such that the functions R t → f (q k q it ) belong to the space S(R) (the usual Schwartz space on R) for k = 1, . . . , N . This definition is, of course, compatible with the isomorphism Γq ∼ = = ZN × R (i.e. S(Γq ) ∼ N C ⊗ S(R)). We shall consider the following function Hq (γ) =
χ(−q −2 , γ)γ (1 − γ¯ )Fq (−q 2 γ)
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
331
defined for γ ∈ Γq\{1}. This function defines a tempered distribution on Γq by integration. The pole of the function Hq has to be rounded. Let be the oriented contour in C coinciding with R, but rounding the point 0 from above. For f ∈ S(Γq ) we have N −1 +∞ it it Hq , f = Hq (q )f (q ) dt + Hq (q k q it )f (q k q it ) dt.
k=1
−∞
We let Φq be the inverse Fourier transform of the tempered distribution Hq : Φq (γ) =
χ(−q −2 γ, q it )q it (1 − q it )Fq (−q 2 q it )
dt +
N −1 +∞ k=1
−∞
χ(−q −2 γ, q k q it )q k q it (1 − q k q it )Fq (−q k+2 q it )
dt. (2.30)
Proposition 2.13. (1) The integral (2.30) is convergent as an improper Riemann integral. More precisely there exists the limit χ(−q −2 γ, q it )q it Φq (γ) = lim dt R→∞ R 1 − q it Fq (−q 2 q it ) + lim
R→∞
N −1 R
χ(−q −2 γ, q k q it )q k q it
−∞
k=1
(1 − q k q it )Fq (−q k+2 q it )
dt,
(2.31)
where R is the part of starting at −∞ and ending at R. (2) For any τ ∈ ]0, 1[ we have ¯ q (−q 2 )−1 Φq (γ) = −2π ρF +
N −1
(q Phase(−q
−2
k
γ))
R−iτ
k=0
|γ|iz |q|−2iz q iz (1 − q k q iz )Fq [−q k+2 ] (z)
dz.
(2.32)
¯ q and (3) Φq extends to a continuous function on Γ ¯ q (−q 2 )−1 . lim Φq (γ) = −2π ρF
γ→0
¯ q the function R t → Φq (q it γ) has a holomorphic continuation (4) For any γ ∈ Γ to the lower half plane. (5) For any γ ∈ Γq we have Φq [qγ] (−i)= (1 + γ)Φq (γ). ¯ q there exist constants C1 and C2 such that for (6) For any δ > 0 and any γ ∈ Γ any 0 ≤ σ < 1 and any s ∈ R we have Φq [γ] (s − iσ) ≤ C1 + C2 q −1 γq is 1+δ .
May 19, 2005 1:20 WSPC/148-RMP
332
J070-00233
P. M. Soltan
Proof. Ad. (1) Due to the exponential decay of |q it | for t → −∞ the integral over R is convergent. After elementary manipulations we can rewrite the right-hand side of (2.31) as lim
R→∞
N −1 k=0
R
−1
−1
(q Phase(−q −2 γ))k eit(log |γ|−Re ρ ) e−t Im ρ dt. 1 − e−ik Im ρ−1 e−it Re ρ−1 ek Re ρ−1 e−t Im ρ−1 Fq [−q k+2 ] (t)
Then we choose a number τ ∈ ]0, 1[ and deform the integration contour in the following way (Fig. 1). Now standard methods show that the integral along the part of the contour from R − iτ to R goes to 0 as R → ∞ and that the integral over the remaining part of the contour has a limit as R → ∞. Ad. (2) To obtain formula (2.32) we deform further the integration contour (Fig. 2). This deformation does not change the value of the summands for k ∈ {1, . . . , ¯ q (−q 2 )−1 as the residue of the integrand at N − 1}. For k = 0 we obtain −2π ρF the point 0. Again standard computations show that the integral over the line from −R to −R − iτ tends to 0 as R → ∞. In the limit we get (2.32). Ad. (3) This follows from (2.32). ¯ q and s ∈ R the value Φq [γ] (s) is a limit over R → ∞ of a Ad. (4) For γ ∈ Γ sum of terms of the form |−q −2 γ|it |q it |is q it dt |q k |is q k Phase(−q −2 γ)k . (2.33) R (1 − q k q it )Fq [−q k+2 ] (t)
Fig. 1.
Fig. 2.
Deformation of the contour R .
Further deformation of the contour R .
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
333
Now we want to put s − iσ in place of s with σ > 0. To that end we must deform R in such a way that the integral be convergent and that we avoid all zeros of the denominator. An easy analysis of the zeros of Fq −q k+2 shows that they all lie below the line Rρ + (k + 2)i. Therefore if we choose τ > σ and deform the integration contour as shown in Fig. 3. The value of (2.33) will not change. Now, as before, it is possible to show that the integral over the line from R − iτ to R goes to 0 as R → ∞. At the same time it is easy to see that the limit R → ∞ defines a holomorphic function of z = s − iσ for 0 < σ < τ . Since τ was arbitrarily large we see that Φq [γ] has a holomorphic extension to the lower half plane. In particular for z = −i and with qγ in place of γ we have
Φq [qγ] (−i) =
N −1
k=0
2 χ(−q −2 γ, q k q it ) q k q it dt (1 − q k q it )Fq (−q k+2 q it )
.
Ad. (5) Combining (2.5) and (2.6) we get
Fq [γ] (−i) = (1 − γ¯ )Fq (−q 2 γ) = (1 − γ¯ )Fq −q 2 γ (0) ¯ q and more generally for all γ ∈ Γ
Fq [γ] (t − i) = (1 − q it γ)Fq −q 2 γ (t) for all t ∈ R.
Fig. 3.
ˆ ˜ Avoiding zeros of Fq −q k+2 .
(2.34)
May 19, 2005 1:20 WSPC/148-RMP
334
J070-00233
P. M. Soltan
Using this formula we compute N −1 χ(−q −2 γ, q k q it )q k q it dt Φq (γ) = k it k+2 q it ) k=0 (1 − q q )Fq (−q k N −1 Phase −q −2 γ |−q −2 γ|it q k q it dt = (1 − q k q it )Fq [−q k+2 ] (t) k=0 k N −1 Phase −q −2 γ |−q −2 γ|it q k q it dt = Fq [−q k+1 ] (t − i) k=0 k N −1 Phase −q −2 γ |−q −2 γ|i(z+i) q k q i(z+i) dz , = Fq [−q k+1 ] (z) −i k=0
and so
−1 Φq (γ) = Phase −q −2 γ |−q −2 γ|−1 k+1 N −1 Phase −q −2 γ |−q −2 γ|iz q k−1 q iz dz × Fq [−q k+1 ] (z) k=0 −i k+1 −1 −2 −1 N Phase −q −2 γ |−q −2 γ|iz q k−1 q iz dz . = −q γ Fq [−q k+1 ] (z) −i k=0
Now we change the integration contour from − i to to obtain k+1 −1 −2 −1 N Phase −q −2 γ |−q −2 γ|it q k−1 q it dt Φq (γ) = −q γ Fq [−q k+1 ] (t) k=0 −1 −2 −1 N χ(−q −2 γ, q k q it )q k−1 q it dt = −q γ Fq [−q k+1 ] (t) k=0 N −1 χ(−q −2 γ, q k q it )q k+1 q it dt = −γ −1 Fq [−q k+1 ] (t) k=0 N −1 χ(−q −2 γ, q k+1 q it )q k+1 q it dt . = −γ −1 Fq (−q k+1 q it ) k=0
Now by (2.5) we have Φq (γ) = −γ
−1
N −1 k=0
= −γ −1
N −1 k=0
= −γ −1
N −1 k=0
χ(−q −2 γ, q k+1 q it )q k+1 q it dt 1−qk+1 qit 2 k+1 q it ) 1−qk+1 qit Fq (−q q
χ(−q −2 γ, q k+1 q it )(1 − q k+1 q it )q k+1 q it dt (1 − q k+1 q it )Fq (−q 2 q k+1 q it ) χ(−q −2 γ, q k q it )(1 − q k q it )q k q it dt (1 − q k q it )Fq (−q k+2 q it )
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
335
N −1
χ(−q −2 γ, q k q it )q k q it dt + γ −1 k q it F (−q k+2 q it ) 1 − q q k=0 2 N −1 −2 k χ(−q γ, q q it ) q k q it dt × . 1 − q k q it Fq (−q k+2 q it ) k=0
= −γ −1
In other words (cf. (2.34)) Φq [γ] (0) = γ −1 Φq [qγ] (−i)− γ −1 Φq [γ] (0), i.e. Φq [qγ] (−i) = (1 + γ)Φq [γ] (0) = (1 + γ)Φq (γ). Ad. (6) Φq [γ] (s − iσ) is a sum of terms of the form is k σ k |q| q q Phase −q −2 γ
|q it |is |q it |σ |−q −2 γ|it q it dt (1 − q k q it )Fq [−q k+2 ] (t)
,
(2.35)
where the integral is understood as the limit with R → ∞ of the integral over a contour as shown in Fig. 3 with τ = 1 + δ. We can divide the contour into three parts: first starting at −∞ and ending at c − iτ , the second from c − iτ to R − iτ , and the third part from R − iτ to R (see Fig. 3). As we have pointed out in the proof of Statement (4) the integral over the third part goes to 0 as R → ∞. Let M1 be the value of the integral over the first part of the contour. Then elementary computations show that there is a constant M2 such that the integral in (2.35) equals M1 + M2 |q −1 γq is |τ . Since τ = 1 + δ there exist constants C1 and C2 such that Φq [γ] (s − iσ) ≤ C1 + C2 q −1 γq is 1+δ for all 0 ≤ σ < 1 and all s ∈ R. As an immediate consequence of Propositions 2.13 and 2.12 we get the following: Corollary 2.14. The functions Fq and Φq are proportional: ¯ q (−q 2 )−1 Φq . Fq = −2π ρF Moreover we have Fq (−q 2 ) Fq (γ) = − 2π ρ¯
Γq
χ(−q −2 γ, γ ) dµ(γ ) (1 − γ¯ )Fq (−q 2 γ )
May 19, 2005 1:20 WSPC/148-RMP
336
J070-00233
P. M. Soltan
in the sense of distribution theory (the correction of the integration contour is understood as a part of the definition of the distribution under the sign of the integral). of F q and F Corollary 2.15. The distributional inverse Fourier transforms F q q and Fq satisfy (γ). q (γ) = α(γ)|γ|χ(−1, γ)F F q Proof. By Corollary 2.14 2 −2 q (γ) = − Fq (−q ) χ(−q , γ)γ . F 2π ρ¯ (1 − γ¯ )Fq (−q 2 γ)
Consequently 2 −1
(γ) = F q (γ −1 ) = − Fq (−q ) F q 2πρ
χ(−q −2 , γ)(1 − γ −1 ) . Fq (−q 2 γ −1 )¯ γ
Therefore (γ) F ρ¯ 1 − γ¯ q Fq (−q 2 γ)Fq (−q 2 γ −1 )¯ = Fq (−q 2 )−2 γ −1 . q (γ) ρ γ−1 F Using (2.5), (2.10), (2.3), the fact that χ is a bicharacter and (2.1) we get (γ) ρ¯ F q = −Fq (−q 2 )−2 Fq (−γ)Fq (−q 2 γ −1 )¯ γ −1 q (γ) ρ F ρ¯ = −Fq (−q 2 )−2 Cq α(−q −1 γ)¯ γ −1 ρ ρ¯ α(−q −1 γ) γ¯ −1 = −Fq (−q 2 )−2 Cq α(−q −1 )α(γ) ρ α(−q −1 )α(γ) ρ¯ = −Fq (−q 2 )−2 Cq α(−q −1 )α(γ)χ(−q −1 , γ)¯ γ −1 ρ ρ¯ 1 = −Fq (−q 2 )−2 Cq α(−q −1 )α(γ) ρ (χ(−q, γ)¯ γ ¯ ρ 1 = −Fq (−q 2 )−2 Cq α(−q −1 )α(γ) ρ χ(q, γ)¯ γ χ(−1, γ) ρ¯ 1 = −Fq (−q 2 )−2 Cq α(−q −1 )α(γ) ρ Phase(γ)¯ γ χ(−1, γ) ¯ ρ 1 . = −Fq (−q 2 )−2 Cq α(−q −1 ) ρ α(γ)|γ|χ(−1, γ) Now inserting γ = −1 in (2.10) and using (2.5) we obtain Cq α(−q −1 ) = Fq (−1)Fq (−q 2 ) = (γ). q (γ) = α(γ)|γ|χ(−1, γ)F and thus F q
−ρ Fq (−q 2 )2 , ρ¯
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
337
2.8. Other useful functions In this subsection we shall relate the special functions introduced above to other functions which fit well the framework developed in [16]. This step is needed to be able to freely use the theory of Zakrzewski relation presented in that reference. 2πin Consider the function Ph: Γq → T given by Ph q n q it = e N , where N is the constant entering the definition of ρ (cf. beginning of this section). It is easy to check that this formula defines a function on Γq (the value does not depend on the representation of γ in the form q n q it ). Elementary computations give the following formula γ = Ph(γ) |γ|1+
2πi Re ρ N
(2.36)
for any γ ∈ Γq . In [16] for a parameter −π < < π a subset Ω+ was defined in the following way: Ω+ = z ∈ C\{0} : arg z ≥ 0, |arg z| ∈ [0, ||] . In what follows we shall use the identification of the region lying between the logarithmic spirals q n {q it : t ∈ R} and q n+1 {q it : t ∈ R} n i(t−iτ ) q q : t ∈ R, τ ∈ [0, 1] with Ω+ Im ρ−1 given by q n q i(t−iτ ) ↔ eiτ |q n q it | (notice that the conditions imposed on the form of ρ imply that |Im ρ−1 | < π). Another tool used in [16] is the space f is holomorphic in the interior of Ω+ and −λ(z)2 , H+ = f ∈ C Ω+ : for any λ > 0 the function z → e f (z) + is bounded on Ω where (z) = log |z| + i arg z. For k ∈ {0, . . . , N − 1} and r > 0 define fk (r) = Fq (e
2πik N
r1+
2πi Re ρ N
).
Using the information about the function Fq obtained in Theorem 2.5 and Proposition 2.11 one can without difficulty prove the following (cf. [7, Sec. 5.5] and [17]): Lemma 2.16. Let k ∈ {0, . . . , N − 1}. Then (1) the function fk extends to a continuous function on Ω+ Im ρ−1 which is holomorphic in the interior of that region, moreover, denoting the extension by the same + −1 + symbol, we have fk ∈ HIm ρ−1 and for any τ ∈ ]0, 1[ we have fk ∈ Hτ Im ρ−1 ;
May 19, 2005 1:20 WSPC/148-RMP
338
J070-00233
P. M. Soltan
(2) for any r > 0 we have −1
fk (ei Im ρ r) = (1 + q −1 γ)Fq (q −1 γ), 2πik
2πi Re ρ
where γ = e N r1+ N ; (3) we have the following asymptotic behavior of fk : arg z
|fk (z)| ≤ Θk (z)|q −1 z| Im ρ−1 , where lim|z|→∞ Θk (z) = 1. 3. Operator Equalities 3.1. The commutation relations In this section we shall examine pairs (A, B) of operators on a Hilbert space satisfying the following conditions: (1) (2) (3) (4)
A and B are normal, ker A = ker B = {0}, ¯q, Sp A, Sp B ⊂ Γ for all γ, γ ∈ Γq χ(γ, A)χ(B, γ ) = χ(γ, γ )χ(B, γ )χ(γ, A).
(3.1)
Formula (3.1) is called the Weyl relation. The set of pairs (A, B) of operators on a Hilbert space H fulfilling conditions (1)–(4) will be denoted by DH . For any infinite dimensional Hilbert space H the set DH is non-empty. Moreover any pair (A, B) ∈ DH is unitarily equivalent to a direct sum of the so-called Schr¨ odinger pairs. The Schr¨ odinger pair (AS , BS ) acts irreducibly on L2 (Γq ) in the following way: AS f (γ) = f [qγ] (−i), BS f (γ) = γf (γ) (cf. [17, 7]). The correspondence H → DH satisfies the following conditions: • if H and K are Hilbert spaces, U : H → K a unitary map and (A, B) ∈ DH then (U AU ∗ , U BU ∗ ) ∈ DK , • if H = H1 ⊕ H2 and operators A and B on H decompose as A = A1 ⊕ A2 and B = B1 ⊕ B2 respectively then (A, B) ∈ DH if and only if (Ak , Bk ) ∈ DHk for k = 1, 2. Such a structure is called an operator domain, a notion closely related to compact and measurable domains and W∗ -categories (cf. [2, 10, 14, 17, 7]). We shall use the symbol D also to denote this operator domain.
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
339
Inserting in the Weyl relation (q, q it ), (q it , q), (q, q), and (q it , q it ) for (γ, γ ) and performing analytic continuation with respect to t in the first two cases we obtain (cf. (2.1)): (Phase A) |B| = |q||B| (Phase A) , |A| (Phase B) = |q| (Phase B) |A|, (Phase A) (Phase B) = ei Im ρ
|A|it |B|it = e−itt
−1
(3.2)
(Phase B) (Phase A) ,
Im ρ−1
|B|it |A|it .
The last equation of (3.2) means that |B| and |A| satisfy the Zakrzewski relation [16, Definition 2.1] with = Im ρ−1 . We shall represent this graphically as |B| ◦ |A|. Let us recall the definition of a core for a family of operators introduced in [6]. Let H be a Hilbert space and let T be a family of closable operators on H. A linear subset D0 is a core for T if
(1) D0 ⊂ D(T ) for all T ∈ T, (2) for any x ∈ H there exists a sequence (xn )n∈N of elements of D0 converging to x such that " ! T ∈ T, ¯x . T − − − → ⇒ T x n n→∞ x ∈ D T¯ Throughout the paper we shall adopt the following convention. For any pair of linear operators (X, Y ) acting on the same Hilbert space we shall denote by X ◦Y their composition. If X◦Y happens to be closable then XY will denote the closure of X ◦Y . Let us recall [6, Theorem 2.7]. Theorem 3.1. Let H be a Hilbert space and let T1 , . . . , Tp be normal operators on H. Assume that for each pair of indices k, l ∈ {1, . . . , p} there exist scalars µ(k, l) and λ(k, l) such that (Phase Tk )|Tl | = µ(k, l)|Tl |(Phase Tk ), (Phase Tk )(Phase Tl ) = λ(k, l)(Phase Tl )(Phase Tk ). Moreover assume that there exists a real number with || < π such that for any k, l ∈ {1, . . . , p} one of the following conditions holds: (1) |Tk | strongly commutes with |Tl |, (2) |Tk | (3) |Tl |
◦ |Tl |,
◦ |Tk |.
Let T be the family of all compositions of the form Ti1 ◦ Ti2 ◦ · · · ◦ Tin where i1 , . . . , in ∈ {1, . . . , p} and T denotes either T or T ∗ . Then all operators in T are densely defined and closable and there exists a core for T.
May 19, 2005 1:20 WSPC/148-RMP
340
J070-00233
P. M. Soltan
The original formulation of this theorem in [6] places stronger conditions on the operators T1 , . . . , Tp . Namely the constant is fixed as 2π N with an even natural number N greater or equal to 6 and the numbers λ(k, l), µ(k, l) are equal to 1 for all k, l ∈ {1, . . . , p}. It is, however, clear from the proof given in [6] that these restrictions are not essential. Given a pair (A, B) ∈ DH for some Hilbert space H we can apply Theorem 3.1 with p = 2 and T1 = A, T2 = B. The family of all finite compositions of elements of the set {A, A∗ , B, B ∗ } will be denoted by T and we shall use the symbol D0 for the core for T. 3.2. Products Theorem 3.2. Let H be a Hilbert space and let (A, B) ∈ DH . Then (1) the operators A ◦ B, B ◦ A, A ◦ B ∗ and B ∗ ◦ A are closable and we have AB = q 2 BA,
AB ∗ = B ∗ A;
(3.3)
(2) for any γ ∈ Γq α(γ)χ(B, γ)χ(γ, A) = α(B)∗ χ(γ, A)α(B) = α(A)χ(B, γ)α(A)∗ ;
(3.4)
(3) we have qBA = α(B)∗ Aα(B) = α(A)Bα(A)∗ .
(3.5)
Proof. Ad. (1) The closability of all finite compositions of elements of the set {A, A∗ , B, B ∗ } was established in Theorem 3.1. Formula (3.3) follows from the fact that A(By) = q 2 B(Ay) and A(B ∗ y) = B ∗ (Ay) for any y ∈ D0 . Ad. (2) In order to make our exposition shorter we shall only prove the equality α(γ)χ(B, γ)χ(γ, A) = α(A)χ(B, γ)α(A)∗ . The other one can be proved in an analogous fashion. From the Weyl relation and the fact that χ is a bicharacter we infer that χ(B, γ)∗ χ(γ , A)χ(B, γ) = χ(γ , γA).
(3.6)
Inserting γ = q and γ = q it in (3.6) we obtain χ(B, γ)∗ (Phase A) χ(B, γ) = (Phase γ) (Phase A) , χ(B, γ)∗ |A|χ(B, γ) = |γ||A|, where in the second equality we performed analytic continuation to t = −i. Multiplying left and right-hand sides of these equations results in χ(B, γ)∗ Aχ(B, γ) = γA
(3.7)
for all γ ∈ Γq . Now let us apply function α to both sides of (3.7) and multiply both sides of the resulting equation by α(A)∗ from the right. We obtain α(A)χ(B, γ)α(A)∗ = χ(B, γ)α(γA)α(A)∗ .
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
341
Remembering that α(γ)α(γ) = 1 for all γ ∈ Γq we can rewrite this as α(A)χ(B, γ)α(A)∗ = α(γ)χ(B, γ) α(γA)α(γ)α(A)∗ = α(γ)χ(B, γ) α(γA) [α(γ)α(A)]−1 . Our assertion follows now from (2.3). Ad. (3) As in the proof of (2) w put γ = q and γ = q it in (3.4) and perform analytic continuation in the latter case: i
−1
e 2 Im ρ
(Phase B) (Phase A) = α(B)∗ (Phase A) α(B) = α(A) (Phase B) α(A)∗ , i
−1
e 2 Im ρ |B||A| = α(B)∗ |A|α(B) = α(A)|B|α(A)∗ . Now we can multiply the left and right-hand sides of the above equations and use relations (3.2) to obtain (3.5). Corollary 3.3. Let H be a Hilbert space and let (A, B) ∈ DH . Then (1) (2) (3) (4)
for any γ1 , γ2 ∈ Γq we have (γ1 A, γ2 B) ∈ DH ; (B, A−1 ) ∈ DH ; (AB, B), (A, BA) ∈ DH ; for any γ ∈ Γq we have χ(q −1 AB, γ) = α(γ)χ(B, γ)χ(γ, A).
Proof. Ad. (1) The assertion follows by multiplying both sides of the Weyl relation (3.1) by χ(γ2 , γ1 ) and using the fact that χ is a bicharacter. Ad. (2) Applying hermitian conjugation to both sides of the Weyl relation and using the fact that χ(γ, A)∗ = χ(γ, A−1 ), χ(B, γ )∗ = χ(B, γ
−1
)
we obtain χ(B, γ
−1
)χ(γ, A−1 ) = χ(γ
−1
, γ)χ(γ, A−1 )χ(B, γ
−1
)
which by symmetry of χ means that (B, A−1 ) ∈ DH . Ad. (3) This follows by conjugating the pair (A, B) with the unitary operators α(B) and α(A)∗ and using Theorem 3.2(3) and Statement (1) above. Ad. (4) By Theorem 3.2(1) we have α(γ)χ(B, γ)χ(γ, A) = α(A)χ(B, γ)α(A∗ ) = χ(α(A)Bα(A)∗ , γ). Now Theorem 3.2(2) of the same theorem says that this is equal to χ(qBA, γ) = χ(q −1 AB, γ).
May 19, 2005 1:20 WSPC/148-RMP
342
J070-00233
P. M. Soltan
3.3. Sums Let H be a Hilbert space and let (A, B) ∈ DH . Put S = Ph(A) |A|ρ and R = Ph(B) |B|ρ . Then the pair (R, S) satisfies the commutation relations and spectral conditions considered in [17]. Therefore we can use Proposition 2.3 of that reference which yields Proposition 3.4. Let H be a Hilbert space and let (A, B) ∈ DH . Then there exists a one-parameter group (Rt )t∈R+ of unitary operators acting on H such that Ph(B) Rt = Rt Ph(B) , t
Ph(A) Rt = Rt Ph(A) , |A|Rt = Rt |A|t .
Rt |B| = |B| Rt , for all t ∈ R+ .
In the next theorem we shall consider the operator A+B◦A, where (A, B) ∈ DH for some Hilbert space H. It turns out to be closable and its closure coincides with the closure of A + BA (which, of course, is closable as well). The closability is more or less straightforward: (A + B ◦ A)∗ ⊃ A∗ + A∗ ◦ B ∗ . The core D0 described after Theorem 3.1 is contained in the domain of the latter operator which is thus densely defined. The same reasoning shows that A + BA is closable. Let x ∈ D(A)∩D(BA) and let (xn )n∈N be a sequence of elements of D0 such that xn −−−→ x, n→∞
Axn −−−→ Ax, n→∞
BAxn −−−→ BAx. n→∞
It follows that (A + B ◦ A)xn n∈N is convergent and consequently x belongs to the domain of the closure of A + B ◦ A. Moreover ´ ` ´ ` A + B ◦A x = lim (A + B ◦A)xn = lim (Axn + BAxn ) = Ax + BAx = A + BA x. n→∞
n→∞
As D(A) ∩ D(BA) is a core for A + BA we obtain A + BA ⊂ A + B ◦ A. The converse inclusion is trivial and our assertion follows. From now on the closure of ˙ ”. For example a sum of operators will be denoted by the symbol “ + ˙ BA. A + B ◦ A = A + BA = A + The proof of the theorem below is almost identical to proofs of analogous theorems in [12] and [17]. Since the details are somewhat different and our notation is not fully compatible with the one used in these references, we decided to include this proof for the reader’s convenience. Theorem 3.5. Let H be a Hilbert space and let (A, B) ∈ DH . Then (1) the operator A + BA is densely defined and closable and its closure ˙ BA = Fq (B)∗ AFq (B), A+ ¯q; ˙ BA is normal and Sp A + ˙ BA ⊂ Γ in particular A +
(3.8)
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
343
(2) the operator A + B is densely defined and closable and its closure ˙ B = Fq (BA−1 )∗ AFq (BA−1 ), A+
(3.9)
˙ B = Fq (B −1 A)BFq (B −1 A)∗ ; A+ ¯q; ˙ B is normal and Sp A + ˙ B ⊂Γ it follows that A + (3) the function Fq has the exponential property:
(3.10)
and
˙ B) = Fq (B)Fq (A). Fq (A +
(3.11)
Proof. Ad. (1) We already know that A + BA is closable. First we shall show that ˙ BA ⊂ Fq (B)∗ AFq (B). A+ N It is easy to see that Ph(B) = I, so that the spectrum of Ph(B) is contained in the set of N th roots of unity. Let H=
N −1 #
Hk
(3.12)
k=0
be the decomposition into eigenspaces of Ph(B). It is also easy to check (cf. remarks preceding Proposition 3.4) that |A| commutes with Ph(B) and thus preserves the decomposition (3.12). We can therefore consider each summand of this decomposition separately. We know that |B| ◦ |A| with = Im ρ−1 . By [16, Theorem 3.1(2)] on each subspace Hk we have −1
fk (ei Im ρ |B|) ⊂ |A|fk (|B|), which by Lemma 2.16(2) means that (1 + q −1 e
2πi N
|B|1+
⊂ |A|Fq (q −1 e
2πi Re ρ N
2πi N
)Fq (q −1 e
1+ 2πiNRe ρ
|B|
2πi N
|B|1+
2πi Re ρ N
)|A|
).
Taking a direct sum over k and keeping in mind (2.36) we get (1 + q −1 B)Fq (q −1 B)|A| ⊂ |A|Fq (B) which can be rewritten as |A| + q −1 B ◦ |A| ⊂ Fq (q −1 B)∗ |A|Fq (B).
(3.13)
Finally multiplying both sides of (3.13) from the left by Phase A, using (3.2) and taking closure of both sides we obtain ˙ BA ⊂ Fq (B)∗ AFq (B). A+
May 19, 2005 1:20 WSPC/148-RMP
344
J070-00233
P. M. Soltan
In order to see the converse inclusion we shall prove that D(A + B ◦A) is a core for Fq (B)∗ AFq (B). Notice that apart from the relation |B| and following remarks), for any τ ∈ ]0, 1[ we have |B| Lemma 2.16(1) and [17, Theorem 3.1(3)] we have
Im ρ−1
τ Im ρ−1
◦ |A|τ
◦ |A|
(cf. (3.2)
. Therefore by
−1
fk (eiτ Im ρ |B|)|A|τ = |A|τ fk (|B|)
(3.14)
on Hk . Let x ∈ D(Fq (B)∗ AFq (B)) and let xk be the projection of x onto Hk . Since for any τ ∈ ]0, 1[ we have Fq (B)x ∈ D(|A|τ ) and Ph(B) commutes with |A|, the vectors Fq (B)xk ∈ D(|A|τ ). Moreover on Hk we have Fq (B) = fk (|B|). Therefore fk (|B|)xk ∈ D(|A|τ ) .
(3.15) −1
Comparing (3.14) and (3.15) yields xk ∈ D(fk (eiτ Im ρ |B|)|A|τ ), which by Proposition 3.4 gives −1 −1 |A| . Rτ xk ∈ D fk eiτ Im ρ |B|τ
(3.16)
−1 −1 is separated from By Lemma 2.16(1) the function R+ r → fk eiτ Im ρ rτ 0, and by Lemma 2.16(3), behaves asymptotically like the function r → r. Therefore (3.16) implies that |A|Rτ xk ∈ D(B) and consequently ARτ xk ∈ D(B) (Phase A preserves the domain of B). Taking the direct sum over k we see that for τ ∈ ]0, 1[ Rτ x ∈ D(A) ,
ARτ x ∈ D(B) .
It follows that Rτ x ∈ D(A + B ◦ A) .
(3.17)
Moreover Fq (B)∗ AFq (B)Rτ x = Fq (Ph(B) |B|1+
2πi Re ρ N
2πi Re ρ
)∗ AFq (Ph(B) |B|1+ N )Rτ x 2πi Re ρ 2πi Re ρ −1 = Rτ Fq (Ph(B) |B|τ (1+ N ) )∗ Ph(A) |A|τ (1+ N ) 2πi Re ρ −1 × F (Ph(B) |B|τ (1+ N ) )x. q
As the group (Rt )t∈R+ is strongly continuous we conclude that Rτ −−−→ x, τ 1
∗
Fq (B) AFq (B)Rτ x −−−→ Fq (B)∗ AFq (B)x. τ 1
Now (3.17) implies that D(A + B ◦A) is a core for Fq (B)∗ AFq (B).
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
345
Remark 3.6. Before continuing the proof, let us notice a fact which we established in the proof of Statement (1): Let H be a Hilbert space and let (A0 , B0 ) ∈ DH . Then there is a one-parameter group (Rt )t∈R+ of unitary operators acting on H such that for all t ∈ R+ Rt |B0 |R∗t = |B0 |t ,
(3.18) R∗t |A0 |Rt = |A0 |t ˙ B0 A0 and any t > 0 the vector Rt x belongs to and for any x ∈ D A0 + D(A0 ) ∩ D(B0 ◦ A0 ) and the net (Rt x)t∈]0,1[ converges to x in the graph topol˙ B0 A0 as t 1. ogy of A0 + Ad. (2) By repeated use of Corollary 3.3 we conclude that (A, BA−1 ) ∈ DH . Formula (3.8) applied to this pair yields (3.9). To prove (3.10) let us observe that by Theorem 2.7 we have Fq (γ) = Cq α(q −1 γ)Fq (q 2 γ −1 ) for all γ ∈ Γq . Therefore Fq (BA−1 ) = Cq α(q −1 BA−1 )Fq (q 2 AB −1 )∗ . Moreover the pair (A, q
−1
−1
BA
(3.19)
) belongs to DH , so by Theorem 3.2(3)
α(q −1 BA−1 )∗ Aα(q −1 BA−1 ) = B.
(3.20)
Inserting (3.19) into (3.9) and using (3.20) we obtain (3.10). Ad. (3) Let T = B −1 A. By Corollary 3.3 the pairs (T, B), (T, A) ∈ DH . Then applying the transformation m → Fq (B −1 A)mFq (B −1 A)∗ to the pair (T, B) we get ˙ B) ∈ DH by (3.10). Inserting this last pair in place of (A, B) in (3.9) we (T, A + obtain ˙ B)∗ T Fq (A + ˙ B) = T + ˙ (A + ˙ B)T, Fq (A +
(3.21)
while, since (T, B), (T, A) ∈ DH and A strongly commutes with BT , we have ˙ BT Fq (A) Fq (A)∗ Fq (B)∗ T Fq (B)Fq (A) = Fq (A)∗ T + ˙ BT = (T + ˙ AT ) + ˙ BT. (3.22) = Fq (A)∗ T Fq (A) + Notice that T strongly commutes with BA−1 = q 2 T −1 and that (AT, BA−1 ) ∈ DH (as (T, A) ∈ DH implies (T, AT ) ∈ DH and consequently (AT, BA−1 ) = (AT, T −1 ) ∈ DH ). Therefore by (3.9) and (3.8) ˙ B)T = Fq (BA−1 )∗ AFq (BA−1 )T (A + = Fq (BA−1 )∗ AT Fq (BA−1 ) ˙ BT. = AT + BA−1 ◦AT = AT +
(3.23)
˙ (AT + ˙ BT ). In view of Eq. (3.23) the right-hand side of (3.21) is equal to T + Moreover the operators on the right-hand sides of (3.21) and (3.22) coincide on the subspace D(T ) ∩ D(AT ) ∩ D(BT ).
May 19, 2005 1:20 WSPC/148-RMP
346
J070-00233
P. M. Soltan
We shall need the following: ˙ (A + ˙ B)T . Lemma 3.7. D(T ) ∩ D(AT ) ∩ D(BT ) is a core for T + ˙ (A + ˙ B)T ⊂ (T + ˙ AT ) + ˙ BT and From this lemma we immediately see that T + ˙ (A + ˙ B)T = (T + ˙ AT ) + ˙ BT . since both these operators are normal, we have T + It follows now from formulas (3.21) and (3.22) that the operator Fq (B)Fq (A) ˙ B)∗ commutes with T and thus with |T |it for all t ∈ R: Fq (A + ˙ B)∗ = |T |it Fq (B)Fq (A)Fq (A + ˙ B)∗ |T |−it . Fq (B)Fq (A)Fq (A + ˙ B)) belong to DH , we have On the other hand, since (T, B), (T, A) and (T, (A + (cf. (3.2)) |T |it B|T |−it = q it B, |T |it A|T |−it = q it A, ˙ B)|T |−it = q it (A + ˙ B). |T |it (A + Consequently ˙ B)∗ = Fq (q it B)Fq (q it A)Fq (q it (A + ˙ B))∗ Fq (B)Fq (A)Fq (A +
(3.24)
for all t ∈ R. The right-hand side converges strongly to I as t → −∞ while the ˙ B)∗ = I and (3.11) left-hand side is independent of t. Therefore Fq (B)Fq (A)Fq (A + follows. The formula (3.11) justifies the name quantum exponential function used in Subsec. 2.3. ˙ B)T is a core Proof of Lemma 3.7. We shall first prove that D T 2 ∩ D (A + ˙ (A + ˙ B)T . for T + ˙ B) ∈ DH we have |T | ◦ |A + ˙ B| with = −Im ρ−1 . For τ > 0 Since (T, A + define fτ : Ω+ −Im ρ−1 → C by
fτ (z) =
e−τ (log z) 0
2
for z = 0, for z = 0.
Then fτ is a bounded and continuous functions on Ω+ −Im ρ−1 which is holomorphic in the interior of this set. Moreover fτ converges almost uniformly on Ω+ −Im ρ−1 to the constant function equal to 1, as τ 0. By [16, Theorem 3.1(2)] we have −1
˙ B| ⊂ |A + ˙ B|fτ (|T |), fτ (e−i Im ρ |T |)|A + and it follows that −1
˙ B|T ⊂ |A + ˙ B|T fτ (|T |). fτ (e−i Im ρ |T |)|A + ˙ B and using Multiplying both sides of this equation from the left by Phase A + ˙ B) in place of (A, B)) we obtain relations (3.2) (with (T, A + ˙ B)T ⊂ (A + ˙ B)T fτ (|T |). fτ (q −1 |T |)(A +
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
347
˙ B)T we have fτ (|T |)x ∈ D (A + ˙ B)T . Also, In particular for any x ∈ D (A + as the function z → |z|2 fτ (|z|) is bounded, the vector x ∈ D T 2 . By the almost uniform convergence of (fτ )τ >0 the net fτ (|T |)x −→ x in the graph topology of τ 0
˙ (A + ˙ B)T . Indeed: T+ ˙ (A + ˙ B)T fτ (|T |)x = T fτ (|T |)x + (A + ˙ B)T fτ (|T |)x T+ ˙ B)T x = fτ (|T |)T x + fτ (q −1 |T |)(A + ˙ ˙ ˙ and the right-hand side converges to T x + (A+ B)T x = T + (A + B)T x. We have 2 ˙ B)T is a core for T + ˙ (A + ˙ B)T . therefore proved that D T ∩ D (A + Now we shall use the fact stated in Remark 3.6 with (A0 , B0 ) = (AT, BA−1 ). ˙ BT ) = D(AT + BA−1 ◦AT ). We have Rt x ∈ D(AT ) ∩ D(BT ) Let x ∈ D(AT + ˙ BT , as t 1. and Rt x converges to x in the graph topology of AT + −1 2 −1 Since BA = q T , by (3.18) we see that −1
R∗t |T |Rt = |T |t
(3.25) for all t ∈ R+ . Assume now that 12 ≤ t < 1 and that x ∈ D T 2 . We have −1 x ∈ D(|T |t ), because t−1 ≤ 2. Therefore by (3.25) the vectors Rt x ∈ D(T ). Now rewriting (3.25) as −1
|T |Rt x = Rt |T |t x, shows that Rt x −→ x in the graph topology of T . t1 ˙ B)T then Rt x ∈ D(T ) ∩ This way we showed that if x ∈ D T 2 ∩ D (A + D(AT ) ∩ D(BT ) provided that 12 ≤ t < 1 and moreover Rt x ∈ D(T ) ∩ D(AT ) ∩ ˙ BT . It follows that D(BT ) and Rt x −→ x in the graph topologies of T and AT + t1
˙ (AT + ˙ BT ) = T + ˙ ˙ Rt x converges to x in the graph topology ofT + (A + B)T . 2 ˙ B)T is a core for Now as we have established that D T ∩ D (A + ˙ (A + ˙ B)T , it follows that so is D(T ) ∩ D(AT ) ∩ D(BT ). T+ Corollary 3.8. Let H be a Hilbert space and let (A, B) ∈ DH . Then Fq (BA) = Fq (B)∗ Fq (A)Fq (B)Fq (A)∗ . Proof. Applying Fq to both sides of (3.8) we obtain ˙ BA) = Fq (B)∗ Fq (A)Fq (B). Fq (A + On the other hand since (A, BA) ∈ DH , by (3.11) we have ˙ BA) = Fq (BA)Fq (A). Fq (A + Comparing these formulas yields (3.26).
(3.26)
May 19, 2005 1:20 WSPC/148-RMP
348
J070-00233
P. M. Soltan
3.4. Necessity of the spectral condition In this subsection we would like to present another aspect of analysis of the commutation relations we have considered so far. This aspect is not relevant to the construction of new quantum “az + b” groups, so we have decided to present it without proof. We refer to [7, Sec. 6.5] for details (cf. also [12, Sec. 2]). One can consider pairs (A, B) of normal operators on a Hilbert space H satisfying more general commutation relations than those implied by the definition of the operator domain D (we consider spectral conditions as part of the commutation relations). Namely one can ask only that (A, B) satisfy the relations (3.2). Then it is easy to show that the spectra of A and B are contained in closures of unions of orbits of the group Γq in C. If we assume that (A, B) is irreducible (the only projections commuting with Phase(A), Phase(B), |A|it and |B|it for all t ∈ R are 0 and I) then the spectra are precisely equal to closures of single orbits. Then mul¯ q . The tiplying both operators by a non-zero scalar we can suppose that Sp A = Γ ¯ spectrum of B will coincide with λΓq for some non-zero λ ∈ C. It is clear that in the above situation (A, B) ∈ DH if and only if λ ∈ Γq ¯ q . It turns out that this condition is equivalent to the or equivalently Sp B = Γ conclusion of Theorem 3.5. More precisely we have: Theorem 3.9 (7, Twierdzenie 6.28). Let H be a Hilbert space and let (A, B) be an irreducible pair of normal operators acting on H satisfying relations (3.2). ¯ q . Then the following conditions are equivalent: Assume that Sp A = Γ (1) the operator A + B has a normal extension, ˙ B is normal, (2) the operator A + (3) (A, B) ∈ DH . With some more effort one can get rid of the assumption of irreducibility. This theorem says that the spectral conditions imposed on pairs (A, B) ∈ DH are nec˙ B to have the same analytic properties as A and B. essary for the sum A + 4. Affiliation Relation In this section we shall deal with the affiliation relation for C∗-algebras investigated in [11] and its relationship with the special functions investigated in Sec. 2. We shall use the notion of a C∗-algebra generated by a finite family of affiliated elements as well as a C∗-algebra generated by a quantum family of affiliated elements. If B is a C∗-algebra and T1 , . . . , TN are elements affiliated with B then we say that B is generated by T1 , . . . , TN if for any Hilbert space H and any representation π of B and any non-degenerate C∗-subalgebra A of B(H) the condition that π(T1 ), . . . , π(TN ) η A implies that π ∈ Mor(B, A). More generally if C is another C∗-algebra and T η C ⊗ B is an element such that for any Hilbert space H and any representation π of B and any non-degenerate C∗-subalgebra A of B(H) the condition that (id ⊗ π)T η C ⊗ A implies that π ∈ Mor(B, A) then we say that T
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
349
is a quantum family of affiliated elements generating B. The algebra C plays the role of the algebra of functions on the space parameterizing the family T . We refer to [11] and [14] for a detailed exposition of these topics. A very simple but useful lemma presented below uses the interplay between the different concepts of a C∗-algebra generated by affiliated elements. Lemma 4.1. Let H be a Hilbert space and let A be a non-degenerate C∗-subalgebra of B(H). Let C and B be C∗-algebras and let F ∈ M (C ⊗ B) be a quantum family of affiliated elements generating B. Let π ∈ Rep(B, H) and let R1 , . . . , RN be elements affiliated with B such that R1 , . . . , RN generate B. Define Tk = π(Rk ) for k = 1, . . . , N . Then (idC ⊗ π)F ∈ M (C ⊗ A) ⇔ Tk η A f or k = 1, . . . , N . Proof. “⇒”: Since F generates B, the condition that (idC ⊗ π)F ∈ M (C ⊗ A) implies that π ∈ Mor(B, A). Therefore for k ∈ {1, . . . , N } we have Tk = π(Rk ) η A. “⇐”: As the operators Tk = π(Rk ) are affiliated with A and {R1 , . . . , RN } generate B, the representation π is a morphism from B to A. Consequently (idC ⊗ π) ∈ Mor(C ⊗ B, C ⊗ A) and (idC ⊗ π)F ∈ M (C ⊗ A) . 4.1. Generators of some C∗-algebras Proposition 4.2. Let F ¯q × Γ ¯ q given by Cbounded Γ
¯ q ⊗ C∞ Γ ¯q be the element of M C∞ Γ
=
F (γ, γ ) = Fq (γγ ). ¯q . Then F is a quantum family of elements generating C∞ Γ ¯ q ⊗ C∞ Γ ¯ q , the map Proof. Since F ∈ M C∞ Γ ¯ q γ → F (γ, ·) ∈ M C∞ Γ ¯q Γ is strictly continuous (cf. [14, Sec. 2]). In particular for any ϕ ∈ L1 (Γq ) we can consider the integral ¯q . F (γ, ·)ϕ(γ) dh(γ) ∈ M C∞ Γ Γq
May 19, 2005 1:20 WSPC/148-RMP
350
J070-00233
P. M. Soltan
Using the asymptotic behavior Fq(γ) ≈ α(q −1 γ) it is easy to show that the function ¯ q γ → Fϕ : Γ F (γ, γ )ϕ(γ) dh(γ) (4.1) Γq
¯ q . We shall show that the family of functions {F (γ, ·)}γ∈Γ¯ sepbelongs to C∞ Γ q ¯ q we have F (γ, γ1 ) = ¯ q . Indeed, suppose that for some γ1 , γ2 ∈ Γ arates points of Γ ¯ q . This means that F (γ, γ2 ) for all γ ∈ Γ Fq (γγ1 ) = Fq (γγ2 )
(4.2)
¯ q . In particular for γ = qq −it (t ∈ R) we obtain for all γ ∈ Γ Fq [qγ1 ](t) = Fq (qq it γ1 ) = Fq (qq it γ2 ) = Fq [qγ2 ](t) for all t ∈ R. Performing holomorphic continuation to t = −i and using (2.6) we get (1 + γ1 )Fq (γ1 ) = (1 + γ2 )Fq (γ2 ), which by (4.2) means that γ1 = γ2 . It follows that the family of functions {Fϕ : ϕ ∈ L1 (Γq )} also separates points of ¯ q (e.g. by considering integrable functions approximating measures concentrated Γ ¯ q ). By Stone–Weierstrass theorem applied to one point comon single points of Γ ¯ pactification of Γq the ∗-algebra generated by functions of the form (4.1) is dense ¯q . in C∞ Γ ¯ q , H . Assume that for Now let H be a Hilbert space and let π ∈ Rep C∞ Γ some non-degenerate C∗-subalgebra B ⊂ B(H) we have ¯q ⊗ B , (id ⊗ π)F ∈ M C∞ Γ ¯ q with values in M (B). This i.e. (id ⊗ π)F is a strictly continuous function on Γ function acts in the following way: ¯ q γ → π F (γ, ·) ∈ M (B) . Γ For ϕ ∈ L1 (Γq ) let us denote the functional ¯ C∞ Γq f → f (γ)ϕ(γ) dh(γ) Γq
by ωϕ . Then π(Fϕ ) = π (ωϕ ⊗ id)F = (ωϕ ⊗ id) (id ⊗ π)F ∈ M (B) . ¯q , Since the ∗-algebra generated by elements of {Fϕ : ϕ ∈ L1 (Γq )} is dense in C∞ Γ we see that ¯ q ⊂ M (B) . π C∞ Γ
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
351
¯ q B is linearly dense in B. We shall use It remains to show that the set π C∞ Γ ¯ ¯ the fact that F is a unitary element of M C ∞ Γq ⊗ C∞ Γq . It implies that ¯ (id ⊗ π)F is a unitary element of M C∞ Γq ⊗ B . In particular the set ¯q , x ∈ B (4.3) (id ⊗ π)F (f ⊗ x) : f ∈ C∞ Γ ¯ is linearly ∗ dense in C∞ Γq ⊗ B. Notice further that for any functional ω ∈ ¯ C∞ Γq we have
(ω ⊗ id) (id ⊗ π)F (f ⊗ x) = π (ω ⊗ id) F (f ⊗ I) x. In particular for ϕ ∈ L1 (Γq )
¯ q B. (ωϕ ⊗ id) (id ⊗ π)F (f ⊗ x) = π (ωf ϕ ⊗ id)F x ∈ π C∞ Γ ¯ q and ϕ ∈ L1 (Γq ) such that ωϕ (g) = 1. For a given y ∈ B consider g ∈ C∞ Γ Choose a sequence (qn )n∈N of linear combinations of elements of (4.3) converging to g ⊗ y. Then the sequence of elements ¯q B (ωϕ ⊗ id)(qn ) ∈ π C∞ Γ ¯q , B . converges to y. This shows that π ∈ Mor C∞ Γ Using similar methods or appealing to the theory of multiplicative unitary operators (e.g. [15, Theorem 1.6(6)]) one can also prove: Proposition 4.3. Let F Cbounded(Γq × Γq ) given by
be the element of M (C∞ (Γq ) ⊗ C∞ (Γq ))
=
F (γ, γ ) = χ(γ, γ ). Then F is a quantum family of elements generating C∞ (Γq ). We can apply Lemma 4.1 in the two situations described below. (a) Let H be a Hilbert space and letT be a normal operator acting on H such that ¯ q and let F ∈ M (C ⊗ B) = Cbounded Γ ¯q × Γ ¯q ¯ q . Set B = C = C∞ Γ Sp T ⊂ Γ be given by F (γ, γ ) = Fq (γγ ). Then F generates B. The algebra B is also generated by a single affiliated element R1 given by R1 (γ) = γ. Further define the representation π by π(f ) = f (T ) for f ∈ B. (b) Let H be a Hilbert space and let T be a normal operator acting on H such that ¯ q and ker{T } = {0}. Set B = C = C∞ (Γq ) and let F ∈ M (C ⊗ B) = Sp T ⊂ Γ Cbounded(Γq × Γq ) be given by F (γ, γ ) = χ(γ, γ ).
May 19, 2005 1:20 WSPC/148-RMP
352
J070-00233
P. M. Soltan
Then F generates B. The algebra B is also generated by two affiliated elements R1 and R2 given by R1 (γ) = γ, R2 (γ) = γ −1 . Further define a representation π of B by π(f ) = f (T ) for f ∈ B. Recall that for C = C∞ (Λ) and any C∗-algebra B the multiplier algebra M (C ⊗ B) is canonically isomorphic to the algebra of strictly continuous M (B)-valued functions on Λ. Thus the two cases (a) and (b) yield the following theorems: Theorem 4.4. Let H be a Hilbert space, T a normal operator acting on H such ¯ q and let A ⊂ B(H) be a non degenerate C∗-subalgebra. Then the that Sp T ⊂ Γ following conditions are equivalent: ¯ q the unitary operator Fq (γT ) belongs to M (A) and the map (1) for any γ ∈ Γ ¯ q γ → Fq (γT ) ∈ M (A) Γ is strictly continuous; (2) the operator T is affiliated with A. Theorem 4.5. Let H be a Hilbert space, T a normal operator acting on H such ¯ q and ker T = {0}. Let A ⊂ B(H) be a non degenerate C∗-subalgebra. that Sp T ⊂ Γ Then the following conditions are equivalent: (1) for any γ ∈ Γq the unitary operator χ(γ, T ) belongs to M (A) and the map Γq γ → χ(γ, T ) ∈ M (A) is strictly continuous; (2) operators T and T −1 are affiliated with A. 4.2. Algebraic consequences Theorem 4.6. Let H be a Hilbert space, A ⊂ B(H) a non-degenerate C∗-subalgebra and let (A, B) ∈ DH be such that A, B η A. Then ˙ B is affiliated with A, (1) the operator A + (2) the operator BA is affiliated with A. Proof. Ad. (1) By Corollary 3.3(1) for any γ ∈ Γq we have (γA, γB) ∈ DH and thus by (3.11) ˙ B) = Fq (γB)Fq (γA). Fq γ(A + Now the result follows by Theorem 4.4. Ad. (2) We apply the same method as in the proof of (1) and use (3.26) and Theorem 4.5. We end this section with the following useful proposition.
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
353
Proposition 4.7. Let X be a closed subset of C\{0}. Let H be a Hilbert space and let T1 , T2 be strongly commuting normal operators acting on H such that ¯ and ker T1 = ker T2 = {0}. Let A be a non-degenerate C∗Sp T1 , Sp T2 ⊂ X subalgebra of B(H) and assume that T1 , T2 , T1−1 and T2−1 are affiliated with A. Then for any f ∈ Cbounded(X × X) we have f (T1 , T2 ) ∈ M (A). The proof of this proposition is identical to that of [17, Proposition 5.3] (cf. also [7]). We have formulated it in a way which gets rid of unnecessary assumptions about the shape of X. We shall use this proposition with X = Γq . 5. Multiplicative Unitary and Its Properties Most results of this section are direct analogs of those presented in [17, Secs. 3–7]. Also most proofs are identical. Therefore we shall omit some of the proofs. It needs to be stressed that this analogy stems from the fact that the formulas arising in the study of commutation relations described in Subsec. 3.1 are in most cases identical to those found in [17]. However the meaning of these formulas is different, as the commutation relations discussed in both papers are different. We shall include the proofs of the results relying on the theorems proved in this paper. 5.1. The quantum group space As in [17] we shall begin the construction of our quantum “az + b” group with the definition of the operator domain G playing the role of the quantum space of our quantum group. Let H be a Hilbert space. By GH we shall denote the set of pairs (a, b) of closed operators on H satisfying (1) (2) (3) (4)
a and b are normal, ¯q, Sp a, : Sp b ⊂ Γ ker a = {0}, χ(γ, a)bχ(γ, a)∗ = γb for all γ ∈ Γq .
It follows that a preserves the decomposition H = ker b ⊕ (ker b)⊥ and (a, b) ∈ GH ¯ q and the pair (A, B) = if and only if a0 = a ker b satisfies ker a0 = {0}, Sp a0 ⊂ Γ (a (ker b)⊥ , b (ker b)⊥ ) belongs to D(ker b)⊥ . The operator domain G will serve as the quantum space of our quantum group in the sense that to each representation of the algebra of continuous functions vanishing at infinity on the quantum group on a Hilbert space H there will correspond a unique pair (a, b) ∈ GH . As described in [2, 10] there is a bijective correspondence between operator domains and C∗-algebras. This is why we call G a quantum space. The C∗-algebra of functions on this quantum space is the universal C∗-algebra encoding the commutation relations defining G. One easily finds that this must be
May 19, 2005 1:20 WSPC/148-RMP
354
J070-00233
P. M. Soltan
¯ q Γq where Γq acts on Γ ¯ q by multiplication the C∗-algebra crossed product C∞ Γ ∗ (cf. Subsec. 6.1). The reader will notice that the C -algebra corresponding to the operator domain D defined in Subsec. 3.1 is the algebra of compact operators. It should be pointed out that the definition of G is not an essential ingredient of the construction of the new quantum “az + b” groups. In practice we can always choose a faithful representation with (a, b) ∈ DH (cf. remarks after Proposition 6.1). Lemma 5.1. Let H and K be Hilbert spaces and let (a, b) ∈ GH and (ˆ a, ˆb) ∈ GK . Then there is the following relation between operators on K ⊗ H: χ(ˆ a ⊗ I, I ⊗ a)(ˆb ⊗ I) = (ˆb ⊗ a)χ(ˆ a ⊗ I, I ⊗ a). The assertion of Lemma 5.1 follows for example from [14, Formula (2.6)] (cf. also [17, 19]). 5.2. Multiplicativity Proposition 5.2. Let H be a Hilbert space and let (a, b) ∈ GH be such that ker b = {0}. Then the unitary operator W = Fq (b−1 a ⊗ b)χ(b−1 ⊗ I, I ⊗ a)
(5.1)
on H ⊗ H satisfies W (a ⊗ I)W ∗ = a ⊗ a, ˙ b ⊗ I. W (b ⊗ I)W ∗ = a ⊗ b +
(5.2)
Proof. It is easy to see that (b−1 , a) ∈ GH . By Lemma 5.1 χ(b−1 ⊗ I, I ⊗ a)(a ⊗ I)χ(b−1 ⊗ I, I ⊗ a)∗ = a ⊗ a. Also since a ⊗ a strongly commutes with b−1 a ⊗ b, we have W (a ⊗ I)W ∗ = Fq (b−1 a ⊗ b)χ(b−1 a ⊗ I, I ⊗ a)(a ⊗ I) × χ(b−1 ⊗ I, I ⊗ a)∗ Fq (b−1 a ⊗ b)∗ = Fq (b−1 ⊗ b)(a ⊗ a)Fq (b−1 a ⊗ b)∗ = a ⊗ a. For the second formula of (5.2) notice that the pair (A, B) = (a ⊗ b, b ⊗ I) ∈ DH ⊗ H . Also b ⊗ I commutes with χ(b−1 ⊗ I, I ⊗ a). Therefore by (3.10) W (b ⊗ I)W ∗ = Fq (b−1 a ⊗ b)χ(b−1 ⊗ I, I ⊗ a)(b ⊗I)χ(b−1 ⊗I, I ⊗ a)∗ Fq (b−1 a ⊗ b)∗ = Fq (b−1 a ⊗ b)(b ⊗ I)Fq (b−1 a ⊗ b)∗ ˙ B = Fq (B −1 A)BFq (B −1 A)∗ = A + ˙ b ⊗ I. = a ⊗ b+
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
355
Proposition 5.3. Let H and K be Hilbert spaces and let (a, b) ∈ GH and (ˆ a, ˆb) ∈ GK . Assume that ker b = {0}. Then the operator W defined by (5.1) and a ⊗ I, I ⊗ a) V = Fq (ˆb ⊗ b)χ(ˆ
(5.3)
W23 V12 = V12 V13 W23
(5.4)
satisfy
on K ⊗ H ⊗ H. Proof. Using (5.2) we obtain
∗ W23 V12 W23 = (I ⊗ W ) Fq (ˆb ⊗ I) ⊗ I (I ⊗ W )∗ × (I ⊗ W ) χ(ˆ a ⊗ I, I ⊗ a) ⊗ I (I ⊗ W )∗ ˙ b ⊗ I) χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ a). = Fq ˆb ⊗ (a ⊗ b +
Also χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ a) = χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ I)χ(ˆ a ⊗ I ⊗ I, I ⊗ I ⊗ a), since χ is a bicharacter on Γq . We shall consider two cases: ˆb = 0 and ker ˆb = {0}. Assume first that ˆb = 0. Then V = χ(ˆ a ⊗ I, I ⊗ a) and ∗ = χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ a) W23 V12 W23
= χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ I)χ(ˆ a ⊗ I ⊗ I, I ⊗ I ⊗ a) = V12 V13 . In the other case (ˆ a, ˆb) ∈ DK and therefore (A, B) = (ˆb ⊗ a ⊗ b, ˆb ⊗ b ⊗ I) ∈ DK ⊗ H ⊗ H .
(5.5)
Lemma 5.1 says that (ˆb ⊗ a)χ(ˆ a ⊗ I, I ⊗ a) = χ(ˆ a ⊗ I, I ⊗ a)(ˆb ⊗ I), and consequently (ˆb ⊗ a ⊗ b)χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ I) = χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ I)(ˆb ⊗ I ⊗ b).
(5.6)
Now using (3.11) with the pair (5.5) we get: ˙ b ⊗ I) = Fq(ˆb ⊗ a ⊗ b + ˙ ˆb ⊗ b ⊗ I) = Fq (ˆb ⊗ b ⊗ I)Fq (ˆb ⊗ a ⊗ b). Fq ˆb ⊗ (a ⊗ b + Therefore ∗ ˙ b ⊗ I) χ(ˆ W23 V12 W23 = Fq ˆb ⊗ (a ⊗ b + a ⊗ I ⊗ I, I ⊗ a ⊗ a) ˆ ˙ b ⊗ I) χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ I) = Fq b ⊗ (a ⊗ b + × χ(ˆ a ⊗ I ⊗ I, I ⊗ I ⊗ a) ˆ = Fq (b ⊗ b ⊗ I)Fq (ˆb ⊗ a ⊗ b)χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ I) × χ(ˆ a ⊗ I ⊗ I, I ⊗ I ⊗ a)
May 19, 2005 1:20 WSPC/148-RMP
356
J070-00233
P. M. Soltan
= Fq (ˆb ⊗ b ⊗ I)χ(ˆ a ⊗ I ⊗ I, I ⊗ a ⊗ I)Fq (ˆb ⊗ I ⊗ b) × χ(ˆ a ⊗ I ⊗ I, I ⊗ I ⊗ a) = V12 V13 where in the second last equality we used (5.6). Now for general ˆb we split the Hilbert space K ⊗ H ⊗ H into a direct sum of (ker ˆb) ⊗ H ⊗ H and (ker ˆb)⊥ ⊗ H ⊗ H and derive (5.4) separately for each summand. Setting K = H and (ˆ a, ˆb) = (b−1 , b−1 a) in Proposition 5.3 we obtain V = W and thus we prove Statement (1) of the following corollary: Corollary 5.4. Let H be a Hilbert space and let (a, b) ∈ GH be such that ker b = {0}. Then (1) the operator W defined by (5.1) is a multiplicative unitary, (2) for any Hilbert space K and any (ˆ a, ˆb) ∈ GH the operator V defined by (5.3) is a unitary adapted to W .
5.3. Modularity In this subsection we shall show that the multiplicative unitary operator given by (5.1) is modular (cf. [8, Definition 2.1]). In what follows we shall need the partial transposition formula (cf. [17, Formula (3.15)] or [7, Lemat 7.7]). If H is a Hilbert ¯ is defined as the set {¯ space then the complex conjugate space H x : x ∈ H} with operations of addition and multiplication by scalars given by x ¯ + y¯ = x + y, and ¯ for x, y ∈ H and λ ∈ C. The Hilbert space structure on H ¯ is given by λ¯ x = λx (¯ x|¯ y ) = (y|x). Then for any closed operator T on H we can define the transpose of ¯ such that D(T ) = {¯ x : x ∈ D(T ∗ )} and T x ¯ = T ∗x T as the operator T on H for any x ¯ ∈ D(T ). In what follows we shall use the following fact: let H and K be Hilbert spaces and let a and a ˆ be normal operators acing on H and K respectively. Then for any bounded Borel function f on Sp a ˆ × Sp a and all z, x ∈ K, u, y ∈ H we have a ⊗ I, I ⊗ a) x ¯ ⊗ y = (x ⊗ u f (ˆ a ⊗ I, I ⊗ a) z ⊗ y) . z¯ ⊗ u f (ˆ
(5.7)
Proposition 5.5. Let H and K be Hilbert spaces and let (a, b) ∈ GH , (ˆ a, ˆb) ∈ GK . ˆ Assume also that ker b = {0} and ker b = {0}. Further denote Q = |a| and let x, z ∈ K, u ∈ D(Q) and y ∈ D(Q−1 ). Define functions ϕ, ψ: Γq → C as ϕ(γ) = x ⊗ u χ(ˆb ⊗ b, γ)χ(ˆ a ⊗ I, I ⊗ a) z ⊗ y , ψ(γ) = z¯ ⊗ Qu χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ a ⊗ I, I ⊗ a) x ¯ ⊗ Q−1 y .
(5.8)
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
357
Then (1) ψ(γ) = α(γ)|γ|χ(−1, γ)ϕ(γ), (2) if x ∈ D(ˆb±1 ), u ∈ D(b±1 ◦Q±2 ) and y ∈ D(Q±2 ) for all possible combinations of signs then ϕ and ψ belong to the space S(Γq ). Proof. Ad. (1) Since (a, b) ∈ DH and Q = |a|, we have (I ⊗ Q−it )(−ˆb ⊗ qa−1 b)(I ⊗ Qit ) = q −it (−ˆb ⊗ qa−1 b) (cf. 3.2)). Applying to both sides of this equation the function γ → χ(γ , γ) we obtain (I ⊗ Q−it ) = χ(−ˆb ⊗ qa−1 b, γ)(I ⊗ Qit ) = χ(q −it (−ˆb ⊗ qa−1 b), γ) = χ(q −it , γ)χ(−ˆb ⊗ qa−1 b, γ) = |γ|−it χ(−ˆb ⊗ qa−1 b, γ).
(5.9)
Now Q commutes with a and consequently (I ⊗ Qit ) commutes with χ(ˆ a ⊗ I, I ⊗ a). Therefore z¯ ⊗ Qit u χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ a ⊗ I, I ⊗ a) x¯ ⊗ Qit y ¯ ⊗y a ⊗ I, I ⊗ a)(I ⊗ Qit ) x = z¯ ⊗ u (I ⊗ Q−it )χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ ¯ ⊗y = z¯ ⊗ u (I ⊗ Q−it )χ(−ˆb ⊗ qa−1 b, γ)(I ⊗ Qit )χ(ˆ a ⊗ I, I ⊗ a) x ¯ ⊗y . = |γ|−it z¯ ⊗ u χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ a ⊗ I, I ⊗ a) x Performing holomorphic continuation to t = i we get z¯ ⊗ Qu χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ a ⊗ I, I ⊗ a) x¯ ⊗ Q−1 y = |γ| z¯ ⊗ u χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ a ⊗ I, I ⊗ a) x ¯ ⊗y . Notice that χ(−ˆb ⊗ qa−1 b, γ) = χ(−1, γ)χ(ˆb ⊗ qa−1 b, γ) = χ(−1, γ)χ(ˆb ⊗ I, γ)χ(I ⊗ qa−1 b, γ) = χ(−1, γ)χ(ˆb, γ) ⊗ χ(qa−1 b, γ). Therefore we can continue our computation in the following way: ¯ ⊗ Q−1 y a ⊗ I, I ⊗ a) x z¯ ⊗ Qu χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ a ⊗ I, I ⊗ a) x¯ ⊗ y = |γ|χ(−1, γ) z¯ ⊗ u χ(ˆb, γ) ⊗ χ(qa−1 b, γ) χ(ˆ a ⊗ I, I ⊗ a) x¯ ⊗ y = |γ|χ(−1, γ) χ(ˆb, γ)z ⊗ χ(qa−1 b, γ)∗ u χ(ˆ = |γ|χ(−1, γ) x ⊗ χ(qa−1 b, γ)∗ u χ(ˆ a ⊗ I, I ⊗ a) χ(ˆb, γ)z ⊗ y ,
May 19, 2005 1:20 WSPC/148-RMP
358
J070-00233
P. M. Soltan
where in the last equality we used (5.7). Then we have ¯ ⊗ Q−1 y a ⊗ I, I ⊗ a) x z¯ ⊗ Qu χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ = |γ|χ(−1, γ) x ⊗ u I ⊗ χ(qa−1 b, γ) χ(ˆ a ⊗ I, I ⊗ a) χ(ˆb, γ) ⊗ I z ⊗ y a ⊗ I, I ⊗ a)χ(ˆb ⊗ I, γ) z ⊗ y . = |γ|χ(−1, γ) x ⊗ u I ⊗ χ(qa−1 b, γ) χ(ˆ By Lemma 5.1 this last expression equals |γ|χ(−1, γ) x ⊗ u I ⊗ χ(qa−1 b, γ) χ(ˆb ⊗ a, γ)χ(ˆ a ⊗ I, I ⊗ a) z ⊗ y . Let us put A = ˆb ⊗ a, B = I ⊗ qa−1 b. It is easy to see that (A, B) ∈ DK ⊗ H . By Corollary 3.3(4) χ(I ⊗ qa−1 b, γ)χ(ˆb ⊗ a, γ) = χ(B, γ)χ(γ, A) = α(γ)χ(qBA, γ) = α(γ)χ(ˆb ⊗ b, γ). With this information we obtain a ⊗ I, I ⊗ a) | x ¯ ⊗ Q−1 y z¯ ⊗ Qu | χ(−ˆb ⊗ qa−1 b, γ)χ(ˆ = α(γ)|γ|χ(−1, γ) x ⊗ u | χ(ˆb ⊗ b, γ)χ(ˆ a ⊗ I, I ⊗ a) | z ⊗ y , which proves Statement (1). Ad. (2) In a similar way to the derivation of (5.9) we find that (I ⊗ Q−it )χ(ˆb ⊗ b, γ)(I ⊗ Qit ) = |γ|−it χ(ˆb ⊗ b, γ) for all γ ∈ Γq and all t ∈ R. Therefore a ⊗ I, I ⊗ a) | z ⊗ Qit y x ⊗ Qit u | χ(ˆb ⊗ b, γ)χ(ˆ a ⊗ I, I ⊗ a) | z ⊗ y = |γ|−it ϕ(γ). = |γ|−it x ⊗ u | χ(ˆb ⊗ b, γ)χ(ˆ Performing holomorphic continuation to the point t = ±2i we obtain x ⊗ Q±2 u | χ(ˆb ⊗ b, γ)χ(ˆ a ⊗ I, I ⊗ a) | z ⊗ Q∓2 y = |γ|±2 ϕ(γ). With γ = q k q it this means that −1 e∓2t Im ρ ϕ q k q it = (x ⊗ Q±2 u|Phase (ˆb ⊗ b)|ˆb ⊗ b|it χ(ˆ a ⊗ I, I ⊗ a)|z ⊗ Q∓2 y). It follows from the assumptions about x, y and u that x ⊗ Q±2 u ∈ D (ˆb ⊗ b). Consequently the functions −1 t → e∓2t Im ρ ϕ q k q it have a holomorphic continuation to {z ∈ C: −1 < Im z < 1} and this continuation is bounded in the strip. It is an easy exercise (cf. [7, Lemat 7.9]) that if a function u on R has the property that the functions t → e±t u(t) extend to bounded holomorphic functions on the strip {z ∈ C: −1 < Im z < 1} then u ∈ S(R). In particular the functions t → ϕ(q k q it ) belong to S(R), i.e. ϕ ∈ S(Γq ).
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
359
By Statement (1) we know that the functions t → ψ(q k q it ) are smooth and that −1 t → e±t Im ρ ψ(q k q it ) have extensions to bounded holomorphic functions in the strip {z ∈ C: −1 < Im z < 1}. Therefore ψ ∈ S(Γq ). In the same way as [17, Proposition 3.5] we get the following: Proposition 5.6. Let H, K, (a, b), (ˆ a, ˆb), Q and x, z, u, y be as in Proposition 5.5. Let f and g be bounded Borel functions on Γq . Denote by fˆ and gˆ the inverse Fourier transforms of f and g (we treat f and g as tempered distributions on Γq ): fˆ(γ )χ(γ, γ ) dµ(γ ), f (γ) = Γq
g(γ) =
(5.10)
gˆ(γ )χ(γ, γ ) dµ(γ ). Γq
Suppose that for almost all γ ∈ Γq we have fˆ(γ) = α(γ)|γ|χ(−1, γ)ˆ g(γ). Then
(5.11)
x ⊗ u | f (ˆb ⊗ b, γ)χ(ˆ a ⊗ I, I ⊗ a) | z ⊗ y = z¯ ⊗ Qu | g(−ˆb ⊗ qa−1 b, γ)χ(ˆ a ⊗ I, I ⊗ a) | x ¯ ⊗ Q−1 y . (5.12)
a, ˆb) ∈ GK . Corollary 5.7. Let H and K be Hilbert spaces and let (a, b) ∈ GH , (ˆ Assume that ker b = {0} and ker ˆb = {0}. Let V be the operator introduced by (5.3) and define a ⊗ I, I ⊗ a). V˜ = Fq (−ˆb ⊗ qa−1 b)∗ χ(ˆ Then (1) for all x, z ∈ K, y ∈ D(Q−1 ) and u ∈ D(Q) we have ¯ ⊗ Q−1 y ; (x ⊗ u V z ⊗ y) = z¯ ⊗ Qu | V˜ | x
(5.13)
(2) the operator W introduced by (5.1) is a modular multiplicative unitary. Proof. Ad. (1) We shall use Proposition 5.6 with f = Fq and g = Fq . We can use it because of Corollary 2.15. The result is exactly (5.13). Ad. (2) Put K = H and (ˆ a, ˆb) = (b−1 , b−1 a). Then by Statement (1) we have ˜ |x ¯ ⊗ Q−1 y (x ⊗ u W z ⊗ y) = z¯ ⊗ Qu | W ˆ = |b|. Then it is easy to verify for all x, z ∈ K, y ∈ D(Q−1 ) and u ∈ D(Q). Let Q −1 −1 ˆ that Q ⊗ Q strongly commutes with b a ⊗ b, b ⊗ I and I ⊗ a. Therefore ˆ ⊗ Q. ˆ ⊗ Q)W ∗ = Q W (Q This way we have verified that W satisfies all conditions listed in [8, Definition 2.1].
May 19, 2005 1:20 WSPC/148-RMP
360
J070-00233
P. M. Soltan
6. The Quantum “az + b” Group for New Values of q 6.1. The C∗-algebra Let us describe the C∗-algebra which will turn out to be the algebra of continuous functions at infinity on the quantum “az + b” group. For γ ∈ Γq and vanishing ¯ f ∈ C∞ Γq let βγ f (γ ) = f (γ γ) ¯ q is a strongly continuous ¯ q . Then Γq γ → βγ ∈ Aut C∞ Γ for all γ ∈ Γ ¯ q , Γq , β) is a C∗-dynamical system. Let B be the corresponding action and (C∞ Γ ¯ As the canonical embedding C C∗-crossed product. ∞ Γq → M (B) is a morphism ¯ ¯ from C∞ Γq to B, any element affiliated with C∞ Γq can be treated as an element affiliated with B. Let b be the element affiliated with B arising from the continuous ¯ q γ → γ ∈ C. Let (Uγ )γ∈Γq be the strictly continuous family of unitary function Γ ¯q ⊂ elements of M (B) implementing the action β: Uγ f Uγ∗ = βγ f for f ∈ C∞ Γ M (B). Let us represent B faithfully on a Hilbert space H. Then (Uγ )γ∈Γq is a group of unitary operators acting on H. By SNAG theorem there is a normal ¯ q , ker a = {0} and Uγ = χ(γ, a) for operator a acting on H such that Sp a ⊂ Γ q = Γq ). Now all γ ∈ Γq (remark that this is where we are using the fact that Γ since all operators Uγ are in M (B) and the map Γq γ → Uγ ∈ M (B) is strictly continuous, by Theorem 4.5 the operators a and a−1 are affiliated with B. It is easy to check that for any π ∈ Rep(B, H) we have (π(a), π(b)) ∈ GH . It is also known that ¯q (6.1) g(a)f (b) : g ∈ C∞ (Γq ) , f ∈ C∞ Γ is a linearly dense subset of B. Using the same technique as in the proof of [17, Propositions 4.1 and 4.2] we get the following: Proposition 6.1. Let B, a and b be the C∗-algebra and two affiliated elements described in this subsection. Then (1) the C∗-algebra B is generated by the three affiliated elements a, a−1 and b; (2) for any Hilbert space H and any (a0 , b0 ) ∈ GH there exists a unique π ∈ Rep(B, H) such that a0 = π(a) and b0 = π(b). If A is a non-degenerate C∗-subalgebra of B(H) and a0 , a−1 0 , b η A then π ∈ Mor(B, A). Assume now that B is faithfully represented in a Hilbert space H. From the commutation relations between a and b and the fact that (6.1) is linearly dense in B we see that ker b is an invariant subspace of H for the action of B. We can therefore restrict our representation to (ker b)⊥ or, equivalently, assume that ker b = {0}. Denote a ˆ = b−1 and ˆb = b−1 a then W = Fq (ˆb ⊗ b)χ(ˆ a ⊗ I, I ⊗ a)
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
361
coincides with (5.1). Therefore it is a modular multiplicative unitary. The operators ˆ and W ˜ related to W via [17, Definition 2.1] are given by Q, Q ˆ = |b|, Q = |a|, Q ˆ ˜ W = Fq (−b ⊗ qa−1 b)∗ χ(ˆ a ⊗ I, I ⊗ a). Let
norm A = (ω ⊗ id)W : ω ∈ B(H)∗
closure
.
By the theory developed in [15, 8] A is a nondegenerate C∗-subalgebra of B(H). The proof of the following proposition is virtually identical to that presented in [17, Sec. 6]. We give it here because it relies on the results about special functions and commutation relations obtained in Secs. 2 and 3. Proposition 6.2. The C∗-algebras A and B are equal as subsets of B(H). Proof. The operators a ˆ ⊗ I, I ⊗ a, I ⊗ a−1 and ˆb ⊗ b are affiliated with K(H) ⊗ B. Therefore by Proposition 4.7 and Theorem 4.4 we have χ(ˆ a ⊗ I, I ⊗ a), Fq (ˆb ⊗ b) ∈ M (K(H) ⊗ B) and it follows that W ∈ M (K(H) ⊗ B). By the definition of A we conclude that A ⊂ M (B). In particular AB ⊂ B. Using the same technique as in the proof of Proposition 4.2 (cf. also [15, Sec. 4] and [17, Sec. 6] one can show that AB is dense in B. Now let us show that the elements a, a−1 and b are affiliated with A. For any ¯ q let γ∈Γ V (γ) = Fq (γˆb ⊗ b)χ(ˆ a ⊗ I, I ⊗ a). It is easy to verify with Theorem 4.4 that V (γ) γ∈Γ¯ q is a strictly continuous family of unitary elements of M (K(H) ⊗ K(H)). Clearly if V (γ)12 = V (γ) ⊗ I then V (γ)12 γ∈Γ¯ q is a strictly continuous family of unitary elements M (K(H) ⊗ K(H) ⊗ A). By Proposition 5.3 ∗ V (γ)13 = V (γ)12 W23 V (γ)12 W23 . Since W ∈ M (K(H) ⊗ A) the family V (γ)13 γ∈Γ¯ q is a strictly continuous family of elements of M (K(H) ⊗ K(H) ⊗ A). It follows that V (γ) γ∈Γ¯ q is a strictly continuous family of unitaries in M (A). This implies that Fq (γˆb ⊗ b) = V (γ)V (0)∗ ∈
M (K(H) A) is also a strictly continuous function of γ. Using again Theorem 4.4 we see that ˆb ⊗ b is affiliated with K(H) ⊗ A. It is known ([19, Proposition A.1]) that if a tensor product of normal operators is affiliated with a tensor product of C∗-algebras then the factor operators are affiliated with the factor C∗-algebras. Thus ˆb ⊗ b η (K(H) ⊗ A) implies that b η A. Similarly using the strictly continuous family Γq γ → V (γ) = χ(γ, a) ∈ M (K(C) ⊗ K(H))
May 19, 2005 1:20 WSPC/148-RMP
362
J070-00233
P. M. Soltan
we arrive at the conclusion that χ(γ, a) γ∈Γ is a strictly continuous family of q
unitary elements of M (A). Consequently a and a−1 are affiliated with A. Since, at the same time, a, a−1 and b generate B, we see that the identity mapping on B is a morphism from B to A. In other words B ⊂ M (A) and BA is a dense subset of A. This concludes the proof of the equality A = B. 6.2. Quantum group structure
Having constructed a multiplicative unitary operator and established its modularity we can proceed with construction of new quantum “az + b” groups. The algebra A carries a comultiplication δ ∈ Mor(A, A ⊗ A). This is a coassociative morphism given by δ(c) = W (c ⊗ I)W ∗ . The next ingredient we are going to examine is the scaling group. It is the one parameter group (τt )t∈R of automorphisms of A given by τt (c) = Q2it cQ−2it . It is easy to check that τt (a) = a, τt (b) = q 2it b ˜ ∗ = W ⊗ R helps in determining the for all t ∈ R. Then the well-known formula W unitary antipode R. It is the ∗-antiautomorphism c → cR of A given on generators as aR = a−1 , bR = −qa−1 b. The formula for the polar decomposition of the antipode gives now κ(a) = a−1 , κ(b) = −a−1 b. All these formulas agree with those derived in the Hopf ∗-algebra framework in Subsec. 1.1 with λ = q 2 . This is why we call our quantum group a quantum “az +b” group for the deformation parameter q. The multiplicative unitary also provides information about the reduced dual ˆ δ) ˆ of our quantum group. In the case of our quantum “az + b” groups the (A, situation is described by the following proposition (recall that a ˆ = b−1 and ˆb = b−1 a). ˆ There exists a Proposition 6.3. The operators a ˆ and ˆb are affiliated with A. ∗ ˆ ˆ ˆ, Ψ(b) = b and C -isomorphism Ψ: A → A such that Ψ(a) = a ˆ δ◦Ψ = σ(Ψ ⊗ Ψ)◦δ, ˆ ⊗ A. ˆ where σ is the flip on A
May 19, 2005 1:20 WSPC/148-RMP
J070-00233
New Quantum “az + b” Groups
363
The proof of this proposition is identical to that of [17, Theorem 7.1]. We shall only point out that we can avoid choosing a special representation of B (as in [17]) by noticing that χ(γ, a ˆ)χ(ˆb, γ ) : γ, γ ∈ Γq = χ(b, γ)χ(γ , a) : γ, γ ∈ Γq , so that the multiplicities of the pairs (a, b) and (ˆ a, ˆb) are the same regardless of the chosen representation (cf. [7, Twierdzenie 7.28]). According to the results of [18], for a quantum group arising from a modular ˜ , Q and Q, ˆ the weight multiplicative unitary W with associated operators W ˆ Q ˆ h(c) = Tr Qc is right invariant and if it is locally finite (densely defined) then it is the right Haar measure. It turns out that in our case this weight is locally finite. More precisely one finds that for c = g(a)f (a) we have |g(γ)|2 dµ(γ) |f (γ)|2 |γ|2 dµ(γ). h(c∗ c) = Γq
¯q Γ
In particular (A, δ) together with κ, (τt )t∈R , R, h) is a weighted Hopf C∗-algebra as defined in [4, Definition 1.5]. It is well known that weighted Hopf C∗-algebras are the same objects as reduced C∗-algebraic quantum groups defined in [3, Definition 4.1]. The left Haar measure is explicitly given as hL = h ◦ R. In other words (A, δ, hL , h) is a reduced C∗-algebraic quantum group. By universal property of the C∗-algebra A described in Proposition 6.1 there exists a unique ∗-character e of A such that e(a) = 1 and e(b) = 0. Clearly this is the counit of (A, δ). This means that quantum “az + b” groups are co-amenable [1]. In view of Proposition 6.3 we conclude that quantum “az + b” groups are amenable. In particular there is a simple general formula describing all unitary representations of quantum “az + b” groups (cf. [5]).
Acknowledgments The author wishes to express his gratitude to Professor S. L. Woronowicz who inspired and helped in developing this work. He also wants to thank Professors M. Bo˙zejko and W. Pusz whose many comments and remarks have been invaluable. This paper was prepared during the author’s stay at the Mathematisches Institut of the Westf¨ alische Wilhelms-Universit¨at in M¨ unster. He would like to thank Professor J. Cuntz for warm hospitality and perfect atmosphere for scientific activity. The author is grateful to one of the referees for helpful suggestions. Research partially supported by Komitet Bada´ n Naukowych grant no. 2PO3A04022, the Foundation for Polish Science and Deutsche Forschungsgemeinschaft.
May 19, 2005 1:20 WSPC/148-RMP
364
J070-00233
P. M. Soltan
References [1] E. Bedos and L. Tuset, Amenability and co-amenability for locally compact quantum groups, Int. J. Math. 14(8) (2003) 865–884. [2] P. Kruszy´ nski and S. L. Woronowicz, A non-commutative Gelfand–Naimark theorem, J. Op. Theory 8 (1982) 361–389. [3] J. Kustermans and S. Vaes, Locally Compact quantum groups, Ann. Sci. Ec. Norm. Sup. 33(4) (2000) 837–934. [4] T. Masuda, Y. Nakagami and S. L. Woronowicz, A C∗-algebraic framework for quantum groups, Int. J. Math. 14(9) (2003) 903–1001. [5] W. Pusz and P. M. Soltan, Functional form of unitary representations of the quantum “az + b” group, Rep. Math.Phys. 52(2) (2003) 309–319. [6] W. Pusz and S. L. Woronowicz, A quantum GL(2, C) group at roots of unity, Rep. Math. Phys. 47(3) (2001) 431–462. [7] P. M. Soltan, Nowe deformacje grupy afinicznych przeksztalce´ n plaszczyzny, PhD thesis, Warsaw University, 2003. [8] P. M. Soltan and S. L. Woronowicz, A remark on manageable multiplicative unitaries, Lett. Math. Phys. 57 (2001) 239–252. [9] A. Van Daele, The Haar measure on some locally compact quantum groups, preprint OA/0109004. [10] S. L. Woronowicz, Duality in the C∗-algebra theory, in Proc. Int. Congr. Math. Warsaw 1983 (PWN Polish Scientific Publishers, Warsaw), pp. 1347–1356. [11] S. L. Woronowicz, Unbounded elements affiliated with C∗-algebras and non-compact quantum groups, Commun. Math. Phys. 136 (1991) 399–432. [12] S. L. Woronowicz, Operator equalities related to the quantum E(2) group, Commun. Math. Phys. 144 (1992) 417–428. [13] S. L. Woronowicz, Quantum E(2) group and its Pontryagin dual, Lett. Math. Phys. 23 (1991) 251–263. [14] S. L. Woronowicz, C∗-algebras generated by unbounded elements, Rev. Math. Phys. 7(3) (1995) 481–521. [15] S. L. Woronowicz, From multiplicative unitaries to quantum groups, Int. J. Math. 7(1) (1996) 127–149. [16] S. L. Woronowicz, Quantum exponential function, Rev. Math. Phys. 12(6) (2000) 873–920. [17] S. L. Woronowicz, Quantum “az + b” group on complex plane, Int. J. Math. 12(4) (2001) 461–503. [18] S. L. Woronowicz, Haar weight on some quantum groups, University of Warsaw (2003) preprint. [19] S. L. Woronowicz and S. Zakrzewski, Quantum “ax + b” group, Rev. Math. Phys. 14(7,8) (2002) 797–828.
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Reviews in Mathematical Physics Vol. 17, No. 4 (2005) 365–389 c World Scientific Publishing Company
EQUILIBRIUM STATES AND THEIR ENTROPY DENSITIES IN GAUGE-INVARIANT C ∗ -SYSTEMS
NOBUYUKI AKIHO and FUMIO HIAI Graduate School of Information Sciences, Tohoku University, Aoba-ku, Sendai 980-8579, Japan ´ DENES PETZ Alfr´ ed R´ enyi Institute of Mathematics, Hungarian Academy of Sciences, H-1053 Budapest, Re´ altanoda u. 13-15, Hungary Received 30 September 2004 Revised 16 March 2005 A gauge-invariant C ∗ -system is obtained as the fixed point subalgebra of the infinite tensor product of full matrix algebras under the tensor product unitary action of a compact group. In this paper, thermodynamics is studied in such systems and the chemical potential theory developed by Araki, Haag, Kastler and Takesaki is used. As a generalization of quantum spin system, the equivalence of the KMS condition, the Gibbs condition and the variational principle is shown for translation-invariant states. The entropy density of extremal equilibrium states is also investigated in relation to macroscopic uniformity. Keywords: C ∗ -dynamical systems; gauge-invariant C ∗ -systems; equilibrium states; KMS condition; Gibbs condition; variational principle; chemical potentials; entropy densities; macroscopic uniformity.
0. Introduction The rigorous treatment of the statistical mechanics of quantum lattice (or spin) systems has been one of the major successes of the C ∗ -algebraic approach to quantum physics. The main results are due to many people but a detailed overview is presented in the monograph [7]. ([22, Chap. 15] is a concise summary, see also [25].) The usual quantum spin system is described on the infinite tensor product C ∗ -algebra of full matrix algebras. Given an interaction Φ, the local Hamiltonian induces the local dynamics and the local equilibrium state. The global dynamics and the global equilibrium states are obtained by a limiting procedure. The equivalence of the KMS condition, the Gibbs condition and the variational principle for translationinvariant states is the main essence in the theory; they were established around 1970 [1, 19, 24]. The above-mentioned concepts are used to describe equilibrium states. Recently Araki and Moriya extended the ideas to fermionic lattice systems [5]. 365
June 23, 2005 10:9 WSPC/148-RMP
366
J070-00235
N. Akiho, F. Hiai & D. Petz
An attempt to extend quantum statistical mechanics from the setting of spin systems to some approximately finite C ∗ -algebras was made by Kishimoto [17, 18]. Motivated by the chemical potential theory due to Araki et al. [4], in our previous paper [14] we study the equivalence of the KMS condition, the Gibbs condition and the variational principle on approximately finite C ∗ -algebras as a natural extension of the thermodynamics of one-dimensional quantum lattice systems. It turned out that Eq. (2.8) in the proof of [14, Theorem 2.2] does not hold and the equivalence formulated in that theorem is recovered here under stronger conditions. (The error in the proof was pointed out to the authors by E. Størmer and S. Neshveyev some years ago.) A gauge-invariant C ∗ -system is obtained as the fixed point subalgebra of the infinite tensor product of full matrix algebras under the tensor product unitary action of a compact group. This situation is a typical example of the chemical potential theory. The primary aim of the present paper is to recover the main results in [14] in the restrictive setup of such gauge-invariant C ∗ -systems. The second aim is to discuss entropy densities and macroscopic uniformity for extremal equilibrium states in such C ∗ -systems and to extend the arguments in [13].
1. Equilibrium States with Chemical Potentials We begin by fixing basic notations and terminologies. Let Md (C) be the algebra of d × d complex matrices. Let F denote a one-dimensional spin (or UHF) C ∗ -algebra k∈Z Fk with Fk := Md (C), and θ the right shift on F . Let G be a separable compact group and σ a continuous unitary representation of G on Cd so that a product action γ of G on F is defined by γg := Z Ad σg , g ∈ G. Let A := F γ , the fixed point subalgebra of F for the action γ of G. For a finite subset Λ ⊂ Z let FΛ := k∈Λ Fk and AΛ := A ∩ FΛ = FΛγ , the fixed point subalgebra for γ|FΛ . Then A is an AF C ∗ -algebra generated by {AΛ }Λ⊂Z [23, Proposition 2.1]. The algebra A is called the observable algebra while F is called the field algebra. Let S(A) denote the state space of A and Sθ (A) the set of all θ-invariant states of A. An interaction Φ is a mapping from the finite subsets of Z into A such that Φ(∅) = 0 and Φ(X) = Φ(X)∗ ∈ AX for each finite X ⊂ Z. Given an interaction Φ and a finite subset Λ ⊂ Z, define the local Hamiltonian HΛ by Φ(X), HΛ := X⊂Λ
and the surface energy WΛ by {Φ(X) : X ∩ Λ = ∅, X ∩ Λc = ∅} WΛ := whenever the sum converges in norm. Throughout the paper we assume that an interaction Φ is θ-invariant and has relatively short range; namely, θ(Φ(X)) = Φ(X + 1), where
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
367
X + 1 := {k + 1 : k ∈ X}, for every finite X ⊂ Z and Φ(X) |||Φ||| := < ∞, |X| X0
where |X| means the cardinality of X. Let B(A) denote the set of all such interactions, which is a real Banach space with the usual linear operations and the norm |||Φ|||. Moreover, let B0 (A) denote the set of all Φ ∈ B(A) such that Φ(X) < ∞ and sup W[1,n] < ∞. n≥1
X0
Then B0 (A) is a real Banach space with the norm Φ0 := Φ(X) + sup W[1,n] (≥ |||Φ|||). n≥1
X0
We define the real Banach space B0 (F ) in a similar manner. When Φ ∈ B0 (A) we have a strongly continuous one-parameter automorphism group αΦ of F such that itH[−l,m] lim αΦ ae−itH[−l,m] = 0 t (a) − e
l,m→∞
for all a ∈ F uniformly for t in finite intervals (see [15, Theorem 8] and also Φ Φ Φ [7, 6.2.6]). It is straightforward to see that αΦ t θ = θαt and αt γg = γg αt for all Φ Φ t ∈ R and g ∈ G so that αt (A) = A, t ∈ R. The sextuple (F , A, G, α , γ, θ) is a so-called field system in the chemical potential theory ([4], [7, Sec. 5.4.3]). The most general notion of equilibrium states is described by the KMS condition in a general one-parameter C ∗ -dynamical system (see [7, Sec. 5.3.1] for example). In this paper we consider only (αΦ , β)-KMS states with β = 1; so we refer to those states as just αΦ -KMS states. The next proposition says that the αΦ -KMS states are automatically θ-invariant. This was stated in [14, Proposition 4.2] but the proof there was given in a wrong way. Proposition 1.1. Let Φ ∈ B0 (A), and let K(A, Φ) denote the set of all αΦ -KMS states of A. Then K(A, Φ) ⊂ Sθ (A), and ω ∈ K(A, Φ) is extremal in K(A, Φ) if and only if ω is extremal in Sθ (A). Proof. The proof below is essentially same as in [10, Sec. III]. Recall that the generator of αΦ is the closure of the derivation δ0 with domain D(δ0 ) = Λ AΛ (over the finite intervals Λ ⊂ Z) given by [Φ(X), a], a ∈ AΛ . δ0 (a) := i X∩Λ =∅
For each n ∈ N let un ∈ F[−n,n] be a unitary implementing the cyclic permutation n of F[−n,n] = −n Md (C), i.e., Ad un (a−n ⊗ a−n+1 ⊗ · · · ⊗ an−1 ⊗ an ) = an ⊗ a−n ⊗ a−n+1 ⊗ · · · ⊗ an−1
June 23, 2005 10:9 WSPC/148-RMP
368
J070-00235
N. Akiho, F. Hiai & D. Petz
n for ak ∈ Md (C). Since [un , −n σg ] = 0, we get γg (un ) = un for all g ∈ G so that un ∈ A. Moreover, since Ad un (a) = θ(a) whenever a ∈ A[−n,n−1] , it is immediate to see that θ(a) = limn→∞ Ad un (a) for all a ∈ A. Hence, one can apply [10, Corollary II.3] (or [7, 5.3.33A]) to obtain K(A, Φ) ⊂ Sθ (A), and it suffices to show that supn≥1 δ0 (un ) < ∞. This indeed follows because [Φ(X), un ] δ0 (un ) = X∩[−n,n] =∅ ∗ (Φ(X) − un Φ(X)un ) = X∩[−n,n] =∅ ∗ (Φ(X) − θ(Φ(X))) + (Φ(X) − un Φ(X)un ) ≤ X⊂[−n,n−1] X ∩ [−n, n] = ∅ X ⊂ [−n, n − 1] ≤ (Φ(X) − Φ(X + 1)) + 2 Φ(X) X⊂[−n,n−1] X ∩ [−n, n] = ∅ X ⊂ [−n, n − 1] ≤ Φ(X) + Φ(X) + 2 Φ(X) + 2 Φ(X) X−n Xn Xn X ∩ [−n, n] = ∅ X ⊂ [−n, n] Φ(X) + 2W[−n,n] =4 X0
≤ 4Φ0 < ∞. For each ω ∈ Sθ (A) let (πω , Hω , Ωω ) be the GNS cyclic representation of A associated with ω and Uθ be a unitary implementing θ so that Uθ Ωω = Ωω and πω (θ(a)) = Uθ πω (a)Uθ∗ for a ∈ A. Since (A, θ) is asymptotically abelian in the norm sense, i.e., lim|n|→∞ [a, θn (b)] = 0 for all a, b ∈ A, it is well known [7, 4.3.14] that πω (A) ∩ {Uθ } ⊂ πω (A) ∩ πω (A) .
(1.1)
According to [27, Lemma 4.7], the second assertion is a consequence of this together with the first assertion (see also [7, 4.3.17 and 5.3.30 (3)] for extremal points of Sθ (A) and of K(A, Φ)). Remark 1.2. Since (A, θ) is asymptotically abelian as mentioned in the above proof, Sθ (A) becomes a simplex. It is also well known that K(A, Φ) is a simplex. These were shown in [27, Sec. 4], where the lattice (or simplex) structure of state spaces was discussed in a rather general setting. (See also [7, 4.3.11 and 5.3.30 (2)]). Moreover, it is seen from (1.1) [27, Lemma 4.7 ] that K(A, Φ) is a face of Sθ (A).
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
369
It is known [14, Lemma 4.1] that any tracial state φ of A is θ-invariant and φ is extremal if and only if it is multiplicative in the sense that φ(ab) = φ(a)φ(b) for all a ∈ A[i,j] and b ∈ A[j+1,k] , i ≤ j < k. The θ-invariance of any tracial state of A is a particular case of Proposition 1.1 where Φ is identically zero. We denote by ET f (A) the set of all faithful and extremal tracial states of A. On the other hand, we denote by Ξ(G, σ) the set of all continuous one-parameter subgroups t → ξt of G. Two elements ξ, ξ in Ξ(G, σ) are identified if there exists g ∈ G such that Ad σg−1 ξt g = Ad σξt , t ∈ R. In fact, this defines an equivalence relation and we redefine Ξ(G, σ) as the set of equivalence classes. Then, [14, Proposition 4.3] says Proposition 1.3. There is a bijective correspondence φ ↔ ξ between ET f (A) and Ξ(G, σ) under the condition that φ extends to a γξ -KMS state of F . Let τ0 be the normalized trace on Md (C). Let φ and ξ be as in the above proposition. Then there exists a unique self-adjoint h ∈ F{0} = Md (C) such that τ0 (e−h ) = 1 and Ad σξt = Ad eith for all t ∈ R. We call this h the generator of ξ. Note that τ0 (e−h ·) is a unique KMS state of Md (C) with respect to Ad eith and thus φˆ := Z τ0 (e−h ·) is a unique KMS state of F with respect to γξt = Z Ad eith ; so ˆ A. φ = φ| Let Φ ∈ B0 (A) and ξ ∈ Ξ(G, σ), and let ω be an αΦ -KMS state of A. We say that ξ is the chemical potential of ω if there exists an extension ω ˆ of ω to F which γ . Let h be the generator of ξ, and define a is a KMS state with respect to αΦ ξ t t θ-invariant interaction Φh in F by Φ({j}) + θj (h) if X = {j}, j ∈ Z, Φh (X) := (1.2) Φ(X) otherwise. h
Since Φh ∈ B0 (F ), it generates a one-parameter automorphism group αΦ on F . h Φ Φh = αΦ |A [14, Lemma 4.4]. Due to Then, we have αΦ t t γξt , t ∈ R, and α |A = α h the uniqueness of an αΦ -KMS state of F [2, 16], we notice that there is a unique αΦ -KMS state with chemical potential ξ, which is automatically θ-invariant and faithful. On the other hand, a consequence of the celebrated chemical potential theory in [4, Sec. II] together with Proposition 1.1 is the following: if ω is a faithful and extremal αΦ -KMS state of A, then ω enjoys the chemical potential. A complete conclusion in this direction will be given in Theorem 1.6 below, and Proposition 1.3 is its special case. To introduce the Gibbs condition, one needs the notion of perturbations of states of A. Let ω, ψ ∈ S(A). For each finite interval Λ ⊂ Z, the relative entropy of ψΛ := ψ|AΛ with respect to ωΛ := ω|AΛ is given by dψΛ dωΛ dψΛ − log S(ψΛ , ωΛ ) := TrΛ log . d TrΛ d TrΛ d TrΛ Here, TrΛ denotes the canonical trace on AΛ such that TrΛ (e) = 1 for any minimal projection e in AΛ . Then the relative entropy S(ψ, ω) is defined by S(ψ, ω) := sup S(ψΛ , ωΛ ) = lim S(ψ[−n,n] , ω[−n,n] ). Λ⊂Z
n→∞
June 23, 2005 10:9 WSPC/148-RMP
370
J070-00235
N. Akiho, F. Hiai & D. Petz
(See [22] for details on the relative entropy for states of a C ∗ -algebra.) For each ω ∈ S(A) and Q = Q∗ ∈ A, since ψ → S(ψ, ω) + ψ(Q) is weakly* lower semicontinuous and strictly convex on S(A), the perturbed state [ω Q ] by Q is defined as a unique minimizer of this functional [8, 22]. Recall [3, 8] that |S(ψ, ω) − S(ψ, [ω Q ])| ≤ 2Q
(1.3)
for every ψ, ω ∈ S(A) and Q = Q∗ ∈ A. Let Φ be an interaction in A and φ a tracial state of A. For each finite Λ ⊂ Z, the local Gibbs state φG Λ of AΛ with respect to Φ and φ is defined by φG Λ (a) :=
φ(e−HΛ a) , φ(e−HΛ )
a ∈ AΛ .
Let ω ∈ S(A) and (πω , Hω , Ωω ) be the cyclic representation of A associated with ω. We say that ω satisfies the strong Gibbs condition if Ωω is separating for πω (A) and if, for each finite Λ ⊂ Z, there exists a conditional expectation from πω (A) onto πω (AΛ ) ∨ πω (AΛc ) with respect to [ω −WΛ ]˜ and −WΛ [ω −WΛ ](ab) = φG ](b), Λ (a)[ω
a ∈ AΛ ,
b ∈ AΛc .
(1.4)
Here, [ω −WΛ ]˜ is the normal extension of the perturbed state [ω −WΛ ] to πω (A) (see [14, p. 826]). Furthermore, we say that ω satisfies the weak Gibbs condition with respect to Φ and φ if [ω −WΛ ]|AΛ = φG Λ for any finite Λ ⊂ Z. Now, let Φ ∈ B(A), φ ∈ ET f (A) and ω ∈ Sθ (A). From now on, for simplicity we write An := A[1,n] , Hn := H[1,n] , φn := φ|An , ωn := ω|An , etc. for each n ∈ N. The mean relative entropy of ω with respect to φ is defined by 1 1 S(ωn , φn ) = sup S(ωn , φn ). n→∞ n n≥1 n
SM (ω, φ) := lim
(See [14, Lemma 3.1] for justification of the definition.) Define the mean energy AΦ of Φ by Φ(X) (∈ A). AΦ := |X| X0
Furthermore, it is known [14, Theorem 3.5] that limn→∞
1 n
log φ(e−Hn ) exists and
1 log φ(e−Hn ) = sup{−SM (ω, φ) − ω(AΦ ) : ω ∈ Sθ (A)}. n The pressure of Φ with respect to φ is thus defined by lim
n→∞
p(Φ, φ) := lim
n→∞
1 log φ(e−Hn ). n
We have the variational expressions of p(Φ, φ) and SM (ω, φ) as follows. Proposition 1.4. Let φ ∈ ET f (A). If Φ ∈ B(A), then p(Φ, φ) = sup{−SM (ω, φ) − ω(AΦ ) : ω ∈ Sθ (A)}.
(1.5)
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
371
If ω ∈ Sθ (A), then −SM (ω, φ) = inf{p(Φ, φ) + ω(AΦ ) : Φ ∈ B(A)}.
(1.6)
Proof. The expression (1.5) was given in [14, Theorem 3.5] as mentioned above. We can further transform (1.5) into (1.6) by a simple duality argument. In fact, for each ω ∈ Sθ (A) define fω ∈ B(A)∗ , the dual Banach space of B(A), by fω (Φ) := −ω(AΦ ), and set Γ := {fω : ω ∈ Sθ (A)}. Then, it is immediately seen that ω ∈ Sθ (A) → fω ∈ Γ is an affine homeomorphism in the weak* topologies so that Γ is a weakly* compact convex subset of B(A)∗ . Define F : B(A)∗ → [0, +∞] by F (fω ) := SM (ω, φ) for ω ∈ Sθ (A), F (g) := +∞
if g ∈ B(A)∗ \ Γ.
Then F is a weakly* lower semicontinuous and convex function on B(A)∗ (see [14, Proposition 3.2]). Since (1.5) means that p(Φ, φ) = sup{g(Φ) − F (g) : g ∈ B(A)∗ },
Φ ∈ B(A),
it follows by duality (see [9, Proposition I.4.1] for example) that F (g) = sup{g(Φ) − p(Φ, φ) : Φ ∈ B(A)},
g ∈ B(A)∗ .
Hence, for every ω ∈ Sθ (A), SM (ω, φ) = sup{fω (Φ) − p(Φ, φ) : Φ ∈ B(A)} = −inf{p(Φ, φ) + ω(AΦ ) : Φ ∈ B(A)}, giving (1.6). We say that ω satisfies the variational principle with respect to Φ and φ if p(Φ, φ) = −SM (ω, φ) − ω(AΦ ).
(1.7)
With the above definitions in mind we have the next theorem, recovering main results of [14, Corollary 3.11 and Theorem 4.5] in the special setup of gaugeinvariant C ∗ -systems. Theorem 1.5. Let Φ ∈ B0 (A), φ ∈ ET f (A) and ξ ∈ Ξ(G, φ) with φ ↔ ξ in the sense of Proposition 1.3. Then the following conditions for ω ∈ S(A) are equivalent: (i) (ii) (iii) (iv)
ω ω ω ω
is an αΦ -KMS state with chemical potential ξ; satisfies the strong Gibbs condition with respect to Φ and φ; ∈ Sθ (A) and ω satisfies the weak Gibbs condition with respect to Φ and φ; ∈ Sθ (A) and ω satisfies the variational principle with respect to Φ and φ.
Furthermore, there exists a unique ω ∈ S(A) satisfying one (hence all ) of the above conditions. Proof. (i) ⇒ (ii). Let ω be an αΦ -KMS state with chemical potential ξ and (πω , Hω , Ωω ) be the associated cyclic representation of A. It is well known that
June 23, 2005 10:9 WSPC/148-RMP
372
J070-00235
N. Akiho, F. Hiai & D. Petz
Ωω is separating for πω (A) (see [7, 5.3.9] for example). According to the proof of [14, Theorem 2.2, (i) ⇒ (ii)], we see that for any finite Λ ⊂ Z there exists a conditional expectation from πω (A) onto πω (AΛ ) ∨ πω (AΛc ) with respect to [ω −WΛ ]˜. (Note that this part of the proof of [14, Theorem 2.2, (i) ⇒ (ii)] is valid.) Moreover, the proof of [14, Theorem 4.5] shows that (1.4) holds for any finite Λ ⊂ Z. Hence we obtain (ii). (ii) ⇒ (iii). The proof of [14, Theorem 2.2, (ii) ⇒ (i)] guarantees that (ii) implies ω ∈ K(A, Φ). Hence Proposition 1.1 gives the θ-invariance of ω. (iii) ⇒ (iv) is contained in [14, Proposition 3.9] proven in a more general setting. (iv) ⇒ (i). To prove this as well as the last assertion, it suffices to show that a state ω ∈ S(A) satisfying (iv) is unique. First, note that the variational principle (1.7) means that Ψ → −ω(AΨ ) is a tangent functional to the graph of p(·, φ) on B0 (A) at Φ. Let h ∈ Md (C) be the generator of ξ and Φh the θ-invariant interaction h ˆ of F . in F defined by (1.2). Since Φh ∈ B0 (F ), there is a unique αΦ -KMS state ω Equivalently, there is a unique θ-invariant state ω ˆ of F satisfying the variational principle with respect to Φh , i.e., ω) − ω ˆ (AΦh ). PF (Φh ) = sF (ˆ Recall here that the pressure PF (Ψ) of Ψ ∈ B0 (F ) and the mean entropy sF (ψ) of ψ ∈ Sθ (F ) are 1 log TrFn (e−Hn (Ψ) ), n→∞ n
PF (Ψ) := lim
1 S(ψn ), n→∞ n
sF (ψ) := lim
where TrFn is the usual trace on Fn and Hn (Ψ) is the local Hamiltonian of Ψ inside the interval [1, n]. The uniqueness property above means (see [9, Proposition I.5.3] for example) that the pressure function PF (·) on B0 (F ) is differentiable at Φh . We have (see [14, (4.11)]) p(Φ, φ) = PF (Φh ) − log d,
Φ ∈ B0 (A).
(1.8)
By this and (1.2) we obtain p(Φ + Ψ, φ) = PF (Φh + Ψ) − log d,
Ψ ∈ B0 (A),
which implies that Ψ ∈ B0 (A) → p(Ψ, φ) is differentiable at Φ. Hence the required implication follows. The next theorem is a right formulation of what we wanted to show in [14], though in the restricted setup of gauge-invariant C ∗ -systems. Theorem 1.6. If Φ ∈ B0 (A) and ω ∈ S(A), then the following conditions are equivalent: (i) ω is a faithful and extremal αΦ -KMS state; (ii) ω is αΦ -KMS with some chemical potential ξ ∈ Ξ(G, σ); (iii) ω satisfies the strong Gibbs condition with respect to Φ and some φ ∈ ET f (A);
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
373
(iv) ω ∈ Sθ (A) and ω satisfies the weak Gibbs condition with respect to Φ and some φ ∈ ET f (A); (v) ω ∈ Sθ (A) and ω satisfies the variational principle with respect to Φ and some φ ∈ ET f (A). Proof. In view of Theorem 1.5 we only need to prove the equivalence between (i) and (ii). (i) ⇒ (ii) is a consequence of the chemical potential theory in [4, Sec. II] and Proposition 1.1 as mentioned above (after Proposition 1.3). Conversely, suppose h (ii) and let ω ˆ be a (unique) KMS state of F with respect to αΦ γξt = αΦ so ˆ is obviously faithful, so is ω. Moreover, the extremality that ω = ω ˆ |A . Since ω ˆ in Sθ (F ). This may be well known but we of ω in Sθ (A) follows from that of ω ˆ Ω, ˆ U ˆθ ) be the cyclic representation of sketch the proof for convenience. Let (ˆ π , H, ˆ =Ω ˆ and ˆθ Ω ˆ F associated with ω ˆ , where Uθ is a unitary implementing θ so that U ∗ ˆθ π ˆ for a ∈ F. Then the cyclic representation of A associated with π ˆ (θ(a)) = U ˆ (a)U θ ˆ and πω (a) := π ˆ Let ω is given by Hω := π ˆ (A)Ω ˆ (a)|Hω for a ∈ A with Ωω := Ω. ˆθ P = P U ˆθ , U ω := U ˆθ |Hω is a ˆ → Hω be the orthogonal projection. Since U P :H θ
ˆ denote the modular automorphism group of π ˆ (F ) unitary implementing θ|A . Let σ Φ ˆ associated with Ω. Since σ ˆt (ˆ π (a)) = π ˆ (αt (a)) ∈ π ˆ (A) for all a ∈ A, there exists ˆ Ω ˆ ˆ (A) with respect to the state · Ω, the conditional expectation E : π ˆ (F ) → π ∗ ∗ ˆ )=U ˆθ E(x)Uˆ for all x ∈ π ˆθ xU ˆ (F ) . ([26]). Notice that E is θ-covariant, i.e., E(U θ θ Now, assume that ω1 ∈ Sθ (A) and ω1 ≤ λω for some λ > 0; hence there exists T1 ∈ πω (A) with 0 ≤ T1 ≤ λ such that ω1 (a) = T1 πω (a)Ωω , Ωω for a ∈ A, and ˆ Then it is easy to check that T1 Uθω = Uθω T1 . Define T := T1 P + (1 − P ) on H. ˆθ = U ˆθ T . Define 0 ≤ T ≤ λ, T ∈ π ˆ (A) and T U ˆ Ω, ˆ π (a))Ω, ω ˆ 1 (a) := T E(ˆ
a ∈ F,
which is a state of F with ω ˆ 1 |A = ω1 and ω ˆ 1 ≤ λˆ ω . For any a ∈ F we get ˆθ π ˆθ∗ )Ω, ˆ Ω ˆ = T E(ˆ ˆ Ω ˆ =ω ω ˆ 1 (θ(a)) = T E(U ˆ (a)U π (a))Ω, ˆ 1 (a) so that the extremality of ω ˆ implies ω ˆ1 = ω ˆ and so ω1 = ω. Hence ω is extremal in Sθ (A) (hence in K(A, Φ)), and (ii) ⇒ (i) is shown. 2. More about Variational Principle In this section we consider the variational principle for ω ∈ Sθ (A) in terms of the mean entropy and the pressure which are defined by use of canonical traces on local algebras (not with respect to a tracial state in ET f (A)). Let ν be the restriction of f the trivial chemical Z τ0 to A, which is an element of ET (A) corresponding to n potential ξ = 1. For each n ∈ N the n-fold tensor product 1 σ of the unitary representation σ is decomposed as n 1
σ = m1 σ1 ⊕ m2 σ2 ⊕ · · · ⊕ mKn σKn ,
June 23, 2005 10:9 WSPC/148-RMP
374
J070-00235
N. Akiho, F. Hiai & D. Petz
n ˆ 1 ≤ i ≤ Kn , are contained in where σi ∈ G, multiplicities mi . For 1 σ with n n 1 ≤ i ≤ Kn let di be the dimension of σi . Then, we have K i=1 mi di = d and An =
Kn
(Mmi (C) ⊗ 1di ) ∼ =
i=1
Fn ∩ An =
Kn
Kn
Mmi (C),
(2.1)
Mdi (C).
(2.2)
i=1
(1mi ⊗ Mdi (C)) ∼ =
i=1
Kn
i=1
The canonical traces TrAn on An and TrAn on Fn ∩ An are written as TrAn ai ⊗ 1 d i = Trmi (ai ), ai ∈ Mmi (C), 1 ≤ i ≤ Kn , TrAn
i
i
1 mi ⊗ b i
=
i
Trdi (bi ),
bi ∈ Mdi (C), 1 ≤ i ≤ Kn ,
i
where Trm denotes the usual trace on Mm (C). Lemma 2.1. (1) If ω ∈ Sθ (A), then limn→∞ n1 S(ωn ) exists and lim
n→∞
1 S(ωn ) = −SM (ω, ν) + log d, n
where S(ωn ) is the von Neumann entropy of ωn with respect to TrAn , i.e., dωn dωn dωn S(ωn ) := −TrAn log = −ωn log . d TrAn d TrAn d TrAn (2) If Φ ∈ B(A), then limn→∞
1 n
log TrAn (e−Hn ) exists and
1 log TrAn (e−Hn ) = p(Φ, ν) + log d. n→∞ n lim
Proof. (1) Notice that dνn S(ωn ) = −S(ωn , νn ) − ωn log . d TrAn
Kn (Mmi (C) ⊗ 1di ) as in (2.1), we have Representing An = i=1 n dνn = di 1mi ⊗ 1di , d TrAn i=1
K
dn because
n
d νn
i
ai ⊗ 1 d i
=
i
Trmi (ai ) Trdi (1di ) =
i
di Trmi (ai )
(2.3)
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
for ai ∈ Mmi (C), 1 ≤ i ≤ Kn . Therefore, n dνn ≤ max di 1An . 1An ≤ d 1≤i≤Kn d TrAn
375
(2.4)
This implies that dνn 0 ≤ ωn log + n log d ≤ log max di . 1≤i≤Kn d TrAn As is well known (see a brief explanation in [14, p. 844] for example), the representation ring of any compact group has polynomial growth; so we have 1 log max di = 0. (2.5) lim n→∞ n 1≤i≤Kn This implies the desired conclusion. (2) By (2.4) we get TrAn (e−Hn ) ≤ dn νn (e−Hn ) ≤
max di TrAn (e−Hn ),
1≤i≤Kn
implying the result. In view of the above lemma we define the mean entropy of ω ∈ Sθ (A) by sA (ω) := lim
n→∞
1 S(ωn ) (= −SM (ω, ν) + log d), n
and the pressure of Φ ∈ B(A) by PA (Φ) := lim
n→∞
1 log TrAn (e−Hn ) (= p(Φ, ν) + log d). n
The variational expression (1.5) in case of φ = ν is rewritten as PA (Φ) = sup{sA (ω) − ω(AΦ ) : ω ∈ Sθ (A)}. Proposition 2.2. Let Φ ∈ B0 (A) and ξ ∈ Ξ(G, σ) with the generator h. Assume that ξ is central, i.e., ξt belongs to the center of G for any t (this is the case if G is abelian). Then Φh defined by (1.2) is an interaction in A, and ω ∈ Sθ (A) is αΦ -KMS with chemical potential ξ if and only if it satisfies the variational principle PA (Φh ) = sA (ω) − ω(AΦh ).
(2.6)
In particular, ω is αΦ -KMS with trivial chemical potential if and only if it satisfies PA (Φ) = sA (ω) − ω(AΦ ). Proof. The assumption of ξ being central implies that Ad σg (σξt) = σξt for all g ∈ G and t ∈ R. Hence, it is immediately seen that Λ e−h = exp − j∈Λ θj (h)
June 23, 2005 10:9 WSPC/148-RMP
376
J070-00235
N. Akiho, F. Hiai & D. Petz
is in AΛ for any finite Λ ⊂ Z and so the interaction Φh is in A. Let φ be an element of ET f (A) corresponding to ξ as in Proposition 1.3. We may show that (2.6) is equivalent to the variational principle (1.7) with respect to φ. Since AΦh = AΦ + h, it suffices to prove the following two expressions: p(Φ, φ) = PA (Φh ) − log d
(2.7)
− SM (ω, φ) = sA (ω) − ω(h) − log d.
(2.8)
and for every ω ∈ Sθ (A) Let Hn (Φh ) be the local Hamiltonian of Φh inside the interval [1, n]. Since n h dνn −Hn (Φh ) −Hn −h −Hn φn (e ) = νn e e e = νn (e−Hn (Φ ) ) = TrAn , d TrAn j=1 we obtain (2.7) thanks to (2.4) and (2.5). On the other hand, since dφn −S(ωn , φn ) = S(ωn ) + ωn log d TrAn n dνn −h = S(ωn ) + ωn log e d TrAn j=1 dνn = S(ωn ) − nω(h) + ωn log , d TrAn
(2.9)
the expression (2.8) follows. 3. Entropy Densities From now on let F , G, σ, γ, A, θ, etc. be as in the previous sections. Let Φ ∈ B0 (A) be given and αΦ be the associated one-parameter automorphism group. Furthermore, let φ ∈ ET f (A) and the corresponding ξ ∈ Ξ(G, σ) with generator h be given as in Proposition 1.3; hence φ extends to the γξ -KMS state φˆ of F . For each n ∈ N we then have the local Gibbs state of An with respect to Φ and φ given by φG n (a) :=
φ(e−Hn a) , φ(e−Hn )
a ∈ An ,
and the local Gibbs state of Fn with respect to Φh given by h
TrFn (e−Hn (Φ ) a) , a ∈ Fn . φˆG n (a) := TrFn (e−Hn (Φh ) ) n −h The notation φˆG and e−Hn commute (see the n is justified as follows: since 1 e G proof of [14, Proposition 4.3]), φˆn is written as n ˆ −Hn a) TrFn ( 1 e−h )e−Hn a φ(e G ˆ φn (a) = n −h −H = , a ∈ Fn . (3.1) ˆ −Hn ) TrFn ( 1 e )e n φ(e
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
377
With these notations we have Theorem 3.1. Let ω be an αΦ -KMS state of A with chemical potential ξ and ω ˆ be the αΦ γξ -KMS state of F extending ω. Then 1 G 1 , φˆn S φn , φn = lim S φˆG n n→∞ n n→∞ n ˆ = −sF (ˆ = SM (ˆ ω , φ) ω) + ω ˆ (h) + log d
SM (ω, φ) = lim
and sF (ˆ ω ) = lim
n→∞
1 ˆG 1 S(φn ) = lim S(φG n ), n→∞ n n
where sF (ˆ ω ) := limn→∞ n1 S(ˆ ωn ), the mean entropy of ω ˆ . In particular, if ξ is ω ). central, then sA (ω) = sF (ˆ Proof. The following proof of SM (ω, φ) = limn→∞ n1 S(φG n , φn ) is a slight modification of [20, Theorem 2.1]. The proof of Theorem 1.5 says that Ψ ∈ B0 (A) → p(Ψ, φ) is differentiable at Φ with the tangent functional Ψ ∈ B0 (A) → −ω(AΨ ). Hence we have d p(βΦ, φ) = −ω(AΦ ). (3.2) dβ β=1
Furthermore, we obtain d 1 1 1 φ(e−Hn (−Hn )) log φ(e−Hn (βΦ) ) = = − φG (Hn ), dβ β=1 n n φ(e−Hn ) n n and as in [20]
d 1 d −Hn (βΦ) log φ(e lim )= p(βΦ, φ). n→∞ dβ dβ β=1 β=1 n
(3.3)
(3.4)
Combining (3.2)–(3.4) yields limn→∞ n1 φG n (Hn ) = ω(AΦ ). Therefore, Theorem 1.5 implies SM (ω, φ) = −p(Φ, φ) − ω(AΦ ) 1 − log φ(e−Hn ) − φG = lim n (Hn ) n→∞ n 1 dφG n = lim φG log n→∞ n n dφn 1 , φn . = lim S φG n n→∞ n On the other hand, ω ˆ satisfies the variational principle with respect to Φh , i.e., PF (Φh ) = sF (ˆ ω) − ω ˆ (AΦh ).
June 23, 2005 10:9 WSPC/148-RMP
378
J070-00235
N. Akiho, F. Hiai & D. Petz
Since AΦh = AΦ + h, this and (1.8) imply SM (ω, φ) = −p(Φ, φ) − ω(AΦ ) = −sF (ˆ ω) + ω ˆ (AΦ + h) + log d − ω(AΦ ) ω) + ω ˆ (h) + log d. = −sF (ˆ n −n −h ˆ , we have Since dφn /d TrFn = d j=1 e dφˆn ˆ S(ˆ ωn , φn ) = −S(ˆ ωn ) − ω ˆ n log d TrFn n j = −S(ˆ ωn ) + ω ˆ θ (h) + n log d
(3.5)
j=1
= −S(ˆ ωn ) + nˆ ω(h) + n log d so that ˆ = −sF (ˆ SM (ˆ ω , φ) ω) + ω ˆ (h) + log d. Furthermore, ˆ ˆG S(φˆG n , φn ) = −S(φn ) +
n
j φˆG n (θ (h)) + n log d
j=1
= −S(φˆG n)+
n
φˆG [1−j, n−j] (h) + n log d.
j=1 Φ Φh By [20] we have sF (ˆ ω ) = limn→∞ n1 S(φˆG )-KMS n ). The uniqueness of α γξ (= α → ω ˆ weakly* as , m → ∞. For each ε > 0 one can state implies that φˆG [−,m] G choose n0 ∈ N such that φˆ[−, m] (h) − ω ˆ (h) ≤ ε for all , m ≥ n0 . If n > 2n0 and n0 < j ≤ n − n0 , then j − 1 ≥ n0 and n − j ≥ n0 so that φˆG ˆ (h) ≤ ε. [1−j, n−j] (h) − ω Hence we have n 4hn 1 0 G ˆ + ε. ˆ (h) ≤ φ[1−j, n−j] (h) − ω n n j=1
This shows that 1 ˆG ˆ (h). φ[1−j, n−j] (h) = ω n→∞ n j=1 n
lim
Therefore, 1 ˆG ˆ S(φn , φn ) = −sF (ˆ ω) + ω ˆ (h) + log d, n and the proof of the first part is completed. The last assertion follows from (2.6) and (3.5). It remains to prove 1 1 lim S(φˆG lim S(φG n ) = n→∞ n ). n→∞ n n This will be proven after the following lemma. lim
n→∞
(3.6)
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
379
Lemma 3.2. Under (2.1) and (2.2) let 0
D =
Kn
Di0
⊗ 1di ∈ An ,
i=1
D =
Kn
1mi ⊗ Di ∈ Fn ∩ An
i=1
Di0
with positive semidefinite matrices ∈ Mmi (C) and Di ∈ Mdi (C) such that 0 0 TrFn (D D ) = 1. Then D := D D is a density matrix with respect to TrFn . If D|An is the density matrix of TrFn (D ·)|An with respect to TrAn , then |S(D|An ) − S(D)| ≤ log max di , 1≤i≤Kn
where S(D) is the von Neumann entropy of D with respect to TrFn and S(D|An ) is that of D|An with respect to TrAn (see (2.3)). Proof. The first assertion is obvious. Let EAn denote the conditional expectation from Fn onto An with respect to TrFn . Notice that S(EAn (D)) − S(D) = TrFn (D log D − EAn (D) log EAn (D)) = S(D, EAn (D)), the relative entropy of the densities D and EAn (D) in Fn . Set Hi0 := Di0 /Trmi (Di0 ), Hi := Di /Trdi (Di ) and Di := Hi0 ⊗ Hi . The joint convexity of relative entropy implies S(D, EAn (D)) ≤
Kn
Trmi (Di0 ) Trdi (Di )S(Di , EAn (Di )).
i=1 0 Since EAn (Di ) = d−1 i Hi ⊗ 1di , we get
S(Di , EAn (Di )) = TrFn Di log Hi0 ⊗ 1di + 1mi ⊗ log Hi − log Hi0 ⊗ 1di + (log di )1mi ⊗ 1di = Trdi (Hi log Hi ) + log di ≤ log di . Therefore,
0 ≤ S(EAn (D)) − S(D) ≤ log
max di .
1≤i≤Kn
ai ⊗ 1di ∈ An K K n n TrFn (aD ) = TrFn ai ⊗ Di = Trmi (ai ) Trdi (Di )
Next, since for a =
i
i=1
i=1
K n Trdi (Di ) 1 mi ⊗ 1 d i a , = TrFn di i=1
(3.7)
June 23, 2005 10:9 WSPC/148-RMP
380
J070-00235
N. Akiho, F. Hiai & D. Petz
we get Kn Trdi (Di ) EAn (D ) = 1 mi ⊗ 1 d i di i=1
so that EAn (D) = D0 EAn (D ) =
Kn Trdi (Di ) 0 Di ⊗ 1d i . di i=1
Hence we have S(EAn (D)) K n Trdi (Di ) 0 0 = −TrFn Di ⊗ 1di log Di ⊗ 1di + log Trdi (Di ) − log di 1mi ⊗ 1di di i=1 =−
Kn
Trdi (Di ) Trmi (Di0 log Di0 ) −
i=1
Kn
Trmi (Di0 ) Trdi (Di ) log Trdi (Di ) − log di .
i=1
On the other hand, since D|An is we have S(D|An ) = −
Kn
Kn i=1
Trdi (Di )Di0 as an element of
Kn i=1
Mmi (C),
Trmi Trdi (Di )Di0 log Di0 + log Trdi (Di )
i=1
=−
Kn
Trdi (Di ) Trmi (Di0 log Di0 ) −
i=1
Kn
Trmi (Di0 ) Trdi (Di ) log Trdi (Di ).
i=1
Therefore, S(EAn (D)) − S(D|An ) =
Kn
Trmi (Di0 ) Trdi (Di ) log di
i=1
so that
0 ≤ S(EAn (D)) − S(D|An ) ≤ log
max di .
1≤i≤Kn
(3.8)
Combining (3.7) and (3.8) gives the conclusion. ˆ nG be the density of the local Gibbs state φˆG Proof of (3.6). Let D n with respect to TrFn , which is written as n ( 1 e−h )e−Hn ˆ nG = . D (3.9) TrFn ( n1 e−h )e−Hn This is obviously of the form of D in Lemma 3.2, i.e., the product of an element G ˆ nG ·)|An = φˆG of An and an element of Fn ∩An . Furthermore, since TrFn (D n |An = φn
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
381
ˆG thanks to (3.1), it follows that the density of φG n with respect to TrAn is Dn |An (in the notation of Lemma 3.2). Hence, Lemma 3.2 implies S(φG ) − S(φˆG ) ≤ log max di n n 1≤i≤Kn
so that we obtain (3.6) thanks to (2.5). 4. Macroscopic Uniformity Let φ ∈ ET f (A) and 0 < ε < 1. For each n ∈ N and for each state ψ of An we define the two quantities βε (ψ) := min{TrAn (q) : q ∈ An is a projection with ψ(q) ≥ 1 − ε}, βε (ψ, φn ) := min{φn (q) : q ∈ An is a projection with ψ(q) ≥ 1 − ε}. For each state ψ of Fn the quantities βε (ψ ) and βε (ψ , φˆn ) are defined in a similar way with Fn instead of An . The aim of this section is to prove Theorem 4.1. Let Φ, φ, ξ and h be as in Theorem 1.5, and let ω be an αΦ -KMS state of A with chemical potential ξ. Then, for every 0 < ε < 1, 1 log βε (ωn , φn ) n 1 1 ˆ log βε (φG log βε (φˆG lim = lim n , φn ) = n→∞ n , φn ). n→∞ n n Moreover, if ξ is central, then for every 0 < ε < 1, −SM (ω, φ) = lim
n→∞
1 1 log βε (ωn ) = lim log βε (φG n) n→∞ n n 1 1 log βε (ˆ log βε (φˆG ωn ) = lim = lim n ). n→∞ n n→∞ n
sA (ω) = lim
n→∞
(4.1) (4.2)
(4.3) (4.4)
To prove the theorem, we modify the proofs of [13, Theorems 3.1 and 3.3]. Let ω be as in the theorem and (πω , Hω , Ωω ) be the cyclic representation of A associated with ω. For each n ∈ N set Dn :=
dωn dφn
and DnG :=
dφG e−Hn n . = dφn φ(e−Hn )
Lemma 4.2. For every n ∈ N, log DnG − log Dn ≤ 2Wn . Proof. For every state ψ of An let ψ˜ be the state of πω (An ) such that ψ = G ˜ be the normal ψ˜ ◦ πω |An ; in particular, let φ˜G n be that for φn . Moreover, let ω ˜n = ω ˜ |πω (An ) . Note (see [14, p. 826]) that the normal extension of ω to πω (A) ; so ω ω −πω (Wn ) ]. There extension [ω −Wn ]˜ of [ω −Wn ] coincides with the perturbed state [˜ exists the conditional expectation En from πω (A) onto πω (An ) with respect to
June 23, 2005 10:9 WSPC/148-RMP
382
J070-00235
N. Akiho, F. Hiai & D. Petz
[ω −Wn ]˜ because πω (An ) is globally invariant under the modular automorphism associated with this state. (See the proof of [14, Theorem 2.2, (i) ⇒ (ii)]; this part of the proof of [14, Theorem 2.2] is valid.) Then, we successively estimate ˜ ω ˜ n ) ≤ S(ψ˜ ◦ En , ω ˜) S(ψ, ωn ) = S(ψ, −W n ]˜) + 2Wn ≤ S(ψ˜ ◦ En , [ω = S(ψ˜ ◦ En , φ˜G n ◦ En ) + 2Wn = S(ψ, φG n ) + 2Wn .
(4.5)
Here, the first inequality is the monotonicity of relative entropy [22, 5.12(iii)] under the restriction of the states of πω (A) to its subalgebra πω (An ), and the second is due to (1.3). The second equality follows because Theorem 1.5 ((ii) or (iii)) gives [ω −Wn ]˜ = φ˜G n ◦ En . The last equality is seen by applying the monotonicity of relative entropy in two ways (or by [22, 5.15]). We now obtain ψ(log DnG − log Dn ) = S(ψ, ωn ) − S(ψ, φG n ) ≤ 2Wn for all states ψ of An , which implies the conclusion. Lemma 4.3. For the densities Dn and DnG , lim
n→∞
1 1 πω (− log Dn ) = lim πω (− log DnG ) = −SM (ω, φ)1 strongly. n→∞ n n
Proof. Since ω is extremal in Sθ (A), the mean ergodic theorem says that n 1 j θ (AΦ ) = ω(AΦ )1 strongly. lim πω n→∞ n j=1 Since it follows as in [13] that
n 1 j lim θ (AΦ ) − Hn = 0, n→∞ n j=1
we have lim
n→∞
1 πω (Hn ) = ω(AΦ )1 strongly. n
(4.6)
Therefore, we obtain the strong convergence 1 1 1 πω (− log DnG ) = πω (Hn ) + log φ(e−Hn ) 1 n n n → ω(AΦ ) + p(Φ, φ) 1 = −SM (ω, φ)1
(4.7)
due to the variational principle of ω in Theorem 1.5. Next, let an := − n1 log Dn and bn := − n1 log DnG + n2 Wn ; so πω (bn ) → −SM (ω, φ)1 strongly by what is already shown. We get an ≤ bn by Lemma 4.2,
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
383
and moreover an = −
dωn dφn dφn 1 1 1 log + log ≥ log n d TrAn n d TrAn n d TrAn dνn 1 j 1 θ (h) + log ≥ −h − log d n j=1 n d TrAn n
=−
(see (2.9) and (2.4)). Hence {bn − an } is uniformly bounded. Since πω (bn − an )Ωω 2 ≤ sup bm − am ω(bn − an ) →
m
sup bm − am m
−SM (ω, φ) + SM (ω, φ) = 0,
we have πω (bn − an ) → 0 strongly because Ωω is separating for πω (A) . Hence πω (an ) → −SM (ω, φ)1 strongly. Lemma 4.4. Let n(1) < n(2) < · · · be positive integers, and let ak ∈ An(k) be a positive contraction for each k ∈ N. (i) If inf k ω(ak ) > 0, then lim
k→∞
1 log φG n(k) (ak ) = 0. n(k)
(ii) If inf k φG n(k) (ak ) > 0, then inf k ω(ak ) > 0. (iii) If limk→∞ ω(ak ) = 1, then limk→∞ φG n(k) (ak ) = 1. The above assertions (i)–(iii) hold also for Fn(k) , ω ˆ and φˆG n(k) instead of An(k) , ω G and φn(k) , respectively. Proof. The last assertion is contained in [13, Lemma 3.2]. Let s1 1 − s1 F (s1 , s2 ) := s1 log + (1 − s1 ) log , 0 ≤ s1 , s2 ≤ 1. s2 1 − s2 If the conclusion of (i) does not hold, then one may assume by taking a subsequence −n(k)η , k ∈ N, for some η > 0. Using the monotonicity of relative that φG n(k) (ak ) ≤ e entropy [22, 5.12(iii)] applied to the map α : C2 → An(k) , α(t1 , t2 ) := t1 ak + t2 (1 − ak ), we have G G S(ωn(k) , φG n(k) ) ≥ S(ωn(k) ◦ α, φn(k) ◦ α) = F (ωn(k) (ak ), φn(k) (ak )) G ≥ − log 2 − ω(ak ) log φG n(k) (ak ) − (1 − ω(ak )) log(1 − φn(k) (ak ))
≥ − log 2 + n(k)ηω(ak ) and hence lim inf k→∞
1 S(ωn(k) , φG n(k) ) ≥ η inf ω(ak ) > 0. k n(k)
June 23, 2005 10:9 WSPC/148-RMP
384
J070-00235
N. Akiho, F. Hiai & D. Petz
This contradicts the equality lim
n→∞
1 S(ωn , φG n ) = SM (ω, φ) + ω(AΦ ) + p(Φ, φ) = 0, n
−Hn ) and (4.6). Hence which is seen from S(ωn , φG n ) = S(ωn , φn ) + ω(Hn ) + log φ(e (i) follows. Furthermore, thanks to the monotonicity of relative entropy as above and (4.5), we have G F (φG n(k) (ak ), ω(ak )) ≤ S(φn(k) , ωn(k) ) G ≤ S(φG n(k) , φn(k) ) + 2Wn(k) = 2Wn(k) .
This shows the boundedness of F (φG n(k) (ak ), ω(ak )), from which (ii) and (iii) are easily verified. Proof of (4.1). For each δ > 0 and n ∈ N, let pn be the spectral projection of − n1 log Dn corresponding to the interval (−SM (ω, φ) − δ, −SM (ω, φ) + δ). Then we have (4.8) exp n(−SM (ω, φ) − δ) Dn pn ≤ pn ≤ exp n(−SM (ω, φ) + δ) Dn pn , and Lemma 4.3 implies that πω (pn ) → 1 strongly as n → ∞. Choose a sequence n(1) < n(2) < · · · such that lim
k→∞
1 1 log βε (ωn(k) , φn(k) ) = lim inf log βε (ωn , φn ). n→∞ n n(k)
(4.9)
For each k choose a projection qk ∈ An(k) such that ω(qk ) ≥ 1 − ε and log φn(k) (qk ) ≤ log βε (ωn(k) , φn(k) ) + 1.
(4.10)
We may assume that πω (qk ) converges to some y ∈ πω (A) weakly. Since πω (pn(k) qk ) → y weakly, we get lim ω(pn(k) qk ) = yΩω , Ωω = lim ω(qk ) ≥ 1 − ε
k→∞
k→∞
and by (4.8) φ(qk ) ≥ φ(pn(k) qk ) ≥ exp n(k)(−SM (ω, φ) − δ) ω(pn(k) qk ). These give lim inf k→∞
1 log φ(qk ) ≥ −SM (ω, φ) − δ. n(k)
Combining (4.9)–(4.11) yields lim inf n→∞
1 log βε (ωn , φn ) ≥ −SM (ω, φ) − δ. n
(4.11)
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
385
On the other hand, we obtain lim sup n→∞
1 log βε (ωn , φn ) ≤ −SM (ω, φ) + δ, n
because by (4.8) 1 1 1 log βε (ωn , φn ) ≤ log φ(pn ) ≤ −SM (ω, φ) + δ + log ω(pn ) n n n ≤ −SM (ω, φ) + δ if n is so large that ω(pn ) ≥ 1 − ε. Thus, the proof of (4.1) is completed. Proof of (4.2). This can be proven by use of (i)–(iii) of Lemma 4.4 similarly to the proof of [13, Theorem 3.3]. Since the proof of the second inequality is a bit more involved than the first, we only prove the second. ˆ Ω) ˆ be the cyclic representation of F associated with ω Let (ˆ π , H, ˆ . For each δ > 0 and n ∈ N, let pn be the spectral projection of − n1 log DnG to (−SM (ω, φ) − ˆ (Hn ) → ω ˆ (AΦ )1 = ω(AΦ )1 and hence n1 π ˆ (− log DnG ) → δ, SM (ω, φ) + δ). Since n1 π ˆ (pn ) → 1 strongly as −SM (ω, φ)1 strongly as (4.6) and (4.7), it follows that π n → ∞. Furthermore, we have e−Hn pn e−Hn pn ≤ p . (4.12) ≤ exp n(−S (ω, φ) + δ) exp n(−SM (ω, φ) − δ) n M φ(e−Hn ) φ(e−Hn ) Choose n(1) < n(2) < · · · such that lim
k→∞
1 1 ˆ ˆ log βε (φˆG inf log βε (φˆG n , φn ). n(k) , φn(k) ) = lim n→∞ n(k) n
(4.13)
For each k there is a projection qk ∈ Fn(k) such that φˆG n(k) (qk ) ≥ 1 − ε and ˆ log φˆn(k) (qk ) ≤ log βε (φˆG n(k) , φn(k) ) + 1.
(4.14)
ˆ (F ) weakly. Then we Here, we may assume that π ˆ (qk ) converges to some y ∈ π obtain ˆ Ω ˆ = lim ω ˆ (pn(k) qk pn(k) ) = y Ω, ˆ (qk ) > 0 lim ω
k→∞
k→∞
by Lemma 4.4(ii) (for ω ˆ and φˆG n(k) with ak = qk ), and hence 1 log φˆG n(k) (pn(k) qk pn(k) ) = 0 k→∞ n(k) lim
(4.15)
June 23, 2005 10:9 WSPC/148-RMP
386
J070-00235
N. Akiho, F. Hiai & D. Petz
by Lemma 4.4(i) (for ω ˆ and φˆG with ak = pn(k) qk pn(k) ). Furthermore, since pn n(k) commutes with e−Hn and n1 e−h , we obtain n(k) −n(k) −h TrF e qk φˆn(k) (qk ) = d n(k)
1
−n(k)
≥d
TrFn(k)
n(k)
e
−h
pn(k) qk
1
d−n(k) TrFn(k) ≥ exp n(k)(−SM (ω, φ) − δ)
n(k) 1
e−h e−Hn(k) pn(k) qk
φ(e−Hn(k) )
ˆ −Hn(k) pn(k) qk pn(k) ) φ(e = exp n(k)(−SM (ω, φ) − δ) φ(e−Hn(k) ) G = exp n(k)(−SM (ω, φ) − δ) φˆ (pn(k) qk pn(k) ) n(k)
using (4.12) and (3.1). This together with (4.13)–(4.15) yields lim inf n→∞
1 ˆ log βε (φˆG n , φn ) ≥ −SM (ω, φ) − δ. n
On the other hand, since φˆG ˆ and φˆG n (pn ) → 1 by Lemma 4.4(iii) (for ω n ), we have G ˆ φn (pn ) ≥ 1 − ε for large n, and for such n 1 1 ˆ log βε (φˆG log φˆn (pn ) ≤ −SM (ω, φ) + δ n , φn ) ≤ n n thanks to (4.12). Therefore, lim sup n→∞
1 ˆ log βε (φˆG n , φ) ≤ −SM (ω, φ) + δ, n
completing the proof of (4.2). Proof of (4.3) and (4.4). Assume that ξ is central. Since sA (ω) = sF (ˆ ω ) by Theorem 3.1, the assertion (4.4) is contained in [13, Theorem 3.3]. To prove (4.3), we first assume that ξ is trivial. Then, by Lemma 2.1(1) and (4.1) (in the case of φ = ν) we have sA (ω) = −SM (ω, ν) + log d 1 log βε (ωn , νn ) + log d n 1 log βε (ωn ). = lim n→∞ n
= lim
n→∞
The latter equality in the above is readily verified from (2.4) and (2.5). The other equality in (4.3) when φ = ν is similarly shown from the first equality in (4.2). When ξ is not trivial, we consider Φh belonging to B0 (A) instead of Φ. Note that
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
387
h
ω is an αΦ -KMS state with trivial chemical potential and φG n is the local Gibbs state with respect to Φh and ν. Hence, the above special case gives the conclusion.
5. Remarks and Problems Some problems as well as related known results are in order. 5.1. It is known [11, 23] that the weak*-closure of ET f (A) coincides with the set ET (A) of all extremal tracial states of A as far as G is a compact connected Lie group. For Φ ∈ B0 (A) let EK(A, Φ) denote the set of all extremal αΦ -KMS states of A (see Proposition 1.1) and EK f (A, Φ) the set of all faithful ω ∈ EK(A, Φ). Theorems 1.5 and 1.6 say that there is a bijective correspondence φ ↔ ω between ET f (A) and EK f (A, Φ). We further know (see [14, Theorem 4.6]) that the correspondence φ → ω is a weak*-homeomorphism from ET f (A) onto EK f (A, Φ). Upon these considerations we are interested in the following problems: (1) Does the weak*-closure of EK f (A, Φ) coincide with EK(A, Φ) (as far as G is a compact connected Lie group)? (2) Does the above φ → ω extend to a weak*-homeomorphism from ET (A) onto EK(A, Φ)? ω) 5.2. In the situation of Theorem 3.1 it seems that the equality sA (ω) = sF (ˆ holds without the assumption of ξ being central. This is equivalent to the equality sA (ω) = limn→∞ n1 S(φG n ), which is the only missing point in Theorem 3.1. ωn , φˆn ) is missing in Theorem 4.1, 5.3. The equality −SM (ω, φ) = limn→∞ n1 βε (ˆ which is equivalent to ˆ = lim ω , φ) −SM (ˆ
n→∞
1 log βε (ˆ ωn , φˆn ) n
(5.1)
due to Theorem 3.1. Note that φˆ is a product state of F and ω ˆ is completely ergodic, i.e., extremal for all θn , n ≥ 1. Thus, the equality (5.1) is an old open problem from the viewpoint of quantum hypothesis testing in [12], where the weaker result was proven: ˆ ≥ lim sup ω , φ) −SM (ˆ n→∞
−
1 log βε (ˆ ωn , φˆn ), n
1 ˆ ≤ lim inf 1 log βε (ˆ SM (ˆ ω , φ) ωn , φˆn ). n→∞ n 1−ε
In this connection, it is worthwhile to note that Ogawa and Nagaoka established in [21] the equality −S(ϕ, ψ) = lim
n→∞
1 log βε (ϕn , ψn ) n
June 23, 2005 10:9 WSPC/148-RMP
388
J070-00235
N. Akiho, F. Hiai & D. Petz
when ϕ, ψ are states of Md (C) and ϕn , ψn are the n-fold tensor products of ϕ, ψ. The problem of macroscopic uniformity for states of spin C ∗ -algebras was completely solved in a recent paper by Bjelakovi´c et al. as follows: if ϕ is an extremal translation-invariant state of the ν-dimensional spin algebra Zν Md (C), then s(ϕ) = limν Λ→Z
1 log βε (ϕ) |Λ|
for any 0 < ε < 1. See [6] for details. 5.4. Although many arguments in this paper as well as in [14] work also in gaugeinvariant C ∗ -systems over the multi-dimensional lattice Zν , some difficulties arise when we would extend our whole arguments to the multi-dimensional case. For instance, it does not seem that Proposition 1.1 holds in multi-dimensional gaugeinvariant C ∗ -systems. The proposition is crucial when we use the chemical potential theory as in the proof of Theorem 1.6. Moreover, the assumption of uniformly bounded surface energies is sometimes useful in our discussions. In the multidimensional case, the assumption is obviously too strong and, if it is not assumed, the non-uniqueness of KMS states (or the phase transition) can occur. Indeed, h the uniqueness of αΦ -KMS state of F is essential in the proof of Theorem 1.5. Consequently, some new ideas must be needed to extend the theory to the multidimensional setting. Acknowledgments The authors are grateful to Professors E. Størmer and S. Neshveyev who pointed out a mistake in our previous paper [14] in 2000, and also thank the referees for their useful suggestions. The second author was supported in part by Japan-Hungary Joint Research Project (JSPS) and by the program “R&D support scheme for funding selected IT proposals” of the Ministry of Public Management, Home Affairs, Posts and Telecommunications. The third author was supported in part by MTA-JSPS project (Quantum Probability and Information Theory) and by OTKA T032662. References [1] H. Araki, On the equivalence of the KMS condition and the variational principle for quantum lattice systems, Commun. Math. Phys. 38 (1974) 1–10. [2] H. Araki, On uniqueness of KMS states of one-dimensional quantum lattice systems, Commun. Math. Phys. 44 (1975) 1–7. [3] H. Araki, Relative entropy for states of von Neumann algebras II, Publ. Res. Inst. Math. Sci. 13 (1977) 173–192. [4] H. Araki, R. Haag, D. Kastler and M. Takesaki, Extension of KMS states and chemical potential, Commun. Math. Phys. 53 (1977) 97–134. [5] H. Araki and H. Moriya, Equilibrium statistical mechanics of Fermion lattice systems, Rev. Math. Phys. 15 (2003) 93–198.
June 23, 2005 10:9 WSPC/148-RMP
J070-00235
Equilibrium States and Entropy Densities
389
[6] I. Bjelakovi´c, T. Kr¨ uger, R. Siegmund-Schultze and A. Szkola, The Shannon– McMillan theorem for ergodic quantum lattice systems, Invent. Math. 155 (2004) 203–222. [7] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 1, 2, 2nd edn. (Springer-Verlag, 2002). [8] M. J. Donald, Relative hamiltonians which are not bounded from above, J. Funct. Anal. 91 (1990) 143–173. [9] I. Ekeland and R. Temam, Convex Analysis and Variational Problems, Studies in Mathematics and its Applications, Vol. 1 (North-Holland, Amsterdam-Oxford, 1976). [10] M. Fannes, P. Vanheuverzwijn and A. Verbeure, Quantum energy-entropy inequalities: A new method for proving the absence of symmetry breaking, J. Math. Phys. 25 (1984) 76–78. [11] D. Handelman, Extending traces on fixed point C ∗ algebras under Xerox product type actions of compact Lie groups, J. Funct. Anal. 72 (1987) 44–57. [12] F. Hiai and D. Petz, The proper formula for relative entropy and its asymptotics in quantum probability, Commun. Math. Phys. 143 (1991) 99–114. [13] F. Hiai and D. Petz, Entropy densities for Gibbs states of quantum spin systems, Rev. Math. Phys. 5 (1993) 693–712. [14] F. Hiai and D. Petz, Quantum mechanics in AF C ∗ -systems, Rev. Math. Phys. 8 (1996) 819–859. [15] A. Kishimoto, Dissipations and derivations, Commun. Math. Phys. 47 (1976) 25–32. [16] A. Kishimoto, On uniqueness of KMS states of one-dimensional quantum lattice systems, Commun. Math. Phys. 47 (1976) 167–170. [17] A. Kishimoto, Equilibrium states of a semi-quantum lattice system, Rep. Math. Phys. 12 (1977) 341–374. [18] A. Kishimoto, Variational principle for quasi-local algebras over the lattice, Ann. Inst. H. Poincar´ e Phys. Th´eor. 30 (1979) 51–59. [19] O. E. Lanford III and D. W. Robinson, Statistical mechanics of quantum spin systems. III, Commun. Math. Phys. 9 (1968) 327–338. [20] H. Moriya and A. van Enter, On thermodynamic limits of entropy densities, Lett. Math. Phys. 45 (1998) 323–330. [21] T. Ogawa and H. Nagaoka, Strong converse and Stein’s lemma in quantum hypothesis testing, IEEE Trans. Inform. Theory 46 (2000) 2428–2433. [22] M. Ohya and D. Petz, Quantum Entropy and Its Use (Springer-Verlag, 1993); 2nd edn. (2004). [23] G. Price, Extremal traces on some group-invariant C ∗ -algebras, J. Funct. Anal. 49 (1982) 145–151. [24] D. W. Robinson, Statistical mechanics of quantum spin system. II, Commun. Math. Phys. 7 (1968) 337–348. [25] G. L. Sewell, Quantum Theory of Collective Phenomena (Clarendon Press, New York, 1986). [26] M. Takesaki, Conditional expectations in von Neumann algebras, J. Funct. Anal. 9 (1972) 306–321. [27] M. Takesaki and M. Winnink, Local normality in quantum statistical mechanics, Commun. Math. Phys. 30 (1973) 129–152.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Reviews in Mathematical Physics Vol. 17, No. 4 (2005) 391–490 c World Scientific Publishing Company
QUANTIZATION METHODS: A GUIDE FOR PHYSICISTS AND ANALYSTS
S. TWAREQUE ALI Department of Mathematics and Statistics, Concordia University, Montr´ eal, Qu´ ebec, Canada H4B 1R6
[email protected] ˇ MIROSLAV ENGLIS ´ AV CR, ˇ ˇ a 25, 11567 Praha 1, Czech Republic MU Zitn´
[email protected] Received 31 May 2004 Revised 24 March 2005
This survey is an overview of some of the better known quantization techniques (for systems with finite numbers of degrees-of-freedom) including in particular canonical quantization and the related Dirac scheme, introduced in the early days of quantum mechanics, Segal and Borel quantizations, geometric quantization, various ramifications of deformation quantization, Berezin and Berezin–Toeplitz quantizations, prime quantization and coherent state quantization. We have attempted to give an account sufficiently in depth to convey the general picture, as well as to indicate the mutual relationships between various methods, their relative successes and shortcomings, mentioning also open problems in the area. Finally, even for approaches for which lack of space or expertise prevented us from treating them to the extent they would deserve, we have tried to provide ample references to the existing literature on the subject. In all cases, we have made an effort to keep the discussion accessible both to physicists and to mathematicians, including non-specialists in the field. Keywords: Canonical quantization; Borel quantization; geometric quantization; deformation quantization; Berezin–Toeplitz quantization; Berezin quantization; coherent state quantization.
Contents 1. Introduction 1.1. The problem 1.2. Stumbling blocks 1.3. Getting out of the quagmire 2. Canonical Quantization and Its Generalizations 2.1. The early notion of quantization 2.2. Segal and Borel quantization 2.3. Segal quantization 2.4. Borel quantization 391
392 393 395 397 399 399 403 404 408
June 23, 2005 10:9 WSPC/148-RMP
392
J070-00237
S. T. Ali & M. Engliˇ c
3. Geometric Quantization 3.1. Prequantization 3.2. Real polarizations and half-densities 3.3. Complex polarizations 3.4. Half-forms and the metalinear correction 3.5. Blattner–Kostant–Sternberg pairing 3.6. Further developments 3.7. SpinC -quantization 3.8. Some Shortcomings 4. Deformation Quantization 5. Berezin and Berezin–Toeplitz Quantization on K¨ ahler Manifolds 6. Prime Quantization 7. Coherent State Quantization 7.1. The projective Hilbert space 7.2. Summary of coherent state quantization 8. Some Other Quantization Methods Acknowledgments References
415 417 421 426 429 432 440 444 447 449 453 459 465 465 468 473 475 475
1. Introduction Quantization is generally understood as the transition from classical to quantum mechanics. Starting with a classical system, one often wishes to formulate a quantum theory, which in an appropriate limit, would reduce back to the classical system of departure. In a more general setting, quantization is also understood as a correspondence between a classical and a quantum theory. In this context, one also talks about dequantization, which is a procedure by which one starts with a quantum theory and arrives back at its classical counterpart. It is well known however, that not every quantum system has a meaningful classical counterpart and moreover, different quantum systems may reduce to the same classical theory. Over the years, the processes of quantization and dequantization have evolved into mathematical theories in their own right, impinging on areas of group representation theory and symplectic geometry. Indeed, the programme of geometric quantization is in many ways an offshoot of group representation theory on coadjoint orbits, while other techniques borrow heavily from the theory of representations of diffeomorphism groups. In this paper we attempt to present an overview of some of the better known quantization techniques found in the current literature and useful both to physicists and mathematicians. The treatment will be more descriptive than rigorous, for we aim to reach both physicists and mathematicians, including non-specialists in the field. It is our hope that an overview such as this will put into perspective the relative strengths as well as shortcomings of the various techniques that have been developed and, besides delineating their usefulness in understanding the nature of the quantum regime, will also demonstrate the mathematical richness of the attendant structures. As will become clear, no one method solves the problem of quantization completely. On the other hand, a comparative study such as ours puts into focus the deeper mathematical and structural relationships between classical
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
393
and quantum mechanics, even though doubts may sometimes be cast, with some legitimacy, on whether any of methods outlined here could be successfully employed in truly complex, practical physical situations. It should also be noted that our focus is on non-relativistic finite dimensional quantum systems here. We do not consider infinite dimensional systems or their representation theory. Consequently we do not enter here into a discussion of problems associated with field quantization or the mathematical theory of field representations, although some of the methods discussed here could possibly be amenable to extensions in this direction too. 1.1. The problem The original concept of quantization (nowadays usually referred to as canonical quantization), going back to Weyl, von Neumann, and Dirac [78, 199, 280], consists in assigning (or rather, trying to assign) to the observables of classical mechanics, which are real-valued functions f (p, q) of (p, q) = (p1 , . . . , pn , q1 , . . . , qn ) ∈ Rn ×Rn (the phase space), self-adjoint operators Qf on the Hilbert space L2 (Rn ) in such a way that (q1) the correspondence f → Qf is linear; (q2) Q1 = I, where 1 is the constant function, equal to one everywhere, and I the identity operator; (q3) for any function φ : R → R for which Qφ◦f and φ(Qf ) are well-defined, Qφ◦f = φ(Qf ); and (q4) the operators Qpj and Qqj corresponding to the coordinate functions pj , qj (j = 1, . . . , n) are given by Qqj ψ = qj ψ,
Qpj ψ = −
ih ∂ψ 2π ∂qj
for ψ ∈ L2 (Rn , dq).
(1.1)
The condition (q3) is usually known as the von Neumann rule. The domain of definition of the mapping Q : f → Qf is called the space of quantizable observables, and one would of course like to make it as large as possible — ideally, it should include at least the infinitely differentiable functions C ∞ (Rn ), or some other convenient function space. The parameter h, on which the quantization map Q also depends, is usually a small positive number, identified with the Planck constant.a (One also often uses the shorthand notation for the ratio h/2π.) An important theorem of Stone and von Neumann [199] states that up to unitary equivalence, the operators (1.1) are the unique operators acting on a Hilbert space H, which satisfy (a) the irreducibility condition, there are no closed subspaces H0 ⊂ H, other than {0} and H itself, that are stable under the action of all the operators Qpj and Qqj , j = 1, . . . , n,
(1.2)
physically h is a fixed number (a physical constant), for mathematical purposes, when going to the classical limit, it is allowed to run over a set of values approaching zero.
a While
June 23, 2005 10:9 WSPC/148-RMP
394
J070-00237
S. T. Ali & M. Engliˇ c
and (b) the commutation relations [Qpj , Qpk ] = [Qqj , Qqk ] = 0,
[Qqk , Qpj ] =
ih δjk I. 2π
(1.3)
The physical interpretation is as follows.b The classical system, of n linear degrees-of-freedom, moves on the phase space Rn × Rn , with qj , pj being the canonical position and momentum observables, respectively. Any classical state is given as a probability distribution (measure) on phase space. The pure states of the quantum system correspond to one-dimensional subspaces Cu (u = 1) of L2 (Rn ), and the result of measuring an observable f in the state u leads to the probability distribution Π(Qf )u, u, where Π(Qf ) is the spectral measure of Qf . In particular, if Qf has pure point spectrum consisting of eigenvalues λj with unit eigenvectors uj , the possible outcomes of measuring f will be λj with probability |u, uj |2 ; if u = uj for some j, the measurement will be deterministic and will always return λj . Noncommutativity of operators corresponds to the impossibility of measuring simultaneously the corresponding observables. In particular, the canonical commutation relations (1.3) above lie at the root of the celebrated Heisenberg uncertainty principle. Evidently, for f = f (q) a polynomial in the position variables q1 , . . . , qn , the linearity (q1) and the von Neumann rule (q3) dictate that Qf (q) = f (Qq ) in the sense of spectral theory (functional calculus for commuting self-adjoint operators); similarly for polynomials f (p) in p. The canonical commutation relations then imply that for any functions f, g that are at most linear in either p or q, [Qf , Qg ] = where
ih Q{f,g} , 2π
n ∂f ∂g ∂f ∂g − {f, g} = ∂qj ∂pj ∂pj ∂qj j=1
(1.4)
(1.5)
is the Poisson bracket of f and g. It turns out that another desideratum on the quantization operator Q, motivated by physical considerations [78, pp. 87–92], is that (q5) the correspondence (1.4), between the classical Poisson bracket and the quantum commutator bracket, holds for all quantizable observables f and g. Thus we are led to the following problem: find a vector space Obs (as large as possible) of real-valued functions f (p, q) on R2n , containing the coordinate functions pj and qj (j = 1, . . . , n), and a mapping Q : f → Qf from Obs into self-adjoint operators on L2 (Rn ) such that (q1)–(q5) are satisfied. is precisely because of this interpretation that one actually has to insist on the operators Qf being self-adjoint (not just symmetric or “formally self-adjoint”). See Gieres [109] for a thorough discussion of this issue. b It
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
395
(Note that the axiom (q2) is, in fact, a consequence of either (q3) (taking φ = 1) or (q5) (taking f = p1 , g = q1 ); we have stated it separately for reasons of exposition.) It should also be emphasized here that although we have formulated the correspondence between the Poisson and commutator brackets as our last condition, from a physical point of view this is, in fact, the most important requirement. 1.2. Stumbling blocks Unfortunately, it turns out that the axioms (q1)–(q5) are not quite consistent. First of all, using (q1)–(q4) it is possible to express Qf for f (p, q) = p21 q12 = (p1 q1 )2 in two ways with two different results (see [100, p. 17]; or Arens and Babbitt [19]). Namely, let us temporarily write just p, q instead of p1 , q1 and P, Q instead of Qp1 and Qq1 , respectively. Then by the von Neumann rule (q3) for the squaring function φ(t) = t2 and (q1), P Q + QP (P + Q)2 − P 2 − Q2 (p + q)2 − p2 − q 2 ⇒ Qpq = = ; 2 2 2 and similarly pq =
(p2 + q 2 )2 − p4 − q 4 P 2 Q2 + Q2 P 2 ⇒ Qp2 q2 = . 2 2 However, a small computation using only the canonical commutation relations (1.3) (which are a consequence of either (q4) or (q5)) shows that 2 P Q + QP P 2 Q2 + Q2 P 2 = . 2 2 p2 q 2 =
Thus neither (q4) nor (q5) can be satisfied if (q1) and (q3) are satisfied and p21 , q12 , p41 , q14 , p1 q1 and p21 q12 ∈ Obs. Secondly, it is a result of Groenewold [127], later elaborated further by van Hove [142], that (q5) fails whenever (q1) and (q4) are satisfied and Obs contains all polynomials in p, q of degree not exceeding four. To see this, assume, for simplicity, that n = 1 (the argument for general n is the same), and let us keep the notations ih . p, q, P, Q of the preceding paragraph and for the sake of brevity also set c = − 2π Note first of all that for any self-adjoint operator X, [X, P ] = [X, Q] = 0 ⇒ X = dI
for some d ∈ C.
(1.6)
(Indeed, any spectral projection E of X must then commute with P, Q, hence the range of E is a subspace invariant under both P and Q; by irreducibility, this forces E = 0 or I.) Set now X = Qpq ; then, since {pq, p} = p,
{pq, q} = −q,
we must have by (q5) [X, P ] = −cP,
[X, Q] = cQ.
June 23, 2005 10:9 WSPC/148-RMP
396
J070-00237
S. T. Ali & M. Engliˇ c
As also
P Q + QP , P = −cP, 2
P Q + QP , Q = cQ, 2
it follows from (1.6) that P Q + QP + dI 2 (m = 1, 2, . . .); then from
Qpq ≡ X = Next set X = Qqm
{q m , q} = 0,
for some d ∈ C.
{q m , p} = mq m−1
we similarly obtain X = Qm + dm I
for some dm ∈ C.
Furthermore, since {pq, q m } = −mq m , it follows that
P Q + QP P Q + QP m m cmX = + dI, Q + dm I = ,Q = cmQm . 2 2
Thus (using also a similar argument for X = Qpm ) Qq m = Qm ,
Qpm = P m ,
∀m = 1, 2, . . . .
Now from {p2 , q 3 } = −6q 2 p we obtain that 6cQq2 p = [P 2 , Q3 ] = 3cP Q2 + 3cQ2 P, so P Q2 + Q2 P 2 and similarly for Qp2 q . Thus finally, we have on the one hand Qq 2 p =
{p3 , q 3 } = −9p2 q 2 ⇒ Qp2 q2 = while on the other hand {p q, pq } = −3p q ⇒ Qp2 q2 2
2
2 2
1 3 3 2 [P , Q ] = Q2 P 2 + 2cQP + c2 , 9c 3 1 P 2 Q + QP 2 P Q2 + Q2 P , = 3c 2 2 1 = Q2 P 2 + 2cQP + c2 , 3
yielding a contradiction. Thirdly, it can be shown that one arrives (by arguments of a similar nature as above) at a contradiction even if one insists on the axioms (q3), (q4) and (q5), but discards (q1) (linearity); see [90]. (Note that by (q3) with φ(t) = ct, we still have at least homogeneity, i.e. Qcf = cQf for any constant c.)
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
397
In conclusion, we see that not only are the axioms (q1)–(q5) taken together, but even any three of the axioms (q1), (q3), (q4) and (q5) are inconsistent. Remark 1. The idea of discarding the linearity axiom (q1) may seem a little wild at first sight, but there seems to be no physical motivation for assuming linearity, though it is definitely convenient from the computational point of view (cf. Tuynman [261, Sec. 5.1]). In fact, nonlinear assignments f → Qf do actually occur already in some existing approaches to geometric quantization, namely when one defines the quantum observables Qf using the Blattner–Kostant–Sternberg kernels; cf. (3.66) in Sec. 3.8 below. The asymptotic morphisms in the E-theory of Connes and Higson take a similar approach as well [176, 69, 67]. Remark 2. The inconsistencies among the axioms above actually go even further. Namely, an analysis of the argument in [90] shows that, in fact, it only requires (q3) and (q5) alone to produce a contradiction. The combination (q1)+(q3) is satisfied e.g. by the map assigning to f the operator of multiplication by f , however this is uninteresting from the point of view of physics (noncommutativity is lost). Similarly, (q1)+(q4) can be satisfied but the outcome is of no physical relevance. The combination (q1)+(q5) is satisfied by the prequantization of van Hove (to be discussed in detail in Sec. 3.1 below). In conclusion, it thus transpires that with the exception of (q1)+(q5), and possibly also of (q4)+(q3) and (q4)+(q5), even any two of the axioms (q1), (q3), (q4) and (q5) are either inconsistent or lead to something trivial. Remark 3. From a purely mathematical viewpoint, it can, in fact, be shown that already (q3) and the canonical commutation relations (1.3) by themselves lead to a contradiction if one allows the space Obs to contain sufficiently “wild” functions ano curve function f mapping R continuously (i.e. not C ∞ — for instance, the Pe´ onto R2n ). See again [90]. 1.3. Getting out of the quagmire There are two traditional approaches on how to handle this disappointing situation. The first is to keep the four axioms (q1), (q2), (q4) and (q5) (possibly giving up only the von Neumann rule (q3)) but restrict the space Obs of quantizable observables. For instance, we have seen above that it may not contain simultaneously p2j , qj2 and p2j qj2 , for any j; however, taking Obs to be the set of all functions at most linear in p, i.e. fj (q)pj , f, fj ∈ C ∞ (Rn ), f (p, q) = f0 (q) + j
and setting Qf = f0 (ˆ q) +
1 [fj (ˆ q)Qpj + Qpj fj (ˆ q)], 2 j
June 23, 2005 10:9 WSPC/148-RMP
398
J070-00237
S. T. Ali & M. Engliˇ c
ˆ for the vector operator Qq , it is not difficult to see that where we have written q all of (q1), (q2), (q4) and (q5) are satisfied. Similarly one can use functions at most linear in q, or, more generally, in ap + bq for some fixed constants a and b. The second approach is to keep (q1), (q2) and (q4), but require (q5) to hold only asymptotically as the Planck constant h tends to zero. The simplest way to achieve this is as follows. By the remarks above, we know that the operator Qf corresponding to f (p, q) = eiη ·q (η ∈ Rn ) is Qf = eiη ·ˆq , and similarly for p. Now an “arbitrary” function f (p, q) can be expanded into exponentials via the Fourier transform, f (p, q) = fˆ(ξ, η)e2πi(ξ ·p+η ·q) dξ dη. Let us now postulate that fˆ(ξ, η)e2πi(ξ ·ˆp+η ·ˆq) dξ dη =: Wf , Qf = ˆ = Qp . After a simple manipulation, the operator Wf can be rewritwhere again, p ten as the oscillatory integral x+y −n (1.7) f p, Wf g(x) = h e2πi(x−y)·p/h g(y) dy dp. 2 This is the celebrated Weyl calculus of pseudodifferential operators (see H¨ ormander [140], Shubin [240], Taylor [254], for instance). The last formula allows us to define Wf as an operator from the Schwartz space S(Rn ) into the space S (Rn ) of tempered distributions; conversely, it follows from the Schwartz kernel theorem that any continuous operator from S into S is of the form Wf for some f ∈ S (R2n ). In particular, if f, g ∈ S (R2n ) are such that Wf and Wg map S(Rn ) into itself (this is the case, for instance, if f, g ∈ S(R2n )), then so does their composition Wf Wg . Thus, Wf Wg = Wf g for some f g ∈ S (R2n ) and we call f g the twisted (or Moyal) product of f and g. Now it turns out that under appropriate hypotheses on f and g (for instance, if f, g ∈ S(R2n ), but much weaker assumptions will do), one has the asymptotic expansion f g =
∞
hj ρj (f, g) as h → 0,
j=0
where ρ0 (f, g) = f g, ρ1 (f, g) =
i {f, g}. 4π (1.8)
Hence, in particular, ih {f, g} + O(h2 ) as h → 0. (1.9) 2π This is the asymptotic version of (q5). (Incidentally, for φ a polynomial, one also gets an asymptotic version of the von Neumann rule (q3).) The validity of (q1), (q2), and (q4) follows immediately from the construction. See [100, Chap. 2] for the details. f g − gf =
Remark 4. An elegant general calculus for non-commuting tuples of operators (of which (1.1) are an example), building essentially on (q1), (q2) and a version of (q3),
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
399
was developed by Nelson [194]. Generalizations of the Weyl calculus were studied by Anderson [15]. The basic problem of quantization is to extend these two approaches from R2n to any symplectic manifold. The first of the above approaches leads to geometric quantization, and the second to deformation quantization. We shall discuss the former in Sec. 3 and the latter in Secs. 4 and 5, and then mention some other approaches in Secs. 6–8. Prior to that, we review in Sec. 2 two other approaches, the Segal quantization and the Borel quantization, which are straightforward generalizations of the canonical scheme. They take a slightly different route by working only with the configuration space Q (the phase space Γ is basically forgotten completely, and its symplectic structure ω is used solely for the purpose of defining the Poisson bracket), and quantizing only functions on Q and vector fields on it instead of functions on Γ. This is the Segal quantization; the Borel quantization enhances it further by allowing for internal degrees-of-freedom (such as spin) with the aid of tools from representation theory — systems of imprimitivity and projection-valued measures. As mentioned earlier and as will emerge from the discussion, no one method completely solves the the problem of quantization, nor does it adequately answer all the questions raised. Consequently, we refrain from promoting one over the other, inviting the reader to formulate their own preference. 2. Canonical Quantization and Its Generalizations We discuss in some detail in this section the original idea of quantization, introduced in the early days of quantum mechanics — rather simple minded and ad hoc, but extremely effective — and some later refinements of it. Some useful references are [80, 86, 115, 118, 127, 134, 142, 193, 238] and [241]. 2.1. The early notion of quantization The originators of quantum theory used the following simple technique for quantizing a classical system: as before, let qi , pi , i = 1, 2, . . . , n, be the canonical position and momentum coordinates, respectively, of a free classical system with n degreesof-freedom. Then their quantized counterparts, qˆi , pˆi , are to be realized as operators on the Hilbert space H = L2 (Rn , dx), by the prescription (see (1.1)): (ˆ qi ψ)(x) = xi ψ(x),
(ˆ pi ψ)(x) = −i
∂ ψ(x), ∂xi
(2.1)
on an appropriately chosen dense set of vectors ψ in H. This simple procedure is known as canonical quantization. Then, as mentioned earlier, the Stone–von Neumann uniqueness theorem [199] states that, up to unitary equivalence, this is the only representation which realizes the canonical commutation relations (CCR): [ˆ qi , pˆj ] = iIδij ,
i, j = 1, 2, . . . , n,
(2.2)
irreducibly on a separable Hilbert space. Let us examine this question of irreducibility a little more closely.
June 23, 2005 10:9 WSPC/148-RMP
400
J070-00237
S. T. Ali & M. Engliˇ c
The operators qˆi , pˆj and I are the generators of a representation of the Weyl– Heisenberg group on L2 (Rn , dx). This group (for a system with n degrees-offreedom), which we denote by GWH (n), is topologically isomorphic to R2n+1 and consists of elements (θ, η), with θ ∈ R and η ∈ R2n , obeying the product rule (θ, η)(θ , η ) = (θ + θ + ξ(η, η ), η + η ), where, the multiplier ξ is given by ξ(η, η ) =
1 † 1 η ωη = (p · q − q · p ) , 2 2
ω=
0 −In In 0
(2.3) ,
(2.4)
In being the n × n identity matrix. This group is unimodular and nilpotent, with Haar measure dθ dη, dη being the Lebesgue measure of R2n . Each unitary irreducible representation (UIR) of GWH (n) is characterized by a non-zero real number, which we write as 1 , and eventually identify h = 2π with Planck’s constant (of course, for a specific value of it). Each UIR is carried by the Hilbert space H = L2 (Rn , dx) via the following unitary operators: i ˆ ψ (x) θ + η† ω η (U (θ, η)ψ)(x) = exp
i 1 (2.5) θ + p · x − p · q ψ(x − q), ψ ∈ H. = exp 2 This shows that the 2n quantized (unbounded) operators, ηˆi = qˆi , i = 1, 2, . . . , n ˆ , along and ηˆi = pˆi−n , i = n + 1, n + 2, . . . , 2n, which are the components of η with the identity operator I on H, are the infinitesimal generators spanning the representation of the Lie algebra gWH (n) of the Weyl–Heisenberg group GWH (n). Since the representation (2.5) is irreducible, so also is the representation (2.1) of the Lie algebra. This is the precise mathematical sense in which we say that the algebra of Poisson brackets {qi , pj } = δij is irreducibly realized by the representation (2.2) of the CCR. One could justifiably ask at this point, how many additional elements of the classical algebra, i.e., functions of qi , pj , could be similarly quantized and added to the set gWH (n) and the resulting enlarged algebra still be represented irreducibly on the same Hilbert space H. In other words, does there exist a larger algebra, containing gWH (n), which is also irreducibly represented on H = L2 (Rn , dx) and whose elements are the quantized versions of classical observables? To analyze this point further, let us look at functions u on R2n which are real-valued homogeneous polynomials in the variables qi and pj of degree two. Any such polynomial can be written as: u(η) =
2n 1 1 ηi Uij ηj = η T U η, 2 i,j=1 2
(2.6)
where the Uij are the elements of a 2n × 2n real, symmetric matrix U . Set U = JX(u),
J = ω −1 ,
(2.7)
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
401
with X(u) = −JU , a 2n × 2n real matrix satisfying X(u) = JX(u)T J.
(2.8)
It follows, therefore, that every such homogeneous real-valued polynomial u is characterized by a 2n × 2n real matrix X(u) satisfying (2.8), and conversely, every such matrix represents a homogeneous real-valued polynomial of degree two via 1 T η JX(u)η. (2.9) 2 Computing the Poisson bracket of two such polynomials u and v, we easily see that u(η) =
{u, v} =
1 T η J[X(u), X(v)]η, 2
where [X(u), X(v)] = X(u)X(v) − X(v)X(u). (2.10)
In other words, the set of homogeneous, real-valued, quadratic polynomials constitutes a closed algebra under the Poisson bracket operation, which we denote by P2 , and the corresponding set of matrices X(u) is closed under the bracket relation, [X(u), X(v)] = X({u, v}),
(2.11)
constituting thereby a matrix realization of the same algebra, P2 . In fact, it is not hard to see that this is a maximal subalgebra of the Poisson algebra (C ∞ (R2n ), {·, ·}) of all smooth functions on R2n with respect to the Poisson bracket (i.e., any other subalgebra which contains P2 must necessarily be the entire Poisson algebra). Moreover, we also see that {ηi , u} = (X(u)η)i ,
i = 1, 2, . . . , 2n,
(2.12)
or compactly, {η, u} = X(u)η,
(2.13)
which can be thought of as giving the action of the Poisson algebra of quadratic polynomials on R2n . Consider now the symplectic group Sp(2n, R), of 2n × 2n real matrices S, satisfying SJS T = J and det S = 1. Let S = eεX be an element of this group, close to the identity, where ε > 0 and X is a 2n × 2n real matrix. The fact that S can be written this way is guaranteed by the exponential mapping theorem for Lie groups. The defining condition SJS T = J, for an element of Sp(2n, R), then implies (I2n + εX)J(I2n + εX)T + O(ε3 ) = J. Simplifying and dividing by ε, XJ + JX T + εXJX T + O(ε2 ) = 0. Hence, letting ε → 0, we find that XJ + JX T = 0 ⇒ X = JX T J.
(2.14)
June 23, 2005 10:9 WSPC/148-RMP
402
J070-00237
S. T. Ali & M. Engliˇ c
Thus, JX is a symmetric matrix and X a matrix of the type (2.8) with an associated second degree, homogeneous, real-valued polynomial: X = X(u),
u(η) =
1 T η JXη. 2
(2.15)
On the other hand, the matrices X in S = eεX constitute the Lie algebra sp(2n, R) of the Lie group Sp(2n, R), and thus we have established an algebraic isomorphism P2 sp(2n, R). Moreover, the relations (2.10) and (2.13) together then constitute the Lie algebra of the metaplectic group,c which is the semi-direct product Mp(2n, R) = GWH (n) Sp(2n, R). The Lie algebra, mp(2n, R), of this group consists, therefore, of all real-valued, first-order and second-order homogeneous polynomials in the variables qi , pi i = 1, 2, . . . , n. The group Mp(2n, R) has elements (θ, η, S) and the multiplication rule is: (θ, η, S)(θ , η , S ) = (θ + θ + ξ(η, Sη ), η + Sη , SS ),
(2.16)
with the same multiplier ξ as in (2.4). The metaplectic group has a UIR on the same space H, extending the representation of U of GWH (n) given in (2.5). We denote this representation again by U and see that since (θ, η, S) = (θ, η, I2n )(0, 0, S), U (θ, η, S) = U (θ, η)U (S),
(2.17)
, the unitary operator U (S) can be shown [241] to be 1 T iε ˆ ˆ ˆ JX(u)ˆ =− η U (S) = exp − X(u) , X(u) η. (2.18) 2
where for S = e
εX(u)
Furthermore, using the unitarity of U (S), it is easily shown that ˆ ˆ ˆ [X(u), X(v)] = iX({u, v}),
(2.19)
that is, the quantization of η now extends to second degree, homogeneous polynoˆ ˆ ˆ and X(u) mials in the manner u → X(u) := uˆ. The self-adjoint operators η of the representation of the Lie algebra mp(2n, R), on the Hilbert space H, satisfy the full set of commutation relations, ˆ [ˆ η , X(u)] = iX(u)ˆ η, ˆ ˆ ˆ [X(u), X(v)] = iX({u, v}).
(2.20)
In the light of the Groenewold–van Hove results, mentioned earlier, this is the best one can do. In other words, it is not possible to find an algebra larger than mp(2n, R), which quantizes a larger classical algebra and which could also be irreducibly represented on L2 (Rn , dx). On the other hand, van Hove also showed that if one relaxes the irreducibility condition, then on L2 (R2n , dη), it is possible to c Due
to some existing terminological confusion in the literature, this is a different metaplectic group from the one we will encounter in Sec. 3.5 below.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
403
represent the full Poisson algebra of R2n . This is the so-called prequantization result, to which we shall return later. Given the present scheme of canonical quantization, a number of questions naturally arise. • Let Q be the position space manifold of the classical system and q any point in it. Geometrically, the phase space of the system is the cotangent bundle Γ = T ∗ Q. ∂ If Q is linear, i.e., Q Rn , then the replacement qi → xi , pj → −i ∂x works j fine. But what if Q is not a linear space? • How do we quantize observables which involve higher powers of qi , pj , such as for example f (q, p) = (qi )n (pj )m , when m + n ≥ 3? • How should we quantize more general phase spaces, which are symplectic manifolds but not necessarily cotangent bundles? In the rest of this section we review two procedures which have been proposed to extend canonical quantization to provide, among others, the answer to the first of these questions. 2.2. Segal and Borel quantization A method for quantizing on an arbitrary configuration space manifold Q was proposed by Segal [238], as a generalization of canonical quantization and very much within the same spirit. A group theoretical method was suggested by Mackey [180], within the context of the theory of induced representations of finite dimensional groups. A much more general method, combining the Segal and Mackey approaches, was later developed by Doebner, Tolar, Pasemann, Mueller, Angermann and Nattermann [80, 81, 193]. It cannot be applied to an arbitrary symplectic manifold, but only to cotangent bundles; the reason is that it distinguishes between the position variables q ∈ Q (the configuration space) and the momentum variables X ∈ T Q in an essential way. Functions f (q) of the spatial variables are quantized by the multiplication operators (fˆφ)(q) = f (q)φ(q) on L2 (Q, µ) with some measure µ, while vector fields X are quantized by ˆ = − ih (Xφ + divµ X · φ) Xφ 2π ˆ be a formally self-adjoint operator on (the additional term divµ X ensures that X 2 L (Q, µ)). One then has the commutation relations , [fˆ, gˆ] = 0, ˆ Yˆ ] = − ih [X, ˆ fˆ] = − ih Xf [X, Y ], [X, 2π 2π which clearly generalize (1.3). A method using infinite dimensional diffeomorphism groups, obtained from local current algebras on the physical space, was suggested by Goldin et al. [112, 115, 118]. The relation to diffeomorphism groups of the configuration space was also noticed by Segal, who in fact in the same paper [238] lifted the theory to the cotangent
June 23, 2005 10:9 WSPC/148-RMP
404
J070-00237
S. T. Ali & M. Engliˇ c
bundle T ∗ Q and thereby anticipated the theory of geometric quantization. Segal also pointed out that the number of inequivalent such quantizations was related to the first cohomology group of Q. 2.3. Segal quantization Let us elaborate a bit on the technique suggested by Segal. The configuration space Q of the system is, in general, an n-dimensional C ∞ -manifold. Since in the case when Q = Rn , canonical quantization represents the classical position observables qi as the operators qˆi of multiplication by the corresponding position variable, on the Hilbert space H = L2 (Rn , dx), Segal generalized this idea and defined an entire class of observables of position using the smooth functions f : Q → R. Similarly, since canonical quantization on Q = Rn replaces the classical observables of momentum, pi , by derivatives with respect to these variables, in Segal’s scheme an entire family of quantized momentum observables is obtained by using the vector fields X of the manifold Q. With this idea in mind, starting with a general configuration space manifold, one first has to choose a Hilbert space. If the manifold is orientable, its volume form determines a measure, ν, which is locally equivalent to the Lebesgue measure: dν(x) = ρ(x) dx1 dx2 · · · dxn ,
x ∈ Q,
(2.21)
where ρ is a positive, non-vanishing function. The quantum mechanical Hilbert space is then taken to be H = L2 (Q, dν). In local coordinates we shall write the vector fields of Q as n ∂ ai (x) , X= ∂xi i=1 for C ∞ -functions ai : Q → R. The generalized quantum observables of position are then defined by the mappings, f → qˆ(f ), such that on some suitable dense set of vectors ψ ∈ H, (ˆ q (f )ψ)(x) = f (x)ψ(x).
(2.22)
Ignoring technicalities involving domains of these operators, they are easily seen to be self-adjoint (f is real). In order to obtain a set of quantized momentum observables, we first notice that quite generally the natural action of the vector field X, φ → X(φ), on a suitably chosen set of smooth functions φ ∈ H, defines an operator on the Hilbert space. This operator may not be bounded and may not be self-adjoint. However, denoting by X ∗ the adjoint of the operator X, the combination, [X − X ∗ ], (2.23) 2i does define a self-adjoint operator (if again we ignore domain related technicalities), and we take this to be the generalized momentum operator corresponding to the pˆ(X) =
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
405
vector field X. An easy computation then leads to the explicit expression, pˆ(X) = −i(X + KX ), where KX is the operator of multiplication by the function
N ∂ai (x) 1 1 kX (x) = divν (X)(x) = X(log ρ)(x) + . 2 2 ∂xi i=i
(2.24)
(2.25)
In terms of the Lie bracket [X, Y ] = X ◦ Y − Y ◦ X of the vector fields, one then obtains for the quantized operators the following commutation relations, which clearly generalize the canonical commutation relations: [ˆ p(X), pˆ(Y )] = −iˆ p([X, Y ]) [ˆ q (f ), pˆ(X)] = iˆ q(X(f ))
(2.26)
[ˆ q (f ), qˆ(g)] = 0. It ought to be pointed out here that the above commutation relations constitute an infinite dimensional Lie algebra, Xc (Q) ⊕ C ∞ (Q)R . This is the Lie algebra of the (infinite-dimensional) group, Xc (Q) Diff(Q), the semi-direct product of the (additive) linear group of all complete vector fields of Q with the group (under composition) of diffeomorphisms of Q (generated by the elements of Xc (Q)). The product of two elements (f1 , φ1 ) and (f2 , φ2 ) of this group is defined as: (f1 , φ1 )(f2 , φ2 ) = (f1 + φ1 (f2 ), φ1 ◦ φ2 ). The Lie algebra generated by the first set of commutation relations (for the momentum operators) in (2.26) is called a current algebra. When modeled on the physical space, rather than the configuration space, the relations (2.26) are precisely the non-relativistic current algebra introduced by Dashen and Sharp [73]. The corresponding semi-direct product group was obtained in this context by Goldin [112]. Next note that if θ is a fixed one-form of Q, then replacing pˆ(X) by pˆ(X) = pˆ(X) + Xθ,
(2.27)
in (2.24) does not change the commutation relations in (2.26). Indeed, by choosing such one-forms appropriately, one can generate inequivalent families of representations of the Lie algebra Xc (Q) ⊕ C ∞ (Q)R . In particular, if θ is logarithmically exact, i.e., if θ = dF F , for some smooth function F , then the representations generated by the two sets of operators, {ˆ p(X), qˆ(f )} and {ˆ p(X) , qˆ(f )} are unitarily equivalent. In other words, there exists a unitary operator V on H which commutes with all the qˆ(f ), f ∈ C ∞ (Q)R , and such that V pˆ(X)V ∗ = pˆ(X) ,
X ∈ Xc (Q).
Some simple examples The obvious example illustrating the above technique is provided by taking Q = R3 , H = L2 (R3 , dx). Consider the functions and vector
June 23, 2005 10:9 WSPC/148-RMP
406
J070-00237
S. T. Ali & M. Engliˇ c
fields, ∂ ∂ , Ji = εijk xj , i, j, k = 1, 2, 3, (2.28) ∂xi ∂xk where εijk is the well-known completely antisymmetric tensor (in the indices i, j, k) and summation being implied over repeated indices. Quantizing these according to the above procedure we get the usual position, momentum and angular momentum operators, ∂ , qˆi := qˆ(fi ) = xi , pˆi = pˆ(Xi ) = −i ∂xi (2.29) ∂ ˆ Ji = pˆ(Ji ) = −iεijk xj . ∂xk Computing the commutation relations between these operators, following (2.26), we get the well-known results, fi (x) = xi ,
Xi =
[ˆ qi , qˆj ] = [ˆ pi , pˆj ] = 0, [ˆ qi , pˆj ] = iδij I, [ˆ pi , Jˆj ] = iεijk pˆk ,
[ˆ qi , Jˆj ] = iεijk qˆk , [Jˆi , Jˆj ] = iεijk Jˆk .
(2.30)
Note that these are just the commutation relations between the infinitesimal generators of the orthochronous Galilei group Gorth in a space of three dimensions and hence they define its Lie algebra, which now emerges as a subalgebra of the Lie algebra Xc (Q) ⊕ C ∞ (Q)R . Now let A(x) = (A1 (x), A2 (x), A3 (x)) be a magnetic vector potential, B = ∇ × A the corresponding magnetic field. Consider the one-form e Ai dxi c i=1 3
θ=−
(e = charge of the electron and c = velocity of light). The set of quantized operators 3 1 2e qˆ(f ) and pˆ(X) = −iX + (ˆ pi ai ) − Ai ai , 2 i=1 c where X(x) =
3 i=1
ai (x)
∂ , (2.31) ∂xi
realize a quantization of a nonrelativistic charged particle in a magnetic field. (For a “current algebraic” description, see Menikoff and Sharp [183].) In particular, if dθ = 0 (i.e., ∇×A = B = 0), then θ is closed, hence exact, and there is no magnetic field. Hence, from a physical point of view, the quantizations corresponding to different such θ must all be unitarily equivalent and indeed, as noted above, this is also true mathematically. This point is illustrated by taking vector potential A(x) = µ(x2 , x1 , 0) where µ is a constant. Then ∇ × A = 0 and the one-form θ = − eµ c [x2 dx1 + x1 dx2 ] is logarithmically exact: eµ dF , with F = exp − x1 x2 . θ= F c
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
407
On the other hand, consider the case where A(x) = B2 (−x2 , x1 , 0), B > 0. This is the case of a constant magnetic field B = (0, 0, B) of strength B along the third axis. The corresponding one-form θ = eB 2c [x2 dx1 − x1 dx2 ] is not closed and for each different value of B we get an inequivalent quantization. As the next example, let Q = R3 \{R}, the three-dimensional Euclidean space with the third axis removed. We take the measure dν(x) = dx and the Hilbert space H = L2 (Q, dν). Consider the vector potential, A(x) = −
µ (−x2 , x1 , 0) µ > 0, r2
r2 = (x1 )2 + (x2 )2 .
Then ∇ × B = 0 and the one-form µe θ(x) = 2 [x2 dx1 − x1 dx2 ] cr is closed. However, θ is not exact, since we may write θ = dF , with µe −1 x2 F =− tan , c x1
(2.32)
(2.33)
which is a multivalued function on Q. Since B = 0, physically the classical systems with A = 0 and A given as above should be equivalent. However, the quantizations for the two cases (which can be easily computed using (2.27)) are inequivalent. This is an example of the Aharonov–Bohm effect (see [1]). Finally, for the same configuration space R3 \{R}, consider the case in which the magnetic field itself is given by B(x) =
2I (−x2 , x1 , 0), cr2
r2 = (x1 )2 + (x2 )2 .
This is the magnetic field generated by an infinite current bearing wire (of current strength I) placed along the x3 -axis. The vector potential, given locally by π 2I x2 π (0, 0, φ), − < φ = tan−1 A(x) = < , c 2 x1 2 does not give rise to a global form and for each value of I one gets a different quantization. As mentioned earlier, Segal actually suggested going over to the group of diffeomorphisms Diff(Q) and its unitary representations, to attend to domain questions associated to qˆ(f ), pˆ(X), and then suggested a classification scheme for possible unitarily inequivalent quantizations in these terms. Note also, that the Segal quantization method is based on configuration space, rather than on phase space. As such, the primary preoccupation here is to generalize the method of canonical quantization. On the other hand, as we said before, Segal also extended the theory to phase space and in that sense, Segal’s method leads to similar results as other methods that we shall study, on the representations of the Poisson algebra on Hilbert space. At this point we should also mention that Goldin, Sharp and their collaborators proposed to describe quantum theory by means of unitary representations of
June 23, 2005 10:9 WSPC/148-RMP
408
J070-00237
S. T. Ali & M. Engliˇ c
groups of diffeomorphisms of the physical space [112, 114, 120]. Deriving the current algebra from second quantized canonical fields, their programme has succeeded in predicting unusual possibilities, including the statistics of anyons in two space dimensions [115, 116, 119, 177]. Diffeomorphisms of the physical space act naturally on the configuration space Q and thus form a subgroup. In fact, the unitary representations of this group are sufficient to characterize the quantum theory, so that the results of Goldin, et al., carry over to the quantization framework described in the next section. In particular, the unitarily inequivalent representations describing particle statistics were first obtained by Goldin, Menikoff and Sharp [115–117]. For an extended review of these ideas, see [113]. 2.4. Borel quantization We pass on to the related, and certainly more assiduously studied, method of Borel quantization. This method focuses on both the geometric and measure theoretic properties of the configuration space manifold Q as well as attempting to incorporate internal symmetries by lifting Q to a complex Hermitian vector bundle with connection and curvature, compatible with the Hermitian structure. Consider a one-parameter family of diffeomorphisms s → φs of Rn , which are sufficiently well behaved in the parameter s ∈ R, in an appropriate sense. Then, d f ◦ φs |s=0 = X(f ), ds
(2.34)
where f is an arbitrary smooth function, defines a vector field X. Its quantized form pˆ(X), according to Segal’s procedure will be a general momentum observable acting on ψ ∈ L2 (Rn , dx) in the manner (ˆ p(X)ψ)(x) = −i(Xψ)(x) −
i ∂ai (x)ψ(x), 2 ∂xi
where X(x) =
n i=1
ai (x)
∂ , ∂xi (2.35)
and together, the set of all such momentum observables then form an algebra under the bracket operation (see (2.26)): [ˆ p(X), pˆ(Y )] = −iˆ p([X, Y ]).
(2.36)
We write φs = φX s , to indicate the generator, and define the transformed sets X φX s (∆) = {φs (x) | x ∈ ∆},
(2.37)
for each Borel set ∆ in Rn . Next, denote the σ-algebra of the Borel sets of Q = Rn by B(Rn ). Corresponding to each ∆ ∈ B(Rn ), define an operator P (∆) on H: 1, if x ∈ ∆, (P (∆)ψ)(x) = χ∆ (x)ψ(x), χ∆ (x) = (2.38) 0, otherwise.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
409
This is a projection operator, P (∆) = P (∆)∗ = P (∆)2 , and has the following measure theoretic properties: P (∅) = 0, P (Rn ) = I P ∆i = P (∆i ) if ∆i ∩ ∆j = ∅, i = j, i∈J
(2.39)
i∈J
where J is a discrete index set and the convergence of the sum is meant in the weak sense. Such a set of projection operators P (∆), ∆ ∈ B(Rn ), is called a (normalized) projection valued measure (or PV-measure for short) on Rn . Note that, for any ψ ∈ H, µψ (∆) = ψ|P (∆)ψ, ψ(x)2 dx, =
∆ ∈ B(Rn ),
(2.40)
∆
defines a real measure, absolutely continuous with respect to the Lebesgue measure. It is then easily checked that for each s ∈ R, p(X)], V (φX s ) = exp[−isˆ
(2.41)
defines a unitary operator on H, such that {V, P } is a system of imprimitivity [180, 271] in the sense: X X V (φX s )P (∆)V (φ−s ) = P (φs (∆)).
(2.42)
Now considering all such one-parameter diffeomorphism groups and their associated systems of imprimitivity, we find that the collective system is certainly irreducibly realized on H = L2 (Rn , dx). Suppose now that the system which we wish to quantize has some internal degrees-of-freedom, such as the spin of a particle. Thus there is some group G of internal symmetries, and for any UIR of G on some (auxiliary) Hilbert space K, we want to work on the Hilbert space H = K ⊗ L2 (Rn , dx) instead of just L2 (Rn , dx); and we would like (2.42) to be irreducibly realized on this H. For instance, for the free particle in R3 , to accommodate for its spin we need to replaced L2 (R3 , dx) by H = C2j+1 ⊗ L2 (R3 , dx), with C2j+1 carrying the jth spinor representation of SU(2), j = 0, 12 , 1, 32 , . . . . The aim of Borel quantization is to construct such irreducible systems on arbitrary configuration space manifolds Q. It is clear that the problem is related to that of finding irreducible representations of the diffeomorphism group, Diff(Q), which admit systems of imprimitivity based on the Borel sets of Q. Let Q be a configuration space manifold, of dimension n, µ a smooth measure ˜ = Ck ⊗ on Q (i.e., locally equivalent to the Lebesgue measure on Rn ) and let H 2 L (Q, dµ), where k ≥ 1 is an integer. d From
a purely mathematical point of view, this amounts to replacing the original configuration space R3 by its Cartesian product with a discrete set consisting of 2j + 1 points.
June 23, 2005 10:9 WSPC/148-RMP
410
J070-00237
S. T. Ali & M. Engliˇ c
˜ Let P˜ (E) be the projection valued measure on H: ˜ ˜ (P˜ (E)ψ)(x) = χE (x)ψ(x),
˜ E ∈ B(Q), ψ˜ ∈ H,
(2.43)
χE being the characteristic function of the set E and B(Q) denoting the set of all Borel sets of Q. Now let H be another Hilbert space and P a PV-measure on it (also defined over B(Q)). Definition 2.1. The pair {H, P } is called a k-homogeneous localized quantum ˜ P˜ }, i.e., if and only if there system if and only if it is unitarily equivalent to {H, ˜ exists a unitary map W : H → H such that WP (E)W −1 = P˜ (E),
E ∈ B(Q).
(2.44)
Let f ∈ C ∞ (Q)R = (space of infinitely differentiable, real-valued functions on Q). Definition 2.2. Let {H, P } be a k-homogeneous localized quantum system. The self-adjoint operator, f (x) dPx , (2.45) qˆ(f ) = Q
defined on the domain, D(ˆ q (f )) =
ψ ∈ H |f (x)|2 dψ|Px ψ < ∞ , Q
is called a generalized position operator. Note that under the isometry (2.44), qˆ(f ) becomes the operator of multiplication ˜ The following properties of these operators are easily verified: by f on H. (1) (2) (3) (4) (5)
qˆ(f ) is a bounded operator if and only if f is a bounded function. qˆ(f ) = 0 if and only if f = 0. qˆ(αf ) = αˆ q (f ), for α ∈ R. qˆ(f + g) ⊇ qˆ(f ) + qˆ(g) and D(ˆ q (f ) + qˆ(g)) = D(ˆ q (f )) ∩ D(ˆ q (g)). qˆ(f · g)) ⊇ qˆ(f ) qˆ(g) and D(ˆ q (f ) qˆ(g)) = D(ˆ q (f · g)) ∩ D(ˆ q (f )).
We had mentioned earlier the notion of a shift on the manifold Q. This is a oneparameter group of diffeomorphisms: φs : Q → Q, φs2 ◦φs1 = φs1 +s2 , s, s1 , s2 ∈ R, φ0 being the identity map. Each such shift defines a complete vector field, X via, X(f ) :=
d f ◦ φs |s=0 , ds
(2.46)
f being an arbitrary smooth function on the manifold and conversely, every such vector field X gives rise to a shift φX s , called the flow of the vector field: sX(x) , π(φX −s ) = e
(2.47)
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
411
where π(φX −s ) is a linear operator on the space of smooth functions f on the manifold: X (π(φX −s )f )(x) = f (φs (x)),
x∈Q.
(2.48)
There is a natural action of the shifts on Borel sets E ⊂ Q, X E → φX s (E) = {φs (x) | x ∈ E}.
(2.49)
X Since φX s is smooth, the resulting set φs (E) is also a Borel set. We want to represent X ˜ the shifts φs on H as one-parameter unitary groups on Hilbert spaces H. Let U(H) denote the set of all unitary operators on H and, as before, Xc (Q) the set of all complete vector fields on the manifold Q.
Definition 2.3. Let {H, P } be a quantum system localized on Q. A map X V : φX s → V (φs ) ∈ U(H),
(2.50)
is called a shift of the localized quantum system if, for all X ∈ Xc (Q), the map s → V (φX s ) gives a strongly continuous representation of the additive group of R and {V (φX s ), P } is a system of imprimitivity with respect to the group of real numbers R and the Borel R-space Q with group action φX s , i.e., X X V (φX s )P (E)V (φ−s ) = P (φs (E)).
(2.51)
The triple {H, P, V } is called a localized quantum system with shifts. Two localized quantum systems with shifts, {Hj , Pj , Vj }, j = 1, 2, are said to be unitarily equivalent if there exists a unitary map W : H1 → H2 , such that −1 = V2 (φX W P1 (E)W −1 = P2 (E), E ∈ B(Q) and W V1 (φX s )W s ), x ∈ Xc (Q), s ∈ R. The map pˆ : Xc (Q) → S(H) (the set of all self-adjoint operators on H), where pˆ(X) is defined via Stone’s theorem as the infinitesimal generator of i sˆ p (X) , (2.52) ) = exp V (φX s is called the kinematical momentum of {H, P, V }. The imprimitivity relation (2.51) has the following important consequences. Lemma 2.4. Let {H, P, V } be a k-homogeneous localized quantum system with shifts. Then q (f )V (φX ˆ(f ◦ φX V (φX s )ˆ −s ) = q s ).
(2.53)
A k-homogeneous quantum system with shifts {H, P, V } is unitarily equivalent to ˜ P˜ , V˜ }, with H ˜ and P˜ as in (2.43). {H, The representation V˜ acquires a very specific form. To understand it we need the concept of a cocycle. Let G be a locally compact group, H a standard Borel group,e X a Borel G-space with group action x → gx and [ν] a G-invariant measure H is a Borel space which can be embedded into a (standard) metric space with the left and right group actions being Borel maps.
e i.e.
June 23, 2005 10:9 WSPC/148-RMP
412
J070-00237
S. T. Ali & M. Engliˇ c
class on X. (This means that if ν is any measure in the class, then so also is νg , where νg (E) = ν(gE), for all E ∈ B(Q).) A Borel measurable map ξ : G × X → H is called a cocycle of G, relative to the measure class [ν] on X, with values in H, if ξ(e, x) = 1, ξ(g1 g2 , x) = ξ(g1 , g2 x)ξ(g2 , x),
(2.54)
for [ν]-almost all x ∈ X and almost all (with respect to the Haar measure) g1 , g2 ∈ G (e is the identity element of G). Two cocycles ξ1 and ξ2 are said to be cohomologous or equivalent if there exists a Borel function ζ : X → H, such that, ξ2 (g, x) = ζ(gx)ξ1 (g, x)ζ(x)−1 for almost all g ∈ G and x ∈ X. The equivalence classes [ξ] are called cohomology classes of cocycles. The following classification theorem for localized quantum systems then holds. Theorem 2.5. Any localized k-homogeneous quantum system {H, P, V } on Q, with ˜ P˜ , V˜ }, with H ˜ = shifts, is unitarily equivalent to a canonical representation {H, k 2 C ⊗ L (Q, dµ), for some smooth measure µ on Q, ˜ ˜ (P˜ (E)ψ)(x) = χE (x)ψ(x), ˜ and all E ∈ B(Q), and for all ψ˜ ∈ H
X X X X ˜ ˜ X (V (φX s )ψ)(x) = ξ (s, φ−s (x)) λ(φs , φ−s (x))ψ(φ−s (x)),
(2.55)
˜ and all X ∈ Xc (Q), where ξ X is a cocycle of the Abelian group R for all ψ˜ ∈ H (relative to the class of smooth measures on Q), having values in U(k) (the group of k × k unitary matrices) and λ is the unique smooth Radon–Nikodym derivative, λ(φX s , x) =
dµφX s (x). dµ
Moreover, equivalence classes of k-homogeneous localized quantum systems are in one-to-one correspondence with equivalence classes of cocycle sets [{ξ X }X∈Xc (Q) ], where {ξ1X }X∈Xc (Q) ∼ {ξ2X }X∈Xc (Q) if there exists a Borel function ζ : Q → U(k), such that, for all X ∈ Xc (Q), s ∈ R and x ∈ Q, X −1 ξ2X (s, x) = ζ(φX . s (x))ξ1 (s, x)ζ(x)
Differentiating (2.55) with respect to s, using (2.52), and then setting s = 0, we obtain, i ˜ pˆ(X)ψ˜ = −iLX ψ˜ − divν (X)ψ˜ + α(X)ψ, 2
(2.56)
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
413
where LX ψ˜ is the Lie derivative of ψ˜ along X and, d 1 X divν (X)(x) = λ(φX s , φ−s (x))|s=0 2 ds (2.57) d X ξ (s, φX α(X)(x) = −i (x))| . s=0 −s ds The first two terms in (2.56) are linear in X. It is now possible to show that the following commutation relations hold: [ˆ q (f ), qˆ(g)] = 0, [ˆ p(X), qˆ(f )] = −iˆ q(LX f ),
(2.58)
[ˆ p(X), pˆ(Y )] = −iˆ p([X, Y ]) − iΩ(X, Y ), for all f, g ∈ C ∞ (Q)R , X, Y ∈ Xc (Q), and where, Ω(X, Y ) = −i[α(X), α(Y )] + LX α(Y ) − LY α(X) − α([X, Y ]).
(2.59)
The two-form Ω and the one-form α on Q are related in the same way as the curvature two-form 1 Ω of a C1 -bundle and its connection one-form 1 α(X). Indeed, one can show that if D is the covariant derivative defined by the connection, then DΩ = 0, which is the Bianchi identity. Definition 2.6. Let {H, P, V } be a k-homogeneous localized quantum system with shifts on Q and Ω a differential two-form on Q with values in the set of all k × k Hermitian matrices. The kinematical momentum pˆ is called Ω-compatible if in a ˜ P˜ , V˜ }, the associated kinematical momenta p˜ satisfy canonical representation {H, ˜ [˜ p(X), p˜(Y )]ψ˜ = −i(˜ p([X, Y ])ψ˜ + Ω(X, Y )ψ).
(2.60)
In this case, the quadruple {H, qˆ, pˆ, Ω} is called an Ω-compatible k-Borel kinematics. In order to arrive at a classification theory of localized quantum systems, we first impose some additional smoothness conditions. An Ω-compatible k-quantum Borel ˜ q˜, p˜, Ω}, ˜ kinematics {H, qˆ, pˆ, Ω} is said to be differentiable if it is equivalent to {H, where ˜ = L2 (E, ·|·, dν) for a Ck -bundle E over Q, with Hermitian metric ·|· and (1) H a smooth measure ν on Q. ˜ is a two-form with (self-adjoint) values in the endomorphism bundle LE = (2) Ω E ⊗ E∗ . (3) (˜ q (f )σ)(x) = f (x)σ(x), for all f ∈ C ∞ (Q)R and smooth sections σ ∈ Γ0 (= smooth sections of compact support). (4) p˜(X)Γ0 ⊂ Γ0 , for all X ∈ Xc (Q). We then have the following canonical representation of a differentiable quantum Borel kinematics: Theorem 2.7. Let {H, qˆ, pˆ, Ω} be a localized differentiable quantum Borel kinematics on Q in canonical representation. Then there is a Hermitian connection ∇ with
June 23, 2005 10:9 WSPC/148-RMP
414
J070-00237
S. T. Ali & M. Engliˇ c
curvature 1 Ω on E and a covariantly constant self-adjoint section Φ of LE = E⊗E∗ , the bundle of endomorphisms of E, such that for all X ∈ Xc (Q) and all σ ∈ Γ0 , i (2.61) pˆ(X)σ = −i∇X σ + − I + Φ divν (X)σ. 2 For an elementary quantum Borel kinematics, i.e., when the Ck -bundle is a line bundle, one can give a complete classification of the possible equivalence classes of quantum Borel kinematics. Indeed, for Hermitian line bundles, one has the classification theorem: Theorem 2.8. Let Q be a connected differentiable manifold and B a closed twoform on Q (i.e., dB = 0). Then there exists a Hermitian complex line bundle (E, ·|·, ∇), with compatible connection and curvature 1 B, if and only if B satisfies the integrality condition 1 B ∈ Z, (2.62) 2π Σ for all closed two-surfaces Σ in Q. Furthermore, the various equivalence classes of (E, ·|·, ∇) (for fixed curvature 1 B) are parameterized by H 1 (Q, U(1)) π1 (Q)∗ , where π1 (Q)∗ denotes the group of characters of the first fundamental group of Q. The classification of the associated elementary quantum Borel kinematics is then spelled out in the following theorem. Theorem 2.9. The equivalence classes of elementary localized differentiable quantum Borel kinematics are in one-to-one correspondence with I 2 (Q) × π1 (Q)∗ × R, where I 2 (Q) denotes the set of all closed real two-forms on Q, satisfying the integrality condition (2.62). For Ck -bundles only a weaker result, for Ω = 0, is known: Theorem 2.10. The equivalence classes of (Ω = 0)-compatible differentiable and localized k-quantum Borel kinematics are in one-to-one correspondence with the equivalence classes {(D, A)} of pairs of unitary representations D ∈ Hom(π1 (Q), U(k)) and self-adjoint complex k × k matrices A ∈ S(Ck ) D , where D is the commutant of the representation D, i.e., D = {M ∈ L(Ck ) | [M, D(g)] = 0, ∀g ∈ π1 (Q)}. Here two pairs (D1 , A1 ) and (D2 , A2 ) are equivalent if there is a unitary matrix U such that D2 = U D1 U −1 and A2 = U A1 U −1 . Instead of enlarging the space of quantizable observables to include the Hamiltonian, the Borel quantization method then proceeds in a different way to treat the time evolution of the quantized system, leading ultimately to a nonlinear Schr¨ odinger equation; see Ali [3], Doebner and Nattermann [81], Angermann, Doebner and Tolar [17], Angermann [16], Tolar [256], Pasemann [210] and Mueller [189] for the details. For a comparison with geometric quantization (to be discussed in the next section) see Zhao [285].
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
415
3. Geometric Quantization We pass on to a treatment of geometric quantization, which in addition to being a physical theory has also emerged as a branch of mathematics. The starting point here is a real symplectic manifold Γ (the phase space) of dimension 2n, with symplectic form ω. For a function f on Γ, the corresponding Hamiltonian vector field Xf is given by ω(·, Xf ) = df . The Poisson bracket of two functions is defined by {f, g} = −ω(Xf , Xg ).
(3.1)
Starting with such a manifold as the arena of classical mechanics, the goal of geometric quantization is to assign to each such manifold (Γ, ω) a separable Hilbert space H and a mapping Q : f → Qf from a subspace Obs (as large as possible) of real-valued functions on Γ, which is a Lie algebra under the Poisson bracket, into self-adjoint linear operators on H in such a way that (Q1) Q1 = I, where 1 is the function constant one and I the identity operator on H; (Q2) the mapping f → Qf is linear; ih Q{f,g} , ∀f, g ∈ Obs; (Q3) [Qf , Qg ] = 2π (Q4) the procedure is functorial in the sense that for two symplectic manifolds (Γ(1) , ω (1) ), (Γ(2) , ω (2) ) and a diffeomorphism φ of Γ(1) onto Γ(2) which sends ω (1) into ω (2) , the composition with φ should map Obs(2) into Obs(1) and there should be a unitary operator Uφ from H(1) onto H(2) such that Qf ◦φ = Uφ∗ Qf Uφ , (1)
(2)
∀f ∈ Obs(2) ;
(3.2)
(Q5) for (Γ, ω) = R2n with the standard symplectic form, we should recover the operators Qqj , Qpj in (1.1), up to unitary equivalence. Remark 5. The requirements (Q4) and (Q5) are, in some way, a substitute for the irreducibility condition (1.2), which may be difficult to interpret on a general symplectic manifold (i.e. in the absence of a global separation of coordinates into the q and p variables). Another, frequently used, possibility is to require that for some “distinguished” set of observables f the corresponding quantum operators Qf should act irreducibly on H; however, there seems to be no general recipe how one should choose such “distinguished” sets. The requirement that there be no nontrivial subspace in H invariant for all Qf , f ∈ Obs, is not the correct substitute; see Tuynman [262] for a thorough discussion of this point. Also we gave up the von Neumann rule (q3), but it turns out that this is usually recovered to some extent, cf. [122]. Remark 6. Observe that if there is a group G of symplectomorphisms acting on (Γ, ω), then the covariance axiom (Q4) implies (taking Γ1 = Γ2 = Γ) that the quantization map f → Qf is (essentially) G-invariant.
June 23, 2005 10:9 WSPC/148-RMP
416
J070-00237
S. T. Ali & M. Engliˇ c
The solution to the above problem was first given by Kostant [167] and Souriau [248]. It is accomplished in two steps: prequantization and polarization. Prequantization starts with introducing a complex Hermitian line bundle L over Γ with a connection ∇ whose curvature form satisfies curv ∇ = 2πω/h. (For (L, ∇) to exist it is necessary that the cohomology class of ω/h in H 2 (Γ, R) be integral; this is known as the prequantization condition.) One then defines for each f ∈ C ∞ (Γ) the differential operator ih ∇Xf + f (3.3) 2π where the last f stands for the operator of multiplication by f . Plainly these operators satisfy (Q1), (Q2) and (Q4), and a short computation reveals that they also satisfy (Q3). Unfortunately, (Q5) is manifestly violated for the operators (3.3); in fact, for Γ = R2n these operators act not on L2 (Rn ) but on L2 (R2n ), so we need somehow to throw away half of the variables. More precisely, one checks that for Γ = R2n the operators (3.3) are given by ∂f ih ∂f ∂ ∂f ∂ − pj Qf = − + f− , 2π j ∂pj ∂qj ∂qj ∂pj ∂pj j Qf = −
so restricting Qf to the space of functions depending only on q and square-integrable over q ∈ Rn one recovers the desired operators (1.1). For a general symplectic manifold (Γ, ω), making sense of “functions depending on and square-integrable over only half of the variables” is achieved by polarization. The latter amounts, roughly speaking, to choosing a subbundle P of complex dimension n in the complexified tangent bundle T ΓC in a certain way and then restricting to functions on Γ which are constant along the directions in P.f This settles the “dependence on half of the variables”. As for the “square-integrability”, the simplest solution is the use of halfdensities, which however does not give the correct quantization for the harmonic oscillator; one therefore has to apply the metaplectic correction, which amounts to using not half-densities but half-forms and gives the right answer for the harmonic oscillator (but not in some other cases, cf. [261]). Finally, for functions f which leave P invariant, i.e. [Xf , P] ⊂ P, the corresponding operator given (essentially) by (3.3) maps a function constant along P into another such function, and thus one arrives at the desired quantum operators. Since geometric quantization is still probably the most widely discussed quantization method, we will now examine all the above ingredients in some more detail prior to embarking on the discussion of other approaches. f If Γ is a cotangent bundle, i.e. Γ = T ∗ Q for some configuration space Q, one can polarize simply by restricting to functions depending on q only; however, for general symplectic manifolds the global separation into position and momentum coordinates is usually impossible. A well-known example of a physical system whose phase space is not a cotangent bundle is the phase space of classical spin (discussed extensively in Souriau [248]), which can be identified with the Riemann sphere S2 .
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
417
3.1. Prequantization The aim of prequantization is to construct a mapping f → Qf satisfying all the required axioms except (Q5). For simplicity, let us start with the case when Γ is a cotangent bundle: Γ = T ∗ Q. One can then define globally a real one-form θ (the symplectic potential) satisfying dθ = ω.
(3.4)
Actually, if m ∈ Γ and ξ ∈ Tm Γ, then one sets θ(ξ) := m(π∗ ξ) where π : Γ → Q denotes the cotangent bundle projection and π∗ : T Γ → T Q is the derivative map of π. In terms of local coordinates qj on Q and (pj , qj ) on Γ, one has n n θ= pj dqj , ω = dpj ∧ dqj . (3.5) j=1
j=1
The Hamiltonian field Xf of a function f on Γ is in these coordinates given by n ∂f ∂ ∂f ∂ − Xf = , (3.6) ∂pj ∂qj ∂qj ∂pj j=1 and the Poisson bracket {f, g} = −ω(Xf , Xg ) = Xf g of two functions f, g is again expressed by (1.5). ih Xf A simple computation shows that [Xf , Xg ] = −X{f,g} , thus Qf = − 2π satisfies the conditions (Q2), (Q3) and (Q4). Unfortunately, (Q1) fails, since X1 = 0. Let us try correcting this by taking ih Xf + f 2π (where the latter f is to be taken as the operator of multiplication by the function f ). Then Q1 = I, as desired, but Qf = −
ih (Q{f,g} + {f, g}) 2π so now (Q3) is violated. Observe, however, that [Qf , Qg ] =
Xf (θ(Xg )) − Xg (θ(Xf )) = −θ(X{f,g} ) + {f, g} by a straightforward computation using (3.6) and (3.5). Thus taking ih (3.7) Xf − θ(Xf ) + f 2π it follows that all of (Q1)–(Q4) will be satisfied. Having settled the case of the cotangent bundle, let us now turn to general symplectic manifolds (Γ, ω). By a theorem of Darboux, one can always cover Γ by local coordinate patches (pj , qj ) such that the second formula in (3.5) (and, hence, also (3.6)) holds; however, the corresponding symplectic potentials need not agree Qf = −
June 23, 2005 10:9 WSPC/148-RMP
418
J070-00237
S. T. Ali & M. Engliˇ c
on the intersections of two coordinate patches. Let us therefore examine what is the influence of a different choice of potential on the operator (3.7). If ω = dθ = dθ , then θ = θ + du (locally) for some real function u; then θ (Xf ) − θ(Xf ) = Xf u = −eu Xf e−u , whence 2π
2π
e ih u Qf φ = Qf e ih u φ ,
∀φ ∈ C ∞ .
(3.8)
Recall now that, quite generally, a complex line bundle L over a manifold Γ is given by the following data: (1) a covering (atlas) {Uα }α∈I of Γ by coordinate patches, (2) a family of transition functions {gαβ }α,β∈I , each gαβ being a nonvanishing C ∞ function in Uα ∩ Uβ , satisfying the cocycle condition gαβ gβγ = gαγ
in Uα ∩ Uβ ∩ Uγ
(3.9)
(⇒ gαα = 1, gβα = 1/gαβ ). A section φ of L is a family of functions φα : Uα → C such that φα = gαβ φβ
in Uα ∩ Uβ .
(3.10)
(Similarly, one defines vector bundles by demanding that fα be mappings from Uα into a (fixed) vector space V, and gαβ ∈ GL(V) be linear isomorphisms of V; more generally, a (fiber) bundle with some object G as fiber is defined by taking fα to be mappings from Uα into G, and gαβ to be isomorphisms of the object G.) For later use, we also recall that L is said to be Hermitian if, in addition, there is given a family eα of positive C ∞ functions on Uα such that eα = |gαβ |−2 eβ
in Uα ∩ Uβ .
In that case, for two sections φ, ψ one can define unambiguously their “local” scalar product — a function on Γ — by if m ∈ Uα .
(φ, ψ)m = eα (m)φα (m)ψα (m),
Further, a mapping (ξ, φ) → ∇ξ φ from X(Γ) × Γ(L) into Γ(L), where Γ(L) denotes the space of all smooth (i.e. C ∞ ) sections of L and X(Γ) the space of all smooth vector fields on Γ, is called a connection on L if it is linear in both ξ and φ, ∇f ξ φ = f ∇ξ φ
(3.11)
∇ξ (f φ) = (ξf )φ + f ∇ξ φ
(3.12)
and for any f ∈ C ∞ (Γ). The curvature of this connection is the 2-form on Γ defined by curv(∇)(ξ, η)φ := i(∇ξ ∇η − ∇η ∇ξ − ∇[ξ,η] )φ,
∀ξ, η ∈ X(Γ), φ ∈ Γ(L). (3.13)
Finally, a connection on a Hermitian line bundle is said to be compatible (with the Hermitian structure) if ξ(φ, ψ) = (∇ξ φ, ψ) + (φ, ∇ξ ψ) C
for φ, ψ ∈ Γ(L) and complex vector fields ξ ∈ V (Γ) .
(3.14)
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
419
Returning to our symplectic manifold (Γ, ω), suppose now that we have an open cover {Uα }α∈I of Γ and collections {θα }α∈I and {uαβ }α,β∈I such that θα is a symplectic potential on Uα and θα = θβ + duαβ on Uα ∩ Uβ . Comparing (3.8) and (3.10), we see that if we can take 2π (3.15) gαβ = exp − uαβ ih then the local operators Qf can be glued together into a well-defined global operator on the sections of the corresponding line bundle L. The functions defined by the last formula satisfy the consistency condition (3.9) (u + u + u ) = 1, that is, if and only if there exist if and only if exp − 2π αβ βγ γα ih integers nαβγ such that uαβ + uβγ + uγα = nαβγ h for all α, β, γ such that Uα ∩ Uβ ∩ Uγ is nonempty. One can show that this condition is independent of the choice of the cover {Uα } etc. and is, in fact, a condition on ω: it means that the de Rham cohomology class defined by h−1 ω in H 2 (Γ, R) should be integral. This is known as the integrality condition (or prequantization condition), and we will assume it to be fulfilled throughout the rest of this section. The bundle L is called the prequantization bundle. Observe that since the transition functions (3.15) are unimodularg (because uαβ are real), we can equip the bundle L with a Hermitian structure simply by taking eα = 1 ∀α; that is, (φ, ψ)m = φα (m)ψα (m). We finish this subsection by exhibiting a compatible connection ∇ on L, in terms of which the operators Qf assume a particularly simple form. Namely, define, for ξ ∈ X(Γ), ψ ∈ Γ(L) and a local chart Uα , 2π θα (ξ)ψα . (3.16) ih One easily checks that this definition is consistent (i.e. that φ := ∇ξ ψ satisfies the relations (3.10)) and that ∇ satisfies (3.11), (3.12) and (3.14), i.e. defines a compatible connection. Now comparing (3.7) and (3.16) we see that the prequantum operators Qf can be rewritten simply as (∇ξ ψ)α := ξψα +
ih ∇Xf + f. (3.17) 2π To summarize our progress, we have shown that on an arbitrary symplectic manifold (Γ, ω) such that h−1 ω satisfies the integrality condition, there exists a Hermitian line bundle L and operators Qf on Γ(L) (the space of smooth sections Qf = −
g In general, if the transition functions g αβ of a (fiber) bundle all belong to a group G, G is said to be the structure group of the bundle. Thus the line bundle L above has structure group U (1), and, similarly, the frame bundles F k P to be constructed in the next subsection have structure groups GL(k, R).
June 23, 2005 10:9 WSPC/148-RMP
420
J070-00237
S. T. Ali & M. Engliˇ c
of L) such that the correspondence f → Qf satisfies the conditions (Q1)–(Q4). In more detail, there is a compatible connection ∇ on L, and the operators Qf are given by the formula (3.17). Remark 7. It can be shown that the curvature of the connection (3.16) is given by 2π ω. curv(∇) = h The fact that, for a given symplectic manifold (Γ, ω), there exists a Hermitian line bundle L with a compatible connection ∇ satisfying curv(∇) = 2πω if and only if ω satisfies the integrality condition, is the content of a theorem of Weil [276] (see also [167]). Furthermore, the equivalence classes of such bundles (L, ∇, (·, ·)) are then parameterized by the elements of the first cohomology group H 1 (Γ, T) with coefficients in the circle group T. This should be compared to the content of Theorem 2.9, which we stated in the context of Borel quantization. Remark 8. In another guise, the integrability condition can be expressed by saying that the integral of ω over any closed orientable 2-dimensional surface in Γ should be an integer multiple of 2π. This is reminiscent of the Bohr–Sommerfeld quantization condition, familiar from the old quantum theory. Remark 9. It is possible to give an alternative description of the whole construction above in the language of connection forms. Namely, let L× denote the line bundle L with the zero section removed. The fundamental vector field on L× corresponding to c ∈ C is defined by d (ηc f )(m, z) = f (e2πict z)t=0 , ∀m ∈ Γ, z ∈ L× m, dt for any function f on L× . A connection form is a one-form α on L× which is C× -invariant and satisfies α(ηc ) = c ∀c ∈ C; in other words, it is locally given by × × α = π ∗ Θ + i dz z , with Θ a one-form on Γ and z the coordinate in the fiber Lm C . A vector field ζ on L× is called horizontal (with respect to α) if α(ζ) = 0. It can be shown that every vector field ξ on Γ has a unique horizontal lift ξ˜ on L× , defined by the requirements that π∗ ξ˜ = ξ
˜ =0 and α(ξ)
(i.e. ξ˜ is horizontal).
One can then easily verify that the recipe ˜ β (∇ξ φ) := ξφ
in a local chart Uβ ,
or, equivalently, ∇ξ φ = 2πiφ∗ α(ξ)φ, defines a connection on L× . Our connection (3.16) corresponds to the choice 2π dz θβ + i in a local chart Uβ × C× . h z See Sniatycki [246, Sec. 3.1] for the details. αβ =
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
421
Remark 10. Still another (equivalent) description may be based on the use of connection one-forms in a principal U (1)-bundle over Γ and the Reeb vector field therein; see [262] and the references therein. We conclude by mentioning also an alternative characterization of the prequantum operators Qf when the Hamiltonian field Xf of f is complete. In that case, the field Xf generates a one-parameter group (a flow) ρt = exp(tXf ) of canonical transformations (symplectomorphisms) of (Γ, ω). This flow lifts uniquely to a flow — again denoted ρt — of linear connection-preserving transformations on Γ(L). The operator Qf is then given by Qf φ = −
ih d (ρt φ)t=0 . 2π dt
For the details we refer to Sniatycki [246, Sec. 3.3]. In particular, since the induced transformations ρt on Γ(L) are unitary, it follows by the Stone theorem that Qf are (essentially) self-adjoint operators on the Hilbert space
n Hpreq := the completion of φ ∈ Γ(L) : (φ, φ)m |ω | < ∞ Γ
of all square-integrable sections of L. This is also akin to the construction of the operators pˆ(X) in Borel quantization (see (2.56)). 3.2. Real polarizations and half-densities We now discuss the second step of geometric quantization — namely, making sense of “functions depending on” and “square-integrable over” only half of the variables. The simplest way of doing this is via real polarizations and half-densities, which we now proceed to describe. A (real) distributionh D on Γ is a map which assigns to each point m ∈ Γ a linear subspace Dm of Tm Γ such that (i) dim Dm = k (a constant independent of m ∈ Γ) (ii) ∀m0 ∈ Γ ∃ a neighborhood U of m0 and vector fields X1 , . . . , Xk on U such that ∀m ∈ U , Dm is spanned by X1 |m , . . . , Xk |m . A distribution is called involutive if for any two vector fields X, Y ∈ D (i.e. Xm , Ym ∈ Dm ∀m) implies that [X, Y ] ∈ D as well; and integrable if for each m0 ∈ Γ there exists a submanifold N of Γ passing through m0 and such that ∀m ∈ N : Dm = Tm N . A theorem of Frobenius asserts that for real distributions, the notions of integrability and involutiveness are equivalent. An integrable distribution is also called a foliation, and the maximal connected submanifolds N as above are called its leaves. A foliation is called reducible (or fibrating) if the set of h This
is not to be confused with the distributions (generalized functions) in the sense of L. Schwartz!
June 23, 2005 10:9 WSPC/148-RMP
422
J070-00237
S. T. Ali & M. Engliˇ c
all leaves — denoted Γ/D — can be given a structure of a manifold in such a way that the natural projection map π : Γ → Γ/D is a (smooth) submersion. So far, all these definitions make sense for an arbitrary (smooth) manifold Γ. If Γ is symplectic, then we further define D to be isotropic if ω(X, Y ) = 0 ∀X, Y ∈ D; and Lagrangian if it is maximal isotropic, i.e. dim Dm = n := 12 dim Γ ∀m ∈ Γ. A Lagrangian foliation is called a real polarization on Γ. One can prove the following alternative characterization of real polarizations: a smooth distribution D on Γ is a real polarization if and only if for each m0 ∈ Γ there exists a neighborhood U of m0 and n independent functions f1 , . . . , fn on U ∗ (i.e. ∀m ∈ U : df1 , . . . , dfn are independent in Tm Γ) such that: (i)
∀m ∈ U , Dm is spanned by Xf1 |m , . . . , Xfn |m ;
(ii)
{fi , fj } = 0 on U , ∀i, j = 1, . . . , n.
(3.18)
(That is, D is locally spanned by commuting Hamiltonian vector-fields.) Now we say that a section φ of our prequantization bundle L with connection ∇ (constructed in the preceding subsection) is covariantly constant along D if ∇X φ = 0 ,
∀X ∈ D.
In view of the compatibility relation (3.14), the “local” scalar product (φ, ψ) of two covariantly constant sections is then a function on Γ constant along D (i.e. X(φ, ψ) = 0 ∀X ∈ D), hence, defines a function on Γ/D. Let us now deal with the issue of “integrating” over Γ/D. The simplest solution would be to take the integral of (φ, ψ)m with respect to some measure on Γ/D. That is, if µ is a (nonnegative regular Borel) measure on Γ/D, let H be the Hilbert space of all sections φ of L such that φ is covariantly constant along D and (φ, ψ)m dµ(x) < ∞ Γ/D
(where, for each x ∈ Γ/D, m is an arbitrary point in the fiber π −1 (x) above x). For a real function f on Γ, the quantum operator could then be defined on H by ih ∇Xf φ + f φ, (3.19) 2π granted this takes φ ∈ H again into a section covariantly constant along D. In view of (3.12) and (3.13), the latter is readily seen to be the case if Qf φ = −
[Xf , X] ∈ D,
∀X ∈ D.
(3.20)
Hence, proclaiming the set of all functions satisfying (3.20) to be the space Obs of quantizable observables, we have arrived at the desired quantization recipe. Unfortunately, there seems to be no canonical choice for the measure µ on Γ/D in general. For this reason, it is better to incorporate the choice of measure directly into the bundle L: that is, to pass from the prequantum line bundle L of Sec. 3.1 to the tensor product of L with some “bundle of measures on Γ/D”. In order for this
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
423
product to make sense, we must (first of all define this “bundle of measures” over Γ/D, and second) turn the latter bundle into a bundle over Γ (instead of Γ/D). Let us now explain how all this is done. Consider, quite generally, a manifold X of dimension n, and let π : F nX → X be the bundle of n-framesi over X , i.e. the fiber Fxn X at x ∈ X consists of all ordered n-tuples of linearly independent vectors (ξ1 , . . . , ξn ) from Tx X . The group GL(n, R) of real nonsingular n × n matrices acts on F n X in a natural way: if ξjk are the coordinates of ξj with respect to some local chart U × Rn of Tx X , then g ∈ GL(n, R) acts by (ξ · g)jk =
n
ξjl glk .
l=1
Now recall that one possible definition of a complex n-form is that it is a mapping η : F nX → C assigning to a point x ∈ X and an n-frame (ξ1 , . . . , ξn ) ∈ Fxn X a complex number ηx (ξ1 , . . . , ξn ) such that ηx (ξ · g) = ηx (ξ) · det g,
∀g ∈ GL(n, R).
By analogy, we therefore define a density on X as a mapping ν from F nX into C satisfying νx (ξ · g) = νx (ξ) · |det g|,
∀g ∈ GL(n, R),
and, more generally, an r-density, where r is any (fixed) real number, by νx (ξ · g) = νx (ξ) · |det g|r ,
∀g ∈ GL(n, R).
(3.21)
Similarly, one defines, for a distribution D on a manifold, an r-D-density as a n D mapping from the bundle F n D (n = dim D) of n-frames of D (i.e. the fiber Fm consists of all ordered bases of Dm ) into C which satisfies νx (ξ · g) = νx (ξ) · |det g|r ,
∀ξ ∈ F n D,
∀g ∈ GL(n, R).
(3.22)
Let us now apply this to the case of X = Γ/D with D a real polarization as above. Thus, a 12 -density on Γ/D is a function φ which assigns to any ordered n-tuple of independent tangent vectors ξj ∈ Tx (Γ/D) a complex number φx (ξ1 , . . . , ξn ) such that (3.21) holds with r = 12 . We now define a “lift” from 12 densities on Γ/D to − 21 -D-densities on Γ as follows. Let m ∈ Γ and let ξ1 , . . . , ξn be a frame of Tπ(m) (Γ/D), where π : Γ → Γ/D denotes the canonical projection. Then ∗ (Γ/D), defined by cj (ξk ) = δjk . there exists a unique dual basis c1 , . . . , cn ∈ Tπ(m) ∗ Γ, and we can therefore This basis is mapped by π ∗ onto n independent vectors of Tm ˜ define tangent vectors ξj ∈ Tm Γ by the recipe ∗ ω(·, ξ˜j ) = πm cj .
bundle F kX of k-frames, where 1 ≤ k ≤ n, is defined similarly; in particular, F 1X is just the tangent bundle without the zero section.
i The
June 23, 2005 10:9 WSPC/148-RMP
424
J070-00237
S. T. Ali & M. Engliˇ c
From the properties of the symplectic form ω one easily sees that π∗ ξ˜j = 0, that ˜ between is, ξ˜1 , . . . , ξ˜n is, in fact, a basis of Dm , and the correspondence (ξ) → (ξ) the frames of Tπ(m) (Γ/D) and the frames of Dm is bijective. For a half-density φ on Γ/D, we can therefore define a function φ˜ on F n D by ˜ ξ) ˜ := φ(ξ). φ( An easy computation shows that ˜ ξ˜ · g) = φ( ˜ ξ ˜ ξ) ˜ · |det g −1T |1/2 , φ( · g −1T ) = φ( where T stands for matrix transposition. Thus φ˜ is a − 21 -D-density on Γ. Let us denote by B D the complex fiber bundle of − 12 -D-densities on Γ. (That D consists of all functions νm : F n D → C satisfying (3.22), and the is: the fiber Bm D sections of B are thus − 21 -D-densities on Γ.) The map φ → φ˜ above thus defines a lifting from ∆1/2 (Γ/D), the (similarly defined) line bundle of 12 -densities on Γ/D, into B D . It turns out that the image of this lifting consists precisely of the sections of B D which are “covariantly constant” along D. Namely, for any ζ ∈ D one can define a mapping ∇ζ on B D as follows: if ν is a − 21 -D-density, then (∇ζ ν)m (η ) := ζ(ν(η))|m
∀ m ∈ Γ,
(3.23)
where η is an arbitrary frame in Dm and η = (η1 , . . . , ηn ), where ηj are n linearly independent locally Hamiltonian vector fields on Γ which span D in a neighborhood of m and such that η|m = η (such vector fields exist because D is a polarization, cf. (3.18)). It is not difficult to verify that ∇ζ ν is independent of the choice of η, and that ∇ satisfies the axioms (3.11) and (3.12), and is thus a well-defined partial connection on B D . (The term “partial” refers to the fact that it is defined for ζ ∈ D only.) From (3.23) it also follows that ∇ is flat, i.e. ∇ξ ∇ζ − ∇ζ ∇ξ = ∇[ξ,ζ] ,
∀ ξ, ζ ∈ D.
Now it can be proved that a − 12 -D-density ν on Γ is a lift of a 12 -density φ on Γ/D, ˜ if and only if i.e. ν = φ, ∇ζ ν = 0,
∀ ζ ∈ D,
i.e. if and only if ν is covariantly constant along D. Coming back to our quantization business, consider now the tensor product QB := L ⊗ B D
(3.24)
(the quantum bundle) with the (partial) connection given by ∇ζ (s ⊗ ν) = ∇ζ s ⊗ ν + s ⊗ ∇ζ ν
(ζ ∈ D, s ∈ Γ(L), ν ∈ Γ(B D )).
(3.25)
Collecting all the ingredients above, it transpires that for any two sections φ = s⊗ ν and ψ = r ⊗ µ of QB which are covariantly constant along D (i.e. ∇ζ φ = ∇ζ ψ = 0, ∀ζ ∈ D), we can unambiguously define a half-density (φ, ψ) on Γ/D by the formula (φ, ψ)π(m) (π∗ ξ) := (s, r)m νm (ζ)µm (ζ)|ω (ζ, ξ)|,
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
425
where (ζ, ξ) is an arbitrary basis of Tm Γ such that (ζ) is a basis of Dm , and (−1)n(n−1)/2 n ω n! is the symplectic volume on Γ. Now introduce the Hilbert space H = the completion of ψ ∈ Γ(QB) : ∇ζ ψ = 0 ∀ζ ∈ D and ω =
Γ/D
(3.26) (ψ, ψ) < ∞
of all square-integrable sections of QB covariantly constant along D, with the obvious scalar product. Finally, for a vector field ζ on Γ, let ρt = exp(tζ) be again the associated flow of diffeomorphisms of Γ. The derived map ρt∗ on the tangent vectors defines a flow ρ˜t on F n Γ: ρ˜t (m, (ξj )) := (ρt m, (ρt∗ ξj )). One can prove that if [ζ, D] ⊂ D
(i.e. [ζ, η] ∈ D ∀η ∈ D)
(3.27)
then ρ˜t maps the subbundle F n D ⊂ F n Γ into itself, and we can therefore define a lift ζ˜ of ζ to F n D by the recipe d ˜ ζ(m, (ξ)) := ρ˜t (m, (ξ))|t=0 . dt Now if ν is a − 21 -D-density then it is a function on F n D, hence we can apply ˜ := Lζ ν will again be a − 1 -D-density. Further, Lζ ν is ζ˜ to it, and the result ζν 2 linear in ν; Lζ (gν) = gLζ ν + (ζg)ν;
(3.28)
if η is another vector field for which [η, D] ⊂ D, then Lζ Lη − Lη Lζ = L[ζ,η] ;
(3.29)
and if ζ is a locally Hamiltonian vector field in D, then Lζ ν = ∇ζ ν coincides with the partial connection ∇ζ constructed above. Now we are ready to define (at last!) the quantum operators. Namely, if f : Γ → R is a smooth function whose Hamiltonian vector field Xf satisfies (3.27), i.e. [Xf , D] ⊂ D, then the quantum operator Qf is defined on sections of QB as follows: ih ih Qf (s ⊗ ν) := − ∇Xf s + f s ⊗ ν + s ⊗ − LXf ν . 2π 2π
(3.30)
(3.31)
From the properties of L and ∇ it transpires that if s ⊗ ν is covariantly constant along D then so is Qf (s⊗ν), and so Qf gives rise to a well-defined operator (denoted again by Qf ) on the Hilbert space H introduced above; it can be shown that if Xf is complete then Qf is (essentially) self-adjoint.
June 23, 2005 10:9 WSPC/148-RMP
426
J070-00237
S. T. Ali & M. Engliˇ c
The space of all real functions f ∈ C ∞ (Γ) satisfying (3.30) is, by definition, the space Obs of quantizable observables. Unfortunately, it turns out that, no matter how elegant, the quantization procedure described in this section sometimes gives incorrect answers: namely, for the one-dimensional harmonic oscillator (corresponding to the observable f = 12 (p2 +q 2 ) on the phase space Γ = R2 with the usual symplectic form ω = dp ∧ dq), one has first of all to modify the whole procedure further by allowing “distribution valued” sectionsj of QB (see Sec. 3.6.1 below), and even then the energy levels come out as nh/2π, n = 1, 2, . . . , instead of the correct answer (n − 12 )h/2π. It turns out that the reason for this failure is the use of half-densities above instead of the so-called half-forms; in order to describe how the situation can still be saved, we need to introduce complex tangent spaces and complex polarizations. We therefore proceed to describe this extended setup in the next subsection, and then describe the necessary modifications in Sec. 3.4.k
3.3. Complex polarizations From now on, we start using complex objects such as the complexified tangent bundle T ΓC , complex vector fields ξ ∈ X(Γ)C , etc., and the bar ¯ will denote complex conjugation. A complex polarization P on the manifold Γ is a complex distribution on Γ such that (i) (ii) (iii) (iv)
P is involutive (i.e. X, Y ∈ P ⇒ [X, Y ] ∈ P); P is Lagrangian (i.e. dimC P = n ≡ 12 dimR Γ and ω(X, Y ) = 0 ∀X, Y ∈ P); dimC Pm ∩ P¯m =: k is constant on Γ (i.e. independent of m); P + P¯ is involutive.
Again, one can prove an alternative characterization of complex polarizations along the lines of (3.18): namely, a complex distribution P on Γ is a complex polarization if and only if ∀m0 ∈ Γ there is a neighborhood U of m0 and n independent complex C ∞ functions z1 , . . . , zn on U such that (i) ∀m ∈ U , Pm is spanned (over C) by the Hamiltonian vector fields Xz1 |m , . . . , Xzn |m ; (ii) {zj , zk } = 0 on U ∀ j, k = 1, . . . , n; (3.32) (iii) dimC Pm ∩ P¯m =: k is constant on Γ (i.e. independent of m and U ); (iv) the functions z1 , . . . , zk are real and ∀ m ∈ U, Pm ∩ P¯m is spanned by Xz1 |m , . . . , Xzk |m . time the distributions are those of L. Schwartz (not subbundles of T Γ). reason for allowing complex polarizations is that there are symplectic manifolds on which no real polarizations exist — for instance, the sphere S2 . j This
k Another
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
427
To each complex polarization there are associated two real involutive (and, hence, integrable) distributions D, E on Γ by ¯ dimR D = k) D = P ∩ P¯ ∩ T Γ (so DC = P ∩ P, ¯ dimR E = 2n − k). ¯ ∩ T Γ (so E C = P + P, E = (P + P) One has E = D⊥ , D = E ⊥ (the orthogonal complements with respect to ω), so that, in particular, Xf ∈ E ⇔ f is constant along D (i.e. ξf = 0, ∀ ξ ∈ D), and similarly Xf ∈ D ⇔ f is constant along E. A complex polarization is called admissible if the space of leaves Γ/D admits a structure of a manifold such that π : Γ → Γ/D is a submersion. In that case, E˜ := π∗ E defines a real integrable distribution of dimension 2(n − k) on Γ/D, and using the Newlander–Nirenberg theorem one can show that the mapping J : Tx L → Tx L defined on each leaf L of E˜ in Γ/D by J (π∗ Re w) = π∗ Im w is an integrable complex structure on L and if Xz1 , . . . , Xzk are local Hamiltonian vector fields as in (3.32) then the functions zk+1 , . . . , zn form, when restricted to L, a local system of complex coordinates which makes L a complex manifold. In particular, if z is a complex function on an open set U ⊂ Γ, then Xz ∈ P if and only if locally z = z˜ ◦ π where z˜ : π −1 (U ) ⊂ Γ/D → C is holomorphic when restricted ˜ to any leaf of E. Throughout the rest of this section, unless explicitly stated otherwise, we will consider only admissible complex polarizations. Let us now proceed to define the quantum Hilbert space H and the quantum operators Qf in this new setting. For real polarizations D, we did this by identifying functions on Γ/D with sections on Γ covariantly constant along D, and then solving the problem of integration by lifting the half-densities on Γ/D to − 12 -D-densities on Γ. For complex polarizations, the “quotient” Γ/P does not make sense; and if we use Γ/D instead, then, since dim Γ/D can be smaller than n in general, the passage from half-densities on Γ/D to “− 21 -D-densities” on Γ breaks down. What we do then is to trust our good luck and just carry out the final quantization procedure as described for real polarizations, and see if it works — and it does! Let us start by defining F n P C to be the bundle of all complex frames of P.l There is a natural action of GL(n, C), written as (η) → (η)·g, on the fibers of F n P C , and we define a − 21 -P-density ν on Γ as a complex function on F n P C such that νm ((η) · g) = νm ((η)) · |detg|−1/2 ,
∀ (η) ∈ F n P C ,
∀ g ∈ GL(n, C),
(3.33)
l The superscript C is just to remind us that this is a complex object; there is no such thing as F nP R !
June 23, 2005 10:9 WSPC/148-RMP
428
J070-00237
S. T. Ali & M. Engliˇ c
and denote the (complex line) bundle of all − 21 -P-densities on Γ by B P . Next we define ∇ζ ν, for ζ ∈ P, by ζ[ν((η)) · |ω,k (ηk+1 , . . . , ηn , η¯k+1 , . . . , η¯n )|1/4 ] (3.34) (∇ζ ν)m ((η)|m ) = |ω,k (ηk+1 , . . . , ηn , η¯k+1 , . . . , η¯n )|1/4 m where (η1 , . . . , ηn ) are any vector fields which span P in a neighborhood of m such that η1 , . . . , ηk are real Hamiltonian vector fields spanning D, and ω,k is the 2(n − k)-form defined by ω,k =
(−1)(n−k)(n−k−1)/2 n−k ω (n − k)!
(3.35)
(so that, in particular, ω,0 = ω is the volume form (3.26)). It again turns out that ∇ζ ν is a − 21 -P-density if ν is,m and defines thus a flat partial connection on B P . The formula (3.25) then defines a partial connection on the quantum bundle QB := L ⊗ B P (L being, as before, the prequantum bundle from Sec. 3.1). Now if φ = s ⊗ ν, ψ = r ⊗ µ are two arbitrary (smooth) sections of QB, then we set (φ, ψ)m (π∗ (ζk+1 , . . . , ζn , ξ1 , . . . , ξn )) := (s, r)m νm (ζ1 , . . . , ζn )µm (ζ1 , . . . , ζn ) · |ω,k (ζk+1 , . . . , ζn , ζ¯k+1 , . . . , ζ¯n )|1/2 · |ω (ζ1 , . . . , ζn , ξ1 , . . . , ξn )| (3.36) where ζ1 , . . . , ζn , ξ1 , . . . , ξn is any basis of Tm ΓC such that ζ1 , . . . , ζk is a basis of C = Pm ∩ P¯m and ζ1 , . . . , ζn is a basis of Pm , and ω,k and ω are the forms given Dm by (3.35) and (3.26), respectively. This time not every basis of Tπ(m) (Γ/D)C arises as π∗ (ζk+1 , . . . , ζn , ξ1 , . . . , ξn ) with ζ, ξ as above, but it is easily seen that the values of (φ, ψ)m on different frames are related in the correct way and thus (φ, ψ)m extends 2n−k (Γ/D)C (the fiber at π(m) of the to define consistently a unique density on Fπ(m) bundle of all complex (2n − k)-frames on Γ/D). From the proof of the Frobenius theorem one can show that for any local Hamiltonian vector fields Xz1 , . . . , Xzn as in (3.32) there exist vector fields Y1 , . . . , Yk (possibly on a subneighborhood of U ) such that π∗ (Xzk+1 , . . . , Xzn , Xz¯k+1 , . . . , Xz¯n , Y1 , . . . , Yk ) is a basis of Tπ(m) (Γ/D)C which depends only on π(m), and ω (Xz1 , . . . , Xzn , Xz¯k+1 , . . . , Xz¯n , Y1 , . . . , Yk ) is a factor |ω,k |1/4 in (3.34) needs some explanation. The reason for it is that if we defined ∇ζ ν simply by the same formula (3.23) as for the real polarizations, then ∇ζ ν might fail to be a − 12 -P-density: it would have satisfied the relation (3.33) only if there were no absolute value around det g there. (That is, if (ˆ η ) = (η) · g is another frame satisfying the conditions imposed on η, then we have ζ(det g) = 0, which need not imply ζ|det g| = 0.) This difficulty does not arise for real polarizations (since then det g is locally of constant sign), nor for the half-forms discussed in the next subsection (where there is no absolute value around the determinant). On the other hand, (3.34) has the advantage that it defines ∇ζ consistently not only for ζ ∈ P, but even for ¯ however, we will not need this refinement in the sequel. ζ ∈ E C = P + P; It should be noted that the correction factor |ω,k |1/4 is such that the combination ν(η) · |ω,k (ηk+1 , . . . , η¯n )|1/4 depends only on the vectors η1 , . . . , ηk spanning D, and defines thus a − 12 -D-density on Γ. m The
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
429
function constant on the leaves of D. Taking these vector fields for the ζj and ξj in (3.36), it can be proved in the same way as for the real polarizations that η(φ, ψ)m (π∗ (Xz , Xz¯, Y )) = (∇η¯ φ, ψ)m (π∗ (Xz , Xz¯, Y )) + (φ, ∇η ψ)m (π∗ (Xz , Xz¯, Y )) for any η ∈ Dm . Thus, in particular, if φ, ψ are covariantly constant along D, then (φ, ψ)m depends only on π(m) and defines thus a density on Γ/D. We can therefore define, as before, the Hilbert space H = the completion of ψ ∈ Γ(QB) : ∇ζ ψ = 0 ∀ζ ∈ P and
Γ/D
(ψ, ψ) < ∞
(3.37)
of square-integrable sections of QB covariantly constant along P (with the obvious inner product). Finally, if ζ is a real vector field on Γ satisfying [ζ, P] ⊂ P, with the associated flow ρt , and ν a − 12 -P-density on Γ, then we may again define Lζ ν by d ρt (η))t=0 , (η ∈ F n P C ) (3.38) (Lζ ν)m (η) = νρt m (˜ dt and show that Lζ ν is again a − 21 -P-density and that Lζ has all the properties of a “flat partial Lie derivative” ((3.28) and (3.29)) and that LXf = ∇Xf whenever f is a real function for which Xf ∈ P (hence Xf ∈ D). Now the operator ih ih Qf (s ⊗ ν) := − ∇Xf s + f s ⊗ ν + s ⊗ − LXf ν , (3.39) 2π 2π defined for any real function f such that [Xf , P] ⊂ P,
(3.40)
maps sections covariantly constant along P again into such sections, and thus defines an operator on H, which can be shown to be self-adjoint if Xf is complete. Having extended the method of Sec. 3.2 to complex polarizations, we now describe the modification needed to obtain the correct energy levels for the harmonic oscillator: the metalinear correction. 3.4. Half-forms and the metalinear correction What this correction amounts to is throwing away the absolute value in the formula (3.33); that is, to pass from half-densities to half-forms. To do that we obviously need to have the square root of the determinant in (3.33) defined in a consistent manner; this is achieved by passing from GL(n, C) to the metalinear group ML(n, C), and from the frame bundle F n P C to the bundle Fˆ n P C of metalinear P-frames. The group ML(n, C) consists, by definition, of all pairs (g, z) ∈ GL(n, C) × C× satisfying z 2 = det g
June 23, 2005 10:9 WSPC/148-RMP
430
J070-00237
S. T. Ali & M. Engliˇ c
with the group law (g1 , z1 ) · (g2 , z2 ) := (g1 g2 , z1 z2 ). We will denote by p and λ the canonical projections p : ML(n, C) → GL(n, C) : : λ : ML(n, C) → C×
(g, z) → g, (g, z) → z,
respectively. To define the bundle Fˆ n P C , suppose that {Uα } is a trivializing cover of F n P C (i.e. Uα are local patches on Γ such that the restrictions F n P C |Uα are isomorphic to Cartesian products Uα × GL(n, C)) with the corresponding transition functions gαβ : Uα ∩ Uβ → GL(n, C). Suppose furthermore that there exist (contingαβ = gαβ and that the cocycle uous) lifts g˜αβ : Uα ∩ Uβ → ML(n, C) such that p˜ conditions g˜αβ g˜βγ = g˜αγ are satisfied. Then the cover {Uα , g˜αβ } defines the desired bundle Fˆ n P C . It turns out that such lifts g˜αβ exist (possibly after refining the cover {Uα } if necessary) if and only if the cohomology class determined by the bundle F n P C in H 2 (Γ, Z2 ) vanishes; from now on, we will assume that this condition is satisfied. The mapping p˜ : Fˆ n P C → F n P C , obtained upon applying p in each fiber, yields then a 2-to-1 covering of F n P C by Fˆ n P C . A − 21 -P-form on Γ is, by definition, a function ν˜ : Fˆ n P C → C satisfying ˜ · λ(˜ ν˜m (ξ˜ · g˜) = ν˜m (ξ) g )−1 ,
∀ ξ˜ ∈ Fˆ n P C ,
∀ g˜ ∈ ML(n, C).
The complex line bundle of all − 12 -P-forms will be denoted by B˜P . Next we define the (partial) connection ∇ on B˜P . Let η1 , . . . , ηn be local Hamiltonian vector fields spanning P in a neighborhood of a point m0 ∈ Γ (cf. (3.32)). Since p˜ is a local homeomorphism, there exists a local lifting (˜ η1 , . . . , η˜n ) ∈ Fˆ n P C (possibly defined on a smaller neighborhood of m0 ) such that p˜(˜ ηj ) = ηj . We can also arrange that (˜ η1 , . . . , η˜n )|m0 coincides with any given metaframe n C P . For ζ ∈ P, we then define f˜0 ∈ Fˆm 0 (∇ζ ν˜)m0 (f˜0 ) := ζ ν˜(˜ η1 , . . . , η˜n )m0 . One checks as usual that this definition is consistent (i.e. independent of the choice of the Hamiltonian metaframe η˜ satisfying η˜|m0 = f˜0 ) and defines again a − 21 -Pform on Γ; further, the resulting map ∇ is again a flat partial connection on B˜P . Denoting by QB the tensor product (quantum bundle) QB := L ⊗ B˜P (with L the prequantization bundle from Sec. 3.1), we then have the corresponding partial connection (3.25) in QB.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
431
n C For arbitrary two sections φ = s⊗ ν˜ and ψ = r ⊗ µ ˜ of QB, m ∈ Γ and f˜ ∈ Fˆm P C ˜ a metaframe at m, denote (ζ1 , . . . , ζn ) = p˜(f ) and choose ξ1 , . . . , ξn ∈ Tm Γ such that ζ1 , . . . , ζn , ξ1 , . . . , ξn is a basis of Tm ΓC . Assume that ζ1 , . . . , ζk is a basis of 2n−k C . Then a function (φ, ψ)m can be defined on Fπ(m) (Γ/D)C by Dm
(φ, ψ)m (π∗ (ζk+1 , . . . , ζn , ξ1 , . . . , ξn )) µm (f˜) · |ω,k (ζk+1 , . . . , ζn , ζ¯k+1 , . . . , ζ¯n )|1/2 := (s, r)m ν˜m (f˜)˜ · |ω (ζ1 , . . . , ζn , ξ1 , . . . , ξn )|.
(3.41)
2n−k (Γ/D)C , one can Although (φ, ψ)m is again defined only on a certain subset of Fπ(m) 2n−k check as before that it extends consistently to a (unique) density on Fπ(m) (Γ/D)C , and, further, if φ and ψ are covariantly constant along P then (φ, ψ)m depends only on π(m), and thus defines a (unique) density on Γ/D. Finally, if ζ is a real vector field on Γ preserving P (i.e. [ζ, P] ⊂ P), then the associated flow ρt (which satisfies ρ˜t∗ Pm ⊂ Pρt m ) induces a flow ρ˜t on P-frames which, for t small enough, lifts uniquely to a flow ρ˜˜t on the metaframes such that p˜ρ˜ ˜t = ρ˜t p˜. Using this action we define d n C (Lζ ν˜)(f˜) := ν˜ρt m (ρ˜˜t f˜)t=0 , f˜ ∈ Fˆm P . (3.42) dt
As before, it is easily seen that Lζ ν˜ is again a − 12 -P-form, for any − 12 -P-form ν˜, that Lζ satisfies the axioms (3.28) and (3.29) of a “flat partial Lie derivative”, and that LXf = ∇Xf if f is a real function with Xf ∈ P. Introducing the Hilbert space H as before,
H = the completion of ψ ∈ Γ(QB) : ∇ζ ψ = 0 ∀ζ ∈ P and (ψ, ψ) < ∞ , Γ/D
a straightforward modification of the corresponding arguments for − 12 -P-densities shows that the operators defined by (3.39), i.e. ih ih (3.43) Qf (s ⊗ ν) := − ∇Xf s + f s ⊗ ν + s ⊗ − LXf ν 2π 2π (but now with the Lie derivative (3.38) replaced by (3.42) etc.!), for f : Γ → R such that (3.40) holds, are densely defined operators of H into itself; and if Xf is complete, they are self-adjoint. We have thus arrived at the final recipe of the original geometric quantization of Kostant and Souriau: that is, starting with a phase space — a symplectic manifold (Γ, ω) — satisfying the integrality condition: h−1 [ω] is an integral class in H 2 (Γ, R), and with a complex polarization P on Γ satisfying the condition for the existence of the metaplectic structure: the class of F n P C in H 2 (Γ, Z2 ) vanishes,
June 23, 2005 10:9 WSPC/148-RMP
432
J070-00237
S. T. Ali & M. Engliˇ c
we have constructed the Hilbert space H as (the completion of) the space of all sections of the quantum bundle QB = L ⊗ B˜P which are covariantly constant along P and square-integrable over Γ/D; and for a function f belonging to the space Obs = {f : Γ → R; [Xf , P] ⊂ P}
(3.44)
(the space of quantizable observables) we have defined by (3.43) the corresponding quantum operator Qf on H, which is self-adjoint if the Hamiltonian field Xf of f is complete, and such that the correspondence f → Qf satisfies the axioms (Q1)–(Q5) we have set ourselves in the beginning.n 3.5. Blattner–Kostant–Sternberg pairing The space (3.44) of quantizable observables is often rather small: for instance, for Γ = R2n (with the standard symplectic form) and the vertical polarization ∂/∂p1 , . . . , ∂/∂pn , the space Obs essentially coincides with functions at most linear in p, thus excluding, for instance, the kinetic energy 12 p2 . There is a method of extending the quantization map Q to a larger space of functionso so that Qf is still h2 ∆ for given by (3.43) if f satisfies (3.40), while giving the correct answer Qf = − 8π 1 2 the kinetic energy f (p, q) = 2 p . The method is based on a pairing of half-forms, due to Blattner, Kostant and Sternberg [37], which we now proceed to describe. Suppose P and P are two (complex) polarizations for which there exist two real ˆ and Eˆ (of constant dimensions k and 2n − k, respectively) such that foliations D ˆ C, P¯ ∩ P = D ¯ P + P = EˆC , ˆ has a manifold structure and π : Γ → Γ/D ˆ is a submersion. Γ/D
(3.45)
Pairs of polarizations satisfying the first and the third condition are called regular;p ˆ = {0} (which implies that the second condition also holds, with if in addition D ˆ E = T Γ), they are called transversal. If the polarizations P and P are positive, which means that iω(x, x ¯) ≥ 0,
∀ x ∈ P,
(3.46)
and similarly for P , then P¯ ∩ P is automatically involutive, so the first condition in (3.45) is equivalent to the (weaker) property that P¯ ∩ P be of constant rank. , . . . , ξ¯n , t1 , . . . , tk of Tm ΓC such that For m ∈ Γ, choose a basis ξ1 , . . . , ξn , ξ¯k+1 ˆ m , ξ1 , . . . , ξn span Pm and ξ1 , . . . , ξk , ξ , . . . , ξ span P . Now ξ1 , . . . , ξk span D n m k+1 n In
(Q4), one of course takes the polarizations on the two manifolds which correspond to each other under the given diffeomorphism. o However, on the extended domain Q does in general no longer satisfy the axiom (Q3); see the discussion in Sec. 3.8 below. p This definition of regularity slightly differs from the original one in [39], where it is additionally required that the Blattner obstruction (3.61) vanish.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
433
if φ = s ⊗ ν and ψ = r ⊗ µ are (local) sections of L ⊗ B˜P and L ⊗ B˜P , respectively, ˆ C by then we can “define” a function on F 2n−k (Γ/D)
φ, ψm (π∗ (ξk+1 , . . . , ξn , ξ¯k+1 , . . . , ξ¯n , t1 , . . . , tk )) = (s, r)m νm ((ξ1 , . . . , ξn )∼ ) · µm ((ξ1 , . . . , ξk , ξk+1 , . . . , ξn )∼ ) · ω,k (ξk+1 , . . . , ξn , ξ¯k+1 , . . . , ξ¯n ) · |ω (ξ1 , . . . , tk )|.
(3.47)
(Here ω,k is given by (3.35).) Moreover, if φ and ψ are covariantly constant along P and P , respectively, then this expression is independent of the choice of m in the fiber above π(m), and thus defines a density — denoted (φ, ψ)π(m) — on ˆ However, there are two problems with (3.47): first, we need to specify which Γ/D. metaframes (ξ1 , . . . , ξn )∼ above (ξ1 , . . . , ξn ) and (ξ1 , . . . , ξn )∼ above (ξ1 , . . . , ξn ) to choose; and, second, we must specify the choice of the branch of the square root of ω,k . Both problems are solved by introducing the metaplectic frame bundle on Γ, which, basically, amounts to a recipe for choosing metalinear lifts B˜P of B P for all complex polarizations P on Γ simultaneously.q Remark 11. On an abstract level, the basic idea behind the half-form pairing can be visualized as follows (Rawnsley [225]). Let P ⊥ ⊂ T ∗ ΓC denote the bundle of oneforms vanishing on P; in view of the Lagrangianity of P, the mapping ξ → ω(ξ, ·) n ⊥ P =: K P is a line bundle is an isomorphism of P onto P ⊥ . The exterior power called the canonical bundle of P. If the polarization P is positive, then the Chern class of K P is determined by ω, so that K P and K P are isomorphic for any two positive polarizations P and P . In this case the bundle K P ⊗ K P is trivial, and a choice of trivialization will yield the pairing. In particular, if P¯ ∩ P = {0}, then exterior multiplication defines an isomorphism of K P ⊗ K P with 2n T ∗ ΓC , and the latter is trivialized by the volume form ω ; hence one can define ν, µ by in ν, µω = µ ∧ ν¯,
µ ∈ Γ(K P ),
ν ∈ Γ(K P ).
If P¯ ∩P has only constant rank, then the positivity of P and P implies that the first ˆ and some real distribution two conditions in (3.45) hold, for some real foliation D ˆ ⊥ on Γ. Then ω induces a nonsingular skew (but not necessarily a foliation) Eˆ = D ˆ D, ˆ and P and P project to Lagrangian subbundles P/D ˆ and P /D ˆ form ωDˆ on E/ ˆ ˆ ˆ D) ˆ C such that P/D ˆ ∩ (P /D) ˆ = {0}. Thus K P/D and K P /D can be paired by of (E/
exterior multiplication as above. To lift this pairing back to K P and K P , consider ˆm . Then m ∈ Γ and a frame ξ1 , . . . , ξn of Pm such that ξ1 , . . . , ξk is a frame of D P any ν ∈ Km is of the form ν = aω(ξ1 , · ) ∧ ω(ξ2 , · ) ∧ · · · ∧ ω(ξn , · )
q More
precisely, for all positive complex polarizations (see the definition below). In other words, the choice of a metaplectic frame bundle uniquely determines a metalinear frame bundle for each positive complex polarization.
June 23, 2005 10:9 WSPC/148-RMP
434
J070-00237
S. T. Ali & M. Engliˇ c
⊥C ˆm ˆ D) ˆ C for some a ∈ C. The projections ξ˜j of ξj ∈ D onto (E/ m , j = k + 1, . . . , n, ˆ then form a frame for (P/D)m , and we set ˆ
P/D . ν˜(ξ˜1 , . . . , ξ˜k ) := aωDˆ (ξ˜k+1 , · ) ∧ · · · ∧ ωDˆ (ξ˜n , · ) ∈ Km
P Projecting µ ∈ Km in the same fashion, we then put ν, µm := ˜ ν, µ ˜ m . Thus in ˆ any case we end up with a −2-D-density on Γ, which defines, using the volume ˆ (i.e. vanishing if any of its arguments density |ω |, a 2-density on T Γ normal to D ˆ is in D). Thus if ν, µm is covariantly constant along the leaves, we can project ˆ Now if the Chern class of K P is even — in which down to a 2-density on T (Γ/D). case (Γ, ω) is called metaplectic — then the symplectic frame bundle of Γ has a double covering, by means of which one can canonically construct a square root QP of K P , for any positive polarization P. (Sections of QP are called half-forms normal to P.) Further, these square roots still have the property that QP ⊗ QP is trivial. Applying the “square root” to the above construction, one thus ends up ˆ Integrating this density gives a complex number, and we with a density on Γ/D. thus finally arrive at the desired pairing
Γ(QP ) × Γ(QP ) → C. In particular, choosing P = P (i.e. pairing a polarization with itself), passing from QP to the tensor product L ⊗ QP with the prequantum bundle, and using ˆ in the densities again Lie differentiation to define a partial connection along D ˆ on T Γ normal to D, we can also continue as before and recover in this way in an equivalent guise the Hilbert space H and the quantum operators Qf from the preceding subsection(s). We now give some details about the construction of the metaplectic frame bundle. As this is a somewhat technical matter, we will confine ourselves to the simplest ˆ = {0} (and, case of transversal polarizations, i.e. such that (3.45) holds with D ˆ hence, E = T Γ); the general case can be found in [246, Chap. 5], or [39]. We will also assume throughout that the polarizations are positive, i.e. (3.46) holds. A symplectic frame at m ∈ Γ is an (ordered) basis (u1 , . . . , un , v1 , . . . , vn ) ≡ (u, v) of Tm Γ such that ω(uj , uk ) = ω(vj , vk ) = 0,
ω(uj , vk ) = δjk .
The collection of all such frames forms a right principal Sp(n, R) bundle F ω Γ, the symplectic bundle; here Sp(n, R), the n × n symplectic group, consists of all g ∈ GL(2n, R) which preserve ω (i.e. ω(gξ, gη) = ω(ξ, η)). The group Sp(n, R) can be realized as the subgroup of 2n × 2n real matrices g satisfying g t Jg = J, where h i J is the block matrix 0I −I0 . The fundamental group of Sp(n, R) is infinite cyclic, hence there exists a unique double cover Mp(n, R), called the metaplectic group. We denote by p the covering homomorphism. The metaplectic frame bundle F˜ ω Γ is a right principal Mp(n, R) bundle over Γ together with a map τ : F˜ ω Γ → F ω Γ ˜ · p(˜ such that τ (ξ˜ · g˜) = τ (ξ) g ), for all ξ˜ ∈ F˜ ω Γ and g˜ ∈ Mp(n, R). The existence
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
435
of F˜ ω Γ is equivalent to the characteristic class of F ω Γ in H 2 (Γ, Z) being even (cf. the construction of the metalinear frame bundle Fˆ n P C ). A positive Lagrangian frame at m ∈ Γ is a frame (w1 , . . . , wn ) ≡ w ∈ Tm ΓC such that ω(wj , wk ) = 0,
∀ j, k = 1, . . . , n,
(3.48)
∀ j = 1, . . . , n.
(3.49)
and iω(wj , w ¯j ) ≥ 0,
The corresponding bundle of positive Lagrangian frames is denoted by Lω Γ. In terms of a given symplectic frame (u, v), a positive Lagrangian frame can be uniquely expressed as U w = (u, v) (3.50) V where U, V are n × n matrices satisfying U rank = n, U t V = V t U, V
(3.51)
in view of (3.48), and i(V ∗ U − U V ∗ ) is positive semidefinite
(3.52)
in view of (3.49). This sets up a bijection between the set of all positive Lagrangian frames at a point m ∈ Γ and the set Π of all matrices U, V satisfying (3.51) and (3.52). The action of Sp(n, R) on Π by left matrix multiplication defines thus an action on Lω Γ and a positive Lagrangian frame w at m can be identified with the function w : F ω Γ → Π satisfying w ((u, v) · g) = g −1 w (u, v),
∀ g ∈ Sp(n, R)
by the recipe w = (u, v)w (u, v).
(3.53)
From (3.51) it follows that the matrix C defined by C := U − iV is nonsingular, and that the matrix W defined by W = (U + iV )C −1 is symmetric (W t = W ). From (3.52) it then follows that W ≤ 1, i.e. W belongs to the closed unit ball B := {W ∈ Cn×n : W t = W, W ≤ 1} of symmetric complex n × n matrices.
June 23, 2005 10:9 WSPC/148-RMP
436
J070-00237
S. T. Ali & M. Engliˇ c
Since U= the mapping
U V
(I + W )C , 2
V =
i(I − W )C , 2
(3.54)
→ (W, C) sets up a bijection between Π and B × GL(n, C). The
action of Sp(n, R) on Π translates into g · (W, C) =: (g (W ), α(g, W )C), where g is a certain (fractional linear) mapping from B into itself and α is a certain (polynomial) mapping from Sp(n, R)×B into GL(n, C). Since B is contractible, there exists a unique lift α ˜ : Mp(n, R) × B → ML(n, C) of α such that ˜ α(˜ ˜ e, W ) = I,
∀ W ∈ B,
where e˜ and I˜ stand for the identities in Mp(n, R) and ML(n, C), respectively, and p(˜ α(˜ g , W )) = α(p(˜ g ), W ),
∀ g˜ ∈ Mp(n, R), ∀ W ∈ B,
where p also denotes (on the left-hand side), as before, the canonical projection ˜ = B × ML(n, C); then there is a left action of of ML(n, C) onto GL(n, C). Let Π ˜ defined by Mp(n, R) on Π ˜ ˜ := (p(˜ ˜ g , W )C), g˜ · (W, C) g ) (W ), α(˜
g˜ ∈ Mp(n, R),
˜ is a double cover of Π with the covering map τ : Π ˜ → Π given by (3.54) with and Π ˜ In analogy with (3.53), we now define a positive metalinear C replaced by p(C). ˜ such that Lagrangian frame as a function w ˜ : F˜ ω Γ → Π v) · g˜) = g˜−1 · w ˜ ((u, v)), w ˜ ((u,
ω ∀ (u, v) ∈ F˜m Γ,
∀ g˜ ∈ Mp(n, R),
and let L Γ be the corresponding bundle of all such frames. The covering map ˜ → Π gives rise to the similar map τ˜ : L˜ω Γ → Lω Γ, showing that the former τ :Π is a double cover of the latter. Finally, the obvious right action of GL(n, C) on Lω Γ lifts uniquely to a right action of ML(n, C) on L˜ω Γ. Let now P be a positive polarization on (Γ, ω). Then the bundle F n P C of Pframes is a subbundle of Lω Γ invariant under the action of GL(n, C) just mentioned. The inverse image of F n P C under τ˜ is a subbundle F˜ n P C of L˜ω Γ invariant under the action of ML(n, C), and τ˜ restricted to F˜ n P C defines a double covering τ˜ : F˜ n P C → F n P C . It follows that F˜ n P C is a metalinear frame bundle of P, which we will call the metalinear frame bundle induced by L˜ω Γ. Finally, notice that for two positive polarizations P and P satisfying the transversality condition ˜ω
P¯ ∩ P = {0}
(3.55)
and frames (ξ1 , . . . , ξn ) ≡ ξ and (ξ1 , . . . , ξn ) ≡ ξ of P and P , respectively, at some point m ∈ Γ as in (3.47) (with k = 0), if we identify ξ and ξ with the matrices U V
and
U V
as in (3.50) with respect to some choice of a symplectic frame (u, v)
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
at m, then the expression ω,k (. . .) in (3.47) reduces to
n ω(ξj , ξl ) 1 det = det C ∗ (I − W ∗ W )C , i 2
437
(3.56)
j,l=1
with (W, C) and (W , C ) as in (3.54). The transversality hypothesis implies that the matrix on the left-hand side is invertible, hence so must be I − W ∗ W . Since the subset B0 of all matrices in B for which 1 is not an eigenvalue is contractible, there exists a unique map γ˜ : B0 → ML(n, C) such that p(˜ γ (S)) = I − S,
∀ S ∈ B0 ,
˜ and γ˜0 = I.
(Note that γ˜ is independent of the polarizations P and P !) Consequently, the function 1 ˜ ∗ ∗ ˜ C γ˜ (I − W W )C , λ 2 with λ having the same meaning as in Sec. 3.4, gives the sought definition of the square root of (3.56) which makes the right-hand side of (3.47) well-defined and ˜ ξ˜ above ξ and ξ . independent of the choice of the metalinear frames ξ, ˆ we obtain the sesquilinear Finally, integrating the density (3.47) over Γ/D, pairing φ, ψ → φ, ψ ∈ C
(3.57)
between sections φ and ψ of L ⊗ B˜P and L ⊗ B˜P covariantly constant along P and P , respectively. This is the Blattner–Kostant–Sternberg pairing (or just BKSpairing for short) originally introduced in [37]. Unfortunately, there seems to be no known general criterion for the existence of φ, ψ, i.e. for the integrability of the density (3.47). All one can say in general is that φ, ψ exists if both φ and ψ are compactly supported. In many concrete situations, however, (3.57) extends continuously to the whole Hilbert spaces HP and HP defined by (3.37) for the polarizations P and P , respectively, and, further, the operator HPP : HP → HP defined by (ψ, HPP φ)HP = ψ, φ,
∀ φ ∈ HP ,
ψ ∈ HP ,
turns out to be, in fact, unitary. For instance, for P = P , HPP is just the identity operator (so that the BKS pairing coincides with the inner product in HP ), and for Γ = R2n and P and P the polarizations spanned by the ∂/∂pj and the ∂/∂qj , respectively, HPP is the Fourier transform. It may happen, though, that HPP is bounded and boundedly invertible but not unitary [226]; no example is currently known where HPP would be unbounded. Turning finally to our original objective — the extension of the quantization map f → Qf — let now f be a real function on Γ such that Xf does not necessarily preserve the polarization P. The flow ρt = exp(tXf ) generated by Xf then takes P into a polarization ρ˜t P =: Pt , which may be different from P. The flow ρt further induces
June 23, 2005 10:9 WSPC/148-RMP
438
J070-00237
S. T. Ali & M. Engliˇ c
the corresponding flows on the spaces Γ(L) of sections of the prequantum bundle L, as well as from sections of the metalinear bundle B˜P into the sections of B˜Pt ; hence, it gives rise to a (unitary) mapping, denoted ρt , from the quantum Hilbert space HPt =: Ht into H. Assume now that for all sufficiently small positive t, the polarizations Pt and P are such that the BKS pairing between them is defined on (or extends by continuity to) all of Ht × H and the corresponding operator HPt P =: Ht is unitary. Then the promised quantum operator given by the BKS pairing is ih d Ht ◦ ρt t=0 . (3.58) 2π dt In view of the remarks in the penultimate paragraph, in practice it may be difficult to verify the (existence and) unitarity of Ht , but one may still use (3.58) to compute Qf on a dense subdomain and investigate the existence of a self-adjoint extension afterwards. Observe also that for f ∈ Obs, i.e. for functions preserving the polarization ([Xf , P] ⊂ P), one has Pt = P and Ht = I ∀t > 0, and, hence, it can easily be seen that (3.58) reduces just to our original prescription (3.43). In particular, if f is constant along P (i.e. Xf ∈ P), then Qf is just the operator of multiplication by f . If the polarization P = D is real and its leaves are simply connected, it is possible to give an explicit local expression for the operator (3.58). Namely, let V be a contractible coordinate patch on Γ/D such that on π −1 (V ) (where, as before, π : Γ → Γ/D is the canonical submersion) there exist real functions q1 , . . . , qn , whose Hamiltonian vector fields span P|π−1 (V ) , and functions p1 , . . . , pn such that n ω|π−1 (V ) = j=1 dpj ∧dqj . Using a suitable reference section on π −1 (V ) covariantly constant along P, the subspace in HP of sections supported in π −1 (V ) can be identified with L2 (V, dq1 · · · dqn ). If ψ is such a section, then under this identification, the operator (3.58) is given by ih dψt , (3.59) Qf ψ = 2π dt t=0 Qf φ = −
where
n/2 2π 2π t (θ(Xf ) − f ) ◦ ρ−s ds exp − ih ih 0 n × det ω(Xqj , ρt Xqk ) j,k=1 ψ(q1 ◦ ρ−t , . . . , qn ◦ ρ−t ) dp1 · · · dpn ,
ψt (q1 , . . . , qn ) =
(3.60) n
where θ = j=1 pj dqj and ρt is, as usual, the flow generated by Xf . See [246, Sec. 6.3]. The conditions (3.45), under which the BKS pairing was constructed here, can be somewhat weakened, see Blattner [39].r In particular, for positive polarizations r Originally,
the pairing was defined in Blattner’s paper [37] for a pair of transversal real polarizations; the transversality hypothesis was then replaced by regularity in [38], and finally regular pairs of positive complex polarizations were admitted in [39].
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
439
the pairing can still be defined even if the middle condition in (3.45) is omitted. In that case, a new complication can arise: it may happen that for two sections φ and ψ which are covariantly constant along P and P , respectively, their “local scalar ˆ (i.e. does not depend only on product” φ, ψm is not covariantly constant along D π(m)). More precisely: φ, ψm is covariantly constant whenever φ and ψ are if and only if the one-form χPP (the Blattner obstruction) defined on P¯ ∩ P by χPP :=
n−k
ω([vj , wj ], · )
(3.61)
j=1
vanishes. Here k = dim P¯ ∩ P and v1 , . . . , vn−k , w1 , . . . , wn−k are (arbitrary) vector fields in P¯ + P such that ω(vi , vj ) = ω(wi , wj ) = 0,
ω(vi , wj ) = δij .
The simplest example when χPP = 0 is Γ = R4 (with the usual symplectic form) and P and P spanned by ∂/∂p1 , ∂/∂p2 and p1 ∂/∂p1 + p2 ∂/∂p2 , p2 ∂/∂q1 − p1 ∂/∂q2 , respectively. We remark that so far there are no known ways of defining the BKS pairing ˆC if the dimension of P¯ ∩ P varies, or if the intersection is not of the form D ˆ which is fibrating. Robinson [233] showed how to define for a real distribution D the “local” product φ, ψm for a completely arbitrary pair of polarizations P and P , however his pairing takes values not in a bundle of densities but in a certain line bundle over Γ (coming from higher cohomology groups) which is not even trivial in general, so it is not possible to integrate the local products into a global (C-valued) pairing. (For a regular pair of positive polarizations, Robinson’s bundle is canonically isomorphic to the bundle of densities on Γ.) A general study of the integral kernels mediating BKS-type pairings was undertaken by Gaw¸edzki [103, 104]; he also obtained a kernel representation for the quantum operators Qf . His kernels seem actually very much akin to the reproducing kernels for vector bundles investigated by Peetre [215] and others, cf. the discussion in Sec. 5 below. A completely different method of extending the correspondence f → Qf was proposed by Kostant in [168]. For a set X of vector fields on Γ and a polarization P, denote by (adP)X the set {[X, Y ]; X ∈ X , Y ∈ P}, and let CPk := {f ∈ C ∞ (Γ) : (ad P)k {Xf } ⊂ P},
k = 0, 1, 2, . . . .
Then, in view of the involutivity of P, CPk ⊂ CPk+1 , and, in fact, CP0 is the space of functions constant along P, and CP1 = Obs; one can think of CPk as the space of functions which are “polynomial of degree at most k in the directions transversal to P”. Kostant’s method extends the domain of the mapping f → Qf to the union CP∗ := k≥0 CPk ; though phrased in completely geometric terms, in the end it essentially boils down just to choosing a particular ordering of the operators Pj and Qj (cf. Sec. 6 below). Namely, let P be an auxiliary polarization on Γ such that locally near any m ∈ Γ, there exist functions q1 , . . . , qn and p1 , . . . , pn such
June 23, 2005 10:9 WSPC/148-RMP
440
J070-00237
S. T. Ali & M. Engliˇ c
that Xqj span P, Xpj span P , and {qj , pk } = δjk . (Such polarizations are said to be Heisenberg related.) Now if f is locally of the form pm φ(q) (any function from CP∗ is a sum of such functions), then |m|−1 |k| |k| ih m ∂ φ ∂ |m−k| Qf = . −1 k ∂q k ∂q m−k 2π 2 0≤|k|≤|m|
Here m = (m1 , . . . , mn ) is a multiindex, |m| = m1 + · · · + mn , and similarly for k. Again, however, the axiom (Q3) is no longer satisfied by these operators on the extended domain, and, further, the operator Qf depends also on the auxiliary polarization P : if f ∈ CPk , then Qf is a differential operator of order k, and choosing a different auxiliary polarization P (Heisenberg related to P) results in an error term which is a differential operator of order k − 2. We will say nothing more about this method here. 3.6. Further developments In spite of the sophistication of geometric quantization, there are still quite a few things that can go wrong: the integrality condition may be violated, polarizations or the metaplectic structure need not exist, the Hilbert space H may turn out to be trivial, there may be too few quantizable functions, etc. We will survey here various enhancements of the original approach that have been invented in order to resolve some of these difficulties, and then discuss the remaining ones in Sec. 3.8. 3.6.1. Bohr–Sommerfeld conditions and distributional sections An example when the Hilbert space H turns out to be trivial — that is, when there are no square-integrable covariantly constant sections of QB except the constant zero — is that of Γ = C\{0} ( R2 with the origin deleted), with the standard symplectic form, and the circular (real) polarization D spanned by ∂/∂θ, where (r, θ) are the polar coordinates in C R2 . The leaves space Γ/D can be identified with R+ ; upon employing a suitable reference section, sections of the quantum bundle (3.24) can be identified with functions on C\{0}, and covariantly constant ones with those satisfying f (eiθ z) = e2πirθ/h f (z). (See [257, pp. 79–83].) However, as the coordinate θ is cyclic, this forces the support of f to be contained in the union of the circles kh , k = 1, 2, . . . . (3.62) r= 2π As the latter is a set of zero measure, we get H = {0}. A similar situation can arise whenever the leaves of D are not simply connected. In general, for any leaf Λ of D, the partial connection ∇ on the quantum line bundle QB induces a flat connection in the restriction QB|Λ of QB to Λ. For any closed loop γ in Λ, a point m on γ and φ ∈ QBm \{0}, the parallel transport with respect to the latter connection of φ along γ transforms φ into cφ, for some c ∈ C× ;
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
441
the set of all c that arise in this way forms a group, the holonomy group GΛ of Λ. Let σ be the set of all leaves Λ ∈ Γ/D whose holonomy groups are trivial, i.e. GΛ = {1}. The preimage S = π −1 (σ) ⊂ Γ is called the Bohr–Sommerfeld variety, and it can be shown that any section of QB covariantly constant along D has support contained in S. In the example above, S is the union of the circles (3.62). For real polarizations P such that all Hamiltonian vector fields contained in P are complete (the completeness condition), the problem can be solved by introducing distribution-valued sections of QB. See [246, Sec. 4.5], and [281, pp. 162–164]. In the example above, this corresponds to taking H to be the set of all functions φ which are equal to φk on the circles (3.62) and vanish everywhere else, i.e. h φk ekiθ if r = k 2π , k = 1, 2, . . . , iθ (3.63) φ(re ) = 0 otherwise, with the inner product (φ, ψ) =
∞
φ¯k ψk .
k=1
For real functions f satisfying [Xf , P] ⊂ P
(3.64)
(i.e. preserving the polarization), the quantum operators Qf can then be defined, essentially, in the same way as before, and extending the BKS pairing to distribution-valued sections (see [246, Sec. 5.1]), one can also extend the domain of the correspondence f → Qf to some functions f for which (3.64) fails. For complex polarizations, there exist some partial results (e.g. Mykytiuk [190]), but the problem is so far unsolved in general. Remark 12. It turns out that in the situation from the penultimate paragraph, the subspaces Hα ⊂ H consisting of sections supported on a given connected component Sα of the Bohr–Sommerfeld variety S are invariant under all operators Qf (both if f satisfies (3.64) or if Qf is obtained by the BKS pairing); that is, H is reducible under the corresponding set of quantum operators. One speaks of the so-called superselection rules ([246, Sec. 6.4]). 3.6.2. Cohomological correction Another way of attacking the problem of non-existence of square-integrable covariantly constant sections is the use of higher cohomology groups. Let k ≥ 0 be an integer and let QB be the quantum bundle L ⊗ B P or L ⊗ B˜P constructed in Secs. 3.3 (or 3.2) and 3.4, respectively. A k-P-form with values in QB is a k-linear and alternating map which assigns a smooth section α(X1 , . . . , Xk ) ¯ We denote the space of all of QB to any k-tuple of vector fields X1 , . . . , Xk ∈ P. k 0 such forms by Λ (Γ, P); one has Λ (Γ, P) = Γ(QB), and, more generally, any α ∈ Λk (Γ, P) can be locally written as a product α = βτ where τ is a section of
June 23, 2005 10:9 WSPC/148-RMP
442
J070-00237
S. T. Ali & M. Engliˇ c
QB covariantly constant along P¯ and β is an ordinary complex k-form on Γ, with two such products βτ and β τ representing the same k-P-form whenever β − β ¯ vanishes when restricted to P. k The operator ∂¯P : Λ (Γ, P) → Λk+1 (Γ, P) is defined by ¯ (∂P α)(X1 , . . . , Xk+1 ) = ∇Xσ(1) (α(Xσ(2) , . . . , Xσ(k+1) )) σ
−
k α [Xσ(1) , Xσ(2) ], Xσ(3) , . . . , Xσ(k+1) 2
where the summation extends over all cyclic permutations σ of the index set 2 1, 2, . . . , k + 1. It can be checked that ∂¯P = 0; hence, we can define the cohomolk ogy groups H (Γ, P) as the quotients Ker(∂¯P |Λk )/Ran(∂¯P |Λk−1 ) of the ∂¯P -closed k-P-forms by the ∂¯P -exact ones. Finally, for each real function f satisfying (3.64) (i.e. preserving the polarization), one can extend the operator Qf given by (3.43) (or (3.31) or (3.39)) to Λk (Γ, P) by setting (Qf α)(X1 , . . . , Xk ) := Qf (α(X1 , . . . , Xk )) +
k ih α(X1 , . . . , [Xf , Xj ], . . . , Xk ). 2π j=1
(3.65)
It can be checked that Qf commutes with ∂¯P , and thus induces an operator — also denoted Qf — on the cohomology groups H k (Γ, P). Now it may happen that even though H 0 (Γ, P) contains no nonzero covariantly constant sections, one of the higher cohomology groups H k (Γ, P) does, and one can then use it as a substitute for H (and (3.65) as a substitute for (3.43)). For instance, in the above example of Γ = C\{0} with the circular polarization, one can show that using H 1 (Γ, P) essentially gives the same quantization as the use of the distributional sections in Sec. 3.6.1 (see Simms [242]). However, in general there are still some difficulties left — for instance, we need to define an inner product on H k (Γ, P) in order to make it into a Hilbert space, etc. The details can be found in Woodhouse [281, Sec. 6.4], Rawnsley [223], or Puta [221] and the references given there. 3.6.3. MpC -structures One more place where the standard geometric quantization can break down is the very beginning: namely, when the integrality condition h−1 [ω] ∈ H 2 (Γ, Z), or the condition for the existence of the metaplectic structure 12 c1 (ω) ∈ H 2 (Γ, Z), are not satisfied. This is the case, for instance, for the odd-dimensional harmonic oscillator, whose phase space is the complex projective space CP n with even n. It turns out that this can be solved by extending the whole method of geometric quantization to the case when the sum h−1 [ω] + 12 c1 (ω), rather than the two summands separately, is integral. This was first done by Czyz [71] for compact K¨ahler manifolds, using an
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
443
axiomatic approach, and then by Hess [139], whose method was taken much further by Rawnsley and Robinson [227] (see Robinson [234] for a recent survey). The main idea is to replace the two ingredients just mentioned — the prequantum bundle and the metaplectic structure — by a single piece of data, called the prequantized MpC structure. To define it, consider, quite generally, a real vector space V of dimension 2n with a symplectic form Ω and an irreducible unitary projective representation W of V on a separable complex Hilbert space H such that W (x)W (y) = e−πiΩ(x,y)/h W (x + y),
∀ x, y ∈ V.
By the Stone–von Neumann theorem, W is unique up to unitary equivalence; consequently, for any g ∈ Sp(V, Ω) there exists a unitary operator U on H (unique up to multiplication by a unimodular complex number) such that W (gx) = U W (x)U ∗ for all x ∈ V . Denote by MpC (V, Ω) the group of all such U ’s as g ranges over Sp(V, Ω), and let σ : MpC (V, Ω) → Sp(V, Ω) be the mapping given by σ(U ) = g. The kernel of σ is just U (1), identified with the unitary scalar operators in H. There is a unique character η : MpC (V, Ω) → U (1) such that η(λI) = λ2 ∀λ ∈ U (1); the kernel of η is our old friend, the metaplectic group Mp(V, Ω). Let now Sp(Γ, ω) denote the symplectic frame bundle of the manifold Γ, which we think of as being modeled π fiberwise on (V, Ω). An MpC -structure on Γ is a principal MpC (V, Ω) bundle P → Γ together with a σ-equivariant bundle map P → Sp(Γ, ω). An MpC structure is called prequantized if, in addition, there exists an MpC (V, Ω)-invariant u(1)-valued 1 ∗ one-form γ on P such that dγ = 2π ih π ω and γ(z) = 2 η∗ z for all z in the Lie algebra C of Mp (V, Ω); here z is the fundamental vertical vector field corresponding to z. It turns out that MpC structures always exist on any symplectic manifold, and prequantized ones exist if and only if the combined integrality condition 1 the class h−1 [ω] + c1 (ω)R ∈ H 2 (Γ, R) is integral 2 is fulfilled. In that case, if P is a positive polarization on Γ, one can again consider partial connections and covariantly constant sections of P , and define the corresponding Hilbert spaces and quantum operators more or less in the same way as before. Details can be found in Rawnsley and Robinson [227] and Blattner and Rawnsley [42]. It is also possible to define the BKS pairing in this situation. 3.6.4. Modular structures From a physical point of view, it has sometimes been argued that the process of choosing a polarization should, at least in certain favorable situations, have the meaning of finding a maximal set of commuting observables for the quantized system. More precisely, if the prequantized set of observables can be described by a von Neumann algebra, then in many situations, the choice of a polarization can be related to the choice of a maximal abelian, atomic von Neumann subalgebra. Alternatively, the problem of extracting an irreducible representation of the quantum algebra of observables can in these cases be related to the well-known
June 23, 2005 10:9 WSPC/148-RMP
444
J070-00237
S. T. Ali & M. Engliˇ c
problem of ordering of operators in quantum mechanics. The most favorable situation arises when the phase space is the coadjoint orbit associated (in the sense of Kirillov [159]) to a unitary irreducible representation U of a Lie group G and when this representation is square integrable (see, for example [8] for a detailed discussion of square-integrable representations). In this case there is a modular structure (in the sense of Tomita [253]), associated to the von Neumann algebra of the prequantized observables, arising from the modular structure determined by the left and right regular representations, U and Ur respectively, of G. (These representations mutually commute). The von Neumann algebras A and Ar , generated by the restrictions of U and Ur , respectively, to the subspace H(U ) ⊂ L2 (G) containing all subrepresentations unitarily equivalent to U , also commute and this restriction preserves the modular structure. The choice of a polarization in this context amounts to finding [10, 11] irreducible subrepresentations of G in H(U ). This can be done by identifying atomic maximal abelian subalgebras M in the center A ∩ Ar of the restricted von Neumann algebras. The modular structure then guarantees that the algebras M are generated by minimal projectors, which can then be used to isolate irreducible subrepresentations of G (unitarily equivalent to U ). More interestingly, the minimal projectors generating M can be used to construct KMS states (see, for example [136], for a definition and properties of such states appropriate to the present context) on A (or equivalently, on Ar ). These states, which enjoy remarkable analytic properties, are vector states on the algebras and are invariant with respect to certain canonically defined time evolutions. The existence of different classes of KMS states reflects the possibility of different decompositions into irreducibles and hence to different maximal abelian, atomic subalgebras M. The appearance of KMS states in this context is intriguing since it is a notion borrowed from equilibrium quantum statistical mechanics, but now reappearing in a totally different guise, related to the choice of polarizations. There is also a measurement theoretic interpretation, in this context, of the individual minimal projectors generating M: they give rise to specific orderings of operators in the irreducible sectors. 3.7. SpinC -quantization A very spectacular recent development consists in replacing the use of polarizations, higher cohomology etc., by viewing the geometric quantization as the index, in the K-theoretic sense (i.e. as a virtual Hilbert space), of a suitable SpinC Dirac operator. Consider, quite generally, a Lie group G, with Lie algebra g, which is acting on symplectic manifold Γ and preserves the symplectic form ω. A moment map is a G-equivariant mapping Φ : Γ → g∗ from Γ into the dual of g such that for each X ∈ g, the function ΦX := Φ, X satisfies dΦX = ω(X, · ),
∀ X ∈ g.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
445
d Here we are denoting by the same letter X the vector field dt exp(tX)|t=0 induced on Γ by X through the G-action. If the action of G admits a moment map, then it is called Hamiltonian. The moment map Φ is then unique up to translations, and Φ[X,Y ] = {ΦX , ΦY }, i.e. X → ΦX is a Lie algebra homomorphism from g into (C ∞ (Γ), {·, ·}). In view of the non-degeneracy of ω, this homomorphism is injective. Suppose now that we have a finite-dimensional subspace g of C ∞ (Γ) and a quantization f → Qf from g into selfadjoint operators on some Hilbert space H satisfying our initial axioms (Q1) (identity), (Q2) (linearity) and (Q3) (Poisson brackets). Assume further that the Hamiltonian vector fields of functions in g are complete, so that the corresponding Hamiltonian flows generate an action on Γ of a connected Lie group G with Lie algebra g. Then by (Q3), the mapping f → 2π ih Qf is a representation of the Lie algebra g (with respect to the Poisson bracket) on H. Lifting this representation to G (or its cover) we therefore obtain a Hamiltonian action with moment map Φ(x)(f ) := f (x), i.e. Φf = f . We thus see that to each quantization rule f → Qf of g there corresponds a Hamiltonian group action of G on H. Conversely, suppose that to any given Hamiltonian action of a Lie group G on Γ (with moment map Φ) we can associate a representation of G on a Hilbert space H. On the level of the Lie algebras, this gives a representation X → π(X) of the Lie algebra g of G in H. Hence, the mapping Q : ΦX → π(X), defined on the Poisson subalgebra {ΦX : X ∈ g} of C ∞ (Γ), satisfies the quantization axioms (Q1) and (Q3). We are thus left with the problem of associating a representation on a Hilbert space to each Hamiltonian group action of G. (Eventually, we also need to worry about some kind of “irreducibility” conditions like (Q4) and (Q5).) The idea how to attack this problem is apparently due to Bott, for the situation when both G and Γ are compact, and the symplectic form ω is, as usual, integral. Let L be the prequantum bundle from Sec. 3.1. Bott’s idea was to define H as the push-forward H = π∗ ([L]), where [L] denotes the class of L in the K-theory of Γ and π is the map sending Γ into a point. Thus H is an element of the K-theory of a point, that is, a virtual vector space, i.e. a formal difference H1 H2 of two Hilbert space H1 , H2 (two such formal differences H1 H2 , H1 H2 being considered equal if and only if H1 ⊕ H2 H1 ⊕ H2 ; the only thing which matters is therefore the dimension dim H1 −dim H2 , which can be negative; also, there is no natural choice for the inner product). One shows that to a certain extent this construction is equivariant, thus giving a virtual representation of G on H if we start with a Hamiltonian action of G on Γ. Further, if the action of G on Γ is transitive, then it can be deduced from the Borel–Weil–Bott theorem that the representation is irreducible. Finally — this is crucial for what follows — one can actually take for H1 and H2 the kernel and the cokernel of a certain Dolbeault complex on Γ. The advantage is that for this it is not necessary that Γ possess the complex structure or polarization etc.: the only thing needed is a compatible almost complex structure, which always exists, and,
June 23, 2005 10:9 WSPC/148-RMP
446
J070-00237
S. T. Ali & M. Engliˇ c
further, the dimension of H is independent of it. The above procedure is known as almost complex quantization. The whole idea can be generalized further by passing from the almost complex structures and prequantum bundles to SpinC structures. Recall that the group Spin(n), n > 2, is the universal (double) cover of SO(n). The group SpinC (n) is the quotient of Spin(n)×U (1) by the two-element subgroup generated by (, −1), where is the nontrivial element in the kernel of the covering map q : Spin(n) → SO(n). A SpinC structure on Γ is a principal SpinC (n) bundle P → Γ, together with a SpinC equivariant map p : P → GL+ (T Γ) (where T Γ is the tangent bundle of Γ), which gives rise to a Riemannian metric and orientation on Γ. Let ∆± be the two real spin representations of Spin(n). These generate representations of SpinC (n) on the tensor products ∆± ⊗C. Consider the vector bundles S ± = P ×SpinC (n) (∆± ⊗C). Then a connection on P determines a connection ∇: C ∞ (Γ, S + ) → C ∞ (Γ, S + ⊗ T ∗ Γ), while the Clifford multiplication gives rise to a bundle morphism S + ⊗ T ∗ Γ → S − . Composing the two maps, we obtain a first-order differential operator D : C ∞ (Γ, S + ) → C ∞ (Γ, S − ) called the SpinC -Dirac operator. It is an elliptic operator, hence if Γ is compact, its index is finite, and its K-theoretic index H := ker D ker D∗ turns out to be a finite-dimensional virtual representation of G, called the SpinC quantization of the SpinC manifold Γ. It can be shown that any complex line bundle L → Γ determines a SpinC structure on T Γ in a canonical way. Taking, in particular, for L the prequantum bundle on a symplectic manifold with a compatible almost complex structure, we get a SpinC structure on Γ for which S ± coincide with the even- and odd-degree L-valued forms, the Dirac operator essentially becomes the Dolbeault operator ∂¯ + ∂¯∗ , and the SpinC quantization reduces to the almost complex quantization mentioned earlier. Finally, the whole setup can be extended also to noncompact manifolds Γ [57]. The whole theory has an overwhelming mathematical beauty and combines brilliant ideas from representation theory; for instance, index theorems can be employed to get breathtaking formulas relating the dimension of H to various geometric invariants etc. On the other hand, these developments seem to veer away somewhat from our original quantization problem: the algebra of functions which get quantized is rather small (finite dimensional), and instead of honest Hilbert spaces we get only virtual representations, which are much less pleasant from the physical point of view.s is remarkable, however, that in many situations the kernel of D ∗ eventually trivializes in the semiclassical limit h → 0, and thus H becomes, in the semiclassical limit, an honest (not only virtual) Hilbert space; see [53] and [273]. s It
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
447
Some good sources on SpinC quantization (which, to a large extent, we also followed in this subsection) are the papers by Guillemin [130], Vergne [272], Sjamaar [244], and the references therein, as well as the book by Guillemin, Ginzburg and Karshon [132] and its Appendix J by Braverman. 3.8. Some shortcomings Though the method of geometric quantization has been very successful, in deepening our understanding of the nature of the classical to quantum transition and its relation to representation theory, it also has some drawbacks. One of them is the dependence on the various ingredients, i.e. the choice of the prequantum bundle, metaplectic structure (or prequantized MpC -structure), and polarization. The (equivalence classes of) various possible choices of the prequantum bundle are parameterized by the elements of the cohomology group H 1 (Γ, T), and have very sound physical interpretation (for instance, they allow for the difference between the bosons and the fermions, see Souriau [248]). The situation with the choices for the metaplectic structure, which are parameterized by H 1 (Γ, Z2 ), is already less satisfactory (for instance, for the harmonic oscillator, only one of the two choices gives the correct result for the energy levels; see [257, pp. 150–153]). But things get even worse with the dependence on polarization. One would expect the Hilbert spaces associated to two different polarizations of the same symplectic manifold to be in some “intrinsic” way unitarily equivalent; more specifically, for any two polarizations P, P for which the BKS pairing exists, one would expect the corresponding operator HPP to be unitary, and such that the corresponding quantum operators satisfy Qf HPP = HPP Qf for any real observable f quantizable with respect to both P and P . We have already noted in Sec. 3.5 that the former need not be the case (HPP can be a bounded invertible operator which is not unitary, nor even a multiple of a unitary operator), and it can be shown that even if HPP is unitary, the latter claim can fail too (cf. [258]). Finally, it was shown by Gotay [121] that there are symplectic manifolds on which there do not exist any polarizations whatsoever.t Such phase spaces are, of course, “unquantizable” from the point of view of conventional geometric quantization theory. Another drawback, perhaps the most conspicuous one, is that the space of quantizable observables is rather small; e.g. for Γ = R2n and polarization given by the coordinates q1 , . . . , qn , the space Obs consists of functions at most linear in p, thus excluding, for instance, the kinetic energy 12 p2 . The extension of the quantization map f → Qf by means of the BKS pairing,u described in Sec. 3.5, (which gives the h2 correct answer Qf = − 8π ∆ for the kinetic energy f (p, q) = 12 p2 ) is not entirely should be noted that — unlike the cohomology groups H 1 (Γ, T) for the choices of the prequantum bundle and H 1 (Γ, Z2 ) for the choice of the metaplectic structure — there seems to be, up to the authors’ knowledge, no known classifying space for the set of all polarizations on a given symplectic manifold, nor even a criterion for their existence. u Sometimes this is also called the method of infinitesimal pairing. t It
June 23, 2005 10:9 WSPC/148-RMP
448
J070-00237
S. T. Ali & M. Engliˇ c
satisfactory, for the following reasons. First of all, as we have already noted in Sec. 3.5, it is currently not known under what conditions the pairing extends from compactly supported sections to the whole product HP × HPt of the corresponding quantum Hilbert spaces; and even if the pairing so extends, it is not known under what conditions the derivative at t = 0 in (3.58) exists. (And neither is it even known under what conditions the polarizations P and Pt are such that the pairing can be defined in the first place — e.g. transversal etc.) Consequently, it is also unknown for which functions f the quantum operator Qf is defined at all. For instance, using the formulas (3.59) and (3.60), Bao and Zhu [25] showed that for Γ = R2 (with the usual symplectic form) and f (p, q) = pm , Qf is undefined as soon as m ≥ 3 (the integral in (3.60) then diverges as t → 0). Second, even when Qf is defined all right, then, as we have also already noted in Sec. 3.5, owing to the highly nonexplicit nature of the formula (3.58) it is not even possible to tell beforehand whether this operator is at least formally symmetric, not to say self-adjoint. Third, even if Qf are well defined and self-adjoint, their properties are not entirely satisfactory: for instance, in another paper by Bao and Zhu [24] they showed that for Γ = R2 and f (p, q) = p2 g(q), one can again compute from (3.59) and (3.60) that (upon identifying H with L2 (R, dq) by means of a suitable reference section) 2 ih g g 2 − Qf ψ = gψ + g ψ + ψ , (3.66) 2π 4 16g so that, in particular, the dependence f → Qf is not even linear(!). Finally, from the point of view of our axioms (Q1)–(Q5) set up in the beginning, the most serious drawback of (3.58) is that the operators Qf so defined do not, in general, satisfy the commutator condition (Q3)! Remark 13. For functions f such that Xf leaves P + P¯ invariant, it was shown by Tuynman that Qf can be identified with a certain Toeplitz-type operator; see [258]. For some further comments on why the standard theory of geometric quantization may seem unsatisfactory, see Blattner [40, p. 42], or Ali [3]. Finally, we should mention that in the case when Γ is a coadjoint orbit of a Lie group G, which operates on Γ by ω-preserving diffeomorphisms, the geometric quantization is intimately related to the representation theory of G (the orbit method); see Kirillov [159, Chap. 14], and Vogan [275] for more information. For further details on geometric quantization, the reader is advised to consult the extensive bibliography on the subject. In our exposition in Secs. 3.2–3.6 we have closely followed the beautiful CWI syllabus of Tuynman [257], as well as the classics by Woodhouse [281] (see also the new edition [282]) and Sniatycki [246]; the books by Guillemin and Sternberg [133] and Hurt [144] are oriented slightly more towards the theory of Fourier integral operators and the representation theory, respectively. Other worthwhile sources include the papers by Sniatycki [245], Blattner and Rawnsley [41, 42], Czyz [71], Gawedzki [104], Hess [139], Rawnsley and Robinson [227], Robinson [233], Blattner [37–39] , Tuynman [261, 262, 258, 259],
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
449
Rawnsley [225], Kostant [167–169], and Souriau [248], the surveys by Blattner [40], Ali [3], Echeverria-Enriquez et al. [83], or Kirillov [160], and the recent books by Bates and Weinstein [30] and Puta [221], as well as the older one by Simms and Woodhouse [243]. 4. Deformation Quantization Deformation quantization tries to resolve the difficulties of geometric quantization by relaxing the axiom (Q3) to ih Q{f,g} + O(h2 ). (4.1) 2π Motivated by the asymptotic expansion for the Moyal product (1.8), one can try to produce this by first constructing a formal associative but noncommutative product ∗h (a star product), depending on h, such that, in a suitable sense, [Qf , Qg ] = −
f ∗h g =
∞
hj Cj (f, g)
(4.2)
j=0
as h → 0, where the bilinear operators Cj satisfy C0 (f, g) = f g,
i {f, g}, 2π ∀ j ≥ 1.
C1 (f, g) − C1 (g, f ) = −
Cj (f, 1) = Cj (1, f ) = 0,
(4.3) (4.4)
Here “formal” means that f ∗h g is not required to actually exist for any given value of h, but we only require the coefficients Cj : Obs × Obs → Obs to be well-defined mappings for some function space Obs on Γ and satisfy the relations which make ∗h formally associative. As a second step, one looks for an analogue of the Weyl calculus, i.e. one wants the product ∗h to be genuine (not only formal) bilinear mapping from Obs × Obs into Obs and seeks a linear assignment to each f ∈ Obs of an operator Qf on a (fixed) separable Hilbert space H, self-adjoint if f is real-valued, such thatv Q f Q g = Q f ∗h g .
(4.5)
Further, we also want the construction to satisfy the functoriality (=covariance) condition (Q4), which means that the star product should commute with any symplectic diffeomorphism φ, (f ◦ φ) ∗h (g ◦ φ) = (f ∗h g) ◦ φ.
(4.6)
Finally, for Γ = R2n the star product should reduce to, or at least be in some sense equivalent to, the Moyal product. The first step above is the subject of formal deformation quantization, which was introduced by Bayen, Flato, Fronsdal, Lichnerowicz and Sternheimer [31]. Namely, one considers the ring A = C ∞ (Γ)[[h]] of all formal power series in h with C ∞ (Γ) v This
is the condition which implies that ∗h must be associative (since composition of operators is).
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
S. T. Ali & M. Engliˇ c
450
coefficients, and seeks an associative C[[h]]-linear mapping ∗ : A × A → A such that (4.2), (4.3) and (4.4) hold. This is a purely algebraic problem which had been solved by Gerstenhaber [108], who showed that the only obstruction for constructing ∗ are certain Hochschild cohomology classes cn ∈ H 3 (A, A) (the construction is possible if and only if all cn vanish). Later Dewilde and Lecomte [77] showed that a formal star product exists on any symplectic manifold (thus the cohomological obstructions in fact never occur). More geometric constructions were subsequently given by Fedosov [93] (see also his book [94]) and Omori, Maeda and Yoshioka [206], but the question remained open whether the star product exists also for any Poisson manifold (i.e. for Poisson brackets given locally by {f, g} = ω ij (∂i f · ∂j g − ∂j f · ∂i g) where the 2-form ω is allowed to be degenerate). This question was finally settled in the affirmative by Kontsevich [166] on the basis of his “formality conjecture”. Yet another approach to formal deformation quantization on a symplectic manifold can be found in Karasev and Maslov [157]; star products with some additional properties (admitting a formal trace) are discussed in Connes, Flato and Sternheimer [68] and Flato and Sternheimer [99], and classification results are also available [36, 76, 196]. A formal star product is called local if the coefficients Cj are differential operators. If the manifold Γ has a complex structure (for instance, if Γ is K¨ ahler), w the star product is said to admit separation of variables if f ∗ g = f g (i.e. Cj (f, g) = 0 ∀j ≥ 1) whenever f is holomorphic or g is anti-holomorphic. See Karabegov [147, 150] for a systematic treatment of these matters. The second step,x i.e. associating the Hilbert space operators Qf to each f , is more technical. In the first place, this requires that f ∗h g actually exist as a function on Γ for some (arbitrarily small) values of h. Even this is frequently not easy to verify for the formal star products discussed above. The usual approach is therefore, in fact, from the opposite — namely, one starts with some geometric construction of the operators Qf , and then checks that the operation ∗ defined by (4.5) is a star product, i.e. satisfies (4.2), (4.3) and (4.4). In other words, one looks for an assignment f → Qf , depending on the Planck parameter h, of operators Qf on a separable Hilbert space H to functions f ∈ C ∞ (Γ), such that as h → 0, there is an asymptotic expansion (h)
Qf Q(h) g =
∞
(h)
hj QCj (f,g)
(4.7)
j=0
for certain bilinear operators Cj : C ∞ (Γ) × C ∞ (Γ) → C ∞ (Γ). Here (4.7) should be interpreted either in the weak sense, as
N (h) (h) (h) hj QCj (f,g) b = O(hN +1 ), ∀ a, b ∈ H, ∀ N = 0, 1, 2, . . . , a, Qf Qg − j=1
w Or
to be of Wick type; anti-Wick type is similarly obtained upon replacing f ∗ g by g ∗ f . is what we might call analytic deformation quantization.
x This
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
451
where ·, · stands for the inner product in H, or in the sense of norms ! ! N ! ! ! (h) (h) j (h) ! h QCj (f,g) ! = O(hN +1 ), ∀N = 0, 1, 2, . . . , !Qf Qg − ! ! j=1
where · is the operator norm on H. Further, Qf should satisfy the covariance condition (Q4), should (in some sense) reduce to the Weyl operators Wf for Γ = R2n , and preferably, the Cj should be local (i.e. differential) operators. For K¨ ahler manifolds, these two problems are solved by the Berezin and Berezin– Toeplitz quantizations, respectively, which will be described in the next section. For compact symplectic manifolds, they are addressed by the asymptotic operator representations of Fedosov [92; 94, Chap. 7], improving upon an earlier idea of Karasev and Maslov [156]. For a completely general symplectic (or even Poisson) manifold, analogous constructions seem to be so far unknown. An interesting method for constructing non-formal star products on general symplectic manifolds, using integration over certain two-dimensional surfaces (membranes) in the complexification ΓC Γ × Γ of the phase space Γ, has recently been proposed by Karasev [154]. A systematic approach to such constructionsy has been pioneered by Rieffel [229–231]. He defines a strict deformation quantization as a dense ∗-subalgebra A of C ∞ (Γ) equipped, for each sufficiently small positive h, with a norm · h , an involution ∗h and an associative product ×h , continuous with respect to · h , such that • h → Ah := the completion of (A, ∗h , ×h ) with respect to · h , is a continuous field of C ∗ -algebras; ∗0 • , ×0 and · 0 are the ordinary complex conjugation, pointwise product and supremum norm on C ∞ (Γ), respectively; ih {f, g}h = 0. • limh→0 (f ×h g − g ×h f ) + 2π Using the Gelfand–Naimark theorem, one can then represent the C ∗ -algebras Ah as Hilbert space operators, and thus eventually arrive at the desired quantization rule f → Qf . (One still needs to worry about the covariance and irreducibility conditions (Q4) and (Q5), which are not directly built into Rieffel’s definition, but let us ignore these for a moment.) The difficulty is that examples are scarce — all of them make use of the Fourier transform in some way and are thus limited to a setting where the latter makes sense (for instance, one can recover the Moyal product in this way). In fact, the motivation behind the definition comes from operator algebras and Connes’ non-commutative differential geometry rather than quantization [176]. A broader concept is a strict quantization [232]: it is defined as a family of ∗-morphisms Th from a dense ∗-subalgebra A of C ∞ (Γ) into C ∗ -algebras Ah , for h in some subset of R accumulating at 0, such that Ran Th spans Ah for y Sometimes
referred to as C ∗ -algebraic deformation quantization (Landsman [176]).
June 23, 2005 10:9 WSPC/148-RMP
452
J070-00237
S. T. Ali & M. Engliˇ c
each h, A0 = C ∞ (Γ) and T0 is the inclusion map of A into A0 , the functions h → Th (f )h are continuous for each f ∈ A, and Th (f )Th (g) − Th (f g)h → 0, (4.8) ih Th ({f, g})h → 0 2π as h → 0, for each f, g ∈ A. (Thus the main difference from strict deformation quantization is that the product Th (f )Th (g) is not required to be in the range of Th .) Comparing the second condition with (4.1) we see that Qf = Th (f ) gives the quantization rule we wanted. (We again temporarily ignore (Q4) and (Q5).) Though this seems not to have been treated in Rieffel’s papers, it is also obvious how to modify these definitions so as to obtain the whole expansion (4.2) instead of just (4.1). Strict quantizations are already much easier to come by, see for instance Landsman [172] for coadjoint orbits of compact connected Lie groups. However, even the notion of strict quantization is still unnecessarily restrictive — we shall see below that one can construct interesting star-products even when (4.8) is satisfied only in a much weaker sense. (The Berezin–Toeplitz quantization is a strict quantization but not strict deformation quantization; the Berezin quantization is not even a strict quantization.) Recently, a number of advances in this “operator-algebraic” deformation quantization have come from the theory of symplectic groupoids, see Weinstein [278], Zakrzewski [283], Landsman [173, 175], and the books of Landsman [174], and Weinstein and Cannas da Silva [61]. A discussion of deformation quantization of coadjoint orbits of a Lie group, which again exhibits an intimate relationship to group representations and the Kirillov orbit method, can be found e.g. in Cahen, Gutt and Rawnsley [59], Karabegov [149], Vogan [275], Landsman [172], Bar-Moshe and Marinov [26], Lledo [179], and Fioresi and Lledo [98]. A gauge-invariant quantization method which, in the authors’ words, “synthesizes the geometric, deformation and Berezin quantization approaches”, was proposed by Fradkin and Linetsky [102] and Fradkin [101]. We remark that, in a sense, the second step in the deformation quantization is not strictly necessary — an alternate route is to cast the von Neumann formalism, interpreting Π(Qf )u, u (where Π(Qf ) is the spectral measure of Qf ) as the probability distribution of the result of measuring f in the state u, into a form involving only products of operators, and then replace the latter by the corresponding star products. Thus, for instance, instead of looking for eigenvalues of an operator Qf , i.e. solving the equation Qf u = λu, with u = 1, one looks for solutions of f ∗π = λπ, with π = π ¯ = π ∗π (π corresponds to the projection operator ·, uu); or, more generally, one defines the (star-) spectrum of f as the support of the measure µ on R for which e−2πiλt/h dµ(λ) Exp(tf ) = [Th (f ), Th (g)] +
R
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
453
(in the sense of distributions) where Exp(tf ) is the star exponential m ∞ 1 2πt f ∗ ···∗ f . Exp(tf ) := " #$ % m! ih m=0 m times
See Bayen et al. [31]. In this way, some authors even perceive deformation quantization as a device for “freeing” the quantization of the “burden” of the Hilbert space. Some other nice articles on deformation quantization are Sternheimer [252], Arnal, Cortet, Flato and Sternheimer [21], Weinstein [279], Fedosov [95], Fernandes [96], and Blattner [40]; two recent survey papers are Gutt [135], and Dito and Sternheimer [79]. See also Neumaier [198], Bordemann and Waldmann [48], Karabegov [151, 152], Duval, Gradechi and Ovsienko [82], and the above mentioned books by Fedosov [94] and Landsman [174] and papers by Rieffel [230, 232]. 5. Berezin and Berezin–Toeplitz Quantization on K¨ ahler Manifolds Recall that a Hilbert space H whose elements are functions on a set Γ is called a reproducing kernel Hilbert space (rkhs for short) if for each x ∈ Γ, the evaluation map φ → φ(x) is continuous on H. By the Riesz–Fischer representation theorem, this means that there exist vectors Kx ∈ H such that ∀ φ ∈ H.
φ(x) = Kx , φ, The function K(x, y) = Kx , Ky ,
x, y ∈ Γ
is called the reproducing kernel of H. Let us assume further that the scalar product in H is in fact the L2 product with respect to some measure µ on Γ. (Thus H is a subspace of L2 (Γ, µ).) Then any bounded linear operator A on H can be written as an integral operator, φ(y)A∗ Kx (y) dµ(y) Aφ(x) = Kx , Aφ = A∗ Kx , φ = = Γ
Γ
∗
φ(y)A Kx , Ky dµ(y) =
Γ
φ(y)Kx , AKy dµ(y),
with kernel Kx , AKy . The function A(x, y) =
Kx , AKy Kx , Ky
(5.1)
restricted to the diagonal is called the lower (or covariant) symbol A˜ of A: Kx , AKx ˜ A(x) := A(x, x) = . Kx , Kx
(5.2)
˜¯ Clearly the correspondence A → A˜ is linear, preserves conjugation (i.e. A˜∗ = A) and for the identity operator I on H one has I˜ = 1.
June 23, 2005 10:9 WSPC/148-RMP
454
J070-00237
S. T. Ali & M. Engliˇ c
For any function f such that f H ⊂ L2 (Γ, µ) — for instance, for any f ∈ L (Γ, µ) — the Toeplitz operator on H is defined by Tf (φ) = P (f φ), where P is the orthogonal projection of L2 onto H. In other words, ∞
Tf φ(x) = Kx , f φ =
φ(y)f (y)K(x, y) dµ(y).
(5.3)
Γ
The function f is called the upper (or contravariant z ) symbol of the Toeplitz operator Tf . The operator connecting the upper and the lower symbol f → T˜f ,
T˜f (x) =
f (y) Γ
|K(x, y)|2 dµ(y) =: Bf (x), K(x, x)
(5.4)
is called the Berezin transform. (It is defined only at points x where K(x, x) = 0.) In general, an operator A need not be uniquely determined by its lower symbol ˜ however, this is always the case if Γ is a complex manifold and the elements A; of H are holomorphic functions. (This is a consequence of the fact that A(x, y) is then a meromorphic function of the variables y and x¯, hence also of u = y + x ¯ and v = i(y − x¯), and thus is uniquely determined by its restriction to the real axes u, v ∈ Rn , i.e. to x = y.) In that case the correspondence A ↔ A˜ is a bijection from the space B(H) of all bounded linear operators on H onto a certain subspace AH ⊂ C ω (Γ) of real-analytic functions on Γ, and one can therefore transfer the operator multiplication in B(H) to a non-commutative and associative product ∗H on AH . Specifically, one has (f ∗H g)(y) =
f (y, x)g(x, y) Γ
|K(x, y)|2 dµ(x), K(y, y)
f, g ∈ AH ,
(5.5)
where f (x, y), g(x, y) are functions on Γ × Γ, holomorphic in x and y¯, such that f (x, x) = f (x) and g(x, x) = g(x) (cf. (5.1) and (5.2)). In particular, these considerations can be applied when H is the Bergman space A2 (Γ, µ) of all holomorphic functions in the Lebesgue space L2 (Γ, µ) on a complex manifold Γ equipped with a measure µ such that A2 (Γ, µ) = {0}. Suppose now that we have in fact a family µh of such measures, indexed by a small real parameter h > 0. (It suffices that h — the Planck constant — ranges over some subset of R+ having 0 as an accumulation point.) Then one gets a family of Hilbert spaces Hh = A2 (Γ, µh ) and of the corresponding products ∗Hh =: ∗h on the spaces AHh =: Ah . Berezin’s idea (phrased in today’s terms) was to choose the measures µh in such a way that these products ∗h yield a star-product. More specifically, let (A, ∗) be the direct sum of all algebras (Ah , ∗h ), and let A˜ be a linear subset of A such that z The adjectives upper and lower seem preferable to the more commonly used contravariant and covariant, as the latter have quite different meanings in differential geometry. The terms active and passive are also used.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
455
each f = {fh (x)}h ∈ A˜ has an asymptotic expansion fh (x) =
∞
hj fj (x)
as h → 0
(5.6)
j=0
with real-analytic functions fj (x) on Γ. We will say that A˜ is total if for any N > 0, x ∈ Γ and F ∈ C ω (Γ)[[h]] there exists f ∈ A˜ whose asymptotic expansion (5.6) coincides with F (x) modulo O(hN ). Suppose that we can show that there exists a ˜ one has f ∗ g ∈ A˜ and total set A˜ ⊂ A such that for any f, g ∈ A, Ck (fi , gj )(x) hi+j+k as h → 0, (5.7) (f ∗ g)h (x) = i,j,k≥0
where Ck : C (Γ) × C (Γ) → C ω (Γ) are some bilinear differential operators such that i C0 (φ, ψ) = φψ, C1 (φ, ψ) − C1 (ψ, φ) = − {φ, ψ}. (5.8) 2π Then the recipe (5.9) f i hi ∗ gj hj := Ck (fi , gj ) hi+j+k ω
i≥0
ω
j≥0
i,j,k≥0
∞
gives a star-product on C (Γ)[[h]] discussed in the preceding section. Moreover, this time it is not just a formal star product, since for functions in the total set A˜ it really exists as an element of C ∞ (Γ), and, in fact, for each h we can pass from Ah back to B(Hh ) and thus represent fh (x) as an operator Op(h) f on the Hilbert space Hh . If we can further find a linear and conjugation-preserving “lifting” f → Lf
(5.10)
from C ∞ (Γ) (or a large subspace thereof) into A˜ such that (Lφ)0 = φ, then the mapping φ → Op(h) (Lφ) =: Qφ will be the desired quantization rule, provided we can take care of the axioms (Q4) (functoriality) and (Q5) (the case of R2n Cn ). (It is easy to see that for real-valued φ the operators Op(h) (Lφ) are self-adjoint.) To see how to find measures µh satisfying (5.7) and (5.8), consider first the case when there is a group G acting on Γ by biholomorphic transformations preserving the symplectic form ω. In accordance with our axiom (Q4), we then want the product ∗ to be G-invariant, i.e. to satisfy (4.6). An examination of (5.5) shows that for two Bergman spaces H = A2 (Γ, µ) and H = A2 (Γ, µ ), the products ∗H and ∗H coincide if and only if |K (x, y)|2 |K(x, y)|2 dµ(x) = dµ (x). K(y, y) K (y, y)
(5.11)
In particular, dµ /dµ has to be a squared modulus of an analytic function; conversely, if dµ = |F |2 dµ with holomorphic F , then one can easily check that
June 23, 2005 10:9 WSPC/148-RMP
456
J070-00237
S. T. Ali & M. Engliˇ c
K(x, y) = F (x)F (y)K (x, y), and hence (5.11) holds. Thus the requirement that ∗H be G-invariant means that there exist analytic functions φg , g ∈ G, such that dµ(g(x)) = |φg (x)|2 dµ(x). Assuming now that µ is absolutely continuous with respect to the (G-invariant) n ω on Γ, measure ν = dµ(x) = w(x) dν(x), the last condition means that w(g(x)) = w(x)|φg (x)|2 . Hence the form ∂ ∂¯ log w is G-invariant. But the simplest examples of G-invariant forms (and if G is sufficiently “ample”, the only ones) are clearly the constant ¯ i.e. if ω is not only multiples of the form ω. Thus if ω lies in the range of ∂ ∂, symplectic but K¨ ahler, we are led to take dµh (x) = e−αΦ(x) dν(x)
(5.12)
where α = α(h) depends only on h and Φ is a K¨ ahler potential for the form ω ¯ (i.e. ω = ∂ ∂Φ). ahler In his papers [33], Berezin showed that for Γ = Cn with the standard K¨ zj , as well as for (Γ, ω) a bounded symmetric domain with the form ω = i j dzj ∧d¯ invariant metric, choosing µh as in (5.12) with α = 1/h indeed yields an (invariant) product ∗ satisfying (5.7) and (5.8), and hence one obtains a star product. Berezin did not consider the “lifting” (5.10) (in fact, he viewed his whole procedure as a means of freeing the quantum mechanics from the Hilbert space!), but he established an asymptotic formula for the Berezin transform B = Bh in (5.4) as h → 0 from which it follows that one can take as the lifting Lf of f ∈ C ∞ (Γ) the Toeplitz (h) operators Tf = Tf given by (5.3). Finally, in the case of Γ = Cn R2n one obtains for TRe zj and TIm zj operators which can be shown to be unitarily equivalent to the Schr¨ odinger representation (1.1). Thus we indeed obtain the desired quantization rule. For a long time, the applicability of Berezin’s procedure remained confined essentially to the above two examples, in other words, to Hermitian symmetric spaces. The reason was that it is not so easy to prove the formulas (5.7) and (5.8) for a general K¨ ahler manifold (with the measures given by (5.12)). Doing this is tantamount to obtaining the asymptotics (as h → 0) of the Berezin transform (5.4), which in turn depend on the asymptotics of the reproducing kernels Kh (x, y). For Cn and bounded symmetric domains, these kernels can be computed explicitly, and turn out to be given by Kα (x, y) = c(α)eαΦ(x,y) ,
(5.13)
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
457
where c(α) is a polynomial in α and Φ(x, y) is a function analytic in x, y¯ which coincides with the potential Φ(x) for x = y. It follows that Bα f (x) = c(α) f (y)e−αS(x,y) dy Γ
where S(x, y) = Φ(x, y)+Φ(y, x)−Φ(x, x)−Φ(y, y), and one can apply the standard Laplace (= stationary phase, WJKB) method to get the asymptotics (5.4).aa Thus what we need is an analog of the formula (5.13) for a general K¨ ahler bb manifold. The correct substitute turns out to be bj (x, y)αn−j , Kα (x, y) = eαΦ(x,y) j≥0
where n is the dimension and bj (x, y) are suitable coefficient functions. This was first established by Peetre and the second author for (Γ, ω) the annulus in C with the Poincar´e metric and x = y [84], and then extended, in turn, to all planar domains with the Poincar´e metric [85], to some Reinhardt domains in C2 with a natural rotation-invariant form ω [86], and finally to all smoothly bounded strictlyahler form ω whose potential Φ behaves like a pseudoconvex domains in Cn with K¨ power of dist(·, ∂Γ) near the boundary [87, 89]. So far we have tacitly assumed that the potential Φ is a globally defined function on Γ. We hasten to remark that almost nothing changes if Φ exists only locally (which it always does, in view of the K¨ ahlerness of ω); the only change is that instead of functions one has to consider sections of a certain holomorphic Hermitian line bundle, whose Hermitian metric in the fiber is locally given by e−αΦ(x) , and for this bundle to exist certain cohomology integrality conditions (identical to the prequantization conditions in the geometric quantization) have to be satisfied. For a more detailed discussion of reproducing kernels and of the upper and lower symbols of operators in the line (or even vector) bundle setting, see Pasternak-Winiarski [211], Pasternak-Winiarski and Wojcieszynski [212], and Peetre [215]. We also remark that Berezin quantization of cotangent bundles (i.e. Γ = T ∗ Q ˇ sevskii [239], who with the standard symplectic form ω) was announced by Sereˇ however was able to quantize only functions polynomial in the moment variables p. In the Berezin quantization, the formula (4.1) is satisfied only in the following weak sense, + * ih (h) (h) (h) (h) (h) Q Ky = O(h2 ), ∀ x, y ∈ Γ, ∀ φ, ψ ∈ C ∞ (Γ). Kx , [Qφ , Qψ ] + 2π {φ,ψ} (h)
(We write Qφ instead of Qφ etc. in order to make clear the dependence on h.) A natural question is whether one can strengthen this to hold in the operator norm. aa The function S(x, y) appeared for the first time in the paper of Calabi [60] on imbeddings of K¨ ahler manifolds into Cn , under the name of diastatic function. bb It seems that the validity of the exact formula (5.13) is probably limited to Hermitian symmetric spaces — at least, no other examples are known to this day.
June 23, 2005 10:9 WSPC/148-RMP
458
J070-00237
S. T. Ali & M. Engliˇ c (h)
More specifically, using the lifting L : f → Tf one would like to replace (5.7) by ! ! N ! ! ! (h) (h) j (h) ! h TCj (f,g) ! !Tf Tg − ! ! j=0
given by the Toeplitz operators,
= O(hN +1 )
(5.14)
B(Hh )
for all N > 0, for some bilinear differential operators Cj satisfying (5.8). This is called the Berezin–Toeplitz (or Wick) quantization. In the language of the preceding section, Berezin–Toeplitz quantization (unlike Berezin quantization) is an example of a strict quantization in the sense of Rieffel. (Here and throughout the rest of this section, the Toeplitz operators are still taken with respect to the measures (5.12) with α = 1/h.) Curiously enough, (5.14) was first established not for Γ = Cn with the Euclidean metric, but for the unit disc and the Poincar´e metric; see Klimek and Lesniewski [162]. The same authors subsequently extended these results to any plane domain using uniformization [163], and to bounded symmetric domains with Borthwick and Upmeier [51]. (Supersymmetric generalizations also exist, see [52].) The case ahler manifolds (with holoof Cn was treated later by Coburn [66]. For compact K¨ morphic sections of line bundles in place of holomorphic functions), a very elegant treatment was given by Bordemann, Meinrenken and Schlichenmaier [47] using the theory of generalized Toeplitz operators of Boutet de Monvel and Guillemin [55]; see also Schlichenmaier [236, 237], Karabegov and Schlichenmaier [153], Guillemin [131], Zelditch [284] and Catlin [62]. The same approach also works for smoothly ahler forms ω whose potential bounded strictly pseudoconvex domains in Cn with K¨ behaves nicely at the boundary, see [89], as well as for Γ = Cn with the standard (= Euclidean) K¨ ahler form [49]. For some generalizations to non-K¨ ahler case see Borthwick and Uribe [53]. We remark that the star products (5.9) determined by the Cj in (5.14) and in (5.7) are not the same; they are, however, equivalent, in the following sense. If one views the Berezin transform (5.4) formally as a power series in h with differential operators on Γ as coefficients, then Bh (f ∗BT g) = (Bh f ) ∗B (Bh g), where ∗B and ∗BT stand for the star products (B = Berezin, BT = Berezin– Toeplitz) coming from (5.7) and (5.14), respectively. In the terminology of [147], the two products are duals of each other. See the last page in [89] for the details. The Berezin–Toeplitz star product ∗BT is usually called Wick, and the Berezin star product ∗B anti-Wick. (For Γ = Cn R2n , they are further related to the 1/2 1/2 1/2 Moyal–Weyl product ∗MW from Sec. 1 by Bh (f ∗MW g) = Bh f ∗B Bh g, or 1/2 1/2 1/2 1/2 Bh (f ∗BT g) = Bh f ∗MW Bh g, where Bh = eh∆/2 is the square root of Bh = eh∆ .) Berezin’s ideas were initially developed further only in the context of symmetric (homogeneous) spaces, i.e. in the presence of a transitive action of a Lie group.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
459
The coefficients Cj (·, ·) are then closely related to the invariant differential operators on Γ; see Moreno [187], Moreno and Ortega-Navarro [188], Arnal, Cahen and Gutt [20], Cahen, Gutt and Rawnsley [59] and Bordemann et al. [45] for some interesting results on star products in this context. Some connections with Rieffel’s C ∗ -algebraic theory can be found in Radulescu [222], Landsman [176] and Guentner [129]. Formal Berezin and Berezin–Toeplitz star products on arbitrary K¨ ahler manifolds were studied by Karabegov [147, 148], Karabegov and Schlichenmaier [153] and Reshetikhin and Takhtajan [228] (cf. also Cornalba and Taylor [70] for a formal expansion of the Bergman kernel); see also Hawkins [138]. Evidently, a central topic in these developments is the dependence of the reproducing kernel Kµ (x, y) of a Bergman space A2 (Γ, µ) on the measure µ. This dependence is still far from being well understood. For instance, for (Γ, ω) a Hermitian ahler symmetric space (or Cn ) with the invariant metric and the corresponding K¨ n ω the Liouville (invariant) measure form ω, Φ a potential for ω, and ν = (n = dimC Γ), the weight function w(x) = e−αΦ(x) (with α 0) has the property that Kw dν (x, x) =
const. . w(x)
The existence of similar weights w on a general K¨ ahler manifold is an open problem. See Odzijewicz [201, p. 584], for some remarks and physical motivation for studying equations of this type. Some results on the dependence µ → Kµ are in PasternakWiniarski [213]. 6. Prime Quantization The most straightforward way of extending (1.1) to more general functions on R2n is to specify a choice of ordering. For instance, for a polynomial amk q m pk (6.1) f (p, q) = m,k
one can declare that Qf = f (Qp , Qq ) with the Qq ordered to the left of the Qp : k Q(f ) = amk Qm (6.2) q Qp . m,k
(Here m, k are multi-indices and we ignore the subtleties concerning the domains of definition etc. We will also sometimes write Q(f ) instead of Qf , for typesetting reasons.) Extending this (formally) from polynomials to entire functions, in particular to the exponentials e2πi(p·ξ+q·η) , we getcc Q(e2πi(p·ξ+q·η) ) = e2πiη·Q(q) e2πiξ·Q(p) . cc Here
we are using the real scalar product notation p · ξ = p1 ξ1 + · · · + pn ξn .
June 23, 2005 10:9 WSPC/148-RMP
460
J070-00237
S. T. Ali & M. Engliˇ c
Finally, decomposing an “arbitrary” function f (p, q) into exponentials via the Fourier transform, as in Sec. 1, we arrive at a quantization recipe Qf φ(x) = f (p, x) e2πi(x−y)·p/h φ(y) dp dy. (6.3) Similarly, using instead of (6.2) the opposite choice of ordering Q q k pm = Qkp Qm q m,k
we arrive at
(6.4)
m,k
f (p, y) e2πi(x−y)·p/hφ(y) dp dy.
Qf φ(x) =
(6.5)
The rules (6.5) and (6.3) are the standard Kohn–Nirenberg calculi of pseudodifferential operators, see [165, 100, Sec. 23]. A more sophisticated set of ordering rules generalizing (6.3) and (6.5) can be obtained by fixing a t ∈ [0, 1] and setting f (p, (1 − t)x + ty) e2πi(x−y)·p/h φ(y) dp dy. (6.6) Qf φ(x) = The choice t = 12 gives the Weyl calculus (1.7), which can thus be thought of as corresponding to a “symmetric” ordering of Qq and Qp . The drawback of (6.2) and (6.4) is that they need not be self-adjoint operators 2n n for real-valued symbols f . This can be remedied by viewing √ R as C and making √ the change of coordinates z = (q + ip)/ 2, z¯ = (q − ip)/ 2. The operators Qz and Qz¯ = Q∗z are then the annihilation and creation operators Qq − iQp √ . 2 One can then again assign to a polynomial f (z, z¯) = bmk z m z¯k either the operator Qf = bmk Q(z)m Q(z)∗k Qz =
Qq + iQp √ , 2
Q∗z =
or the operator Qf =
bmk Q(z)∗k Q(z)m
which is called the Wick (or normal) and the anti-Wick (anti-normal) ordering, respectively. The corresponding Wick and anti-Wick calculi are discussed in Folland’s book [100, Sec. 3.8]. The anti-Wick calculus turns out not to be so interesting, but the Wick calculus has an important reformulation if we replace the underlying Hilbert space L2 (Rn ), on which the operators Qf act, by the Fock (or Segal–Bargmann) space A2 (Cn , µh ) of all entire functions on Cn square-integrable 2 with respect to the Gaussian measure dµh (z) := (πh)−n e−|z| /h dz (dz being the
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
461
Lebesgue measure on Cn ). Namely, the Bargmann transform β : L2 (Rn ) ! f → βf (z) := (2πh)n/4
f (x)e2πx·z−hπ
2
x·x−z·z/2h
Rn
dx ∈ A2 (Cn , µh )
(6.7)
is a unitary isomorphism and upon passing from L2 (Rn ) to A2 (Cn , µh ) via β, the operators Qf become the familiar Toeplitz operators (5.3): f (y)φ(y)Kh (x, y) dµh (y), (6.8) βQf β −1 = Tf , with Tf φ(x) := Cn
where Kh (x, y) = ex¯y/h is the reproducing kernel for the space A2 (Cn , µh ). In this way, we thus recover on Cn the Berezin–Toeplitz quantization discussed in the preceding section. Another way of writing (6.8) is Tf = f (y) ∆y dy, (6.9) Cn
where ∆y = |ky ky | = ky , · ky is the rank-one projection operator onto the complex line spanned by the unit vector ky :=
Kh ( · , y) . Kh ( · , y)
(6.10)
This suggests looking, quite generally, for quantization rules of the form (6.9), with a set of “quantizers” ∆y (y ∈ Γ) which may be thought of as reflecting the choice of ordering. This is the basis of the prime quantization method introduced in [9] (see also [219]), where it is also explained how the choice of the quantizers (hence also of the ordering) is to be justified on physical grounds. The main result of [9] is that if the quantizers ∆y are bounded positive operators, ∆y ≥ 0, on some,(abstract) Hilbert space H, then there exists a direct integral Hilbert space ⊕ K = Γ Kx dν(x) (see [56]), where Kx is a family of separable Hilbert spaces indexed by x ∈ Γ and ν is a measure on Γ, and an isometry ι : H → K of H onto a subspace of K such that (i) ιH is a “vector-valued” reproducing kernel Hilbert space, in the sense that for each x ∈ Γ there , ⊕ is a bounded linear operator Ex from ιH into Kx such that for any f = Γ fy dν(y) ∈ ιH, one has fx = Ex Ey∗ fy dν(y) ∀ x ∈ Γ. (6.11) Γ
∗
(ii) ι∆y ι =
Ey∗ Ey .
The operators
Tf =
Γ
f (y) ∆y dν(y)
June 23, 2005 10:9 WSPC/148-RMP
462
J070-00237
S. T. Ali & M. Engliˇ c
thus satisfy
Tf =
Γ
f (y)ι∗ Ey∗ Ey ι dν(y).
(6.12)
If Kx = C for every x ∈ Γ, one can identify K with L2 (Γ, ν), ι with an inclusion map of H into K, and Ex with the functional Kx , · for some vector Kx ∈ H; thus (6.11) becomes f (y)K(x, y) dν(y), ∀ f ∈ H, where K(x, y) := Kx , Ky , f (x) = Γ
so H is an (ordinary) reproducing kernel Hilbert subspace of L2 (Γ, ν) with reproducing kernel K(x, y), and (6.12) reads f (y) |Ky Ky | dν(y), Tf = Γ
i.e. Tf is the Toeplitz type operator Tf φ = P (f φ) where P is the orthogonal projection of L2 (Γ, ν) onto H. In particular, for Γ = Cn and ν the Gaussian measure we recover (6.8) and (6.9). Note that the Weyl quantization operators (1.7), transferred to A2 (Cn , µh ) via the Bargmann transform (6.7), can also be written in the form (6.9), namely (cf. [100, p. 141]) f (y)sy dy, (6.13) βWf β −1 = Cn
where y = q − ip ((p, q) ∈ R , y ∈ C ) and 2n
n
sy φ(z) = φ(2y − z)e2¯y·(z−y)/h is the self-adjoint unitary map of A2 (Cn , µh ) induced by the symmetry z → 2y − z of Cn . In contrast to (6.9), however, this time the quantizers sy are not positive operators. Given ∆y , one can also consider the “dequantization” operator T → T˜, T˜ (y) := Trace(T ∆y ),
(6.14)
which assigns functions to operators. For the Weyl calculus, it turns out that ˜ f = f , a reflection of the fact that the mapping f → Wf is a unitary map W from L2 (R2n ) onto the space of Hilbert–Schmidt operators (an observation due to Pool [218]). For the Wick calculus (6.9), T˜f is precisely the Berezin transform of f , discussed above, and the function T˜ is the lower (covariant, passive) symbol of the operator T (and f is the upper (contravariant, active) symbol of the Toeplitz operator Tf ). Using the same ideas as in the previous section, one can thus try to construct, for a general set of quantizers ∆y , a Berezin–Toeplitz type star product f ∗h g = Cj (f, g) hj , f, g ∈ C ∞ (Γ), j≥0
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
463
by establishing an asymptotic expansion for the product of two operators of the form (6.9), Tf Tg = hj TCj (f,g) as h → 0, j≥0
and, similarly, a Berezin-type star product by setting T˜f ∗h T˜g := T f Tg . In this way we see that the formula (6.9), which at first glance might seem more like a mathematical exercise in pseudodifferential operators rather than a sensible quantization rule, effectively leads to most of the developments (at least for R2n ) we did in the previous two sections. In the context of R2n , or, more generally, of a coadjoint orbit of a Lie group, the “quantizers” and “dequantizers” above seem to have been first studied systematically by Gracia-Bondia [123]; in a more general setting, by Antoine and Ali [6]. Two recent papers on this topic, with some intriguing ideas, are Karasev and Osborn [158]. For some partial results on the Berezin–Toeplitz star-products for general quantizers, see Engliˇs [91]. The operators (6.6) and the corresponding “twisted product” f g defined by Qf g = Qf Qg were investigated by Unterberger [263] (for t = 1/2, see H¨ormander [140]); a relativistic version, with the Weyl calculus replaced by “Klein–Gordon” and “Dirac” calculi, was developed by Unterberger [266]. The formula (6.13) for the Weyl operator makes sense, in general, on any Hermitian symmetric space Γ in the place of Cn , with sy the self-adjoint unitary isomorphisms of A2 (Γ) induced by the geodesic symmetry around y; in this context, the Weyl calculus on bounded symmetric domains was studied by Upmeier [270], Unterberger and Upmeier [269], and Unterberger [264, 268]. Upon rescaling and letting h → 0, one obtains the so-called Fuchs calculus [265]. A general study of invariant symbolic calculi (6.9) on bounded symmetric domains has recently been undertaken by Arazy and Upmeier [18]. An important interpretation of the above-mentioned equality of the L2 (R2n )norm of a function f and the Hilbert–Schmidt norm of the Weyl operator Wf is the following. Consider once more the map Γ : f → Tf mapping a function f to the corresponding Toeplitz operator (6.8), and let Γ∗ be its adjoint with respect to the L2 (R2n ) inner product on f and the Hilbert–Schmidt product on Tf . One then checks easily that Γ∗ coincides with the dequantization operator (6.14). Now by the abstract Hilbert-space operator theory, Γ admits the polar decomposition Γ = W R,
with R := (Γ∗ Γ)1/2 and W a partial isometry with initial space Ran Γ∗ and final space Ran Γ.
(6.15)
June 23, 2005 10:9 WSPC/148-RMP
464
J070-00237
S. T. Ali & M. Engliˇ c
A simple calculation shows, however, that Γ∗ Γ is precisely the Berezin transform associated to A2 (Cn , µh ), |Kh (y, x)|2 Γ∗ Γf (y) = dµh (x) = eh∆ f (y), f (x) K (y, y) n h C and using the Fourier transform to compute the square root (Γ∗ Γ)1/2 one discovers that W is precisely the Weyl transform f → Wf . This fact, first realized by Orsted and Zhang [208] (see also Peetre and Zhang [216] for a motivation coming from decompositions of tensor products of holomorphic discrete series representations), allows us to define an analogue of the Weyl transform by (6.15) for any reproducing kernel subspace of any L2 space. For the standard scale of weighted Bergman spaces on bounded symmetric domains in Cn , this generalization has been studied in Orsted and Zhang [208] and Davidson, Olafsson and Zhang [74]; the general case seems to be completely unexplored at present. From the point of view of group representations, the unit vectors ky in (6.10) are the coherent states in the sense of Glauber [110], Perelomov [217] and Onofri [207]. Namely, the group G of all distance-preserving biholomorphic self-maps of Cn (which coincides with the group of orientation-preserving rigid motions x → Ax + b, A ∈ U (n), b ∈ Cn ) acts transitively on Cn and induces a projective unitary representation Ug : φ(x) → φ(gx)e− b,Ax /h−|b|
2
/2h
(gx = Ax + b, g ∈ G)
of G in A2 (Cn ); and the vectors ky are unit vectors satisfying Ug ky = kgy
(6.16)
for some numbers = (g, y) of unit modulus. Coherent states for a general group G of transformations acting transitively on a manifold Γ, with respect to a projective unitary representation U of G in a Hilbert space H, are similarly defined as a family {ky }y∈Γ of unit vectors in H indexed by the points of Γ such that (6.16) holds. Choosing a basepoint 0 ∈ Γ and letting H be the subgroup of G which leaves the subspace Ck0 invariant (i.e. g ∈ H iff Ug k0 = (g)k0 for some (g) ∈ C of modulus 1), we can identify Γ with the homogeneous space G/H. Suppose that there exists a biinvariant measure dg on G, and let dm be the corresponding invariant measure on Γ = G/H. We say that the coherent states {ky }y∈Γ are square-integrable if |kx , ky |2 dm(y) =: d < ∞ Γ
(in view of (6.16), the value of the integral does not depend on the choice of x ∈ Γ). If the representation U is irreducible, it is then easy to see from the Schur lemma that 1 |ky ky | dm(y) = I (the identity on H). d Γ It follows that the mapping H ! f → f (y) := ky , f identifies H with a subspace of L2 (Γ, dm) which is a reproducing kernel space with kernel K(x, y) = d−1 kx , ky .
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
465
Thus, in some sense, the quantizers ∆y above and their associated reproducing kernel Hilbert spaces may be regarded as generalizations of the coherent states to the situation when there is no group action present. For more information on coherent states and their applications in quantization, see for instance Klauder [161], Odzijewicz [202], Unterberger [267], Ali and Goldin [12], Antoine and Ali [6], Ali [4], Bartlett, Rowe and Repka [235], and the survey by Ali, Antoine, Gazeau and Mueller [7], as well as the recent book [8], and the references therein. An interesting characterization of the cut locus of a compact homogeneous K¨ ahler manifold in terms of orthogonality of coherent states has recently been given by Berceanu [32]. We will have more to say about coherent states in Sec. 7 below. Another way of arriving at the Toeplitz-type operators (6.8) is via geometric quantization. Namely, consider a phase space (Γ, ω) which admits a K¨ ahler polarization F , i.e. one for which F ∩ F¯ = {0} (hence F + F¯ = TC∗ Γ). The functions constant along F can then be interpreted as holomorphic functions, the corresponding L2 -space becomes the Bergman space, and the quantum operators (3.3) become, as has already been mentioned above, Toeplitz operators. This link between geometric and Berezin quantization was discovered by Tuynman [258, 259], who showed that on a compact K¨ ahler manifold (as well as in some other situations) the operators Qf of the geometric quantization coincide with the Toeplitz operators Tf +h∆f , where ∆ is the Laplace–Beltrami operator. Later on this connection was examined in detail in a series of papers by Cahen [58] and Cahen, Gutt and Rawnsley [59] (parts I and II of [59] deal with compact manifolds, part III with the unit disc, and part IV with homogeneous spaces). See also Nishioka [200] and Odzijewicz [202]. In a sense, the choice of polarization in geometric quantization plays a similar role as the choice of ordering discussed in the paragraphs above, see Ali and Doebner [9]. Another point of view on the ordering problem in geometric quantization is addressed in Bao and Zhu [24]. 7. Coherent State Quantization The method of coherent state quantization is in some respects a particular case of the prime quantization from the previous section, exploiting the prequantization of the projective Hilbert space. Some representative references are by Ali [4, 6, 13], Lisiecki [178], Odzijewicz [201–205] and Rawnsley [224]. The relationship between coherent state quantization and geometric quantization is described in rigorous detail in [204]. We begin with a quick review of the symplectic geometry of the projective Hilbert space. 7.1. The projective Hilbert space Let H be a Hilbert space of dimension N , which could be (countably) infinite or finite. As a set, the projective Hilbert space CP(H) will be identified with the collection of all orthogonal projections onto one-dimensional subspaces of H and for 1 each non-zero vector ψ ∈ H let Ψ = ψ 2 |ψψ| denote the corresponding projector.
June 23, 2005 10:9 WSPC/148-RMP
466
J070-00237
S. T. Ali & M. Engliˇ c
There is a natural K¨ ahler structure on CP(H) as we now demonstrate. An analytic atlas of CP(H) is given by the coordinate charts {(VΦ , hφ , HΦ ) | φ ∈ H \ {0}},
(7.1)
VΦ = {Ψ ∈ CP(H) | φ|ψ = 0}
(7.2)
where is an open, dense set in CP(H); HΦ = (I − Φ)H = φ⊥
(7.3)
is the subspace of H orthogonal to the range of Φ and hφ : VΦ → HΦ is the diffeomorphism φ 1 . (7.4) (I − Φ)ψ, φˆ = hφ (Ψ) = ˆ φ φ|ψ Since VΦ is dense in H, it is often enough to consider only one coordinate chart. −1 Thus, we set e0 = φˆ and choose an orthonormal basis {ej }N j=1 of HΦ to obtain a basis of H which will be fixed from now on. We may then identify CP(H) with CPN : For arbitrary Ψ ∈ H we set zj = ej |ψ,
j = 1, 2, . . . , N − 1,
(7.5)
and the coordinates of Ψ ∈ CP(H) are the standard homogeneous coordinates zj ej |hφ (Ψ) = Zj = , j = 1, 2, . . . , N − 1, (7.6) z0 of projective geometry. The projection map π : H\{0} → CP(H) that assigns to each ψ ∈ H\{0} the corresponding projector Ψ ∈ CP(H) is holomorphic in these coordinates. For Φ ∈ CP(H) we have π −1 (Ψ) = C∗ ψ where C∗ = C\{0}, and so π : H\{0} → CP(H) is a GL(1, C) principal bundle, sometimes called the canonical line bundle over CP(H). We will denote the associated holomorphic line bundle by L(H) and write elements in it as (Ψ, ψ), where ψ ∈ Ψ(H). (We again write π for the canonical projection.) A local trivialization of L(H) over VΦ is given by the (holomorphic) reference section sˆ of L(H): ψ sˆ(Ψ) = Ψ, , (7.7) ˆ φ|ψ and any other section s : VΦ → L(H) is given by s(Ψ) = (Ψ, κ(Ψ))
(7.8) |κ(Ψ) κ(Ψ)| κ(Ψ)2
= Ψ. Denote by where κ : CP(H) → H\{0} is a holomorphic map with s0 the zero-section of L(H). The identification map ıL : L(H)\s0 → H given as ıL (Ψ, ψ) = ψ,
(7.9)
yields a global coordinatization of L(H)\s0 . For any ψ ∈ H let ψ| be its dual element. The restriction of ψ| to the fiber π −1 (Ψ ) in L(H), for arbitrary
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
467
Ψ ∈ CP(H), then yields a section s∗Ψ of the dual bundle L(H)∗ of L(H). Moreover, the map Ψ → s∗Ψ is antilinear between H and Γ(L(H)∗ ). We may hence realize H as a space of holomorphic sections. The tangent space TΨ CP(H) to CP(H) at the point Ψ has a natural identification with HΨ (obtainable, for example, by differentiating curves in CP(H) passing through Ψ). The complex structure of HΨ then endows the tangent space TΨ CP(H) ahler manifold. with an integrable complex structure JΨ , making CP(H) into a K¨ The corresponding canonical 2-form ΩF S , called the Fubini-Study 2-form, is given pointwise by 1 (ξ|ζ − ζ|ξ), (7.10) 2i where ξ, ζ ∈ HΨ correspond to the tangent vectors XΨ , YΨ respectively. The associated Riemannian metric gF S is given by ΩF S (XΨ , YΨ ) =
gF S (XΨ , YΨ ) =
1 (ξ|ζ + ζ|ξ) = ΩF S (XΨ , JΨ YΨ ). 2
(7.11)
In the local coordinates Zj , defined in (7.6), ΩF S assumes the form N −1 1 Zj Z¯k δjk − dZ¯j ∧ dZk , ΩF S = 1 + Z2 1 + Z2 j,k=1
Z = (Z1 , Z2 , . . . , ZN −1 ).
(7.12)
Thus, clearly, dΩF S = 0, implying that ΩF S is a closed 2-form, derivable from the real K¨ ahler potential N −1
¯ = log[1 + Z2 ]. Φ(Z, Z)
(7.13)
2 Φ (That is, ΩF S = j,k=1 ∂ Z¯∂j ∂Z dZ¯j ∧ dZk .) k A Hermitian metric HF S and a connection ∇F S on L(H) can be defined using the inner product of H: Indeed, since π −1 (Ψ) = {Ψ} × Cψ, the Hermitian structure HF S is given pointwise by
HF S ((Ψ, ψ), (Ψ, ψ )) = ψ|ψ
(7.14)
for all (Ψ, ψ), (Ψ, ψ ) ∈ π −1 (Ψ). We will use the identification map ıL defined in (7.9) to construct a connection on L(H). Define the 1-form α on H by α(ψ) =
dψ|ψ . ψ2
(7.15)
Then the pullback αF S = ı∗L α
(7.16)
defines a C∗ -invariant 1-form on L(H) whose horizontal space at (Ψ, ψ) ∈ L(H) is HΨ . For an arbitrary section s : VΦ → L(H)\s0 as in (7.8), the pullback −iθF S = s∗ αF S
(7.17)
June 23, 2005 10:9 WSPC/148-RMP
468
J070-00237
S. T. Ali & M. Engliˇ c
defines a local 1-form θF S on CP(H). Pointwise, θF S (Ψ) = i
dκ(Ψ)|κ(Ψ) = i∂¯ log κ(Ψ)2 , κ(Ψ)2
(7.18)
where ∂¯ denotes exterior differentiation with respect to the anti-holomorphic variables. In terms of the coordinatization introduced in (7.5), with f as the holomorphic function representing κ, we have ¯ df (Z) j Zj dZj θFS (Z) = i . (7.19) +i 1 + Z2 f (Z) Furthermore, θFS locally defines a compatible connection ∇FS ∇FS s = −iθFS ⊗ s,
(7.20)
ΩFS = ∂θFS = curv ∇FS ,
(7.21)
and it is easy to verify that
where ∂ denotes exterior differentiation with respect to the holomorphic variables and curv ∇F S is the curvature form of the line bundle L(H). Thus the Hermitian line bundle (L(H), HF S , ∇FS ) is a prequantization of (CP(H), ΩF S ) in the sense of geometric quantization. 7.2. Summary of coherent state quantization The prequantization of (CP(H), ΩFS ) can be exploited to obtain a prequantization of an arbitrary symplectic manifold (Γ, Ω) whenever there exists a symplectomorphism Coh of Γ into CP(H). In this case, Ω = Coh∗ ΩF S and the line bundle L := Coh∗ L(H), equipped with the Hermitian metric Coh∗ HF S and (compatible) connection ∆K := Coh∗ ∇F S , is a prequantization of Γ, i.e. in particular, Ω = curv(Coh∗ ∇F S ). The expression θK (x) := i(Coh∗ θF S )(x),
(7.22)
defines a 1-form on L, for which Ω = dθK . The Hermitian metric HK = Coh∗ HF S and the compatible connection ∇K are given by HK ((x, ψ), (x, ψ )) = ψ, ψ
(7.23)
∇K coh = −iθK ⊗ coh,
(7.24)
where coh denotes a smooth section of L and curv ∇K = Ω. More generally, if Coh : Γ → CP(H) is only assumed to be a smooth map, not necessarily a symplectomorphism, the above scheme gives us a prequantization of the symplectic manifold (Γ, ΩK ) where ΩK = Coh∗ ΩF S . That is, one has: Proposition 7.1. The triple (π : L → Γ, HK , ∇K ), where ∇K coh = −iθK ⊗ coh, is a Hermitian line bundle with compatible connection, and curv ∇K = ΩK .
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
469
To make the connection with coherent states, we note that the elements of L are pairs (x, ψ) with ψ ∈ H and |ψ ψ| ψ2 = Ψ = Coh(x). Let U ⊂ Γ be an open dense set such that the restriction of L to U is trivial. Let coh : U → H be a smooth section of L, that is, a smooth map satisfying Coh(x) =
|coh(x)coh(x)| coh(x)2
(7.25)
(such maps can always be found). Let us also write ηx = coh(x), ∀x ∈ U . Assume furthermore that the condition |ηx ηx |dν(x) = IH (7.26) Γ
is satisfied, where IH is the identity operator on H and ν is the Liouville measure on Γ, arising from Ω. We call the vectors ηx the coherent states of the prequantization. In terms of the reproducing kernel K(x, y) = ηx |ηy and locally on U , θK (x) = d1 log K(x1 , x2 )|x1 =x2 =x (d1 denoting exterior differentiation with respect to x1 ). Once we have (7.26), we can define a quantization via the recipe f (x)|ηx ηx | dν(x). (7.27) f → Qf = Γ
Note that this is a particular case of the “prime quantization” discussed in Sec. 6. As a consequence of Proposition 7.1 we see that ΩK so constructed has integral cohomology. Thus the pair (Γ, ΩK ) satisfies the integrality condition. We have thus obtained a geometric prequantization on (Γ, ΩK ) from the natural geometric prequantization of (CP(H), ΩF S ) via the family of coherent states {ηx }. While the new two-form ΩK on Γ is integral, this is not necessarily the case for the original form Ω. If it is, then there exists a geometric prequantization on (Γ, Ω) which we may compare with the prequantization obtained using the coherent states. The original prequantization is said to be projectively induced if Ω = ΩK ; if furthermore, Γ has a complex structure which is preserved by Coh, the symplectic manifold (Γ, Ω) turns out to be a K¨ ahler manifold. For the Berezin quantization, discussed in Sec. 5, the coherent states can be shown to give rise to a projectively induced prequantization if Γ is a Hermitian symmetric space. It ought to be pointed out that while the map Coh : Γ → CP(H) yields a prequantization of (Γ, Ω), the method outlined above does not give an explicit way to determine H itself. However, starting with the Hilbert space L2 (Γ, ν), one can try to obtain subspaces HK ⊂ L2 (Γ, ν), for which there are associated coherent states. Note that (7.26) then means that HK will be, in fact, a reproducing kernel space (with reproducing kernel ηx , ηy ). Two simple examples. Consider a free particle, moving on the configuration space R3 . Then, Γ = R6 , is the phase space. This is a symplectic manifold with 3 two-form Ω = i=1 dpi ∧ dqi . Let H = L2 (Γ, dp dq) and let us look for convenient
June 23, 2005 10:9 WSPC/148-RMP
470
J070-00237
S. T. Ali & M. Engliˇ c
subspaces of it which admit reproducing kernels. Let e : R3 → C be a measurable function, depending only on the modulus k and satisfying |e(k)|2 dk = 1. R3
For = 0, 1, 2, . . . , denote by P the Legendre polynomial of order , P (x) =
1 d 2 (x − 1) . 2 ! dx
Define Ke, (q, p; q , p ) (k − p) · (k − p ) 2 + 1 ik·(q−q ) = e P e(k − p) e(k − p) dk. (7.28) (2π)3 R3 k − p k − p It is then straightforward to verify [14] that Ke, is a reproducing kernel with the usual properties, (q, p) ∈ Γ,
Ke, (q, p; q, p) > 0,
Ke, (q, p; q , p ) = Ke, (q , p ; q, p), Ke, (q, p; q , p ) = Ke, (q, p; q , p )Ke, (q , p ; q , p ) dp dq ,
(7.29)
R6
and we have the associated family of coherent states, S = {ξq,p ∈ H | ξq,p (q , p ) = Ke, (q , p ; q, p),
(q, p), (q , p ) ∈ Γ}
(7.30)
which span a Hilbert subspace He, ⊂ H and satisfy the resolution of the identity on it: |ξq,p ξq,p | dp dq = Ie, . (7.31) R6
Using these coherent states we can do a prime quantization as in (7.27), i.e., f → Qf = f (q, p)|ξq,p ξq,p | dp dq. (7.32) R6
In particular, we get for the position and momentum observable the operators, Qqj ≡ qˆj = qj − i
∂ , ∂pj
Qpj ≡ pˆj = −i
∂ , ∂qj
j = 1, 2, 3,
(7.33)
on He, , so that [ˆ qi , pˆj ] = iδij Ie, . This illustrates how identifying appropriate reproducing kernel Hilbert spaces can lead to a physically meaningful quantization of the classical system.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
471
Let us next try to bring out the connection between this quantization and the natural prequantization on CP(He, ). Consider the map Coh : Γ = R6 → CP(He, ),
Coh(q, p) =
|ξq,p ξq,p | . ξq,p 2
(7.34)
It is straightforward, though tedious, to verify that Coh∗ ΩF S = Ω =
3
dpi ∧ dqi .
(7.35)
i=1
Hence Ω is projectively induced. The pullback L = Coh∗ L(He, ) of the canonical line bundle L(He, ) (over CP(He, )) under Coh gives us a line bundle over Γ = R6 . Take a reference section sˆ(q, p) = ξq,p in L. Square-integrable sections of this bundle form a Hilbert space HL , with scalar product Ψ1 (q, p)Ψ2 (q, p)Ke, (q, p; q, p) dq dp, si (q, p) = sˆΨi , i = 1, 2, s1 |s2 = R6
and again, HL is naturally (unitarily) isomorphic to L2 (Γ, dq dp). We take the symplectic potential θ=
3
pi dqi ,
i=1
so that Ω = dθ, and thus we obtain a prequantization, as in Sec. 3.1, yielding the position and momentum operators qˆj = −i
∂ + qj , ∂pj
pˆj = −i
∂ , ∂qj
which are the same as in (7.33), but now act on the (larger) space L2 (Γ, dq dp). Our second example, following [106] and [107], is somewhat unorthodox and makes use of a construction of coherent states associated to the principal series representation of SO0 (1, 2). The quantization is performed using (7.27). The coherent states in question are defined on the space S 1 × R = {x ≡ (β, J) | 0 ≤ β < 2π, J ∈ R}, which is the phase space of a particle moving on the unit circle. The J and β are canonically conjugate variables and define the symplectic form dJ ∧ dβ. Let H be an abstract Hilbert space and let {ψn }∞ n=0 be an orthonormal basis of it. Consider next the set of functions, 2
φn (x) = e(−n
/2) n(J+iβ)
e
,
n = 0, 1, 2, . . . ,
(7.36)
defined on S 1 × R, where > 0 is a parameter which can be arbitrarily small. These functions are orthonormal with respect to the measure, 1 −J 2 dJ dβ. e dµ(x) = π 2π
June 23, 2005 10:9 WSPC/148-RMP
472
J070-00237
S. T. Ali & M. Engliˇ c
Define the normalization factor, N (J) =
∞
|φn (x)|2 =
n=0
∞
2
e(−n ) e2nJ < ∞
(7.37)
n=0
(which is proportional to an elliptic Theta function), and use it to construct the coherent states ∞ ∞ 2 1 1 ηx := ηJ,β = . φn (x)ψn = . e(−n /2) en(J−iβ) ψn . N (J) n=0 N (J) n=0
(7.38)
These are easily seen to satisfy ηJ,β = 1 and the resolution of the identity |ηJ,β ηJ,β | N (J) dµ(x) = IH , (7.39) S 1 ×R
so that the map W : H → L2 (S 1 × R, N (J) dµ),
where (W φ)(J, β) = ηJ,β | φ ,
is a linear isometry onto a subspace of L2 (S 1 ×R, N (J) dµ). Denoting this subspace by Hhol , we see that it consists of functions of the type, ∞ 1 F (z) , (W φ)(J, β) = . cn z n := . N (J) n=0 N (J) 2
where we have introduced the complex variable z = eJ+iβ and cn = e−n /2 ψn |φ. The function F (z) is entire analytic and the choice of the subspace Hhol ⊂ L2 (S 1 × R, N (J) dµ) — that is, of the coherent states (7.38) — is then akin to choosing a polarization. In view of (7.26) and (7.27), the quantization rule for functions f on the phase space S 1 × R becomes Qf := f (J, β) |ηJ,β ηJ,β | N (J) dµ(x). (7.40) S 1 ×R
For f (J, β) = J, QJ =
S 1 ×R
J |ηJ,β ηJ,β | N (J) dµ(x) =
∞
n |ψn ψn | .
(7.41)
n=0
This is just the angular momentum operator, which as an operator on Hhol is seen ∂ . For an arbitrary function of β, we get similarly to assume the form QJ = −i ∂β Qf (β) = f (β) |ηJ,β ηJ,β | N (J) dµ(x) S 1 ×R
=
n,n
2
e− 4 (n−n ) cn−n (f )|ψn ψn |,
(7.42)
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
473
where cn (f ) is the nth Fourier coefficient of f . In particular, we have for the “angle” operator: Qβ = πIH +
e− 4 (n−n )2 i |ψn ψn |, n − n
(7.43)
n=n
and for the “fundamental Fourier harmonic” operator
Qeiβ = e− 4
∞
|ψn+1 ψn |,
(7.44)
n=0
which, on Hhol , is the operator of multiplication by eiβ up to the factor e− 4 (which can be made arbitrarily close to unity). Interestingly, the commutation relation [QJ , Qeiβ ] = Qeiβ ,
(7.45)
is “canonical” in that it is in exact correspondence with the classical Poisson bracket {J, eiβ } = ieiβ . 8. Some Other Quantization Methods Apart from geometric and deformation quantization, other quantization methods exist; though it is beyond our expertise to discuss them all here, we at least briefly indicate some references. For quantization by Feynman path integrals, a standard reference is Feynman and Hibbs [97] or Glimm and Jaffe [111]; a recent survey is Grosche and Steiner [128]. Path integrals are discussed also in Berezin’s book [34], and a local deformation quantization formula resembling the Feynman expansion in a 2d quantum field theory lies also at the core of Kontsevich’s construction [166] of star product on any Poisson manifold. (More precisely, Kontsevich’s formula is an expansion of a certain Feynman integral at a saddle point, see Cattaneo and Felder [63].) Connections between Feynman path integrals, coherent states, and the Berezin quantization are discussed in Kochetov and Yarunin [164], Odzijewicz [202], Horowski, Kryszen and Odzijewicz [141], Klauder [161], Berezin and Shubin [35, Chap. 5], Marinov [181], Charles [64], and Bodmann [43]. For a discussion of Feynman path integrals in the context of geometric quantization, see Gawedzki [105], Wiegmann [277], and Woodhouse [282, Chap. 9]. Another method is the asymptotic quantization of Karasev and Maslov [156]. It can be applied on any symplectic manifold, even when no polarization exists and the geometric quantization is thus inapplicable. It is based on patching together local Weyl quantizations in Darboux coordinate neighborhoods, the result being a quantization rule assigning to any f ∈ C ∞ (Γ) a Fourier integral operator on a sheaf of function spaces over Γ such that the condition (4.1) is satisfied. The main technical point is the use of the Maslov canonical operator (see e.g. Mishchenko, Sternin and Shatalov [184]). The main disadvantage of this procedure is its asymptotic character: the operators gluing together the local patches into the sheaf are
June 23, 2005 10:9 WSPC/148-RMP
474
J070-00237
S. T. Ali & M. Engliˇ c
defined only modulo O(h), and so essentially everything holds just modulo O(h) (or, in an improved version, module O(h∞ ) or modulo the smoothing operators). The ideas of Karasev and Maslov were further developed in their book [157] (see also Karasev [155]), in Albeverio and Daletskii [2], and Maslov and Shvedov [182]. A good reference is Patissier and Dazord [75], where some obscure points from the original exposition [156] are also clarified. For comparison of this method with the geometric and deformation quantizations, see Patissier [214]. We remark that this asymptotic quantization should not be confused with the “asymptotic quantization” which is sometimes alluded to in the theory of Fourier integral operators and of generalized Toeplitz operators (in the sense of Boutet de Monvel and Guillemin), see e.g. Boutet de Monvel [54] or Bony and Lerner [44] (though the two are not totally unrelated). Another two asymptotic quantizations exist in coding theory (see e.g. Neuhoff [197], Gray and Neuhoff [124]) and in quantum gravity (Ashtekar [22]). Stochastic quantization is based, roughly speaking, on viewing the quantum indeterminacy as a stochastic process, and applying the methods of probability theory and stochastic analysis. They are actually two of the kind, the geometrostochastic quantization of Prugoveˇcki [220] and the stochastic quantization of Parisi and Wu [209]. The former arose, loosely speaking, from Mackey’s systems of imprimitivity (U, E) (Mackey [180] — see the discussion of Borel quantization in Sec. 2.4 above), with U a unitary representation of a symmetry group and E a projectionvalued measure satisfying Ug E(m)Ug∗ = E(gm) for any Borel set m, by demanding that E be not necessarily projection but only positive-operator valued (POV) measure; this leads to appearance of reproducing kernel Hilbert spaces and eventually makes contact with the prime quantization discussed in the preceding section. See Ali and Prugoveˇcki [14]; a comparison with Berezin quantization is available in Ktorides and Papaloucas [171]. The stochastic quantization of Parisi and Wu originates in the analysis of perturbations of the equilibrium solution of a certain parabolic stochastic differential equation (the Langevin equation), and we will not say anything more about it but refer the interested reader to Chaturvedi, Kapoor and Srinivasan [65], Damgaard and H¨ uffel [72], Namsrai [192], Mitter [185], or Namiki [191]. A comparison with geometric quantization appears in Hajra and Bandyopadhyay [137] and Bandyopadhyay [23]. Again, the term “stochastic quantization” is sometimes also used as a synonym for the stochastic mechanics of Nelson [195]. Finally, we mention briefly the method of quantum states of Souriau [250]. It builds on the notions of diffeological space and diffeological group, introduced in [249], which are too technical to describe here, and uses a combination of methods of harmonic and convex analysis. See the expository article [251] for a summary of later developments. Currently, the connections of this method with the other approaches to quantization seem unclear (cf. Blattner [40]). The subject of quantization is vast and it is not the ambition, nor within the competence, of the present authors to write a comprehensive overview, so we better stop our exposition at this point, with an apology to the reader for those topics
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
475
that were omitted, and to all authors whose work went unmentioned. We have not, for instance, at all touched the important and fairly complex problem of quantization with constraints, including BFV and BRST quantizations (see Sniatycki [247], Tuynman [260], Ibort [145], Batalin and Tyutin [29], Batalin, Fradkin and Fradkina [28], Kostant and Sternberg [170], Grigoriev and Lyakhovich [126]) and the relationship between quantization and reduction (Sjamaar [244], Tian and Zhang [255], Jorjadze [146], Bordemann, Herbig and Waldmann [46], Mladenov [186], Huebschmann [143], Vergne [274]); or quantum field theory and field quantization (Greiner and Reinhardt [125], Borcherds and Barnard [27]), etc. Some useful surveys concerning the topics we have covered, as well as some of those that we have not, are Sternheimer [252], Weinstein [279], Fernandes [96], Echeverria-Enriquez et al. [83], Sniatycki [245], Ali [3], Blattner [40], Tuynman [261], Borthwick [50], and the books of Fedosov [94], Landsman [174], Bates and Weinstein [30], Souriau [248], Perelomov [217], Bandyopadhyay [23], Greiner and Reinhardt [125] and Woodhouse [282] mentioned above.
Acknowledgments This survey is based on an appendix to the habilitation thesis of the second author [88] and on lecture notes from a course on quantization techniques given at Cotonou, Benin, by the first author [5]. The authors would like to thank G. Tuynman for many helpful conversations on geometric quantization and record their gratitude to J.-P Antoine, J.-P. Gazeau and G. A. Goldin, for constructive feedback on the manuscript. The work of the first author (STA) was partially supported by grants from the Natural Sciences and Engineering Research Council (NSERC) of Canada and the Fonds qu´eb´ecois de la recherche sur la nature et les technologies (FQRNT). The ˇ grants A1019005 and second author (ME) acknowledges support from GA AV CR A1019304.
References [1] Y. Aharonov and D. Bohm, Significance of electromagnetic potentials in the quantum theory, Phys. Rev. 115 (1959) 485–491. [2] S. Albeverio and A. Yu. Daletskii, Asymptotic quantization for solution manifolds of some infinite-dimensional Hamiltonian systems, J. Geom. Phys. 19 (1996) 31–46. [3] S. T. Ali, Survey of quantization methods, in Classical and Quantum Systems (Goslar 1991) (World Scientific River Edge, NJ 1993), pp. 29–37. [4] S. T. Ali, Coherent states and quantization, in Spline Functions and the Theory of Wavelets (Montreal, PQ, 1996) CRM Proc. Lecture Notes, Vol. 18 (AMS, Providence, 1999), pp. 233–243. [5] S. T. Ali, Quantization techniques: A quick overview, in Contemporary Problems in Mathematical Physics (World Scientific, Singapore, 2002), pp. 3–78. [6] S. T. Ali and J.-P. Antoine, Quantum frames, quantization and dequantization, Quantization and Infinite-Dimensional Systems (Plenum, New York, 1994), pp. 133–145.
June 23, 2005 10:9 WSPC/148-RMP
476
J070-00237
S. T. Ali & M. Engliˇ c
[7] S. T. Ali, J.-P. Antoine, J.-P. Gazeau and U. A. Mueller, Coherent states and their generalizations: A mathematical overview, Rev. Math. Phys. 7 (1995) 1013–1104. [8] S. T. Ali, J.-P. Antoine and J.-P. Gazeau, Coherent States, Wavelets, and Their Generalizations (Springer, Berlin-Heidelberg-New York, 2000). [9] S. T. Ali and H.-D. Doebner, Ordering problem in quantum mechanics: Prime quantization and its physical interpretation, Phys. Rev. A41 (1990) 1199–1210. [10] S. T. Ali and G. G. Emch, Geometric quantization: Modular reduction theory and coherent states, J. Math. Phys. 27 (1986) 2936–2943. [11] S. T. Ali, G. G. Emch and A. M. El-Gradechi, Modular algebras in geometric quantization, J. Math. Phys. 35 (1994) 6237–6243. [12] S. T. Ali and G. A. Goldin, Quantization, coherent states and diffeomorphism groups, in Differential Geometry, Group Representations, and Quantization, Lecture Notes in Phys., Vol. 379 (Springer, Berlin, 1991), pp. 147–178. [13] S. T. Ali and U. A. Mueller, Quantization of a classical system on a coadjoint orbit of the Poincar´e group in 1 + 1 dimensions, J. Math. Phys. 35 (1994) 4405–4422. [14] S. T. Ali and E. Prugoveˇcki, Mathematical problems of stochastic quantum mechanics: Harmonic analysis on phase space and quantum geometry, Acta Appl. Math. 6 (1986) 1–18; Extended harmonic analysis of phase space representations for the Galilei group, Acta Appl. Math. 6 (1986) 19–45; Harmonic analysis and systems of covariance for phase space representations of the Poincar´e group, Acta Appl. Math. 6 (1986) 47–62. [15] R. F. V. Anderson, The Weyl functional calculus, J. Funct. Anal. 4 (1969) 240–267; On the Weyl functional calculus, J. Funct. Anal. 6 (1970) 110–115; The multiplicative Weyl functional calculus, J. Funct. Anal. 9 (1972) 423–440. ¨ [16] B. Angermann, Uber quantisierungen lokalisierbarer systeme — physikalisch interpretierbare mathematische modelle, Dissertation, Technische Universit¨ at Clausthal, 1983. [17] B. Angermann, H.-D. Doebner and J. Tolar, Quantum kinematics on smooth manifolds, in Nonlinear Partial Differential Operators and Quantization Procedures (Springer Berlin-Heidelberg-New York, 1983), pp. 171–208. [18] J. Arazy and H. Upmeier, Invariant symbolic calculi and eigenvalues of invariant operators on symmetric domains, Function Spaces, Interpolation Theory, and Related Topics (Lund, 2000), eds. A. Kufner, M. Cwikel, M. Engliˇs, L.-E. Persson and G. Sparr (Walter de Gruyter, Berlin, 2002), pp. 151–211. [19] R. Arens and D. Babbitt, Algebraic difficulties of preserving dynamical relations when forming quantum-mechanical operators, J. Math. Phys. 6 (1965) 1071–1075. [20] D. Arnal, M. Cahen and S. Gutt, Representation of compact Lie groups and quantization by deformation, Acad. Roy. Belg. Bull. Cl. Sci. 74(5) (1988) 123–141. [21] D. Arnal, J. C. Cortet, M. Flato and D. Sternheimer, Star-products: Quantization and representation without operators, in Field Theory, Quantization and Statistical Physics, D. Reidel (1981), pp. 85–111. [22] A. Ashtekar, Asymptotic quantization Monographs and textbooks in physical science. Lecture notes, Vol. 2 (Bibliopolis, Naples, 1987); New perspectives in canonical gravity (Bibliopolis, Naples, 1988). [23] P. Bandyopadhyay, Geometry, Topology and Quantization, Mathematics and its Applications, Vol. 386 (Kluwer, Dordrecht, 1996). [24] D. H. Bao and Z. Y. Zhu, Ordering problem in geometric quantization, J. Phys. A25 (1992) 2381–2385. [25] D. H. Bao and Z. Y. Zhu, Operator ordering problem and coordinate-free nature of geometric quantization, Comm. Theor. Phys. 25 (1996) 189–194.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
477
[26] D. Bar-Moshe and M. S. Marinov, Realization of compact Lie algebras in K¨ ahler manifolds, J. Phys. A27 (1994) 6287–6298. [27] A. Barnard and R. E. Borcherds, Quantum field theory, preprint math-ph/0204014. [28] I. A. Batalin, E. S. Fradkin and T. E. Fradkina, Another version for operatorial quantization of dynamical systems with irreducible constraints, Nuclear Phys. B314 (1989) 158–174; Erratum ibid. 323 (1989) 734–735; Generalized canonical quantization of dynamical systems with constraints and curved phase space, Nuclear Phys. B332 (1990) 723–736. [29] I. A. Batalin and I. V. Tyutin, Quantum geometry of symbols and operators, Nuclear Phys. B345 (1990) 645–658. [30] S. Bates and A. Weinstein, Lectures on the Geometry of Quantization, Berkeley Mathematics Lecture Notes, Vol. 8 (Amer. Math. Soc., Providence, 1997). [31] F. Bayen, M. Flato, C. Fronsdal, A. Lichnerowicz and D. Sternheimer, Deformation theory and quantization, Lett. Math. Phys. 1 (1977) 521–530; Ann. Phys. 111 (1978) 61–110 (Part I), 111–151 (Part II). [32] S. Berceanu, Coherent states and geodesics: Cut locus and conjugate locus, J. Geom. Phys. 21 (1997) 149–168; A remark on Berezin’s quantization and cut locus, in Quantizations, Deformations and Coherent States, Bialowieza, 1996, Rep. Math. Phys. 40 (1997) 159–168. [33] F. A. Berezin, Quantization, Math. USSR Izvestiya 8 (1974) 1109–1163; Quantization in complex symmetric spaces, Math. USSR Izvestiya 9 (1975) 341–379; General concept of quantization, Commun. Math. Phys. 40 (1975) 153–174. [34] F. A. Berezin, The Method of Second Quantization, 2nd edn. (Nauka, Moscow, 1986) (in Russian). [35] F. A. Berezin and M. A. Shubin, The Schr¨ odinger Equation Moscow, 1983; English translation (Kluwer, Dordrecht, 1991). [36] M. Bertelson, M. Cahen and S. Gutt, Equivalence of star products, Class. Quant. Grav. 14 (1997) A93–A107. [37] R. J. Blattner, Quantization and representation theory, Harmonic analysis on homogeneous spaces, Williamstown, 1972, in Proc. Symp. Pure Math., Vol. XXVI (Amer. Math. Soc., Providence, 1973), pp. 147–165. [38] R. J. Blattner, Pairing of half-form spaces, G´eom´etrie symplectique et physique math´ematique, Aix-en-Provence, 1974 (CNRS, Paris, 1975), pp. 175–186. [39] R. J. Blattner, The metalinear geometry of non-real polarizations, Differential Geometrical Methods in Mathematical Physics (Bonn, 1975) Lecture Notes in Math., Vol. 570 (Springer, Berlin, 1977), pp. 11–45. [40] R. J. Blattner, Some remarks on quantization, Symplectic Geometry and Mathematical Physics (Aix-en-Provence, 1990) Progr. Math. Vol. 99 (Birkh¨ auser, Boston, 1991), pp. 37–47. [41] R. J. Blattner and J. H. Rawnsley, Quantization of the action of U (k, l) on R2(k+l) J. Funct. Anal. 50 (1983) 188–214. [42] R. J. Blattner and J. H. Rawnsley, A cohomological construction of half-forms for nonpositive polarizations, Bull. Soc. Math. Belg. S´er. B 38 (1986) 109–130. [43] B. G. Bodmann, Construction of self-adjoint Berezin-Toeplitz operators on K¨ ahler manifolds and a probabilistic representation of the associated semigroups, preprint math-ph/0207026. [44] J. -M. Bony and N. Lerner, Quantification asymptotique et microlocalisations d’ordre superieur, in S´eminaire sur les ´equations aux d´eriv´ees partielles 1986–1987, ´ ´ Exp. No. II-III (Ecole Polytech., Palaiseau, 1987); Ann. Sci. Ecole Norm. Sup. 22(4) (1989) 377–433.
June 23, 2005 10:9 WSPC/148-RMP
478
J070-00237
S. T. Ali & M. Engliˇ c
[45] M. Bordemann, M. Brischle, C. Emmrich and S. Waldmann, Phase space reduction for star-products: An explicit construction for CP n , Lett. Math. Phys. 36 (1996) 357–371; Subalgebras with converging star products in deformation quantization: An algebraic construction for CP n , J. Math. Phys. 37 (1996) 6311–6323. [46] M. Bordemann, H.-C. Herbig and S. Waldmann, BRST cohomology and phase space reduction in deformation quantization, Commun. Math. Phys. 210 (2000) 107–144. [47] M. Bordemann, E. Meinrenken and M. Schlichenmaier, Toeplitz quantization of K¨ ahler manifolds and gl(N ), N → ∞ limits, Commun. Math. Phys. 165 (1994) 281–296. [48] M. Bordemann and S. Waldmann, A Fedosov star product of Wick type for K¨ ahler manifolds, Lett. Math. Phys. 41 (1997) 243–253. [49] D. Borthwick, Microlocal techniques for semiclassical problems in geometric quantization, in Perspectives on Quantization (South Hadley, MA, 1996), pp. 23–37, Contemp. Math. Vol. 214 (AMS, Providence, 1998). [50] D. Borthwick, Introduction to K¨ ahler quantization, First Summer School in Analysis and Mathematical Physics (Cuernavaca Morelos, 1998), Contemp. Math. 260 (Amer. Math. Soc. Providence 2000), pp. 91–132. [51] D. Borthwick, A. Lesniewski and H. Upmeier, Nonperturbative deformation quantization of Cartan domains, J. Funct. Anal. 113 (1993) 153–176. [52] D. Borthwick, S. Klimek, A. Lesniewski and M. Rinaldi, Matrix Cartan superdomains, super Toeplitz operators, and quantization, J. Funct. Anal. 127 (1995) 456–510. [53] D. Borthwick and A. Uribe, Almost complex structures and geometric quantization, Math. Res. Lett. 3 (1996) 845–861. [54] L. Boutet de Monvel, Toeplitz operators — an asymptotic quantization of symplectic cones, Stochastic processes and their applications in mathematics and physics (Bielefeld, 1985), Math. Appl. Vol. 61 (Kluwer, Dordrecht, 1990), pp. 95–106. [55] L. Boutet de Monvel and V. Guillemin, The Spectral Theory of Toeplitz Operators, Ann. Math. Studies, Vol. 99 (Princeton University Press, Princeton, 1981). [56] O. Bratteli and D. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. I (Springer-Verlag, New York-Berlin-Heidelberg, 1979). [57] M. Braverman, Index theorem for equivariant Dirac operators on non-compact manifolds, math-ph/0011045. [58] M. Cahen, D´eformations et quantification, Physique quantique et g´eom´etrie (Paris, 1986) Travaux en Cours, Vol. 32 (Hermann, Paris, 1988), pp. 43–62. [59] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds. I: Geometric interpretation of Berezin’s quantization, J. Geom. Physics 7 (1990) 45–62; II, Trans. Amer. Math. Soc. 337 (1993) 73–98; III, Lett. Math. Phys. 30 (1994) 291–305; IV, Lett. Math. Phys. 34 (1995) 159–168. [60] E. Calabi, Isometric imbedding of complex manifolds, Ann. Math. 58 (1953) 1–23. [61] A. Cannas da Silva and A. Weinstein, Geometric Models for Noncommutative Algebras, Berkeley Mathematics Lecture Notes, Vol. 10, AMS, Providence; Berkeley Center for Pure and Applied Mathematics, Berkeley, 1999. [62] D. Catlin, The Bergman kernel and a theorem of Tian, in Analysis and Geometry in Several Complex Variables (Katata, 1997), Trends in Math. (Birkh¨ auser, Boston, 1999), pp. 1–23. [63] A. S. Cattaneo and G. Felder, A path integral approach to the Kontsevich quantization formula, Commun. Math. Phys. 212 (2000) 591–611, preprint math.QA/ 9902090.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
479
[64] L. Charles, Feynman path integral and Toeplitz quantization, Helv. Phys. Acta 72 (1999) 341–355; Berezin-Toeplitz operators, a semi-classical approach, Commun. Math. Phys. 239 (2003) 1–28; Quasimodes and Bohr-Sommerfeld conditions for the Toeplitz operators, Comm. Partial Diff. Eqs. 28 (2003) 1527–1566. [65] S. Chaturvedi, A. K. Kapoor and V. Srinivasan, Stochastic quantization scheme of Parisi and Wu, Monographs and Textbooks in Physical Science. Lecture Notes, Vol. 16 (Bibliopolis, Naples, 1990). [66] L. A. Coburn, Deformation estimates for the Berezin-Toeplitz quantization, Commun. Math. Phys. 149 (1992) 415–424; Berezin-Toeplitz quantization, in Algebraic Methods in Operator Theory (Birkh¨ auser, Boston, 1994), pp. 101–108. [67] A. Connes, Noncommutative Geometry (Academic Press, San Diego, 1994). [68] A. Connes and M. Flato, Closed star products and cyclic cohomology, Lett. Math. Phys. 24 (1992) 1–12; A. Connes, M. Flato and D. Sternheimer, Closedness of star products and cohomologies, Lie Theory and Geometry, Progr. Math., Vol. 123 (Birkh¨ auser, Boston, 1994), pp. 241–259. [69] A. Connes and N. Higson, Almost homomorphisms and KK-theory, unpublished manuscript, available at http://math.psu.edu/higson/Papers/CH1.pdf, 1989. [70] L. Cornalba and W. Taylor IV, Holomorphic curves from matrices, Nuclear Phys. B536 (1999) 513–552, preprint hep-th/9807060. [71] J. Czyz, On geometric quantization and its connections with the Maslov theory, Rep. Math. Phys. 15 (1979) 57–97; On geometric quantization of compact, complex manifolds, Rep. Math. Phys. 19 (1984) 167–178. [72] P. H. Damgaard and H. H¨ uffel, Stochastic quantization Phys. Rep. 152 (1987) 227–398. [73] R. Dashen and D. H. Sharp, Currents as coordinates for hadrons, Phys. Rev. 165 (1968) 1857–1866. [74] M. Davidson, G. Olafsson and G. Zhang, Laplace and Segal-Bargmann transforms on Hermitian symmetric spaces and orthogonal polynomials, J. Funct. Anal. 204 (2003) 157–195; Segal-Bargmann transform on Hermitian symmetric spaces and orthogonal polynomials, preprint math.RT/0206275; M. Davidson and G. Olafsson, The generalized Segal-Bargmann transform and special functions, preprint math.RT/0307343. [75] P. Dazord and G. Patissier, La premi´ere classe de Chern comme obstruction a la quantification asymptotique, in Symplectic Geometry, Groupoids, and Inte´ grable Systems, Berkeley, CA, 1989, Math. Sci. Res. Inst. Publ., Vol. 20 (Springer, New York, 1991), pp. 73–97. [76] P. Deligne, D´eformations de l’alg`ebre des fonctions d’une vari´et´e symplectique: Comparaison entre Fedosov et Dewilde, Lecomte, Sel. Math. 1(4) (1995) 667–697. [77] M. Dewilde and P. B. A. Lecomte, Existence of star products and of formal deformations of the Poisson Lie algebra of arbitrary symplectic manifolds, Lett. Math. Phys. 7 (1983) 487–496. [78] P. A. M. Dirac, The Principles of Quantum Mechanics, 3rd edn. (Oxford, London, 1947). [79] G. Dito and D. Sternheimer, Deformation quantization: genesis, developments, and metamorphoses, in Proceedings of the Meeting between Mathematicians and Theoretical Physicists (Strasbourg, 2001), IRMA Lectures in Math. Theoret. Phys., Vol. 1 (Walter de Gruyter Berlin, 2002), pp. 9–54, preprint math.QA/0201168. ˇ [80] H. D. Doebner, P. St’ov´ ıˇcek and J. Tolar, Quantization of kinematics on configuration manifolds, Rev. Math. Phys. 13 (2001) 1–47, preprint math-ph/0104013.
June 23, 2005 10:9 WSPC/148-RMP
480
J070-00237
S. T. Ali & M. Engliˇ c
[81] H.-D. Doebner P. Nattermann, Borel quantization: Kinematics and dynamics, Acta Phys. Polon. B27 (1996) 2327–2339. [82] C. Duval, A. M. El Gradechi and V. Ovsienko, Projectively and conformally invariant star-products, preprint math.QA/0301052. [83] A. Echeverria-Enriquez, M. C. Mu˜ noz-Lecanda, N. Roman-Roy and C. VictoriaMonge, Mathematical foundations of geometric quantization, Extracta Math. 13 (1998) 135–238. [84] M. Engliˇs and J. Peetre, On the correspondence principle for the quantized annulus, Math. Scand. 78 (1996) 183–206. [85] M. Engliˇs, Asymptotics of the Berezin transform and quantization on planar domains, Duke Math. J. 79 (1995) 57–76. [86] M. Engliˇs, Berezin quantization and reproducing kernels on complex domains, Trans. Amer. Math. Soc. 348 (1996) 411–479. [87] M. Engliˇs, A Forelli-Rudin construction and asymptotics of weighted Bergman kernels, J. Funct. Anal. 177 (2000) 257–281. [88] M. Engliˇs, Bergman kernels in analysis, operator theory and mathematical physics, Habilitation (DrSc.) thesis Prague, October 2000. [89] M. Engliˇs, Weighted Bergman kernels and quantization, Commun. Math. Phys. 227 (2002) 211–241. [90] M. Engliˇs, A no-go theorem for nonlinear canonical quantization, Commun. Theor. Phys. 37 (2002) 287–288. [91] M. Engliˇs, Berezin-Toeplitz quantization and invariant symbolic calculi, Lett. Math. Phys. 65 (2003) 59–74. [92] B. V. Fedosov, Deformation quantization and asymptotic operator representation, Funct. Anal. Appl. 25 (1991) 184–194. [93] B. V. Fedosov, A simple geometric construction of deformation quantization, J. Diff. Geo. 40 (1994) 213–238. [94] B. V. Fedosov, Deformation Quantization and Index Theory, Mathematical Topics, Vol. 9 (Akademie Verlag, Berlin 1996). [95] B. V. Fedosov, Deformation quantization: Pro and contra, in Quantization, Poisson Brackets and Beyond (Manchester, 2001), ed. T. Voronov, Contemp. Math. 315. (Amer. Math. Soc. Providence, 2002). [96] R. L. Fernandes, Deformation quantization and Poisson geometry, Resenhas IMEUSP 4 (2000) 325–359. [97] R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals (McGrawHill, New York, 1965). [98] R. Fioresi and M. A. Lledo, A review on deformation quantization of coadjoint orbits of semisimple Lie groups, preprint math.QA/0111132. [99] M. Flato and D. Sternheimer, Closedness of star products and cohomologies, in Lie Theory and Geometry, in Honor of B. Kostant, eds. J.-L. Brylinski, R. Brylinski, V. Guillemin and V. Kac (Birkhauser, Boston, 1994). [100] G. B. Folland, Harmonic Analysis in Phase Space, Annals of Mathematics Studies, Vol. 122 (Princeton University Press, Princeton, 1989). [101] E. S. Fradkin, Towards a quantum field theory in curved phase space, in Proceedings of the Second International A.D. Sakharov Conference on Physics, Moscow, 1996 (World Scientific, River Edge, 1997), pp. 746–756. [102] E. S. Fradkin and V. Ya. Linetsky, BFV approach to geometric quantization, Nuclear Phys. B431 (1994) 569–621; BFV quantization on Hermitian symmetric spaces, Nuclear Phys. B444 (1995) 577–601.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
481
[103] K. Gaw¸edzki, On extensions of Kostant quantization procedure, G´eom´etrie symplectique et physique math´ematique (Aix-en-Provence, 1974), pp. 211–216, CNRS Paris, 1975. [104] K. Gaw¸edzki, Fourier-like kernels in geometric quantization, Dissertationes Math. 128 (1976). [105] K. Gaw¸edzki, Geometric quantization and Feynman path integrals for spin, in Differential Geometric Methods in Mathematical Physics (Proc. Sympos., Univ. Bonn, Bonn, 1975), Lecture Notes in Math., Vol. 570 (Springer, Berlin, 1977), pp. 67–71. [106] J.-P. Gazeau, Coherent states and quantization of the particle motion on the line, on the circle, on 1+1-de Sitter space-time and of more general systems, in Proceedings of the Third International Workshop on Contemporary Problems in Mathematical Physics (Cotonou 2003), eds. J. Govaerts, M. N. Hounkonnou and A. Z. Msezane (World Scientific, Singapore, 2005), pp. 465–479. [107] J.-P. Gazeau and W. Piechocki, Asymptotic coherent states quantization of a particle in de Sitter space, preprint hep-th/0308019 v1 4 Aug 2003. [108] M. Gerstenhaber, On deformations of rings and algebras, Ann. Math. 79 (1964) 59–103. [109] F. Gieres, Mathematical surprises and Dirac’s formalism in quantum mechanics, Rep. Prog. Phys. 63 (2000) 1893–1946, preprint quant-ph/9907069. [110] R. J. Glauber, Coherent and incoherent states of the radiation field, Phys. Rev. 131(2) (1963) 2766–2788. [111] J. Glimm and A. Jaffe, Quantum Physics. A Functional Integral Point of View (Springer-Verlag, New York-Berlin, 1987). [112] G. A. Goldin, Non-relativistic current algebras as unitary representations of groups, J. Math. Phys. 12 (1971) 462–487. [113] G. A. Goldin, Lectures on diffeomorphism groups in quantum physics, in Proceedings of the Third International Workshop on Contemporary Problems in Mathematical Physics (Cotonou 2003), eds. J. Govaerts, M. N. Hounkonnou and A. Z. Msezane (World Scientific, Singapore, 2005), pp. 3–93. [114] G. A. Goldin, J. Grodnik, R. T. Powers and D. H. Sharp, Nonrelativistic current algebra in the ‘N/V’ limit, J. Math. Phys. 15 (1974) 88–100. [115] G. A. Goldin, R. Menikoff and D. H. Sharp, Particle statistics from induced representations of a local current group, J. Math. Phys. 21 (1980) 650–664. [116] G. A. Goldin, R. Menikoff and D. H. Sharp, Representations of a local current algebra in non-simply connected space and the Aharonov-Bohm effect, J. Math. Phys. 22 (1981) 1664–1668. [117] G. A. Goldin, R. Menikoff and D. H. Sharp, Diffeomorphism groups, gauge groups and quantum theory, Phys. Rev. Lett. 51 (1983) 2246–2249. [118] G. A. Goldin, R. Menikoff and D. H. Sharp, Induced representations of the group of diffeomorphisms of R3n , J. Phys. A: Math. Gen. 16 (1983) 1827–1833. [119] G. A. Goldin and D. H. Sharp, Diffeomorphism groups, anyon fields and qcommutators, Phys. Rev. Lett. 76 (1996) 1183–1187. [120] G. A. Goldin and D. H. Sharp, Lie algebras of local currents and their representations, in Group Representations in Mathematics and Physics: Battelle–Seattle 1969 Rencontres, Lecture Notes in Phys., Vol. 6 (Springer Berlin, 1970), pp. 300–310. [121] M. J. Gotay, A class of nonpolarizable symplectic manifolds, Monatsh. Math. 103 (1987) 27–30. [122] M. J. Gotay, H. B. Grundling and G. M. Tuynman, Obstruction results in quantization theory, J. Nonlinear, Sci. 6 (1996) 469–498.
June 23, 2005 10:9 WSPC/148-RMP
482
J070-00237
S. T. Ali & M. Engliˇ c
[123] J. M. Gracia-Bondia, Generalized Moyal quantization on homogeneous symplectic spaces, in Deformation Theory and Quantum Groups with Applications to Mathematical Physics, Amherst, MA, 1990, Contemp. Math., Vol. 134 (AMS, Providence, 1992), pp. 93–114. [124] R. M. Gray and D. L. Neuhoff, Quantization, IEEE Trans. Inform. Theory 44 (1998) 2325–2383. [125] W. Greiner and J. Reinhardt, Field Quantization (Springer-Verlag, Berlin, 1996). [126] M. A. Grigoriev and S. L. Lyakhovich, Fedosov quantization as a BRST theory, Commun. Math. Phys. 218 (2001) 437–457, preprint hep-th/0003114. [127] H. J. Groenewold, On the principles of elementary quantum mechanics, Physica 12 (1946) 405–460. [128] C. Grosche, An introduction into the Feynman path integral, preprint hepth/9302097; C. Grosche and F. Steiner, Handbook of Feynman Path Integrals, Springer Tracts in Modern Physics 145 (Springer-Verlag, Berlin, 1998). [129] E. Guentner, Berezin quantization and K-homology, Commun. Math. Phys. 240 (2003) 423–446; Wick quantization and asymptotic morphisms, Houston J. Math. 26 (2000) 361–375. [130] V. Guillemin, Reduced phase spaces and Riemann-Roch, in Lie Theory and Geometry in Honour of B. Kostant, Progress in Math. 123 (Birkh¨ auser, Boston, 1994) pp. 304–334. [131] V. Guillemin, Star products on pre-quantizable symplectic manifolds, Lett. Math. Phys. 35 (1995) 85–89. [132] V. Guillemin, V. Ginzburg and Y. Karshon, Moment Maps, Cobordisms, and Hamiltonian Group Actions (Amer. Math. Soc., Providence, 2002). [133] V. Guillemin and S. Sternberg, Geometric Asymptotics, Math. Surveys, Vol. 14 (AMS, Providence, 1977). [134] V. Guillemin and S. Sternberg, Symplectic Techniques in Physics (Cambridge University Press, Cambridge, 1984). [135] S. Gutt, Variations on Deformation Quantization, Conf´erence Mosh´e Flato, Dijon, 1999, Vol. I, Math. Phys. Stud. 21 (Kluwer, Dordrecht, 2000), pp. 217–254, preprint math.QA/0003107. [136] R. Haag, N. Hugenholtz and M. Winnink, On the equilibrium states in quantum statistical mechanics, Commun. Math. Phy. 5 (1967) 215–236. [137] K. Hajra and P. Bandyopadhyay, Equivalence of stochastic, Klauder and geometric quantization, Int. J. Mod. Phys. A7 (1992) 1267–1285. [138] E. Hawkins, The correspondence between geometric quantization and formal deformation quantization, preprint math/9811049 (1998). [139] H. Hess, On a geometric quantization scheme generalizing those of Kostant-Souriau and Czyz, in Differential Geometric Methods in Mathematical Physics, ClausthalZellerfeld, 1978, Lecture Notes in Phys., Vol. 139 (Springer, Berlin-New York, 1981), pp. 1–35. [140] L. H¨ ormander The Weyl calculus of pseudodifferential operators, Comm. Pure Appl. Math. 32 (1979), 359–443. [141] M. Horowski, A. Kryszen and A. Odzijewicz, Classical and quantum mechanics on the unit ball in Cn , Rep. Math. Phys. 24 (1986) 351–363. [142] L. van Hove, Sur certaines repr´esentations unitaires d’un groupe infini de transformations, Mem. Acad. Roy. de Belgique, Classe des Sci. 26(6) (1951). [143] J. Huebschmann, K¨ ahler quantization and reduction, preprint math/0207166. [144] N. E. Hurt, Geometric Quantization in Action (Dordrecht, Reidel, 1983).
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
483
[145] L. A. Ibort, On the geometrical and cohomological foundations of BRST quantization, in Quantization of Constrained Systems and Conformal Field Theory, eds. J. F. Cari˜ nena, G.I.F.T. (Zaragoza, 1990), pp. 201–255. [146] G. P. Jorjadze, Hamiltonian reduction and quantization on symplectic manifolds, Mem. Diff. Eqs. Math. Phys. 13 (1998) 1–98. [147] A. V. Karabegov, Deformation quantization with separation of variables on a K¨ ahler manifold, Commun. Math. Phys. 180 (1996) 745–755. [148] A. V. Karabegov, On deformation quantization on a K¨ ahlerian manifold that is related to the Berezin quantization, Funktsional. Anal. i Prilozhen. 30 (1996) 87–89; Funct. Anal. Appl. 30 (1996) 142–144. [149] A. V. Karabegov, Berezin’s quantization on flag manifolds and spherical modules, Trans. Amer. Math. Soc. 350 (1998) 1467–1479. [150] A. V. Karabegov, Cohomological classification of deformation quantizations with separation of variables, Lett. Math. Phys. 43 (1998) 347–357. [151] A. V. Karabegov, On the canonical normalization of a trace density of deformation quantization, Lett. Math. Phys. 45 (1998) 217–228. [152] A. V. Karabegov, Pseudo-K¨ ahler quantization on flag manifolds, Commun. Math. Phys. 200 (1999) 355–379. [153] A. V. Karabegov and M. Schlichenmaier, Identification of Berezin-Toeplitz deformation quantization, J. Reine Angew. Math. 540 (2001) 49–76, preprint math.QA/0006063. [154] M. V. Karasev, Quantization by means of two-dimensional surfaces (membranes): Geometrical formulas for wave-functions, in Symplectic Geometry and Quantization Sanda and Yokohama, 1993, Contemp. Math. Vol. 179 (AMS, Providence, 1994) pp. 83–113, Geometric star-products, ibid., pp. 115–121; Geometric coherent states, membranes, and star products, in Quantization, Coherent States, and Complex Structures, Bialowieza, 1994 (Plenum, New York, 1995), pp. 185–199; Quantization and coherent states over Lagrangian submanifolds, Russian J. Math. Phys. 3 (1995) 393–400. [155] M. V. Karasev, Simple quantization formula, in Symplectic Geometry and Mathematical Physics, Aix-en-Provence, 1990, Progr. Math., Vol. 99 (Birkh¨ auser, Boston, 1991), pp. 234–244. [156] M. V. Karasev and V. P. Maslov, Asymptotic and geometric quantization, Uspekhi Mat. Nauk 39 (1984) 115–173; Russian Math. Surveys 39 (1985) 133–205; Pseudodifferential operators and the canonical operator on general symplectic manifolds, Izv. Akad. Nauk SSSR Ser. Mat. 47 (1983) 999–1029; Math. USSR-Izv. 23 (1984) 277–305; Global asymptotic operators of a regular representation, Dokl. Akad. Nauk SSSR 257 (1981) 33–37; Soviet Math. Dokl. 23 (1981) 228–232. [157] M. V. Karasev and V. P. Maslov, Nonlinear Poisson Brackets. Geometry and Quantization (Nauka, Moscow, 1991). [158] M. V. Karasev, Quantization and intrinsic dynamics, in Asymptotic Methods for Wave and Quantum Problems, Amer. Math. Soc. Transl. Ser. 2, Vol. 208 (Amer. Math. Soc., Providence, 2003), pp. 1–32, preprint math-ph/0207047; M. V. Karasev, T. A. Osborn, Quantum, Magnetic algebra and magnetic curvature, preprint quantph/0311053. [159] A. A. Kirillov, Elements of the theory of representations, 2nd edn. (Nauka, Moscow, 1978) in Russian; English translation of the 1st edition: Grundlehren Math. Wissensch., Band 220 (Springer, Berlin-New York, 1976). [160] A. A. Kirillov, Geometric quantization, in Dynamical Systems IV, Encyclopaedia Math. Sci., Vol. 4 (Springer, Berlin, 2001), pp. 139–176.
June 23, 2005 10:9 WSPC/148-RMP
484
J070-00237
S. T. Ali & M. Engliˇ c
[161] J. R. Klauder, Continuous representations and path integrals, revisited, in Path Integrals and Their Applications in Quantum, Statistical and Solid State Physics, Proc. NATO Advanced Study Inst., State Univ. Antwerp, Antwerp, 1977, NATO Adv. Study Inst. Ser., Ser. B: Physics, Vol. 43 (Plenum, New York-London, 1978), pp. 5–38. [162] S. Klimek and A. Lesniewski, Quantum Riemann surfaces, I: The unit disc, Commun. Math. Phys. 146 (1992) 103–122. [163] S. Klimek and A. Lesniewski, Quantum Riemann surfaces, II: The discrete series, Lett. Math. Phys. 24 (1992) 125–139; III: The exceptional cases, Lett. Math. Phys. 32 (1994) 45–61. [164] E. A. Kochetov and V. S. Yarunin, Coherent-state path integral for transition amplitude: A theory and applications, Phys. Scripta 51 (1995) 46–53. [165] J. J. Kohn and L. Nirenberg, An algebra of pseudo-differential operators, Comm. Pure Appl. Math. 18 (1965) 269–305. [166] M. Kontsevich, Deformation quantization of Poisson manifolds, preprint q-alg/9709040 (1997). [167] B. Kostant, Quantization and Unitary Representations, Lecture Notes in Math., Vol. 170 (Springer, Berlin, 1970). [168] B. Kostant, Symplectic Spinors, Symposia Mathematica Vol. 14 (Academic Press, London, 1974), pp. 139–152. [169] B. Kostant, On the definition of quantization, in G´eom´etrie Symplectique et Physique Math´ematique, Aix-en-Provence, 1974, Colloq. Internat. CNRS, Vol. 237 ´ (Editions Centre Nat. Recherche Sci., Paris, 1975), pp. 187–210. [170] B. Kostant and S. Sternberg, Symplectic reduction, BRS cohomology, and infinitedimensional Clifford algebras, Ann. Phys. 176 (1987) 49–113. [171] C. N. Ktorides and L. C. Papaloucas, A special construction of Berezin’s L-kernel, Found. Phys. 17 (1987) 201–207. [172] N. P. Landsman, Strict quantization of coadjoint orbits, J. Math. Phys. 39 (1998) 6372–6383. [173] N. P. Landsman, Classical and quantum representation theory, in Proc. Seminar 1989–1990 Mathem. Struct. in Field Theory, Amsterdam, CWI Syllabi, Vol. 39, Math. Centrum (Centrum Wisk. Inform. Amsterdam, 1996), pp. 135–163. [174] N. P. Landsman, Mathematical Topics between Classical and Quantum Mechanics (Springer, New York, 1998). [175] N. P. Landsman, Lie grupoid C ∗ -algebras and Weyl quantization, Commun. Math. Phys. 206 (1999) 367–381. [176] N. P. Landsman, Quantization and the tangent grupoid, in Operator Algebras and Mathematical Physics, Constant¸a, 2001 (Theta, Bucharest, 2003), pp. 251–265, preprint math-ph/0208004. [177] J. M. Leinaas and J. Myrheim, On the theory of identical particles, Nuovo Cimento B37 (1977) 1–23. [178] W. Lisiecki, K¨ ahler coherent states orbits for representations of semisimple Lie groups, Ann. Inst. H. Poincar´ e 53 (1990) 857–890. [179] M. A. Lledo, Deformation quantization of coadjoint orbits, Int. J. Mod. Phys. B14 (2000) 2397–2400, preprint math.QA/0003142. [180] G. W. Mackey, Induced Representations of Groups and Quantum Mechanics (Benjamin, New York, 1968). [181] M. S. Marinov, Path integrals on homogeneous manifolds, J. Math. Phys. 36 (1995) 2458–2469; D. Bar-Moshe and M. S. Marinov, Berezin quantization and unitary representations of Lie groups, Transl. Amer. Math. Soc. 177 (1996) 1–21.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
485
[182] V. P. Maslov and O. Yu. Shvedov, Geometric quantization in the Fock space, in Topics in Statistical and Theoretical Physics, AMS Transl. Ser. 2, Vol. 177 (AMS, Providence, 1996), pp. 123–157. [183] R. Menikoff and D. H. Sharp, A gauge invariant formulation of quantum electrodynamics using local currents, J. Math. Phys. 18 (1977) 471–482. [184] A. S. Mishchenko, B. Yu. Sternin and V. E. Shatalov, Lagrangian Manifolds and the Method of the Canonical Operator (Nauka, Moscow, 1978). [185] P. K. Mitter, Stochastic approach to Euclidean quantum field theory, in New Perspectives in Quantum Field Theories, Jaca, 1985 (World Scientific, Singapore, 1986), pp. 189–207. [186] I. Mladenov, Quantization on curved manifolds, in Geometry, Integrability and Quantization, Varna, 2001 (Coral Press, Sofia, 2002), pp. 64–104. [187] C. Moreno, ∗-products on some K¨ ahler manifolds, Lett. Math. Phys. 11 (1986) 361–372; Invariant star products and representations of compact semisimple Lie groups, Lett. Math. Phys. 12 (1986) 217–229; Geodesic symmetries and invariant star products on K¨ ahler symmetric spaces, Lett. Math. Phys. 13 (1987) 245–257. [188] C. Moreno and P. Ortega-Navarro, Deformations of the algebra of functions on Hermitian symmetric spaces resulting from quantization, Ann. Inst. H. Poincar´ e Sect. A (N.S.) 38 (1983) 215–241. [189] U. A. Mueller, Zur Quantisierung physikalischer Systeme mit inneren Freiheitsgraden, doctoral dissertation, Technische Universit¨ at Clausthal, Clausthal, 1987. [190] I. Mykytiuk, Geometric quantization: Hilbert space structure on the space of generalized sections, Rep. Math. Phys. 43 (1999) 257–266. [191] M. Namiki, Basic Ideas of Stochastic Quantization, Progr. Theoret. Phys. Suppl. 111 (1993) 1–41; Stochastic quantization, Lecture Notes in Physics. New Series m: Monographs, Vol. 9, Springer-Verlag, Berlin, 1992. [192] K. Namsrai, Nonlocal Quantum Field Theory and Stochastic Quantum Mechanics (D. Reidel, Dordrecht-Boston, 1986). [193] P. Nattermann, Dynamics in Borel quantization: Nonlinear Schr¨ odinger equations vs. Master Equations, doctoral dissertation, Technische Universit¨ at Clausthal, Clausthal, 1997. [194] E. Nelson, Operants: A functional calculus for non-commuting operators, in Functional Analysis and Related Fields, ed. F. E. Browder (Springer-Verlag, New York, 1970), pp. 172–187. [195] E. Nelson, Quantum Fluctuations (Princeton University Press, Princeton, 1985), What is stochastic mechanics? Math´ematiques finitaires et analyse non standard, Publ. Math. University Paris VII, Vol. 31 (Univ. Paris VII, Paris, 1989), pp. 1–4. [196] R. Nest and B. Tsygan, Algebraic index theory, Commun. Math. Phys. 172 (1995) 223–262; Algebraic index theory for families, Adv. Math. 113 (1995) 151–205. [197] D. L. Neuhoff, The other asymptotic theory of lossy source coding, in Coding and Quantization, Piscataway, NJ, 1992 DIMACS Ser. Discrete Math. Theoret. Comput. Sci., Vol. 14 (AMS, Providence, 1993), pp. 55–65. [198] N. Neumaier, Local ν-Euler derivations and Deligne’s characteristic class of Fedosov star products and star products of special type, Commun. Math. Phys. 230 (2002) 271–288, preprint math.QA/9905176; Universality of Fedosov’s construction for star products of Wick type on pseudo-K¨ ahler manifolds, Rep. Math. Phys. 52 (2003) 43–80, preprint math.QA/0204031. [199] J. von Neumann, Mathematical Foundations of Quantum Mechanics (Princeton University Press, Princeton, 1955).
June 23, 2005 10:9 WSPC/148-RMP
486
J070-00237
S. T. Ali & M. Engliˇ c
[200] M. Nishioka, Geometric quantization of curved phase space, Hadronic J. 5 (1981/82) 207–213. [201] A. Odzijewicz, On reproducing kernels and quantization of states, Commun. Math. Phys. 114 (1988) 577–597. [202] A. Odzijewicz, Coherent states and geometric quantization, Commun. Math. Phys. 150 (1992) 385–413. [203] A. Odzijewicz, Covariant and contravariant Berezin symbols of bounded operators, Quantization and infinite-dimensional systems, Bialowieza, 1993, eds. J.-P. Antoine, S. T. Ali, W. Lisiecki, I. M. Mladenov and A. Odzijewicz (Plenum, New York, 1994), pp. 99–108. [204] A. Odzijewicz, Coherent state method in geometric quantization, in Twenty Years of Bialowieza: A Mathematical Anthology (Aspects of Differential Geometric Methods in Physics), eds. S. T. Ali, G. G. Emch, A. Odzijewicz, M. Schlichenmaier and L. Woronowicz (World Scientific, Singapore, 2005), pp. 47–78. ´ etochowski, Coherent states map for mic-kepler system, [205] A. Odzijewicz and M. Swi¸ J. Math. Phys. 38 (1997) 5010–5030. [206] H. Omori, Y. Maeda and A. Yoshioka, Weyl manifolds and deformation quantization, Adv. Math. 85 (1991) 224–255. [207] E. Onofri, A note on coherent state representation of Lie groups, J. Math. Phys. 16 (1975) 1087–1089. [208] B. Ørsted and G. Zhang, Weyl quantization and tensor products of Fock and Bergman spaces, Indiana Univ. Math. J. 43 (1994) 551–582. [209] G. Parisi and Y. S. Wu, Perturbation theory without gauge fixing, Sci. Sinica 24 (1981) 483–496; reprinted in Current topics in Chinese science, Section A — Physics, Vol. 1, pp. 69–82. [210] F. B. Pasemann, Eine geometrische Methode der Quantisierung: Vektorfelddarstellungen, doctoral dissertation, Technische Universit¨ at Clausthal, Clausthal, 1977. [211] Z. Pasternak-Winiarski, On reproducing kernels for holomorphic vector bundles, in Quantization and Infinite-Dimensional Systems, Bialowieza, 1993, eds. J.-P. Antoine, S. T. Ali, W. Lisiecki, I. M. Mladenov and A. Odzijewicz (Plenum, New York, 1994), pp. 109–112. [212] Z. Pasternak-Winiarski and J. Wojcieszynski, Bergman spaces and kernels for holomorphic vector bundles, Demonstr. Math. 30 (1997) 199–214. [213] Z. Pasternak-Winiarski, On the dependence of the reproducing kernel on the weight of integration, J. Funct. Anal. 94 (1990) 110–134; On the dependence of the Bergman section on deformations of the volume form and the Hermitian structure, in Singularity Theory Seminar (Institute of Mathematics, Warsaw University of Technology, 1996), preprint. [214] G. Patissier, Quantification d’un vari´et´e symplectique, in S´eminaire de G´eom´etrie, 1985–1986; Publ. D´ep. Math. Nouvelle S´er. B, 86–4 (University Claude-Bernard, Lyon, 1986), pp. 35–54. [215] J. Peetre, The Berezin transform and Ha-plitz operators, J. Operator Theory 24 (1990) 165–186. [216] J. Peetre and G. Zhang, A weighted Plancherel formula III. The case of hyperbolic matrix ball, Collect. Math. 43 (1992) 273–301. [217] A. Perelomov, Generalized Coherent States and Their Applications (Springer-Verlag, Berlin, 1986). [218] J. C. T. Pool, Mathematical aspects of the Weyl correspondence, J. Math. Phys. 7 (1966) 66–76. [219] E. Prugoveˇcki, Consistent formulation of relativistic dynamics for massive spin-zero particles in external fields, Phys. Rev. D18 (1978) 3655–3673.
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
487
[220] E. Prugoveˇcki, Stochastic Quantum Mechanics and Quantum Spacetime (D. Reidel, Dordrecht-Boston, 1984). [221] M. Puta, Hamiltonian Mechanical Systems and Geometric Quantization, Mathematics and Its Applications, Vol. 260 (Kluwer, Dordrecht, 1993). [222] F. Radulescu, The Γ-equivariant form of the Berezin quantization of the upper halfplane, Mem. Amer. Math. Soc. 133(630) (1998); Quantum dynamics and Berezin’s deformation quantization, in Operator Algebras and Quantum Field Theory, Rome, 1996 (Intern. Press, Cambridge, MA, 1998), pp. 383–389. [223] J. H. Rawnsley, On the cohomology groups of a polarization and diagonal quantization, Trans. Amer. Math. Soc. 230 (1977) 235–255. [224] J. H. Rawnsley, Coherent states and K¨ ahler manifolds, Quarterly J. Math. Oxford 28 (1977) 403–415. [225] J. H. Rawnsley, On the pairing of polarizations, Commun. Math. Phys. 58 (1978) 1–8. [226] J. H. Rawnsley, A nonunitary pairing of polarizations for the Kepler problem, Trans. Amer. Math. Soc. 250 (1979) 167–180. [227] J. H. Rawnsley and P. L. Robinson, The metaplectic representation, M pc structures and geometric quantization, Mem. Amer. Math. Soc. 81(410) (1989). [228] N. Reshetikhin and L. Takhtajan, Deformation quantization of K¨ ahler manifolds, in L.D. Faddeev’s Seminar on Mathematical Physics, Amer. Math. Soc. Transl. Ser. 2, Vol. 201 (Amer. Math. Soc., Providence, 2000), pp. 257–276, preprint math/ 9907171. [229] M. A. Rieffel, Deformation quantization and operator algebras, in Operator Theory: Operator Algebras and Applications, Durham, NH, 1988, Proc. Symp. Pure Math. Vol. 51 (1990), pp. 411–423. [230] M. A. Rieffel, Quantization and C ∗ -algebras, C ∗ -algebras: 1943–1993 (San Antonio, 1993) Contemp. Math. Vol. 167 (AMS, Providence, 1994), pp. 66–97. [231] M. A. Rieffel, Deformation quantization for actions of Rd , Mem. Amer. Math. Soc. 106(506) (1993). [232] M. A. Rieffel, Questions on quantization, in Operator Algebras and Operator Theory, Shanghai, 1997, Contemp. Math., Vol. 228 (AMS, Providence, 1998), pp. 315–326. [233] P. L. Robinson, A cohomological pairing of half-forms, Trans. Amer. Math. Soc. 301 (1987) 251–261. [234] P. L. Robinson, A report on geometric quantization, in Differential Geometry: Partial Differential Equations on Manifolds, Los Angeles, 1990, Proc. Symp. Pure Math. Vol. 54, Part I (Amer. Math. Soc., Providence, 1993), pp. 403–415. [235] D. J. Rowe, Quantization using vector coherent state methods, Quantization, Coherent States, and Poisson Structures, Bialowieza, 1995 (PWN, Warsaw, 1998), pp. 157–175; S. D. Bartlett, D. J. Rowe and J. Repka, Vector coherent state representations, induced representations, and geometric quantization: I. Scalar coherent state representations, J. Phys. A: Math. Gen. 35 (2002) 5599–5623, preprint quantph/0201129; II. Vector coherent state representations, J. Phys. A: Math. Gen. 35 (2002) 5625–5651, preprint quant-ph/0201130; D. J. Rowe and J. Repka, Coherent state triplets and their inner products, J. Math. Phys. 43 (2002) 5400–5438, preprint math-ph/0205034. [236] M. Schlichenmaier, Deformation quantization of compact K¨ ahler manifolds by Berezin-Toeplitz quantization, in Conference Mosh´e Flato 1999, Dijon, France 1999, Vol. 2 (Kluwer, 2000), pp. 289–306, preprint math.QA/9910137; Zwei Anwendungen algebraisch-geometrischer Methoden in der theoretischen Physik: Berezin-Toeplitz Quantisierung und globale Algebren der zweidimensionalen konformen Feldtheorie, Habilitation thesis, Mannheim, 1996.
June 23, 2005 10:9 WSPC/148-RMP
488
J070-00237
S. T. Ali & M. Engliˇ c
[237] M. Schlichenmaier, Deformation quantization of compact K¨ ahler manifolds via Berezin-Toeplitz operators, in Group Theoretical Methods in Physics, Goslar, 1996 (World Scientific, 1997), pp. 396–400; Berezin-Toeplitz quantization of compact K¨ ahler manifolds, Quantization, Coherent States and Poisson Structures, Bialowieza, 1995, PWN (1998), pp. 101–115, preprint q-alg/9601016; BerezinToeplitz quantization and Berezin symbols for arbitrary compact K¨ ahler manifolds, in Coherent States, Quantization and Gravity, Bialowieza, 1998, eds. M. Schlichenmaier, A. Strasburger, S. T. Ali and A. Odzijewicz (Warsaw University Press, 2001), 45–56, preprint math.QA/9902066. [238] I. E. Segal, Quantization of nonlinear systems, J. Math. Phys. 1 (1960) 468–488. ˇ sevskii, Quantization in cotangent bundles, Dokl. Akad. Nauk SSSR 245 [239] I. A. Sereˇ (1979) 1057–1060 (in Russian); Soviet Math. Dokl. 20 (1979) 402–405. [240] M. A. Shubin, Pseudodifferential Operators and Spectral Theory (Nauka, Moscow, 1978); English translation (Springer-Verlag, Berlin, 1987). [241] R. Simon, E. C. G. Sudarshan and N. Mukunda, Gaussian pure states in quantum mechanics and the symplectic group, Phys. Rev. A37 (1988) 3028–3038. [242] D. J. Simms, An outline of geometric quantization (d’apr`es Kostant), in Differential Geometrical Methods in Mathematical Physics, Bonn, 1975, Lecture Notes in Math., Vol. 570 (Springer, Berlin, 1977), pp. 1–10. [243] D. J. Simms and N. M. J. Woodhouse, Lectures on Geometric Quantization, Lecture Notes in Physics, Vol. 53 (Springer, New York, 1976). [244] R. Sjamaar, Symplectic reduction and Riemann-Roch formulas for multiplicities, Bull. Amer. Math. Soc. 33 (1996) 327–338. [245] J. Sniatycki, On geometric quantization of classical systems, in Mathematical Foundations of Quantum Theory, Loyola Univ., New Orleans, 1977 (Acad. Press New York 1978), pp. 287–297. [246] J. Sniatycki, Geometric Quantization and Quantum Mechanics (Springer-Verlag, Berlin, 1980). [247] J. Sniatycki, Constraints and Quantization, Lecture Notes in Math., Vol. 1037 (1983), pp. 293–310. [248] J.-M. Souriau, Structure des Syst´emes Dynamiques, Dunod, Paris 1969 English translation Structure of Dynamical Systems, Progr. Math., Vol. 149 (Birkh¨ auser, Boston, 1997). [249] J.-M. Souriau, Groupes diff´erentiels, Differential geometrical methods in mathematical physics (Aix-en-Provence/Salamanca, 1979), pp. 91–128, Lecture Notes in Math., Vol. 836 (Springer, Berlin-New York, 1980). [250] J.-M. Souriau, Quantification g´eom´etrique, Physique Quantique et G´eom´etrie (Paris, 1986), Travaux en Cours, Vol. 32 (Hermann, Paris, 1988), pp. 141–193. [251] J.-M. Souriau, Des particules aux ondes: Quantification g´eom´etrique, in Huyghen’s Principle 1690–1990: Theory and Applications, The Hague and Scheveningen, 1990, Stud. Math. Phys., Vol. 3 (North-Holland, Amsterdam, 1992), pp. 299–341. [252] D. Sternheimer, Deformation quantization: Twenty years after, in Particles, Fields and Gravitation, Lodz, 1998, AIP Conf. Proc., Vol. 453 (Amer. Inst. Phys., Woodbury, 1998), pp. 107–145; Star products: Their ubiquity and unicity, in Modern Group Theoretical Methods in Physics, Paris, 1995, Math. Phys. Stud., Vol. 18 (Kluwer, Dordrecht, 1995), pp. 255–265. [253] M. Takesaki, Tomita’s Theory of Modular Hilbert Algebras and its Applications, Lecture Notes in Math., Vol. 128 (Springer, Berlin, 1970). [254] M. E. Taylor, Pseudodifferential Operators (Princeton University Press, Princeton, 1981).
June 23, 2005 10:9 WSPC/148-RMP
J070-00237
Quantization Methods
489
[255] Y. Tian and W. Zhang, An analytic proof of the geometric quantization conjecture of Guillemin-Sternberg, Invent. Math. 132 (1998) 229–259. [256] J. Tolar, Borel Quantization and the origin of topological effects in quantum mechanics, in Differential Geometry, Group Representations, and Quantization, Lecture Notes in Phys., Vol. 379 (Springer, Berlin, 1991), pp. 179–190. [257] G. M. Tuynman, Geometric quantization, in Proceedings Seminar 1983–1985: Mathematical Structures in Field Theories, Vol. 1, CWI Syllabus, Vol. 8 (Math. Centrum, CWI, Amsterdam, 1985). [258] G. M. Tuynman, Generalized Bergman kernels and geometric quantization, J. Math. Phys. 28 (1987) 573–583. [259] G. M. Tuynman, Quantization: Towards a comparison between methods, J. Math. Phys. 28 (1987) 2829–2840. [260] G. M. Tuynman, What are the rules of the game called BRST?, in Proc. Mathematical Aspects of Classical Field Theory, Seattle, WA, 1991, Contemp. Math., Vol. 132 (AMS, Providence, 1992), pp. 625–633. [261] G. M. Tuynman, What is prequantization, and what is geometric quantization?, in Mathematical Structures in Field Theory, Proc. Seminar 1989–1990, CWI Syllabi, Vol. 39 (Math. Centrum, CWI, Amsterdam), pp. 1–28 (1996). [262] G. M. Tuynman, Prequantization is irreducible, Pub. IRMA, Lille, 37 (1995), no. XV, 1–13; Indag. Math. (N.S.) 9 (1998) 607–618. [263] A. Unterberger, Encore des classes de symboles, S´eminaire Goulaouic-Schwartz ´ (1977/1978), Exp. No. 6 (Ecole Polytech., Palaiseau, 1978). [264] A. Unterberger, Quantification des certains espaces Hermitiens sym´etriques, ´ S´eminaire Goulaouic-Schwartz 1979/1980, Exp. No. 16 (Ecole Polytech., Palaiseau, 1980). [265] A. Unterberger, The calculus of pseudodifferential operators of Fuchs type, Comm. Partial Diff. Equations 9 (1984) 1179–1236. [266] A. Unterberger, Pseudodifferential analysis, quantum mechanics and relativity, Comm. Partial Diff. Equations 13 (1988) 847–894; Quantification relativiste, M´em. Soc. Math. France (N.S.) 44–45 (1991); Quantization, symmetries and relativity, in Perspectives on Quantization, South Hadley, 1996; Contemp. Math. Vol. 214 (AMS, Providence, 1998), pp. 169–187. [267] A. Unterberger, Quantization: Some problems, tools, and applications, in Operator Theory and Complex and Hypercomplex Analysis, Mexico City, 1994, Contemp. Math., Vol. 212 (AMS, Providence, 1998), pp. 285–298. [268] A. Unterberger, Composition formulas associated with symbolic calculi and applications, preprint ESI No. 822 (http://www.esi.ac.at/ESI-Preprints.html), ErwinSchr¨ odinger-Institute, Wien (1999). [269] A. Unterberger and H. Upmeier, Pseudodifferential Analysis on Symmetric Cones (CRC Press, Boca Raton, 1996). [270] H. Upmeier, Weyl quantization on symmetric spaces I. Hyperbolic matrix domains, J. Funct. Anal. 96 (1991) 297–330; Weyl quantization of complex domains, in Operator Algebras and Topology, Craiova, 1989, Pitman Research Notes in Math., Vol. 270 (Longman Harlow, 1992), pp. 160–178. [271] V. S. Varadarajan, Geometry of Quantum Theory (Springer-Verlag, New York, 1985). [272] M. Vergne, Geometric quantization and equivariant cohomology, in First European Congress of Mathematicians, Paris, 1992, Vol. 1, Progress in Math. 119 (Birkh¨ auser, Boston, 1994), pp. 249–295.
June 23, 2005 10:9 WSPC/148-RMP
490
J070-00237
S. T. Ali & M. Engliˇ c
[273] M. Vergne, Convex polytopes and quantization of symplectic manifolds, Proc. Natl. Acad. Sci. USA 93 (1996) 14238–14242. [274] M. Vergne, Quantification g´eom´etrique et r´eduction symplectique, S´eminaire Bourbaki 2000/2001, Ast´erisque 282 (2002), Exp. no. 888, 249–278. [275] D. A. Vogan, Jr., Noncommutative algebras and unitary representations, The mathematical heritage of Hermann Weyl (Durham, NC, 1987), pp. 35–60, Proc. Symp. Pure Math., Vol. 48, AMS, Providence, 1988. [276] A. Weil, Vari´et´es K¨ ahl´eriennes (Hermann, Paris, 1958). [277] P. B. Wiegmann, Multivalued functionals and geometrical approach for quantization of relativistic particles and strings, Nuclear Phys. B323 (1989) 311–329. [278] A. Weinstein, Symplectic groupoids, geometric quantization, and irrational rotation algebras, Symplectic geometry, groupoids, and integrable systems (Berkeley, 1989), pp. 281–290, Math. Sci. Res. Inst. Publ., Vol. 20 (Springer, New York, 1991). [279] A. Weinstein, Deformation quantization, S´eminaire Bourbaki, expos´e 789, Ast´erisque 227 (1994) 389–409. [280] H. Weyl, The Theory of Groups and Quantum Mechanics, (Dover, New York, 1931). [281] N. Woodhouse, Geometric Quantization (Clarendon Press, Oxford, 1980). [282] N. M. J. Woodhouse, Geometric Quantization, 2nd edn. Oxford Math. Monographs (Clarendon Press, Oxford University Press, New York, 1992). [283] S. Zakrzewski, Geometric quantization of Poisson groups — diagonal and soft deformations, in Symplectic Geometry and Quantization, Sanda and Yokohama, 1993, Contemp. Math., Vol. 179 (AMS, Providence, 1994), pp. 271–285. [284] S. Zelditch, Szeg¨ o kernels and a theorem of Tian, Int. Math. Res. Not. 6 (1998) 317–331. [285] Q. Zhao, Quantum kinematics and geometric quantization, J. Geom. Phys. 21 (1996) 34–42.
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Reviews in Mathematical Physics Vol. 17, No. 5 (2005) 491–543 c World Scientific Publishing Company
EXTENSION OF THE STRUCTURE THEOREM OF BORCHERS AND ITS APPLICATION TO HALF-SIDED MODULAR INCLUSIONS
HUZIHIRO ARAKI Research Institute for Mathematical Sciences, Kyoto University, Sakyoku, Kyoto 606-8205, Japan
[email protected] ´ ´ ZSIDO ´∗ LASZL O Department of Mathematics, University of Rome “Tor Vergata”, Via della Ricerca Scientifica, 00133 Rome, Italy
[email protected]
Received 01 December 2004 Revised 10 March 2005
Dedicated to Professor D. Buchholz on his 60th birthday A result of H.-W. Wiesbrock is extended from the case of a common cyclic and separating vector for the half-sided modular inclusion N ⊂ M of von Neumann algebras to the case of a common faithful normal semi-finite weight and at the same time a gap in Wiesbrock’s proof is filled in. Keywords: Von Neumann algebra; modular theory; half-sided modular inclusion; analytic extension of one-parameter groups. Mathematics Subject Classification 2000: Primary 81T40, 46L10
1. Introduction Bisognano and Wichmann [4] made a discovery about the connection of the modular operator and the modular conjugation for the von Neumann algebra generated by quantum fields in a wedge region of the Minkowski space-time with kinematical transformations, namely pure Lorentz transformation and the TCP1 operator. Borchers [6] formulated an important feature of this connection in the abstract setting of a pair of von Neumann algebras N ⊂ M with a common cyclic and separating vector Ω, and a one-parameter group of unitaries U (λ) having a positive generator, which induces a semi-group of endomorphisms of (M, Ω), obtaining a ∗ The
second author was supported by MIUR, INDAM and EU. 491
July 6, 2005 12:21 WSPC/148-RMP
492
J070-00238
H. Araki & L. Zsid´ o
commutation relation of U (λ) with the modular operator and the modular conjugation for (M, Ω), which reproduces the kinematical commutation relations in the Bisognano–Wichmann situation. A further development was achieved by Wiesbrock [37–42], who introduced the notion of the half-sided modular inclusion and obtained an underlying group structure (cf. [31]), as well as an imbedding of the canonical endomorphisms of the subfactor theory into a one-parameter semi-group of endomorphisms in this specific situation. Thus he obtained a correspondence between 2-dimensional chiral conformal field theories and a class of type III1 subfactors. Unfortunately, there is a gap in Wiesbrock’s proof of his basic theorem [37, Theorem 3, Corollaries 6 and 7, 42]. We will fill in this gap in Wiesbrock’s proof and further generalize the result to the case of a common normal semi-finite faithful weight. As a basic tool to prove general half-sided modular inclusion results, we generalize a structure theorem of Borchers [7, Theorem B] considerably, making Borchers’ proof at the same time more transparent. The extension from the state case to the weights turns out not to be straightforward. For this purpose we introduce as a basic tool the notion of a Hermitian map by using modular structure. It seems that before the summer of 1995, when we independently noticed the gap in the proof of Wiesbrock’s half-sided modular inclusion theorem, this gap was generally overlooked. Thanks to Professor Detlev Buchholz, who has been visiting the first-named author in the fall of 1995, we learned about each other’s insights and started to collaborate on this paper. The first version of the paper, containing a complete proof of the General Half-sided Modular Inclusion Theorem, Theorem 2.1, was already available at the end of 1995. It had a restricted circulation, but it was presented at several conferences. Other topics, like Theorem 2.2 on the structure and type of the involved von Neumann algebras and Proposition 2.4 on pathologies of the analytic extension of orbits of one-parameter automorphism groups, are of more recent date. We noticed that since 1995 a number of papers appeared, containing proposals for a complete proof of the half-sided modular inclusion theorem (see, for example, [17, Sec. 3] and [10, pp. 608 and 609]). Nonetheless, until now we have no knowledge of a completely elaborated proof, even in the state case. 2. Main Results 2.1. Notations and facts from the modular theory of von Neumann algebras (see, for example, [35, Chap. 10]) For two Hilbert spaces H and K we denote by B(K, H) the Banach space of all bounded linear maps from K to H. B(H, H) will be denoted simply by B(H). If T is a not necessarily everywhere defined linear operator from K to H, then Dom(T ) will stand for the domain of T .
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
493
We denote the weak and the strong operator topology on B(K, H) respectively by wo and so. The weak topology defined on B(K, H) by all linear functionals belonging to the norm-closure of the wo-continuous linear functionals in the dual of B(K, H) will be denoted by w. Further, the locally convex vector space topology defined on B(K, H) by the semi-norms B(K, H) T → ϕ(T ∗ T )1/2 , where ϕ ranges over all w-continuous positive linear functionals on B(K), will be denoted by s. We notice that on any bounded subset of B(K, H) wo = w and so = s. For a weight ϕ on a von Neumann algebra M we use the standard notations: Nϕ = {x ∈ M ; ϕ(x∗ x) < +∞}
(left ideal),
∗
Mϕ = (Nϕ ) Nϕ = the linear span of {a ∈ M + ; ϕ(a) < +∞} (hereditary ∗ -subalgebra), ∗
Aϕ = (Nϕ ) ∩ Nϕ ⊃ Mϕ
(∗-subalgebra).
We notice that for a ∈ M + we have a ∈ Mϕ ⇔ ϕ(a) < +∞. A von Neumann algebra M on a Hilbert space H is in standard form with respect to a normal semi-finite faithful weight ϕ on M if there is a linear map with dense range Nϕ x → xϕ ∈ H such that ϕ(x∗ x) = xϕ 2 ,
(ax)ϕ = axϕ ,
x ∈ Nϕ ,
a ∈ M.
In particular, by the faithfulness of ϕ, the map x → xϕ is injective. We notice also that the above map x → xϕ is unique up to natural unitary equivalence. If ϕ is bounded, then ξϕ = (1H )ϕ is a cyclic and separating vector for M and xϕ = xξϕ for all x ∈ M = Nϕ . In this case ϕ is the vector form M x → ωξϕ (x) = (xξϕ | ξϕ ). Let M be a von Neumann algebra on a Hilbert space H, in standard form with respect to a normal semi-finite faithful weight ϕ on M . Then the anti-linear operator H ⊃ {xϕ ; x ∈ Aϕ } xϕ → (x∗ )ϕ has closure Sϕ and the invertible positive self-adjoint operator ∆ϕ = Sϕ∗ Sϕ is called the modular operator of ϕ. If Sϕ = Jϕ ∆1/2 ϕ is the polar decomposition of Sϕ , then Jϕ is an involutive anti-unitary operator (anti-linear surjective isometry with Jϕ2 = 1H ), called the modular conjugation of ϕ. The operators ∆ϕ and Jϕ satisfy the commutation relation z Jϕ ∆zϕ = ∆−¯ ϕ Jϕ ,
z ∈ C,
(2.1)
July 6, 2005 12:21 WSPC/148-RMP
494
J070-00238
H. Araki & L. Zsid´ o
in particular, −1/2 Sϕ = Jϕ ∆1/2 Jϕ , ϕ = ∆ϕ
it Jϕ ∆it ϕ = ∆ϕ Jϕ ,
t ∈ R.
(2.2)
If ϕ is bounded and ξϕ = (1H )ϕ is the associated cyclic and separating vector, then Sϕ ξϕ = ξϕ ,
∆ϕ ξϕ = ξϕ ,
Jϕ ξϕ = ξϕ .
The fundamental result of the modular theory claims that x ∈ Nϕ ,
−it t ∈ R ⇒ ∆it ϕ x∆ϕ ∈ Nϕ ,
−it it (∆it ϕ x∆ϕ )ϕ = ∆ϕ xϕ ,
(2.3)
so that −it M x → σtϕ (x) = ∆it ϕ x∆ϕ ∈ M,
t∈R
(2.4)
defines an so-continuous one-parameter group of automorphisms (σtϕ )t∈R of M , called the modular automorphism group of ϕ, and Jϕ M Jϕ = M ,
x, y ∈ Nϕ ⇒ xJϕ yϕ = Jϕ yJϕ xϕ ,
(2.5)
so that M x → Jϕ x∗ Jϕ ∈ M is a ∗-anti-isomorphism. Moreover, the weight ϕ is invariant under the action of the modular automorphism group: ϕ σtϕ (a) = ϕ(a),
a ∈ M +,
t ∈ R.
(2.6)
The center Z(M ) of M is contained in the fixed point von Neumann subalgebra {x ∈ M ; σtϕ (x) = x, t ∈ R} ⊂ M , which is usually denoted by M ϕ . On the other hand, Jϕ zJϕ = z ∗ for all z ∈ Z(M ). We recall also (see the proof of [28, Lemma 5.2] or [45, Corollary 1.2]): x ∈ Nϕ ,
y ∈ M ϕ ⇒ xy ∈ Nϕ ,
(xy)ϕ = Jϕ y ∗ Jϕ xϕ .
(2.7)
Let e ∈ M ϕ be a projection and let ϕe denote the restriction of ϕ to eM e. By [28, Proposition 4.1 and Theorem 4.6] (see also [33, Propositions 4.5 and 4.7]), ϕe is a normal semi-finite faithful weight and its modular group is the restriction of the modular group of ϕ to eM e. Thus, if πe : eM e → B(eH) is the faithful normal ∗-representation which associates to every x ∈ eM e the restriction x | eH considered as a linear operator eH → eH, then the modular group of the weight ϕe ◦ πe−1 on πe (eM e), that is (πe ◦ σtϕe ◦ πe−1 )t∈R , is implemented by the unitary group (∆it ϕ | eH)t∈R on eH. Nevertheless, πe (eM e) is not always in standard form with respect to ϕe ◦ πe−1 (indeed, if M ⊂ B(C4 ) is a type I2 factor, in standard form with respect to its trace, and e ∈ M is a minimal projection, then πe (M ) is one-dimensional, while its commutant πe (M ) is four-dimensional, so πe (M ) and πe (M ) are not anti-isomorphic).
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
495
However, for any projection e ∈ M ϕ , π = πeJϕ eJϕ : eM e x → x | eJϕ eJϕ H ∈ B(eJϕ eJϕ H)
(2.8)
is a faithful normal ∗-representation, such that the von Neumann algebra π(eM e) is in standard form with respect to ϕe ◦ π −1 (cf. [19, Lemma 2.6]). Moreover, ∆ϕ and Jϕ commute with eJϕ eJϕ and we have the identifications ∆ϕe ◦π−1 = ∆ϕ | eJϕ eJϕ H,
Jϕe ◦π−1 = Jϕ | eJϕ eJϕ H.
(2.9)
For the convenience of the reader, let us outline the proof of (2.9). For the faithfulness of π, let x ∈ eM e be such that xeJϕ eJϕ = 0. Then xJϕ M eJϕ = xeJϕ M Jϕ Jϕ eJϕ = Jϕ M Jϕ xeJϕ eJϕ = 0, so xJϕ vanishes on M eH, hence on the range of the central support z(e) ∈ Z(M ) of e. Thus x = xz(e) = xJϕ z(e)Jϕ = 0. To see that π(eM e) is in standard form with respect to ψ = ϕe ◦ π −1 , first we notice that, according to (2.7), Nψ = π(Nϕe ) = eNϕ e. Next, the linear map (2.7)
Nψ = π(Nϕe ) π(x) → xϕ = (exe)ϕ = eJϕ eJϕ xϕ ∈ eJϕ eJϕ H has dense range. Indeed, every vector in eJϕ eJϕ H belongs to the closure of (2.7)
eJϕ eJϕ {xϕ ; x ∈ Nϕ } = {(exe)ϕ ; x ∈ Nϕ } = {xϕ ; x ∈ Nϕe }. Finally, for every π(x) ∈ π(Nϕe ) = Nψ and π(a) ∈ π(eM e) hold true: ψ π(x)∗ π(x) = ϕe (x∗ x) = ϕ(x∗ x) = xϕ 2 , (ax)ϕ = axϕ = π(a)xϕ . The commutation of Jϕ with eJϕ eJϕ follows immediately from the commutation of e with Jϕ eJϕ . Let Je denote the involutive anti-unitary operator eJϕ eJϕ H ξ → Jϕ ξ ∈ eJϕ eJϕ H. Further, using (2.2) and e ∈ M ϕ , we obtain for every t ∈ R: it it it it eJϕ eJϕ ∆it ϕ = eJϕ e∆ϕ Jϕ = eJϕ ∆ϕ eJϕ = e∆ϕ Jϕ eJϕ = ∆ϕ eJϕ eJϕ .
Thus also ∆ϕ commutes with eJϕ eJϕ , so eJϕ eJϕ H ⊃ Dom(∆ϕ ) ∩ (eJϕ eJϕ H) ξ → ∆ϕ ξ ∈ eJϕ eJϕ H is an invertible positive self-adjoint operator ∆e , whose positive self-adjoint square root is 1/2 eJϕ eJϕ H ⊃ Dom(∆1/2 ϕ ) ∩ (eJϕ eJϕ H) ξ → ∆ϕ ξ ∈ eJϕ eJϕ H.
Since, for every π(x) ∈ π(Aϕe ) = Aψ , 1/2 1/2 Sψ xϕ = (x∗ )ϕ = Sϕ xϕ = Jϕ ∆1/2 ϕ xϕ = Je ∆ϕ xϕ = Je ∆e xϕ ,
July 6, 2005 12:21 WSPC/148-RMP
496
J070-00238
H. Araki & L. Zsid´ o 1/2
1/2
we deduce that Sψ ⊂ Je ∆e . For the equality Sψ = Je ∆e , which will imply 1/2 1/2 (2.9), let ξ ∈ Dom(∆e ) = Dom(∆ϕ ) ∩ (eJϕ eJϕ H) = Dom(Sϕ ) ∩ (eJϕ eJϕ H) be arbitrary. Then there is a sequence (xn )n≥1 in Aϕ such that (xn )ϕ → ξ and (x∗n )ϕ → Sϕ ξ. By (2.7) the sequence (exn e)n≥1 belongs to Aϕe and we have (exn e)ϕ = eJϕ eJϕ (xn )ϕ → eJϕ eJϕ ξ = ξ, Sψ (exn e)ϕ = (exn e)∗ ϕ = eJϕ eJϕ (x∗n )ϕ → eJϕ eJϕ Sϕ ξ. Now the closedness of the graph of Sψ yields ξ ∈ Dom(Sψ ). For a projection p ∈ Z(M ) ⊂ M ϕ we have Jϕ pJϕ = p, so (2.9) yields ∆ϕp ◦πp−1 = ∆ϕ | pH,
Jϕp ◦πp−1 = Jϕ | pH.
(2.10)
Let M = {0} be a von Neumann algebra, in standard form with respect to a normal semi-finite faithful weight ϕ on M . Then the Connes spectrum Γ(σ ϕ ) of the modular automorphism group σ ϕ of ϕ is the intersection of the Arveson spectra of all modular automorphism groups σ ϕe , where e ranges over all non-zero projections e ∈ M ϕ . By [13, Lemme 1.2.2 and Th´eor`eme 2.2.4] (see also [33, Theorem 3.1 and Proposition 16.3]), Γ(σ ϕ ) is a closed additive subgroup of R and does not depend on the choice of ϕ, so it can be denoted (like in [27, 8.15]) by Γ(M ). Furthermore, by [13, Lemme 3.2.2] (see also [33, Proposition 28.1]), λ ∈ Γ(M ) if and only if eλ belongs to the spectrum σ(∆ϕe ) of ∆ϕe for all non-zero projections e ∈ M ϕ . According to [13, p. 28], the von Neumann algebra M = {0} is called to be of type III1 if Γ(M ) = R, or equivalently, if σ(∆ϕe ) = [0, +∞) for every non-zero projection e ∈ M ϕ . By (2.9) we also have: σ ∆ϕ | eJϕ eJϕ H = [0, +∞) M is of type III1 ⇔ . (2.11) for every projection 0 = e ∈ M ϕ 2.2. The general half-sided modular inclusion theorem Let M be a von Neumann algebra on a Hilbert space H, in standard form with respect to a normal semi-finite faithful weight ϕ on M . Let further N ⊂ M be a von Neumann subalgebra such that the restriction ψ of ϕ to N is semi-finite. If {yϕ ; y ∈ Nψ } is dense in H, then N is in standard form with respect to ψ such that yψ = yϕ for all y ∈ Nψ . This happens, for example, if N ⊂ M ⊂ B(H) are von Neumann algebras having a common cyclic and separating vector ξo , and ϕ is the vector form M x → (xξo |ξo ). In the above situation, owing to (2.5), we have Jψ Jϕ M Jϕ Jψ = Jψ M Jψ ⊂ Jψ N Jψ = N, so the unitary Jψ Jϕ implements a unital ∗-homomorphism M x → Ad(Jψ Jϕ )(x) = Jψ Jϕ xJϕ Jψ ∈ N ⊂ M,
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
497
considered by Longo [24, 25] and called the canonical endomorphism of the inclusion N ⊂ M . The canonical endomorphism γ, in particular the tunnel M ⊃ N ⊃ γ(M ) ⊃ γ(N ) ⊃ γ 2 (M ) ⊃ γ 2 (N ) ⊃ · · · ,
(2.12)
plays an important role in the Subfactor Theory (see [26] and [22]). ↑ (1) denote the two-dimensional Lie group generated by the hyperbolic Let P+ rotations ξ1 cosh(2πt) −sinh(2πt) ξ1 L t : R2 → ∈ R2 , t ∈ R −sinh(2πt) cosh(2πt) ξ2 ξ2 and the lightlike translations T s : R2
ξ1 ξ2
→
ξ1 + s ξ2 + s
∈ R2 ,
s∈R
(cf. [2, Chap. 17, Sec. 2, A]), which is the Poincar´e group on the light-ray. Furthermore, the commutation relation Ts Lt = Lt Te2πt s ,
s, t ∈ R
implies that (Ts1 Lt1 )(Ts2 Lt2 ) = Ts1 +e−2πt1 s2 Lt1 +t2 . Therefore, endowing R2 with the Lie group structure defined by the composition law (s1 , t1 ) · (s2 , t2 ) = (s1 + e−2πt1 s2 , t1 + t2 ), ↑ the mapping R2 (s, t) → Ts Lt ∈ P+ (1) becomes a Lie group isomorphism. In ↑ particular, P+ (1) is connected and simply connected. On the other hand, the map −2πt « e s → Ts Lt is a Lie group isomorphism of the two-dimensional 2 × 2 matrix 0 1 ff −2πt « e s ↑ ↑ ; s, t ∈ R onto P+ (1). If we identify P+ (1) with G along group G = 0 1 ↑ the above isomorphism, the Lie algebra p↑+ (1) of P+ (1) will be identified with the ↑ ↑ (1) with the exponential Lie algebra g of G, and the exponential map p+ (1) ⇒ P+ map g → G, that is with the usual exponentiation of the matrices belonging to g. We notice that g is the set of all 2 × 2 real matrices X such that exp(tX) ∈ G, t ∈ R, and [X, Y ] = XY − Y X for all X, Y ∈ g. The elements −2π 0 −2π 2π 0 1 X1 = , X2 = , X3 = (2.13) 0 0 0 0 0 0
of g ≡ p↑+ (1) are of particular interest: we have X3 =
1 X2 − X1 , 2π
[X2 , X1 ] = 4π 2 X3 ,
any two of X1 , X2 , X3 is a basis for g ≡ p↑+ (1) and −2πt −2πt e 0 1 − e−2πt 1 e exp(tXj ) = , , 0 1 0 1 0 for j = 1, 2, 3, respectively.
(2.14)
t 1
(2.15)
July 6, 2005 12:21 WSPC/148-RMP
498
J070-00238
H. Araki & L. Zsid´ o
According to the general theory of unitary representations of Lie groups (see, for example, [2, Chap. 11, Sec. 1, B] or [30, Sec. 10.1]), if π is an so-continuous ↑ (1) on a Hilbert space H and DG (π) denotes the unitary representation of G ≡ P+ G˚ arding subspace of H for π, then the formula d dπ(X)ξ = π(exp(tX))ξ , X ∈ p↑+ (1), ξ ∈ DG (π) dt t=0 defines a representation of the Lie algebra p↑+ (1) into the Lie algebra of all skewsymmetric linear mappings DG (π) → DG (π). Moreover, for every X ∈ p↑+ (1), the linear mapping idπ(X) : H ⊃ DG (π) → DG (π) ⊂ H is essentially self-adjoint (see, for example, [2, Chap. 11, Sec. 2, Corollary 4] or [30, Corollary 10.2.11]). Therefore, if X ∈ p↑+ (1) and A is the self-adjoint linear operator in H, then π(exp(tX)) = exp(itA) for all t ∈ R ⇒ dπ(X) = iA
(2.16)
↑ (1), while the second one (the first exp is the exponential map of the Lie group P+ indicates functional calculus). Indeed, by the definition of dπ(X) we have dπ(X) ⊂ iA, so the self-adjoint operator −idπ(X) is contained in the self-adjoint operator A, which implies their equality. We notice for completeness that, according to [16, Theorem 3.3], the G˚ arding subspace DG (π) is actually equal to the set of all C ∞ -vectors for π. We notice also the following simple fact concerning the essential self-adjointness of sums of symmetric operators: if H is a Hilbert space, D ⊂ H is a dense linear subspace and A, B : D → D are linear operators, then
¯ ⊂ A + B. A, B symmetric, A + B essentially self-adjoint ⇒ A¯ + B
(2.17)
¯ is essentially self-adjoint and A¯ + B ¯ = A + B. Consequently also A¯ + B ¯ ¯ To prove (2.17), let η ∈ Dom(A) ∩ Dom(B) be arbitrary. Then
¯ | ξ) + (Bη ¯ | ξ) = (A¯ + B)η ¯ |ξ , η | (A + B)ξ = (η | Aξ) + (η | Bξ) = (Aη
ξ ∈ D,
¯ so η is in the domain of (A + B)∗ = A + B and (A + B)η = (A + B)∗ η = (A¯ + B)η. ↑ (2.17) implies that, for any so-continuous unitary representation π of P+ (1), dπ(X) + dπ(Y ) = dπ(X + Y ),
X, Y ∈ p↑+ (1).
(2.18)
Theorem 2.1 (General Half-sided Modular Inclusion Theorem). Let M be a von Neumann algebra on a Hilbert space H, in standard form with respect to a normal semi-finite faithful weight ϕ on M, and N ⊂ M a von Neumann subalgebra such that the restriction ψ of ϕ to N is semi-finite and N is in standard form with respect to ψ. Let us denote for convenience ∆M = ∆ϕ ,
JM = Jϕ
and
∆N = ∆ψ ,
JN = Jψ
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
499
and assume the following half-sided modular inclusion: −it ∆it M N ∆M ⊂ N,
t ≤ 0.
(2.19)
Then 1 (log ∆N − log ∆M ), 2π
(2.20)
defined on the intersection of the domains of log ∆N and log ∆M , is an essentially self-adjoint operator with positive self-adjoint closure P and, letting U (s) = exp(isP ),
s ∈ R,
(2.21)
we have the following: (1) (2) (3) (4) (5) (6) (7)
−it it it 2πt s), s, t ∈ R; ∆−it M U (s)∆M = ∆N U (s)∆N = U (e U (s)J = J U (s)J = U (−s), s ∈ R; JM M N N it it it ∗ ∆ and ∆ = U (1)∆ U 1 − e2πt = ∆−it M N M U (1) , t ∈ R; N ∗ U (2) = JN JM and JN = U (1)JM U (1) ; N = U (1)M U (1)∗ ; U (s)M U (s)∗ ⊂ M, s ≥ 0; γs = Ad U (s), s ≥ 0 is an so-continuous one-parameter semigroup of ∗endomorphisms of M such that γ2 is equal to the canonical endomorphism γ = Ad(JN JM ) of the inclusion N ⊂ M ; thus γs (M ), s ≥ 0, provide a continuous interpolation of the tunnel (2.12):
M ⊃ γ1 (M ) = N ⊃ γ2 (M ) = γ(M ) ⊃ γ3 (M ) = γ(N ) ⊃ γ4 (M ) = γ 2 (M ) ⊃ γ5 (M ) = γ 2 (N ) ⊃ · · · ; (8) For x ∈ M and s ≥ 0 we have
x ∈ Nϕ ⇔ U (s)xU (s)∗ ∈ Nϕ ⇒ U (s)xU (s)∗ ϕ = U (s)xϕ ;
(9) The weight ϕ on M is invariant under γs for every s ≥ 0; is (10) {∆it M , ∆N ; t, s ∈ R} generates a group of unitary operators on H, which is the ↑ image of an so-continuous unitary representation π of P+ (1) on H, uniquely determined by any two of the relations dπ(X1 ) = i log ∆M ,
dπ(X2 ) = i log ∆N ,
dπ(X3 ) = iP,
where X1 , X2 , X3 ∈ p↑+ (1) are the Lie algebra elements defined in (2.13). Remark. This theorem is a generalization of Wiesbrock’s statement, where ϕ is assumed to be bounded. The strategy of our proof. The proof will be given in Sec. 7 in several steps. it (i) First we study ∆−it N ∆M by using the Modular Extension Theorem which will be proved in Sec. 5.
July 6, 2005 12:21 WSPC/148-RMP
500
J070-00238
H. Araki & L. Zsid´ o
it (ii) Then we show that ∆−it N ∆M has a strong operator limit T for t → −∞. For the existence of the wave operator T we use our generalization of the Borchers Structure Theorem which will be proved in Sec. 6. (iii) Using the above ingredients, we define an so-continuous one-parameter family U (s), s ∈ R, of unitaries. (iv) We prove that the defined family U (s), s ∈ R, is a one-parameter group having positive generator P and we verify that for it the statements (1)–(10) in Theorem 2.1 hold. (v) Using that the generator P of R s → U (s) satisfies (10), (2.18) will imply that P is the closure of the operator (2.20).
We also prove the following result about the structure of half-sided modular inclusions: Theorem 2.2. Under the assumptions and with the notation as in Theorem 2.1, the following hold: (1) γs (z) = z for s ≥ 0 and z ∈ Z(M ), so Z γs (M ) = Z(M ) for all s ≥ 0; (2) There exists the greatest central projection p of M satisfying M p = N p. For a projection e ∈ M we have e ≤ p ⇔ U (s)e = e
for all s ∈ R,
while for a projection e ∈ M ϕ with γs (e) = e, s ≥ 0, we even have e ≤ p ⇔ U (s)eJM eJM = eJM eJM for all s ∈ R;
(3) M ϕ ⊂ s≥0 γs (M ) ⇒ s≥0 γs (M ) = x ∈ M ; γs (x) = x, s ≥ 0 ; (4) Mϕ ∩ M ϕ
so
= M ϕ ⇒ M ϕ ⊂ x ∈ M ; γs (x) = x, s ≥ 0 ⇒ M (1H − p) and N (1H − p) are of type III1 whenever p = 1H .
Remarks. (1) is proved in the case of bounded ϕ in [9, Theorem 2.4], and for the general case we shall essentially repeat the same proof. If ϕ is bounded, then the equality s≥0 γs (M ) = {x ∈ M ; γs (x) = x, s ≥ 0} in (3) follows from [24, Corollary 2.2], but for the proof of (3) in our setting we need a different method. Finally, for bounded ϕ, the inclusion M ϕ ⊂ {x ∈ M ; γs (x) = x, s ≥ 0} and the type of M (1H − p) and N (1H − p) were established in [37] if M is a factor, and in [8, 9] in the case of a general M . However, there is a gap in the proof in [8, 9]: it is shown only that, for every e ∈ M ϕ majorized by 1H − p, the spectrum of ∆ϕ | eH is [0, +∞), while the right proof requires that the spectrum of the modular operator of ϕe , which is ∆ϕ | eJϕ eJϕ H (see (2.9)), be equal to [0, +∞).
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
501
We do not know if the above inclusion still holds without assuming the strong operator density of Mϕ ∩ M ϕ in M ϕ . We notice that, by (3) and (4) in the above theorem, if M ϕ ⊂ s≥0 γs (M ) would hold in general, then we would always have:
γs (M ), M ϕ ⊂ x ∈ M ; γs (x) = x, s ≥ 0 = s≥0
M (1H − p), N (1H − p) are of type III1 in the case p = 1H . 2.3. The analytic extension theorem Let β ∈ R, β = 0. Set Sβ = {z ∈ C; 0 < β −1 z < 1}. H ∞ (Sβ ) will denote the Banach algebra of all bounded analytic complex functions on Sβ . Most parts of the following Analytic Extension Theorem are known, but we shall give a proof for the convenience of the reader. Theorem 2.3 (Analytic Extension Theorem). Let A and B be invertible positive self-adjoint linear operators on the Hilbert spaces H and K respectively, 0 = β ∈ R, and T ∈ B(K, H). Then the next statements (1)–(5) are equivalent: (1) R s → Ais T B −is has a uniformly bounded so-continuous extension Sβ z → T (z) ∈ B(K, H)
(2.22)
which is analytic in Sβ ; (2) R s → Ais T B −is has a wo-continuous extension (2.22) which is analytic in Sβ ; (3) There exists a Borel set Ξo ⊂ R of non-zero Lebesgue measure such that, for every ξ ∈ K and η ∈ H, there exists an fξ,η ∈ H ∞ (Sβ ) satisfying lim
0
fξ,η (s + it) = (Ais T B −is ξ | η)
(2.23)
for almost all s ∈ Ξo ; (4) A−β T B β is defined and bounded on a core of B β ; (5) Dom(A−β T B β ) = Dom(B β ) and A−β T B β is bounded. Moreover, if the above conditions are satisfied, then, for every z ∈ Sβ , Dom(Aiz T B −iz ) = Dom(B −iz ), iz
A TB it
A T (z)B
−it
−iz
⊂ T (z),
= T (z + t),
(2.24) (2.25)
t ∈ R.
(2.26)
Remark. A somewhat novel feature is the non-null Borel set Ξo in (3). We shall apply this theorem in the case K = H, T = 1H , A = ∆M , B = ∆N and β = − 12 in the proof of the Modular Extension Theorem (Theorem 2.12). We notice that, for A and B as in Theorem 2.3, R s → αs = Ais · B −is is an so-continuous one-parameter group of linear isometries on B(K, H). If, for
July 6, 2005 12:21 WSPC/148-RMP
502
J070-00238
H. Araki & L. Zsid´ o
some T ∈ B(K, H) and z ∈ C, the orbit R s → Ais T B −is has an so-continuous extension S z ζ → T (ζ) ∈ B(K, H), which is analytic in S z , then we say that T belongs to the domain of αz and define αz (T ) = T (z). According to Theorem 2.3, for T ∈ B(K, H) the conditions • T ∈ Dom(αz ), • Aiz T B −iz is defined and bounded on a core of B −iz , • Dom(Aiz T B −iz ) = Dom(B −iz ) and Aiz T B −iz is bounded are equivalent and, if they are satisfied, then αz (T ) = Aiz T B −iz . For a more detailed study of the analytic extension operator αz , especially in the most relevant case z = −i, when it is called the analytic generator of the group α, we refer to [12], [43] and [44]. Let us point out that, in general, the five equivalent statements in Theorem 2.3 are not equivalent with (4 ) A−β T B β is densely defined and bounded. Indeed: Proposition 2.4. There exists an invertible positive self-adjoint linear operator A on a Hilbert space H and a unitary v ∈ B(H), such that A−1 vA is densely defined and bounded, but Dom(A−1 vA) is not a core for A. Remark. Actually the proof of Proposition 2.4, which will be given in Sec. 3, works to prove the next more general statement: If A and B are invertible positive self-adjoint linear operators on a non-zero Hilbert space, such that Ait B is A−it = e−its B is , t, s ∈ R, e−πt , t ∈ R, and further, hence Ait exp(−B π ) A−it = exp(−B π )
=b
−πt
Ait bis A−it = bise
,
t, s ∈ R,
then, for every s > 0, Dom(A−1 bis A) = Dom(A) and A−1 bis A ⊂ b−is , while A−1 b−is A is densely defined and A−1 b−is A ⊂ bis , but Dom(A−1 b−is A) is not a core of A.
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
503
Nevertheless, there are situations in which the above statement (4 ) is equivalent with the statements (1)–(5) in Theorem 2.3. One of such situations occurs in [36, Lemma 15.15 and Theorem 15.3], namely: Let M ⊂ B(H) be a von Neumann algebra, in standard form with respect to a normal semi-finite faithful weight ϕ on M . Then, for any a ∈ Nϕ , the following are equivalent: −1/2
(a) ∆ϕ
1/2
1/2
a∆ϕ
−1/2
(b) ∆ϕ a∗ ∆ϕ (c)
−1/2
is densely defined and ∆ϕ
1/2
a∆ϕ ≤ 1,
≤ 1,
−1/2 1/2 Dom(∆ϕ a∆ϕ )
1/2
−1/2
= Dom(∆ϕ ) and ∆ϕ 1/2
1/2
a∆ϕ ≤ 1.
−1/2
Indeed, if (a) holds and η ∈ Dom(∆ϕ a∗ ∆ϕ
), then
∗ −1/2 |(∆1/2 η | ξ)| = |(η | ∆−1/2 a∆1/2 ϕ a ∆ϕ ϕ ϕ ξ)| ≤ ηξ, −1/2
1/2
1/2
ξ ∈ Dom(∆−1/2 a∆1/2 ϕ ϕ ). −1/2
Since Dom(∆ϕ a∆ϕ ) is dense in H, we get ∆ϕ a∗ ∆ϕ η ≤ η. 1/2 −1/2 −1/2 Next, Dom(∆ϕ a∗ ∆ϕ ) always contains the core {Jϕ xϕ ; x ∈ Aϕ } of ∆ϕ . −1/2 Indeed, if x ∈ Aϕ , then xa ∈ Aϕ and so a∗ ∆ϕ Jϕ xϕ = a∗ Sϕ xϕ = a∗ (x∗ )ϕ = 1/2 (xa)∗ ϕ belongs to Dom Sϕ = Dom ∆ϕ . ϕ Consequently, according to Theorem 2.3, (b) implies that a∗ ∈ Dom(σ−i/2 ) ϕ ϕ ϕ ϕ ∗ ∗ ∗ and σ−i/2 (a ) ≤ 1. But then a ∈ Dom(σi/2 ) and σi/2 (a) = σ−i/2 (a ), hence ϕ ϕ (a) = σ−i/2 (a∗ ) ≤ 1. Using Theorem 2.3 again, we obtain that (c) holds. σi/2 Finally, the implication (c) ⇒ (a) is trivial. 2.4. Lebesgue continuity, Tomita algebras Let M ⊂ B(H) be a von Neumann algebra, in standard form with respect to a normal semi-finite faithful weight ϕ on M . The next lemma shows that 1H can be approximated by particularly regular elements of Mϕ with respect to the so-topology: Lemma 2.5. There is an increasing net {aι }ι in Mϕ ∩ M + such that, for any ι, the orbit R s → σsϕ (aι ) ∈ M has an entire extension C z → σzϕ (aι ) ∈ M and • σzϕ (aι ) ∈ Mϕ , σzϕ (aι )∗ = σzϕ (aι ), σzϕ (aι ) ≤ e( z) for all ι and z ∈ C, • so- limι σzϕ (aι ) = 1H for all z ∈ C. 2
Nets {aι }ι as in Lemma 2.5 (called in [45, Sec. 1], regularizing nets for ϕ) will be used to prove the following description of Nϕ : Lemma 2.6. (1) For x ∈ M and c ≥ 0, x ∈ Nϕ
and
xϕ ≤ c ⇔ xJϕ yϕ ≤ cy
for all y ∈ Mϕ .
(2) For x ∈ M and ξ ∈ H, x ∈ Nϕ
and
xϕ = ξ ⇔ xJϕ yϕ = Jϕ yJϕ ξ
for all y ∈ Mϕ .
July 6, 2005 12:21 WSPC/148-RMP
504
J070-00238
H. Araki & L. Zsid´ o
Using the above lemma, we get immediately wo- R f (t)σtϕ (x) dt ∈ Nϕ and ϕ 1 = R f (t)∆it wo- R f (t)σt (x) dt f ∈ L (R), x ∈ Nϕ ⇒ ϕ xϕ dt ϕ ˆ = f (log ∆ϕ )xϕ , where fˆ is the inverse Fourier transform of f : ˆ f (t)eiλt dt, λ ∈ R. f (λ) =
(2.27)
(2.28)
R
Indeed, by (2.5) and (2.3) we have for every y ∈ Nϕ , ϕ f (t) σtϕ (x)Jϕ yϕ dt wo- f (t)σt (x) dt Jϕ yϕ = R R = f (t) Jϕ yJϕ σtϕ (x)ϕ dt R = Jϕ yJϕ f (t)∆it ϕ xϕ dt, R
so we can apply Lemma 2.6(2) to wo- R f (t)σtϕ (x) dt and R f (t)∆it ϕ xϕ dt. If ϕ is bounded, then the linear mapping M x → xϕ = xξϕ ∈ H is bounded, but its inverse is in general not bounded. For unbounded ϕ, even x → xϕ is not bounded. Nevertheless, both Nϕ x → xϕ ∈ H and its inverse have a dominated continuity property with respect to the wo-topology on M and the weak topology on H, called in [43, Sec. 2] Lebesgue continuity. For the proof of Theorem 2.1 we need the following variant of [43, Sec. 4.6, Propositions 1 and 2], concerning the Lebesgue continuity of x → xϕ and xϕ → x: Proposition 2.7. Let M ⊂ B(H) be a von Neumann algebra, in standard form with respect to a normal semi-finite faithful weight ϕ on M, and {xι }ι ⊂ Nϕ a net. ι (1) If wo- limι xι = x ∈ M and supι (xι )ϕ < ∞, then x ∈ Nϕ and (xι )ϕ → xϕ in the weak topology of H. ι (2) If (xι )ϕ → ξ ∈ H in the weak topology of H and supι xι < ∞, then there exists x ∈ Nϕ such that wo- limι xι = x and xϕ = ξ. Let Tϕ denote the set of all x ∈ Aϕ such that R s → σsϕ (x) ∈ M has an entire extension C z → σzϕ (x) ∈ M satisfying σzϕ (x) ∈ Aϕ for all z ∈ C. Since x, y ∈ Tϕ ⇒ xy ∈ Tϕ ∗
x ∈ Tϕ ⇒ x ∈ Tϕ
and σzϕ (xy) = σzϕ (x)σzϕ (y), and
σzϕ (x∗ )
=
∗ σzϕ ¯ (x) ,
z ∈ C,
z ∈ C,
Tϕ is a ∗-subalgebra of Aϕ , called the (maximal) Tomita algebra of ϕ. In the next variant of [35, 10.21, Corollary 1], certain standard properties of the Tomita algebra Tϕ are formulated.
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
505
Proposition 2.8. Let M ⊂ B(H) be a von Neumann algebra, in standard form with respect to a normal semi-finite faithful weight ϕ on M . Then x ∈ Tϕ , z ∈ C ⇒ xϕ ∈ Dom(∆iz ϕ)
σzϕ (x)ϕ = ∆iz ϕ xϕ
and
(2.29)
and, for every y ∈ Aϕ , there exists a sequence {yn }n≥1 in Tϕ such that so
so
• yn −→ y and yn∗ −→ y ∗ , • (yn )ϕ → yϕ and (yn∗ )ϕ → (y ∗ )ϕ in the norm-topology of H, 2
• σzϕ (yn ) ≤ en( z) y for all n ≥ 1 and z ∈ C, 2
2
n( z) ∗ n( z) yϕ , ∆iz (y ∗ )ϕ for all n ≥ 1 and z ∈ C. • ∆iz ϕ (yn )ϕ ≤ e ϕ (yn )ϕ ≤ e
We notice that the set Sϕ of all x ∈ Tϕ , for which σzϕ (x) ≤ ec(x) z x,
c(x) z ∆iz xϕ , ϕ xϕ ≤ e
z∈C
with c(x) ≥ 0 a constant depending only on x, is a ∗-subalgebra of Tϕ and for every y ∈ Aϕ there exists a sequence {yn }n≥1 in Sϕ such that so
so
• yn −→ y and yn∗ −→ y ∗ , • (yn )ϕ → yϕ and (yn∗ )ϕ → (y ∗ )ϕ in the norm-topology of H (see [35, 10.22]). 2.5. Hermitian maps Let H, K be Hilbert spaces and M ⊂ B(H), N ⊂ B(K) von Neumann algebras, in standard form with respect to the normal semi-finite faithful weights ϕ on M and ψ on N . An essential role will be played by the fixed point real linear subspaces of K and H under Sψ and Sϕ , respectively: K Sψ = {ξ ∈ Dom(Sψ ); Sψ ξ = ξ},
H Sϕ = {η ∈ Dom(Sϕ ); Sϕ η = η}.
They have been used by various authors earlier (see, for example, [29, 15]). Let us formulate the basic properties, for example, of K Sψ : Lemma 2.9. (1) K Sψ = {xψ ; x∗ = x ∈ Nψ }. (2) ξ ∈ K belongs to K Sψ if and only if (ξ | Jψ xψ ) ∈ R for all x∗ = x ∈ Nψ . (3) ξ ∈ K belongs to K Sψ if and only if (ξ | Jψ xψ ) = (Jψ (x∗ )ψ | ξ) for all x ∈ Aψ . (4) Dom Sψ = K Sψ + iK Sψ . Definition 2.10. (1) T ∈ B(K, H) is said to be Hermitian with respect to the weight pair (ψ, ϕ) if T K Sψ ⊂ H Sϕ . (2) T ∈ B(K, H) is said to implement ψ in ϕ if x ∈ Nψ ⇒ TxT ∗ ∈ Nϕ ,
(TxT ∗ )ϕ = Txψ .
Statement (3) in the next lemma explains why we call the fulfilment of the implication in Definition 2.10(2) “implementation of ψ in ϕ by T ”.
July 6, 2005 12:21 WSPC/148-RMP
506
J070-00238
H. Araki & L. Zsid´ o
Lemma 2.11. (1) T ∈ B(K, H) is Hermitian with respect to (ψ, ϕ) whenever it implements ψ in ϕ. (2) If T ∈ B(K, H) implements ψ in ϕ, then T N T ∗ ⊂ M . (3) If an isometric T ∈ B(K, H) implements ψ in ϕ, then N x → TxT∗ ∈ M is an injective ∗-homomorphism and ψ(a) = ϕ(T aT ∗ ),
0 ≤ a ∈ Mψ .
(4) For bounded ψ and ϕ and the corresponding cyclic and separating vectors ξψ = (1K )ψ and ηϕ = (1H )ϕ , an injective T ∈ B(K, H) implements ψ in ϕ if and only if T NT ∗ ⊂ M
and
T ∗ ηϕ = ξψ .
The following result provides important criteria for Hermiticity: Theorem 2.12 (Modular Extension Theorem). Let M ⊂ B(H), N ⊂ B(K) be von Neumann algebras, in standard form with respect to the normal semi-finite faithful weights ϕ on M and ψ on N . Then for T ∈ B(K, H) the following conditions (1)–(8) are equivalent: (1) (2) (3) (4)
T is Hermitian with respect to (ψ, ϕ); T xψ ∈ H Sϕ for all x∗ = x ∈ Nψ ; (T xψ | Jϕ yϕ ) ∈ R for all x∗ = x ∈ Nψ and y ∗ = y ∈ Nϕ ; For every x ∈ Aψ and y ∈ Aϕ we have (T xψ | Jϕ yϕ ) = (Jϕ (y ∗ )ϕ | T (x∗ )ψ );
(5) (6) (7) (8)
T Sψ ⊂ Sϕ T ; 1/2 −1/2 −1/2 is defined on Dom ∆ψ and coincides there with Jϕ T Jψ ; ∆ϕ T ∆ψ Jψ T ∗ Jϕ is Hermitian with respect to (ϕ, ψ); −is ∈ B(K, H) extends to a (Modular Extension Condition) R s → ∆is ϕ T ∆ψ bounded so-continuous map S−1/2 z → T (z) ∈ B(K, H), analytic in S−1/2 and satisfying i T − = Jϕ TJ ψ . 2
(2.30)
Moreover, if the above equivalent conditions are satisfied, then, with the notation from the Modular Extension Condition (8), we have z ∈ S−1/2 ,
T (z) ≤ T , −it ∆it ϕ T (z)∆ψ ,
T (z + t) = i T s− = Jϕ T (s)Jψ , 2
z ∈ S−1/2 ,
s∈R
and T (s) is Hermitian with respect to (ψ, ϕ) for all s ∈ R.
(2.31) t ∈ R,
(2.32) (2.33)
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
507
2.6. Generalization of the structure theorem of Borchers Let M, N, ϕ, ψ be as in the preceding subsection, and T ∈ B(K, H) Hermitian with −is of T has respect to (ψ, ϕ). Then, by Theorem 2.12, the orbit R s → ∆is ϕ T ∆ψ a bounded so-continuous extension T (·) to S−1/2 , analytic in S−1/2 , which satisfies the boundary conditions T (s) is Hermitian with respect to (ψ, ϕ) for all s ∈ R, i Jψ = T (s) is Hermitian with respect to (ψ, ϕ) for all s ∈ R. Jϕ T s − 2 The next extension of a structure theorem of Borchers ([7, Theorem B], see also [6, Theorem 11.9] and [37, Theorem 2]) shows, in particular, that also the converse statement holds, that is any bounded so-continuous map S−1/2 → B(K, H), which is analytic in S−1/2 and satisfies the above boundary conditions, arises from a Hermitian T ∈ B(K, H) as above. Theorem 2.13 (Generalized Structure Theorem). Let M ⊂ B(H) and N ⊂ B(K) be von Neumann algebras, in standard form with respect to the normal semifinite faithful weights ϕ on M and ψ on N . Further let 0 = β ∈ R, Ξo and Ξ1 be Lebesgue null sets in R, and Sβ \ Ξo ∪ (Ξ1 + iβ) z → T (z) ∈ B(K, H) be a bounded map which is analytic in Sβ and satisfies the boundary conditions (i) T (s) is Hermitian with respect to (ψ, ϕ) for all s ∈ R\Ξo and T (s) = wo-
T (s + it),
lim
0
s ∈ R\Ξo ,
(2.34)
(ii) Jϕ T (s + iβ)Jψ is Hermitian with respect to (ψ, ϕ) for all s ∈ R\Ξ1 and T (s + iβ) = wo-
lim
1>t/β→1
T (s + it),
s ∈ R\Ξ1 .
(2.35)
Then, for some T ∈ B(K, H) which is Hermitian with respect to (ψ, ϕ), s −i 2β
T (s) = ∆ϕ
i
s
T ∆ψ2β ,
s ∈ R\Ξo .
(2.36)
Hence the given map z → T (z) extends to an so-continuous map on the whole Sβ and, with the same notation T (·) for the extension, it satisfies it T (z + 2βt) = ∆−it ϕ T (z)∆ψ ,
T (s + iβ) = Jϕ T (s)Jψ ,
z ∈ Sβ , s ∈ R.
t ∈ R,
(2.37) (2.38)
Remark. Our theorem owes much to Borchers’ work, its proof being based on the main idea of the proof of Theorem B in [7]. Nevertheless, our approach has several features of generality: (a) z → T (z) is not assumed to be so-continuous on the whole Sβ , but only the existence of radial limits are assumed almost everywhere on the boundary. In our
July 6, 2005 12:21 WSPC/148-RMP
508
J070-00238
H. Araki & L. Zsid´ o
application to the proof of Theorem 2.1 we shall use Theorem 2.13 with Ξo = {0} and Ξ1 = ∅. (b) We are considering the case of arbitrary normal semi-finite faithful weights ϕ and ψ, without assuming their boundedness. (c) On the boundary, we assume only the Hermiticity of T (s) and Jϕ T (s+iβ)Jψ rather than the implementation of ψ in ϕ by these operators. The advantage of our assumption consists in its linearity, which allows “mollification”, while Borchers’ assumption is of quadratic nature, more difficult to handle. (d) Our proof is made more elementary, avoiding most arguments of the two-dimensional complex analysis and using, instead of the Malgrange–Zerner Theorem, only the elementary Osgood Lemma (the Hartogs Theorem for continuous functions) along with the Morera Theorem (one-dimensional edge-of-the-wedge theorem). 2.7. Complements to the implementation theorem of Borchers Based on the ideas from [1], an invariant subspace theory was developed in [44] for the “bounded analytic” elements associated to an so-continuous one-parameter group (αt )t∈R of ∗-automorphisms of a von Neumann algebra M ⊂ B(H). This theory allows, starting with an already existent one-parameter group of unitaries on H which implements α, to construct canonically a new implementing group of unitaries on H, which has a minimality property and inherits certain properties of the ∗-automorphism group α (see [1, Proposition in Sec. 3], where the idea is formulated in the realm of a particular situation, and [44, Theorem 5.3, Corollary 5.4, Lemma 5.11] for the general theory). The above method yields a proof for the one-parameter version of the celebrated implementation theorem of Borchers [5], claiming the innerness of α whenever it is implemented by a one-parameter group of unitaries U (s) s∈R having positive generator (see [1, Theorem 3.1] and [44, Corollary 5.7]). Moreover, as we shall see in the next theorem, the obtained canonical inner implementing group of unitaries inherits certain commutation properties of the ∗-automorphism group α. We recall that, if M is a von Neumann algebra and (αs )s∈R is an so-continuous one-parameter group of ∗-automorphisms of M , then the spectral subspace of α corresponding to a closed set F ⊂ R is defined by M α (F ) = x ∈ M ; wo- f (s)αs (x) ds = 0 if f ∈ L1 (R), F ∩ supp(fˆ) = ∅ , R
where fˆ denotes the inverse Fourier transform (2.28) of f (see [1, Definition 2.1]). Theorem 2.14. Let M ⊂ B(H) be a von Neumann algebra and P a self-adjoint operator in H, such that P is bounded below and Ad exp(isP ) leaves M invariant for all s ∈ R, defining thus an so-continuous one-parameter group (αs )s∈R of ∗-automorphisms of M . Then there exists a unique injective b ∈ M , 0 ≤ b ≤ 1H ,
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
509
such that (i) αs (x) = b−is xbis , s ∈ R, x ∈ M, (ii) for any injective d ∈ M , 0 ≤ d ≤ 1H , such that the implementation relation αs (x) = d−is xdis , s ∈ R, x ∈ M holds, we have χ(0,eλ ] (b) ≤ χ(0,eλ ] (d),
λ ∈ R,
where χ(0,eλ ] stands for the characteristic function of (0, eλ ]. Moreover, (iii) for every λ ∈ R, χ(0,eλ ] (b) is the orthogonal projection onto the closed linear span of M α [−µ, +∞) H, µ>λ
(iv) for any ∗-automorphism σ of M and λσ > 0, such that σ ◦ αs = αλσ s ◦ σ for all s ∈ R, we have σ(b) = bλσ . The above theorem will be used in the proof of Theorem 2.2. 2.8. Summary of the remaining part of the paper The remainder of this paper presents proofs for the above results: • • • • • •
Theorem 2.3 and Proposition 2.4 in Sec. 3, Lemma 2.5, Lemma 2.6, Proposition 2.7 and Proposition 2.8 in Sec. 4, Lemma 2.9, Lemma 2.11 and Theorem 2.12 in Sec. 5, Theorem 2.13 in Sec. 6, Theorem 2.1 in Sec. 7 and, finally, Theorem 2.14 and Theorem 2.2 in Sec. 8.
3. The Analytic Extension Theorem The aim of this section is to prove Theorem 2.3 and Proposition 2.4. Proof of Theorem 2.3. The equivalence of conditions (2), (4) and (5), as well as the three additional statements (2.24), (2.25), (2.26) were proved in [12, Theorem 6.2]. For the proof of the remaining part, we introduce the following notation: let Kc (B) and Hc (A) be the set of all vectors ξ ∈ K and η ∈ H, respectively, with compact spectral support for log B and log A, respectively. For such ξ and η, C z → B iz ξ and C z → Aiz η are analytic functions of exponential type with respect to Im z and they are uniformly bounded in Sβ . Furthermore, Kc (B) ⊂ K and Hc (A) ⊂ H are dense linear subspaces and they are cores of B iz and Aiz for every z ∈ C, respectively. Proof of (2) ⇒ (1). The uniform boundedness of Sβ z → T (z) and its so-continuity are to be proved. The latter is automatic on Sβ , where T (·) is analytic.
July 6, 2005 12:21 WSPC/148-RMP
510
J070-00238
H. Araki & L. Zsid´ o
Let ξ ∈ Kc (B) and η ∈ Hc (A). Then (T (z)ξ | η) = (T B −iz ξ | A−iz η), z ∈ Sβ , because the analytic function C z → (T B −iz ξ | A−iz η) and the continuous function Sβ z → (T (z)ξ | η), which is analytic in the interior, coincide on R. By (2.24) and by the density of Hc (A) in H, it follows that T (z)ξ = Aiz T B −iz ξ,
z ∈ Sβ ,
ξ ∈ Kc (B).
(3.1)
Since Aiz T B −iz = T for z ∈ R and Aiz T B −iz = A−β T Aβ for z ∈ R + iβ by (5), we have by the Three Line Theorem |(T (z)ξ | η)| ≤ max T , A−β T Aβ ξη, z ∈ Sβ , thus obtaining the uniform boundedness of Sβ z → T (z). Due to this uniform boundedness, it suffices to prove the convergences lim T (z)ξ − Ais T B −is ξ = 0,
s ∈ R,
(3.2)
T (z)ξ − Ais−β T B −is+β ξ = 0,
s∈R
(3.3)
Sβ z→s
lim
Sβ z→s+iβ
for ξ ∈ Kc (B). We give a proof explicitly only for β > 0, the treatment of the case β < 0 being completely similar. Let E denote the spectral projection of log A corresponding to (−∞, 0]. Owing to (3.1) we can split T (z)ξ as follows: T (z) = Aiz (1H − E) T B −iz ξ + Aiz+β E (A−β T B β )B −iz−β ξ, z ∈ Sβ . We note that Aiz (1H − E) is defined on H and Aiz (1H − E) ≤ 1 for all z ∈ C, Im z ≥ 0, Aiz+β E is defined on H and Aiz+β E ≤ 1 for all z ∈ C, Im z ≤ β. Now the norm-continuity of Sβ z → B −iz ξ and Sβ z → B −iz−β ξ, the so-continuity of Sβ z → Aiz (1H −E) and Sβ z → Aiz+β E, and the boundedness of A−β T B β on Kc (B) yield the convergences (3.2) and (3.3). Proof of (1) ⇒ (3). Obvious, with Ξo = R. Proof of (3) ⇒ (4). We proceed in three steps. Step 1. First we quote some results from the theory of the Hardy spaces on the disc. Let H ∞ (D) be the Banach algebra of all bounded analytic complex functions on the unit disc D = {z ∈ C; |z| < 1}. ∞
Any g ∈ H (D) has a non-tangential limit g˜(ζ) = lim g(z) D z→ζ
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
511
for almost all ζ in the boundary ∂D of D (the unit circle) due to Fatou’s Theorem (see, for example, [20, the second Corollary, p. 38] or the theorems of [21, p. 5 and p. 14]. Furthermore, the map H ∞ (D) g → g˜ ∈ L∞ (∂D) obtained this way is an isometric algebra homomorphism. On the other hand, the range {˜ g; g ∈ H ∞ (D)} of the above homomorphism is equal to 2π ∞ is iks ψ(e )e ds = 0 for all k = 1, 2, . . . ψ ∈ L (∂D); 0
∗
and hence it is weak closed (see, for example, [14, Sec. 20.1]). We also notice that, according to a uniqueness theorem of the Riesz brothers (see, for example, [20, the second Corollary, p. 52] or the theorem of [21, p. 76]), if for some g ∈ H ∞ (D) the boundary function g˜ vanishes almost everywhere on a Borel subset of ∂D with non-zero arc length measure, then g = 0. We consider the one point compactification of the right half and the left half of Sβ and denote each added point by +∞ and −∞, respectively. We extend the function 1+ζ β ¯ D\{+1, −1} ζ → Φβ (ζ) = log i ∈ Sβ π 1−ζ to be +∞ at ζ = +1 and −∞ at ζ = −1. Then the extended function ¯ ζ → Φβ (ζ) = β log i 1 + ζ ∈ Sβ ∪ {−∞, +∞} D π 1−ζ is a homeomorphism, mapping D onto Sβ conformally, and the boundary ∂D onto ∂Sβ ∪ {−∞, +∞} absolutely bicontinuously with respect to the arc length measures: if Ξ is a Borel set in ∂D, then Ξ has arc length measure 0 if and only if Φβ (Ξ) has arc length measure 0. Moreover, Φβ maps paths in D tending to a ζ ∈ ∂D\{1, −1} from within a sector of opening < π having vertex at ζ, and symmetric about the inner normal to ∂D in ζ, to paths tending to Φβ (ζ) ∈ ∂Sβ in a similar non-tangential way. Therefore, if f ∈ H ∞ (Sβ ), the non-tangential limit f˜(ζ) = lim f (z) D z→ζ
exists for almost all ζ in ∂Sβ by Fatou’s Theorem applied to f ◦ Φβ . Similarly we can transcribe the above quoted results concerning H ∞ (D) in the setting of H ∞ (Sβ ): H ∞ (Sβ ) g → g˜ ∈ L∞ (∂Sβ ) is an isometric algebra homomorphism with weak∗ closed range in L∞ (∂Sβ ) and f ∈ H ∞ (Sβ ) is equal to zero whenever f˜ vanishes almost everywhere on a Borel subset of ∂Sβ with non-zero arc length measure. Step 2. We consider the map F : K × H (ξ, η) → f˜ξ,η ∈ {f˜; f ∈ H ∞ (Sβ )} ⊂ L∞ (∂Sβ ), where, as noticed in Step 1, {f˜; f ∈ H ∞ (Sβ )} is a weak∗ closed subalgebra of L∞ (∂Sβ ). The function fξ,η is uniquely determined by (2.23) due to the uniqueness
July 6, 2005 12:21 WSPC/148-RMP
512
J070-00238
H. Araki & L. Zsid´ o
result quoted in Step 1. Since the right-hand side of (2.23) is sesquilinear in ξ and η, the mapping F is also sesquilinear. We shall prove in this step that F is bounded. We first prove that the graph of F is closed. Suppose that ξn → ξo , ηn → ηo and f˜ξn ,ηn → f˜o with respect to the norm of L∞ (∂Sβ ), hence also fξn ,ηn → fo uniformly. By the continuity of the right hand side of (2.23) in ξ and η, f˜o has to satisfy (2.23) for ξ = ξo and η = ηo almost everywhere on the set Ξo . Therefore f˜o = f˜ξo ,ηo , again by the uniqueness theorem of the Riesz brothers, proving that the graph of F is closed. ¯ the conjugate of the Hilbert Let us consider F (ξ, ·) for a fixed ξ. Denote by H ¯ space H and by η¯ the canonical image of η ∈ H in H. By the above proved closedness ¯ η¯ → F (ξ, η) ∈ L∞ (∂Sβ ) is closed of the graph of F , the graph of the linear map H and hence, by the Closed Graph Theorem, F (ξ, η) ≤ cξ η,
η∈H
¯ → L∞ (∂Sβ ) is bounded. for some constant cξ ≥ 0 depending on ξ. Thus F (ξ, ·) : H Now we prove that the graph of the linear map ¯ L∞ (∂Sβ ) K ξ → F (ξ, ·) ∈ B H, is closed. Suppose that ξn → ξo and F (ξn , ·) → To with respect to the norm of ¯ L∞ (∂Sβ ) . Then, for every η ∈ H, F (ξn , η) → To η and by the closedness of B H, the graph of F it follows that To η = F (ξo , η). Thus To = F (ξo , ·). By the Closed Graph Theorem, F (ξ, ·) ≤ cξ,
ξ∈K
for some constant c ≥ 0, so F (ξ, η) = ess sup |f˜ξ,η (ζ)| = sup |fξ,η (z)| ≤ cξη, ζ∈∂Sβ
z∈Sβ
ξ ∈ K, η ∈ H.
Step 3. We now take ξ ∈ Kc (B), η ∈ Hc (A). Then C z → gξ,η (z) = (TB −iz ξ | A−i¯z η) is an entire function satisfying the boundary condition (2.23), so that gξ,η = fξ,η by the uniqueness theorem of the Riesz brothers. Therefore |gξ,η (z)| ≤ cξη,
z ∈ Sβ
and hence the same estimate holds for all z ∈ Sβ by continuity. This implies TB −iz ξ ∈ Dom(Aiz ), because Hc (A) is a core of Aiz , and Aiz TB −iz ξ ≤ cξ,
z ∈ Sβ .
Since Kc (B) is a core of B −iz , setting z = iβ we obtain (4).
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
513
Proof of Proposition 2.4. Let us denote: by λt the translation operator ξ → ξ(· − t) on L2 (R), t ∈ R, by αt the ∗-automorphism Ad(λt ) of B(H), t ∈ R, and π· by b the multiplication operator with e−e on L2 (R). Clearly, (αt )t∈R is an so-continuous one-parameter group of ∗-automorphisms of B(H), 0 ≤ b ≤ 1L2 (R) and b is injective. Since −πt
αt (b) = λt bλ∗t = be
t ∈ R,
,
we have −πt
αt (bis ) = λt bis λ∗t = bise
,
t, s ∈ R.
(3.4)
By the Stone Representation Theorem, there exists an invertible positive self-adjoint linear operator A on L2 (R) such that λt = Ait , t ∈ R. Then αt = Ad(Ait ),
t ∈ R.
(3.5)
Let s > 0 be arbitrary. Since 0 ≤ b ≤ 1L2 (R) and (ise−πz ) = se−πz sin(πz), −πz −πz the complex power bise ∈ B L2 (R) is defined and bise ≤ 1 for every z in the closed strip S1 . Using (3.4), it is easily seen that −πz ∈ B L2 (R) F1 : S1 z → bise is an so-continuous extension of R t → αt (bis ), which is analytic in S1 and whose value at i is b−is . Taking into account (3.5), Theorem 2.3 yields that Dom(A−1 bis A) = Dom(A) and A−1 bis A ⊂ b−is , that is A−1 bis A = b−is | Dom(A). Therefore Dom(A−1 b−is A) = b−is Dom(A) is dense in H and A−1 b−is A ⊂ bis .
(3.6)
But Dom(A−1 b−is A) is not a core of A.
(3.7)
Indeed, assuming that Dom(A−1 b−is A) is a core of A, (3.6) and Theorem 2.3 imply that R t → αt (b−is ) has a uniformly bounded so-continuous extension F2 : S1 → B L2 (R) , which is analytic in S1 and whose value at i is bis . Then S−1 z → F2 (¯ z )∗ ∈ B L2 (R) is an so-continuous extension of R t → αt (bis ), which is analytic in S−1 and whose value at −i is b−is . Consequently, R t → αt (bis ) has a uniformly bounded so-continuous extension F1 (z) if z ≥ 0 , F : S1 ∪ S−1 z → z )∗ if z ≤ 0 F2 (¯
July 6, 2005 12:21 WSPC/148-RMP
514
J070-00238
H. Araki & L. Zsid´ o
which is analytic in the interior and which takes the same value b−is at i and −i. Then, by (2.26), F is periodic of period 2i, so it extends to a uniformly bounded entire mapping, which must be constant by the Liouville Theorem. Thus the orbit R t → αt (bis ) is constant, that is, bis commutes with every λt . Since bis is the πr multiplication operator with R r → e−ise on L2 (R), this means that the above function is constant, what is plainly not true. By (3.6) and by (3.7) we conclude that, choosing v = b−is with s > 0, A−1 vA is densely defined and bounded, but Dom(A−1 vA) is not a core for A.
4. Lebesgue Continuity, Tomita Algebras In this section we prove Lemmas 2.5 and 2.6, as well as Propositions 2.7 and 2.8. Throughout this section M ⊂ B(H) will stand for a von Neumann algebra, in standard form with respect to a normal semi-finite faithful weight ϕ on M . Proof of Lemma 2.5. Since Mϕ is a hereditary ∗-subalgebra of M , there is an increasing approximate unit {bι }ι for Mϕ (for example, the upward directed set {b ∈ Mϕ ∩ M + ; b < 1}, labeled by itself). Then, by the so-density of Mϕ in M , we have so- limι bι = 1H . Setting 1 aι = √ woπ
∞
−∞
2
e−t σtϕ (bι ) dt,
{aι }ι is an increasing net in M + such that every orbit 1 R s → σsϕ (aι ) = √ woπ
∞
2
−∞
e−(t−s) σtϕ (bι ) dt ∈ M
has an entire extension 1 C z → σzϕ (aι ) = √ woπ
∞
2
−∞
e−(t−z) σtϕ (bι ) dt ∈ M.
Clearly, σzϕ (aι )∗ = σzϕ ¯ (aι ) for all ι and z ∈ C. Since, for every z ∈ C, the function 2
R t → e−(t−z) = e−(t−z)
2
+( z)2 2i(t−z) z
e
is of the form f1 − f2 + i(f3 − f4 ) with 0 ≤ fj ∈ L1 (R), 1 ≤ j ≤ 4, using (2.6) we deduce easily that σzϕ (aι ) ∈ Mϕ
and σzϕ (aι ) ≤ e( z)
2
for all ι and z ∈ C.
On the other hand, so- limι bι = 1H yields so- lim σzϕ (aι ) = 1H ι
for all z ∈ C.
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
Proof of Lemma 2.6. First we prove that ϕ ϕ ϕ y ∈ Aϕ ∩ Dom σ− σ− (y)ϕ = Jϕ (y ∗ )ϕ . i , i (y) ∈ Aϕ ⇒ σ −i 2
2
2
515
(4.1)
For let x ∈ Aϕ be arbitrary. Then (2.5) yields ϕ ϕ σ− (y)ϕ . i (y)Jϕ xϕ = Jϕ xJϕ σ −i 2
(4.2)
2
1 −1 On the other hand, since Jϕ xϕ = Jϕ Sϕ (x∗ )ϕ = ∆ϕ2 (x∗ )ϕ ∈ Dom ∆ϕ 2 , using Theorem 2.3 we obtain 1
−1
1
ϕ ∗ ∗ ∗ 2 2 2 σ− i (y)Jϕ xϕ = ∆ϕ y∆ϕ Jϕ xϕ = ∆ϕ (yx )ϕ = Jϕ Sϕ (yx )ϕ = Jϕ x(y )ϕ . 2
(4.3)
Now, (4.2) and (4.3) imply ϕ ∗ Jϕ xJϕ σ− i (y)ϕ = Jϕ x(y )ϕ ,
x ∈ Aϕ
2
ϕ ∗ and by the so-density of Aϕ in M 1H we conclude that σ− i (y)ϕ = Jϕ (y )ϕ . 2
(1) If x ∈ Nϕ , then by (2.5) xJϕ yϕ = Jϕ yJϕ xϕ ≤ yxϕ ,
y ∈ Nϕ ⊃ Mϕ .
Conversely, assume that x ∈ M and c ≥ 0 are such that xJϕ yϕ ≤ cy,
y ∈ Mϕ .
(4.4)
Let {aι }ι be a net as in Lemma 2.5. Then we have for every ι ϕ σ i (aι )∗ x∗ xσ ϕ i (aι ) ≤ x2 σ ϕ i (aι )2 ≤ x2 e1/2 − − − 2
2
2
and, according to (4.1) and (4.4), 2 ϕ ϕ ϕ ∗ ∗ = xJϕ (aι )ϕ 2 ≤ c2 aι 2 ≤ c2 . (aι ) = xσ− ϕ σ− i (aι ) x xσ i (aι )ϕ −i 2
Since
2
2
ϕ ϕ ∗ ∗ (aι ) σ− i (aι ) x xσ − 2i 2
ϕ
ι
ϕ ∗ = σ i (aι )x xσ− i (aι ) → x x in the so-topology and ϕ is ∗
2
2
lower wo-semi-continuous on the bounded subsets of M + , it follows that ϕ(x∗ x) ≤ c2 , that is x ∈ Nϕ
and
xϕ ≤ c.
(2) Since the implication “⇒” is an immediate consequence of (2.5), we have to prove only the converse implication. Let x ∈ M and ξ ∈ H be such that xJϕ yϕ = Jϕ yJϕ ξ,
y ∈ Mϕ .
(4.5)
Then (4.4) holds with c = ξ, so by the above part of the proof we have x ∈ Nϕ . But then (2.5) and (4.5) yield Jϕ yJϕ xϕ = xJϕ yϕ = Jϕ yJϕ ξ, so by the so-density of Mϕ in M 1H , we conclude that xϕ = ξ.
July 6, 2005 12:21 WSPC/148-RMP
516
J070-00238
H. Araki & L. Zsid´ o
Proof of Proposition 2.7. (1) Let D be the linear span of {Jϕ a∗ Jϕ bϕ ; a, b ∈ Nϕ }, which is dense in H. Define the linear functional F : D → C by F (η) = lim(η | (xι )ϕ ), ι
η ∈ D,
where the limit exists due to the convergence (2.5)
(Jϕ a∗ Jϕ bϕ | (xι )ϕ ) = (bϕ | Jϕ aJϕ (xι )ϕ ) = (bϕ | xι Jϕ aϕ ) ι
→ (bϕ | xJϕ aϕ ).
(4.6)
Since F is bounded by F ≤ sup (xι )ϕ < ∞, ι
it extends to a continuous linear functional on H, and hence there exists ξ ∈ H satisfying F (η) = (η | ξ) for all η ∈ D. In particular, by (4.6), (bϕ | xJϕ aϕ ) = F (Jϕ a∗ Jϕ bϕ ) = (Jϕ a∗ Jϕ bϕ | ξ) = (bϕ | Jϕ aJϕ ξ),
a, b ∈ Nϕ .
This implies that xJϕ aϕ = Jϕ aJϕ ξ for all a ∈ Nϕ and by Lemma 2.6(2) we get x ∈ Nϕ
and xϕ = ξ.
Furthermore, lim(η | (xι )ϕ ) = F (η) = (η | ξ) = (η | xϕ ), ι
η ∈ D,
the density of D in H and the boundedness of the net {(xι )ϕ }ι yield that ι
(xι )ϕ → xϕ in the weak topology of H. (2) Let x ∈ M be any wo-limit point of the bounded net {xι }ι . Then, for every (2.5)
a ∈ Nϕ , xJϕ aϕ is a weak limit point of the net {xι Jϕ aϕ }ι = {Jϕ aJϕ (xι )ϕ }ι . ι Since (xι )ϕ → ξ, we deduce xJϕ aϕ = Jϕ aJϕ ξ,
a ∈ Nϕ
and, using Lemma 2.6(2), we obtain x ∈ Nϕ
and xϕ = ξ.
By the injectivity of the mapping Nϕ y → yϕ , the uniqueness of the wo-limit point x of {xι }ι follows and we conclude that wo- limι xι = x. Proof of Proposition 2.8. First we show that every y ∈ Aϕ can be approximated by a sequence {yn }n≥1 in Tϕ as required in the statement and such that (2.29) holds for x = yn , n ≥ 1. Set ∞ 2 n woe−nt σtϕ (y) dt, n ≥ 1. yn = π −∞
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
Then every orbit
R s → σsϕ (yn ) = has the entire extension
C z → σzϕ (yn ) =
n woπ
n woπ
∞
−∞ ∞
−∞
517
2
e−n(t−s) σtϕ (y) dt ∈ M
2
e−n(t−z) σtϕ (y) dt ∈ M
(4.7)
and by (2.27) we have σzϕ (yn ) ∈ Nϕ for every z ∈ C. Similarly, ∞ 2 n ϕ ∗ R s → σs (yn ) = woe−n(t−s) σtϕ (y ∗ ) dt ∈ M π −∞ has an entire extension C z → σzϕ (yn∗ ) and σzϕ (yn∗ ) ∈ Nϕ for every z ∈ C. Since σzϕ (yn )∗ = σzϕ (yn∗ ), we have σzϕ (yn ) ∈ (Nϕ )∗ ∩ Nϕ = Aϕ ,
n ≥ 1,
z ∈ C,
that is yn ∈ Tϕ for all n ≥ 1. By the so-continuity of R t → σtϕ (y) ∈ M and so so R t → σtϕ (y ∗ ) ∈ M we get yn −→ y and yn∗ −→ y ∗ , while using −n(t−z)2 −n(t−z)2 n( z)2 2ni(t−z) z = e = e−n(t−z)2 en( z)2 e e e (4.8) 2
it is easily seen that σzϕ (yn ) ≤ en( z) y for all n ≥ 1 and z ∈ C. On the other hand, by (2.27) we have ∞ ∞ 2 n n −nt2 it ∗ ∗ e ∆ϕ yϕ dt, (yn )ϕ = e−nt ∆it (yn )ϕ = ϕ (y )ϕ dt π −∞ π −∞ it ∗ and by the norm-continuity of R t → ∆it ϕ yϕ ∈ H and R t → ∆ϕ (y )ϕ ∈ H ∗ ∗ we get the convergences (yn )ϕ → yϕ and (yn )ϕ → (y )ϕ in the norm-topology. Furthermore, every orbit ∞ 2 n (y ) = e−n(t−s) ∆it R s → ∆is n ϕ ϕ ϕ yϕ dt ∈ H π −∞
has the entire extension C z →
n π
∞
−∞
2
e−n(t−z) ∆it ϕ yϕ dt ∈ H
and thus (see [28, Lemma 3.2] and [12, Theorem 6.1]) (yn )ϕ ∈ z∈C Dom ∆iz ϕ and ∞ 2 n e−n(t−z) ∆it z ∈ C. (4.9) ∆iz ϕ (yn )ϕ = ϕ yϕ dt, π −∞ Similarly, (yn∗ )ϕ ∈ z∈C Dom ∆iz ϕ and ∞ 2 n ∗ ∗ ∆iz e−n(t−z) ∆it z ∈ C. ϕ (yn )ϕ = ϕ (y )ϕ dt, π −∞ Moreover, using (4.8), we get for every n ≥ 1 and z ∈ C 2
n( z) ∆iz yϕ , ϕ (yn )ϕ ≤ e
2
∗ n( z) ∆iz (y ∗ )ϕ . ϕ (yn )ϕ ≤ e
July 6, 2005 12:21 WSPC/148-RMP
518
J070-00238
H. Araki & L. Zsid´ o
Finally, for every n ≥ 1, (4.7), (2.27) and (4.9) yield ∞ 2 n ϕ e−n(t−z) σtϕ (y)ϕ dt = ∆iz σz (yn )ϕ = ϕ (yn )ϕ , π −∞
z ∈ C,
(4.10)
hence (2.29) holds for x = yn . It remains to prove that (2.29) holds in full generality. First we show that ϕ iz xϕ ∈ Dom(∆iz ϕ ) ⇒ σz (x)ϕ = ∆ϕ xϕ .
z ∈ C,
x ∈ Tϕ ,
(4.11)
By Lemma 2.6, this is equivalent to the implication x ∈ Tϕ ,
z ∈ C,
xϕ ∈ Dom(∆iz ϕ ),
y ∈ Aϕ ⇒ σzϕ (x)Jϕ yϕ = Jϕ yJϕ ∆iz ϕ xϕ ,
what we now are going to prove. Choose a sequence {yn }n≥1 in Tϕ as in the above z part of the proof. For each n ≥ 1, (yn )ϕ ∈ Dom(∆−i¯ ϕ ) implies by (2.1) that Jϕ (yn )ϕ ∈ Dom(∆−iz ϕ )
−i¯ z and ∆−iz ϕ Jϕ (yn )ϕ = Jϕ ∆ϕ (yn )ϕ ,
so, according to Theorem 2.3, −i¯ z iz x∆−iz ϕ Jϕ (yn )ϕ = xJϕ ∆ϕ (yn )ϕ ∈ Dom(∆ϕ )
and
−iz iz −i¯ z σzϕ (x)Jϕ (yn )ϕ = ∆iz ϕ x∆ϕ Jϕ (yn )ϕ = ∆ϕ xJϕ ∆ϕ (yn )ϕ .
(4.12)
Since, by (4.10) and by (2.5), ϕ ϕ z xJϕ ∆−i¯ ϕ (yn )ϕ = xJϕ σ−¯ z (yn )ϕ = Jϕ σ−¯ z (yn )Jϕ xϕ ,
(4.12) yields ϕ ϕ iz ϕ iz Jϕ σ−¯ z (yn )Jϕ xϕ ∈ Dom(∆ϕ ) and σz (x)Jϕ (yn )ϕ = ∆ϕ Jϕ σ−¯ z (yn )Jϕ xϕ .
Using again (2.1), we obtain ϕ i¯ z ϕ i¯ z ϕ σ−¯ z (yn )Jϕ xϕ ∈ Dom(∆ϕ ) and σz (x)Jϕ (yn )ϕ = Jϕ ∆ϕ σ−¯ z (yn )Jϕ xϕ .
(4.13)
Taking into account that xϕ ∈ Dom(∆iz ϕ ) and, by Theorem 2.3 and by (2.1), z ϕ i¯ z iz Jϕ ∆i¯ z (yn )Jϕ ⊃ Jϕ yn ∆ϕ Jϕ ⊃ Jϕ yn Jϕ ∆ϕ , ϕ σ−¯
(4.13) implies the equality σzϕ (x)Jϕ (yn )ϕ = Jϕ yn Jϕ ∆iz ϕ xϕ . Passing now to the limit for n → ∞, we conclude that σzϕ (x)Jϕ yϕ = Jϕ yJϕ ∆iz ϕ xϕ . Next we show by induction on k that k x ∈ Tϕ ⇒ xϕ ∈ Dom ∆ϕ2
(4.14)
holds for every integer k ≥ 1. Indeed,
1 x ∈ Tϕ ⊂ Aϕ ⇒ xϕ ∈ Dom(Sϕ ) = Dom ∆ϕ2
is clear and if (4.14) holds for some k ≥ 1 and x ∈ Tϕ , then we have by (4.11) k 12 k+1 ϕ ∆ϕ2 xϕ = σ− that is x ∈ Dom ∆ϕ 2 . ϕ k (x)ϕ ∈ Dom(Sϕ ) = Dom ∆ϕ , i 2
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
519
On the other hand, for every x ∈ Tϕ and k ≥ 1 we have σ ϕk i (x) ∈ Tϕ , so (4.14) 2 k yields σ ϕk i (x)ϕ ∈ Dom ∆ϕ2 and, using (4.11), we deduce 2
k − k2 ϕ ϕ ϕ 2 σ . xϕ = σ− k k (x) ϕ = ∆ϕ σ k (x)ϕ ∈ Dom ∆ϕ i i i 2
2
2
Therefore (4.14) holds also for every integer k ≤ −1 that is Tϕ ⊂ This last inclusion together with (4.11) imply (2.29).
z z∈C Dom∆ϕ .
5. Hermitian Maps In this section, we analyse the notion introduced in Definition 2.10 by proving Lemma 2.9, Lemma 2.11 and Theorem 2.12. Proof of Lemma 2.9. Let ξ ∈ Dom Sψ be arbitrary. Then there is a sequence (xn )n≥1 in Aψ such that (xn )ψ → ξ,
(x∗n )ψ → Sψ ξ.
Then, denoting ξ+ =
1 (ξ + Sψ ξ), 2
ξ− =
1 (ξ − Sψ ξ), 2i
an =
1 (x + x∗n ), 2 n
we have ξ = ξ+ + iξ− , a∗n
Sψ ξ± = ξ± ,
= an ∈ Nψ ,
i.e. ξ± ∈ K Sψ ,
(5.1)
(an )ψ → ξ+ .
(5.2)
Since ξ+ = ξ if ξ ∈ K Sψ , (5.2) implies that K Sψ ⊂ {xψ ; x∗ = x ∈ Nψ }. The converse inclusion being trivial, the equality K Sψ = {xψ ; x∗ = x ∈ Nψ } follows. On the other hand, (5.1) implies that Dom Sψ = K Sψ + iK Sψ . This proves (1) and (4) in Lemma 2.9. For (2) and (3) we first notice that, for every ξ ∈ K Sψ and x ∈ Aψ , 1/2
1/2
(ξ | Jψ xψ ) = (ξ | Jψ Sψ (x∗ )ψ ) = (ξ | ∆ψ (x∗ )ψ ) = (∆ψ ξ | (x∗ )ψ ) = (Jψ (x∗ )ψ | ξ). In particular, (ξ | Jψ xψ ) ∈ R whenever x = x∗ . Conversely, let us assume that ξ ∈ K is such that (ξ | Jψ xψ ) ∈ R if x∗ = x ∈ Nψ . For every x ∈ Aψ , we have 1 1 (x + x∗ ), b = (x − x∗ ) ∈ Aψ are self-adjoint and x = a + ib, 2 2i hence, by our assumption on ξ, 1 1 (ξ | Jψ xψ ) = (ξ | Jψ aψ ) + i(ξ | Jψ bψ ) = (Jψ aψ | ξ) + i(Jψ bψ | ξ) 2 2 1 = Jψ (a − ib)ψ | ξ = (Jψ (x∗ )ψ | ξ). 2 a=
1/2
It follows that (xψ | Jψ ξ) = (∆ψ xψ | ξ) for all x ∈ Aψ , hence, {xψ ; x ∈ Aψ } being 1/2
1/2
1/2
a core of ∆ψ , ξ belongs to the domain of (∆ψ )∗ = ∆ψ 1/2
1/2
and ∆ψ ξ = Jψ ξ. In
other words, ξ ∈ Dom Sψ and Sψ ξ = Jψ ∆ψ ξ = ξ, i.e. ξ ∈ K Sψ .
July 6, 2005 12:21 WSPC/148-RMP
520
J070-00238
H. Araki & L. Zsid´ o
Proof of Lemma 2.11. If T ∈ B(K, H) implements ψ in ϕ and x∗ = x ∈ Nψ , then T xψ = (TxT ∗ )ϕ
with (TxT ∗ )∗ = T xT ∗ ∈ Nϕ
and hence Lemma 2.9(1) implies TK Sψ ⊂ H Sϕ , proving (1). In this case, the inclusion T Nψ T ∗ ⊂ Nϕ and the wo-density of Nψ in N imply TNT ∗ ⊂ M , proving (2). If T is isometric in addition, then x = T ∗ (TxT ∗ )T,
x∈N
shows the injectivity of the map N x → TxT ∗ ∈ M , which is clearly also a ∗-homomorphism. Furthermore, 0 ≤ a ∈ Mψ implies a1/2 ∈ Nψ and ψ(a) = (a1/2 )ψ 2 = T (a1/2 )ψ 2 = (T a1/2 T ∗ )ϕ 2 = ϕ(TaT ∗ ). Therefore (3) holds. Let us finally assume that ψ and ϕ are bounded and ξψ = (1K )ψ , ηϕ = (1H )ϕ . If T ∈ B(K, H) is injective and implements ψ in ϕ, then TNT ∗ ⊂ M by the above proved (2) and T T ∗ηϕ = (T 1K T ∗ )ϕ = T (1K )ψ = T ξψ ⇒ T ∗ ηϕ = ξψ by the injectivity of T . Conversely, if T ∈ B(K, H) is injective and satisfies TNT ∗ ⊂ M and T ∗ ηϕ = ξψ , then for x ∈ N (TxT ∗ )ϕ = TxT ∗ ηϕ = T xξψ = T xψ . Hence we have (4). Proof of Theorem 2.12. (1), (2) and (3) in Lemma 2.9 imply the equivalences (1) ⇔ (2), (2) ⇔ (3) and (2) ⇔ (4), respectively. Let us assume that (1) holds. By (4) in Lemma 2.9, every ξ ∈ Dom Sψ is of the form ξ = ξ1 + iξ2 with ξ1 , ξ2 ∈ K Sψ . Hence we get T (ξ) = T (ξ1 ) + iT (ξ2 ) ∈ H Sϕ + iH Sϕ ⊂ Dom Sϕ , Sϕ T (ξ) = T (ξ1 ) − iT (ξ2 ) = T (ξ1 − iξ2 ) = T Sψ (ξ) , proving (5). Conversely, if (5) holds, then we have for every ξ ∈ K Sψ ⊂ Dom Sψ T (ξ) ∈ Dom Sϕ and Sϕ T (ξ) = T Sψ (ξ) = T (ξ), so T (ξ) ∈ H Sϕ . Therefore (1) ⇔ (5). −1/2 −1/2 Since Jψ is involutive and, by (2.2), Sψ = ∆ψ Jψ and Sϕ = ∆ϕ Jϕ , (5) is equivalent to −1/2
T ∆ψ
⊂ ∆−1/2 Jϕ T Jψ . ϕ
This equation, in turn, is equivalent to the validity of −1/2
∆1/2 ϕ T ∆ψ and thus (5) ⇔ (6).
ξ = Jϕ T Jψ ξ,
−1/2
ξ ∈ Dom ∆ψ
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
521
We have already seen that (1) ⇔ (3). Applying this equivalence to Jψ T ∗ Jϕ , it follows that Jψ T ∗ Jϕ is Hermitian with respect to (ϕ, ψ) if and only if (Jψ T ∗ Jϕ yϕ | Jψ xψ ) = (xψ | T ∗ Jϕ yϕ ) = (T xψ | Jϕ yϕ ) is real for all x∗ = x ∈ Nψ and y ∗ = y ∈ Nϕ . But this means exactly (3), so (3) ⇔ (7). By the equivalence of statements (1) and (5) in Theorem 2.3 with A = ∆ϕ , B = ∆ψ , β = − 12 , and taking into account that they imply (2.24) and (2.25), we obtain the equivalence (6) ⇔ (8). Now let us assume that the equivalent conditions (1)–(8) are satisfied. Then, by (2.26) in Theorem 2.3, we get (2.32). Further, using (2.2), we obtain (2.33) immediately from (2.32) and (2.30). Since the map T (·) is bounded and i −is is (2.33) = Jϕ T (s)Jψ = T , s ∈ R, T (s) = ∆ϕ T ∆ψ = T , T s − 2 we get also (2.31) by the Three Line Theorem. Finally, since K Sψ and H Sϕ are and ∆is invariant under ∆−is ϕ , respectively, for every s ∈ R, due to (2.3) and ψ −is Lemma 2.9(1), we obtain the Hermiticity of T (s) = ∆is from (1). ϕ T ∆ψ 6. Generalization of the Structure Theorem of Borchers We prove Theorem 2.13 in two steps: first we prove it for the case where Ξo and Ξ1 are empty, and then we reduce the proof of the general case to the above special case. Step 1. Proof in the case of Ξo = Ξ1 = ∅ and wo-continuous T (·). By our assumptions in this step, Sβ z → T (z) ∈ B(K, H) is a bounded wo-continuous map which is analytic in Sβ and satisfies the boundary conditions (i) T (s) is Hermitian with respect to (ψ, ϕ) for all s ∈ R, (ii) Jϕ T (s + iβ)Jψ is Hermitian with respect to (ψ, ϕ) for all s ∈ R. Let x ∈ Tψ and y ∈ Tϕ be arbitrary (for the Tomita algebras Tψ and Tϕ see the comments before Proposition 2.8) such that 2
c( z) xϕ , ∆iz ϕ xϕ ≤ e
2
c( z) ∆iz yϕ , ϕ yϕ ≤ e
z∈C
(6.1)
for some constant c ≥ 0. Consider the functions 1 1 xψ | Jϕ ∆−iz yϕ , f1 : C × Sβ (z1 , z2 ) → T (z2 )∆−iz ϕ ψ −iz + 1 −iz + 1 f2 : C × S−β (z1 , z2 ) → ∆ϕ 1 2 yϕ | T (z2 )Jψ ∆ψ 1 2 xψ . They are continuous and, according to (6.1), bounded on any set of the form {z1 ∈ C; |z1 | ≤ δ} × Sβ
and {z1 ∈ C; |z1 | ≤ δ} × S−β , respectively,
δ > 0.
July 6, 2005 12:21 WSPC/148-RMP
522
J070-00238
H. Araki & L. Zsid´ o
Furthermore, the partial functions C z1 → f1 (z1 , z2 ), C z1 → f2 (z1 , z2 ),
z2 ∈ Sβ ,
Sβ z2 → f1 (z1 , z2 ), z1 ∈ C,
z2 ∈ S−β ,
S−β z2 → f2 (z1 , z2 ), z1 ∈ C
are analytic. Now, by (2.29) and Theorem 2.12(4), (i) implies, for every z1 ∈ C and s ∈ R, ψ ϕ f1 (z1 , s) = T (s)σ−z (x)ψ | Jϕ σ−z (y)ϕ 1 1 ϕ ψ = Jϕ σ−z (y)∗ ϕ | T (s) σ−z (x)∗ ψ 1 1 ϕ ψ = Jϕ Sϕ σ−z (y)ϕ | T (s)Sψ σ−z (x)ψ 1 1 −iz + 1 −iz + 1 = ∆ϕ 1 2 yϕ | T (s)Jψ ∆ψ 1 2 xψ = f2 (z1 , s). Therefore
f : C × z2 ∈ C; |z2 | ≤ |β| (z1 , z2 ) →
f1 (z1 , z2 )
if z2 ∈ Sβ ,
f2 (z1 , z2 )
if z2 ∈ S−β
is a well defined continuous function, bounded on every set of the form
z1 ∈ C; |z1 | ≤ δ × z2 ∈ C; |z2 | ≤ |β| , δ > 0. For each fixed z1 ∈ C, the function Sβ ∪ S−β z2 → f (z1 , z2 ) is analytic. Hence, by the Morera Theorem (the one-dimensional edge-of-the-wedge theorem, see for example [3, 2.1.9.(2)] or [11, II.2.7]), it can be analytically extended across R, that is the partial functions
z2 ∈ C; |z2 | < |β| z2 → f (z1 , z2 ), z1 ∈ C are analytic. Thus we can apply to f the Osgood Lemma (the Hartogs Theorem for continuous functions, see for example [18, Theorem I.A.2]) and deduce that it
is analytic, as function of two complex variables, on C × z2 ∈ C; |z2 | < |β| . For every z1 ∈ C and s ∈ R, (ii) implies by (2.29) and Theorem 2.12(4), i i f z1 + , s + iβ = f1 z1 + , s + iβ 2 2 1 1 = T (s + iβ)Jψ Sψ ∆−iz xψ | Sϕ ∆−iz yϕ ϕ ψ ψ ϕ = T (s + iβ)Jψ σ−z (x)∗ ψ | σ−z (y)∗ ϕ 1 1 ϕ ψ = Jϕ σ−z (y)∗ ϕ | Jϕ T (s + iβ)Jψ σ−z (x)∗ ψ 1 1 ψ ϕ = Jϕ T (s + iβ)Jψ σ−z (x)ψ | Jϕ σ−z (y)ϕ 1 1 ϕ ψ = σ−z (y)ϕ | T (s + iβ)Jψ σ−z (x)ψ 1 1 −i(z − i )+ 1 −i(z − i )+ 1 = ∆ϕ 1 2 2 yϕ | T (s − iβ)Jψ ∆ψ 1 2 2 xψ i i = f2 z1 − , s − iβ = f z1 − , s − iβ . 2 2
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
523
Therefore, for each s ∈ R, the bounded, continuous function 1 gs : ζ ∈ C; |ζ| ≤ ζ → f ζ, s + 2βζ , 2 which is analytic in the interior, satisfies i i gs t + = gs t − , 2 2
t ∈ R.
By the Morera Theorem, gs extends to a periodic entire function with period i, still denoted by gs , which is bounded. By the Liouville Theorem it follows that gs is constant, hence we get successively −s −s −s , 0 = f1 ,0 , f1 (0, s) = f (0, s) = gs (0) = gs =f 2β 2β 2β s s i i T (s)xψ | Jϕ yϕ = T (0)∆ψ2β xψ | Jϕ ∆ϕ2β yϕ −i s i s (2.2) = ∆ϕ 2β T (0)∆ψ2β xψ | Jϕ yϕ . By the density property of Tϕ stated in Proposition 2.8, the above equalities imply that s −i 2β
T (s) = ∆ϕ
i
s
T (0)∆ψ2β ,
s ∈ R,
hence (2.36) holds with T = T (0). From (2.32) and (2.33) in Theorem 2.12, we obtain (2.37) and (2.38). Step 2. Proof in the general case. Let us consider, for any integer n ≥ 1, the entire function n −nz2 e C z → fn (z) = π and the mollification of T (·) Sβ z → Tn (z) = wo-
∞
−∞
fn (t)T (t + z) dt ∈ B(K, H).
(6.2)
We notice that the mapping R t → T (t + z) ∈ B(K, H) is norm-continuous for z ∈ Sβ and, due to the continuity conditions (2.34) and (2.35), wo-measurable with respect to the Lebesgue measure for z ∈ ∂Sβ . Since T (·) is bounded and fn (t) dt is a probability measure, the integrals in (6.2) exist and sup{Tn (z); z ∈ Sβ , n ≥ 1} ≤ sup{T (z); z ∈ Sβ } < ∞.
(6.3)
July 6, 2005 12:21 WSPC/148-RMP
524
J070-00238
H. Araki & L. Zsid´ o
Further, (2.34) and (2.35) yield by the Lebesgue Dominated Convergence Theorem Tn (s) = woTn (s + iβ) = wo-
lim
Tn (s + it),
s ∈ R,
(6.4)
lim
Tn (s + it),
s ∈ R.
(6.5)
0
t/β→1
We compare the operator valued function Tn (·) with C z → Tζ,n (z) = wofn (w − z)T (w) dw R+ζ
∞
= wo−∞
where ζ ∈ Sβ . Due to
fn (t + ζ − z) =
fn (t + ζ − z)T (t + ζ)dt ∈ B(K, H),
(6.6)
n −nt2 −2nt(ζ−z)−n(ζ−z)2 e e , π
the integral in (6.6) is convergent and defines an entire mapping Tζ,n (·). For ζ1 , ζ2 ∈ Sβ and z ∈ C, we have by the Cauchy Integral Theorem Tζ1 ,n (z) = wofn (w − z)T (w) dw = wofn (w − z)T (w) dw = Tζ2 ,n (z), R+ζ1
R+ζ2
so Tζ,n (·) does not depend on ζ ∈ Sβ . Therefore, for any ζ ∈ Sβ , z ∈ Sβ .
Tζ,n (z) = Tz,n (z) = Tn (z),
(6.7)
Let ζ ∈ Sβ be arbitrary. Since Tζ,n (·) is an entire mapping, by (6.4), (6.5) and (6.7) we get for every s ∈ R Tn (s) = woTn (s + iβ) = wo-
lim
Tn (s + it) = wo-
lim
Tn (s + it) = wo-
0t/β→1
lim
Tζ,n (s + it) = Tζ,n (s),
lim
Tζ,n (s + it) = Tζ,n (s + iβ).
0t/β→1
Consequently, the mapping Sβ z → Tn (z) defined in (6.2) is a restriction of the entire mapping Tζ,n (·). In particular, it is so-continuous (the role of Tζ,n (·) is just to prove this statement) and its restriction to Sβ is analytic. We recall that its boundedness was already noticed in (6.3). Since R t → fn (t) is a real function, (1)⇔(3) in Theorem 2.12 implies that the Hermiticity of T (s) and Jϕ T (s + iβ)Jψ for s ∈ R\Ξo , respectively s ∈ R\Ξ1 , is inherited by Tn (s) and Jϕ Tn (s + iβ)Jψ for all s ∈ R. Thus Tn (·) fulfils all the assumptions made in Step 1. Consequently there exists Tn ∈ B(K, H) satisfying s −i 2β
Tn (s) = ∆ϕ
i
s
Tn ∆ψ2β ,
s ∈ R.
It follows that s −i 2β
Tn (s + t) = ∆ϕ
i
s
Tn (t)∆ψ2β ,
t, s ∈ R,
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
525
which yields by analytic extension s −i 2β
Tn (z + s) = ∆ϕ
i
s
z ∈ Sβ ,
Tn (z)∆ψ2β ,
s ∈ R.
(6.8)
On the other hand, from (6.2) we obtain z ∈ Sβ ,
norm- lim Tn (z) = T (z), n→∞
due to the boundedness and norm-continuity of T (·) in Sβ . Thus we get by (6.8) s −i 2β
T (z + s) = ∆ϕ
i
s
z ∈ Sβ ,
T (z)∆ψ2β ,
s ∈ R.
Now choose some so ∈ R\Ξo and denote i so
so −i 2β
T = ∆ψ2β T (so )∆ϕ
.
Then T is Hermitian with respect to (ψ, ϕ) and we have for every s ∈ R\Ξo T (s) = wo-
lim
0
T (s + it) −i s−so
i s−so
lim ∆ϕ 2β T (so + it)∆ϕ 2β 0
0
s −i 2β
= ∆ϕ
s −i 2β
= ∆ϕ
so i 2β
so −i 2β
∆ψ T (so )∆ϕ i
i
s
∆ψ2β
s
T ∆ψ2β ,
that is, (2.36) holds. Using (1) ⇒ (8) in Theorem 2.12, (2.32) and (2.33), as well as the uniqueness theorem of the Riesz brothers (see, Sec. 3, Theorem 2.3, Step 1 of the proof of (3) ⇒ (4) ), we obtain that T (·) extends to an so-continuous map on Sβ , for which (2.37) and (2.38) hold. 7. Proof of Theorem 2.1 We recall the setting: • N ⊂ M ⊂ B(H) are von Neumann algebras, • ϕ is a normal semi-finite faithful weight on M such that its restriction ψ to N is semi-finite, • we assume that M is in standard form with respect to ϕ and {yϕ ; y ∈ Nψ } is dense in H, hence N is in standard form with respect to ψ and yψ = yϕ for all y ∈ Nψ . We shall use the notations ∆M = ∆ϕ , JM = Jϕ and ∆N = ∆ψ , JN = Jψ . The proof of Theorem 2.1 will be performed in nine steps.
July 6, 2005 12:21 WSPC/148-RMP
526
J070-00238
H. Araki & L. Zsid´ o
Step 1. Application of the Modular Extension Theorem. Since Sψ ⊂ Sϕ , hence H Sψ ⊂ H Sϕ , the identity map I on H is Hermitian with respect to (ψ, ϕ). By (1) ⇔ (8) in Theorem 2.12, (2.32) and (2.33), we obtain an so-continuous map S−1/2 z → I(z) ∈ B(H), which is analytic in S−1/2 and satisfies the conditions i −is is and I s − I(s) = ∆M ∆N = JM I(s)JN , 2 I(z) ≤ 1,
s ∈ R,
z ∈ S−1/2 .
Therefore the mapping ¯ ∗ ∈ B(H) S−1/2 ζ → W (ζ) = I(−ζ) is so-continuous, analytic in S−1/2 , and such that i is ∆ and W s − W (s) = ∆−is = JN W (s)JM , M N 2 W (ζ) ≤ 1,
s ∈ R,
ζ ∈ S−1/2 .
(7.1) (7.2)
Step 2. Hermiticity on the boundary. First we show that i W s− xψ JN yψ ∈ R, 2
x∗ = x ∈ Nψ ,
y ∗ = y ∈ Nψ ,
what is equivalent, according to (1) ⇔ (3) in Theorem 2.12, to i W s− is Hermitian with respect to (ψ, ψ), s ∈ R. (7.3) 2 Indeed, is is i (7.1) is = JN ∆−is W s− xψ JN yψ N ∆M JM xψ JN yψ = ∆N yψ ∆M JM xψ 2 (2.3) ψ (2.2) is = ∆N yψ JM ∆is = σs (y)ψ JM σsϕ (x)ϕ M xϕ = σsψ (y)ϕ JM σsϕ (x)ϕ is real because of Lemma 2.9(2). Next we show, using the negative half-sided modular inclusion assumption (2.19), that W (s) is Hermitian with respect to (ψ, ψ), JN W (s)JN is Hermitian with respect to (ψ, ψ),
s ≤ 0, s ≥ 0.
(7.4) (7.5)
For (7.5), let s ≤ 0 and x∗ = x ∈ Nψ be arbitrary. By (2.19), we have σsϕ (x) ∈ N , so σsϕ (x) ∈ Nψ and σsϕ (x)ϕ = σsϕ (x)ψ . Thus (7.1) (2.3) (2.3) ψ ϕ −is ϕ −is ϕ is W (s)xψ = ∆−is N ∆M xϕ = ∆N σs (x)ϕ = ∆N σs (x)ψ = σ−s σs (x) ψ
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
527
and the Hermiticity of W (s) with respect to (ψ, ψ) follows by using (2) ⇒ (1) in Theorem 2.12. Now, for (7.5), let s ≥ 0 and x∗ = x ∈ Nψ , y ∗ = y ∈ Nψ be arbitrary. Then, by the already proved (7.4), W (−s) is Hermitian with respect to (ψ, ψ). So JN W (s)JN xψ JN yψ = yψ W (s)JN xψ = W (−s)yψ JN xψ is real by (1) ⇒ (3) in Theorem 2.12. Hence, owing to the converse implication (3) ⇒ (1) in Theorem 2.12, JN W (s)JN is Hermitian with respect to (ψ, ψ). To summarize: • W (ζ) is Hermitian with respect to (ψ, ψ) for ζ ∈ (−∞, 0] ∪ (R − 2i ) and • JN W (ζ)JN is Hermitian with respect to (ψ, ψ) for ζ ∈ [0, ∞). Step 3. Change of variable. We shall use the analytic logarithm branches 3π π log+ : reiθ ; r > 0, − < θ < log r + iθ, reiθ → 2 2 π 3π iθ <θ< log r + iθ. log− : re ; r > 0, − reiθ → 2 2 For any β ∈ R, β = 0, we consider (like in Sec. 3, Theorem 2.3 in the proof of (3) ⇒ (4)) the one point compactification of the right half and the left half of Sβ and denote each added point by +∞ and −∞, respectively. In order to apply Theorem 2.13 to W , we have to map S−1/2 conformally onto some Sβ , β = 0, such that (−∞, 0) ∪ {−∞} ∪ (R − 2i ) correspond to R, and (0, ∞) to R + iβ. This is done, for β = π, by S−1/2 \{0} ζ → Ψ(ζ) = log+ 1 − e2πζ ∈ Sπ \{0}, which extends to a homeomorphism Ψ : S−1/2 ∪ {−∞, +∞} → Sπ ∪ {−∞, +∞} satisfying Ψ(0) = −∞, Ψ(−∞) = 0 and Ψ(+∞) = +∞. The inverse homeomorphism Ψ−1 : Sπ ∪ {−∞, +∞} → S−1/2 ∪ {−∞, +∞} is given by 1 log− 1 − ez ∈ S−1/2 \{0}, Sπ \{0} z → Ψ−1 (z) = 2π Ψ−1 (0) = −∞,
Ψ−1 (−∞) = 0,
Ψ−1 (+∞) = +∞,
so it maps Sπ conformally onto S−1/2 and (−∞, 0) onto (−∞, 0) 0− → −∞, −∞ → 0− , i i i (0, +∞) onto R − 0+ → −∞ − , +∞ → +∞ + , 2 2 2 log 2 , +∞ + iπ → +∞ . R + iπ onto (0, +∞) −∞ + iπ → 0+, iπ → 2π
July 6, 2005 12:21 WSPC/148-RMP
528
J070-00238
H. Araki & L. Zsid´ o
Thus we can consider the so-continuous mapping Sπ \{0} z → V (z) = W Ψ−1 (z) ∈ B(H),
(7.6)
which is analytic in Sπ and, according to (7.2), (7.3), (7.4) and (7.5), satisfies V (z) ≤ 1,
z ∈ Sπ \{0},
V (s) is Hermitian with respect to (ψ, ψ), s ∈ R\{0}, JN V (s + iπ)JN is Hermitian with respect to (ψ, ψ), s ∈ R.
(7.7) (7.8) (7.9)
Step 4. Application of the Generalized Structure Theorem. Since the so-continuous mapping considered in (7.6) is analytic in Sπ and (7.7), (7.8), (7.9) hold, it satisfies the assumptions for T (·) in Theorem 2.13 with M replaced by N , ϕ replaced by ψ, β = π, Ξo = {0} and Ξ1 = ∅. By Theorem 2.13, it follows that, for some V ∈ B(H) which is Hermitian with respect to (ψ, ψ), −i s is V (s) = ∆N 2π V ∆N2π ,
s ∈ R\{0}
(7.10)
and the mapping (7.6) has an so-continuous extension Sπ z → V (z) ∈ B(H) with V (0) = V , satisfying it V (z + 2πt) = ∆−it N V (z)∆N ,
V (s + iπ) = JN V (s)JN ,
z ∈ Sπ ,
t ∈ R,
s ∈ R.
(7.11) (7.12)
In particular, it V (0) = so- lim V (s) = so- lim W (t) = so- lim ∆−it N ∆M t→−∞
0 =s→0
t→−∞
is a wave operator. We notice an additional continuity property of V (·): since Ψ−1 (−∞) = 0, the limit so-
lim
Sπ z→−∞
V (z) = so-
lim
S−1/2 ζ→0
W (ζ) = W (0) = 1H
exists. Thus the mapping (7.6) has actually an so-continuous extension Sπ ∪ {−∞} z → V (z) ∈ B(H) with V (0) =
it so- limt→−∞ ∆−it N ∆M
and V (−∞) = 1H .
Step 5. Further change of variable. We recall that Sπ z → Θ(z) = ez ∈ {ζ ∈ C ; ζ ≥ 0}\{0}
(7.13)
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
529
is a homeomorphism mapping Sπ conformally onto {ζ ∈ C; ζ > 0}, which extends to a homeomorphism Θ : Sπ ∪ {−∞, +∞} → {ζ ∈ C; z ≥ 0} ∪ {∞} satisfying Θ(−∞) = 0 and Θ(+∞) = ∞. The inverse homeomorphism Θ−1 : {ζ ∈ C; ζ ≥ 0} ∪ {∞} → Sπ ∪ {−∞, +∞}, which maps {ζ ∈ C; ζ > 0} conformally onto Sπ , is given by {ζ ∈ C; ζ ≥ 0}\{0} ζ → Θ−1 (ζ) = log+ ζ ∈ Sπ , Θ−1 (0) = −∞,
Θ−1 (∞) = +∞.
Since the mapping (7.13) is so-continuous, the mapping {ζ ∈ C; ζ ≥ 0} ζ → U (ζ) = V Θ−1 (ζ) ∈ B(H) is also so-continuous. Moreover, since Sπ z → V (z) is analytic, the restriction of the above mapping to {ζ ∈ C; ζ > 0} is analytic. For ζ = 0 and ζ = 1, we have U (0) = V (−∞) = 1H ,
it U (1) = V (0) = so- lim ∆−it N ∆M . t→−∞
∆is N U (1)
In particular, U (0) and U (1) are unitaries and = U (1)∆is M for all s ∈ R. On the other hand, for ζ ∈ C, ζ ≥ 0, ζ = 0, 1, 1 log− (1 − ζ) U (ζ) = V Θ−1 (ζ) = V log+ ζ = W Ψ−1 (log+ ζ) = W 2π holds. In particular, according to (7.1), the operators {U (s); s ∈ R, s = 0, 1} = {V (z); z ∈ ∂Sπ , z = 0} = {W (ζ); ζ ∈ ∂S−i/2 , ζ = 0} are also unitaries. We summarize: • U (0) = 1H and U (s) is unitary for every s ∈ R, is ∆it and ∆is • U (1) = so- lim ∆−it M N U (1) = U (1)∆M , s ∈ R, t→−∞ N • U (ζ) = V log+ ζ , 0 = ζ ∈ C, ζ ≥ 0, 1 log− (1 − ζ) , 1 = ζ ∈ C, ζ ≥ 0. • U (ζ) = W 2π
(7.14) (7.15) (7.16)
Using (7.15), we obtain from (7.7), (7.11) and (7.12) U (ζ) ≤ 1, ζ ∈ C, ζ ≥ 0, 2πt it t ∈ R, U e ζ = ∆−it N U (ζ)∆N , U (−s) = JN U (s)JN ,
(7.17) ζ ∈ C, ζ ≥ 0,
s ∈ R.
(7.18) (7.19)
Indeed, (7.7) implies that U (ζ) ≤ 1 for 0 = ζ ∈ C, ζ ≥ 0, while the norm of U (0) = 1H is ≤ 1. Similarly, the equality in (7.18) is an immediate consequence of (7.11) for 0 = ζ ∈ C, ζ ≥ 0, while it is trivial for ζ = 0. Finally, for any s > 0, (7.12) implies U (−s) = V (log s + iπ) = JN V (log s)JN = JN U (s)JN ,
July 6, 2005 12:21 WSPC/148-RMP
530
J070-00238
H. Araki & L. Zsid´ o
2 2 hence also JN U (−s)JN = JN U (s)JN = U (s). Therefore the equality in (7.19) holds for every s ∈ R (it is trivial for s = 0). Furthermore, by (7.16) and (7.1), 1 i log− (−1) = W − (7.20) = JN W (0)JM = JN JM . U (2) = W 2π 2
On the other hand, (7.16) is equivalent to the equality W (z) = U 1 − e2πz , z ∈ S−i/2 , which yields
it 2πt ∆−it , N ∆M = W (t) = U 1 − e
t ∈ R.
(7.21)
Step 6. Group property of U (·). We now prove the group property s1 , s2 ∈ R.
U (s1 )U (s2 ) = U (s1 + s2 ),
(7.22)
Let s > 0 and t ∈ R be arbitrary. By (7.15), (7.10). (7.14) and (7.21), we obtain i log s −i log s −i log s i log s U (s) = V (log s) = ∆N 2π V (0)∆N 2π = ∆N 2π U (1)∆N 2π ∗ −i log s i log s log s = U (1)∆M 2π ∆N 2π = U (1) W = U (1)U (1 − s)∗ , 2π hence U (s)U (1 − s) = U (1). By sandwiching this equation by Ad ∆−it N and taking into account (7.18), we get (7.23) U e2πt s U e2πt (1 − s) = U e2πt . 1 Next let r1 , r2 ∈ R be such that r1 > 0 and r1 + r2 > 0. Then s = r1r+r >0 2 1 2πt 2πt and, with t = 2π log(r1 + r2 ) ∈ R, we have e s = r1 , e (1 − s) = r2 . Thus (7.23) yields
U (r1 )U (r2 ) = U (r1 + r2 ).
(7.24)
Finally, let s1 , s2 ∈ R be arbitrary and choose s ∈ R such that s > 0, s + s1 > 0 and s + s1 + s2 > 0. Then, using (7.24) with (r1 , r2 ) equal to (s, s1 ), (s + s1 , s2 ) and (s, s1 + s2 ), respectively, we obtain U (s)U (s1 )U (s2 ) = U (s + s1 )U (s2 ) = U (s + s1 + s2 ) = U (s)U (s1 + s2 ). Since U (s) is unitary, the above equality implies that U (s1 )U (s2 ) = U (s1 + s2 ). Therefore (7.22) is proved. In particular, U (s)U (−s) = U (0) = 1H ,
that is U (s)∗ = U (−s), s ∈ R.
(7.25)
Thus R s → U (s) ∈ B(H) is an so-continuous one-parameter group of unitaries, which allows an so-continuous extension {ζ ∈ C; ζ ≥ 0} ζ → U (ζ), analytic in {ζ ∈ C; ζ > 0} and satisfying (7.17). Consequently U (s) = exp(isP ),
s∈R
for some positive self-adjoint operator P in H.
(7.26)
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
531
Step 7. Further properties of U (·). Here we show that the above constructed group R s → U (s) ∈ B(H) satisfies properties (1)–(7) in Theorem 2.1. By (7.18), (7.21), (7.22) and (7.25), we obtain for all s, t ∈ R −it it −it −it it it it ∆−it M U (s)∆M = ∆M ∆N ∆N U (s)∆N ∆N ∆M ∗ = U 1 − e2πt U e2πt s U 1 − e2πt = U e2πt s . (7.27) This equality and (7.18) show that property (1) in Theorem 2.1 is satisfied. Similarly, (7.19), (7.20), (7.22) and (7.25) yield for every s ∈ R JM U (s)JM = JM JN JN U (s)JN JN JM = U (2)∗ U (−s)U (2) = U (−s).
(7.28)
Now property (2) in Theorem 2.1 is (7.19) together with (7.28). The validity of property (3) in Theorem 2.1 follows from (7.21) and (7.14). The first equality in property (4) in Theorem 2.1 is (7.20), while the second one follows from (7.25), (7.28), (7.22) and (7.20): U (1)JM U (1)∗ = U (1)JM U (−1) = U (1)2 JM = U (2)JM = JN .
(7.29)
Next we prove property (5). Since −it it −it −it it it it ∆−it M ∆N N ∆N ∆M = ∆M N ∆M ⊂ ∆M M ∆M = M,
t ∈ R,
(7.14) implies U (1)∗ N U (1) ⊂ M.
(7.30)
On the other hand, by sandwiching this relation by Ad JM and using (7.29), we obtain M = JM M JM ⊃ JM U (1)∗ N U (1)JM = U (1)∗ JN N JN U (1) = U (1)∗ N U (1). Passing to the commutants, this inclusion relation yields M ⊂ U (1)∗ N U (1), which together with (7.30) gives U (1)∗ N U (1) = M , that is N = U (1)M U (1)∗ .
(7.31)
For property (6) in Theorem 2.1 we notice that (7.1) and the negative half-sided modular inclusion assumption (2.19) imply that W (t)N W (t)∗ ⊂ N for all t ≤ 0, what is by (7.21) equivalent to U (s)N U (s)∗ ⊂ N,
0 ≤ s ≤ 1.
July 6, 2005 12:21 WSPC/148-RMP
532
J070-00238
H. Araki & L. Zsid´ o
Using (7.31), (7.22) and (7.25), we get for every 0 ≤ s ≤ 1 U (s)M U (s)∗ = U (s)U (1)∗ N U (1)U (s)∗ = U (1)∗ U (s)N U (s)∗ U (1) ⊂ U (1)∗ N U (1) = M. Using induction on n, it follows that U (s)M U (s)∗ ⊂ M,
0≤s≤n
holds for every integer n ≥ 1, that is U (s)M U (s)∗ ⊂ M for all s ≥ 0. (7) is an immediate consequence of (6), (5) and (4). Step 8. Invariance properties of U (·) with respect to ϕ. We show in the following that R s → U (s) ∈ B(H) satisfies property (8) in Theorem 2.1, hence also property (9), which is an immediate consequence of (8). For any y ∈ Nψ and t ∈ R, (2.3) yields it −it −it −it it ∆it and ∆it N y∆N ∈ Nψ ⊂ Nϕ N y∆N ϕ = ∆N y∆N ψ = ∆N yψ . Setting s = 1 − e2πt and using (7.21), we obtain −it it −it −it it it it U (s)∗ yU (s) = ∆−it M ∆N y∆N ∆M ∈ ∆M Nψ ∆M ⊂ ∆M Nϕ ∆M ⊂ Nϕ , −it −it −it −it it it it U (s)∗ yU (s) ϕ = ∆−it M ∆N y∆N ϕ = ∆M ∆N y∆N ψ = ∆M ∆N yψ
= U (s)∗ yψ .
Therefore we have for all s ∈ R, s < 1, U (s)∗ yU (s) ∈ Nϕ and U (s)∗ yU (s) ϕ = U (s)∗ yψ .
(7.32)
Moreover, according to the Lebesgue continuity result Proposition 2.7, (7.32) holds also for s = 1. According to (7.31), γ1 : M x → U (1)xU (1)∗ ∈ N is a ∗-isomorphism with inverse γ1−1 : N y → U (1)∗ yU (1) ∈ M . The modular automorphism groups of the normal semi-finite faithful weights ψ and ϕ ◦ γ1−1 are equal. Indeed, by (7.14) we have for every y ∈ N and t ∈ R, ϕ◦γ −1 −it ∗ ∗ σt 1 (y) = γ1 σtϕ γ1−1 (y) = U (1)∆it M U (1) yU (1)∆M U (1) ψ −it = ∆it N y∆N = σt (y).
On the other hand, for every y ∈ Nψ , using (7.32) with s = 1, we obtain ψ(y ∗ y) = yψ 2 = U (1)∗ yψ 2 = γ1−1 (y)ϕ 2 = ϕ ◦ γ1−1 (y ∗ y), so ϕ ◦ γ1−1 and ψ coincide on Mψ . Thus [28, Proposition 5.9] yields ψ = ϕ ◦ γ1−1 , that is ϕ ◦ γ1 = ψ ◦ γ1 = ϕ.
(7.33)
In particular, for every x ∈ M and n ≥ 0, x ∈ Nϕ ⇔ γn (x) ∈ Nϕ .
(7.34)
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
533
Let x ∈ Nϕ be arbitrary. By (7.34) and (7.31) we have γ1 (x) ∈ Nψ , so (7.32) holds with y = γ1 (x) and any 0 ≤ s ≤ 1. Using the group property of U (·), we deduce that, for every 0 ≤ s ≤ 1, γs (x) = U (s)xU (s)∗ = U (1 − s)∗ γ1 (x)U (1 − s) ∈ Nϕ
and
γs (x)ϕ = U (1 − s)∗ γ1 (x)ψ .
(7.35)
In particular, for s = 0 we get xϕ = U (1)∗ γ1 (x)ψ , that is γ1 (x)ψ = U (1)xϕ . Thus (7.35) yields γs (x) ∈ Nϕ and γs (x)ϕ = U (s)xϕ ,
0 ≤ s ≤ 1.
(7.36)
Iterating (7.36), we obtain x ∈ Nϕ ,
s ≥ 0 ⇒ γs (x) ∈ Nϕ
and γs (x)ϕ = U (s)xϕ .
(7.37)
On the other hand, for x ∈ M and s ≥ 0, denoting by n the integer part of s, that is the integer n ≥ 0 with n ≤ s < n + 1, we have γs (x) ∈ Nϕ
(7.37)
⇒
γn+1 (x) = γn+1−s γs (x) ∈ Nϕ
(7.34)
⇒
x ∈ Nϕ .
Consequently property (8) in Theorem 2.1 is satisfied. Step 9. Description of the generator P . First we verify that statement (10) in Theorem 2.1 holds with P defined by (7.26). We recall from Subsection (b) of Sec. 2 that, if we endow R2 with the Lie group structure defined by the composition law (s1 , t1 ) · (s2 , t2 ) = (s1 + e−2πt1 s2 , t1 + t2 ), ↑ then R2 (s, t) → Ts Lt ∈ P+ (1) is a Lie group isomorphism. Hence, by (7.27), ↑ (1) Ts Lt → U (s)∆it π : P+ M
isan so-continuous unitary representation on H and, according to (7.21), the group ↑ is π P+ (1) contains {∆it M , ∆N ; t, s ∈ R} and is generated by this set. Let us consider the elements X1 , X2 , X3 of the Lie algebra p↑+ (1) ≡ g defined ↑ (1) was in (2.13). By (2.15) and by the definition of π, taking into account how P+ identified in Subsection (b) of Sec. 2 with G, we obtain for every t ∈ R: π exp(tX1 ) = π(Lt ) = ∆it M = exp it log ∆M , π exp(tX2 ) = π T1−e−2πt Lt = U 1 − e−2πt ∆it M (7.26) π exp(tX3 ) = π(Tt ) = U (t) = exp(itP ).
= ∆it N = exp it log ∆N ,
(7.21)
July 6, 2005 12:21 WSPC/148-RMP
534
J070-00238
H. Araki & L. Zsid´ o
Therefore, according to (2.16), dπ(X1 ) = i log ∆M ,
dπ(X2 ) = i log ∆N ,
dπ(X3 ) = iP.
(7.38)
↑ Since any two of X1 , X2 , X3 is a basis for g↑+ (1) and P+ (1) is connected and simply connected, the representation π is uniquely determined by any two of the relations (7.38) (see, for example [2, Chap. 11, Sec. 5] or [30, Proposition 10.5.2]). Now, by (2.14), (2.18) and (7.38), we conclude that 1 1 dπ(X2 − X1 ) = dπ(X2 ) − dπ(X1 ) iP = dπ(X3 ) = 2π 2π i log ∆N − log ∆M , = 2π 1 hence P is the closure of log ∆N − log ∆M . 2π
8. Complements to the Implementation Theorem of Borchers and the Proof of Theorem 2.2 First we prove Theorem 2.14, which will then be used to prove Theorem 2.2. Proof of Theorem 2.14. Step 1. The existence and the uniqueness of b (it is essentially the proof of [1, Theorem 3.1] and [44, Corollary 5.7]). By the lower boundedness of P we have do = exp(−P ) ∈ B(H). Moreover, do is clearly positive and injective. Denoting βs = α−s , (βs )s∈R is an so-continuous group of ∗-automorphisms of M such that −is βs (x) = dis o xdo ,
s ∈ R,
x ∈ M.
We recall that, according to [44, Theorem 1.4], we have for every λ ∈ R: M α [−λ, +∞) = M β (−∞, λ] λ z Dom(αz ); αz (x) ≤ e x for all z ∈ C, z ≥ 0 = x∈ =
x∈
z∈C z ≥ 0
Dom(βz ); βz (x) ≤ e
−λ z
x for all z ∈ C, z ≤ 0 . (8.1)
z∈C z ≤ 0
Denoting
Hλ = the closed linear span of M α [−λ, +∞) H = the closed linear span of M β (−∞, λ] H, we have clearly Hλ ⊃ M β (−∞, 0] H ⊃ 1H H = H, hence Hλ = H, for all λ ≥ 0. In particular, H is an invariant subspace of support 1H relative to β, as defined in [44, Sec. 5] Moreover, since the spectral subspace of do corresponding to (0, do ] is H, the second statement in [44, Theorem 5.3] implies that H is simply invariant,
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
535
that is λ∈R Hλ = {0}. Furthermore, M Hλ ⊂ Hλ implies that the orthogonal projection pλ onto Hλ belongs to M . Using now the first statement in [44, Theorem 5.3], it follows that there exists an injective b ∈ B(H), 0 ≤ b ≤ 1H , such that βs (x) = bis xb−is ,
s ∈ R,
x∈M
(8.2)
and, for every λ ∈ R, the spectral projection χ(0,eλ ] (b) is the orthogonal projection onto Hλ+0 = µ>λ Hµ , hence it is equal to pλ+0 = so- limλ<µ→λ pµ ∈ M . In particular, b ∈ M . Property (i) in the statement of Theorem 2.14 holds by (8.2). In order to verify property (ii), let d be an arbitrary injective operator in M such that 0 ≤ d ≤ 1H and αs (x) = d−is xdis , s ∈ R, x ∈ M , that is βs (x) = dis xd−is , s ∈ R, x ∈ M . Since the spectral subspace of the unitary group (dis )s∈R corresponding to (−∞, λ] is χ(0,eλ ] (d)H, [44, Corollary 2.6] yields M β (−∞, λ] H = M β (−∞, λ] χ(0,e0 ] (d) H ⊂ χ(0,eλ ] (d)H, λ ∈ R.
=1H
Consequently, χ(0,eλ ] (b) = pλ+0 ≤ χ(0,eλ ] (d) for all λ ∈ R. The uniqueness of b is an immediate consequence of (ii). Step 2. Proof of (iii) and (iv). For (iii), it is clear from the construction of b in Step 1. In order to verify (iv), let σ be a ∗-automorphism of M such that, for some λσ > 0, σ ◦ αs = αλσ s ◦ σ,
s ∈ R.
Then it holds clearly σ ◦ αz = αλσ z ◦ σ,
z ∈ C.
(8.3)
There exists a faithful unital normal ∗-representation π : M ⇒ B(K), which is covariant with respect to σ, that is π σ(x) = U π(x)U ∗ , x ∈ M , where U is an appropriate unitary on K: for example, we can choose K = l2 (Z; H), the space of all square-summable two-sided sequences in H, π(x)(ξk )k∈Z = σ k (x)ξk k∈Z for x ∈ M, (ξk )k∈Z ∈ l2 (Z; H), U (ξk )k∈Z = (ξk+1 )k∈Z for (ξk )k∈Z ∈ l2 (Z; H). Then π◦αs ◦π −1 s∈R is an so-continuous one-parameter group of ∗-automorphisms of the von Neumann algebra π(M ) ⊂ B(K), π(b) is an injective element of π(M ) with 0 ≤ π(b) ≤ 1K and π ◦ αs ◦ π −1 π(x) = π(b)−is π(x)π(b)is , s ∈ R, x ∈ M.
July 6, 2005 12:21 WSPC/148-RMP
536
J070-00238
H. Araki & L. Zsid´ o
Moreover, by the definition of b, for any injective π(d) ∈ π(M ), 0 ≤ π(d) ≤ 1K , such that π ◦ αs ◦ π −1 π(x) = π(d)−is π(x)π(d)is , s ∈ R, x ∈ M, we have
χ(0,eλ ] π(b) ≤ χ(0,eλ ] π(d) , λ ∈ R. Applying the above proved (iii) to π(M ), π ◦ αs ◦ π −1 s∈R , π(b) instead of M , α, b, we obtain that, for every λ ∈ R, π χ(0,eλ ] (b) = χ(0,eλ ] π(b) is the orthogonal projection onto −1 the closed linear span of π(M )π◦α◦π [−µ, +∞) K µ>λ
=
the closed linear span of π M α [−µ, +∞) K
µ>λ
For every λ ∈ R and x ∈ M , by (8.1) and by (8.3), the following four conditions are equivalent: x ∈ M α [−λ, +∞) , x∈
Dom(αz ) and σ αz (x)
z∈C z ≥ 0
σ(x) ∈
= αz (x) ≤ eλ z x for all z ∈ C, z ≥ 0, Dom(αz ) and αλσ z σ(x) ≤ eλ z x for all z ∈ C,
z∈C z ≥ 0
Therefore
z ≥ 0,
σ(x) ∈ M α [−λ−1 σ λ, +∞) .
σ M α [−λ, +∞) = M α [−λ−1 σ λ, +∞) ,
λ ∈ R.
(8.4)
Let next λ ∈ R be arbitrary. By the covariance property of π and by (8.4), we have for every µ > λ: U π M α [−µ, +∞) K = U π M α [−µ, +∞) U ∗ K = π σ M α [−µ, +∞) K = π M α [−λ−1 σ µ, +∞) K. Consequently U π χ(0,eλ ] (b) K = π χ(0,eλ−1 π(b) K, and so σ λ] (b) , π σ χ(0,eλ ] (b) = U π χ(0,eλ ] (b) U ∗ = π χ(0,eλ−1 σ λ] χ(0,eλ ] σ(b) = σ χ(0,eλ ] (b) = χ(0,eλ−1 (b) = χ(0,eλ ] (bλσ ). (8.5) σ λ] Now, by (8.5) we conclude that σ(b) = bλσ .
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
537
Proof of Theorem 2.2. Proof of (1). Let us consider γs = Ad U (s) for all s ∈ R (not only for s ≥ 0) and let z be an arbitrary self-adjoint element of the center Z(M ) of M . By (6) in Theorem 2.1 we have γs (z) ∈ M for all s ≥ 0, hence z and γs (z) commute for any s ≥ 0. But then the elements of the set {γs (z); s ∈ R} are mutually commuting, so the von Neumann algebra C generated by this set is commutative. Since γs = Ad U (s) | M leaves C invariant for every s ∈ R and U (s) = exp(isP ), s ∈ R, for some positive self-adjoint operator P in H, according to the implementation theorem of Borchers [5] (see also Theorem 2.14) there exists an element b ∈ C, 0 ≤ b ≤ 1H , such that γs (x) = b−is xbis = x,
x ∈ C,
s ∈ R.
Consequently, γs (z) = z for all s ∈ R. Proof of (2). By (5) in Theorem 2.1 and by the above proved (1), Z(N ) = Z γ1 (M ) = Z(M ). Now it is easy to see that the projection family {q ∈ Z(M ); M q = N q} is upward directed and its lowest upper bound is the greatest projection p ∈ Z(M ) satisfying M p = N p. The implication e ≤ p ⇒ U (s)e = e holds for any projection e ∈ M , because (2.10)
it M p = N p ⇒ ϕp = ψp =⇒ ∆it M p = ∆N p,
t ∈ R ⇒ U (s)p = p,
s ∈ R.
Now let e ∈ M be an arbitrary projection such that U (s)e = e,
s ∈ R.
(8.6)
For every a ∈ Aϕ and x, y ∈ Nϕ , using (8) in Theorem 2.1, we deduce that (8.6) (U (s)aeJM xϕ | JM yϕ ) = γs (a)eJM xϕ JM yϕ = eJM xϕ γs (a∗ )JM yϕ (2.5) = eJM xϕ JM yJM γs (a∗ )ϕ = JM y ∗ JM eJM xϕ U (s)(a∗ )ϕ = eJM y ∗ xϕ U (s)(a∗ )ϕ (8.6) = eJM y ∗ xϕ (a∗ )ϕ does not depend on s ≥ 0, so 1H − U (s) aeJM xϕ JM yϕ = 0 for all s ≥ 0. By so {JM xϕ ; x ∈ Nϕ } = H and Aϕ = M , we get 1H − U (s) M eH = {0}, s ∈ R. Since the orthogonal projection onto the closed linear span of M eH is the central support z(e) ∈ Z(M ) of e, we obtain that 1H − U (s) z(e) = 0 for all s ≥ 0, hence U (s)z(e) = z(e), ∗
s ∈ R.
Consequently, N z(e) = U (1)M U (1) z(e) = M z(e) and so e ≤ z(e) ≤ p.
July 6, 2005 12:21 WSPC/148-RMP
538
J070-00238
H. Araki & L. Zsid´ o
Finally, let e ∈ M ϕ be a projection such that s ∈ R,
U (s)e = eU (s),
(8.7)
s ∈ R.
U (s)eJM eJM = eJM eJM ,
(8.8)
If π = πeJϕ eJϕ : eM e → B(eJϕ eJϕ H) is the ∗-representation defined in (2.8), then, for every s ≥ 0, (8.7) and (8.8) yield that γs (eM e) ⊂ eM e and π γs (a) = U (s)aU (−s) | eJM eJM H = a | eJM eJM H = π(a),
a ∈ eM e.
Since π is faithful, we obtain that U (s)aU (−s) = γs (a) = a for all s ≥ 0 and all a ∈ eM e. In other words, every U (s) commutes with every operator in eM e. Consequently, for every s ∈ R, the unitary U (s) | eH : eH → eH belongs to the commutant of the reduced von Neumann algebra x | eH : eH → eH; x ∈ eM e ,
hence to the induced von Neumann algebra x | eH : eH → eH; x ∈ M . Since the kernel of the induction ∗-homomorphism M x → x | eH is M 1H − z(e) , where z(e) ∈ Z(M ) stands for the central support of e, there exists a one-parameter group (us )s∈R of unitaries in M z(e) such that U (s) | eH = us | eH, that is U (s)e = us e,
s ∈ R.
Setting us = JM u−s JM , (us )s∈R is a one-parameter group of unitaries in M z(e) such that U (s)JM eJM = JM U (−s)eJM = JM u−s eJM = us JM eJM ,
s ∈ R.
Therefore we have, for every s ≥ 0 and a ∈ M z(e), γs (a)JM eJM = JM eJM U (s)aU (−s)JM eJM = JM eJM us au−s JM eJM = (us au−s )JM eJM . Since the kernel of the induction M x → x | JM eJM H is equal ∗-homomorphism to M 1H − z(JM eJM ) = M 1H − z(e) , we obtain that γs (a) = us au−s ,
s ≥ 0,
a ∈ M z(e).
In particular, N z(e) = γ1 M z(e) = u1 M z(e)u−1 = M z(e), and so e ≤ z(e) ≤ p. Proof of (3). Let us assume that M−∞ =
s≥0
γs (M ) contains M ϕ .
Ad U (s) leaves M−∞ invariant for every s ∈ R, defining thus an so-continuous one-parameter group (αs )s∈R of ∗-automorphisms of M−∞ . Using Theorem 2.14,
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
539
we get an injective b ∈ M−∞ , 0 ≤ b ≤ 1H , such that αs (x) = b−is xbis ,
s ∈ R,
x ∈ M−∞
and σ ∗-automorphism of M−∞ and λσ > 0 σ ◦ αs = αλσ s ◦ σ for all s ∈ R
(8.9)
⇒ σ(b) = bλσ .
(8.10)
Since b ∈ M−∞ , (8.9) yields that U (s)bU (s)∗ = αs (b) = b, that is b commutes with U (s),
s ∈ R.
(8.11)
∗
In particular, b ∈ U (1)M U (1) = N . Furthermore, by (1) in Theorem 2.1 we have σtϕ ◦ γs = γe−2πt s ◦ σtϕ , so M−∞ is left invariant by all
σtϕ .
σtϕ (b)
s, t ∈ R,
Therefore, applying (8.10) with σ =
=b
e−2πt
t ∈ R.
,
(8.12) σtϕ ,
we get (8.13)
Let χ{λ} be the characteristic function of {λ} ⊂ R. For every t ∈ R we have −2πt σtϕ (b) = be ⇒ σtϕ χ{1} (b) = χ{1} (b), so χ{1} (b) ∈ M ϕ . On the other hand, (8.11) implies that χ{1} (b) commutes with it ∗ all U (s). Therefore χ{1} (b) ∈ N commutes with all ∆it N = U (1)∆M U (1) , and so it ψ belongs to N . Let us denote eo = 1H − χ{1} (b) ∈ M ϕ ∩ N ψ , ϕo = the restriction of ϕ to eo M eo , ψo = the restriction of ψ (hence also of ϕ) to eo N eo , bo = eo beo = beo = eo b ∈ eo M eo . By [28, Proposition 4.1 and Theorem 4.6] (see also [33, Propositions 4.5 and 4.7]), ϕo is a normal semi-finite faithful weight and its modular group is the restriction of the modular group of ϕ to eo M eo . Similarly, ψo is a normal semi-finite faithful weight and its modular group is the restriction of the modular group of ψ to eo N eo . In particular, by (8.13), we have −2πt
σtϕo (bo ) = bo e
,
t ∈ R.
(8.14)
Since 0 ≤ bo ≤ eo and the supports of both bo and eo − bo are equal to the unit eo of eo M eo , − log bo is a positive self-adjoint linear operator, of support eo and affiliated with eo M eo . Consequently, defining us = (− log bo )is ,
s ∈ R,
(us )s∈R is a strongly continuous one-parameter group of unitaries in eo M eo and (8.14) yields σtϕo (us ) = e−2πtsi us ,
s, t ∈ R.
July 6, 2005 12:21 WSPC/148-RMP
540
J070-00238
H. Araki & L. Zsid´ o
Now the characterization theorem of Landstad [23, Theorem 2] (see also [34, Theorems I.3.3 and I.3.4], or [33, Theorem 19.9]) implies that the von Neumann algebra eo M eo is generated by (eo M eo )ϕo = eo M ϕ eo and by uR , that is eo M ϕ eo and bo generate the von Neumann algebra eo M eo .
(8.15)
Since M ϕ ⊂ M−∞ and b ∈ M−∞ , we get that eo M eo ⊂ M−∞ ⊂ γ1 (M ) = N , that is eo M eo = eo N eo . Consequently ϕo = ψo , and so the modular groups σ ϕ and σ ψ have the same restriction σ ϕo = σ ψo on eo M eo = eo N eo . Using (3) in Theorem 2.1, we obtain for every x ∈ eo M ϕ eo ⊂ M ϕ ∩ N ψ and t ∈ R: ψ −it it −it ϕ it it U (1 − e2πt )xU (1 − e2πt )∗ = ∆−it N ∆M x∆M ∆N = ∆N σt (x)∆N = σ−t (x) = x.
Therefore eo M ϕ eo ⊂ x ∈ M ; U (s)x = xU (s), s ∈ R , which yields together with (8.11) and (8.15):
eo M eo ⊂ x ∈ M ; U (s)x = xU (s), s ∈ R .
In other words, every αs acts identically on eo M eo ⊂ M−∞ . By (8.9) we conclude that bo belongs to the center of eo M eo . Since bo ∈ Z(eo M eo ) is invariant under the modular automorphism group of ϕo , which coincides with the restriction of the modular automorphism group of ϕ to eo M eo as discussed above, we have bo ∈ M ϕ . Taking into account (8.14), we obtain −2πt
be
−2πt
eo = beo
= σtϕ (bo ) = bo = beo ,
t ∈ R,
which is possible only if eo = 0. Consequently χ{1} (b) = 1H , that is b = 1H . But then every αs = Ad U (s)|M−∞ acts identically on M−∞ , hence
M−∞ ⊂ x ∈ M ; U (s)x = xU (s), s ∈ R = x ∈ M ; γs (x) = x, s ≥ 0 . so
Proof of (4). Let us first assume that Mϕ ∩ M ϕ = M ϕ . Taking into account (8) in Theorem 2.1, the inclusion M ϕ ⊂ x ∈ M ; γs (x) = x, s ≥ 0 will follow once we show that s > 0,
x ∈ Mϕ ∩ M ϕ ⇒ U (s)xϕ = xϕ .
Let s > 0 and x ∈ Mϕ ∩ M ϕ be arbitrary. Using (8) in Theorem 2.1, (2.3) and (8.12), we get successively t→+∞ ϕ −2πt s)xϕ −−−−−→ xϕ , ∆it M U (s)xϕ = σt γs (x) ϕ = γe−2πt s x ϕ = U (e t→+∞ ϕ it U (s)xϕ − xϕ = ∆it −−−−→ 0. M U (s)xϕ − xϕ = ∆M U (s)xϕ − σt (x)ϕ − For the second implication we prove that if
M ϕ ⊂ x ∈ M ; γs (x) = x, s ≥ 0
(8.16)
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
541
and p = 1H , then M (1H − p) is of type III1 . Then also N (1H − p) = γ1 M (1H − p) will be of type III1 . Taking into account (2.11), we have to prove that e ∈ M ϕ projection, 0 = e ≤ 1H − p ⇒ σ ∆M | eJM eJM H = [0, +∞). For this purpose, let the projection e ∈ M ϕ , 0 = e ≤ 1H − p, be arbitrary. By the assumption (8.16) we have γs (e) = e for all s ≥ 0, hence (8.7) holds. Since 0 = e ≤ 1H − p, the proved (2) entails that (8.8) does not hold, that is U (s)eJM eJM = eJM eJM for some s ∈ R. Nevertheless, by (8.7) and by (2) in Theorem 2.1, all U (s) commute with eJM eJM . Since e ∈ M ϕ , also every ∆it M commutes with eJM eJM . According to (2.21), U (i) = exp(−P ) ∈ B(H) is injective and 0 ≤ U (i) ≤ 1H . By the commutation relation (1) in Theorem 2.1, we have −2πt
−it e ∆it M U (i)∆M = U (i)
t ∈ R. (8.17) Consequently, the spectral projection fo = χ{1} U (i) commutes with every ∆it M. On the other hand, fo clearly commutes with every U (s). Finally, the commutation of eJM eJM with all U (s) implies that the projections eJM eJM and fo commute. We have already seen that U (s)eJM eJM = eJM eJM for some s ∈ R. On the other hand, since U (s) = U (i)−is , we have U (s)fo = fo for every s ∈ R. Therefore eJM eJM ≤ fo , and so the projection ,
f1 = eJM eJM − fo eJM eJM ≤ eJM eJM is not zero. Since all ∆it M and all U (s) commute with eJM eJM and with fo , they commute also with f1 . Therefore, we can define an so-continuous one-parameter group (vt )t∈R of unitaries on f1 H by setting it vt = ∆it M | f1 H = (∆M | f1 H) ,
t ∈ R,
(8.18)
as well as the operator b1 = U (i) | f1 H : f1 H → f1 H ∈ B(f1 H), for which 0 ≤ b1 ≤ f1 and b1 , f1 − b1 are injective. From (8.17) we get successively −2πt
vt b1 vt∗ = be1 vt (− log b1 )is vt∗
=e
t ∈ R,
,
−2πtsi
is
(− log b1 ) ,
t, s ∈ R.
Now the Stone–von Neumann Uniqueness Theorem for canonical commutation relations (see, for example, [2, Chap. 20, Sec. 2] or [32]) entails that there exists a Hilbert space K = {0} and a unitary operator ¯ L2 (R) → f1 H W :K⊗ such that ¯ m−2πt ) ◦ W ∗ , vt = W ◦ (1K ⊗
¯ λs ) ◦ W ∗ , (−log b1 )is = W ◦ (1K ⊗
t, s ∈ R,
July 6, 2005 12:21 WSPC/148-RMP
542
J070-00238
H. Araki & L. Zsid´ o
where mt is the multiplication operator with eit· on L2 (R), 2
λs is the translation operator ξ → ξ(· − s) on L (R),
t ∈ R, s ∈ R.
¯ 2πi ) ◦ W ∗ , where m2πi is Using (8.18), we deduce that ∆M | f1 H = W ◦ (1K ⊗m the unbounded positive self-adjoint multiplication operator with e−2π· in L2 (R). ¯ m2πi , Consequently, the spectrum of ∆M | f1 H is equal to the spectrum of 1K ⊗ that is to [0, +∞). Since f1 ≤ eJM eJM , we conclude that also the spectrum of ∆M | eJM eJM H is equal to [0, +∞). References [1] W. Arveson, On groups of automorphisms of operator algebras, J. Funct. Analysis 15 (1974) 217–243. [2] A. O. Barut and R. Raczka, Theory of Group Representations and Applications (Polish Scientific Publishers, Warszawa, 1977). [3] C. A. Berenstein and R. Gay, Complex Variables, An Introduction (Springer-Verlag, 1991). [4] J. Bisognano and E. Wichmann, On the duality condition for a Hermitian scalar field, J. Math. Phys. 16 (1975) 985–1007. [5] H. J. Borchers, Energy and momentum as observables in quantum field theory, Commun. Math. Phys. 2 (1966) 49–54. [6] H. J. Borchers, The CPT-theorem in two-dimensional theories of local observables, Commun. Math. Phys. 143 (1992) 315–332. [7] H. J. Borchers, On the use of modular groups in quantum field theory, Ann. Inst. Henri Poincar´e — Physique Th´eorique 63 (1995) 331–382. [8] H. J. Borchers, Tensor product decompositions in quantum field theory, Oberwolfach Lectures (March 1997). [9] H. J. Borchers, Half-sided translations and the type of von Neumann algebras, Lett. Math. Phys. 44 (1998) 283–290. [10] H. J. Borchers and J. Yngvason, Modular groups of quantum fields in thermal states, J. Math. Phys. 40 (1999) 601–624. [11] H. Cartan, Th´eorie ´el´ementaire des fonctions analytiques d’une ou plusieurs variables complexes (Hermann, Paris, 1961). [12] I. Cior˘ anescu and L. Zsid´ o, Analytic generators for one-parameter groups, Tˆ ohoku Math. J. 28 (1976) 327–362. ´ [13] A. Connes, Une classification des facteurs de type III, Ann. Sci. Ecole Norm. Sup. 6 (1973) 133–252. [14] J. B. Conway, Functions of One Complex Variable II (Springer-Verlag, 1995). [15] D. R. Davidson, Endomorphism semigroups and lightlike translations, Lett. Math. Phys. 38 (1996) 77–90. [16] J. Dixmier and P. Malliavin, Factorizations de fonctions et de vecteurs ind´efiniment diff´erentiables, Bull. Sci. Math. 102(2) (1978) 307–330. [17] M. Florig, On Borchers’ theorem, Lett. Math. Phys. 46 (1998) 289–293. [18] R. C. Gunning and H. Rossi, Analytic Functions of Several Complex Variables (Prentice-Hall, Englewood Cliffs, N. J., 1965). [19] U. Haagerup, The standard form of von Neumann algebras, Math. Scand. 37 (1975) 271–283. [20] K. Hoffman, Banach Spaces of Analytic Functions (Prentice-Hall, Englewood Cliffs, N. J., 1962).
July 6, 2005 12:21 WSPC/148-RMP
J070-00238
Half-Sided Modular Inclusions
543
[21] P. Koosis, Introduction to Hp Spaces (Cambridge University Press, 1980). [22] H. Kosaki, Type III Factors and Index Theory, Lecture Note Series, 43 (Seoul National University, Global Analysis Research Centre, Seoul, 1998). [23] M. Landstad, Duality theory for covariant systems, Trans. Amer. Math. Soc. 248 (1979) 223–276. [24] R. Longo, Solution of the factorial Stone-Weierstrass conjecture, Inv. Math. 76 (1986) 145–155. [25] R. Longo, Simple injective subfactors, Adv. Math. 63 (1987) 152–171. [26] R. Longo, Index of subfactors and statistics of quantum fields I, II, Commun. Math. Phys. 126 (1989) 217–247; 130 (1990) 285–309. [27] G. K. Pedersen, C*-Algebras and their Automorphism Groups (Academic Press, 1979). [28] G. K. Pedersen and M. Takesaki, The Radon-Nikodym theorem for von Neumann algebras, Acta Math. 130 (1973) 53–88. [29] M. A. Rieffel and A. Van Daele, The commutation theorem for tensor products of von Neumann algebras, Bull. London Math. Soc. 7 (1975) 257–260. [30] K. Schm¨ udgen, Unbounded Operator Algebras and Representation Theory, Operator Theory, Advances and Applications, Vol. 37 (Birkh¨ auser Verlag, 1990). [31] B. Schroer, Recent developments of algebraic methods in quantum field theories, Int. J. Modern Phys. B 6 (1992) 2041–2059. [32] J. Slawny, On factor representations and the C ∗ -algebra of canonical commutation relations, Commun. Math. Phys. 24 (1972) 151–170. [33] S ¸ . Str˘ atil˘ a, Modular Theory in Operator Algebras (Editura Academiei-Abacus Press, 1981). [34] S ¸ . Str˘ atil˘ a, D. V. Voiculescu and L. Zsid´ o, On crossed products, I, II, Revue Roum. Math. Pures Appl., 21 (1976) 1411–1449; 22 (1977) 83–117. [35] S ¸ . Str˘ atil˘ a and L. Zsid´ o, Lectures on von Neumann Algebras (Editura AcademieiAbacus Press, 1979). [36] M. Takesaki, Tomita’s Theory of Modular Hilbert Algebras and Its Applications, Lecture Notes in Math. 128 (Springer-Verlag, 1970). [37] H.-W. Wiesbrock, Half-sided modular inclusions of von Neumann algebras, Commun. Math. Phys. 157 (1993) 83–92. [38] H.-W. Wiesbrock, Symmetries and half-sided modular inclusions of von Neumann algebras, Lett. Math. Phys. 28 (1993) 107–114. [39] H.-W. Wiesbrock, Conformal quantum field theory and half-sided modular inclusions of von Neumann algebras, Commun. Math. Phys. 158 (1993) 537–543. [40] H.-W. Wiesbrock, A note on strongly additive conformal field theory and half-sided modular conformal standard inclusions, Lett. Math. Phys. 31 (1994) 303–307. [41] H.-W. Wiesbrock, Superselection structure and localized Connes’ cocycles, Rev. Math. Phys. 7 (1995) 133–160. [42] H.-W. Wiesbrock, Erratum, “Half-sided modular inclusions of von Neumann algebras” [Commun. Math. Phys. 157 (1993) 83–92], Commun. Math. Phys. 184 (1997) 683–685. [43] L. Zsid´ o, Analytic generator and the foundation of the Tomita-Takesaki theory of Hilbert algebras, in Proc. International School Math. Phys., Univ. Camerino (1974) 182–267. [44] L. Zsid´ o, Spectral and ergodic properties of the analytic generator, J. Approximation Theory 20 (1977) 77–138. [45] L. Zsid´ o, On the equality of two weights, Revue Roum. Math. Pures Appl. 23 (1978) 631–646.
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Reviews in Mathematical Physics Vol. 17, No. 5 (2005) 545–576 c World Scientific Publishing Company
DISTILLABILITY AND POSITIVITY OF PARTIAL TRANSPOSES IN GENERAL QUANTUM FIELD SYSTEMS
RAINER VERCH Max-Planck-Institut for Mathematics in the Sciences Inselstr. 22, D-04103 Leipzig, Germany [email protected] REINHARD F. WERNER Institut f. Mathematische Physik, TU Braunschweig Mendelssohnstr. 3, D-38106 Braunschweig, Germany [email protected] Received 02 April 2004 Revised 04 April 2005 Dedicated to Detlev Buchholz on the occasion of his 60th birthday Criteria for distillability, and the property of having a positive partial transpose, are introduced for states of general bipartite quantum systems. The framework is sufficiently general to include systems with an infinite number of degrees-of-freedom, including quantum fields. We show that a large number of states in relativistic quantum field theory, including the vacuum state and thermal equilibrium states, are distillable over subsystems separated by arbitrary spacelike distances. These results apply to any quantum field model. It will also be shown that these results can be generalized to quantum fields in curved spacetime, leading to the conclusion that there is a large number of quantum field states which are distillable over subsystems separated by an event horizon. Keywords: Entanglement; distillability; quantum field theory; Reeh–Schlieder property.
1. Introduction In the present work we investigate entanglement criteria for quantum systems with infinitely many degrees-of-freedom, paying particular attention to relativistic quantum field theory. The specification and characterization of entanglement in quantum systems is a primary issue in quantum information theory (see [34] for a recent review of quantum information theory). Entanglement frequently appears as a resource for typical quantum information tasks, in particular for teleportation [2], key distribution [18], and quantum computation [48]. Ideally these processes use bipartite entanglement in the form of maximally entangled states, such as the singlet state of two spin-1/2 particles. But less entangled sources can sometimes be converted to 545
July 6, 2005 12:21 WSPC/148-RMP
546
J070-00236
R. Verch & R. F. Werner
such maximally entangled ones by a “distillation process” using only local quantum operations and classical communication [46, 3]. States for which this is possible are called “distillable”, and this property is the strongest entanglement property for generic states (as opposed to special parameterized families). Indeed, it is stronger than merely being entangled, where a state is called entangled if it cannot be written as a mixture of uncorrelated product states. The existence of non-distillable entangled states (also called “bound entangled states”) was first shown in [28]. For a given state it is often not easy to decide to which class it belongs. A very efficient criterion is obtained from studying the partial transpose of the density operator, and asking whether it is a positive operator. In this case the state is called a ppt state, and an npt state otherwise. Originally, the npt property was established by Peres [44] as a sufficient condition for entanglement, and was subsequently shown to be also sufficient for low dimensional systems [29, 66] and some highly symmetric systems [59]. It turns out that ppt states cannot be distilled, so the existence of bound entangled states shows that the ppt condition is a much tighter fit for non-distillability than for mere separability. In fact, it is one of the major open problems [47] to decide whether there is equivalence, i.e., whether all npt states are distillable. There have been indications that this conjecture might fail for bipartite quantum systems having finitely and sufficiently many discrete degrees-of-freedom [14, 16]. On the other hand, for bipartite quantum systems having finitely many continuous degrees-offreedom (such as harmonic oscillators) it was found that Gaussian states which are npt are also distillable (about this and related results, cf. [23] and references cited there). While this brief recapitulation of results documents that the distinction between entangled, npt and distillable states is a subtle business already in the case of quantum systems with finitely many degrees-of-freedom, we would now like to point out that the study of entanglement is also a longstanding issue in general quantum field theory. Already before the advent of quantum information theory, the extent to which Bell-inequalities are violated has been investigated in several articles by Landau [39, 40] and by Summers and Werner [54–56]. In fact, the studies [55, 54] motivated the modern concept of separable states (then called “classically correlated” [65]) and raised the question of the connection between separability and Bell’s inequalities. More recently, there has been renewed interest [50, 26, 42, 1, 17, 45] in the connection between “locality” as used in quantum information theory on the one hand, and in quantum field theory on the other. However, for some of the relevant questions, like distillability, the usual framework of quantum information theory, mainly focusing on systems with finite dimensional Hilbert spaces, is just not rich enough. This lack, which is also serious for the connections between entanglement theory and statistical mechanics of infinite systems, is addressed in the first part of our paper. In particular, we extend the notions of separability and distillability for the general bipartite situations found in systems which have infinitely many degreesof-freedom, and which cannot be expressed in terms of the tensor product of
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
547
Hilbert spaces. These generalizations are fairly straightforward. Less obvious is our generalization of the notion of states with positive partial transpose, since the operation “partial transposition” itself becomes meaningless. Of course, we also establish the usual implications between these generalized concepts. It turns out that 1-distillability of a state follows from the Reeh–Schlieder property, which has been thoroughly investigated for quantum field theoretical systems. After establishing this connection, we can therefore bring to bear known results from quantum field theory to draw some new conclusions about the non-classical nature of vacuum fluctuations. In particular, the vacuum is 1-distillable, even when Alice and Bob operate in arbitrarily small spacetime regions, and arbitrarily far apart in a Minkowski spacetime. Such a form of distillability can then also be deduced to hold for a very large (in a suitable sense, dense) class of quantum field states, including thermal equilibrium states. We comment on related results in [26, 38] and [50] in the remarks following Theorem 7.2. Furthermore, we generalize the distillability result to free quantum fields on curved spacetimes. We also point out that this entails distillability of a large class of quantum field states over subsystems which may be separated by an event horizon in spacetime, inhibiting two-way classical communication between the system parts, and we will discuss what this means for the distillability concept. 2. General Bipartite Quantum Systems The bipartite quantum systems arising in quantum field theory are systems of infinitely many degrees-of-freedom. In contrast, the typical descriptions of concepts and results of quantum information theory are for quantum systems described in finite dimensional Hilbert spaces. In this section we describe the basic mathematical structures needed to describe systems of infinitely many degrees-of-freedom and, in particular, bipartite systems in that context. For the transition to infinitely many degrees-of-freedom it does not suffice to consider Hilbert spaces of infinite dimension: this level of complexity is already needed for a single harmonic oscillator. The key idea allowing the transition to infinitely many oscillators is to look at the observable algebra of the system, which is then no longer the algebra of all bounded operators on a Hilbert space, but a more general operator algebra. This operator algebraic approach to large quantum systems has proved useful in both quantum field theory and quantum statistical mechanics [4, 5, 19, 24, 52]. For many questions we discuss, it suffices to take the observable algebra R as a general C*-algebra: this is defined as an algebra with an adjoint operation X → X ∗ on the algebra elements X and also with a norm with respect to which it is complete and which satisfies X ∗ X = X2. In practically all applications, R is given in a Hilbert space representation, so that it is usually no restriction of generality to think of R as a norm-closed and adjoint-closed subalgebra of the algebra B(H) of all bounded linear operators on a Hilbert space H. We should emphasize, though, that R is usually really a proper subalgebra of B(H), and also
July 6, 2005 12:21 WSPC/148-RMP
548
J070-00236
R. Verch & R. F. Werner
¯ will typically its weak closure (in the sense of convergence of expectation values) R be a proper subalgebra of B(H). This is of particular importance in the present context when we consider C ∗ -algebras of local observables in relativistic quantum field theories: these are proper subalgebras of some B(H) which do not contain any finite-dimensional projection (in technical terms, the von Neumann algebras arising as their weak closures are purely infinite, cf. [4, Sec. 2.7]). Therefore, the properties of these algebras are fundamentally different from those of the full B(H); in particular, arguments previously developed in quantum information theory for finite dimensional systems modelled on B(Cm ) ⊗ B(Cn ) are typically based on the use of finite-dimensional projections and thus they can usually not simply be generalized to the quantum field theoretical case. We only consider algebras with unit element 1l. For some questions we will consider a special type of such algebras, called von Neumann algebras, about which we collect some basic facts later. In any case, the “observables” are specified as selfadjoint elements of the algebra, or, more generally as measures (POVMs) with values in the positive elements of R. Discussions of entanglement always refer to distinguished subsystems of a given quantum system. Subsystems are specified as subalgebras of the total observable algebra. For a bipartite system we must specify two subsystems with the crucial property that every observable of one subsystem can be measured jointly with every observable of the other, which is equivalent to saying that the observable algebras commute elementwise. Hence we arrive at Definition 2.1. A (generalized) bipartite system, usually denoted by (A, B) ⊂ R, is a pair of C*-subalgebras A, B of a larger C*-algebra R, called the ambient algebra of the system, such that the identity is contained in both algebras, and all elements of A ∈ A and B ∈ B commute. Thinking of typical situations in quantum information theory, A corresponds to the observables controlled by “Alice” and B to the observables controlled by “Bob”. The ambient algebra R will not play an important role for the concepts we define. For most purposes it is equivalent to choose R either “minimal”, i.e., as the smallest C*-subalgebra containing both A and B, or else “maximal” as B(H), the algebra of all bounded operators on the Hilbert space H on which all the operators under consideration are taken to operate. The standard quantum mechanical example of a bipartite situation is given by the tensor product H = HA ⊗ HB of two Hilbert-spaces HA and HB , with the observable algebras of Alice and Bob defined as R = B(H), A = B(HA ) ⊗ 1l, B = 1l ⊗ B(HB ). ˜ (for suitNote that in this example both algebras A, B are of the form B(H) ˜ able Hilbert-space H), and as mentioned above, this will not be the case any more when A and B correspond to algebras of local observables in quantum field theory. Furthermore, if we do not want to impose unnecessary algebraic restrictions on the subsystems, we must envisage more general compositions than of tensor product
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
549
form, too. Such systems arise naturally in quantum field theory, for tangent spacetime regions [54], but also if we want to describe a state of an infinite collection of singlet pairs, and other “infinitely entangled” situations [35]. A state on a C*-algebra R is a linear functional ω : R → C, which can be interpreted as an expectation value functional, i.e., which is positive (ω(A) ≥ 0 for A ≥ 0), and normalized (ω(1l) = 1). When R ⊂ B(H), i.e., when we consider a particular representation of all algebras involved as algebras of operators, we can consider the special class of states of the form ω(A) = Tr(ρω A) for all A ∈ R,
(2.1)
for some positive trace class operator ρω , called the density operator of ρ. Such states are called normal (with respect to the representation). As usual, for A = A∗ representing an observable, the value ω(A) is the expectation value of the observable A in the state ω. A bipartite state is simply a state on the ambient algebra of a bipartite system. Since every state on the minimal ambient algebra can be extended to a state on the maximal algebra, this notion does not intrinsically depend on the choice of ambient algebra. A bipartite state ω is a product state if ω(AB) = ω(A)ω(B) for all A ∈ A and B ∈ B. Similarly, ω is called separable, if it is the weak limita of states ωα , each of which is a convex combination of product states. 3. Positivity of Partial Transpose (ppt) Consider again the standard situation in quantum information theory, where all Hilbert spaces are finite dimensional, and a bipartite system with Hilbert space H = HA ⊗HB . Then we can define the partial transpose of a state ω, or equivalently, its density ω with (A) matrix with ρ(B) ω( · ) = Tr(ρω · ), by introducing orthonormal bases |ek in HA and |e in HB for each of the Hilbert spaces, and swapping the matrix indices belonging to one of the factors, say the first, so that (A) (A) (A) (B) (B) (B) = em ⊗ e ρω ek ⊗ e(B) . (3.2) ek ⊗ e ρTω1 e(A) m ⊗ en n Then it is easy to see that in general ρω ≥ 0 does not imply ρTω1 ≥ 0, i.e., the partial transpose operation is not completely positive. On the other hand, if ω is separable, then ρTω1 ≥ 0. More generally, we say that ω is a ppt-state when this is the case. As we just noted, the ppt property is necessary for separability, and also sufficient in low dimensions (2 ⊗ 2 and 2 ⊗ 3), which is known as the Peres–Horodecki criterion for separability [44]. It is important to note that while the definition of the partial transpose depends on the choice of bases, the ppt-condition does not: different partial transposes are linked by a unitary transformation and so have the same spectrum. In the more involved context of general bipartite systems, we will follow a similar approach by a This
means that limα ωα (X) = ω(X), for all X ∈ R.
July 6, 2005 12:21 WSPC/148-RMP
550
J070-00236
R. Verch & R. F. Werner
defining a ppt property without even introducing an object which one might call the “partial transpose” of the given state, and which would in any case be highly dependent on further special choices. Definition 3.1. We say that a state ω on a bipartite system (A, B) ⊂ R has the ppt property if for any choice of finitely many A1 , . . . , Ak ∈ A, and B1 , . . . , Bk ∈ B, one has ω(Aβ A∗α Bα∗ Bβ ) ≥ 0. α,β
Clearly, this definition is independent of the choice of ambient algebra R, since only expectations of the form ω(AB) enter. It is also symmetrical with respect to the exchange of A and B (just exchange Aα and Bβ∗ , with concomitant changes). Our first task is to show that this notion of ppt coincides with that given by Peres [44] in the case of finite-dimensional Hilbert spaces. We show this by looking more generally at situations in which there is a candidate for the role of the “partial transpose of ω”. Proposition 3.2. Let (A, B) ⊂ B(H) be a bipartite system, and let θ be an antiunitary operator on H such that the algebra B˜ ≡ θ∗ Bθ commutes elementwise with A. (1) Suppose that ω ˜ is a state on B(H) such that ˜ = ω(AθB ˜ ∗ θ∗ ) ω ˜ (AB)
(3.3)
˜ ∈ B. ˜ Then ω is ppt. for A ∈ A and B (2) In particular, if A, B are finite dimensional matrix algebras, Definition 3.1 is equivalent to the positivity of the partial transpose in the sense of Eq. (3.2). Note that the star on the right-hand side of Eq. (3.3) is necessary so the whole ˜ When θ is complex conjugation in some basis, X → equation becomes linear in B. ∗ ∗ θ X θ is exactly the matrix transpose in that basis. This proves the second part of the proposition: if A, B are matrix algebras, we can identify B with the algebra of all transposed matrices θ∗ Bθ, and with this identification Eq. (3.3) defines a linear functional on A ⊗ B, which is just the partial transpose of ω. The only issue for the ppt property in both formulations is indeed whether this functional is positive, i.e., a state. Proof. Let A1 , . . . , Ak ∈ A, and B1 , . . . , Bk ∈ B be as in Definition 3.1, and ˜α∗ θ∗ . Then ˜α = θ∗ Bα∗ θ, so that also Bα = θB introduce B ˜α θ ∗ θ B ˜β∗ θ∗ ω Aβ A∗α Bα∗ Bβ = ω Aβ A∗α θB α,β
α,β
˜ ∗ θ∗ ˜α B = ω Aβ A∗α θB β α,β
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
=
551
˜ ∗ )∗ θ∗ ˜β B ω Aβ A∗α θ(B α α,β
˜β B ˜ α∗ = ω ˜ Aβ A∗α B α,β
with X =
=ω ˜ (XX ∗ ),
α
(3.4)
˜α . Clearly, when ω Aα B ˜ is a state, this is positive.
Another consistency check is the following. Lemma 3.3. Also for general bipartite systems, separable states are ppt. Proof. Obviously, the ppt property is preserved under weak limits and convex combinations. By definition, each separable state arises by such operations from product states. Hence it is enough to show that each product state on R is ppt. If ω(AB) = ω(A)ω(B) is a product state, and A1 , . . . , Ak ∈ A, and B1 , . . . , Bk ∈ B we introduce the (k × k)-matrices Mβα = ω(Aβ A∗α ) and Nαβ = ω(Bα∗ Bβ ). What we have to show according to Definition 3.1 is that tr(M N ) ≥ 0. But this is clear from the observation that M and N are obviously positive semi-definite. Therefore the set of states which are not ppt across A and B (the “npt-states”) forms a subset of the class of entangled states. As is well-known already for low dimensional examples (larger than (3 ⊗ 3)-dimensional systems) the converse of this lemma fails. We add another result, an apparent strengthening of the ppt condition, which will turn out to be useful in proving below that a ppt state fulfils the Bell inequalities. Again, the assumptions on A and B are of the generic type as stated at the beginning of the section. Lemma 3.4. Let ω be a ppt state on R for the bipartite system (A, B) ⊂ R. Then for any choice of finitely many A1 , . . . , Ak ∈ A, and B1 , . . . , Bk ∈ B, it holds that ω Aβ A∗α Bα∗ Bβ |ω(T )|2 ≤ where T =
α,β
α
Aα Bα .
Proof. We add new elements A0 = 1l and B0 = λ1l for λ ∈ C to the families A1 , . . . , Ak , B1 , . . . , Bk . The condition of ppt then applies also with the new families A0 , A1 , . . . , Ak ∈ A1 , B0 , B1 , . . . , Bk ∈ A2 , entailing that 0≤
k α,β=0
k ω Aβ A∗α Bα∗ Bβ = ω Aβ A∗α Bα∗ Bβ α,β=1
¯ ) + ω(|λ|2 1l). + ω(λT ∗ ) + ω(λT Now insert λ = −ω(T ) and use that, since ω is a state, it holds that ω(T ∗ ) = ω(T ). This yields immediately the inequality claimed in Lemma 3.4.
July 6, 2005 12:21 WSPC/148-RMP
552
J070-00236
R. Verch & R. F. Werner
In a similar spirit, we can apply the standard trick of polarization, i.e., of replacing the arguments in a positive definite quadratic form by linear combinations to get a condition on a bilinear form. The polarized version of the ppt-property is the following, and makes yet another connection to the ordinary matrix version of the ppt-property: Lemma 3.5. Let ω be a state on a bipartite system (A, B) ⊂ R. Then for any choice of elements A1 , . . . , An ∈ A and B1 , . . . , Bm ∈ B, introduce the (nm)×(nm)matrix X by iα|X|jβ = ω Ai Bα Bβ∗ A∗j . (3.5) All such matrices are positive definite for any state ω. Moreover, they all have a positive partial transpose if and only if ω is ppt. Proof. The positivity for arbitrary states says that, for all complex (n × m)matrices Φ, we have Φiα iα|X|jβ Φjβ = ω(X ∗ X) ≥ 0, (3.6) where X =
iαjβ
iα
Φiα Bα∗ A∗i . For the ppt-property, decompose an arbitrary Φ as Φiα = uiµ vαµ , µ
for suitable coefficient matrices u, v. For example, we can get u and v from the singular value decomposition of Φ. Inserting this into the condition for the positivity of X T2 , we find Φiα iα|X T2 |jβ Φjβ = Φiα iβ|X|jα Φjβ iαjβ
iαjβ
=
uiµ vαµ ujν vβν ω Ai Bβ Bα∗ A∗j
iαjβµν
=
˜ ∗B ˜ ω A˜µ A˜∗ν B ν µ , µν
with A˜µ =
i
uiµ Ai
˜µ = and B
α
vαµ Bα∗ .
The ppt-property demands that all these expressions are positive, and conversely, positivity of all these expressions entails that ω is ppt. This lemma greatly helps to sort the big mess of indices which would otherwise clutter the proof of the following result. It contains as a special case the observation that the tensor product of ppt states is ppt, provided we consistently maintain the Alice/Bob distinction, which will be important for establishing the preservation of the ppt-property under general distillation protocols. In the standard case this is
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
553
an easy property of the partial transposition operation. Since this is not available in general, we have to give a separate proof based on our definition. Lemma 3.6. Let (Ak , Bk ) ⊂ Rk be a finite collection of bipartite systems, all contained in a common ambient algebra R such that all algebras Rk commute. Let A (resp. B) denote the C*-algebra generated by all the Ak (resp. Bk ). Let ω be a state on R, which is ppt for each subsystem, and which factorizes over the different Rk . Then ω is ppt for (A, B) ⊂ R. Proof. We show the ppt property in polarized form. Since A is generated by the commuting algebras Ak , we can approximate each element by linear combinations of
products A = k A(k) . Since the polarized ppt-condition is continuous and linear
(k) in Ai , it suffices to prove it for choices Ai = k Ai , and similarly for Bα . For such choices the factorization of ω implies that the X-matrix from the lemma is the tensor product of the matrices Xk obtained for the subsystems. The partial transposition of the whole matrix is done factor by factor, and since all the XkT2 are positive, so is their tensor product X T2 . We close this section by pointing out a mathematically more elegant way of expressing the ppt property. It employs the concept of the opposite algebra Aop of a given ∗-algebra A. The opposite algebra is the ∗-algebra formed by A with its original vector addition, scalar multiplication, and adjoint (and operator norm), but endowed with a new algebra product: A • B = BA,
A, B ∈ A,
where on the right-hand side we read the original algebra product of A. There is a linear, ∗-preserving, one-to-one, onto map θ : Aop → A given by θ(A) = A, which is an anti-homomorphism (i.e., θ(A • B) = θ(B)θ(A) for all A, B ∈ Aop . With its help one can define a linear, ∗-preserving map θ id : Aop B → R by (θ id)(A B) = θ(A)B, where we have distinguished the “algebraic tensor product” , i.e., the tensor product as defined in linear algebra, from the ordinary tensor product “⊗” of C*-algebras, which also contains norm limits of elements in A B. By definition, (θ id) has dense range, but is usually unbounded, and does not preserve positivity. Given any state ω on R, it induces a linear functional ωθid = ω ◦ (θ id) on Aop B. Then it is not difficult to check that the functional ωθid is positive (i.e. ωθid(C ∗ C) ≥ 0 for all C ∈ Aop B) if and only if ω is a ppt state. It would be interesting to study “mild failures” of the ppt condition, i.e., cases in which ωθid , although not positive, is a bounded linear functional, or maybe even a normal linear functional on Aop B.
July 6, 2005 12:21 WSPC/148-RMP
554
J070-00236
R. Verch & R. F. Werner
4. Relation to the Bell-CHSH Inequalities Now we study the connection of the ppt-property to Bell-inequalities in the CHSH form [11]. Again, we have to recall some terminology. A state ω on a bipartite system (A, B) ⊂ R is said to satisfy the Bell-CHSH inequalities if |ω(A(B + B) + A (B − B))| ≤ 2
(4.1)
holds for all hermitean A, A ∈ A and B, B ∈ B whose operator norm is bounded by 1. A quantitative measure of the failure of a state to satisfy the Bell-CHSH inequalities is measured by the quantity β(ω) =
sup
A,A ,B,B
ω(A(B + B) + A (B − B))
where the supremum is taken over all admissible A, A , B, B as in (4.1). By Cirel’son’s inequality [10], √ β(ω) ≤ 2 2. (4.2) If equality holds here, we say that the bipartite state ω violates the Bell-CHSH inequalities maximally. The proof of the following result is adapted from the finite dimensional case [63]. Theorem 4.1. If a bipartite state is ppt, then it satisfies the Bell-CHSH inequalities. Proof. The right-hand side in (4.1) is linear in each of the arguments A, A ∈ A and B, B ∈ B. Hence we can search for the maximum of this expression taking each of these four variables as an extreme point of the admissible convex domain. The extreme points of the set hermitean X with X ≤ 1 are those with X 2 = 1l. Hence it is sufficient to show that the bound (4.1) holds for all hermitean arguments fulfilling A2 = A2 = B 2 = B 2 = 1l. For such operators A, A and B, B we set, following [39], C = A(B + B) + A (B − B) and obtain |ω(C)|2 ≤ ω(C 2 ) = 4 + ω([A, A ][B, B ])
(4.3)
where [X, Y ] = XY − Y X denotes the commutator. On the other hand, if we set A1 = A, A2 = A , B1 = B + B, B2 = B − B, we get according to Lemma 3.4, since ω admits a ppt, |ω(C)|2 ≤
2
ω Aβ A∗α Bα∗ Bβ
α,β=1
= 4 − ω([A, A ][B, B ]). Adding (4.3) and (4.4) yields |ω(C)|2 ≤ 4 which is equivalent to (4.1).
(4.4)
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
555
5. Distillability for General Quantum Systems If entanglement is considered as a resource provided by some source of bipartite systems, it is natural to ask whether the particular states provided by the source can be used to achieve some tasks of quantum information processing, such as teleportation. Usually the pair systems provided by the source are not directly usable, so some form of preprocessing may be required. This upgrading of entanglement resources is known as distillation. The general picture here is that the source can be used several times, say N times. The allowed processing steps are local quantum operations, augmented by classical communication between the two labs holding the subsystems (“LOCC operations” [3, 34], see also [2]), usually personified by the two physicists operating the labs, called Alice and Bob. That is, the decision which operation is applied by Bob can be based on measuring results previously obtained by Alice and conversely. The aim is to obtain, after several rounds of operations, some bipartite quantum systems in a state which is nearly maximally entangled. The number of these systems may be much lower than N , whence the name “distillation”. The idea of distillation can be generalized to combinations of resources. For example, a bound entangled (i.e., not by itself distillable) state can sometimes be utilized to improve entanglement in another state [27]. The optimal rate of output particles per input particle is an important quantitative measure of entanglement in the state produced by the source. Distillation rates are very hard to compute because they involve an optimization over all distillation procedures, a set which is difficult to parameterize. A simpler question is to decide whether the rate is zero or positive. In the latter case the state is called distillable. In this paper we will look at two types of results on distillability, ensuring either success or failure: we will show that many states in quantum field theory are distillable, by using an especially simple kind of distillation protocol. States for which this works are also called 1-distillable (see below). On the other hand we will show that distillable states cannot be ppt. Note that this is a statement about all possible LOCC protocols, so we will need to define this class of operations more precisely in our general context. The desired implication will become stronger if we allow more operations as LOCC, so we should make only minimal technical assumptions about this class of operations. To begin with, LOCC operations are operations between different bipartite systems. So let (A1 , B1 ) ⊂ R1 and (A2 , B2 ) ⊂ R2 be bipartite systems. An operation localized on the Alice side will be a completely positive map T : A1 → A2 with T (1l) = 1l. Note that since we defined the operation in terms of observables, we are working in the Heisenberg picture, hence 1 labels the output system and 2 labels the input system. An operation also producing classical results is called an instrument in the terminology of Davies [12]. When there are only finitely many possible classical results, this is given by a collection Tx of completely positive maps, labelled by the classical result x, such that x Tx (1l) = 1l. Similarly, an operation depending on a
July 6, 2005 12:21 WSPC/148-RMP
556
J070-00236
R. Verch & R. F. Werner
classical input x is given by a collection of completely positive maps Sx such that Sx (1l) = 1l. Hence, whether the classical parameter x is an input or an output is reflected only in the normalization conditions. A LOCC operation with information flow only from Alice to Bob is then given by a completely positive map M : R1 → R2 such that Tx (A)Sx (B), (5.1) M (AB) = x
where the sum is finite, and for each x, Tx : A1 → A2 and Sx : B1 → B2 are completely positive with the normalization conditions specified above. This will be the first round of a LOCC protocol. In the next round, the flow of information is usually reversed, and all operations are allowed to depend on the classical parameter x measured in the first round. Iterating this will lead to a similar expression as (5.1), with x replaced by the accumulated classical information obtained in all rounds together. The normalization conditions will depend in a rather complicated way on the information parameters of each round. However, as is easily seen by induction the overall normalization condition Tx (1l)Sx (1l) = 1l (5.2) x
will also hold for the compound operation. Fortunately, we only need this simple condition. An operator M of the form (5.1), with completely positive Tx , Sx , but with only the overall normalization condition (5.2), is called a separable superoperator, in analogy to the definition of separable states. More generally, we use this term also for limits of such operators Mα , such that probabilities converge for all input states, and all output observables. By such limits we automatically also cover the case of continuous classical information parameters x, in which the sums are replaced by appropriate integrals. Then we can state the following implication: Proposition 5.1. Let M be a separable superoperator between bipartite systems (Ai , Bi ) ⊂ Ri , (i = 1, 2), and let ω2 be ppt. Then the output state ω1 (X) = ω2 (M (X)) is also ppt. In particular, ppt states are not distillable with LOCC operations. Proof. The ppt-preserving property is preserved under limits as described above, and also under sums, so it suffices to consider superoperators M , in which the sum (5.1) has only a single term, i.e., ω1 (AB) = ω2 (T (A)S(B)). Let A1 , . . . , Ak ∈ A1 , and B1 , . . . , Bk ∈ B1 . We have to show that ∗ ∗ α,β ω2 T (Aβ Aα )S(Bα Bβ ) ≥ 0. Now because T is completely positive, the matrix ∗ T (Aβ Aα ) is positive in the algebra of A2 -valued (k × k)-matrices, and hence we can find elements tnα ∈ A2 (the matrix elements of the square root) such that (tnβ )∗ tnα . (5.3) T (Aβ A∗α ) = n
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
557
Of course, there is an analogous decomposition S(Bβ Bα∗ ) = (smβ )∗ smα .
(5.4)
Hence, observing the changed order of the indices α, β in the S-term: ω1 (T (Aβ A∗α )S(Bα∗ Bβ )) = ω2 ((tnβ )∗ tnα (smα )∗ smβ ) ,
(5.5)
m
α,β
n,m α,β
which is positive, because the input state ω2 was assumed to be ppt. For distillability we have to consider tensor powers of the given state and try to obtain a good approximation of a singlet state of two qubits by some LOCC operation. However, since the final state is clearly not ppt, and the input tensor power is ppt by Lemma 3.6, the statement just proved shows that this impossible. For positive distillability results it is helpful to reduce the vast complexity of all LOCC operations, applied to arbitrary tensor powers, and to look for specific simple protocols for the case at hand. Since we are not concerned with rates, but only with the yes/no question of distillability, some major simplifications are possible. The first simplification is to restrict the kind of classical communication. Suppose that the local operations are such that every time they also produce a classical signal “operation successful” or “operation failed”. Then we can agree to use only those pairs in which the operation was successful on both sides. In all other cases we just try again. Note that this requires two-way classical communication, since Alice and Bob both have to give their ok for including a particular pair in the ensemble. However, in the simplest case no further communication between Alice and Bob is used. To state this slightly more formally, let T denote the distillation operation in such a step, written in the Heisenberg picture. This is a selective operation in the sense that T (1l) ≤ 1l, and ω(T (1l)) is the probability for successfully obtaining a pair. Then by the law of large numbers we can build from this a sequence of non-selective distillation operations on many such pairs, which produce systems in the state ω(T (A)) (5.6) ω(T (1l)) with rate close to the probability ω T (1l) . If we are only interested in the yes/no question of distillability and not in the rate, then obviously selective operations are just as good as non-selective ones. Moreover, it is sufficient for distillability that ω [T ] be distillable for some such T . It is also convenient to restrict the type of output systems: it suffices to produce a pair of qubits (2-level systems) in a distillable state, because from a sufficient number of such pairs any entangled state can be generated by LOCC operations. Any target state which has non-positive partial transpose will do, because for qubits ppt and non-distillability are equivalent. Finally, we look at situations where the criterion can be applied without going to higher tensor powers. ω [T ] (A) =
July 6, 2005 12:21 WSPC/148-RMP
558
J070-00236
R. Verch & R. F. Werner
In the simplest case only one pair prepared in the original state ω is needed to obtain a distillable qubit pair with positive probability. Definition 5.2. A state ω on a bipartite system (A, B) ⊂ R is called 1-distillable, if there are completely positive maps T : B(C2 ) → A and S : B(C2 ) → B such that the functional ω2 (X ⊗ Y ) = ω(T (X)S(Y )), X ⊗ Y ∈ B(C2 ⊗ C2 ), on the two-qubit system is not ppt. Then according to the discussion just given, 1-distillable states are distillable. If the maps T, S are normalized such that T (1l) = S(1l) = 1, and ω2 is close to a multiple of a singlet state, a rough estimate of the distillation rate achievable from ω is the normalization constant ω2 (1l). In the field theoretical applications below this rate will be very small. Note that specifying a completely positive map T : B(C2 ) → A is equivalent to specifying the four elements Tk = T (|k|) ∈ A or, in other words, an A-valued (2×2)-matrix, called the Choi matrix of T . It turns out that T is completely positive if and only if the Choi matrix is positive in the algebra of such matrices (isomorphic to A⊗B(C2 )). This allows a partial converse of the implication “distillable ⇒ npt”: Lemma 5.3. Let ω be a state on a bipartite system (A, B) ⊂ R, and suppose that the ppt condition in Definition 3.1 fails already for k = 2. Then ω is 1-distillable in the sense of Definition 5.2. Proof. Let A1 , A2 , B1 , B2 be as in Definition 3.1. Then we can take the matrix Aα A∗β as the Choi matrix of T , i.e., with a similar definition for S: T (M ) = Aβ β|M |αA∗α α,β
S(N ) =
Bα∗ α |N |β Bβ .
α ,β
Inserting this into Definition 5.2, we find ω2 (Z) = βα |Z|αβ ω Aβ A∗α Bα∗ Bβ ,
Z ∈ B(C2 ⊗ C2 ).
(5.7)
α,β,α β
In particular, when Z is equal to the transposition operator Z|αβ = |β α, this expectation is equal to the sum in Definition 3.1, hence negative by assumption. On the other hand, Z has a positive partial transpose (proportional to the projection onto a maximally entangled vector), hence ω2 cannot be positive. 6. The Reeh–Schlieder Property In this section we will establish a criterion for 1-distillability which will be useful in quantum field theoretical applications. We prove it in an abstract form, which for the time being makes no use of spacetime structure. We will assume that all observable algebras are given as operator algebras, i.e., we look at bipartite systems
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
559
of the kind (A, B) ⊂ B(H). This is no restriction of generality, since every C*algebra (here the ambient algebra of the bipartite system) may be isomorphically realized as an algebra of operators. The non-trivial information contained in any such representation is about a special class of states, namely the normal ones (see (2.1)). Any state of a C*-algebra becomes normal in a suitable representation, so the choice of representation is mainly the choice of a class of states of interest. In particular, we have the vector states on B(H), which are states of the form ωψ (R) = ψ|R|ψ,
R ∈ B(H),
(6.1)
with ψ ∈ H a unit vector. Again, this is not a loss of generality, since every bipartite system can be written in this way, by forming the GNS-representation [4] of the ambient algebra.b However, in this language the key condition of this section is more easily stated. It has two formulations: one emphasizing the operational content from the physical point of view, and one which is somewhat simpler mathematically. We state their equivalence in the following lemma (whose proof is entirely trivial). Lemma 6.1. Let A ⊂ B(H) be a C*-algebra, and ψ ∈ H a unit vector. Then the following are equivalent: (1) ψ has the Reeh–Schlieder property with respect to A, i.e., for each unit vector χ ∈ H and each ε > 0, there is some A ∈ A, so that |ωχ (R) − ωψ (A∗ RA)/ωψ (A∗ A)| < εR holds for all R ∈ B(H). (2) ψ is cyclic for A, i.e., the set Aψ = {Aψ : A ∈ A} is dense in H. We also remark that a vector ψ in H is called separating for A if for each A ∈ A, the relation Aψ = 0 implies that A = 0. It is a standard result in the theory of operator algebras that ψ is cyclic for a von Neumann algebra A if and only if ψ is separating for its commutant A (see, e.g., [4]). Note that A, a subset of B(H), is a von Neumann algebra if it coincides with its bicommutant A , where for B ⊂ B(H), its commutant is the von Neumann algebra B = {R ∈ B(H) : RB = BR ∀B ∈ B}. The physical meaning of the Reeh–Schlieder property is that any vector state on B(H) can be obtained from ωψ by selecting according to the results of a measurement on the subsystem A. Let us denote by A1 a multiple of the A from the lemma, normalized so that A1 ≤ 1, and set A0 = (1l − A∗1 A1 )1/2 . Then the operation elements Ti (R) = A∗i RAi (i = 0, 1) together define an instrument. The operation without selecting according to results is T (R) = T0 (R) + T1 (R). This instrument is localized in A in the sense that Ti (A) ⊂ A, and that for any B commuting with A, in particular for all observables of the second subsystem of a bipartite system, we a state ω on a C ∗ -algebra R, there is always a triple (πω , Hω , Ωω ) where: (1) πω is a ∗-preserving representation of R by bounded linear operators on the Hilbert-space Hω . (2) Ωω is a unit vector in Hω so that πω (R)Ωω is dense in Hω . (3) ω(R) = Ωω |πω (R)|Ωω for all R ∈ R. (πω , Hω , Ωω ) is called the GNS-representation of ω; see, e.g., [4] for its construction.
b For
July 6, 2005 12:21 WSPC/148-RMP
560
J070-00236
R. Verch & R. F. Werner
get T (B) = B. That is, no effect of the operation is felt for observables outside the subsystem A. Of course, Ti (B) = B, but this only expresses the state change by selection in the presence of correlations. The state appearing in the Reeh–Schlieder property is just a selected state, obtained by running the instrument on systems prepared according to ωψ , and keeping only the systems with a 1-response. By taking convex combinations of operations, one can easily see that also every convex combination of vector states, and hence any normal state can be approximately obtained from ωψ . Our next result connects these properties with distillability. Theorem 6.2. Let (A, B) ⊂ B(H) be a bipartite system, with both A, B nonabelian. Suppose ψ ∈ H is a unit vector which has the Reeh–Schlieder property with respect to A. Then ωψ is 1-distillable. The proof of this statement takes up ideas of Landau, and utilizes [56, Lemma 5.5]. To keep our paper self-contained, we nevertheless give a full proof here. Proof. Step 1: We first treat the special case in which A and B are von Neumann algebras, i.e., of algebras also closed in the weak operator topology. Then a theorem due to Takesaki [57] asserts that there are non-vanishing *-homomorphisms τ : B(C2 ) → A and σ : B(C2 ) → B, which may, however, fail to preserve the identity. Consider the map π : B(C2 ) ⊗ B(C2 ) → B(H), given by π(X ⊗ Y ) = τ (X)σ(Y ).
(6.2)
One easily checks that, because the ranges of τ and σ commute, π is a *-homomorphism. But as a C*-algebra B(C2 ) ⊗ B(C2 ) ∼ = B(C4 ) is a full matrix algebra. Since this has no ideals, π is either an isomorphism or zero. Step 2: We have to show that τ can be chosen so that π is non-zero. In many situations of interest this would follow automatically because both τ (1l) and σ(1l) are non-zero: often A and B also have the so-called Schlieder property [51] (an independence property [56]), which means that A ∈ A, B ∈ B, A, B = 0 imply AB = 0. (There seems to be an oversight in [56, Lemma 5.5] concerning this assumption.) However, we do not assume this property, and instead rely once again on the Reeh–Schlieder property of A. Let us take σ as guaranteed by Takesaki’s Theorem, and set p = σ(1l). Then if pAp is non-abelian, we can apply Takesaki’s result to this algebra, and find a homomorphism τ with π(1l) = τ (1l)p = τ (1l) = 0. So we only need to exclude the possibility that pAp is abelian. In other words, we have to exclude the possibility that in some Hilbert space H(p) ≡ p H there is an abelian von Neumann algebra A(p) ≡ pAp with a cyclic vector ψ(p) = pψ, so that A(p) commutes with a copy B(p) ≡ σ(B(C2 )) of the (2 × 2)-matrices. The latter property entails that (A(p) ) is non-abelian. We will exclude this possibility by adopting it as a hypothesis and showing that this leads to a contradiction. Let q denote the projection onto the subspace
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
561
of H(p) generated by (A(p) ) ψ(p) . This projection is contained in (A(p) ) = A(p) . Let H(qp) = qH(p) , then ψ(qp) = qψ(p) = ψ(p) ∈ H(qp) is both a cyclic and separating vector for the von Neumann algebra (A(p) )(q) = q(A(p) ) q, and since ψ(p) is separating for (A(p) ) (owing to the assumed cyclicity of ψ(p) for A(p) ), (A(p) )(q) is non-abelian since so is (A(p) ) by hypothesis. On the other hand, abelianess of A(p) entails that A(qp) = (A(p) )(q) = q(A(p) )q is a von Neumann algebra in B(H(qp) ) for which (A(p) )(q) = (A(qp) ) , where the second commutant is taken in B(H(qp) ). Clearly, A(qp) is again abelian. However, since ψ(qp) is cyclic and separating for (A(p) )(q) = (A(qp) ) , it follows by the Tomita–Takesaki theorem [4] that A(qp) is anti-linearly isomorphic to (A(qp) ) , which is a contradiction in view of the abelianess of A(qp) and non-abelianess of (A(qp) ) . To summarize, we have shown that with a suitable τ , the representation π in (6.2) is an isomorphism. √ Step 3: Now consider the singlet vector Ω = (| + − − | − +)/ 2 ∈ (C2 ⊗ C2 ). Since π has trivial kernel, the projection Q = π(|ΩΩ|) is non-zero, and hence there is a vector χ in the range of this projection. Obviously, ωχ (π(Z)) = ωχ (Qπ(Z)Q) = ωχ (π(|ΩΩ|Z|ΩΩ|)) = Ω|Z|Ωωχ (Q) = Ω|Z|Ω holds for all Z ∈ B(C2 ⊗ C2 ). Now we introduce the distillation maps T, S of Definition 5.2. On Bob’s side S(Y ) = σ(Y ) is good enough. For Alice we take T (X) = A∗ τ (X)A, where A ∈ A is the operator from the Reeh–Schlieder property for some small ε > 0. The functional distilled from this is ω2 (X ⊗ Y ) = ωψ (T (X)S(Y )) = ωψ (A∗ τ (X)Aσ(Y )) = ωψ (A∗ τ (X)σ(Y )A) = ωψ (A∗ π(X ⊗ Y )A). Now the Reeh–Schlieder property, applied to the operator R = π(Z) ∈ B(H), asserts that ω2 (Z)/ω2 (1l) is close to ωχ (π(Z)) = Ω|Z|Ω, Z ∈ B(C2 ⊗ C2 ). Hence, up to normalization, ω2 is close to a singlet state, and therefore is not ppt. This proves the theorem in the case that A and B are von Neumann algebras. Step 4: When the C*-algebras A, B satisfy the assumptions of the theorem, so do their weak closures, the von Neumann A , B : since A ⊂ A these algebras are both non-abelian, and by taking commutants of the inclusion B ⊂ A , we get the commutation property A ⊂ B of the von Neumann algebras. Of course, if Aψ is dense in H, so is the larger set A ψ. Now let T : B(C2 ) → A and S : B(C2 ) → B be the distillation maps, whose existence we have just proved. We have to find maps T : B(C2 ) → A, and S : B(C2 ) → B with smaller ranges, which do nearly as well. This is the content of the following: Lemma 6.3. Let A ⊂ B(H) be a C*-algebra, and let k ∈ N. Consider a completely positive map T : B(Ck ) → A . Then for any finite collection of vectors φ1 , . . . , φn , and ε > 0 we can find a completely positive map T˜ : B(Ck ) → A such that, for all X ∈ B(Ck ), and all j, we have ||(T (X) − T˜ (X))φj || ≤ ε||X||.
July 6, 2005 12:21 WSPC/148-RMP
562
J070-00236
R. Verch & R. F. Werner
Obviously, with such approximations (for just the single vector φ1 = ψ), we get a distilled state ω2 arbitrarily close to what we could get from the distillation in the von Neumann algebra setting. This concludes the proof of the theorem, apart from the proof of the lemma. Proof of the Lemma. Note that the version of the lemma with k = 1 just states that the positive cone of A is strongly dense in the positive cone of A , which is a direct consequence of Kaplansky’s Density Theorem [57, Theorem 4.8]. We will reduce the general case to this by parameterizing all completely positive maps Ti : B(Ck ) → A by their Choi-matrices ti =
k
Ti (|αβ|) ⊗ |αβ| ∈ A ⊗ B(Ck ),
(6.3)
αβ=1
where “subscript i” equals “tilde” or “no tilde”. Note that A ⊗ B(Ck ) is the von Neumann algebra closure of A ⊗ B(Ck ), so via Kaplansky’s Density Theorem we obtain, for the given positive element t ∈ A ⊗ B(Ck ), and any finite collection of vectors in H ⊗ Ck , a positive approximant t˜ ∈ A ⊗ B(Ck ). As the collection vectors we take the given φi , tensored with the basis vectors of Ck , which implies the desired approximation for all X, which are matrix units |αβ|. However, because k is finite, and all norms are equivalent on a finite dimensional vector space, we can achieve a bound as required in the lemma. We remarked in the beginning of this section that assuming the given bipartite state to be a vector state in some representation is not a restriction of generality. Therefore there should be a version of the theorem, which does not require a representation. Indeed, we can go to the GNS-representation of the algebra generated by A and B in the given state, and just restate the conditions of the theorem as statements about expectations in the given state. This leads to the following: Corollary 6.4. Let ω be a state on a bipartite system (A, B) ⊂ R, and suppose that (1) For some A ∈ A, and B1 , . . . , B4 ∈ B, ω AB1 [B2 , B3 ]B4 = 0, and a similar condition holds with A and B interchanged. (2) For all B ∈ B and ε > 0 there is an A ∈ A such that ω((A − B)∗ (A − B)) ≤ ε. Then ω is 1-distillable. This opens an interesting connection with the theory of maximally entangled states on bipartite systems. These are generalizations of the EPR state, and have the property that for every (projection valued) measurement on Alice’s side there is a “double” on Bob’s side such that if the two are measured together the results agree with probability one [35]. The equation which has to be satisfied by Alice’s
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
563
observable A, and the double B looks very much like condition (2) in the corollary with ε = 0, except that in addition one requires ω((A − B)(A − B)∗ ) = 0. Before going to the context of quantum field theory, let us summarize the implications we have established for a state ω on a general bipartite system (A, B) ⊂ R:
ReehSchlieder property Fig. 1.
1-distillible
violates Bell-Ineqs
distillible
not ppt
entangled
Implications valid for any bipartite state.
7. Distillability in Quantum Field Theory The generic occurrence of distillable states in quantum field theory can by Theorem 6.2 be deduced from the fact that the Reeh–Schlieder property, and nonabelianess, are generic features of von Neumann algebras A and B describing observables localized in spacelike disjoint regions OA and OB in relativistic quantum field theory. To see this more precisely, we have to provide a brief description of the basic elements of quantum field theory in the operator algebraic framework. The reader is referred to the book by Haag [24] for more details and discussion. The starting point in the operator algebraic approach to quantum field theory is that each system is described in terms of a so-called “net of local observable algebras” {A(O)}O⊂R4 . This is a family of C ∗ -algebras indexed by the open, bounded regions O in R4 , the latter being identified with Minkowski spacetime. In other words, to each open bounded region O in Minkowski spacetime one assigns a C ∗ -algebra A(O), and it is required that the following assumptions hold: (I) isotony: O1 ⊂ O2 ⇒ A(O2 ) ⊂ A(O2 ); (II) locality: if the region O is spacelike to the region O , then AA = A A for all A ∈ A(O) and all A ∈ A(O ). The isotony assumption implies that there is a smallest C ∗ -algebra containing all the A(O); this will be denoted by A(R4 ). It is also assumed that there exists a unit element 1l in A(R4 ) which is contained in all the local algebras A(O). Suggested by the assumptions (I) and (II), the hermitean elements in A(O) should be viewed as the observables of the quantum system which can be measured at times and locations within the spacetime region O. The locality (or microcausality) assumption then says that there are no uncertainty relations between measurements carried out at spacetime events that are spacelike with respect to each other, or that the corresponding observables are “jointly measurable”. In this way, the relativistic requirement of finite propagation speed of all effects is built into the description of
July 6, 2005 12:21 WSPC/148-RMP
564
J070-00236
R. Verch & R. F. Werner
a system. (See also [9] for a very recent discussion of locality aspects in quantum field theory.) Nevertheless, there is usually in quantum field theory an abundance of states which are “non-local” in the sense that there are correlations between measurements carried out in spacelike separated regions on these states which are of quantum nature, i.e., there is entanglement over spacelike separations for such states. Given a state ω on A(R4 ), one can associate with it a net of “local von Neumann algebras” {Rω (O)}O⊂R4 in the GNS-representation by setting Rω (O) = πω (A(O)) , where (πω , Hω , Ωω ) is the GNS-representation of ω (cf. footnote in Sec. 6). On the right-hand side we read the von Neumann algebra generated by the set of operators πω (A(O)) ⊂ B(Hω ). At this point we ought to address a point which often causes confusion. Although in the GNS-representation the state ω is given by a vector state, it need not hold that ω is a pure state for the simple reason that Rω (R4 ) need not coincide with B(Hω ), and in that case ω corresponds to the vector state Ωω | · |Ωω restricted to Rω (R4 ). However, restrictions of vector states onto proper subalgebras of B(Hω ) are in general mixed states. It is very convenient to distinguish certain states by properties of their GNSrepresentations. We call a state covariant if there exists a (strongly continuous) unitary group {Uω (a)}a∈R4 which in the GNS-representation acts like the translation group: Uω (a)Aω (O)Uω (a)−1 = Rω (O + a), for all a ∈ R4 and all bounded open regions O. Among the class of covariant states there are two particulary important subclasses: Vacuum states: ω is called a vacuum state if Uω (a)Ωω = Ωω (the state is translationinvariant) and joint spectrum of the selfadjoint generators Pµ , µ = 0, 1, 2, 3, of P the µ Uω (a) = ei µ a Pµ is contained in the closed forward lightcone V¯+ = {x = (xµ ) ∈ R4 : x0 ≥ 0, (x0 )2 − (x1 )2 − (x2 )2 − (x3 )2 ≥ 0}. In other words, the energy is positive in any inertial Lorentz frame. Thermal equilibrium states: ω is called a thermal equilibrium state at inverse temperature β > 0 (corresponding to the temperature T = 1/kβ where k denotes Boltzmann’s constant) if there exists a time-like unit vector e ∈ R4 , playing the role of a distinguished time axis, so that Uω (t · e)Ωω = Ωω and Ωω |Ae−βHβ B|Ωω = Ωω |BA|Ωω
(7.4)
holds for (a suitable dense subset of) A, B ∈ Rω (R4 ), where the selfadjoint operator Hβ is the generator of the time-translations in the time-direction determined by e, i.e. Uω (t · e) = eitHβ , t ∈ R.
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
565
We should note that (7.4) is a slightly sloppy way of expressing the condition of thermal equilibrium at inverse temperature β which in a mathematically more precise form would be given in terms of the so-called “KMS boundary condition” that refers to analyticity conditions of the functions t → Ωω |AUω (t · e)B|Ωω (see any of the references [5, 19, 24, 25, 52] for a precise statement of the KMS boundary condition). That way of characterizing thermal equilibrium states has the advantage of circumventing the difficulty that e−βHβ will usually be unbounded since the “thermal Hamiltonian” Hβ in the GNS-representation of a thermal state has a symmetric spectrum (much in contrast to the Hamiltonians in a vacuum-state representation). We will not enter into further details here and refer the reader to [5, 19, 52] for discussion of these matters. There is, however, a point which is worth focusing attention on. The condition of thermal equilibrium makes reference to a single direction of time, and it is known that if a state is a thermal equilibrium state with respect to a certain time axis e, then in general it will not be a thermal equilibrium state (at any inverse temperature) with respect to another time-direction e [41, 43]. Nevertheless, it has been shown by J. Bros and D. Buchholz that in a relativistic quantum field theory, the correlation functions of a thermal equilibrium state ω (with respect to an arbitrarily given time-direction) possess, under very general conditions, a certain analyticity property which is Lorentz-covariant, and stronger than the thermal equilibrium condition with respect to the given timedirection itself [6]. This analyticity condition is called “relativistic KMS-condition”. Let us state the relativistic spectrum condition of [6] in precise terms (mainly for the sake of completeness; we will not make use of it in the following): A state ω on A(R4 ) is said to fulfill the relativistic KMS condition at inverse temperature β > 0 if ω is covariant and if there exists a timelike vector e in V+ (the open interior of V¯+ ) having unit Minkowskian length, so that for each pair of operators A, B ∈ πω (A(R4 )) there is a function F = FAB which is analytic in the domain Tβe = {z ∈ C4 : Im z ∈ V+ ∩ (βe − V+ )}, and continuous at the boundary sets determined by Im z = 0, Im z = βe with the boundary values F (x) = Ωω |AUω (x)B|Ωω , F (x + iβe) = Ωω |BUω (−x)A|Ωω for x ∈ R4 . We will give an indication of the nature of those general conditions leading to the relativistic KMS-condition since that gives us the opportunity of also introducing the lacking bits of terminology for eventually formulating our result. Let us start with a vacuum state ω = ωvac , and denote the corresponding GNS-representation by (πvac , Hvac , Ωvac ) and the local von Neumann algebras in the vacuum representation by Rvac (O). When one deals with quantum fields φ of the Wightman type, then Rvac (O) is generated by quantum field operators φ(f ) smeared with test-functions f having support in O. More precisely, Rvac (O) = {eiφ(f ) , supp f ⊂ O} . This is the typical way how local algebras of observables arise in quantum field theory. We note that in this case, the net {Rvac (O)}O⊂R4 of von Neumann algebras fulfils the condition of additivity which requires that Rvac (O) is contained in {Rvac (On ), n ∈ N} whenever the sequence of regions
July 6, 2005 12:21 WSPC/148-RMP
566
J070-00236
R. Verch & R. F. Werner
{On }n∈N covers O, i.e. O ⊂ n On . The additivity requirement can therefore be taken for granted in quantum field theory. Now it is clear that the vacuum state ωvac (like any state) determines a further class of states ω on A(R4 ), namely those states which arise via density matrices in its GNS-representation: ω (A) = Tr(ρ πvac (A))
∀ A ∈ A(R4 )
for some density matrix ρ on Hvac . These states are called normal states (in the vacuum representation, in this case), and they correspond in an obvious manner to normal states on Rvac (R4 ). Such normal states in the vacuum representation may be regarded as states with a finite number of particles. For quantum systems with a finite number of degrees-of-freedom one would write a thermal equilibrium state ωβ as a Gibbs state ωβ (A) = Tr(e−βHvac πvac (A)), but for a system situated in the unboundedly extended Minkowski spacetime, e−βHvac will not be a density matrix since the spectrum of the vacuum Hamiltonian Hvac will usually be continuous. So a thermal equilibrium state is not a normal state in the vacuum representation. What one can however do is to approximate ωβ by a sequence of “local Gibbs states” (N ) (N ) ωβ (A) = Tr e−βHvac πvac (A) , A ∈ A(ON ), which are restricted to bounded spacetime regions ON with suitable local (N ) Hamiltonians Hvac . Now one lets ON R4 as N ∞, and under fairly general assumptions on the behavior of the theory in the vacuum representation that are expected to hold for all physically relevant quantum fields, it can be shown that in the limit one gets a thermal equilibrium state ωβ (this is a long known result due to Haag, Hugenholtz and Winnink [25]) and that, moreover, remnants of the spectrum condition in the vacuum representation survive the limit to the effect that the limiting state ωβ satisfies the relativistic KMS-condition [6]. The relativistic KMS condition has proved useful in establishing the Reeh– Schlieder theorem for thermal equilibrium states. We shall, for the sake of completeness, quote the relevant results in the form of a theorem. Theorem 7.1. [49, 36, 15] Let ω be either a vacuum state on A(R4 ), or a thermal equilibrium state on A(R4 ) satisfying the relativistic KMS-condition. Assume also that the net {Rω (O)}O⊂R4 fulfils additivity and that Hω is separable. Then it holds that: (a) The set Rω (O)Ωω is dense in Hω , i.e. the Reeh–Schlieder property holds for ω = Ωω | · |Ωω with respect to Rω (O), whenever O is an open region.c c Here
and in the following, we always assume that the open set O is non-void.
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
567
(b) Moreover, there is a dense set of vectors χ ∈ Hω so that, for each such χ, Rω (O)χ is dense in Hω for all open regions O. The proof of (a) in the vacuum case has been given in [49]. For the case of thermal equilibrium states, a proof of this property was only recently established by J¨ akel in [36]. Statement (b) is implied by (a), as has been shown in [15]. We should also like to point out that the Schlieder property mentioned in the proof of Theorem 6.2 holds for the state ω, cf. [51, 37]. These quoted results in combination with Theorem 6.2 now yield: Theorem 7.2. Let A = Rω (OA ) and B = Rω (OB ) be a pair of local von Neumann algebras of a quantum field theory d in the representation of a state ω which is either a vacuum state, or a thermal equilibrium state satisfying the relativistic KMScondition (with Hω separable). If the open regions OA and OB are spacelike separated by a non-zero spacelike distance, then the state ω = Ωω | · |Ωω is 1-distillable on the bipartite system (A, B). Moreover, there is a dense set X ⊂ Hω so that the vector states χ| · |χ are 1-distillable on (A, B) for all χ ∈ X , ||χ|| = 1. Also, X may be chosen independently of OA and OB . Consequently, the set of vector states on R = (A ∪ B) which are 1-distillable on (A, B) is strongly dense in the set of all vector states. Remarks. (i) Actually, the statement of Theorem 7.2 shows distillability not only for a dense set of vector states on R but even for a dense set of normal states (i.e., density matrix states) on R. To see this note that, owing to the assumption that the spacetime regions OA and OB are spacelike separated by a finite distance, there is for R a separating vector in Hω , since Ωω has just this property: There is an open region O lying spacelike to OA and OB . By the Reeh–Schlieder property, Rω (O)Ωω is dense in Hω , and hence, Ωω is a separating vector for R ⊂ Rω (O) . This implies by [31, Theorem 7.3.8] that, whenever ω ˜ is a density matrix state on R, there is a ˜ = ωχ |R. In other words, under the given assumptions unit vector χ ∈ Hω so that ω every normal state on R coincides with the restriction of a suitable vector state. (ii) It should also be noted that, under very general conditions, vacuum representations and also thermal equilibrium representations of quantum field theories fulfil the so-called “split property” (an independence property, cf. [24, 56, 64]), which implies (under the conditions of Theorem 7.2) that there exists an abundance of normal states which are separable and even ppt on (A, B) for bounded, spacelike separated regions OA and OB . (iii) The second part of the statement, asserting that in the GNS-representation of ω there is a dense set of normal states which are distillable over causally separated d The
quantum field theory is supposed to be non-trivial in the sense that its local observable algebras are non-abelian, and this is also to hold for the local von Neumann algebras in the representations considered. This is the generic case in quantum field theory and holds for all investigated quatum field models.
July 6, 2005 12:21 WSPC/148-RMP
568
J070-00236
R. Verch & R. F. Werner
regions, is closely related to a result by Clifton and Halvorson [26] who show (for a vacuum state ω; see [38] for a generalization of the argument to states satisfying the relativistic KMS condition) that there is a dense set of normal states in the GNS-representation of ω which are Bell-correlated over spacelike separated regions. However, they cannot deduce that Bell-correlations over spacelike separated regions are present for the state ω itself (or for a specific class of states, like those having the Reeh–Schlieder property, which can often be constructed out of other states). It is here where our result provides some additional information. (iv) In an interesting recent paper, Reznik, Retzker and Silman [50] propose a different method towards qualifying the degree of entanglement of a (free) quantum field vacuum state over spacelike separated regions. Their idea is to couple each local algebra A = R(OA ) and B = R(OB ) to an “external” algebra B(C2 ). They introduce a time-dependent coupling between the quantum field degrees-of-freedom in OA and OB and the corresponding “external” algebras, which are hence supposed to represent detection devices for quantum field excitations. It is then shown in [50] that this dynamical coupling, turned on for a finite amount of time during which the quantum field degrees-of-freedom remain causally separated, yields an entangled partial state for the pair of detector systems from an initially uncorrelated state coupled to the quantum field vacuum. Further local filtering operations are then used to distil that partial detector state to an approximate singlet state. It should, however, be remarked that the authors of [50] do not demonstrate the existence of Bell-correlations in the vacuum state over arbitrarily spacelike separated and arbitrarily small spacetime regions in the sense of [39, 40, 54, 26], i.e. in the sense of proving a violation of the CHSH inequalities by the quantum field observables themselves. Nevertheless, the approach of [50], while apparently less general than the one presented here, has some interesting aspects since potentially it may allow a more quantitative description of distillability in quantum field systems.
8. Distillability Beyond Spacetime Horizons It is worth pointing out that in Theorem 7.2 the spacelike separated regions OA and OB are the localization regions of the operations that Alice and Bob can apply to a given, shared state. The spacetime pattern of any form of classical communication between Alice and Bob that might be necessary to “post-select” a sub-ensemble of higher entanglement (i.e. to normalize the state ω [T ] ) from a given shared ensemble (on which local operations have been applied) is not represented in the criterion of distillability. Put differently, the distillability criterion merely tests if there are sufficiently “non-classical” long-range correlations in the shared state ω which can be provoked by local operations. It does not require that the post-selection is actually carried out via classical communication realizable between Alice and Bob in spacetime. Such a stronger demand would have to make reference to the causal structure of the spacetime into which Alice and Bob are placed.
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
569
We will illustrate this in the present section, and we begin by noting that Theorem 7.2 can actually be generalized to curved spacetime. Thus, we assume that M is a four-dimensional smooth spacetime manifold, endowed with a Lorentzian metric g. To avoid any causal pathologies, we will henceforth assume that (M, g) is globally hyperbolic (cf. [60]). In this case, it is possible to construct nets of local observable (C ∗ -) algebras {A(O)}O⊂M for quantized free fields, like the scalar Klein–Gordon, Dirac and free electromagnetic fields [13, 61]. Let us focus, for simplicity, on the free quantized Klein–Gordon field on (M, g), and denote by {A(O)}O⊂M the corresponding net of local observable algebras, fulfilling the conditions of isotony and locality, which can be naturally formulated also in curved spacetimes. Let us briefly indicate how the local C ∗ -algebras A(O) are constructed in the case of the free scalar Klein–Gordon field; for full details, see [13, 33, 61]. The Klein–Gordon operator on (M, g) is (∇µ ∇µ + m2 ) where ∇ denotes the covariant derivative of the spacetime metric g and m ≥ 0 is some constant. Owing to global hyperbolicity of the underlying spacetime (M, g), the Klein–Gordon operator possesses uniquely determined advanced and retarded fundamental solutions (Green’s functions), G+ and G− , which can be viewed as distributions on C0∞ (M × M, R). Their difference G = G+ − G− is called the causal propagator. One can construct a ∗-algebra A(M ) generated by symbols W (f ), f ∈ C0∞ (M, R), fulfilling the relations W (f1 )W (f2 ) = e−iG(f1 ,f2 )/2 W (f1 + f2 ), W (f )∗ = W (−f ) and W (f + (∇µ ∇µ + m2 )h) = W (f ). This algebra possesses a unit element and admits a unique C ∗ -norm. We identify A(M ) with the C ∗ -algebra generated by all the W (f ). Then A(O) is defined as the C ∗ -subalgebra generated by all W (f ) where f ∈ C0∞ (O, R). Now, unless (M, g) possesses time-symmetries, there are no obvious criteria to single out vacuum states or thermal equilibrium states on A(M ). Nevertheless, there is a class of preferred states on A(M ) which serve, for most purposes, as replacements for vacua or thermal equilibrium states. The states in this class are called quasifree Hadamard states. Given such a state, ω, one has πω (W (f )) = eiΦω (f ) in the GNS-representation of ω with selfadjoint quantum field operators Φω (f ) in Hω depending linearly on f and fulfilling Φω ((∇µ ∇µ + m2 )f ) = 0 and the canonical commutation relations in the form [Φω (f1 ), Φω (f2 )] = iG(f1 , f2 )1l. The Hadamard condition is a condition on the two-point distribution Ωω |Φω (x)Φω (y)|Ωω of ω (symbolically written as integral kernel with x, y ∈ M ) and demands, essentially, that this has a leading singularity of the type “1/(squared geodesic distance between x and y)”. Quasifree Hadamard states are a very well investigated class of the free scalar field in curved spacetime. The reasons why they are considered as replacements for vacuum states or thermal equilibrium states are discussed, e.g., in the refs. [22, 33, 61, 21]. The Hadamard condition on the two-point distribution of a (quasifree) state ω can equivalently be expressed by requiring that the C ∞ -wavefront set of the
July 6, 2005 12:21 WSPC/148-RMP
570
J070-00236
R. Verch & R. F. Werner
Hilbert-space valued distribution C0∞ (M ) f → Φω (f )Ωω is confined to the set of future-pointing causal covectors on M (cf. [53] and also references cited there). If ω satisfies this latter condition, one says that it fulfils the microlocal spectrum condition (µSC). If the latter condition holds even with the analytic wavefront set in place of the C ∞ -wavefront set, then one says that ω fulfils the analytic microlocal spectrum condition (aµSC) [53]. (For aµSC, it is also required that the spacetime (M, g) be real analytic.) While the definitions of C ∞ -wavefront set and analytic wavefront set are a bit involved so that we do not present them here and refer to [53] and references given there for full details, we put on record that for any quasifree state ω on the observable algebra A(M ) of the scalar Klein–Gordon field one has ω fulfils aµSC ⇒ ω fulfils µSC ⇔ ω Hadamard . Moreover, on a stationary, real analytic, globally hyperbolic spacetime (M, g), the quasifree ground states or quasifree thermal equilibrium states on A(M ), which are known to exist under a wide range of conditions, fulfil aµSC [53]. It is also known that there exist very many quasifree Hadamard states on A(M ) for any globally hyperbolic spacetime (M, g). Several properties of the local von Neumann algebras Rω (O) are known for quasifree Hadamard states ω, and we collect those of interest for the present discussion in the following proposition. Proposition 8.1. Let (M, g) be a globally hyperbolic spacetime, and let ω be a quasifree Hadamard state on A(M ), the algebra of observables of the Klein–Gordon field on (M, g). Write Rω (O) = πω (A(O)) , O ⊂ M, for the local von Neumann algebras in the GNS-representation of ω. Then the following statements hold. (a) Rω (O) is non-abelian whenever O is open. (b) There is a dense set of vectors χ ∈ Hω so that, for each such χ, Rω (O)χ is dense in Hω , for all open O ⊂ M . (c) If (M, g) is real analytic and if ω satisfies the aµSC, then the Reeh–Schlieder property holds for ω = Ωω | · |Ωω with respect to Rω (O), whenever O ⊂ M is open. Proof. Statement (a) is clear from the fact that the canonical commutation relations hold for the field operators Φω (f ). Statement (c) is a direct consequence of [53, Theorem 5.4]. For statement (b), one can argue as follows. For a globally hyperbolic (M, g), there is a countable neighborhood base {On }n∈N for the topology of M where each On has a special shape (called “regular diamond” in [58]; we assume here also that each On has a non-void causal complement), which allows the conclusion that each Rω (On ) is a type III1 factor (cf. [58, Theorem 3.6]). Since Hω is separable (cf. again [58, Theorem 3.6]), one can make use of [15, Corollary 2 and Proposition 3] which leads to the conclusion that there is a dense set X ⊂ Hω so that each χ ∈ X is cyclic for all Rω (On ), n ∈ N. Since {On }n∈N is a neighborhood
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
571
base for the topology of M , each open set O ⊂ M has On ⊂ O for some n, and hence each χ ∈ X is cyclic for Rω (O) whenever O is an open subset of M . As in the previous section, we can conclude distillability from the just asserted Reeh–Schlieder properties. Theorem 8.2. Let (M, g) be globally hyperbolic spacetime, and let ω be a quasifree state on the observable algebra A(M ) of the quantized scalar Klein–Gordon field on (M, g). Let OA and OB be two open subsets of M whose closures are causally separated (i.e., they cannot be connected by any causal curve), and let A = Rω (OA ), B = Rω (OB ). The following statements hold: (a) If (M, g) is real analytic and ω satisfies the aµSC, then the state ω = Ωω | · |Ωω is 1-distillable on (A, B). (b) There is a dense set X ⊂ Hω so that the vector states χ| · |χ are 1-distillable on (A, B) for all χ ∈ X , χ = 1. Also, X may be chosen independently of OA and OB . Consequently, the set of normal states on R = (A ∪ B) which are 1-distillable on (A, B) is strongly dense in the set of normal states on R. The proof of this theorem is a straightforward combination of the statements of Proposition 8.1 with Theorem 6.2. For part (b), we have already made use of the observation of Remark (i) following Theorem 7.2. Again, as noted in Remark (iii) following Theorem 7.2, part (b) of the last theorem is related to a similar statement by Clifton and Halvorson [26] which refers to the existence of a dense set of normal states which are Bell-correlated over causally separated spacetime regions. Also here, our comments of Remark (iii) apply. In Theorem 8.2 the localization regions OA and OB of the system parts controlled by Alice and Bob, respectively, could also be separated by spacetime horizons. Let us give a concrete example and take (M, g) to be Schwarzschild–Kruskal spacetime, i.e. the maximal analytic extension of Schwarzschild spacetime. This is a globally hyperbolic spacetime which is real analytic, and it has two subregions, denoted by I and II, that model the interior and exterior spacetime parts of an eternal black hole, respectively (see [60, Sec. 6.4]). These two regions are separated by the black hole horizon, so that no classical signal can be sent from the interior region I to an observer situated in the exterior region II. The situation is depicted in Fig. 2. For the quantized scalar Klein–Gordon field on the Schwarzschild–Kruskal spacetime, there is a preferred quasifree state, the so-called Hartle–Hawking state, which is in a sense the best candidate for the physical “vacuum” state on this spacetime (cf. [32, 61]). It is generally believed that this state fulfils the aµSC on all of M . (The arguments of [53] can be used to show that aµSC is fulfilled in region II and its “opposite” region, which makes it plausible that this holds actually on all of M , although there is as yet no complete proof.) Anticipating that this
July 6, 2005 12:21 WSPC/148-RMP
572
J070-00236
R. Verch & R. F. Werner
I
OA OB II
Fig. 2. This figure shows the interior region I and exterior region II of the conformal diagram of Schwarzschild–Kruskal spacetime, which is a model of a static black hole spacetime (at large times after collapse of a star to a black hole). The event horizon, represented by the double lines, separates region I from region II such that no signal can be sent from I to II across the horizon. A quantum field state which satisfies the Reeh–Schlieder property (as e.g. implied by the analytic microlocal spectrum condition) is distillable over the shaded spacetime regions OA (wherein “Alice” conducts her experiments on the state) and OB (wherein “Bob” conducts his experiments on the state). The dashed line represents the black hole singularity.
is the case, we can choose the localization region OA inside the interior region I and OB in the exterior region II (cf. Fig. 2). Then, by our last theorem, we find that the Hartle–Hawking state ω of the quantized Klein–Gordon field is distillable on the bipartite system (A, B) with A = Rω (OA ) and B = Rω (OB ). Furthermore, there is a dense set of normal states in the GNS-representation of the Hartle–Hawking state with respect to which this distillability holds. (At any rate, since the existence of quasifree Hadamard states for the Klein–Gordon field on the Schwarzschild–Kruskal spacetime is guaranteed, part (b) of Theorem 8.2 always ensures the existence of an abundance of states which are distillable on (A, B).) A similar example for regions OA and OB separated by a spacetime horizon (an event horizon) can be given for de Sitter spacetime; the de Sitter “vacuum state” for the quantized Klein–Gordon field actually has all the required properties for the distillablity statement of Theorem 8.2, cf. [7]. This shows that distillability of quantum field states beyond spacetime horizons (event horizons) can be expected quite generally. A similar situation occurs also in the standard Friedmann–Robertson–Walker cosmological models with an initial spacetime singularity. In this scenario, spacetime regions sufficiently far apart from each other are causally separated for a finite amount of time by their cosmological horizons [60]. However, also in this situation, a quantum field state fulfilling the aµSC on any Friedmann–Robertson–Walker spacetime would be distillable on a bipartite system (A, B) of the form A = Rω (OA )
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
573
and B = Rω (OB ) for spacetime regions OA and OB separated by a cosmological horizon. Again, there is at any rate a large class of states where such a distillability is found. In passing we should like to note that quantum field correlations, whose appearance is precisely expressed by the Reeh–Schlieder property, have already been considered in connection with the question if (potentially, very strong) quantum field fluctuations in the early universe could account for the structure of its later development [62]. 9. Discussion: Classical Communication in Spacetime? Distillation was introduced as the process of taking imperfectly entangled systems, and turning them into a useful entanglement resource. Any such process requires classical communication, even though for realizing 1-distillability only a single step of post-selection is required. It is suggestive to describe the classical communication steps also as causal communication processes in spacetime. This immediately raises a problem: if the laboratories of Alice and Bob are separated by an event horizon, they will never be able to exchange the required signals, so in this case the above results of the previous section might appear to be totally useless. Several comments to this idea are in order. (1) Event horizons are global features of a spacetime. Hence if we are interested in what can be gained from the local state between Alice and Bob, the future development of the universe remains yet unknown. Since the gravitational background is taken as “external” at this level of the theory, the adopted framework, using only spacetime structure up until the time the quantum laboratories close, never allows a decision on whether or not postselection will be causally possible. (2) The attempt to include the distillation process in the spacetime description meets the following characteristic difficulty: it becomes very hard to distinguish between classical and quantum communication. Obviously, a quantum operation disturbs the quantum field in its future light cone, but it is very hard to assert that this disturbance leaves alone the spacetime region where the negotiations for postselection take place. In other words: we cannot distinguish LOCC operations from exchanging quantum particles, and this would completely trivialize the distinction between distillable and separable states. (3) This difficulty is akin to the problem of realizing statistical experiments in spacetime. On the one hand, the statistical interpretation of quantum mechanics (and hence of quantum field theory) is based on independent repetitions of “the same” experiment. But in a dynamic space time it is clear that strictly speaking no repetition is possible, and the above disturbance argument casts additional doubt on the possibility of independent repetitions. Carrying this argument still further, into the domain of quantum cosmology, it has been debated [20] whether quantum theory may ever apply to the universe as a whole. Whether this can be resolved by showing that for typical (small) experimental setups statistical behavior can be shown to hold with probability 1 in
July 6, 2005 12:21 WSPC/148-RMP
574
J070-00236
R. Verch & R. F. Werner
any ensemble of universes admitted by the theory is a question far beyond the present paper. To summarize: we have adopted here the most “local” approach to distillability, where it is strictly taken as a property of a state ω of a general bipartite quantum system (A, B) ⊂ R, independent of the “surroundings” of that quantum system and the global structure of the spacetime into which it is placed. Still, it would be quite interesting to see if distillability criteria taking into account the realizability of distillation protocols in spacetime can be developed in a satisfactory manner (e.g. reconcilable with ideas like general covariance [8], and with the difficulties related to independence of measurements alluded to above). We should finally note that the difference between these two points of view is insignificant for present day laboratory physics where it can always be safely assumed that spacetime is Minkowskian. References [1] D. Beckman, D. Gottesman, M. A. Nielsen and J. Preskill, Phys. Rev. A64 (2001) 052309. [2] C. H. Bennett, G. Brassard, C. Cr´epeau, R. Josza, A. Peres and W. K. Wootters, Phys. Rev. Lett. 70 (1993) 1895; C. H. Bennett, G. Brassard, S. Popescu, B. Schumacher and J. A. Smolin, ibid. 76 (1996) 722; erratum ibid. 78 (1997) 2031; N. Gisin, Phys. Lett. A210 (1996) 151. [3] C. H. Bennett, D. P. DiVincenzo, J. A. Smolin and W. K. Wootters, Phys. Rev. A54 (1996) 3824. [4] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 1, 2nd edn. (Springer, Berlin-Heidelberg, New York, 1987). [5] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 2, 2nd edn. (Springer, Berlin-Heidelberg, New York, 1997). [6] J. Bros and D. Buchholz, Nucl. Phys. B429 (1994) 291. [7] J. Bros and U. Moschella, Rev. Math. Phys. 8 (1996) 327. [8] R. Brunetti, K. Fredenhagen and R. Verch, Commun. Math. Phys. 237 (2003) 31. [9] D. Buchholz and S. J. Summers, Phys. Lett. A337 (2005) 17. [10] B. S. Cirel’son, Lett. Math. Phys. 4 (1980) 93. [11] J. F. Clauser, M. A. Horne, A. Shimony and R. A. Holt, Phys. Rev. Lett. 26 (1969) 880. [12] E. B. Davies, Quantum Theory of Open Systems (Academic Press, New York, 1976). [13] J. Dimock, Commun. Math. Phys. 77 (1980) 219; — Trans. Amer. Math. Soc. 269 (1982) 133; Rev. Math. Phys. 4 (1992) 223. [14] D. P. DiVincenzo, P. W. Shor, J. A. Smolin, B. M. Terhal and A. V. Thapliyal, Phys. Rev. A61 (2000) 062312. [15] J. Dixmier and O. Mar´echal, Commun. Math. Phys. 22 (1971) 44. [16] W. D¨ ur, J. I. Cirac, M. Lewenstein and D. Bruß, Phys. Rev. A61 (2000) 062313. [17] T. Eggeling, D. Schlingemann and R. F. Werner, Europhys. Lett. 57 (2001) 782. [18] A. K. Ekert, Phys. Rev. Lett. 67 (1991) 661. [19] G. G. Emch, Algebraic Methods in Statistical Mechanics and Quantum Field Theory (Wiley, New York, 1972).
July 6, 2005 12:21 WSPC/148-RMP
J070-00236
Distillability and Positivity of Partial Transposes
575
[20] H. Fink and H. Leschke, Found. Phys. Lett. 13 (2000) 345. [21] C. J. Fewster and R. Verch, Commun. Math. Phys. 240 (2003) 329. [22] S. A. Fulling, Aspects of Quantum Field Theory in Curved Spacetime (Cambridge University Press, Cambridge, 1989). [23] G. Giedke, B. Kraus, L.-M. Duan, P. Zoller, J. I. Cirac and M. Lewenstein, Fortschr. Phys. 49 (2001) 973. [24] R. Haag, Local Quantum Physics (Springer-Verlag, Berlin, 1992). [25] R. Haag, N. M. Hugenholtz and M. Winnink, Commun. Math. Phys. 5 (1967) 215. [26] H. Halvorson and R. Clifton, J. Math. Phys. 41 (2000) 1711. [27] P. Horodecki, M. Horodecki and R. Horodecki, Phys. Rev. Lett. 82 (1999) 1056. [28] M. Horodecki, P. Horodecki and R. Horodecki, Phys. Rev. Lett. 80 (1998) 5239. [29] M. Horodecki, P. Horodecki and R. Horodecki, Phys. Lett. A223 (1996) 1. [30] P. Horodecki, Phys. Lett. A232 (1997) 333. [31] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras, Vol. II, 2nd edn. (AMS, Providence, 1997). [32] B. S. Kay, Commun. Math. Phys. 100 (1985) 57. [33] B. S. Kay and R. M. Wald, Phys. Rep. 207 (1991) 49. [34] M. Keyl, Phys. Rep. 369 (2002) 431. [35] M. Keyl, D. Schlingemann and R. F. Werner, Quant. Inf. Comp. 3 (2003) 281. [36] C. D. J¨ akel, J. Math. Phys. 41 (2000) 1745. [37] C. D. J¨ akel, J. Math. Phys. 40 (1999) 6234. [38] C. D. J¨ akel, Found. Phys. Lett. 14 (2001) 1. [39] L. J. Landau, Phys. Lett. A120 (1987) 54. [40] L. J. Landau, Phys. Lett. A123 (1987) 115. [41] H. Narnhofer, Acta Phys. Austriaca 47 (1977) 1. [42] H. Narnhofer, Rep. Math. Phys. 50 (2002) 111. [43] I. Ojima, Lett. Math. Phys. 11 (1986) 73. [44] A. Peres, Phys. Rev. Lett. 77 (1996) 1413. [45] A. Peres and D. R. Terno, Rev. Mod. Phys. 76 (2004) 93. [46] S. Popescu, Phys. Rev. Lett. 74 (1995) 2619. [47] Problem No. 2 on website of open problems in Quantum Information: http:// www.imaph.tu-bs.de/qi/problems/2.html. [48] R. Raussendorf and H. Briegel, Phys. Rev. Lett. 86 (2001) 5188. [49] H. Reeh and S. Schlieder, Nuovo Cimento 22 (1961) 1051. [50] B. Reznik, A. Retzker and J. Silman, Phys. Rev. A71 (2005) 042104. [51] S. Schlieder, Commun. Math. Phys. 13 (1969) 216. [52] G. L. Sewell, Quantum Theory of Collective Phenomena (Clarendon Press, Oxford, 1989). [53] A. Strohmaier, R. Verch and M. Wollenberg, J. Math. Phys. 43 (2002) 5514. [54] S. J. Summers and R. F. Werner, Lett. Math. Phys. 33 (1995) 321; — Ann. Inst. H. Poincar´e 49 (1988) 215; — Commun. Math. Phys. 110 (1987) 247; — Phys. Lett. A110 (1985) 257. [55] S. J. Summers and R. F. Werner, J. Math. Phys. 28 (1987) 2448. [56] S. J. Summers, Rev. Math. Phys. 2 (1990) 201. [57] M. Takesaki, Theory of Operator Algebras, I (Springer, Berlin-Heidelberg, New York, 1979). [58] R. Verch, Rev. Math. Phys. 9 (1997) 635. [59] K. G. H. Vollbrecht and R. F. Werner, Phys. Rev. A64 (2001) 062307. [60] R. M. Wald, General Relativity (University of Chicago Press, 1984).
July 6, 2005 12:21 WSPC/148-RMP
576
J070-00236
R. Verch & R. F. Werner
[61] R. M. Wald, Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics (University of Chicago Press, 1994). [62] R. M. Wald, Gen. Relativity Gravitation 24 (1992) 1111. [63] R. F. Werner and M. M. Wolf, Phys. Rev. A61 (1999) 062102. [64] R. F. Werner, Lett. Math. Phys. 13 (1987) 325. [65] R. F. Werner, Phys. Rev. A40 (1989) 4277. [66] S. L. Woronowicz, Rep. Math. Phys. 10 (1976) 165.
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Reviews in Mathematical Physics Vol. 17, No. 5 (2005) 577–612 c World Scientific Publishing Company
QUANTUM ENERGY INEQUALITIES IN TWO-DIMENSIONAL CONFORMAL FIELD THEORY
CHRISTOPHER J. FEWSTER Department of Mathematics, University of York, Heslington, York YO10 5DD, United Kingdom [email protected] STEFAN HOLLANDS Department of Physics, UCSB, Broida Hall, Santa Barbara, CA 93106, USA and Institut f¨ ur Theoretische Physik, Universit¨ at G¨ ottingen, Friedrich-Hund-Platz 1, D-37077 G¨ ottingen, Germany [email protected] Received 23 December 2004 Revised 29 May 2005 Quantum energy inequalities (QEIs) are state-independent lower bounds on weighted averages of the stress-energy tensor, and have been established for several free quantum field models. We present rigorous QEI bounds for a class of interacting quantum fields, namely the unitary, positive energy conformal field theories (with stress-energy tensor) on two-dimensional Minkowski space. The QEI bound depends on the weight used to average the stress-energy tensor and the central charge(s) of the theory, but not on the quantum state. We give bounds for various situations: averaging along timelike, null and spacelike curves, as well as over a space-time volume. In addition, we consider boundary conformal field theories and more general “moving mirror” models. Our results hold for all theories obeying a minimal set of axioms which — as we show — are satisfied by all models built from unitary highest-weight representations of the Virasoro algebra. In particular, this includes all (unitary, positive energy) minimal models and rational conformal field theories. Our discussion of this issue collects together (and, in places, corrects) various results from the literature which do not appear to have been assembled in this form elsewhere. Keywords: Quantum field theory; energy inequalities; conformal field theory.
1. Introduction In classical theories of matter, the stress-energy tensor Tµν is usually taken to satisfy “energy conditions”, encoding various physical assumptions. For example, the dominant energy condition (DEC) requires that T µν v ν be a future-pointing causal (timelike or null) vector whenever v ν is, reflecting the idea that energy-momentum 577
July 6, 2005 12:21 WSPC/148-RMP
578
J070-00240
C. J. Fewster & S. Hollands
should be propagated at or below the speed of light, while the weak energy condition (WEC) requires simply that the energy density seen by any observer is non-negative. It is well-known that such conditions usually fail in quantum theoretical models of matter to the extent that, at a given space-time point, the expectation value of the energy density can be made arbitrarily negative by a suitable choice of state. If such negative energy densities could in fact be sustained over a sufficiently large region of space and time, then all sorts of unexpected physical phenomena ranging from exotic space-times to violations of the second law of thermodynamics could occur [35, 1, 16]. However, it has been shown that the duration and magnitude of negative energy density that can occur is constrained, at least in models of free fields, by so-called “quantum inequalities”. (We will use the more specific term “quantum energy inequalities” (QEIs).) Results are known for the free scalar [19, 39, 12, 6, 15], Dirac [52, 13, 9], Maxwell and Proca [19, 38, 10], and Rarita–Schwinger fields [54] in various levels of generality, including some quite general and rigorous results. These inequalities state that the weighted average of the expected energy density along a worldline is bounded from below by a negative constant depending only on the weighting function used in the averaging process, but not on the quantum state. Moreover, the bounds become more stringent if one increases the time interval over which the averaging is performed. These quantum energy inequalities arguably exclude, or at least severely constrain, the above-mentioned exotic physical phenomena (see, for example, [18, 40, 45]). Unfortunately, quantum inequalities of the above character are at present only known for free field theories, leaving open the possibility that physically interesting, interacting field theories might display a completely different behavior in this regard. Thus, one should also investigate quantum inequalities for interacting quantum field theories. In the present paper, we take a first step in this direction, by deriving a sharp quantum energy inequality of the above character for arbitrary unitary, two-dimensional quantum field theories with conformal invariance and positive Hamiltonian.a Our derivation is based on the realization that Flanagan’s bound [14] for a massless scalar field in two dimensions is in fact an argument in conformal field theory. Indeed, a close inspection shows that the essential part of his argument only relies upon the transformation law of the stress energy operator under diffeomorphisms, common to all two-dimensional unitary conformal field theories with positive Hamiltonian and a stress-energy tensor. As a result, our general bound differs from that for a massless scalar field in two dimensions only by a multiplicative factor of the central charge, c, of the conformal field theory under consideration (and the possibility that the left- and right-moving portions of the stress-energy tensor might have different central charges). We do not assume at any point that the theory is derived from a Lagrangian, nor do we invoke (but certainly a We are also assuming, of course, that the theory has a stress tensor. Not all theories with conformal invariance necessarily admit a stress-energy tensor [2, 28].
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
579
do not exclude) at any point the existence of any fields other than the stress-energy tensor. The general arguments establishing the bound are sketched in Sec. 2, following Flanagan’s argument fairly closely. Some non-trivial issues of mainly technical nature have to be dealt with in order to make the argument rigorous for the class of weighting functions that we want to consider, and to show that the bound is sharp. These issues mainly arise from the fact that the stress-energy tensor in two-dimensional conformal field theory has the familiar transformation property for those diffeomorphisms of the left-moving (resp., right-moving) light-ray that can be lifted to diffeomorphisms of the unit circle S 1 under the stereographic map. However, in order to prove our quantum inequality bound for weighting functions of compact support (and to show that it is sharp), one formally wants to consider diffeomorphisms outside this class. These difficulties were overcome in [14] by an appeal to general covariance. In the setting explored in this paper, a different argument is needed and this is elaborated in Sec. 4. To make this argument, we need to have sufficient control over the unitary representations of the diffeomorphism group of S 1 which enter the transformation law of the stress-tensor in the given CFT model. We therefore begin in Sec. 3 by specifying — in an axiomatic fashion — the class of models to which our derivation applies. Our axioms are fairly minimal and particular models will generally have extra structure. The main content of the axioms is that the theory should 1 be covariant with respect to Diff + (S ), the universal covering group of the group of orientation preserving diffeomorphisms of the circle, and invariant under M¨ ob, the subgroup covering the M¨ obius transformations of the circle. Each independent component of the stress-energy tensor should correspond to an independent unitary 1 multiplier representation of Diff + (S ) and the stress-energy tensor itself should be formed from the infinitesimal generators of these representations. As we will see (in Sec. 5.3), these axioms will be loose enough to embrace a wide range of theories, in particular, they encompass all unitary rational CFTs. Nonetheless, they are sufficient conditions for the theory to obey QEIs. We have also collected a number 1 of facts about Diff + (S ) and its representations in Sec. 3; although much of this material is regarded as well-known, comprehensive references seem not to exist. Thus, our presentation may be of independent interest. In Sec. 5, we verify that our axioms are satisfied by models constructed from unitary, highest-weight representations of the Virasoro algebra. Here, we draw on the results of Goodman and Wallach [25], and Toledano Laredo [49] which make precise the sense in which such representations may be “exponentiated” to unitary 1 multiplier representations of Diff + (S ). As particular models may be built as direct sums of tensor products of Virasoro representations, it is also necessary to maintain explicit control of the multiplier appearing in our representations and we show that this may be defined in terms of the Bott cocycle. We have not found a full proof of this elsewhere in the literature. We illustrate our main result by giving several applications in Sec. 4.2. In particular, we derive QEIs that are valid along worldlines, or for averaging over space-time
July 6, 2005 12:21 WSPC/148-RMP
580
J070-00240
C. J. Fewster & S. Hollands
volumes. A peculiarity of two-dimensional conformal field theory is that QEIs also exist for averages along spacelike or null lines, in contrast to the situation in fourdimensional theories [17, 11]. We also show that similar results hold for conformal field theories in the presence of moving boundaries (often called “moving mirrors”). Finally, we discuss the failure of QEIs for sharply cut-off averaging functions. In conclusion, we mention that it is not clear that quantum energy inequalities involving averaging along worldlines will hold in generic non-conformally invariant theories in two dimensions, or in interacting quantum field theories in dimensions d > 2. Olum and Graham [37] have investigated a model with two nonlinearly coupled fields, one of which is in a domain wall configuration, and argued that a static negative energy density can be created in this fashion, which can be made large by tuning the parameters of the model. For these reasons, we suggest that spacetime-averaged QEIs might be a more profitable direction for future research (as mentioned, such QEIs hold in our present context). Indeed, if one were required to scale the spatial support of the averaging with the temporal support, then averages of long duration would necessarily sense the large positive energy concentrated in the domain wall, preventing the overall average from becoming too negative. This may suggest an appropriate formulation for QEIs in more general circumstances. 2. Stress-Energy Densities of Scale-Invariant Theories in Two Dimensions Let us begin by considering a general scale-invariant theory in two-dimensional Minkowski space. The L¨ uscher–Mack theorem [32, 33, 21] assertsb that if such a theory possesses a symmetric and conserved stress-energy tensor field T µν obeying (2.1) T µ0 (x0 , x1 ) dx1 = P µ , where P µ are the energy-momentum operators generating space-time translations, then T µν is traceless and the independent components T 00 and T 01 may be expressed in terms of left- and right-moving chiral components TL and TR which each depend on only one lightlike variable: T 00 (x0 , x1 ) = TR (x0 − x1 ) + TL (x0 + x1 ), T 01 (x0 , x1 ) = TR (x0 − x1 ) − TL (x0 + x1 ).
(2.2)
These fields have scaling dimension two, i.e. U (λ)TL (v)U (λ)−1 = λ2 TL (λv)
(2.3)
(and an analogous relation for TR ) where U (λ) is the unitary implementing the scaling xµ → λxµ . Moreover, TL and TR commute with each other and satisfy b The
theorem assumes that the theory obeys Wightman’s axioms [48].
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
581
relations of the form: cL [TL (v1 ), TL (v2 )] = i − TL (v1 )δ(v1 − v2 ) + 2TL (v1 )δ (v1 − v2 ) − δ (v1 − v2 )11 24π (2.4) (and similarly for TR ) where the constants cL , cR are the central charges of the theory and are equal under the additional assumption of parity invariance. These commutation relations are closely related to those of the Virasoro algebra, a central extension of the (complexified) Lie algebra of Diff + (S 1 ), the group of orientation preserving diffeomorphisms of the circle. One of the key properties of a QFT is the spectrum condition, which, in the present context, requires that P 0 ± P 1 be positive operators. It is easy to see that 1 0 PR := P + P 1 = TR (u) du, 2 (2.5) 1 0 PL := P − P 1 = TL (v) dv 2 generate translations along null light-rays, so that PR generates translations along a left-moving null ray and vice versa. Positivity of these operators does not, however, entail that the stress-energy densities themselves are everywhere non-negative. On the contrary, for any v, there is a sequence of unit vectors ψn (in the “Wightman domain” of the theory) with TL (v)ψn → −∞ as n → ∞
(2.6)
enough to show this for (of course there is a similar statement for TR ).c It is clearly v = 0. Let Ω be the vacuum state and write TL (f ) = TL (v)f (v) dv, where f is a non-negative test function. Now TL (v)Ω = 0 by translation- and scale-invariance of the vacuum, while TL (f )Ω = 0 by the Reeh–Schlieder theorem of Wightman theory (excluding the trivial possibility that TL (f ) = 0 for all f ). Defining ϕλ = Ω − λTL (f )Ω
(λ ∈ R),
it is now evident that ϕλ | TL (f )ϕλ = −2λTL (f )Ω2 + λ2 Ω | TL (f )3 Ω is negative for all sufficiently small positive λ. Hence TL (v)ϕλ must assume negative values for some point v, and we deduce the existence of a unit vector ψ with TL (0)ψ < 0. Defining ψn = U (n)−1 ψ and using Eq. (2.3), we obtain Eq. (2.6). Thus, the stress-energy density at individual space-time points is unbounded from below, as is the case in many other quantum field theories.d In the following sections, we will formulate precise conditions under which averaged stress-energy densities such as TL (f ) (for non-negative f ) obey state-independent lower bounds: Quantum Energy Inequalities. Our discussion is based on an argument given by Flanagan [14] for the particular case of the massless free scalar field (corresponding as elsewhere, Aψ := ψ | Aψ/ψ | ψ denotes the expectation value. similar to those given here apply to any theory (in dimension d ≥ 2) with a scaling limit of positive scaling dimension — see [7]. c Here,
d Arguments
July 6, 2005 12:21 WSPC/148-RMP
582
J070-00240
C. J. Fewster & S. Hollands
to the case cL = cR = 1). We now sketch the heart of the argument, proceeding rather formally and leaving details aside. This is based on the transformation property of a chiral stress-energy density T of a conformal field theory (representing TL or TR ) under reparametrizations v → V (v): c {V, v}11, (2.7) T (v) → V (v)2 T (V (v)) − 24π where 2 V (v) 3 V (v) 1 d2 (2.8) {V, v} = − = −2 V (v) 2 V (v) 2 V (v) dv V (v) is the Schwarz derivative of V . That is, to any non-zero vector ψ in Hilbert space, there is a vector ψV (of the same norm) such that c T (v)ψ = V (v)2 T (V (v))ψV − {V, v}. (2.9) 24π (The infinitesimal form of this transformation law is simply Eq. (2.4).) Now, suppose we are given a non-negative test function H and choose a reparametrization such that V (v) = H(v)−1 . Then {V, v} = d2 1/2 and, −2H(v)−1/2 dv 2 H(v) c d2 H(v)T (v)ψ dv = V (v)T (V (v))ψV dv + H(v) 2 H(v) dv 12π dv 2 d c = T (V )ψV dV − H(v) dv, (2.10) 12π dv assuming that the integration by parts in the last term may be accomplished without producing any boundary terms. Since the first term on the right-hand side is P ψV , which is non-negative, we conclude that 2 d c H(v) dv (2.11) H(v)T (v)ψ dv ≥ − 12π dv for arbitrary ψ. Moreover, since P Ω = 0, one expects the bound to be attained for ψ such that ψV = Ω. Although the above conveys the essential ideas underlying the QEI derivation (and differs from the scalar case only inasmuch as the central charge is not restricted to c = 1), one must exercise greater care to produce a satisfactory argument. There are various reasons for this. Firstly, the reparametrization rule (2.7) is expected to hold only for those reparametrizations of R which correspond to a diffeomorphism of the compactified light-ray, and this will not generally be the case for the coordinate V invoked above. (Indeed the reparametrization is not even defined for H vanishing outside a compact interval.) Secondly, it is clearly necessary to delineate the class of ψ for which the bound holds, for example, the left-hand side does not even exist for every ψ!Finally, one needs to ensure that the various formal manipulations — this technical point conceals some subtle nuances relating to H(v) are valid √ (for example, although H could be replaced by a [not necessarily non-negative]
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
583
function which squares to H, it is not the case that every smooth non-negative function is the square of a smooth function [23]). Flanagan addressed the first two points for the scalar field by an elegant appeal to general covariance in order to compare the theory on the full line with a theory restricted to the interior of the support of H. We have chosen not to make a parallel assumption for general conformal field theories and instead present an alternative resolution of the problem. The upshot is that the QEI (2.11) holds (for ψ in a specified domain) for any non-negative H belonging to the Schwartz classe S (R) and with the integrand on the right-hand side regarded as vanishing at any point where H vanishes. The formal statement and rigorous proof is given in Theorem 4.1. 3. Axiomatic Framework In this section, we delineate in a mathematically precise manner the class of models to which our rigorous QEI derivation in Sec. 4 applies. We will state the required properties of these models in an axiomatic fashion and demonstrate later in Sec. 5 (by drawing together various results in the literature) that there actually exists a wide class of models with those properties. As we remarked in the previous section, independent components of the stressenergy are associated with independent representations of Diff + (S 1 ), the group of orientation-preserving diffeomorphisms of the circle. It is important for the validity of our arguments to establish the QEIs to have sufficient control over these representations, especially their continuity properties, as well as the spectral properties of certain generators. The essence of our axioms therefore consists in specifying the nature of the representations of Diff + (S 1 ) that are allowed to occur in the given conformal field theory. In order to state these properties in a precise and efficient way, we will set the stage in the following subsections by recalling the salient facts about the group Diff + (S 1 ) and its unitary representations, especially the so-called “unitary multiplier representations”. With those facts at hand, we will then state our axioms for the conformal field theories considered in this paper in Subsec. 3.3. Some of our later arguments in Sec. 5 establishing the existence of conformal field theories obeying our axioms, will also require us to know certain properties of the phases that occur in the unitary multiplier representations. Our presentation will therefore include a discussion and analysis of those, even though this would not, strictly speaking, be necessary in order to present our axioms. 3.1. Preliminaries concerning Diff+ (S 1 ) 3.1.1. Group structure Beginning with the circle itself, S 1 will be regarded as the unit circle {z ∈ C : |z| = 1} in the complex plane. Under the Cayley transform C : z → e That
is, the class of functions which, together with their derivatives, vanish more rapidly than any inverse power at infinity.
July 6, 2005 12:21 WSPC/148-RMP
584
J070-00240
C. J. Fewster & S. Hollands
i(1 − z)/(1 + z), the circle (less −1) is mapped onto R; we will refer to this as the “light-ray picture” in what follows. The real line will also enter as the universal covering group of S 1 , via the map θ → tan 12 θ. We will call this copy of R the “unrolled circle” to distinguish it from the light-ray picture. A function f on S 1 will be said to be differentiable if R θ → f (eiθ ) is, and the derivative f will be given by ieiθ f (eiθ ) =
d f (eiθ ). dθ
(3.1)
We may now define Diff + (S 1 ) to be the group (under composition) of all diffeomorphisms σ of the circle to itself which are orientation preserving, in the sense that σ(z) winds once positively around the origin as z does. We will also be con1 cerned with its universal covering group Diff + (S ), which may be identified with the group of diffeomorphisms ρ of R obeying ρ(θ + 2π) = ρ(θ) + 2π,
(3.2)
each such map determining a ˚ ρ ∈ Diff + (S 1 ) by ˚ ρ(eiθ ) = eiρ(θ) .
(3.3)
As examples, let us note three particularly important one-parameter subgroups of Diff + (S 1 ), which will appear in our discussion: namely Rφ (φ ∈ R) corresponding to rotations on the circle, and Ts (s ∈ R) and Dλ (λ > 0) corresponding respectively to translations and dilations on the light-ray. On the unrolled circle, the rotations ˚φ (z) = zeiφ ], while the translations and are defined by Rφ (θ) = θ + φ [so that R dilations are defined by θ −1 s + tan for θ ∈ (−π, π) (3.4) Ts (θ) = 2 tan 2 and −1
Dλ (θ) = 2 tan
θ λ tan 2
for θ ∈ (−π, π)
(3.5)
and are extended to other values of θ by Eq. (3.2) and continuity. In each case, the principal branch of arctangent should be understood. The rotations and translations may be combined to obtain a further oneparameter subgroup of interest, namely the special conformal transformations Ss = Rπ Ts Rπ−1 (s ∈ R). We also observe that the elements R2πk (k ∈ Z) con1 stitute the center of Diff + (S ) as a consequence of Eq. (3.2). Taken together, the rotations, translations and dilations generate the universal cover M¨ ob of M¨ ob, the group of M¨ obius transformations of S 1 . This group will be the unbroken symmetry of conformal field theory; as we will see, these theories are
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
585
only covariant (rather than invariant) with respect to the diffeomorphisms. M¨ obius transformations of the circle take the form: αz + β , z → ¯ βz + α ¯
(3.6)
where α, β ∈ C with |α|2 − |β|2 = 1. Noting the invariance of Eq. (3.6) under simultaneous negation of α and β, we see that M¨ob ∼ = PSU(1, 1) = SU(1, 1)/{11, −11}. In the light-ray picture, elements of M¨ ob act according to u →
au + b , cu + d
(3.7)
for real coefficients a, b, c, d with ad−bc = 1, and this provides a group isomorphism M¨ ob ∼ = PSL(2, R). 3.1.2. Lie group structure Let C ∞ (R; R) be the space of smooth, real-valued functions on R equipped with the topology of uniform convergence of functions and their derivatives of all orders,f ∞ which makes it into a Fr´echet space. We use C2π (R; R) to denote the Fr´echet sub∞ 1 space of C (R; R) consisting of (2π)-periodic functions. Now, ρ ∈ Diff + (S ) if ∞ and only if ρ˜(θ) = ρ(θ) − θ is an element of C2π (R; R) obeying ρ˜ (θ) > −1 for 1 ∞ all θ. Thus, Diff + (S ) is an open subset of an affine translation of C2π (R; R) in ∞ C (R; R) and may therefore be endowed with the structure of a Fr´echet manifold ∞ (R; R), with ρ → ρ˜ acting as a global coordinate chart. Moreover, modeled on C2π 1 the group operations of composition and inversion are smooth, so Diff + (S ) is in 1 fact a Fr´echet Lie group. The same structure can be induced on Diff + (S ) by the quotient map. (Cf., for example, [34, Sec. 6] and [26, Example 4.2.6].) ∞ (R; R), may be conveniently regarded as The Lie algebra of these groups, C2π the space of real vector fields on the circle, VectR (S 1 ). Indeed, given any smooth 1 1 one-parameter curve t → ρt ∈ Diff + (S ), we obtain a vector field X on S by d g(˚ ρt (z)) (g ∈ C ∞ (S 1 )), (3.8) (Xg)(z) = dt t=0 which corresponds to the tangent vector to ρt at t = 0. This vector field is said to be real because it may be expressed in the form: (Xg)(eiθ ) = f (eiθ )
d g(eiθ ) (g ∈ C ∞ (S 1 )) dθ
(3.9)
for some real-valued f ∈ C ∞ (S 1 ). For our purposes, however, it will be more convenient to identify Vect(S 1 ) and C ∞ (S 1 ) so that f ∈ C ∞ (S 1 ) corresponds to f That
(r)
is, fk → f iff supx∈R |fk (x)−f (r) (x)| → 0 for all r ≥ 0, where f (r) is the rth derivative of f .
July 6, 2005 12:21 WSPC/148-RMP
586
J070-00240
C. J. Fewster & S. Hollands
the vector field f ∈ Vect(S 1 ) with action (fg)(z) = f (z)g (z).
(3.10)
With this identification, f is real if and only if f is invariant under the antilinear conjugation (Γf )(z) = −z 2 f (z). We will denote the space of f ∈ C ∞ (S 1 ) obeying Γf = f by CΓ∞ (S 1 ). As examples, it is straightforward to check that the tangent vector to the curve φ → Rφ at φ = 0 corresponds to the function z → iz, while those of s → Ts and s → Ss at s = 0 correspond to z → 2i (1 + z)2 and z → − 2i (1 − z)2 respectively. All three functions are invariant under Γ, as z¯ = z −1 on the circle. 3.1.3. The Bott cocycle As already remarked, the Virasoro algebras underlying CFT are central extensions of the complexified Lie algebra of Diff + (S 1 ). At the level of groups, these extensions are described by the Bott cocycle B: Diff + (S 1 ) × Diff + (S 1 ) → R given byg 1 d Re log(σ2 (z)) dz, B(σ1 , σ2 ) = − log((σ1 ◦ σ2 ) (z)) (3.11) 48π dz 1 S 1 ˜ 1 , ρ2 ) = B(˚ which lifts to a cocycle B(ρ ρ1 , ˚ ρ2 ) on Diff + (S ). Note that the logarithms do not introduce any ambiguity into this formula, because σ (z) has winding number zero about the origin for σ ∈ Diff + (S 1 ). ˜ Firstly, it is immediate from the Let us now collect some properties of B and B. definition that
B(σ, σ −1 ) = 0
B(id, σ) = B(σ, id) = 0,
(σ ∈ Diff + (S 1 )),
(3.12)
and that the cocycle property B(σ1 , σ2 ) + B(σ1 σ2 , σ3 ) = B(σ2 , σ3 ) + B(σ1 , σ2 σ3 )
(3.13)
˜ holds for all σ1 , σ2 , σ3 ∈ Diff + (S 1 ) (analogous results also hold for B). Secondly, B vanishes on M¨ ob × M¨ ob by the Cauchy integral formula because ˜ vanishes the integrand is holomorphic in the unit disk in this case [47]. Similarly, B on M¨ ob × M¨ ob. Thirdly, the following first derivatives are easily computed: ˚ ρ (z) ˜ (id,ρ) (f) = − 1 Re dz (3.14) f (˚ ρ(z)) D1 B| 48π ˚ ρ (z) S1 and ˜ (ρ,id) (f) = − 1 Re D2 B| 48π g This
S1
˚ ρ (z) − ˚ ρ (z)
˚ ρ (z) ˚ ρ (z)
2 f (z) dz,
(3.15)
differs slightly from the form usually given, to which it is co-homologous, but which corresponds to the Gel’fand–Fuks (rather than Virasoro) cocycle at the level of Lie algebras. The form given here is drawn from [47] with some typographical errors corrected.
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
from which the second derivative 1 ˜ (id,id) (f, g) = − 1 Re f (z)g (z) dz = ω(f, g) D12 B| 48π 2 1 S follows easily, where ω(f, g) =
1 48π
S1
(f (z)g (z) − f (z)g(z)) dz
587
(3.16)
(3.17)
˜ Note that is the Virasoro cocycle, i.e. the Lie algebra cocycle corresponding to B. ∞ 1 the integral in Eq. (3.17) is automatically real for f, g ∈ CΓ (S ). 1 3.2. Unitary multiplier representations of Diff + (S ) 1 Let H be a Hilbert space, and suppose that each ρ ∈ Diff + (S ) is assigned a unitary operator, U (ρ), on H so that ˜
U (ρ)U (ρ ) = eicB(ρ,ρ ) U (ρρ )
(3.18)
1 ˜ holds for all ρ, ρ ∈ Diff + (S ), where B is the Bott cocycle introduced above. Then 1 the map ρ → U (ρ) will be called a unitary multiplier representation of Diff + (S ) ˜ with cocycle B and central charge c. Representations of this type will form the main component of our axioms for CFT and we now collect some of their properties. We begin by noting that U restricts to M¨ ob as a bona fide unitary representation ˜ because B vanishes on M¨ ob × M¨ ob. It therefore obeys U (id) = 11, and because we 1 −1 ˜ ρ−1 ) = 0 for all ρ ∈ Diff also have B(ρ, ) = U (ρ)−1 + (S ), we easily obtain U (ρ from Eq. (3.18). Now assume, in addition, that the map ρ → U (ρ)ψ is continuous for each fixed ψ ∈ H , i.e. the representation is strongly continuous. This assumption permits us to obtain the infinitesimal generators of the representation, which are interpreted as smeared stress-energy densities. In more detail, for each f ∈ CΓ∞ (S 1 ), let f ∈ VectR (S 1 ) be the corresponding real vector field and define a self-adjoint operator, Θ(f ), by 1 d U (exp(sf))ψ (3.19) Θ(f )ψ = i ds s=0 h on the dense domain of ψ for whichthe derivative exists. We then define Θ(f ) for 1 1 ∞ 1 arbitrary f ∈ C (S ) by Θ(f ) = Θ 2 (f +Γf ) +iΘ 2i (f −Γf ) on the appropriate intersection of domains, so that
Θ(f )∗ = Θ(Γf ) h The
(3.20)
additive group of real numbers does not admit non-trivial smooth cocycles (see, for example, [50, Theorem 10.38]). Thus, because s → U (exp(sf)) is a strongly continuous unitary multiplier representation of (R, +) with a smooth multiplier, we may write U (exp(sf)) = eiα(s) V (s) where V (s) is a strongly continuous 1-parameter group of unitaries and α is a smooth and real-valued. Stone’s theorem and the Leibniz rule then guarantee that Eq. (3.19) does indeed define a selfadjoint operator with domain equal to the set of ψ for which the derivative exists.
July 6, 2005 12:21 WSPC/148-RMP
588
J070-00240
C. J. Fewster & S. Hollands
holds on D(Θ(f )). A dense domain D ⊂ H will be called a domain of C 1 -regularity for ρ → U (ρ) if (i) it is invariant under each U (ρ) and contained in the domain D(Θ(f )) for all f , and (ii) the map f → Θ(f )ψ defines a vector-valued distribution on C ∞ (S 1 ) for each ψ ∈ D. We assume henceforth that such a domain is available, and also adopt the informal notation f (z)Θ(z) dz (3.21) Θ(f ) = S1
as a convenient book-keeping device, although Θ(z) should not be interpreted as an operator in its own right. To illustrate the use of this notation, let H be the ob. Then generator of the 1-parameter subgroup Rφ of M¨ 1 d 1 d U (Rφ )ψ U (exp(φf))ψ Hψ = = (3.22) i dφ i dφ φ=0 φ=0 for any ψ ∈ D, where f is the tangent vector to φ → Rφ at φ = 0. As shown above, this corresponds to the function f (z) = iz, so we write izΘ(z) dz. (3.23) H= S1
Similarly, the generators P and K of the 1-parameter subgroups s → Ts and s → Ss may be written as i P = (1 + z)2 Θ(z) dz (3.24) 2 S1 i K=− (1 − z)2 Θ(z) dz, (3.25) 2 S1 so that 1 (P + K), (3.26) 2 on D, using linearity of f → Θ(f )ψ. One of the key properties we will require is the transformation law of the smeared stress-energy densities under diffeomorphisms, provided by the following result. H=
Proposition 3.1. Assume that H carries a strongly continuous unitary multiplier 1 representation of Diff + (S ) obeying Eq. (3.18) for which D ⊂ H is a domain of 1 C -regularity. Then D is a core for each Θ(f ) with f = Γf . Moreover, the Θ(f ) transform according to c −1 {˚ ρ, z}f (z) dz 11, (3.27) U (ρ)Θ(f )U (ρ) = Θ(fρ ) − 24π S 1 on vectors in D, for arbitrary f ∈ C ∞ (S 1 ), where fρ (z) = ˚ ρ (˚ ρ−1 (z))f (˚ ρ−1 (z)) corresponds to the vector field fρ = Ad(ρ)(f). Furthermore, the commutation relations i[Θ(g), Θ(f )] = Θ(g f − f g) + cω(g, f )11,
(3.28)
hold for arbitrary f, g ∈ C (S ), on vectors ψ ∈ D ∩ D(Θ(f )Θ(g)) ∩ D(Θ(g)Θ(f )). ∞
1
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
589
Remark. Equation (3.27) may also be written in the “unsmeared form” ρ (z)2 Θ(˚ ρ(z)) − U (ρ)Θ(z)U (ρ)−1 = ˚
c {˚ ρ, z}11. 24π
(3.29)
Proof. That D is a core follows from [43, Theorem VIII.11] and footnote h. To obtain the stated transformation property, choose f ∈ CΓ∞ (S 1 ) and let f be the 1 corresponding vector field. Then for any ψ ∈ D and ρ ∈ Diff + (S ), 1 d U (ρ)U (exp(sf))U (ρ−1 )ψ U (ρ)Θ(f )U (ρ)−1 ψ = i ds s=0 1 d icϕ(s) −1 e = U (ρ exp(sf)ρ )ψ i ds s=0 1 d icϕ(s) e = U (exp(sfρ ))ψ i ds s=0 = Θ(fρ )ψ − cϕ (0)ψ,
(3.30)
˜ exp(sf)) + B(ρ ˜ exp(sf), ρ−1 ). Using the fact that ρ exp(sf) = where ϕ(s) = B(ρ, exp(sfρ )ρ, the cocycle relation Eq. (3.13), and the elementary properties Eq. (3.12), ϕ may be rewritten in the form ˜ exp(sf)) − B(exp(sf ˜ ϕ(s) = B(ρ, ρ ), ρ).
(3.31)
˜ given in the It is now a straightforward exercise, using the first derivatives of B previous subsection and the definition (2.8) of the Schwarz derivative, to show that 1 {˚ ρ, z}f (z) dz (3.32) ϕ (0) = − 24π S 1 (the integral is real because ˚ ρ ∈ Diff + (S 1 ) and f ∈ CΓ∞ (S 1 )). Substituting this in Eq. (3.30), we have obtained Eq. (3.27) (applied to ψ); the extension to f ∈ C ∞ (S 1 ) is immediate by linearity. To obtain the Virasoro relations, we now put ρs = exp sg, where the vector field g corresponds to some g ∈ CΓ∞ (S 1 ), and choose arbitrary ψ, ϕ ∈ D. We now write d U (ρ−1 )ϕ | Θ(f )ψ (3.33) −iΘ(g)ϕ | Θ(f )ψ = s ds s=0 and use Eq. (3.27) (applied to U (ρs )ψ ∈ D) and the Leibniz rule, together with d d Θ(fρs )ϕ = Θ fρs ϕ (3.34) ds ds to rewrite the right-hand side. The upshot is that Eq. (3.28) holds in a quadratic form sense on D, and hence as an identity on vectors ψ ∈ D ∩ D(Θ(f )Θ(g)) ∩ D(Θ(g)Θ(f )). The extension to general f, g ∈ C ∞ (S 1 ) is by linearity, as before.
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
C. J. Fewster & S. Hollands
590
The following are simple applications of the above result: U (Dλ )P U (Dλ )−1 = λP ;
U (Dλ )KU (Dλ )−1 = λ−1 K ;
K = U (Rπ )P U (Rπ )−1 (3.35)
(note that the Schwarz derivative of a M¨ obius transformation vanishes). In particular, we observe that K and P must have the same spectrum, which (as it is non-empty, closed and dilation-invariant) must be one of the four possibilities {0}, [0, ∞), (−∞, 0] or R. Restricting attention to the first two cases, in which P ≥ 0, we find that H ≥ 0 by Eq. (3.26), because H is thereby positive on D, on which it is essentially self-adjoint. Conversely, if H ≥ 0, we use the identityi 1 λP + λ−1 K U (Dλ )HU (Dλ )−1 = (3.36) 2 on D to deduce that P ≥ 0 because ψ | P ψ = lim λ−1 U (Dλ )−1 ψ | HU (Dλ )−1 ψ ≥ 0 λ→∞
(3.37)
for all ψ ∈ D, which is again a core for P . Clearly, P = 0 if and only if H = 0, so spec(P ) = [0, ∞) if and only if H is a non-zero positive operator. 3.3. Axioms We now come to the statement of the axioms we shall adopt for conformal field theory. These are to be regarded as minimal requirements: specific models will have more structure and possibly an enlarged symmetry group. Nonetheless, the following axioms are already sufficient to establish the QEIs and are satisfied in models built from Virasoro representations (see Sec. 3.2). Note that, as they include the assumptions of Sec. 3.2, all the conclusions of that subsection apply to such theories, particularly Proposition 3.1. For simplicity, we state our axioms for a conformal field theory with a single component of stress-energy; at the end of this section we describe the (straightforward) extension to two independent components. A. Hilbert space, diffeomorphism group and energy positivity (A.1) The Hilbert space H of the theory carries a strongly continuous unitary 1 multiplier representation ρ → U (ρ) of Diff + (S ) obeying Eq. (3.18), with central charge c > 0. (A.2) Up to phase, there is a unique unit vector Ω ∈ H which is invariant under the restriction of U to M¨ ob, and which will be called the vacuum vector. fact that P ≥ 0 iff H ≥ 0 is well-known, but is usually obtained from a detailed knowledge g of the unitary representations of M¨ ob. Combine, for example, the proof of [41, Proposition 9.2.6] with the representation theory given in [31, 29, 42]. The approach given here is adapted from [27, Proposition 1] (note that the conventions differ slightly).
i The
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
591
(A.3) The generator P of the one-parameter translation subgroup s → U (Ts ) is assumed to be a positive self-adjoint operator. (An equivalent requirement is that the generator H of the rotation subgroup φ → U (Rφ ) be positive, by the remarks above.)
B. Stress-energy density The (smeared) stress-energy density Θ(f ) is defined as the generator of U (ρ), as described in the previous subsection (see Eq. (3.19)). We assume that H contains a dense subspace D ⊂ H such that: (B.1) D is invariant under each U (ρ), contains Ω and is contained in each D(Θ(f )) for all f ∈ C ∞ (S 1 ). (B.2) For each ψ ∈ D, the map f → Θ(f )ψ is a vector-valued distribution on C ∞ (S 1 ) (equipped with its usual topology of uniform convergence of functions and all their derivatives). Thus, D is a domain of C 1 -regularity in the sense introduced above. (B.3) For each ψ ∈ D, Θ(z)ψ is smooth on S 1 . Given a theory of the above type living on a circle, we may define a stress-energy density T (v) living on a light-ray by the “unsmeared” formula 2 dz 4 T (v) = Θ(z(v)) = − Θ(z(v)), (3.38) dv (1 − iv)4 where z(v) = C −1 (v) =
1 + iv 1 − iv
(3.39)
maps R to S 1 (less −1, which represents the “point at infinity”). The class of allowed smearing functions in this picture consists of all F ∈ C ∞ (R) for which z → 2i (1 + z)2 F (C(z)) is smooth on S 1 [with an appropriate limiting definition at z = −1]. As before, we use an integral notation to denote such smearings, thus, for example, the relationship Eq. (3.24) now reads P = T (v) dv. (3.40) We may also deduce from axiom B.3 and Eq. (3.38) that T (v)ψ decays as O(v −4 ) as |v| → ∞ for ψ ∈ D. 1 Finally, suppose ρ ∈ Diff ρ(−1) = −1, and + (S ) fixes the point at infinity, i.e. ˚ define a reparametrization v → V (v) of R implicitly by z(V (v)) = ˚ ρ(z(v)). Then the transformation law Eq. (3.29) becomes U (ρ)T (v)U (ρ)−1 = V (v)2 T (V (v)) −
c {V, v}11. 24π
(3.41)
July 6, 2005 12:21 WSPC/148-RMP
592
J070-00240
C. J. Fewster & S. Hollands
Here, we have used the chain rule for Schwarz derivatives 2 dy {z, x} = {z, y} + {y, x}, dx
(3.42)
where z = z(y), y = y(x), and the fact that the Schwarz derivative of a M¨ obius transformation vanishes identically, so {z(v), v} = 0. The above structure is already enough to encompass an interesting class of theories in Minkowski space: namely, the boundary conformal field theories (see, for example, [55], or [30] for a recent treatment in terms of algebraic quantum field 1 theory). In these theories, there is a single underlying representation U of Diff + (S ) with corresponding stress-energy density T , and the theory lives on the right-hand half x1 > 0 of Minkowski space with stress-energy tensor given by Eq. (2.2) where TL = TR = T . In particular, T 01 vanishes on the timelike line x1 = 0, reflecting the boundary condition that no energy should flow out of the half-space x1 > 0. A more general class of theories corresponds to the “moving mirror” models studied in [20] (for the particular case of the massless scalar field). Instead of an inertial boundary x1 = 0, we consider a moving boundary with trajectory v = p(u), where u = x0 − x1 and v = x0 + x1 are null coordinates on Minkowski space. The theory is defined on the portion of Minkowski space to the right of this curve, i.e. v > p(u). Restricting, for simplicity, to the case in which u → p(u) lifts to an 1 element ρ ∈ Diff + (S ), the stress-energy tensor is again defined by Eq. (2.2), where we now put TL (v) = T (v),
TR (u) = U (ρ)T (u)U (ρ)−1 .
(3.43)
(Boundary CFT corresponds, of course, to the case p(u) = u and hence U (ρ) = 11.) It follows Eq. (3.41) and T (v)Ω that the energy density in the vacuum state Ω is then T00 (x0 , x1 )Ω = −
c 1 c d2 {p, u} = p (u) 2 , 24π 12π du p (u)
(3.44)
which reduces to the result of [20] in the case c = 1. In fact, the moving mirror spacetime is conformally related to the boundary space-time considered above (under the transformation (u, v) → (p(u), v)) and this dictates the form of Eq. (3.43), together with the boundary condition that Ω should be the “in” vacuum at past null infinity. It is intended to discuss this more fully elsewhere. By the L¨ uscher–Mack theorem (see Sec. 2), conformal field theories on the whole of Minkowski space must have two independent components of stress-energy. We now briefly explain the required modifications to our axioms to permit the description of this situation. There are now two commuting projective unitary representa1 ob as a unitary representation. tions UL and UR of Diff + (S ), each restricting to M¨ We assume the existence of a unique vacuum vector Ω invariant under both copies of M¨ ob and assume that the two translation generators PL and PR are positive. The
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
593
domain D is assumed to be invariant under both UL and UR , and each representation is generated (in the sense of Eq. (3.19)) by corresponding stress-energy densities ΘL and ΘR , each of which obeys the regularity assumptions of axiom B. Each stress-energy density transforms according to the Eq. (3.29) (with central charge cL 1 or cR as appropriate) under the corresponding representation of Diff + (S ), but is invariant under the adjoint action of the other copy. We also define light-ray fields TL and TR in the same way as above, and then define the stress-energy tensor by Eq. (2.2). In particular, one may construct such a theory as a tensor product of two conformal field theories with a single component of stress-energy, but this is by no means the only possibility. Clearly, we could envisage theories with any number of independent components of stress-energy in a similar fashion, but the interpretation as a theory in Minkowski space is no longer clear. 4. Quantum Energy Inequalities in CFT 4.1. Main result We are now in a position to state our main result. The notation is as in the previous section. Theorem 4.1. Consider a conformal field theory with a single component T of stress-energy. For any non-negative G ∈ S (R), the quantum energy inequality 2 d c G(v) dv (4.1) G(v)T (v)ψ dv ≥ − 12π dv √ d holds for all ψ ∈ D, where the derivative dv G is defined to be zero for points at which G vanishes: d G (v)/(2 G(v)) G(v) = 0 G(v) = (4.2) 0 G(v) = 0. dv Moreover, this bound is sharp: the right-hand side is the infimum of the left-hand side as ψ varies in D. In a conformal field theory with two independent components of stress-energy, both TL and TR obey bounds of the above type (with weight functions GL , GR ∈ S (R)) which are simultaneously sharp in the sense that there is a sequence of non-zero vectors ψn ∈ D with 2 d cL GL (v)TL (v)ψn dv → − GL (v) du, 12π dv (4.3) 2 d cR GR (u)TR (u)ψn du → − GR (u) dv 12π du as n → ∞. Remarks. (a) It is proved in Corollary A.2 in the Appendix that the square root √ G of a non-negative Schwartz function is in fact a distribution in the Sobolev space
July 6, 2005 12:21 WSPC/148-RMP
594
J070-00240
C. J. Fewster & S. Hollands
W 1 (R) (i.e.) has square-integrable first derivative) and that the above rule (4.2) for defining its derivative coincides with the usual notion of the distributional (or “weak”) derivative of such a distribution. In particular, this formally establishes that the integral representing our QEI bound on the right side of Eq. (4.1) is actually finite even for smearing functions G that are not strictly positive. (b) As D is a core for any smeared energy density, the QEIs can be stated as operator inequalities, e.g., 2 c d G(v) dv 11 (4.4) G(v)T (v) dv ≥ − 12π dv by standard quadratic form arguments (see, for example, [44, Theorem X.23]). The fact that QEIs for TL and TR are simultaneously sharp is simply the statement that the pair formed by the two bounds in Eq. (4.3) belongs to the joint spectrum of the two operators concerned. (c) The above results can of course be transformed to give QEIs on the field Θ on the circle; one can also follow the general strategy given below to derive QEIs based on positivity of H (rather than P ), which would be more natural in that setting. In addition, the results can be extended to any number of independent stress-energy operators. We will not pursue these directions here. Proof. The proof is broken down into various stages. We start with the case in which the non-negative function G is smooth and compactly supported, and then extend to the Schwartz class. As mentioned above, the obstruction to a straightforward use of the argument summarized in Sec. 2 is that the equation V (v) = 1/G(v) does not define a diffeomorphism which can be lifted to the circle. To circumvent this problem, we modify G to a function H ,n depending upon regulators and n. The function H ,n is constructed in such a way that the formal argument given in Sec. 2 holds rigorously, and so that the desired bound is obtained as the regulators are removed. The two regulators have the following effect. First, we add the constant to G(v), thus obtaining a reparametrization of the whole line by V (v) = 1/(G(v) + ). Although this reparametrization fixes the point at infinity, it does not lift to a diffeomorphism of the circle as it has a discontinuous second derivative at z = −1 (unless G is identically zero). The remedy is to subtract from G(v) + a small compactly supported correction, which is translated to the right (and slightly rescaled) as n increases. As noted following Eq. (3.40), T (v)ψ = O(v −4 ) as v → ∞ for ψ ∈ D, and we can exploit this decay to control the limit n → ∞. Other approaches to this issue are probably possible.j The construction and properties of H ,n are summarised by the following lemma, whose proof is deferred to the end of this section. j As we were completing this paper, Carpi and Weiner released a preprint [3] in which they point out that certain nonsmooth smearings of the stress-energy density also yield self-adjoint operators. It is likely that one could use this to find a unitary implementation of the reparametrization of the line defined by V (v) = 1/(G(v) + ), removing the need for the second stage of regulation.
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
Lemma 4.2. Given a non-negative G ∈ C0∞ (R), let G(v) 1 dv, λ = |supp G| G(v) +
595
(4.5)
where |supp G| denotes the Lebesgue measure of the support of G. Then λ increases as → 0+ , with lim →0+ λ = 1. Let η ∈ C0∞ (R) obey 0 ≤ η(v) ≤ 1/2 for all v and η(v) dv = |supp G|, (4.6) 1 − η(v) and set
ηn, (v) = η
v−n λ
.
(4.7)
Then, there exists an n0 such that, for all n ≥ n0 and ∈ (0, 1), 1. the support of ηn, lies to the right of supp G, 1 2. there is a diffeomorphism ρn, ∈ Diff + (S ) corresponding to a reparametrization v → Vn, (v) of the light-ray with Vn, (v) =
1 , Hn, (v)
(4.8)
where Hn, (v) = G(v) + (1 − ηn, (v)). Now let ψ ∈ D be arbitrary, so T (v)ψ = O(v −4 ) as v → ∞ for the reasons mentioned above. Then the formal calculation of Sec. 2 holds rigorously if H is replaced by the function Hn, given in item (2) of the above lemma, and if ψV is replaced by U (ρn, )ψ. This yields Hn, (v)T (v)ψ ≥ −
c 12π
d dv
2
Hn, (v) dv,
(4.9)
the required integration by parts being valid because Hn, is constant outside a compact interval. For n ≥ n0 , the supports of G and ηn, are disjoint by item (1) of the lemma, so the integral on the right-hand side falls into two pieces: 4
d dv
2
Hn, (v) dv =
ηn, (v)2 dv 1 − ηn, (v)
G (v)2 η (v)2 dv + dv. = G(v) + λ 1 − η(v)
G (v)2 dv + G(v) +
On the other hand, we have Hn, (v)T (v)ψ = G(v)T (v)ψ dv + P ψ − ηn, (v)T (v)ψ .
(4.10)
(4.11)
July 6, 2005 12:21 WSPC/148-RMP
596
J070-00240
C. J. Fewster & S. Hollands
As n → ∞, ηn, is translated off to infinity, so the last term drops out in the limit owing to the decay of T (v)ψ as v → ∞. We therefore have c η (v)2 G (v)2
c G(v)T (v)ψ dv ≥ − dv − dv − P ψ , (4.12) 48π G(v) + 48πλ 1 − η(v) and the limit → 0+ yields the QEI (4.1), owing to Corollary A.2 in the Appendix and the fact that ψ was an arbitrary element of D. We now turn to the case √ in which G is a non-negative function of Schwartz class. 1 According to Corollary A.2, G belongs to the Sobolev √ space W (R). It√follows 2that ∞ C0 (R) with hk → G and hk → d/dv G in L (R) we may find non-negative hk ∈ √ as k → ∞ (the derivative d/dv G being understood in the sense of distributions). Thus for each ψ ∈ D and k, we have c T (v)ψ hk (v)2 dv ≥ − (4.13) hk (v)2 dv. 12π limit k → ∞, the right-hand side clearly converges to −c/(12π) × In the √ 2 (d/dv G) dv, while the left-hand side converges to T (v)ψ G(v) dv because T (v)ψ is bounded in v. The QEI (4.1) therefore holds for all non-negative G ∈ S (R). To show that the bound is sharp, we employ another lemma. Lemma 4.3. If F ∈ S (R) and G ∈ C0∞ (R) are non-negative, then d F (v) d c G(v) + dv. inf F (v)T (v)ψ dv ≤ − 12π dv G(v) + dv ψ∈D
(4.14)
Proof. Using the notation of Lemma 4.2, let n > n0 and > 0, and define ψn, = U (ρn, )−1 Ω in terms of G. Since T (Vn, (v))Ω vanishes identically, the transformation law in Eq. (3.41) gives d2 Hn, (v) 1 c c {Vn, , v} = T (v)ψn, = − 24π 12π Hn, (v) dv 2 d2 1 − ηn, (v) d2 G(v) + 1 c 1 = + 12π dv 2 dv 2 1 − ηn, (v) G(v) + (4.15) because G and ηn, have disjoint supports. Note that the effect of increasing n is merely to translate the final term to the right. This term therefore vanishes in the limit n → ∞ when we integrate against F , because it is pushed off into the tail of F . Thus, we have F (v) c d2 G(v) + dv (4.16) F (v)T (v)ψn, dv = lim n→∞ 12π G(v) + dv 2 and Eq. (4.14) is obtained after integration by parts.
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
597
Now suppose that G is a non-negative Schwartz-class function and set Gn (v) = χ(v/n)G(v), where χ ∈ C0∞ (R), 0 ≤ χ(x) ≤ 1 and χ(x) = 1 for |x| ≤ 1. One may verify that d d G(v) d lim = G(v) + = lim Gm (v) + (4.17) m→∞ dv m→∞ dv dv Gm (v) + in L2 (R). Applying Lemma 4.3 with F and G replaced by G and Gm respectively, these limits and the continuity of the right-hand side of Eq. (4.14) in both factors (it is effectively an L2 -inner product) yield 2 d c inf G(v)T (v)ψ dv ≤ − G(v) + dv. (4.18) 12π dv ψ∈D On taking → 0+ , we conclude that the bound Eq. (4.1) is sharp. Turning to conformal field theories with two independent components of stressenergy, it is immediate from the above that both TL and TR satisfy QEIs of the form required. That the bounds are simultaneously sharp follows from the fact that each 1 stress-energy density transforms under its corresponding copy of Diff + (S ), but is invariant under the other copy. Thus, the construction used to establish sharpness of the QEI (4.1) may be adapted in a straightforward fashion to prove Eq. (4.3). This concludes the proof of our main Theorem 4.1. It remains to establish the lemma used above. Proof of Lemma 4.2. It is clear (e.g., by monotone convergence) that λ increases to unity as → 0+ . Thus, the support of ηn, will lie to the right of supp G for all n greater than some n0 and all ∈ (0, 1). We define v 1 Vn, (v) = dv , (4.19) H n, (v ) 0 which evidently satisfies Eq. (4.8) and, as it is smooth and strictly increasing with limv→±∞ Vn, (v) = ±∞ that gives a diffeomorphism of R. We wish to see that this diffeomorphism can be extended to the circle. Suppose the support of G is contained within [−R, R] for some R > 0 and that n > n0 . Then, for v < −R we have v (4.20) Vn, (v) = + α,
where −R R 1 α= + dv. (4.21)
G(v) + 0 Now, choose S to the right of supp ηn, , so supp ηn, ⊂ (R, S). Then, for v > S we have S v S 1 dv Vn, (v) = − +
G(v) +
(1 − ηn, (v)) 0 v = + α, (4.22)
which follows after a small amount of calculation using the definitions of η and λ .
July 6, 2005 12:21 WSPC/148-RMP
598
J070-00240
C. J. Fewster & S. Hollands
Thus v → Vn, (v) differs from the M¨ obius transformation v → v/ + α only on 1 a compact set and may therefore be lifted to ρn, ∈ Diff + (S ) defined by ρn, (θ) = 1 2 tan−1 Vn, (tan 2 θ) for θ ∈ (−π, π) and extended to other values by continuity and Eq. (3.2). 4.2. Applications We now use Theorem 4.1 to give various useful QEI bounds for conformal field theories (on two-dimensional Minkowski space). 4.2.1. Worldline bounds Consider a smooth curve λ → γ µ (λ) in Minkowski space, and set u = γ 0 − γ 1 , v = γ 0 + γ 1 . It is straightforward to show that ˙ 2 + TL (v(λ))v(λ) ˙ 2. ργ (λ) := Tµν (γ(λ))γ˙ µ (λ)γ˙ ν (λ) = TR (u(λ))u(λ)
(4.23)
To avoid technicalities, let us assume that our curve γ is either timelike or spacelike, with no endpoints. The curve can then be parametrized by proper time (resp. proper distance) λ ranging from −∞ to +∞, and we assume this has been done. We assume furthermore that both u(λ) ˙ and v(λ) ˙ are bounded away from zero on the parameter range of the curve (i.e. greater or equal to some fixed ε > 0), meaning that the curve does not become null asymptotically. We also restrict consideration to curves that do not “wiggle” too rapidly by assuming moreover that all derivatives of u(λ) ˙ and v(λ) ˙ vanish faster than polynomially. Our assumptions imply that the functions u(λ) and v(λ) can be inverted with smooth inverses λ(u), resp. λ(v), the derivatives of which are Schwartz functions. Let G be a smooth, non-negative Schwartz function. Our assumptions then ensure that the smearing functions GR (u) = G(λ(u)) and GL (v) = G(λ(v)) and consequently GR (u)|dλ(u)/du|−1 and GL (v)|dλ(v)/dv|−1 are in the Schwartz class. Thus, using the simultaneously sharp QEIs for both left- and right-moving stressenergy densities, we obtain the worldline QEI inf ργ (λ)ψ G(λ) dλ ψ∈D
cR =− 12π
d du
GR (u) |dλ(u)/du|
2
cL du − 12π
d dv
GL (v) |dλ(v)/dv|
2 dv,
(4.24)
where the integrands on the right-hand side are set to zero for points such that GL , resp. GR , vanish. This bound can be generalized to smooth parametrized curves γ µ satisfying less stringent conditions, but we will not go into this here. We only remark that we may also obtain a bound for the affinely parametrized left-moving null ray u = λ, v = const. for any non-negative G(λ) in the Schwartz class. In that case, ργ = TR and the worldline bound is given by the QEI bound for the right-moving stress tensor (with GR = G) given in our theorem. A similar statement holds of
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
599
course also for the right moving light ray. In general, therefore, averages of the null-contracted stress-energy density Tµν k µ k ν are bounded below along an affinely parametrized null line with tangent k µ . As noted in [15], no other component of the stress tensor can be bounded below along such a curve because all other components involve TR or TL evaluated at a single point and therefore, not averaged. For the case of a static worldline parametrized by proper time, γ 0 = x0 , 1 γ = x1 = constant, we find: inf
ψ∈D
T00 (x0 , x1 )ψ G(x0 ) dx0 = −
cL + cR 12π
2 ∂0 G(x0 ) dx0
(4.25)
which reduces to Flanagan’s bound [14] for the massless scalar field (cL = cR = 1) and Vollick’s bound [52] for the massless (complex) Dirac field, which also has cL = cR = 1. [The Majorana field has cL = cR = 1/2 and a correspondingly tighter bound.] It is worth noting a feature of conformal quantum field theories in two dimensions: namely that one can obtain a (non-trivial) worldline quantum energy inequality even along spacelike or null curves. This can be traced back to the fact that one is free to interchange the role of space and time in two-dimensional conformal field theories (by “turning Minkowski space on its side”) as far as the stress-tensor is concerned. Neither is possible in any other dimension [17, 11] (even for free scalar fields), nor for non-conformally invariant field theories in two dimensions. In those cases, we expect however that there are still bounds that hold for space-time averages of the stress tensor, to which we now turn to.
4.2.2. Worldvolume bounds Let f µν be a smooth tensor field whose components (with respect to global inertial coordinates) are Schwartz class. Then,
Tµν f µν (x0 , x1 ) dx0 dx1 =
TR (u)FR (u) du +
TL (v)FL (v) dv,
(4.26)
where the null averages, FL and FR , are given by FR (u) =
f uu (u, v) dv,
FL (u) =
f vv (u, v) du
(4.27)
with f uu and f vv being appropriate components in (u, v)-coordinates, related to the components in (x0 , x1 )-coordinates by f uu = f 00 + f 11 − f 01 − f 10 , f vv = f 00 + f 11 + f 01 + f 10 .
(4.28)
July 6, 2005 12:21 WSPC/148-RMP
600
J070-00240
C. J. Fewster & S. Hollands
If f µν has non-negative null averages,k then we have the worldvolume QEI Tµν f µν (x0 , x1 )ψ dx0 dx1 inf ψ∈D
=−
cL 12π
d FL (v) dv
2 dv −
cR 12π
d FR (u) du
2 du,
(4.29)
where the integrands on the right-hand side are as usual defined to be zero for points u (resp. v) where FL (u) (resp. FR (v)) vanishes. In particular, if sµ and tν are Schwartz-class timelike vector fields, f µν = sµ tν obeys the above condition and so we obtain a quantum dominated energy inequality (QDEI). 4.2.3. Moving mirrors and boundary CFT As a variation on the foregoing results, let us consider a moving mirror model, with central charge c, living in the portion v > p(u) of Minkowski space, where u → p(u) 1 lifts to some ρ ∈ Diff + (S ). As described in Sec. 3.3, the left- and right-moving components of the stress-energy density are given in terms of a single field T by the relations TL (v) = T (v), TR (u) = U (ρ)T (u)U (ρ)−1 . If f µν is a smooth tensor field compactly supported in v > p(u), then Eq. (4.26) and the transformation law (3.41) entail c (4.30) Tµν f µν (x0 , x1 ) dx0 dx1 = T (v)G(v) dv − {p, u}FR (u) du, 24π where G(v) = FL (v) + p (p−1 (v))FR (p−1 (v)),
(4.31)
and an obvious change of variables has also been employed. Thus, we have the modified worldvolume QEI inf Tµν f µν (x0 , x1 )ψ dx0 dx1 ψ∈D
=−
c 12π
d G(v) dv
2 dv −
c 24π
{p, u}FR (u) du,
(4.32)
in which the last term relates to the stress-energy density created by the motion of the mirror. If the support of f µν is such that the supports of FL and FR ◦ p−1 (i.e. the two “null projections” of f µν onto the mirror trajectory) are disjoint, the first term in the above bound splits into terms involving FL and FR separately. The term in FR may be recombined with the final term in Eq. (4.32), leading to the same overall result as in Eq. (4.29). This is to be expected on grounds of locality, as measurements in (a diamond neighborhood of) the support of f µν should be unaware of the presence of the boundary. (See also [30] for a detailed discussion of boundary CFT in which these ideas appear.) k This
follows of course in particular if f µν satisfies the conditions f uu , f vv ≥ 0 pointwise.
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
601
4.2.4. Unweighted averages Finally, we discuss unweighted averages of the stress-energy tensor along portions of a worldline γ. First, let us note that, if γ is an infinite straight line (with u˙ and v˙ constant) then, (4.33) ργ (λ)ψ dλ ≥ 0 for all ψ ∈ D, because the left-hand side is simply a weighted sum of PL and PR with positive coefficients. Accordingly, conformal field theories in Minkowski space obey the averaged weak energy condition and the averaged null energy condition. However, unweighted averaging along a bounded, or even semi-infinite, portion of such a worldline leads to very different results. For simplicity, we consider a theory with only one independent component of stress-energy, and averaging over (−∞, 0), but it is easy to extend these arguments. We begin by constructing a particular family of states in the following: Let f ∈ C0∞ ((−1, 1)) obey f (v) ≥ −1, f (v) dv = 0, and suppose f is not identically zero on (−1, 0). Then, the map v → V (v) defined by v V (v) = v + f (v ) dv (4.34) −1 1 is a diffeomorphism of the line which lifts to some element ρ ∈ Diff + (S ) (as it agrees with the identity outside a compact interval). If f obeys, additionally,
−1 ≤
1 d2 ≤0 2 dv 1 + f (v)
(4.35)
for v ∈ (−1, 0), then {V, v} ≥ 0 on this interval, and no conflict need arise with 0 our previous assumptions because the left-hand side inequality ensures that −1 f (v) dv < 1. Moreover {V, v} must be strictly positive on some open subset of (−1, 0), since f is not identically zero there. Owing to the identity 1 d2 {V, v} dv = −2 dv = 0, (4.36) 2 dv V (v) V (v) it follows that {V, v} is strictly negative on some open subset of (0, 1) (note that {V, v} is supported in (−1, 1)). With the above assumptions in force, we may use the resulting diffeomorphism to create a vector state ψ = U (ρ)−1 Ω by acting on the vacuum. The corresponding energy density, c {V, v}, (4.37) T (v)ψ = − 24π is smooth and compactly supported in (−1, 1), non-positive for v ≤ 0, and strictly negative (resp. positive) on some open subset of (−1, 0) (resp. (0, 1)). In particular, 0 0 c T (v)ψ = − {V, v} dv < 0. (4.38) 24π −∞ −∞
July 6, 2005 12:21 WSPC/148-RMP
602
J070-00240
C. J. Fewster & S. Hollands
We now consider the family of states obtained by scaling ψ, namely ψλ = U (Dλ )−1 ψ, for which 0 0 cλ T (v)ψλ dv = − {V, v} dv → −∞ (4.39) 24π −∞ −∞ as λ → ∞. The reason for this is that the negative energy density becomes more and more sharply peaked near zero under the dilations, with magnitude growing like λ2 and support shrinking as λ−1 . Thus, we have shown explicitly that sharp averages of the stress-energy density are not subjected to QEI restrictions. A related result holds for general quantum fields with mass-gap in two dimensions, as shown by Verch [51, Propsition 3.1]. However, there is no contradiction between this observation and the QEIs proved above. An average taken against a weight function G ∈ C0∞ (−∞, 0) in the vector states ψλ would in fact tend to zero as λ → ∞ because the negative peak eventually leaves the support of G. If one used a weight function which did not vanish at the origin, its support would spill over into the right-hand half line and sense the energy density there. However, the family of states ψλ also has an increasingly sharply peaked positive energy density within the interval (0, λ−1 ), which must at least compensate for the negative contribution (because T (v)ψλ dv is non-negative). It is the competition between these two differently weighted contributions which permits the QEI to hold. To emphasize the point, let us consider averages over half the light-ray, but with a smoothed-off end. Let G be a non-negative, smooth and compactly supported function, which equals unity in a neighborhood of the origin. Define a sequence of smooth functions Gn by Gn (v) = ϑ(−v)G(v/n) + ϑ(v)G(v),
(4.40)
where ϑ is the Heaviside function (and we take ϑ(0) = 1/2). As n → ∞, these functions approach H(v) = ϑ(−v) + ϑ(v)G(v), which is supported on a half-line and has a smoothed off end. Now for any non-zero ψ ∈ D, we have 2 c d T (v)ψ Gn (v) dv ≥ − Gn (v) dv (4.41) 12π dv 2 d ϑ(−v) c + ϑ(v) =− G(v) dv (4.42) 12π n dv for each n. Taking n → ∞ and using the fact that T (v)ψ decays as O(v −4 ) (by the remark following Eq. (3.40)), we obtain 2 ∞ d c G(v) dv (4.43) T (v)ψ H(v) dv ≥ − 12π 0 dv for arbitrary ψ ∈ D. As expected, the bound depends only on the way the averaging is rounded off.
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
603
5. Highest-Weight Virasoro Representations In this section, we describe how CFT models satisfying our axioms may be constructed by taking direct sums of unitary highest-weight representations of the Virasoro algebra. In particular, this demonstrates that our QEI applies to so-called minimal models and to rational conformal field theories. As part of our discussion 1 we will need to consider the unitary multiplier representations of Diff + (S ) carried by any such Virasoro representation; in particular, we need to show that the representation can be normalized so that the multiplier is of the Bott form assumed in Axiom A.1. We have not found this elsewhere in the literature.
5.1. Highest-weight representations of the Virasoro algebra We recall that the Virasoro algebra is generated by elements Ln (n ∈ Z) and a central element κ, obeying the relations [Lm , Ln ] = (m − n)Lm+n +
1 m(m2 − 1)δm+n,0 κ (m, n ∈ Z) 12
(5.1)
and [κ, Lm ] = 0 for all m ∈ Z. A unitary highest-weight representation amounts to the specification of a pair (c, h) of real constants, a Hilbert space H(c,h) , a dense domain D0 ⊂ H(c,h) , a vector |h ∈ D0 , and operators Ln (n ∈ Z) defined on D0 such that: (1) L0 |h = h|h and Ln |h = 0 for n > 0, (2) D0 coincides with the set of vectors obtained from |h by acting with polynomials in the Ln with n < 0 (including the trivial polynomial 11), (3) L∗n = L−n on D0 and Eq. (5.1) holds as an identity on D0 with κ = c11. Such representations are irreducible; moreover, the “highest weight” (c, h) is restricted to particular values first classified in [22, 24]. (See, for example, [46, Theorems 6.17(3) and 6.13].) However, we will not need the precise details of this classification beyond the fact that both c and h are non-negative, which follows immediately from the observation that 0 ≤ L−n |h2 = 2nh + n(n2 − 1)c/12 for all n ≥ 1 as a consequence of Eq. (5.1). In the course of our analysis, we will need more detailed information on the domain of definition of the Ln and various other operators. Our first observation is that, by virtue of the Virasoro relations, D0 contains an orthonormal basis of L0 -eigenvectors. Indeed, this follows by the Gram–Schmidt process applied to vectors of the form L−n1 L−n2 · · · L−nk |h (for n1 , . . . , nk > 0), which are L0 eigenvectors with eigenvalue h+n1 +n2 +· · ·+nk . Thus L0 is essentially self-adjoint on D0 and we will use L0 from now on to denote the unique self-adjoint extension of this operator, writing D(L0 ) for its domain. The above remarks also show that L0 is a positive operator, with spectrum contained in h + N0 and finite-dimensional
July 6, 2005 12:21 WSPC/148-RMP
604
J070-00240
C. J. Fewster & S. Hollands
eigenspaces. Secondly, estimates obtained by Goodman and Wallach [25]l entail that Ln ψ ≤ C(1 + |n|)3/2 L0 ψ
(5.2)
for all ψ ∈ D0 and n ∈ Z, where the constant C is determined by the central charge and is independent of both ψ and n. Accordingly, the Ln may be extended uniquely to D(L0 ), and we now use Ln to denote these extensions. The relation Ln = L∗−n continues to hold, and the Virasoro relations hold as identities on D(L20 ). A further consequence is that the formula 1 −n−2 Θ(z) = − z Ln , (5.3) 2π n∈Z
defines Θ(·)ψ as a vector-valued distribution on C ∞ (S 1 ) for each ψ ∈ D(L0 ). Furthermore, Θ(f )∗ = Θ(Γf )
(5.4)
on D(L0 ) for f ∈ C ∞ (S 1 ). In particular, if Γf = f (i.e. f ∈ CΓ∞ (S 1 )), then Θ(f ) is symmetric on D(L0 ) and an application of Nelson’s commutator theorem [44, Theorem X.37]m shows that Θ(f ) is essentially self-adjoint on any core of L0 . Henceforth, we will use Θ(f ) to denote the unique self-adjoint extension. It is easy to verify that the Θ(f ) is defined in this way which obeys the commutation relations Eq. (3.28) on D(L20 ). ∞ ∞ Finally, let us define the space H to be the intersection H = n∈N0 D(Ln0 ), equipped with the Fr´echet topology induced by the seminorms ψ → Ln0 ψ ∞ is dense in H and (n ∈ N0 ). As D0 ⊂ D(Ln0 ) for each n, it follows that H is a core for L0 . 1 5.2. Integration to a unitary representation of Diff + (S )
We now need to demonstrate that Θ generates a unitary multiplier representation 1 of G = Diff + (S ) as in Axiom A.1 and Eq. (3.19). The relevant results are all present in the literature, but do not appear to have been assembled in this form before. Explicit control of the multiplier is necessary when we come to assemble Virasoro representations to form more general CFT models below: the direct sum of two projective representations is not generally a projective representation! Let U(c,h) be the group of unitary operators on H(c,h) and let P U(c,h) be the projective unitary group (i.e. unitaries modulo phases) P U(c,h) = U(c,h) /T. In the following, we distinguish unitary multiplier representations (which take values in U(c,h) ) from projective unitary representations (which take values in P U(c,h) ). As shown by Goodman and Wallach [25]n and Toledano Laredo [49], H(c,h) carries l See
[2] for related bounds. the notation of [44], set A = Θ(f ), N = L0 + 11 and D = D0 , for example. n In fact [25] addresses Diff (S 1 ) rather than its universal cover. + m In
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
605
a projective unitary representation U of G, so the remaining problem is to assign phases in such a way that Axiom A.1 and Eqs. (3.18) and (3.19) are satisfied. It is helpful (and standard) to rephrase this problem in a geometric fashion. Let ˆ be the subgroup of G × U(c,h) defined by G ˆ = {(g, V ) ∈ G × U(c,h) : U(g) = p(V )}, G
(5.5)
where p : U(c,h) → P U(c,h) is the quotient map. As shown in [49, Proposition 5.3.1], ˆ is a central extension of G by T which may be given the structure of a Lie G group. In particular, it is a smooth principal T-bundle over G (with projection π(g, V ) = g). The problem of assigning local (resp. global) phases to U is then ˆ equivalent to selecting a local (resp. global) section of G. The local problem was addressed by Toledano Laredo in the course of proving the just-mentioned result. He showed that phases can be assigned to U in a neighborhood N of id to provide a local unitary multiplier representation Uloc of G so ∞ ∞ to H and (ii) for that (i) the map (g, ψ) → Uloc (g)ψ is smooth from N × H ∞ ∞ 1 each f ∈ CΓ (S ) and each ψ ∈ H , d = iΘ(f )ψ (5.6) Uloc (ef (s))ψ ds s=0 where s → ef (s) is a smooth curve in G with ef (0) = id and e˙ f (0) = f, the corresponding vector field to f . [These curves, and Uloc , are determined by a choice of coordinates near id.] By (i), we may replace ef (s) by exp sf in Eq. (5.6), so Uloc obeys Eq. (3.19) and provides a local solution to our problem. A further consequence ∞ of (i) is that Uloc is strongly continuous on H , because H is dense in H and the Uloc (g) have unit operator norms. Toledano Laredo also uses Uloc to show that ˆ is cohomologous to cω, where ω is the Virasoro cocycle the Lie algebra cocycle of G of Eq. (3.17). The global assignment of phases is achieved by the following result. ˆ such that Proposition 5.1. There is a global smooth section g → (g, U(c,h)(g)) of G g → U(c,h) (g) is a strongly continuous unitary multiplier representation of G leaving H ∞ invariant and obeying ˜
U(c,h) (g)U(c,h) (g ) = eicB(g,g ) U(c,h) (gg )
(g, g ∈ G).
(5.7)
Moreover, if f ∈ VectR (S 1 ) is the vector field corresponding to f ∈ CΓ∞ (S 1 ), then D(Θ(f )) consists precisely of those ψ ∈ H for which s → U(c,h) (exp sf)ψ is differentiable, and we have d U(c,h) (exp sf)ψ = iΘ(f )ψ (5.8) ds s=0 for such ψ. Remark. As discussed in Sec. 3.1.2, G is diffeomorphic to a convex subset of ∞ (R; R). Accordingly, Poincar´e’s lemma (see [36, Lemma 3.3]) the Fr´echet space C2π
July 6, 2005 12:21 WSPC/148-RMP
606
J070-00240
C. J. Fewster & S. Hollands
entails that G has trivial cohomology groups H k (G; R). In consequence, H 2 (G; Z), which classifies the smooth principal T-bundles over G (see [41, Sec. 4.5]) is also ˆ is isomorphic to G × T as a smooth manifold and therefore admits trivial, so G smooth global sections. ˆ may be described in terms of a group 2-cocycle Proof. By [36, Proposition 4.2], G mapping G × G to T which is smooth near (id, id). Because G is simply connected, ˆ is fixed by the infinitesimal the equivalence class of group cocycles describing G class of cω (see, for example, the long exact sequence in [36, Theorem 7.12]) and ˜ therefore includes the Bott cocycle Ωc (g, g ) = eicB(g,g ) for central charge c. Let ˆ and define the corresponding g → (g, V (g)) be any smooth global section of G (everywhere smooth) cocycle m : G × G → T by V (g)V (g ) = m(g, g )V (gg ). Since m and Ωc are cohomologous, there exists µ : G → T, smooth near id, such that m(g, g ) = Ωc (g, g )
µ(gg ) . µ(g)µ(g )
(5.9)
As both m and Ωc are smooth, it follows that µ is everywhere smooth; the required global section is given by U(c,h) (g) = µ(g)V (g). Near the identity, we must have U(c,h) (g) = eiν(g) Uloc (g) for some smooth ν : N → R. It follows that U(c,h) is strongly continuous on H and has well-defined ∞ by generators Ξ(f ) given on H d U(c,h) (exp sf)ψ , iΞ(f )ψ = (5.10) ds s=0 ∞
and obeying Ξ(f ) = Θ(f ) + α(f )11 (on H ) where α(f ) = νid (f) is continuous and linear in f ∈ CΓ∞ (S 1 ) because ν is smooth. By Proposition. 3.1, applied to U(c,h) ∞ ∞ and H , the generators Ξ obey the same algebraic relations on H as the Θs on H ∞ . In particular, they obey Eq. (3.28), from which it follows that α(f g −f g) = 0 for all f, g ∈ CΓ∞ (S 1 ). It is now straightforward to show that α vanishes on a basis ∞ for CΓ∞ (S 1 ) and hence identically. Accordingly, Eq. (5.8) holds for ψ ∈ H and, in particular, on D0 . Now, the argument of footnote h guarantees that the left-hand side of Eq. (5.8) defines a self-adjoint operator whose domain consists precisely of those ψ for which the derivative exists. As this operator agrees with Θ(f ) on a core, it must in fact be Θ(f ).
We have thus established that the stress-energy density in a unitary highestweight Virasoro representation is the infinitesimal generator of a unitary multiplier 1 representation of Diff + (S ) with the Bott cocycle. Thus H(c,h) and U(c,h) satisfy Axiom A.1 of Sec. 3.3. Moreover the algebraic relations Eqs. (3.27) and (3.28) hold ∞ when applied to vectors in H . Let us observe that it is not the case that U(c,h) (exp sf) = eisΘ(f )
(FALSE)
(5.11)
for all s ∈ R and f ∈ CΓ∞ (S 1 ) because the Bott cocycle does not vanish along all one-parameter subgroups (although it is of course a coboundary). In passing, we
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
607
mention that Goodman and Wallach [25] appear to claim that their unitary multiplier representation of Diff + (S 1 ) can be normalized in such a way that Eq. (5.11) holds. However, this cannot be true, as it is not possible for exponentiations of sl(2, R) representations with non-integer highest weight. Turning to Axiom A.2, we note that representations with h = 0 do not contain , because this representation of M¨ ob is a vacuum vector invariant under U(c,h) |M¨ g ob generated by L0 and linear combinations of L±1 , and we know that spec (L0 ) ⊂ h + N0 . If h = 0, the highest-weight vector |0 is indeed the unique invariant vector, as required by Axiom A.2.o We will return to this when constructing more general CFT models. Continuing with general highest-weight Virasoro representations, Axiom A.3 clearly holds, because the generator of rotations H = L0 is positive. To check the remaining axioms, we construct a new U(c,h) -invariant domain, D(c,h) = span U(c,h) (g)D0 , (5.12) g∈G
(i.e. finite linear combinations of vectors of form U(c,h) (g)ψ for g ∈ G, ψ ∈ D0 ). ∞ ∞ ∞ and H is U -invariant. It is clear that D(c,h) lies within H , as D0 ⊂ H ∞ 1 Thus D(c,h) ⊂ D(L0 ) ⊂ D(Θ(f )) for each f ∈ C (S ), verifying Axiom B.1 (apart from the statement concerning the vacuum, which holds if and only if h = 0 for the vector |0). Moreover, the comment after Eq. (5.3) shows that Θ(·)ψ is a vector valued distribution on C ∞ (S 1 ) for each ψ ∈ D(c,h) . Accordingly, H(c,h) , U(c,h) and D(c,h) satisfy Axiom B.2. We also wish to see that expectation values of Θ(z)ψ for ψ ∈ D(c,h) are smooth. This can be verified directly for ψ ∈ D0 , in which case the expectation values are polynomial in z and z −1 ; the extension to D(c,h) then follows from the ∞ and hence on D(c,h) ). Thus, transformation law Eq. (3.27) (which holds on H Axiom B.3 holds. To summarize, we have established that H(c,h) , D(c,h) and U(c,h) obey all the axioms for a CFT on S 1 except those relating to the vacuum state; all the axioms are obeyed if h = 0. 5.3. CFT models obeying the axioms It is now easy to construct a large class of theories obeying our axioms, simply by taking the direct sums of Virasoro representations. Starting with CFTs with a single component of stress-energy, we may take, for example,
H =
K k=0
H(c,hk ) ,
U=
K
U(c,hk ) ,
(5.13)
k=0
where 0 ≤ K ≤ ∞ and 0 = h0 < h1 ≤ h2 ≤ h3 · · · with each (c, hk ) an allowed highest-weight for a unitary representation of the Virasoro algebra. Here we take D arbitrary highest-weight h, the highest-weight vector |h obeys L0 |h = h|h, L1 |h = 0, L−1 |h2 = 2h. The assertion follows on taking h = 0.
o For
July 6, 2005 12:21 WSPC/148-RMP
608
J070-00240
C. J. Fewster & S. Hollands
to be the space of vectors in H with only finitely many non-zero components, each belonging to the appropriate D(c,hk ) and set Ω = (|0, 0, 0, . . .). Since we argued in the last subsection that the multipliers in each U(c,hk ) are all equal, their direct sum is also a unitary multiplier representation with the same multiplier. In addition, by insisting on a unique summand with h = 0, we have guaranteed the existence of a unique vacuum vector. In a similar fashion, CFTs with two independent components of stress-energy may be constructed as direct sums of tensor products of the form
H =
K
H(cL ,hL,k ) ⊗ H(cR ,hR,k ) ,
(5.14)
k=0
in which 0 ≤ K ≤ ∞ as before, and we require that (hL,k , hR,k ) = (0, 0) if and only if k = 0. The vacuum is Ω = (|0 ⊗ |0, 0, 0, . . .) (and is again unique) and the space D is constructed as before. Thus our axioms embrace, and are more general than, minimal models (for which K is finite) and rational conformal field theories (for which K may be infinite but the theory is minimal for an extended algebra, for example minimal superconformal models [22] or WZW models [53]). Acknowledgments The work of CJF was assisted by EPSRC Grant GR/R25019/01 to the University of York. SH was supported by NSF Grant PH00-90138 to the University of Chicago, by NSF Grant PHY0354978 and by funds from the University of California. Part of this research was carried out during the 2002 program on Quantum Field Theory in Curved Space-time at the Erwin Schr¨ odinger Institute, Vienna, and we wish to thank the Institute for its hospitality. We have greatly benefited from conversations ´ E. ´ Flanagan, K. Fredenhagen with participants of the program, in particular E. and K.-H. Rehren. CJF would also like to thank G. W. Delius and I. McIntosh for many illuminating discussions on conformal field theory and infinite-dimensional Lie groups. Appendix. Square Roots of Schwartz Class Functions In the body of the paper, we use various properties of square roots of functions in the Schwartz class. The following results are quite probably known, but are included for completeness. Related results, also based on the use of Taylor’s theorem, may be found in [23, Lemma 1] and [4, p. 86]. Lemma A.1. Let G ∈ S (R) be non-negative. Then, there exists M > 0 such that G (v)2 ≤ for all v. In particular, |d/dv
4M G(v) 1 + v2
G(v)|2 ≤ M/(1 + v 2 ) where G(v) = 0.
(A.1)
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
609
Proof. Noting that the result holds trivially if G ≡ 0, we now restrict to non-trivial G. For k = 0, 2, let Mk = supv∈R |(1 + |v|k )G (v)|, observing that each Mk > 0. If
> 0, Taylor’s theorem entails that 1 0 ≤ G(v − G (v)) = G(v) − G (v)2 + 2 G (v)2 G (η) 2
(A.2)
for some η lying between v and v − G (v). We apply this in two ways. Firstly, for any v, we use G (η) < M0 and put = M0−1 to find 1 G (v)2 0 ≤ G(v) − G (v)2 + 2 G (v)2 M0 = G(v) − , 2 2M0 so G (v)2 ≤ 2M0 G(v) for all v. Secondly, we observe that G (v) M2 ≤ 2 v 1 + (v/2)2
(A.3)
(A.4)
holds for all sufficiently large |v|, so setting = (1 + (v/2)2 )/M2 , the η in Eq. (A.2) obeys |η| ≥ |v|/2 and we find 0 ≤ G(v) − G (v)2 +
1 + (v/2)2 2
2 G (v)2 M2 = G(v) − G (v) 2(1 + (v/2)2 ) 2M2
(A.5)
for all |v| greater than some v0 > 0. Thus, Eq. (A.1) holds with M = max{ 12 M0 (1 + v02 ), 4M2 }. Corollary A.2. Given 0 ≤ G ∈ S (R), define G (v)/(2 G(v)) G(v) = 0 (A.6) ϕ(v) = 0 G(v) = 0. √ Then ϕ ∈ L2 (R) and ϕ√= d/dv G, where d/dv denotes the derivative in the sense of distributions. Thus G belongs to the Sobolev space W 1 (R). Furthermore, G (v)2 dv. (A.7) ϕ(v)2 dv = lim 4(G(v) + ) →0+ √ √ √ Proof. For > 0, define G (v) = ( G(v) + − )2 . Then G → G in L2 (R) as → 0+ . Moreover, M 1/2 d G (v) ≤ 2 − ϕ(v) G (v) − ϕ(v) , (A.8) = dv 2 G(v) + 1 + v2 where M is the constant furnished by Lemma A.1. (In the case G(v) = 0, this follows from the triangle inequality; the case G(v) = 0 is trivial as wemust also have G (v) = 0 by Eq. (A.1), so the left-hand side vanishes.) Since d/dv G (v) → ϕ(v) pointwise as → 0+ , we deduce that the convergence occurs in L2 (R) by the √ 2 dominated convergence theorem. Thus ϕ = d/dv G ∈ L (R). The expression for ϕ2 is also proved by dominated convergence.
July 6, 2005 12:21 WSPC/148-RMP
610
J070-00240
C. J. Fewster & S. Hollands
References [1] M. Alcubierre, The warp drive: Hyper-fast travel within general relativity, Class. Quant. Grav. 11 (1994) L73–L77. [2] D. Buchholz and H. Schulz-Mirbach, Haag duality in conformal quantum field theory, Rev. Math. Phys. 2 (1990) 105–125. [3] S. Carpi and M. Weiner, On the uniqueness of diffeomorphism symmetry in conformal field theory (2004) math.oa/0407190. [4] M. Dimassi and J. Sj¨ ostrand, Spectral Asymptotics in the Semi-Classical Limit, LMS Lecture Note Series 268 (Cambridge University Press, Cambridge, 1999). [5] S. P. Eveson, C. J. Fewster and R. Verch, Quantum inequalities in quantum mechanics, Ann. Henri Poincar´e 6 (2005) 1–30. [6] C. J. Fewster, A general worldline quantum inequality, Class. Quant. Grav. 17 (2000) 1897–1911. [7] C. J. Fewster, Energy Inequalities in Quantum Field Theory, Expanded version of a contribution to appear in the Proceedings of the XIV International Conference on Mathematical Physics, Lisbon (2003). [8] C. J. Fewster and S. P. Eveson, Bounds on negative energy densities in flat spacetime, Phys. Rev. D58 (1998) 084010. [9] C. J. Fewster and B. Mistry, Quantum weak energy inequalities for the Dirac field in flat spacetime, Phys. Rev. D68 (2003) 105010. [10] C. J. Fewster and M. J. Pfenning, A quantum weak energy inequality for spin-one fields in curved spacetime, J. Math. Phys. 44 (2003) 4480–4513. [11] C. J. Fewster and T. A. Roman, Null energy conditions in quantum field theory, Phys. Rev. D67 (2003) 044003. [12] C. J. Fewster and E. Teo, Bounds on negative energy densities in static space-times, Phys. Rev. D59 (1999) 104016. [13] C. J. Fewster and R. Verch, A quantum weak energy inequality for Dirac fields in curved spacetime, Commun. Math. Phys. 225 (2002) 331–359. ´ E. ´ Flanagan, Quantum inequalities in two-dimensional Minkowski spacetime, [14] E. Phys. Rev. D56 (1997) 4922–4926. ´ E. ´ Flanagan, Quantum inequalities in two dimensional curved spacetimes, Phys. [15] E. Rev. D66 (2002) 104007. [16] L. H. Ford, Quantum coherence effects and the second law of thermodynamics, Proc. R. Soc. Lond. A364 (1978) 227–236. [17] L. H. Ford, A. Helfer and T. A. Roman, Spatially averaged quantum inequalities do not exist in four-dimensional spacetime, Phys. Rev. D66 (2002) 124012. [18] L. H. Ford and T. A. Roman, Quantum field theory constrains traversable wormhole geometries, Phys. Rev. D53 (1996) 5496–5507. [19] L. H. Ford and T. A. Roman, Restrictions on negative energy density in flat spacetime, Phys. Rev. D55 (1997) 2082–2089. [20] S. A. Fulling and P. C. W. Davies, Radiation from a moving mirror in two dimensional space-time: Conformal anomaly, Proc. R. Soc. Lond. A348 (1976) 393–414. [21] P. Furlan, G. M. Sotkov and I. T. Todorov, Two-dimensional conformal field theory, Riv. Nuovo Cimento 12(6) (1989) 1–202. [22] D. Friedan, Z. Qiu and S. Shenker, Conformal invariance, unitarity and critical exponents in two dimensions, Phys. Rev. Lett. 52 (1984) 1575–1578. [23] G. Glaeser, Racine carr´ee d’une fonction diff´erentiable, Ann. Inst. Fourier (Grenoble) 13 (1963) 203–210.
July 6, 2005 12:21 WSPC/148-RMP
J070-00240
Quantum Energy Inequalities in Two-Dimensional Conformal Field Theory
611
[24] P. Goddard, A. Kent and D. Olive, Unitary representations of the Virasoro and super-Virasoro algebras, Commun. Math. Phys. 103 (1986) 105–119. [25] R. Goodman and N. R. Wallach, Projective unitary positive-energy representations of Diff(S 1 ), J. Funct. Anal. 63 (1985) 299–321. [26] R. S. Hamilton, The inverse function theorem of Nash and Moser, Bull. Amer. Math. Soc. 7 (1982) p. 65. [27] S. K¨ oster, Conformal transformations as observables, Lett. Math. Phys. 61 (2002) 187–198. [28] S. K¨ oster, Absence of stress energy tensor in CFT2 models (2003) math-ph/0303053. [29] J. Kupsch, W. R¨ uhl and B. C. Yunn, Conformal invariance of quantum fields in two-dimensional space-time, Ann. Phys. 89 (1975) 115–148. [30] R. Longo, K.-H. Rehren, Local fields in boundary conformal QFT, Rev. Math. Phys. 16 (2004) 909–960. [31] M. L¨ uscher, Operator product expansions on the vacuum in conformal quantum field theory in two spacetime dimensions, Commun. Math. Phys. 50 (1976) 23–52. [32] M. L¨ uscher and G. Mack, The energy momentum tensor of a critical quantum field theory in 1+1 dimensions, unpublished manuscript (1976). [33] G. Mack, Introduction to conformal invariant quantum field theory in two and more dimensions, in Nonperturbative Quantum Field Theory: Proc. NATO Advanced Summer Institute, eds. G. ’t Hooft, A. Jaffe, G. Mack, P. K. Mitter and R. Stora (Plenum Press, New York, 1988). [34] J. Milnor, Remarks on infinite-dimensional Lie groups, in Relativity, Groups and Topology II, eds. B. S. DeWitt and R. Stora (North-Holland, Amsterdam, 1984). [35] M. S. Morris and K. S. Thorne, Wormholes in spacetime and their use for interstellar travel: A tool for teaching general relativity, Am. J. Phys. 56 (1988) 395–412. [36] K.-H. Neeb, Central extensions of infinite-dimensional Lie groups, Ann. Inst. Fourier. (Grenoble) 52 (2002) 1365–1442. [37] K. D. Olum and N. Graham, Static negative energies near a domain wall, Phys. Lett. B554 (2003) 175–179. [38] M. J. Pfenning, Quantum inequalities for the electromagnetic field, Phys. Rev. D65 (2002) 024009. [39] M. J. Pfenning and L. H. Ford, Scalar field quantum inequalities in static spacetimes, Phys. Rev. D57 (1998) 3489–3502. [40] M. J. Pfenning and L. H. Ford, The unphysical nature of “warp drive”, Class. Quantum Grav. 14 (1997) 1743–1751. [41] A. Pressley and G. Segal, Loop Groups (Oxford University Press, Oxford, 1999). [42] L. Puk´ anszky, The Plancherel formula for the universal covering group of SL(R, 2), Math. Ann. 156 (1964) 96–143. [43] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. 1: Functional Analysis (Academic Press, New York, 1972). [44] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. 2: Fourier Analysis, Self-Adjointness (Academic Press, New York, 1975). [45] T. A. Roman, Some thoughts on energy conditions and wormholes, to appear in the Proceedings of the Tenth Marcel Grossmann Meeting on General Relativity and Gravitation. [46] M. Schottenloher, A mathematical introduction to conformal field theory, Lecture Notes in Physics m43 (Springer-Verlag, Berlin, 1997). [47] G. Segal, Unitary representations of some infinite dimensional groups, Commun. Math. Phys. 80 (1981) 301–342.
July 6, 2005 12:21 WSPC/148-RMP
612
J070-00240
C. J. Fewster & S. Hollands
[48] R. F. Streater and A. S. Wightman, PCT, Spin and Statistics, and All That (Princeton University Press, Princeton, 1978). [49] V. Toledano Laredo, Integrating unitary representations of infinite-dimensional Lie groups, J. Funct. Anal. 161 (1999) 478–508. [50] V. S. Varadarajan, Geometry of Quantum Theory, Vol II: Quantum Theory of Covariant Systems (Van Nostrand, New York, 1970). [51] R. Verch, The averaged null energy condition for general quantum field theories in two dimensions, J. Math. Phys. 41 (2000) 206–217. [52] D. N. Vollick, Quantum inequalities in curved two-dimensional spacetimes, Phys. Rev. D61 (2000) 084022. [53] E. Witten, Nonabelian bosonization in two dimensions, Commun. Math. Phys. 92 (1984) 455–472. [54] H. Yu and P. Wu, Quantum inequalities for the free Rarita-Schwinger fields in flat spacetime, Phys. Rev. D69 (2004) 064008. [55] J.-B. Zuber, CFT, BCFT, ADE and all that (2000) hep-th/0006151.
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Reviews in Mathematical Physics Vol. 17, No. 6 (2005) 613–667 c World Scientific Publishing Company
ELLIPTIC THERMAL CORRELATION FUNCTIONS AND MODULAR FORMS IN A GLOBALLY CONFORMAL INVARIANT QFT
NIKOLAY M. NIKOLOV∗ and IVAN T. TODOROV† Institute for Nuclear Research and Nuclear Energy, Tsarigradsko Chaussee 72, BG-1784 Sofia, Bulgaria and Institut f¨ ur Theoretische Physik, Universit¨ at G¨ ottingen, Friedrich-Hund-Platz 1, D-37077 G¨ ottingen, Germany ∗[email protected]; [email protected] †[email protected]; [email protected] Received 19 August 2004 Revised 17 March 2005
Global conformal invariance (GCI) of quantum field theory (QFT) in two and higher space-time dimensions implies the Huygens’ principle, and hence, rationality of correlation functions of observable fields [29]. The conformal Hamiltonian H has discrete spectrum assumed here to be finitely degenerate. We then prove that thermal expectation values of field products on compactified Minkowski space can be represented as finite linear combinations of basic (doubly periodic) elliptic functions in the conformal time variables (of periods 1 and τ ) whose coefficients are, in general, formal power series 1
in q 2 = eiπτ involving spherical functions of the “space-like” fields’ arguments. As a corollary, if the resulting expansions converge to meromorphic functions, then the finite temperature correlation functions are elliptic. Thermal 2-point functions of free fields are computed and shown to display these features. We also study modular transformation properties of Gibbs energy mean values with respect to the (complex) inverse β > 0). The results are used to obtain the thermodynamic limit temperature τ (Im τ = 2π of thermal energy densities and correlation functions. Keywords: 4-dimensional conformal field theory; thermal correlation functions; elliptic functions; modular forms. Mathematics Subject Classification 2000: 81T40, 81R10, 81T10
Contents 1. Introduction 1.1. Conformal invariance in QFT 1.2. Why thermal correlation functions should be elliptic in the conformal time differences? 1.3. Basic (anti)periodic functions. Content of the paper 613
614 614 616 618
July 26, 2005 15:21 WSPC/148-RMP
614
J070-00239
N. M. Nikolov & I. T. Todorov
2. Globally Conformal Invariant QFT on Compactified Minkowski Space 2.1. Affine coordinate systems on compactified Minkowski space 2.2. Wightman axioms for conformal field theory in the analytic picture 3. GCI Correlation Functions as Meromorphic Functions 3.1. Rationality of the vacuum correlation functions 3.2. Ellipticity of the finite temperature correlation functions 4. Free Field Models 4.1. General properties of thermal correlation functions of free fields 4.2. Free scalar fields 4.2.1. Canonical free massless field in even space time dimension D 4.2.2. Subcanonical field of dimension d = 1 for D = 6 4.3. The Weyl field 4.4. The Maxwell free field 5. The Thermodynamic Limit 5.1. Compactified Minkowski space as a “finite box” approximation 5.2. Infinite volume limit of the thermal correlation functions 6. Concluding Remarks Appendix A. Basic Elliptic Functions Appendix B. Proof of Proposition 3.4
620 620 623 628 628 632 637 637 640 643 644 645 649 651 651 654 657 658 663
1. Introduction The modular group SL(2, Z) (=: Γ(1)) arises as the symmetry group of an oriented 2-dimensional lattice. Usually, including our case, this is the period lattice of an elliptic function. The factor group PSL(2, Z) = SL(2, Z)/Z of SL(2, Z) with respect 2 to its 2-element center Z2 ≡ Z/2Z acts faithfully by fractional linear transformations on the upper half-plane: a b aτ + b for g = ∈ SL(2, Z). (1.1) H := {τ ∈ C : Im τ > 0}, g(τ ) = cτ + d c d The modular inversion, the involutive S-transformation of H, 0 −1 1 S= : τ → − (Im τ > 0), τ 1 0
(1.2)
which relates high and low temperature behavior, is the oldest and best studied example of a duality transformation [21] (for a recent reference in the context of elliptic functions that provides a historical review going back to 19th century work; see [10]). It naturally appears in the study of finite temperature correlation functions in a conformally invariant field theory (CFT). The case of 2-dimensional (2D) CFT has been thoroughly studied from the point of view of vertex algebras in [38]. The present paper builds on the observation that this analysis can be extended in a straightforward manner to the recently developed GCI QFT (see [29], [27] and [28]). 1.1. Conformal invariance in QFT The conformal group C acts on Minkowski space M by local diffeomorphisms which preserve the conformal class of the metric form, i.e. multiply it by a positive factor.
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
615
Unlike the Poincar´e group, C acts, in general, by non-linear transformations on M which may have singularities on a cone (or on a hyperplane). Furthermore, C has an infinite-sheeted universal cover. In view of these peculiarities, there exist different notions of conformal invariance in QFT. In order to make clear the concept of GCI QFT, we shall briefly discuss these notions in the framework of axiomatic QFT [34]. The weakest condition of conformal invariance in QFT is the infinitesimal conformal invariance of the Wightman functions. This yields a system of first-order differential equations for each Wightman function (displayed in Sec. 2.2). According to the Bargman–Hall–Wightman theorem (see [18, Sec. IV, 4–5]), the Wightman functions are boundary values of analytic functions, holomorphic in a complex domain, the so-called symmetrized extended tube, which contains all non-coinciding euclidean arguments. As a result, the same (complexified) system of differential equations is satisfied by these analytic functions and hence by their (real analytic) Euclidean restrictions that define the so-called non-coinciding Schwinger (or Euclidean Green) functions which are, therefore, invariant under (Euclidean) infinitesimal conformal transformations. We thus see that the conditions of infinitesimal conformal invariance for each element in the hierarchy of functions — Wightman distributions, their analytic continuations and the Schwinger functions — are equivalent. The global, i.e. group, version of conformal invariance is more subtle. We recall that there exist conformal compactifications of both Minkowski and Euclidean space, such that the local actions of the corresponding conformal groups can be extended to everywhere defined ones. The conformally compactified Euclidean space is just the (simply connected) sphere SD . It follows that the infinitesimal Euclidean conformal invariance is equivalent to invariance under the Euclidean conformal group. Compactified Minkowski space M , on the other hand, is isomorphic to S1 × SD−1 /Z2 so that it has an infinite-sheeted universal cover, . One can only integrate, in general, the conditions of infinitesimal invariance to M invariance on this infinite-sheeted cover, the finite conformal transformations on M becoming multi-valued if projected on M . Assuming Euclidean conformal invariance, L¨ uscher and Mack [23] have thus established invariance of the QFT continued to M under the infinite-sheeted covering of the Minkowski space conformal group and called it “global conformal invariance” on the universal covering space. The above analysis shows that it is, in fact, equivalent to infinitesimal conformal invariance in Minkowski space. By contrast, the GCI condition introduced in [29],a invariance under finite conformal transformations g in Minkowski space M (whenever both x and gx are in M ) is stronger since it allows to continue the Wightman functions to invariant distributions on M . Combined with locality GCI on M implies the Huygens’ principle — the aA
special case of such a condition — in the context of (generalized) free fields — has been displayed earlier in [17]; see also [13] where a condition of this type is discussed in the framework of 2D CFT and [16] for a retrospective view on the subject.
July 26, 2005 15:21 WSPC/148-RMP
616
J070-00239
N. M. Nikolov & I. T. Todorov
vanishing of field (anti)commutators for non-isotropic separations — since it allows to transform space-like into time-like intervals. Under Wightman axioms, the Huygens principle is equivalent to strong locality (i.e. the algebraic condition (2.22)). (Note that only in even dimensional space-time, the canonical free massless fields and the stress-energy tensor satisfy the Huygens principle.) Strong locality and energy positivity imply the rationality of Wightman functions (cf. Theorem 3.3 below), thus excluding non-integer anomalous field dimensions. In the case of 2D CFT, the GCI incorporates the notion of chiral algebra which has served as a starting point for developing the important mathematical concept of a vertex algebra — see [2, 3, 19 and 11] and further references in the latter two books. Moreover, it also includes non-chiral 2D fields with rational correlation functions. (The simplest example is given by the energy density of conformal weight ( 12 , 12 ) in the vacuum sector of the critical Ising model.) The primary fields, which may well have anomalous dimensions and non-trivial braiding properties (as the magnetization field in the Ising model), appear in this framework as intertwiners between the vacuum and other positive energy representations of the GCI algebra. It is then expected that only the 4-dimensional counterpart of such intertwining primary fields may display anomalous time-like braiding relations of the type discussed in [31]. 1.2. Why thermal correlation functions should be elliptic in the conformal time differences? The main idea is simple to explain. A GCI QFT lives on compactified Minkowski space M of dimension D which has a natural complex vector parametrization: M = S1 × SD−1 /Z2 = zα = e2πiζ uα : ζ ∈ R, u2 := u2 + u2D = 1, u ∈ RD ,
(1.3)
ζ being the conformal time variable. The coordinates z in Eq. (1.3) are obtained by a complex conformal transformation introduced in Sec. 2.1 (Eq. (2.3)) of the Cartesian coordinates of Minkowski space generalizing the Cayley transform (inverse stereographic projection) of the chiral (i.e. 1-dimensional) case. They have been first introduced for D = 4 in [35] using the Cayley (u(2) → U (2)) compactification map, and were generalized to arbitrary D in [30]; the reader will find a geometric introduction to this and more general systems of charts in [26]. The use of Euclidean metric in (1.3) does not mean, of course, that we are working within the Euclidean picture of QFT at this point. We recall that the Euclidean rotation group SO(D) is a subgroup of the Minkowski space conformal group SO(D, 2) and that (1.3) (involving the Euclidean unit sphere SD−1 ) does indeed represent compactifed Minkowski space. (This is made clear in Sec. 2.1 by exhibiting its relation to the Dirac projective quadric.) Transforming the fields in these coordinates, we obtain an equivalent representation of the GCI local fields on M called (analytic or) z-picture. Since the transformation is conformal, the vacuum correlation (Wightman) functions do not change
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
617
their form. For example, the z-picture scalar field φ(z) of (integer) dimension d has rational correlation functions like 2 −d , z12 = z1 − z2 , (1.4) 0|φ(z1 )φ(z2 )|0 = z12 invariant under D-dimensional inhomogeneous complex rotation group. Let us note that we will treat the fields as formal power series in z and z12 which is shown in [26] to be completely equivalent to the Wightman approach (with GCI) which treats local fields as operator-valued distributions. The conformal Hamiltonian H, with respect to which we will consider the thermal correlation functions, gives rise to a multiplication of z by a phase factor (and hence, to a translation of ζ in Eq. (1.3)). This suggests introducing a real compact picture field φ(ζ, u) related to φ(z) by φ(ζ, u) = e2πidζ φ(e2πiζ u).
(1.5)
Then H acts on it by (infinitesimal) translation in ζ: e2πitH φ(ζ, u)e−2πitH = φ(ζ + t, u).
(1.6)
Since H has an integer or half-integer spectrum in the vacuum sector state space, it follows that: φ(ζ + 1, u) = (−1)2d φ(ζ, u),
(1.7)
i.e., the conformal time evolution is periodic or anti-periodic with period 1 in the compact picture, so that the vacuum and the thermal correlation functions will be also (anti)periodic. The second period τ comes from statistical quantum physics: there it is pure imaginary and proportional to the inverse absolute temperature. More precisely, for any (real) Bose field φ with an invariant dense domain D (common for all fields and actually coinciding with the finite energy space spanned by eigenvectors of H — see Proposition 3.2), we are going to construct the partition function Z(τ ) = trD (q H ),
q = e2πiτ ,
Im τ > 0 (|q| < 1),
(1.8)
and the Gibbs correlation functions φ(ζ1 , u1 ) · · · φ(ζn , un )q :=
1 trD {φ(ζ1 , u1 ) · · · φ(ζn , un )q H } Z(τ )
(1.9)
as meromorphic functions in τ , ζk and uk (k = 1, . . . , n) in a suitable domain of CnD+1 . Sure, the existence of Gibbs equilibrium states with the above properties requires additional assumptions extending the notion of classical phase space volume. Such extra assumptions are needed in any axiomatic treatment of thermodynamic properties of local QFT.b Our study of GCI QFT is facilitated by the fact that the conformal Hamiltonian H has a (bounded below) discrete spectrum b For
a general discussion of this point within Haag’s operator algebra approach — see Sec. V.5 of [14], where Buchholz nuclearity condition, [6] and [7], is advocated and reviewed.
July 26, 2005 15:21 WSPC/148-RMP
618
J070-00239
N. M. Nikolov & I. T. Todorov
n
n = 0, 1, . . . . The partition function (1.8) exists for any inverse temperature β = 2π Im τ (> 0) iff the growth of dimension d n2 of the nth eigenspace of H is slower than any exponential eεn (ε > 0). Moreover, in the GCI QFT, it is sufficient to assume that H has just finitely degenerate spectrum to ensure the existence of thermal correlation functions (1.9) (and the partition function (1.8)) 1 as formal power series in q 2 = eiπτ with coefficients which are symmetric rational functions in (eπiζ1 , u1 ), . . . , (eπiζn , un ), as it is shown in Sec. 2. This allows us then to extend the heuristic argument, given in [38], which makes it plausible that the Kubo–Martin–Schwinger (KMS) property [15]c 2:
φ(ζ1 , u1 ) · · · φ(ζn , un )q = φ(ζ2 , u2 ) · · · φ(ζn , un )φ(ζ1 + τ, u1 )q
(1.10)
implies that the functions (1.9) are doubly periodic meromorphic functions with periods 1 and τ in ζjk = ζj − ζk ; in other words, they are elliptic functions. In Sec. 3, we give a rigorous interpretation of this argument thus proving that the finite temperature correlation functions (1.9) has the form of finite linear combinations of basic (series of) elliptic functions in the conformal time variables whose coefficients 1 are, in general, formal power series in q 2 = eiπτ involving spherical functions of the angular fields’ arguments uk (Theorem 3.5).d Let us stress that our main result, Theorem 3.5, takes into account the most general purely algebraic properties of the theory only. As noted above, additional hypotheses of topological character are necessary in order to guarantee the existence of the thermal expectation values as meromorphic functions. In this case, our analysis tells us that these meromorphic functions are automatically elliptic (Corollary 3.6). We shall demonstrate that this is indeed the case for conformally invariant free fields by computing explicitly their Gibbs 2-point functions. 1.3. Basic (anti)periodic functions. Content of the paper An elliptic function is characterized by its poles and their residues (in the fundamental domain). The poles of the thermal correlation functions should be the same as the poles of the operator product expansions (OPE): they only appear at mutually isotropic field arguments. In the compact picture, the light cone equation factorizes: 2 2 = e2πiζ1 u1 − e2πiζ2 u2 ≡ −4 e2πi(ζ1 +ζ2 ) sin πζ+ sin πζ− ; (1.11) 0 = z12 here we have introduced the variables ζ± = ζ12 ± α, c For
for u1 · u2 = cos 2πα.
(1.12)
a later discussion combining nuclearity, KMS and Lorentz invariance — see [5]. note that our results are valid in any space-time dimension D; in particular, for the D = 1 case, corresponding to the chiral projection of the 2-dimensional CFT, it implies that under the assumptions of convergence of all the traces of products of fields’ modes (including the partition function (1.8)), the finite temperature correlation functions are convergent to elliptic functions (since then there are no additional angular variables).
d We
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
619
Therefore, the basic elliptic functions occurring in the theory depend on the variables of type (1.12) and have poles on the lattice spanned by the periods 1 and τ . Taking into account the fermionic case, the above statements are modified and the periodicity in both periods 1 and τ being replaced by antiperiodicity. We are (ζ, τ ): k = 1, 2, . . . , κ, λ = 0, 1 of basic (elliptic) functions, thus led to the set pκ,λ k uniquely characterized by the conditions: (i) pκ,λ k (ζ, τ ) are meromorphic functions in (ζ, τ ) ∈ C × H with exactly one pole at ζ = 0 of order k and residue 1 in the domain ατ + β: α, β ∈ [0, 1) ⊂ C for all τ ∈ H and k = 1, 2, . . . , 1 ∂ κ,λ (ii) pκ,λ k+1 (ζ, τ ) = − k ∂ζ pk (ζ, τ ) for k = 1, 2, . . . , λ κ,λ (iii) pκ,λ k (ζ + 1, τ ) = (−1) pk (ζ, τ ) for k = 1, 2, . . . , κ,λ (iv) pk (ζ + τ, τ ) = (−1)κ pκ,λ k (ζ, τ ) for k + κ + λ > 1, k κ,λ (−ζ, τ ) = (−1) p (v) pκ,λ k k (ζ, τ ) for k = 1, 2, . . . .
Note that for k = 1 and κ = λ = 0 at most one of conditions (iii) and (iv) can be satisfied and we have chosen the first one. This is a natural choice since the periodicity with period 1 in the conformal time is coupled to the periodicity in the 00 angle α. It leads to a difference between our p00 1 =: p1 and p2 =: p2 -functions, and the Weierstrass Z- and ℘-functions (Eqs. (A.6) and (A.5)), respectively, by linear functions in ζ (see Eqs. (A.11) and (A.12); the Weierstrass functions have the advantage that they have simple modular transformation laws). In Appendix A (see Proposition A.2), we allow for a more general U (1) character replacing (−1)κ in κ −2πiµ κ,λ pk (ζ, τ, µ), where condition (iv): pκ,λ k (ζ + τ, τ, µ) = (−1) e the parameter µ can be interpreted as chemical potential in physical applications and pκ,λ k (ζ, τ, 0) = κ,λ pk (ζ, τ ) . The n-point correlation functions (1.9) are elliptic in ζjk = ζj − ζk with poles at ±αjk + m + nτ (n, m ∈ Z), where cos 2παjk = uj · uk . One cannot expect, however, that they are homogeneous under modular transformations: ζjk aτ + b a b , g(ζ, τ ) = for g = ∈ SL (2, Z) (1.13) c d cτ + d cτ + d since αjk , playing the role of spherical distances for D > 2, are not invariant α under rescaling αjk → cτ jk +d . One can hope to recover modular covariance for their (always well defined in Wightman theories — see [4]) 1-dimensional restrictions corresponding to u1 = u2 = · · · = un ,
αjk = 0.
(1.14)
It turns out that the restricted 2-point function of a d = 1 free massless scalar field for D = 4 indeed transforms homogeneously (of degree 2) under the modular transformations (1.13). The corresponding energy mean value in an equilibrium state, 1 trD (Hq H ) (1.15) Hq = Z(τ )
July 26, 2005 15:21 WSPC/148-RMP
620
J070-00239
N. M. Nikolov & I. T. Todorov
is a modular form of weight 4 (and level Γ(1)(≡ SL (2, Z))), after shifting the vacuum energy (Sec. 4.2.1). The paper is organized as follows. In Sec. 2.1, we give a concise review of the basic properties of the conformal group and its Lie algebra, and introduce the basic complex parametrization of Minkowski space which we use throughout this paper. It allows us to formulate in Sec. 2.2 an algebraic counterpart of the Wightman axioms in what we call the analytic (z) picture. We sum up the implications of these axioms in Sec. 3.1 where we also give an introduction to the purely algebraic approach to GCI QFT in terms of formal power series. In Sec. 3.2, we obtain the general form of the thermal correlation functions. In Sec. 4, we calculate the finite temperature correlation functions in the (generalized) free field GCI models starting with their relation to the Wightman functions found in Sec. 4.1. The cases of physical free fields in D = 4 dimensions: the massless scalar, Weyl and electromagnetic fields are considered in Secs. 4.2, 4.3 and 4.4, respectively. We have also studied examples of subcanonical free fields (for D = 4 and D = 6). The “thermodynamic limit” in which the compactification radius R goes to infinity (so that M is restricted to M and time is no longer cyclic) is considered in Sec. 5 where it is shown that the thermal correlation functions have Minkowski space limits. The results are summed up and discussed in Sec. 6. The reader will find our conventions about elliptic functions and modular forms, used in the text, listed in Appendix A. 2. Globally Conformal Invariant QFT on Compactified Minkowski Space In the GCI QFT, the natural choice of the conformal group C is the connected spinor group Spin(D, 2) (∼ = = C). Then the complexified conformal group will be CC ∼ ∼ ∼ SpinC (D + 2). The conformal Lie algebra will be denoted by c(= spin(D, 2) = so(D, 2)) and its complexification, by c C . We begin this section with recalling some basic facts about the conformal group and its action on compactified Minkowski space. 2.1. Affine coordinate systems on compactified Minkowski space Let M be the D-dimensional Minkowski space, with coordinates x = (x0 , x = 2 (x1 , . . . , xD−1 )) ∈ RD and Poincar´e invariant interval x212 = x212 − x012 , x12 = D−1 j 2 x1 − x2 , x212 = j=1 x12 . The group of conformal transformations of M is defined as the group of (local) diffeomorphisms of M ( x → y) preserving the conformal class of the infinitesimal metric dx2 (≡ dxµ dxµ ), i.e. mapping dx2 on (D+1)(D+2) 2 −2 2 dimensional for D ≥ 3, due dy = ω (x) dx (ω(x) = 0). It is finite 2 to the Liouville theorem, and is generated by: µ
• the Poincar´e translations eia·P (x) (≡ eia Pµ (x)) = x + a (for x, a ∈ M ), • the Lorentz transformations etΩµν , 0 ≤ µ < ν ≤ D − 1 (Ωνµ = −Ωµν ),
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
• the dilations x → ρx, ρ > 0, • and the special conformal transformations eia·K (x) = obvious singularities).
x+x2 a 1+2a·x+a2 x2
621
(which has
The corresponding Lie algebra is isomorphic to the Lie algebra of the pseudoorthogonal group SO(D, 2). Recalling this isomorphism, we introduce the basis of infinitesimal (pseudo) rotations Ωab (= −Ωba ), where the indices a and b take values −1, 0, . . . , D, the underlying orthonormal basis ea ∈ RD,2 (a = −1, 0, 1, . . . , D) 2 = −e02 , α = 1, . . . , D (cf. [29, Appendix A]). The satisfying eα2 = 1 = −e−1 generators Ωab are characterized by the following non-trivial commutation relations: [Ωaα , Ωbα ] = Ωab (= [Ωαa , Ωαb ]) for α = 1, . . . , D, [Ωκa , Ωκb ] = −Ωab
(2.1)
for κ = −1, 0;
iPµ , iKµ and the dilations are expressed in terms of them as: iPµ = −Ω−1µ − ΩµD ,
iKµ = −Ω−1µ + ΩµD ,
ρ−Ω−1D (x) = ρx (ρ > 0); (2.2)
the Lorentz generators Ωµν correspond to 0 ≤ µ, ν ≤ D − 1. In fact, the group SO(D, 2) itself has an action on M by (rational) conformal transformations. It is straightforward to derive this action using the Klein–Dirac construction of compactified Minkowski space M (1.3), realized as the projective quadric of RD,2 ([9] and see also [29, Appendix A] for a survey adapted to our present purposes and notation). The Minkowski space M is mapped into a dense subset (identified with M ) of M thus providing an affine chart in M . Other affine charts in M can be obtained by applying conformal transformations. In particular, the following chart in the complex compactified Minkowski space M C plays a crucial role in the GCI QFT. Let MC := M + iM be the complexified Minkowski 2 space, with coordinates Z = 2 2 ≡ (Z 1 − Z 2 )2 = Z 12 − Z 012 being the Poincar´e invariant (Z 0 , Z ) = x + iy ∈ CD , Z 12 interval and let EC be the complex Euclidean D-dimensional space 2 coordinates D with 2 2 z = (z, zD ) ∈ CD and Euclidean invariant interval z12 = z 12 + z12 , z12 = z1 −z2 . The rational complex coordinate transformation (see [35], [30] and [26]): gc : MC ( Z ) → EC ( z),
z=
Z
ω(Z )
,
zD =
1 − Z2 , 2ω(Z )
ω(Z ) =
1 + Z2 − iZ 0 2 (2.3)
is a complex conformal map (with singularities) such that 2 z12 =
2 Z 12
ω(Z 1 )ω(Z 2 )
,
2 dz 2 (= dz 2 + dzD )=
dZ 2 . ω(Z )2
(2.4)
The transformation (2.3) is regular on the real Minkowski space M and on the forward tube domain T+ = {Z = x + iy : y 0 > |y|}, and maps them on precompact subsets of EC . The closure gc (M ) of the image of the real Minkowski space M has
July 26, 2005 15:21 WSPC/148-RMP
622
J070-00239
N. M. Nikolov & I. T. Todorov
the form (1.3) (thus being identified with M ) and the image T+ of T+ under gc is
1 D 2 1 2 D 2 2 2 T+ := z ∈ C : |z | < 1, z · z¯ = |z | + · · · + |z | < (1 + |z | ) . (2.5) 2 The conjugation ∗ : M C → M C which leaves invariant the real space M is represented in the z-coordinates as: z¯ z )), (2.6) z → z ∗ := 2 ≡ jW (RD (¯ z¯ where RD (z) (2.7) Rα (z) := (z 1 , . . . , −z α , . . . , z D ), jW (z) := z2 and jW is a z-picture analogue of the Weyl reflection. Let us introduce the complex Lie algebra generators Tα and Cα for α = 1, . . . , D of the z-translations ew·T (z) = z + w and the z-special conformal transformations z+z 2 w D ew·C (z) = 1+2w·z+w 2 z 2 (w, z ∈ C ) which are conjugated by gc to the analogous generators iPµ and iKµ . This new basis of generators is expressed in terms of Ωab as: Tα = iΩ0α − Ω−1α ,
Cα = −iΩ0α − Ω−1α
for α = 1, . . . , D.
(2.8)
The set of infinitesimal z-rotations is again a subset {Ωαβ } of {Ωab } corresponding to 1 ≤ α < β ≤ D and the conformal generator which gives rise to the dilation (or, in fact, phase) transformation of the z-coordinates (to be interpreted as a conformal time translation) is the conformal Hamiltonian H = iΩ−10 ,
eitH (z) = eit z,
(2.9)
([Tα , Cβ ] = 2(δαβ H − Ωαβ ), [H, Cα ] = −Cα , [H, Tα ] = Tα ). The relations (2.8) and (2.9) can be easily obtained in the projective realization of M where the transformation gc is represented by a rotation of an angle π2 in the plane (ie0 , eD ) (∈ CD+2 , see for more detail [26, Appendix A]). Note that there is an involutive antilinear automorphism : cC → cC leaving invariant the real algebra c, i.e. Ω = Ω,
¯ + λ¯ Ω , (λΩ + λ Ω ) = λΩ
Ωab = Ωab ⇒ Pµ = −Pµ ,
Kµ = −Kµ ,
[Ω, Ω ] = [Ω , Ω ],
H = −H,
Tα = Cα
(2.10)
for λ, λ ∈ C, Ω, Ω ∈ cC , µ = 0, . . . , D − 1 and α = 1, . . . , D. In fact, the real generators Tα , Cα , Ωαβ , H(α, β = 1, . . . , D) span an Euclidean real form (∼ = spin(D + 1, 1)) of the complex conformal algebra. From a group theoretic point of view, the compactified Minkowski space M is a homogeneous space of the conformal group C characterized by the stabilizers of the points. For the tip p∞ of the light cone at infinity K∞ := M \M (recall that the isotropy relation extends to a conformally invariant relation on M ), the stabilizer is exactly the Weyl group: the Poinar´e group with dilations. In more detail, the Lie algebra of the stabilizer of p∞ is spanned by the generators {iPµ , Ωµν , Ω−1D }, while the Lie algebra of the stabilizer of the origin p0 (corresponding to x = 0 ∈ M ) is spanned by {iKµ , Ωµν , Ω−1D }. Thus every chart in M as well as in M C can be
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
623
uniquely characterized, as a vector space, by a pair (p, q) of mutually non-isotropic points: the origin p of the chart and the tip q of the its light cone complement. For the Minkowski space chart, the stabilizer of the pair (p, q) ≡ (p0 , p∞ ) is the Cartesian product of the (one-parameter) dilation and the Lorentz subgroups with Lie algebra spanned by {Ω−1D , Ωµν }. The z-chart introduced above is characterized by the pair of mutually conjugate points (p, q) = (ie0 , −ie0 ) (e0 := (1, 0) ∈ M , so that p ∈ T+ ⊂ MC ) and the stabilizer KC of this pair is a -invariant subgroup of CC . The real part of KC coincides with the maximal compact subgroup K which is generated by two mutually commuting subgroups: the U (1)-group {e2πitH } and the Spin(D) group acting on z via real (Euclidean) rotations (K ∼ = U (1) × Spin(D)/Z2 ). Since the points p and q are mutually conjugate, K is also the real part of the stabilizer of the origin in the z-chart. In fact, T+ (2.5) is isomorphic to the homogeneous space C /K of C (cf. [36]). Note that the complex Lie algebra of the stabilizer of z = 0 is spanned by the generators {Cα , H, Ωαβ }. Remark 2.1. In the familiar realization of M as the Dirac projective quadric [9], M = Q/.R∗ ,
2 2 Q = {ξ ∈ RD,2 \{0} : ξ 2 := ξ2 + ξD − ξ02 − ξ−1 (= ξ a ηab ξ b ) = 0},
the Minkowski space coordinates x and the complex coordinates z of (1.3) are expressed by µ
x =
ξµ ξ D + ξ −1
,
α
z =
ξα ξ −1 − iξ 0
.
Remark 2.2. Only Lorentz types of signatures (D − 1, 1) or (1, D − 1) possess the remarkable property that there exist affine charts covering the corresponding conformally compactified real space ([26, Proposition A.1]). Moreover, every such chart is characterized by the condition that the tip q of the light cone complement belongs to the union T+ ∪ T− (T− := (T+ )∗ is the image in the z-coordinates of the backward tube T− ). If q ∈ T± , then T∓ , is also covered by the chart.
2.2. Wightman axioms for conformal field theory in the analytic picture We proceed with a brief survey of the axiomatic QFT with GCI. First, one assumes the existence of a vector bundle over the complex compactified Minkowski space M C called the field bundle. It is endowed with an action of the conformal group CC via (bundle) automorphisms. Thus for every point p ∈ M C , its stabilizer Cp will act by a representation πp on the (finite dimensional) fibre Fp over p. Then if p is the origin of some affine chart in M C , e.g., the z-chart, we can trivialize the bundle over the chart using the coordinate translations tw (z) (≡ ew·T (z)) = z + w so that
July 26, 2005 15:21 WSPC/148-RMP
624
J070-00239
N. M. Nikolov & I. T. Todorov
the action of CC in this trivialization will take the form B φB }) ∈ CD × F, (z = {z α }, φ = {φA }) → (g(z), πz (g)φ = {πz (g)A g
(2.11)
where {φA } are some (spin-tensor) coordinates in the fibre F := F0 over the origin z = 0 and the matrix valued function πz (g) = {πz (g)B A } (g ∈ CC ), regular in the domain of g and called cocyle, is characterized by the properties: πz (g1 g2 ) = πg2 (z) (g1 )πz (g2 ), πz (tw ) = IF ⇔ πz (g) = π0 t−1 g(z) gtz . (2.12) The fibre F is the space of (classical) field values and the coordinates φA correspond to the collection of local fields in the theory. An example of a field bundle is the M dxµ ∧ electromagnetic field defined as the bundle of 2-forms Fαβ dz α ∧ dz β = Fµν dxν over M . The axiomatic assumptions of the GCI QFT are the Wightman axioms [34] and the condition of GCI for the correlation functions [29]. As proven in [29, Theorem 3.1], GCI is equivalent to the rationality of the (analytically continued) Wightman functions. Thus, the vacuum n-point correlation functions in the theory ×n can be considered as meromorphic sections of the nth tensor power (over M C ) of the field bundle and hence, for every affine chart in M C , we obtain a system of rational correlation functions over the chart. This provides the general scheme for the passage from the GCI QFT over Minkowski space to the theory over a complex affine chart which contains the forward “tube” T+ (2.5) — see [26, Sec. 9]. The (analytic) z-picture of a GCI QFT is equivalent to the theory of vertex algebras ([2], [19], [3] and [11]) in higher dimensions (see [26]). We proceed to formulate the analogue of Wightman axioms [34] in this picture. The quantum fields φA (z) (A = 1, . . . , I for I = dim F ) will be treated as formal power series in z and z12 . This is possible because of the analytic properties of the fields in a GCI Wightman QFT ([26, Theorem 9.1]). Using harmonic polynomials, one can uniquely separate the integer powers of the interval z 2 due to the following (known) fact: for every polynomial p(z), there exist unique polynomials h(z) and q(z) such that h(z) is harmonic and p(z) = h(z) + z 2 q(z). Thus, if we (m) fix a basis {hσ (z)} of homogeneous harmonic polynomials of degree m, for every m = 0, 1, . . . , we can write our fields φA (z) in a unique way as formal series in the (m) monomials (z 2 )n hσ (z) for n ∈ Z, m = 0, 1, . . . , and the index σ takes values in a finite set Im . In such a way, we end up with the following axioms: Fields (F ). The fields are represented by (non-zero) formal series φA (z) =
∞
φA{n,m,σ} (z 2 )n h(m) σ (z),
(2.13)
n∈Z m=0 σ
with coefficients φA{n,m,σ} which are operators acting on a common invariant dense domain D of the Hilbert space H of physical states. We require that for every vector state Ψ ∈ D, there exists a constant NΨ ∈ N such that φA{n,m,σ} Ψ = 0 for all n ≤ −NΨ , m = 0, 1, . . . , and all possible values of σ,
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
625
or equivalently, (z 2 )NΨ φA (z) Ψ is a formal power series with no negative powers. (This requirement is related to the energy positivity, stated below in the axiom (SC ).) (m)
As the properties of the field φA (z) do not dependent of the choice on {hσ (z)} we may also write it in a basis independent form: φA (z) =
∞
φA{n,m} (z)(z 2 )n ,
(2.14)
n∈Z m=0
(m) where φA{n,m} (z) (= σ φA{n,m,σ} hσ (z)) are operator valued homogeneous harmonic polynomials. We shall use this more concise presentation in studying exam(m) ples of free fields (Sec. 4). Using an (arbitrary) basis {hσ (z)}, on the other hand, makes more transparent the algebraic manipulations of formal power series in this and the following sections. The next axiom introduces the conformal symmetry of the theory. Covariance (C ). There exists a unitary representation, U (g) of the real conformal group C on the Hilbert space H such that the hermitian generators of the conformal Lie algebra c leave invariant the fields’ domain D. We also require the existence of a rational matrix-valued function {πz (g)B A }A,B=1,...,I depending D on z ∈ C and g ∈ C, regular for z in the domain of g on CD, and such that it satisfies the Properties (2.12). Then the fields φA (z) are assumed to satisfy infinitesimal conformal covariance, formally written as: d U (etΩ )φA (z)U (etΩ )−1 t=0 dt B d πz (etΩ )−1 A φB (etΩ (z)) t=0 = (2.15) dt for Ω ∈ spin(D, 2). It is simpler to write down the field covariance law if we further assume that our fields are transforming under an elementary induced representation of the conformal group C. This means that the cocycle πz (g) is trivial at z = 0 for g = ea·C and it is thus determined by a representation of the maximal compact subgroup K of C:
B , πz (etΩαβ )AB = etπ0 (Ωαβ ) B (2.16) πz (eitH )AB = eitdA δA A, where dA are positive numbers called dimensions of the corresponding fields φA . Under this additional assumption we can present Eq. (2.15) in a more concrete form: [Tα , φA (z)] = ∂zα φA (z),
(2.17)
[H, φA (z)] = z · ∂z φA (z) + dA φA (z), [Ωαβ , φA (z)] = [Cα , φA (z)] =
(2.18)
(z ∂zβ − z ∂ )φA (z) + π0 (Ωαβ )AB φB (z), (z 2 ∂zα − 2z α z · ∂z )φA (z) − 2z α dA φA (z) + 2z β π0 (Ωβα )AB φB (z), α
β
zα
where ∂zα stands for the partial (formal) derivative
∂ ∂z α .
(2.19) (2.20)
July 26, 2005 15:21 WSPC/148-RMP
626
J070-00239
N. M. Nikolov & I. T. Todorov
We further assume that the hermitian conjugate φA (z)∗ of each φA (z) belongs to the linear span of the set {φA }. Field conjugation law (*). For every Ψ1 , Ψ2 ∈ D and for any field φA there exists a ∗ conjugate field φA such that ∗
−1 B )A φB (z ∗ )Ψ1 |Ψ2 , Ψ1 |φA (z)Ψ2 = πRD (¯z) (jW
(2.21)
where jW is defined by (2.6) and (2.7). The exact meaning of Eq. (2.21) is provided by the fact that both sides are finite series, i.e. polynomials in z and z12 ∗
(see [26, Remark 8.1]). The correspondence φA → φA gives rise to an antilinear involution in the standard fibre F of the field bundle, invariant under the action of C. The next axiom states energy positivity and determines the physical vacuum. Spectral condition (SC ). The conformal time generator H is represented on H by a positive operator. There is only one norm 1 conformally invariant vector |0 ∈ H (up to phase factor) and it is contained in the fields’ domain D. We shall now formulate a strong form of the locality axiom also called Huygens’ principle stating that the fields are independent for non-isotropic separations. Strong Locality or Huygens’ principle (SL). Every field φA (z) is assumed to have a fixed statistical parity pA = 0, 1 and there exist positive integers MAB such that 2 MAB (z12 ) (φA (z1 )φB (z2 ) − (−1)pA pB φB (z2 )φA (z1 )) = 0,
(2.22)
where z12 := z1 − z2 . Remark 2.3. When we deal with formal power series, it is more convenient to use weaker (infinitesimal) conformal invariance but a stronger locality axiom. Indeed, for rational functions, GCI follows from infinitesimal conformal invariance. Thus within the Wightman framework, the two pairs of axioms: (1) ordinary locality and GCI, (2) strong locality and infinitesimal conformal invariance, are completely equivalent. Completeness. The set of vectors |0, φA1 {n1 ,m1 ,σ1 } . . . φAk {nk ,mk ,σk } |0, for all k ∈ N and all possible values of the indices of the φ’s, spans the fields’ domain D. This completes our analogue of Wightman axioms in the z-picture. Theorems 9.2 and 9.3 of [26] allow one to state the following general result: Theorem 2.1. There is a one-to-one correspondence between the finite systems of Wightman fields with GCI correlation functions [29] and the systems of formal series satisfying the above axioms. Using the fact that the cocycle πz (g) is meromorphic (even rational) in g and z, we can continue it to CC and write down the explicit connection between
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
627
the Wightman fields φM A (x) on the Minkowski space and the analytic picture fields φA (z): −1 B φM A (x) = πz (gc )A φB (gc (x)) (z = gc (x)),
(2.23)
where gc is the transformation (2.3) viewed as an element of CC . Equation (2.23) is the precise expression of the fact that the fields φM A (x) and φA (z) are different coordinate expressions of the same (generalized, operator-valued) section of the field bundle over M . For example, for the electromagnetic field Eq. (2.23) is equivalent to M (x) dxµ ∧ dxν = Fαβ (z) dz α ∧ dz β . Fµν
(2.24)
The rigorous meaning of Eqs. (2.23) and (2.24) includes, on one hand, the extension of the operator valued functions φM A (x) to a larger class of test functions which correspond to coordinate expressions of arbitrary smooth sections over M . This can be done using the GCI condition (see [29, Proposition 2.1]). On the other hand, using the positivity of the scalar product of the Hilbert state space, one can easily extend our formal field series (2.15) to generalized operator-valued functions over M (in the parametrization (1.3)). We now proceed to introduce the real compact picture representation which is more convenient in studying finite temperature correlation functions. For a local field φ(z) of dimension d, we set φ(ζ, u) to be a formal Fourier series in e2πiζ ∈ S1 and u ∈ SD−1 defined as:
φ(ζ, u) = e2πidζ φ(e2πiζ u) = φ−νm (u) =
∞
φ−νm (u)e2πiνζ ,
ν∈d+Z m=0
(2.25)
φ−ν,m,σ h(m) σ (u).
σ (m)
(The space of harmonic polynomials, SpanC {hσ (u)}m,σ , is identified with the algebra of complex polynomials restricted to the sphere SD−1 .) Then the connection with the previous analytic picture modes is ν+m+d φ{n,m,σ} = φν,m,σ for ν = −d − 2n − m n = − . (2.26) 2 Note that the index n in the analytic picture modes φ{n,m,σ} is always integer while in the compact picture modes, φν,m,σ , it is integer or half-integer depending on d (which is reflected in the first sum in (2.25)). In accord with the commutation relation (2.18), we obtain [H, φν,m (u)] = −νφν,m (u), e2πitH φν,m (u)e−2πitH = e−2πitν φν,m (u) [H, φ{n,m} (z)] = (d + 2n + m)φ{n,m} (z).
(2.27)
It follows that 2πiH acts as a translation generator in ζ in accord with Eq. (1.6).
July 26, 2005 15:21 WSPC/148-RMP
628
J070-00239
N. M. Nikolov & I. T. Todorov
As a realization of the above axioms, we will consider the case of a neutral scalar field φ(z) ≡ φ(d) (z) of dimension d. Its 2-point function is proportional to the unique scalar conformal invariant function of dimension d, 0| φ(z1 )φ(z2 )|0 =
1 2 )d (z12 = z1 − z2 ) (z12
(2.28)
viewed as a Taylor series in the second argument, z2 , with coefficients — rational functions in z1 (see Theorem 3.3 below). The field cocycle is πz (g) = ω(g, z)−d, α − 1 D , see [29, where ω(g, z) is a quadratic polynomial in z(ω(g, z) = det ∂g(z) ∂z β Eq. (A.5)]. (Note that the Minkowski space transform of the correlation func1 tion (2.28) is proportional to x2 +i0x 0 , cf. with Eq. (5.17).) The hermiticity of 12 12 the field is expressed by ∗ 1 z¯ ∗ ∗ ∗ φ(z) = 2 d φ(z ) z = 2 , φ(z) ≡ φ(z) (2.29) (¯ z ) z¯ since ω(jW , z) = z 2 . This conjugation law simplifies in the compact picture, since we are using real coordinates (ζ, u); the hermiticity condition for the field modes reads: φ∗ν,m,σ = φ−ν,m,σ .
(2.30)
3. GCI Correlation Functions as Meromorphic Functions 3.1. Rationality of the vacuum correlation functions Theorem 2.1 implies the rationality of the Wightman functions as well as the analyticity properties of the fields. Since these facts are of major importance, we shall prove them independently. We begin with stating some basic properties of the formal series which arise in the analytic picture of GCI QFT. We introduce, following [26], the space V [[z, 1/z 2]] of formal series v(z) =
∞ n∈Z m=0
v{n,m,σ} (z 2 )n h(m) σ (z)
(3.1)
σ
with coefficients v{n,m,σ} belonging to a complex vector space V . The space of finite series of V [[z, 1/z 2]] will be denoted by V [z, 1/z 2]. Obviously, C[z, 1/z 2] is a complex algebra and V [z, 1/z 2] is a module over this algebra. Nevertheless, the product between a series f (z) ∈ C[[z, 1/z 2]] and a series v(z) ∈ V [[z, 1/z 2]] is not defined in general, but it is not difficult to define the product f (z)v(z) if f (z) ∈ C[z, 1/z 2] (thus turning V [[z, 1/z 2]] into a C[z, 1/z 2]-module). We emphasize that all the products between formal series throughout this paper will be treated in purely algebraic sense, i.e. every coefficient of the product series should be obtained by a finite number operations of summation and multiplication on the coefficients of the initial series. On the other hand, the space of Taylor series V [[z]] in z, with coefficients in V , is naturally identified with the subspace of V [[z, 1/z 2]] of formal series (3.1) whose sum in n runs from 0 to ∞. Evidently, V [[z]] is a module over the algebra C[[z]] (i.e. the
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
629
product f (z)v(z) is well defined for f (z) ∈ C[[z]] and v(z) ∈ V [[z]]). Similarly, V [z] (the space of polynomials in z with coefficients in V ) is a subspace of V [z, 1/z 2] and it is a module over the polynomial algebra C[z]. There is a larger space of formal series of C[[z, 1/z 2]] than C[[z]] which still possesses a complex algebra structure: this is the space C[[z]]z2 of those series f (z) ∈ C[[z, 1/z 2]] whose sum in n in (3.1) is bounded below. The more general spaces V [[z]]z2 are defined in a similar way and we have in fact a shorter equivalent definition v(z) ∈ V [[z]]z2 ⇔ (z 2 )N v(z) ∈ V [[z]] for N 0.
(3.2)
Remark 3.1. The notation C[[z]]z2 comes from commutative algebra: for a commutative ring R and f ∈ R the localized ring Rf is defined as the ring of ratios fan for a ∈ R and n = 0, 1, . . . (more precisely, this is the quotient ring R[t]/(f t − 1) of the ring R[t] of polynomials in the one-dimensional variable t over ideal generated by f t − 1 — (see [1])). In a similar way, if V is a module over the ring R, then the localized space Vf is defined as a module over the localized ring Rf . Note that C[z, 1/z 2] ≡ C[z]z2 . Proposition 3.1. Let V be a complex vector space. (a) The space C[[z]]z2 is a complex algebra containing C[[z]] as a subalgebra and V [[z]]z2 is a module over this algebra that extends the module structure of V [[z]] over the algebra C[[z]]. (b) There are no zero divisors in the C[[z]]z2 -module V [[z]]z2 , i.e. if f (z) ∈ C[[z]]z2 and v(z) ∈ V [[z]]z2 are such that f (z)v(z) = 0, then either f = 0 or v = 0. (c) If w is another D-dimensional formal variable and V [[z]]z2 [[w]]w2 := (V [[z]]z2 ) [[w]]w2 , then the polynomial (z − w)2 is invertible in V [[z]]z2 [[w]]w2 and its inverse, de1 1 noted by ιz,w (z−w) 2 , is the Taylor series of (z−w)2 in w with coefficients belong1 α ∞ (−1)n 1 1 α · · · wαn ). ing to C[z, 1/z 2] (i.e. ιz,w (z−w) 2 := n=0 n! ∂z 1 · · · ∂z αn z 2 w The proof of Proposition 3.1 is quite simple. We just remark that the product of f (z) ∈ C[[z]]z2 and v(z) ∈ V [[z]]z2 can be defined as (z 2 )−N1 −N2 ((z 2 )N1 f (z)) ((z 2 )N2 v(z)) for N1 , N2 0, according to Eq. (3.2), and does not depend on N1 and N2 . Then condition (b) follows from the absence of zero divisors in the C[[z]]-module V [[z]]. Having several D-dimensional variables z1 , . . . , zn , one can inductively define 1 1 1 1 1 V z1 , 2 ; . . . ; zn , 2 := V z1 , 2 ; . . . ; zn−1 , 2 zn , 2 . (3.3) z1 zn z1 zn−1 zn A different order of z1 , . . . , zn in the right-hand side of (3.3) will correspond to another way of summation in the formal series. Nevertheless, the order of zk in the 1 successive localizations V [[z1 ]]z12 · · · [[zn ]]zn2 is important. For example, ιz,w (z−w) 2 and
July 26, 2005 15:21 WSPC/148-RMP
630
J070-00239
N. M. Nikolov & I. T. Todorov
1 2 2 ιw,z (z−w) 2 are different formal series of C[[z, 1/z ; w, 1/w ]]. From Proposition 3.1(c) it follows that 1 1 2 − ιw,z (z − w) ιz,w = 0, (3.4) (z − w)2 (z − w)2
so we see that in the C[z, 1/z 2; w, 1/w2 ]-module C[[z, 1/z 2; w, 1/w2 ]] there are zero divisors (the same is true in any C[z, 1/z 2]-module V [[z, 1/z 2]]). The spaces of successive localizations play important roles in the analytic picture GCI QFT since, according to Axiom (F ), we have φA1 (z1 ) · · · φAn (zn )Ψ ∈ D[[z1 ]]z12 · · · [[zn ]]zn2
(3.5)
for any state vector Ψ belonging to the fields’ domain D. They are convenient, on one hand, since these spaces have no zero divisors and, on the other, since products 2 are invertible in the algebra C[[z1 ]]z12 · · · [[zn ]]zn2 . of the type 1≤k 0 it will turn out that ∂zα F (z, w) = (z 2 )−N −1 h1 (z, w) + (z 2 )−N g1 (z, w) where h1 (z, w) and g1 (z, w) are series with similar properties. Then the equation ∂zα F (z, w) = ∂wα F (z, w) will imply an equality of type h2 (z, w) = z 2 g2 (z, w) for the series h2 (z, w) and g2 (z, w) with no negative powers, h2 being non-zero and harmonic with respect to z. But this would contradict the uniqueness of the harmonic decomposition. The series φA (z)|0 is non-zero: otherwise Axioms (F ) and (SL) imply that 2 N ) φA (z1 )φA2 (z2 ) · · · φAn (zn )|0 = 0, for large N ∈ N, and we can can( k
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
631
(b) The positivity of dA is a consequence, on the one hand, of Eq. (2.27) implying HφA{0,0} |0 = dA φA{0,0} |0 (φA{0,0} = φA{0,0} (z) is the mode multiplying the (0)
unique, up to proportionality, harmonic polynomial h1 (z) = 1 of degree 0) and on the other, of the positivity of H (Axiom (SC)). Note also that, according to condition (a), the vector φA{0,0} |0 ≡ φA (z)|0|z=0 is non-zero and non-collinear with |0 since otherwise, it will follow that φA{n,m,σ} |0 = 0 for n > 0 (and hence, φA (z)|0 = 0), or φA (z) ∼ I, respectively (because φA (z)|0 = ez·T φA{0,0} |0). The second statement follows from [26, Proposition 7.1] and the assumed rationality of the field cocycle (see the covariance axiom). (c) The set of vectors in the axiom of completeness is actually a set of eigenvectors of H with integer or half-integer eigenvalues. From the commutation relation (2.19), it also follows that every vector of this system is contained in a finite dimensional subrepresentation of the Lie algebra of the maximal compact subgroup. Remark 3.2. The vector ΦA = φA (z)|0 |z=0 uniquely characterizes the field φA (z) and we have φA (z)|0 = ez·T ΦA . Moreover, for every v ∈ D, there exists a unique translation covariant local field Y (v, z) such that Y (v, z)|0 = ez·T v (see [N03, Sec. 3]). This is a compact formulation of the state-field correspondence in the vertex algebra approach. Theorem 3.3. Every scalar product Ψ1 φA1 (z1 ) · · · φAn (zn )Ψ2 (for arbitrary Ψ1 , Ψ2 ∈ D), regarded as a power series, is absolutely convergent in the domain Dn : zk = e2πiζk uk , ζk ∈ C and uk ∈ SD−1 ⊂ RD U< n := (z1 , . . . , zn ) ∈ C (uk2 = 1) for k = 1, . . . , n; Im ζkl < 0 for 1 ≤ k < l ≤ n (3.6) n 2 −N 2 −N and its limit is a rational function of the form 1≤k
1
−1 Bn = 0| πz (g)−1 B A1 · · · πz (g) An φB1 (g(z1 )) · · · φBn (g(zn ))|0,
(3.7)
and Z2 -symmetric in the sense 0|φAσ(1) zσ(1) · · · φAσ(n) zσ(n) |0 = (−1)ε(σ) 0|φA1 (z1 ) · · · φAn (zn )|0,
(3.8)
for every permutation σ; (−1)ε(σ) is the corresponding statistical factor: ε(σ) = paσ(i) paσ(j) (mod 2) the sum being over all pairs of indices i < j such that σ(i) > σ(j).
July 26, 2005 15:21 WSPC/148-RMP
632
J070-00239
N. M. Nikolov & I. T. Todorov
Proof. Set
ρn :=
n
zk2
k=1
2 zkl .
(3.9)
1≤k
From Axioms (F ) and (SL), it follows that for sufficiently large N ∈ N, the formal series ρN n φA1 (z1 ) · · · φAn (zn ) Ψ is Z2 -symmetric Taylor series in the z’s with coef ficients in D. Hence, P (z1 , · · · , zn ) := ρN n Ψ1 φA1 (z1 ) · · · φAn (zn )Ψ2 is a complex Taylor series, and if HΨ1 = 1 Ψ1 and HΨ2 = 2 Ψ2 , it satisfies, in addition, the Euler equation n n zk · ∂zk P (z1 , . . . , zn ) = N n(n + 1) + 1 − 2 − dk P (z1 , . . . , zn ) k=1
k=1
(3.10) as a consequence of the commutation relations (2.18). Therefore, P (z1 , . . . , zn ) is a polynomial and it is clear that its coefficients are linear combinations of scalar products of the type Ψ1 | φA1 {k1 ,m1 ,σ1 } · · · φAn {kn ,m1 ,σ1 } Ψ2 . Thus, we find that the series ρN n Ψ1 | φA1 (z1 ) · · · φAn (zn )Ψ2 is a polynomial for all Ψ1 , Ψ2 ∈ D for N ∈ N sufficiently large. in the space C[[z1 ]]z12 · · · [[zn ]]zn2 which contains We now divide by ρN n Ψ1 | φA1 (z1 ) · · · φAn (zn )Ψ2 , because of the Property (3.5). (The inverse series of ρN n 2 −N is obtained in C[[z1 ]]z12 · · · [[zn ]]zn2 by expanding every factor (zkl ) = ((zk −zl )2 )−N , for 1 ≤ k < l ≤ n, in Taylor series in zl around zl = 0; see Proposition 3.1(c)). Thus the domain of absolute convergence of the formal series Ψ1 | φA1 (z1 ) · · · φAn (zn )Ψ2 coincides with the domain of absolute convergence of 2 −N ) , which contains U< the above expanded “propagators” (zkl n. The covariance law (3.7) and the Z2 -symmetry (3.8) follow from the uniqueness of the analytic continuation. Remark 3.3. The domain T+ can be characterized as the connected component ∗ 2 ) of z = 0 of the subset z ∈ C : (z−z ¯ + |z 2 |2 = 0 . Note that due to (z ∗ )2 ≡ 1 − 2z · z the axiom (*) the product of φA (z) with its conjugate φA (z)∗ will be singular for (z − z ∗ )2 = 0. 3.2. Ellipticity of the finite temperature correlation functions To study the thermal correlation functions, it is convenient to use the compact picture representation of the GCI fields. Therefore, we begin with stating the basic properties of the compact picture formal series which are analogous to those of the formal power series of the previous subsection. Denote by V [[e±πiζ , u]] the space of infinite formal Fourier series: v(ζ, u) =
∞ ν∈ 12 Z m=0
σ
vν,m,σ e−2πiνζ h(m) σ (u)
(3.11)
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
633
with coefficients in V . It is a module over the algebra C[e±πiζ , u] of Fourier polynomials (the space of finite complex series of type (3.11)). When ν in Eq. (3.11) runs over Z, the resulting space of series will be denoted by V [[e±2πiζ , u]]. In the product f (ζ, u)v(ζ, u) of f (ζ, u) ∈ C[e±πiζ , u] and v(ζ, u) ∈ V [[e±πiζ , u]], (m) the harmonic polynomials hσ (u) are treated as spanning the algebra C[[u]] of polynomial functions over the unit sphere SD−1 ( u). The space of n-point formal series, denoted by V [[e±πiζ1 , u1 ; . . . ; e±πiζn , un ]], is a module over the algebra C[e±πiζ1 , u1 ; . . . ; e±πiζn , un ]. Following the line of arguments of the previous subsection, we need a compact picture analogue of the localized space V [[z]]z2 . Observing that the basic terms in (3.11) can be represented as: 2 n (m) 2πiζ u and ν = − e−2πiνζ h(m) σ (u) = (z ) hσ (z) for z = e
n+m , 2
(3.12)
we are led to introduce the space V [[e±πiζ , u]]+ defined as containing those series (3.11) for which there exists N ∈ N such that vν,m,σ = 0 if ν + m > N (thus excluding arbitrary large powers of z 2 in the z-picture). It then follows from (3.12) that v(ζ, u) ∈ V [[e±πiζ , u]]+ iff v(ζ, u) = e−πiN ζ v (e2πiζ u) for some N ∈ N and v (z) ∈ V [[z]]. We conclude, using Proposition 3.1, that C[[e±πiζ , u]]+ is a complex algebra and V [[e±πiζ , u]]+ is a module over this algebra with no zero divisors. Recall that 4 sin πζ+ sin πζ− , defined by Eqs. (1.11) and (1.12), the com2 , is a Fourier polynomial belonging to pact picture analogue of the interval z12 ±2πiζ1 ±2πiζ2 , u1 ; e , u2 ] (i.e. the space of series containing just integer powers of C[e e±2πiζk ). We shall now introduce an elliptic version of the compact picture interval : Θ(ζ1 , u1 ; ζ2 , u2 ; τ ) := e−
πiτ 2
ϑ11 (ζ+ , τ )ϑ11 (ζ− , τ )
= 4 sin πζ+ sin πζ− − 4(sin 3πζ+ sin πζ− + sin πζ+ sin 3πζ− )e2πiτ + . . . , (3.13) where ϑ11 (ζ, τ ) is the Jacobi ϑ-function (A. 24). Having n compact picture points (ζ1 , u1 ) · · · (ζn , un ) (∈ R × SD−1 ), we introduce the shorthand notation: Θjk . (3.14) Θjk := Θ(ζj , uj ; ζk , uk ; τ ), Ω(ζ1 , u1 ; · · · ; ζn , un ) := 1≤j
Each term of the series (3.13) and (3.14) is a function of the coordinate differences ζjk = ζj − ζk ; hence, we can write Θ(ζ12 ; u1 , u2 ) and Ω(ζ12 , . . . , ζn−1n ; u1 , . . . , un ) treating them, however, as series in different spaces. Proposition 3.4. (a) Θ12 is a formal series belonging to C[e±2πiζ1 , u1 ; e±2πiζ2 , u2 ][[q]] (q = e2πiτ ), i.e. Θ12 is a Taylor series in q with coefficients which are Fourier polynomials belonging to C[e±2πiζ1 , u1 ; e±2πiζ2 , u2 ]. Moreover, Θ12 is symmetric, Θ12 = Θ21 , and it is divisible by sin πζ+ sin πζ− : Θ12 = 4 sin πζ+ sin πζ− (1 + qΘ12 ) = 4 sin πζ+ sin πζ− {1 − 2(1 + 2 cos 2πζ12 cos 2πα)q + · · ·}, where Θ12 is again a series belonging to C[e±2πiζ1 , u1 ; e±2πiζ2 , u2 ][[q]].
(3.15)
July 26, 2005 15:21 WSPC/148-RMP
634
J070-00239
N. M. Nikolov & I. T. Todorov
(b) The formal series Θ12 has inverse ±2πiζ1 Θ−1 , u1 ]]+ [[e±2πiζ2 , u2 ]]+ [[q]] (Θ12 Θ−1 12 ∈ C[[e 12 = 1).
(3.16)
The series Θ−1 12 is absolutely convergent in the domain 0 < −Im ζ12 < Im τ . (c) Ω(ζ1 , u1 ; . . . ; ζk + τ, uk ; . . . ; ζn , un ) is a symmetric formal series belonging to C[e±2πiζ1 , u1 ; . . . ; e±2πiζn , un ][[q]] (i.e. Ω(ζ1 , u1 ; . . . ; ζn , un ) = Ω(ζσ1 , uσ1 ; . . . ; ζσn , uσn )). As a series in the conformal time differences ζj j+1 , Ω(ζ12 , . . . , ζn−1n ; u1 , . . . , un ) satisfies the property Ω(ζ1 2 + λ1 τ, . . . , ζn−1 n + λn−1 τ ; u1 , . . . , un ) m−1 m−1 (n) (n) = exp −2πi λj Ajk λk τ − 4πi λj Ajk ζk k+1 j,k=1
j,k=1
× Ω(ζ1 2 , . . . , ζn−1 n ; u1 , . . . , un ) (n)
(3.17) (n)
for all (λ1 , . . . , λn−1 ) ∈ Zn−1 , where Ajk = Akj are fixed integer constants (n)
(depending just on their indices) and the symmetric matrix {Ajk }n−1 j,k=1 is positive definite. (d) Let R be a commutative complex algebra. The space of Taylor series 1 F (ζ1 2 , . . . , ζn−1 n ; τ ) in q 2 which have polynomial coefficients belonging to ±πiζ12 ±πiζn−1n ,...,e ] and obey for all λ1 , . . . , λn−1 ∈ Z the properties: R[e Pn−1
F (ζ1 2 + λ1 , . . . , ζn−1 n + λn−1 ; τ ) = (−1)
k=1
(1)
λk εk
F (ζ1 2 , . . . , ζn−1 n ; τ ), (3.18)
F (ζ1 2 + λ1 τ, . . . , ζn−1 n + λn−1 τ ; τ ) m−1 Pn−1 (τ ) (n) λ ε = (−1) k=1 k k exp −2πiN λj Ajk λk τ m−1
− 4πiN
j,k=1
(n)
λj Ajk ζk k+1 F (ζ1 2 , . . . , ζn−1 n ; τ ),
(3.19)
j,k=1 (1)
(τ )
(n)
where εk , εk = 0, 1 (k = 1, . . . , n − 1), N ∈ N and {Ajk }n−1 j,k=1 is 1 integral positive matrix, is a finitely generated module over R[[q 2 ]]. In other words, there exists a finite number of fixed (complex) series (N ) Fc (ζ1 2 , . . . , ζn−1 n ; τ ), c = 1, . . . , CN , obeying the properties (3.18) and (3.19) CN (N ) (ζ1 2 , . . . , ζn−1 n ; τ ) for and such that F (ζ1 2 , . . . , ζn−1 n ; τ ) = c=1 Gc (τ )Fc 1 (N ) Gc (τ ) ∈ R[[q 2 ]]. Moreover, the basic series Fc (ζ1 2 , . . . , ζn−1 n ; τ ) can be chosen absolutely convergent to analytic functions for Im τ > 0 and ζk k+1 ∈ C (k = 1, . . . , n − 1). If we have a multicomponent series FA1 ···An (ζ1 2 , . . . , ζn−1 n ; τ ) of the above type which, in addition, is Z2 -symmetric (in the sense of Eq. (3.8)) 1 then we can expand it in a finite R[[q 2 ]]-linear combination of Z2 -symmetric (N ) basic series FA1 ···An ;c (ζ1 2 , . . . , ζn−1 n ; τ ).
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
635
The proof of Proposition 3.4 is straightforward. We present it in Appendix B. Let us note that taking the ratios (N )
Ec(N ) (ζ1 2 , . . . , ζn−1 n ; u1 , . . . , un ; τ ) :=
Fc (ζ1 2 , . . . , ζn−1 n ; τ ) Ω(ζ1 2 , . . . , ζn−1 n ; u1 , . . . , un )N
(3.20)
for c = 1, . . . , CN , using the basic systems in Proposition 3.4(d), we obtain systems of formal series (for N ∈ N) belonging to C[[e±2πiζ1 , u1 ]]+ · · · [[e±2πiζn , un ]]+ [[q]] which are absolutely convergent in the domain 0 < −Im ζjk < Im τ for 1 ≤ j < k ≤ n to elliptic functions in every ζk (Proposition 3.4(b) and (c)): Pn−1
Ec(N ) (ζ1 2 + λ1 , . . . , ζn−1 n + λn−1 ; τ ) = (−1)
k=1
Pn−1
Ec(N ) (ζ1 2 + λ1 τ, . . . , ζn−1 n + λn−1 τ ; τ ) = (−1)
(1)
λk εk
k=1
Ec(N ) (ζ1 2 , . . . , ζn−1 n ; τ ), (τ )
λk εk
Ec(N ) (ζ1 2 , . . . , ζn−1 n ; τ ). (3.21)
The finite temperature correlation functions can be written as linear combinations of such ratios with coefficients that are τ -dependent spherical functions (or, at least, formal Fourier series) in uk . We are now ready to find out the general structure of the Gibbs (thermal) correlation functions. When one considers the thermodynamic properties of quantum fields, additional assumptions are always needed (see footnote a). In our framework, we impose the minimal assumption that the conformal Hamiltonian H has finite dimensional eigenspaces. This makes it possible to introduce the partition function Z(τ ), and the thermal mean values φA1 ;ν1 ,m1 ,σ1 · · · φAk ;νk ,mk ,σk q (q = e2πiτ ) of products of compact picture modes φA;ν,m,σ of fields φA (ζ, u) as formal power 1 series in q 2 , Z(τ ) = trD (q H ) =
∞ j Ψjσ | Ψjσ q 2 ,
(3.22)
j=0 σ
φA1 ;ν1 ,m1 ,σ1 . . . φAk ;νk ,mk ,σk q 1 trD {φA1 ;ν1 ,m1 ,σ1 · · · φAk ;νk ,mk ,σk q H } := Z(τ ) ∞ j 1 Ψjσ | φA1 ;ν1 ,m1 ,σ1 · · · φAk ;νk ,mk ,σk Ψjσ q 2 , = Z(τ ) j=0 σ
(3.23)
where {Ψjσ }jσ is an orthonormal basis in the Hilbert state space consisting of eigenvectors of H, HΨjσ = 2j Ψjσ . (Note that D = SpanC {Ψjσ }jσ because of Proposition 3.2(c); note also that the series of Z(τ ) has a leading term 1 so that it is 1 invertible in C[[q 2 ]] (see Fact B.1).) The cyclic property of the traces over each (finite dimensional) eigenspace of H will imply the KMS property: φA1 ;ν1 ,m1 ,σ1 · · · φAn ;νk ,mk ,σk q = φA2 ;ν2 ,m2 ,σ2 · · · φAn ;νk ,mk ,σk q H φA1 ;ν1 ,m1 ,σ1 q −H q = q −ν1 φA2 ;ν2 ,m2 ,σ2 · · · φAn ;νk ,mk ,σk φA1 ;ν1 ,m1 ,σ1 q ,
(3.24)
July 26, 2005 15:21 WSPC/148-RMP
636
J070-00239
N. M. Nikolov & I. T. Todorov 1
(according to Eq. (2.27)) as an equality in C[[q 2 ]]. Summing over all triples νl , ml , σl in the corresponding expansions of φAl (ζl , ul ) by (2.25) we obtain the KMS equation φA1 (ζ1 , u1 ) · · · φAn (ζn , un )q = φA2 (ζ2 , u2 ) · · · φAn (ζn , un )φA1 (ζ1 + τ, u1 )q
(3.25)
1
as an equality of formal series belonging to C[[q 2 ]] [[e±2πiζ1 , u1 ; . . . ; e±2πiζn , un ]]. On the other hand, one can perform the sum in the trace trD {φA1 (ζ1 , u1 ) · · · φAn (ζn , un )q H } first over the fields’ modes and then over the 1 energy levels (taking the sum in the powers of q 2 ), trD {φA1 (ζ1 , u1 ) . . . φAn (ζn , un )q H } ∞ j = Ψjσ | φA1 (ζ1 , u1 ) · · · φAn (ζn , un )Ψjσ q 2 ;
(3.26)
j=0 σ
this gives a meaning to Eq. (3.25) as an equality in the space 1 C[[e±2πiζ1 , u1 ]]+ · · · [[e±2πiζn , un ]]+ [[q 2 ]] (according to Eq. (3.2)). Now, if we multiply both sides of (3.25) by Ω(ζ1 2 , . . . , ζn−1 n ; u1 , . . . , un )N for some N 0, setting FA1 ···An (ζ1 2 , . . . , ζn−1 n ; u1 , . . . , un ; τ ) := ΩN φA1 (ζ1 , u1 ) · · · φAn (ζn , un )q , (3.27) we find that FA1 ···An are symmetric formal Fourier series belonging to 1 C[e±πiζ1 , u1 ; . . . ; e±πiζn , un ][[q 2 ]] (i.e. with symmetric polynomial coefficients multi1 plying each power of q 2 ) and they also obey the Properties (3.18) and (3.19) with n (1) (2) εk = εk = =k+1 p for k = 1, . . . , n−1 (p being the fermion parities). The first statement is verified by using the rationality (Theorem 3.3) and Eq. (3.15), while the second uses Proposition 3.4(c) and the KMS equation (3.25) combined with the Z2 -symmetry (3.8). Thus, we can apply Proposition 3.4(d) (with R, the algebra C[u1 , . . . , un ] of harmonic polynomials in u1 , . . . , un restricted to SD−1 ) obtaining the expansion φA1 (ζ1 , u1 ) · · · φAn (ζn , un )q =
CN
Gc;A1 ···An (u1 , . . . , un ; τ )Ec(N ) (ζ1 2 , . . . , ζn−1 n ; u1 , . . . , un ; τ )
(3.28)
c=1
in the basic elliptic functions of the system (3.20). The coefficients 1 Gc;A1 ···An (u1 , . . . , un ; τ ) are Taylor series in q 2 with symmetric polynomial coef1 ficients in u1 , . . . , un ∈ SD−1 (i.e. Gc; A1 ···An ∈ C[u1 , . . . , un ][[q 2 ]]). We thus end up with the following result. Theorem 3.5. Under the assumptions of Sec. 2 and the additional condition that the conformal Hamiltonian H has finite dimensional eigenspaces every npoint thermal correlation function φA1 (ζ1 , u1 ) · · · φAn (ζn , un )q admits a formal series representation of type (3.28) where Gc; A1 ···An (u1 , . . . , un ; τ ) are Taylor series
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
637
1
in q 2 with symmetric (harmonic) polynomial coefficients in u1 , . . . , un ∈ SD−1 (N ) and Ec (ζ1 2 , . . . , ζn−1 n ; u1 , . . . , un ; τ )(c = 1, . . . , CN ) are some fixed series which 1 belong to C[[e±2πiζ1 , u1 ]]+ · · · [[e±2πiζn , un ]]+ [[q 2 ]] and are absolutely convergent in the domain (ζ1 , u1 ; . . . ; ζn , un ; τ ) ∈ (C × SD−1 )n × H : 0 < −Im ζjk < Im τ (1 ≤ j < k ≤ n) (3.29) to Z2 -symmetric (in the sense of Eq. (3.8)) meromorphic functions over (C × SD−1 )n × H. Moreover, the resulting functions are doubly periodic (resp., antiperiodic) in ζm with periods 1 and τ if φAm is a bosonic (resp., fermionic) field, for m = 1, . . . , n. The problem of summability of the angular coefficients Gc; A1 ···An (u1 , . . . , un ; τ ) is still open. Let us note first that if we exchange the order of summation in 1 Gc;A1 ···An , first summing in the powers of q 2 and then over the harmonic polynomi1 als in uk , we will obtain Gc;A1 ···An as elements of the space C[[q 2 ]][[u1 , . . . , un ]], i.e. 1 the space of infinite harmonic power series in u1 , . . . , un with coefficients in C[[q 2 ]]. 1 These coefficients can be expressed by a finite set of series in q 2 of types (3.22) and (3.23), using the operations of summation and multiplication, since all above considerations have been made in a purely algebraic setting. Thus, if we assume that the partition function and all thermal 1 mean values of products of fields’ modes are absolutely convergent series for q 2 < 1, we obtain Gc;A1 ···An as infinite formal Fourier series in (u1 , . . . , un ) ∈ S(D−1)n whose convergence should be further assumed in order to end up with elliptic finite temperature correlation functions. Let us conclude this discussion with the remark that in a chiral conformal QFT (which is, essentially, a 1-dimensional theory), there are no angular variables u so that Theorem 3.5 actually states the existence of the finite temperature correlation functions as elliptic functions under the assumptions of convergence of the partition function and the thermal mean values of the product of the fields’ modes. Corollary 3.6. In the assumptions of Theorem 3.5 let the finite temperature correlation functions absolutely converge in the domain (3.29) to meromorphic (N ) appearing in the functions. Then these function are elliptic of the type of Ec representation (3.27). In the following section we shall calculate the finite temperature correlation functions in free fields’ models and will see that they satisfy the above assumptions and are indeed elliptic functions. 4. Free Field Models 4.1. General properties of thermal correlation functions of free fields A generalized free field is defined as a Fock space representation of the Heisenberg– Dirac algebra with generators φA{n,m,σ} as in Eq. (2.13) for (A = 1, . . . , I). It is
July 26, 2005 15:21 WSPC/148-RMP
638
J070-00239
N. M. Nikolov & I. T. Todorov
completely determined by its 2-point function ∗
0|φA (z)φB (w)|0 = ιz,w WAB (z, w),
WAB (z, w) =
QAB (z − w) , [(z − w)2 ]µAB
(4.1)
where QAB (z) are polynomials and we recall that ιz,w stands for the Taylor expansion of WAB (z, w) in w whose coefficients are rational functions in z. Note that the ιz,w operation is the z-picture counterpart of the “i0(x0 − y 0 )” prescrip1 tion in Minkowski space which turns, for example, the rational function (x−y) 2, 1 into the distribution (x−y)2 +i0(x0 −y0 ) . Then the generating function of the modes’ (anti)commutation relations is ∗
∗
φA (z)φB (w) − (−1)pA pB φB (w)φA (z) = ιz,w WAB (z, w) − ιw,z WAB (z, w) = ιz,w WAB (z, w) − (−1)pA pB ιw,z WBA (w, z).
(4.2)
The annihilation operators are the modes φA{n,m,σ} with n < 0. The Fock space is generated by the one particle state space D1 spanned by the vectors φA{n,m,σ} |0 for n ≥ 0 and its hermitian scalar product is determined by the contributions of the Laurent modes 0|φA{n,m,σ} φ∗B{n,m,σ} |0 to the 2-point function (4.1). We will not assume, in general, that the inner product in D1 is positive definite. The rational two-point function WAB (z, w) is, by assumption, conformally invariant with respect to a cocycle πz (g)B A. The partition function trD (q H ) and the other traces below are understood as traces taken over some (pseudo)orthonormal basis of D consisting of eigenvectors 1 1 of H (as in Sec. 3). It is a Taylor series in q 2 which is always convergent for q 2 = eiπτ 1 with Im τ > 0(|q 2 | < 1) since the degree of degeneracy of the conformal energy level „ « n + C2 n in the 1-particle state space has an upper bound of the form C with 1 D−1 2 some positive constants C1,2 . More specifically, due to the spin-statistics theorem (which follows from the rationality of (4.1)), the integer conformal energy levels n in D1 should belong to the bosonic 1-particle subspace while the half-integer ones, n − 12 , belong to the fermionic subspace. Then, the partition function is determined by the dimensions of these energy spaces. Let us denote these dimensions by dB (n) and df (n) (for the bosonic and fermionic 1-particle spaces of energies n and n − 12 , respectively); then we will have H
Z(τ ) := trD (q ) =
1 d (n) ∞ 1 + q n− 2 f
dB (n)
n=1
(1 − q n )
.
(4.3)
It is also easy to see that the temperature mean value (3.23) of the products of (compact picture) modes φA;ν,m (u) (see Eq. (2.25)), is absolutely convergent. Moreover, it is expressed by Wick theorem in terms of “1-” and “2-point” ∗ Gibbs expectation values φA;ν,m (u)q and φA;ν1 ,m1 (u1 )φB;ν2 ,m2 (u2 )q , where ∗ ∗ ∗ (m) φB;ν,m (u)(= σ φB;ν,m,σ hσ (u)) are the modes of the conjugate field φB (ζ, u)
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
639
(see Eq. (2.27)). Combining the KMS property (3.24): ∗
∗
φA;ν1 ,m1 (u1 )φB;ν2 ,m2 (u2 )q = q −ν1 φB;ν2 ,m2 (u1 )φA;ν1 ,m1 (u2 )q ,
(4.4)
with the canonical (anti)commutation relations (4.2) of the modes we obtain ∗
φA;ν1 ,m1 (u1 )φB;ν2 ,m2 (u2 )q
∗ 1 = 0| φA;ν1 ,m1 (u1 ), φB;ν2 ,m2 (u2 ) −(−1)pA pB |0. 1 − (−1)pA pB q ν1
(4.5)
∗
Theorem 4.1. The series φA (ζ1 , u1 )φB (ζ2 , u2 )q is absolutely convergent for 0 < −Im ζ12 < Im τ to an elliptic function in ζ12 . It can be written as a series ∗
φA (ζ1 , u1 )φB (ζ2 , u2 )q =
∞
(−1)kpA pB WAB (ζ12 + kτ ; u1 , u2 ),
(4.6)
k=−∞
absolutely convergent in the same domain; here WAB (ζ12 ; u1 , u2 ) is the meromorphic vacuum correlation function ∗
WAB (ζ12 ; u1 , u2 ) := 0|φA (ζ1 , u1 )φB (ζ2 , u2 )|0.
(4.7)
The functions (4.6) are manifestly doubly periodic elliptic functions in ζ12 . ∗
Proof. First observe that φB;ν,m (u)|0 = 0 if ν ≥ 0 (in accord with Proposition 3.2) and therefore, 0|φB;ν,m (u) = 0 if ν ≤ 0. Thus, at most one term contributes to the (anti)commutator in the right-hand side of (4.5) and in fact:
∗ 0| φA;ν1 ,m1 (u1 ), φB;ν2 ,m2 (u2 ) −(−1)pA pB |0 ∗
= θν1 0|φA;ν1 ,m1 (u1 )φB;ν2 ,m2 (u2 )|0 ∗
− (−1)pA pB θ−ν1 0|φB;ν2 ,m2 (u2 )φA;ν1 ,m1 (u1 )|0 where θs is the characteristic function of the positive numbers (θs := 1 for s > 0 1 1 and θs = 0 otherwise). Expanding for |q 2 | < 1 (q 2 = eiπτ ), the prefactor in the right-hand side of (4.5), we find: ∗
φA (ζ1 , u1 )φB (ζ2 , u2 )q ∗ = φA;ν1 ,m1 (u1 )φB;ν2 ,m2 (u2 )q e−2πi(ν1 ζ1 +ν2 ζ2 ) ν1 ,m1 ν2 ,m2
=
∞
∗
(θν (−1)kpA pB q kν 0|φA;ν,m1 (u1 )φB;−ν,m2 (u2 )|0
ν,m1 ,m2 k=0 ∗
+ θ−ν (−1)kpA pB q −(k+1)ν 0|φB;−ν,m2 (u2 )φA;ν,m1 (u1 )|0)e−2πiνζ12 .
(4.8)
If we first perform the sum over the indices ν, m1 , m2 in the right-hand side of (4.8) we obtain (due to Theorem 3.3) the series expansion in Eq. (4.6): indeed, the first
July 26, 2005 15:21 WSPC/148-RMP
640
J070-00239
N. M. Nikolov & I. T. Todorov
kpA pB term in the sum gives ∞ WAB (ζ12 + kτ ; u1 , u2 ) while the second gives k=0 (−1) ∞ (−k+1)pA pB (−1) W (−ζ − kτ ; u2 , u1 ), where 12 BA k=1 ∗
(ζ12 ; u1 , u2 ) := 0|φB (ζ1 , u1 )φA (ζ2 , u2 )|0 , WBA
so we should further apply the symmetry property (−ζ12 ; u2 , u1 ). WAB (ζ12 ; u1 , u2 ) = (−1)pA pB WBA
(4.9)
The series (4.6) is absolutely convergent since its terms behave as WAB (ζ12 + kτ ; u1 , u2 ) ∼ q kdB e2πi(dA ζ1 +dB ζ2 ) WAB (e2πiζ1 u1 , 0) for k → ∞, WAB (ζ12 + kτ ; u1 , u2 ) ∼ q −kdA e2πi(dA ζ1 +dB ζ2 ) WAB (0, e2πiζ2 u2 ) for k → −∞. ∗
The series of the finite temperature correlation function φA (ζ1 , u1 )φB (ζ2 , u2 )q given by the first equality in (4.8) is also absolutely convergent for 0 < −Im ζ12 < Im τ since the series of WAB absolutely converges in this domain. Remark 4.1. Let N be a hermitian operator, commuting with the conformal Hamiltonian H and such that ∗
∗
[N, φA (z)] = nA φA (z),
[N, φA (z)] = −nA φA (z).
(4.10)
Then we can derive in the same way as above the following expression for the grand canonical correlation functions ∗
∗
trD (φA (ζ1 , u1 )φB (ζ2 , u2 )q H e2πiµN ) trD (q H e2πiµN ) ∞ = eπik(2µ+pA pB ) WAB (ζ12 + kτ ; u1 , u2 ),
φA (ζ1 , u1 )φB (ζ1 , u1 )q,µ :=
k=−∞
(4.11) for real µ. (In the physical literature, the grand canonical partition function is written as tr(e−β(H−µN ) ) where β is the inverse temperature and µ is the chemical potential.) Remark 4.2. In the assumptions of Corollary 3.6 one can state that for the thermal ∗ 2-point function φA (ζ1 , u1 )φB (ζ2 , u2 )q , of an arbitrary field φ = {φA }, the righthand side of Eq. (4.6) describes the most singular part in ζ12 since it comes from ∗ the most singular part of the operator product expansion of φA (ζ1 , u1 )φB (ζ2 , u2 ) (see [29, Proposition 4.3]). 4.2. Free scalar fields The generalized free neutral scalar field φ(z) ≡ φ(d) (z) of dimension d is determined by the unique conformally invariant scalar 2-point function (2.28).
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
641
Many of the modes in the field expansion (2.13) are zero so that it is convenient to reduce the system of basic functions and actually, organize the field modes in a slightly different way. Let us denote by φ−d−n (z) the homogeneous operator-valued polynomial of degree n ≥ 0 contributing to the Taylor part of the expansion (2.13) of φ(z). z )∗ obtained conjugating the coefficients of φ−d−n (z) is The polynomial φ−d−n (¯ denoted by z )∗ φn+d (z) = φ−d−n (¯
(4.12)
(n ≥ 0). Due to Proposition 3.2, the creation modes of the field are exactly {φ−n−d : n ≥ 0} so that the remaining non-zero field modes {φn+d : n ≥ 0} annihilate the vacuum |0. Thus the field φ(z) is expanded in the above modes as follows: φ(z) =
∞
φ−n−d (z) +
n=0
∞
(z 2 )−n−d φn+d (z).
(4.13)
n=0
The commutation relation with the conformal Hamiltonian take the form [H, φn (z)] = −nφn (z)
(n ∈ Z).
(4.14)
The vacuum matrix elements of products of field modes are derived from the 2-point function: 0|φ(¯ z )∗ φ(w)|0 =
∞ 1 C˜nd (z, w), = (1 − 2z · w + z 2 w2 )d n=0
(4.15)
where C˜nd (z, w) are polynomials separately homogeneous in z and w of equal degrees n with generating function ∞ 1 C˜ d (z, w)λn . = (1 − 2z · wλ + z 2 w2 λ2 )d n=0 n
(4.16)
0|φm+d (z)φ−n−d (w)|0 = δm,n C˜nd (z, w)
(4.17)
Then
for m, n ≥ 0. Note that polynomials Cnd (t) (with
the polynomials C˜nd (z, w) are related to the Gegenbauer ∞ 1 d n generating function (1−2tλ+λ 2 )d = n=0 Cn (t)λ ) by
n C˜nd (z, w) = (z 2 w2 ) 2 Cnd
In the real compact picture, we set φ(ζ, u) = e2πidζ φ(e2πiζ u) =
z·w (z 2 w2 )1/2
.
e−2πinζ φ−n (u)
n∈Z |n|≥d
as a formal Fourier series in ζ. Taking into account the relation 0|φ(e2πi ζ1 u1 )φ(e2πi ζ2 u2 )|0 = e2πi d (ζ1 +ζ2 ) 0|φ(ζ1 , u1 )φ(ζ2 , u2 )|0
(4.18)
(4.19)
July 26, 2005 15:21 WSPC/148-RMP
642
J070-00239
N. M. Nikolov & I. T. Todorov
we find 0|φ(ζ1 , u1 )φ(ζ2 , u2 )|0 =
(−1)d , 4d sin πζ+ sind πζ− d
(4.20)
where ζ± = ζ12 ± α, cos 2πα = u1 · u2 . Then Eq. (4.5) takes the form φ(ζ1 , u1 )φ(ζ2 , u2 )q =
∞
(−1)d . 4d sind π(ζ+ + kτ ) sind π(ζ− + kτ ) k=−∞
(4.21)
For d = 1, we obtain 1 (p1 (ζ+ , τ ) − p1 (ζ− , τ )) , (4.22) 4π sin 2πα where pk (ζ, τ ) are written down in Appendix A (see (A.28)). Equation (4.22) follows from the identity 1 −1 (cot πζ+ − cot πζ− ) = (4.23) sin πζ+ sin πζ− sin 2πα φ(ζ1 , u1 )φ(ζ2 , u2 )q =
and (4.21). Note that the differences in (4.23) and (4.22) allows us to apply Eq. (A.28) and ensures the ellipticity (double periodicity) in ζ12 of the thermal correlation function. Remark 4.3. The Gibbs 2-point function of the modes φn (u) in the latter example (d = 1), φ−m (u1 )φn (u2 )q = δmn
q n sin 2πnα 1 − q n sin 2πα
(4.24)
(for u1 · u2 = cos 2πα), which can be derived directly from the canonical commutation relations and the KMS condition, yields the q-expansion of (4.22) −1 φ(ζ1 , u1 )φ(ζ2 , u2 )q = 4 sin πζ+ sin πζ− ∞ q n sin 2πnα cos 2πnζ12 . +2 (4.25) 1 − q n sin 2πα n=1 Comparing with (4.22), we deduce a similar expansion for p1 : p1 (ζ, τ ) = π cot πζ + 4π
∞
qn sin 2πnζ. 1 − qn n=1
(4.26)
In the of Remark 4.1, for a complex scalar field of dimen more ∗general context 2 −1 taking N to be the charge operator (with n = 1 sion 1 0|φ(z1 )φ(z2 )|0 = z12 in Eq. (4.10)), we find ∗ 1 (p1 (ζ+ , τ, µ) − p1 (ζ− , τ, µ)) (4.27) φ(ζ1 , u1 )φ(ζ2 , u2 )q,µ = 4π sin 2πα for the more general functions p1 (ζ, τ, µ) of Appendix A. In order to find the mean energy (or the partition function), we have to specify the space-time dimension D together with the field dimension d. We will consider the following two basic examples.
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
643
4.2.1. Canonical free massless field in even space time dimension D The canonical free field is determined by the Laplace equation: 1 D−2 ∂z2 φ(d) (z) = 0 ⇔ ∂z21 d = 0 ⇔ d = d0 := . 2 2 z12
(4.28)
The existence of the canonical free field as a GCI field requires D to be even and greater than 2. Then the polynomials C˜nd0 (z, w) are harmonic in both z and w, and they determine a positive definite scalar product by Eq. (4.17). Thus, the canonical free fields satisfy the Hilbert space Wightman positivity. The operator-valued polynomials φ−n−d (z) are harmonic, i.e. φ{0,n,σ} h(n) (4.29) φ−n−d (z) = σ (z) σ
in the notations of Sec. 2.2, so that the only non-zero modes of φ(z) are φ{0,n,σ} and φ{−n−d0 ,n,σ} for n = 0, 1, . . . . It then follows that the 1-particle eigenspace of conformal energy n(≥ d0 ) is isomorphic to the space of the harmonic polynomials on CD of degree n − d0 . Its dimension d(D) (n)(= d(n) ≡ dB (n)) is thus d(D) (n) =
2n (2d0 )!
d 0 −1
(n − k) for D > 4 (d(4) (n) = n2 ),
(4.30)
k=1−d0
which is an even polynomial in n, for even D, of degree 2d0 , say, d0 d0 −1 2n2 (D) 2k (D) 2 2 d (n) = ck n (n − k ) for D > 4 . = (2d0 )! k=0
(4.31)
k=1
Note that d(D) (n) = 0 for n = 1, . . . , d0 − 1, so that the thermal energy mean value is Hq ≡ =
∞ 1 ∂ nd(n)q n trD (Hq H ) = q Z(τ ) = H trD (q ) Z(τ ) ∂q 1 − qn n=1 d 0 +1 k=1
(D)
ck−1
d 0 +1 B2k (D) + ck−1 G2k (τ ), 4k
(4.32)
k=1
where G2k (τ ) are the level 1 modular forms (see (A.10) and (A.16)) and B2k are the Bernoulli numbers (see Appendix A). This agrees with Eq. (4.3) since here (2) df (n) = 0 and dB (n) ≡ d(n). Note that d(2) (n) = c0 = 2, while for D ≥ 4, (D) c0 = 0. In particular, for D = 4 we find H + E0 (4) q = G4 (τ ),
E0 =
1 . 240
(4.33)
If we interpret E0 as a vacuum energy, i.e. renormalize the conformal Hamiltonian ˜ = H + E0 , then its temperature mean value would be a modular form of as H weight 4.
July 26, 2005 15:21 WSPC/148-RMP
644
J070-00239
N. M. Nikolov & I. T. Todorov
Remark 4.4. Extrapolation to the case D = 2 of the above result contains two chiral components, each of them giving the energy distribution for a U (1) current H + E0 (2) q = G2 (τ ),
E0 = −
1 24
(4.34)
which is not modular invariant.
4.2.2. Subcanonical field of dimension d = 1 for D = 6 The scalar field of dimension d = 1 in D = 6 space-time dimensions is not harmonic but satisfies the fourth-order equation (∂z2 )2 φ(z) = 0. The harmonic polynomials on C6 are now C˜n2 (z, w). The identity Cn1 (t) =
1 2 2 Cn (t) − Cn−2 (t) n+1
(4.35)
implies the following harmonic decomposition of the homogeneous polynomials C˜n1 (z, w): C˜n2 (z, w) =
1 1 2 z 2 w2 C˜n−2 C˜n2 (z, w) − (z, w). n+1 n+1
(4.36)
Thus we can decompose φ−n−1 (z) = φ1−n−1 (z) + z 2 φ2−n−1 (z),
(4.37)
where φnj (z) are now harmonic homogeneous operator-valued polynomials of degrees n and n − 2, respectively (as φ10 := 0 and φ20 = φ21 := 0). Then, 1 C˜ 2 (z, w), n+1 n −1 ˜ 2 2 C (z, w). 0|φ2∗ −n−3 (z)φ−n−3 (w)|0 = n+1 n 1 0|φ1∗ −n−1 (z)φ−n−1 (w)|0 =
(4.38)
Therefore, the 1-particle state space of conformal energy n decomposes into a pseudo-orthogonal direct sum of two subspaces isomorphic to the spaces of harmonic homogeneous polynomials of degrees n − 1 and n − 3, respectively: the first will have positive definite while the second one, negative definite metric. In particular, the dimension of the full eigenspace of conformal energy n is dB (n) = d(6) (n + 1) + d(6) (n − 1) =
n2 (n2 + 5) 6
(4.39)
so that the thermal energy mean value and the vacuum energy are H + E0 q =
1 5 G6 (τ ) + G4 (τ ), 6 6
E0 = −
5 B4 19 1 B6 − = . 6 12 6 8 6048
(4.40)
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
645
4.3. The Weyl field Let us introduce the (2 × 2)-matrix representation of the quaternionic algebra: Qk = −iσk = −Q+ Q4 = I k (k = 1, 2, 3), 0 1 0 −i 1 0 σ1 = , σ2 = , σ3 = , 1 0 i 0 0 −1 + + + Q+ α Qβ + Qβ Qα = 2δαβ = Qα Qβ + Qβ Qα
for α, β = 1, . . . , 4
(4.41)
(4.42)
(σk being Pauli matrices). In this section, we will denote the Hermitian matrix conjugation by a superscript “+”. The matrices iσαβ =
1 + Qα Qβ − Q+ β Qα , 2
i˜ σαβ =
1 + Qα Q+ β − Qβ Qα 2
(4.43)
are the selfdual and anti-selfdual anti-hermitian spin(4) Lie algebra generators. We will denote also z/ =
4 α=1
z α Qα ,
z/+ =
4
z α Q+ α,
∂/z =
α=1
4
Qα ∂zα ,
∂/+ z =
α=1
4
Q+ α ∂z α ,
(4.44)
α=1
etc. Note that in the definition of z/+ , we do not conjugate the coordinates z α . Then Eqs. (4.42) are equivalent to /2 + z /+ /1 = z /1 z /+ /2 z /+ z/+ 1z 2z 2 +z 1 = 2z1 · z2
(z/+ z/ = z/z/+ = z 2 ).
(4.45)
The generalized free Weyl fields of dimension d = 12 , 32 , . . . are two mutually conjugate complex 2-component fields, ∗ χ1 (z) ∗ + , (4.46) χ (z) = χ1 (z), χ2 (z) and χ(z) = χ2 (z) transforming under the elementary induced representations of spin(4) corresponding to the selfdual and anti-selfdual representations (4.43), respectively. In particular, the action of the Weyl reflection jW (2.7) is, χ(z) →
z/ 1 (z 2 )d+ 2
χ+ (z) → χ+ (z)
χ(z)(≡ πz (jW )χ(z)), z/
1 (z 2 )d+ 2
(≡ πz+ (jW )χ+ (z)).
(4.47)
The conformal invariant 2-point functions, characterizing the fields, have the following matrix representation z/+ 0|χ(z1 )χ+ (z2 )|0 = 12 1 , 2 d+ 2 z12 + 0|χα (z1 )χβ (z2 )|0 = 0|χ+ α (z1 )χβ (z2 )|0 = 0.
(4.48) (4.49)
July 26, 2005 15:21 WSPC/148-RMP
646
J070-00239
N. M. Nikolov & I. T. Todorov
In particular, the invariance under the complex Weyl reflection jW is ensured by the equality z/1 + z/2 z/+ z/+ z/12 2 = 12 − 22 . 2 z1 z2 z1 z2 The conjugation law (2.21) reads +
χ (¯ z) =
z/
+
1
(z 2 )d+ 2
in other words, for any Φ, Ψ ∈ D: χ (¯ z )Φ | Ψ =
z/
+
z χ 2 z
1
(z 2 )d+ 2
(4.50) (4.51)
z Φ|χ 2 Ψ . z
(4.52)
Here one can explicitly verify the hermiticity of the 2-point scalar product z )Ω | χ+ (w)Ω = (χ+ (w)Ω | χ+ (¯ z )Ω)+ , χ+ (¯
(4.53)
where Ω = |0 is the vacuum: + ¯/ z w w ¯ z/ + + z )|0 . χ (w)|0 = χ (¯ 1 0|χ 1 0|χ z2 w ¯2 (z 2 )d+ 2 (w¯2 )d+ 2
(4.54)
The conjugation law for the compact picture generalized free Weyl field χ+ (ζ, u) = e2πidζ χ+ (e2πiζ u),
χ(ζ, u) = e2πidζ χ(e2πiζ u)
(4.55)
becomes χ+ (ζ, u)+ = u/χ(ζ, u).
(4.56)
The vacuum correlation function is diagonal in “the moving frame” representation defined as follows. For given non-collinear unit real vectors u1 , u2 ∈ SD−1 (⊂ RD ) such that u1 ·u2 = cos 2πα, let v and v¯ be the unique complex vectors (in CD ) for which u1 = eπiα v + e−πiα v¯,
u2 = e−πiα v + eπiα v¯.
(4.57)
It then follows that v and v¯ are mutually conjugate isotropic vectors with scalar product: 2v · v¯ = 1. In this basis, we have 0|χ(ζ1 , u1 )χ+ (ζ2 , u2 )|0 =
1 1
2i(−4 sin πζ+ sin πζ− )d− 2 ¯v/+ v/+ × + , sin πζ− sin πζ+
(4.58)
where ζ± = ζ12 ± α (as in previous sections). In the frame, in which u1,2 = (0, 0, ± sin πα, cos πα), the matrix v/ and its conjugate assume a simple form: 1 0 0 0 + + ¯ ¯ v/ = (4.59) = v/, v/ = = v/. 0 0 0 1
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
647
Thus, in the d = 12 case of a subcanonical Weyl field, the contribution of ζ+ and ζ− are separated. The dimension d = 32 corresponds to the canonical free Weyl field which will be denoted by ψ := χ(ψ + := χ+ ). We find in this case 0|ψ(ζ1 , u1 )ψ + (ζ2 , u2 )|0 cos πζ− cot 2πα 1 i − + = v/+ 8 sin 2πα sin πζ− sin 2πα sin πζ+ sin2 πζ− cos πζ+ cot 2πα 1 + − ¯v/ + − . sin πζ+ sin 2πα sin πζ− sin2 πζ+
(4.60)
From the vacuum correlation functions (4.58), (4.60) and from Eq. (4.6) we deduce 1 11 ¯/+ , p1 (ζ− , τ )v/+ + p11 (4.61) 1 (ζ+ , τ )v 2πi i ψ(ζ1 , u1 )ψ + (ζ2 , u2 )q = 8πsin 2πα p11 + 11 1 (ζ+ , τ ) 11 × v/ p2 (ζ− , τ ) − cot 2πα p1 (ζ− , τ ) + sin 2πα 11 p1 (ζ− , τ ) + 11 11 ¯ − v/ p2 (ζ+ , τ ) + cot 2πα p1 (ζ+ , τ ) − . sin 2πα (4.62) mκ+nλ (ζ + mτ + n)−k (see Appedix A). Under the Here pκ,λ m n (−1) k (ζ, τ ) = assumptions of Remark 4.1, for N identified with the charge operator (so that [N, χ+ (z)] = χ+ (z), [N, χ(z)] = −χ(z)), we find for d = 1 , 2 1 11 v/+ (4.63) χ(ζ1 , u1 )χ+ (ζ2 , u2 )q,µ = p (ζ− , τ, µ)v/+ + p11 1 (ζ+ , τ, µ)¯ 2πi 1 χ(ζ1 , u1 )χ+ (ζ2 , u2 )q =
using the more general functions pκ,λ 1 (ζ, τ, µ) of Appendix A (Eq. (4.62) is generalized in a similar way). Note that the 1-particle scalar product is z )+ χ+ (w)|0 = 0|χ+ (¯
1 − z/ w/+ 1
(1 − 2z · w + z 2 w2 )d+ 2
.
(4.64)
This implies similarly to the scalar field case that we can organize the field mode expansion (in the compact picture) as χ(ζ, u) = χn− 12 (u)eiπ(1−2n)ζ , n∈Z |n− 12 |≥d
χ+ (ζ, u) =
n∈Z |n− 12 |≥d (+)
(4.65)
χ+ (u)eiπ(1−2n)ζ , n− 1 2
(+)
where χn+d (u) and χ−n−d (u) for n = 0, 1, . . . are homogeneous polynomial in u ∈ S3 (+)
of degree n with 2-component operator coefficients. For n ≥ 0, χn+d (u) correspond
July 26, 2005 15:21 WSPC/148-RMP
648
J070-00239
N. M. Nikolov & I. T. Todorov (+)
to annihilation operators while χ−n−d (u), to the creation modes and we also have + + + + χ 1 −k (u) = u/χk− 12 (u), 0|χ+ −m−d (u1 ) χ−n−d (u2 )|0 2 d+ 1 d+ 1 = δnm Cn 2 (u1 · u2 ) − Cn−12 (u1 · u2 )u/1 u/+ (4.66) 2 for k ∈ Z, n, m = 0, 1, . . . and u1 , u2 ∈ S3 . For the thermal energy mean values, we will consider the two cases of d = and d = 12 , separately. The z-picture canonical spinor field satisfies the Weyl–Dirac equation: ∂/z ψ(z) = 0,
ψ + (z)∂/z = 0.
3 2
(4.67)
These equations are also valid for the compact picture modes ψ−n− 32 (u) extended to u ∈ R4 . The positive charge 1-particle state-space of conformal energy 0|H|0+ + n + 32 (n = 0, 1, . . .), spanned by ψ−n− 3 (u)|0 carries the irreducible representation 2 n n+1 of Spin(4) and therefore, has dimension (n+2)(n+1). The dimension of the , 2 2 full 1-particle space, including charge −1 states, is twice as big. It has also a positive definite scalar product in view of [24]. Thus applying the general formula (4.3) and Eq. (A.10), we find (cf. [10]): 3 ∞ 2 n + 32 (n + 1)(n + 2)q n+ 2 ∂ 1 q Z(τ ) = H + E0 q = E0 + 3 Z(τ ) ∂q 1 + q n+ 2 n=0 1 1 τ +1 τ +1 = (4.68) G4 − 8G4 (τ ) − G2 − 2G2 (τ ) , 4 2 4 2 17 1 B4 1 B2 (1 − 23 ) + (1 − 2) = − . 4 8 4 4 960 Here we have used the equalities: ∞ (2n + 1)2k−1 q 2n+1 E0 = −
1 + q 2n+1 B2k 1 (1 − 22k−1 ), = G2k τ + − 22k−1 G2k (2τ ) + 2 4k 1 1 2 n+ n(n + 1) = ((2n + 1)3 − (2n + 1)). 2 4
(4.69)
n=1
(4.70) (4.71)
The subcanonical Weyl field and its conjugate satisfy third-order equations which assume the following form on the modes ∂u2 ∂/u χn+ 12 (u) = 0,
χ+ (u)∂/u ∂ 2u = 0 (u ∈ R3 ). n+ 1 2
(4.72)
The resulting Spin(4)-representation in the positive charge 1-particle space of conformal energy n + 12 (n = 0, 1, . . .) is then isomorphic to a (pseudo-orthogonal) direct sum of three irreducible representations (for n ≥ 3), n n−1 n−2 n−1 n n+1 , , , ⊕ ⊕ , (4.73) 2 2 2 2 2 2
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
649
each of them should posses a definite restriction of the scalar product. In particular, the full dimension df (n) of the space (4.73) is 1 1 3 5 2 df (n) = 3n + 3n + 2, n+ df (n) = (2n + 1)3 + (2n + 1), (4.74) 2 2 4 4 for all n = 0, 1, . . . , so that the thermal energy mean value and vacuum energy are given by 3 5 τ +1 τ +1 H + E0 q = G4 − 8G4 (τ ) + G2 − 2G2 (τ ) , (4.75) 4 2 4 2 29 3 B4 5 B2 E0 = − (1 − 23 ) − (1 − 2) = . (4.76) 4 8 4 4 960 Note that although G2 is not a modular form, the differences entering the righthand sides of (4.65) and (4.75) are multiples of F (τ ) (A.15) and are thus modular forms of weight two and level Γθ . 4.4. The Maxwell free field The electromagnetic (or Maxwell) free field is a 6-component field Fαβ (z) = −Fβα (z) (1 ≤ α < β ≤ 4). It is convenient to write it as a 2-form: F (z) =
1 Fαβ (z) dz α ∧ dz β 2
(4.77)
which makes clear its transformation properties and conjugation law: U (g)(Fαβ (z) dz α ∧ dz β )U (g −1 ) = Fαβ (g(z)) dg(z)α ∧ dg(z)β , z α ∧ d¯ z β z ∗ = z¯z¯2 , (dz α )∗ = d(z ∗ )α , (Fαβ (z) dz α ∧ dz β )∗ = Fαβ (z ∗ ) d¯
(4.78) (4.79)
where g = etΩ for a real conformal generator Ω as in Eq. (2.15). The 2-point function is 0|Fα1 β1 (z1 )Fα2 β2 (z2 )|0 := rαβ (z) := δαβ − 2
zα zβ z2 .
rα1 α2 (z12 )rβ1 β2 (z12 ) − rα1 β2 (z12 )rβ1 α2 (z12 ) , (4.80) 2 )2 (z12
It is verified to satisfy the Maxwell equations dF (z) = 0,
d ∗ (F )(z) = 0,
(4.81)
∗ being Hodge star: ∗(F )αβ (z) := εαβρσ F ρσ (z). To compute the (compact picture) finite temperature correlation functions Fα1 β1 (ζ1 , u1 )Fα2 β2 (ζ2 , u2 )q , we again use the diagonal frame in which, 2v = (0, 0, −i, 1), u1,2 = (0, 0, ± sin πα, cos πα); then there exist linear combinations of the field components √ ± √ ± 2F1 = F23 ± F14 , 2F2 = F31 ± F24 , (4.82) √ ε √ ± 2F3 = F12 ± F34 , 2F± = F1ε ± iF2ε
July 26, 2005 15:21 WSPC/148-RMP
650
J070-00239
N. M. Nikolov & I. T. Todorov
(ε = ±) such that 0|F++ (ζ1 , u1 )F−− (ζ2 , u2 )|0 =: W0 (ζ12 , α)
1 cot 2πα 1 cos πζ+ − cot πζ ) − − ; (cot πζ − + 4 sin 2πα sin3 πζ+ 4 sin3 2πα sin2 πζ+ 0|F−+ (ζ1 , u1 )F+− (ζ2 , u2 )|0 = W0 (ζ12 , −α); 1 1 1 0|F3+ (ζ1 , u1 )F3− (ζ2 , u2 )|0 = + 4 sin2 2πα sin2 πζ+ sin2 πζ− + 2 cot 2πα(cot πζ+ − cot πζ− ) (4.83) =
(ζ± = ζ12 ± α). The corresponding finite temperature correlation functions read: 1 F++ (ζ1 , u1 )F−− (ζ2 , u2 )q =: Wq (ζ12 , α) = (p1 (ζ− , τ ) − p1 (ζ+ , τ )) 4 sin3 2πα 1 1 p3 (ζ+ , τ ) − cot 2παp2 (ζ+ , τ ) ; − 4 sin 2πα 2π F−+ (ζ1 , u1 )F+− (ζ2 , u2 )q = Wq (ζ12 , −α); 1 F3+ (ζ1 , u1 )F3− (ζ2 , u2 )q = (p2 (ζ+ , τ ) + p2 (ζ− , τ ) 4 sin2 2πα + 2 cot 2πα(p1 (ζ+ , τ ) − p1 (ζ− , τ ))).
(4.84)
In order to find the thermal energy mean value for the Maxwell field, we have to compute the dimension d(n)(≡ dB (n)) of the 1-particle state space of conformal energy n, spanned by Fαβ;−n (z)|0 where the mode Fαβ;−n (z) is a homogeneous (harmonic) polynomial of degree n − 2, satisfying the Maxwell equations. To this end, we display the SO(4) representation content of the modes satisfying the Maxwell equations. Decomposing the anti-symmetric tensor Fαβ into selfdual and anti-selfdual parts, (1, 0) ⊕ (0, 1), we see that the full space of homogeneous skewsymmetric-tensor valued polynomials in z of degree n − 2 generically splits into a direct sum of three conjugate pairs of SU (2) × SU (2) representations; for n−2 n n−2 n−2 n−2 n−4 n−2 instance, (1, 0) ⊗ ( n−2 2 , 2 ) = ( 2 , 2 ) ⊕ ( 2 , 2 ) ⊕ ( 2 , 2 ) (for n > 3). Maxwell equations imply that only two of the resulting six representations, those n−2 n with maximal weights, appear in the energy n 1-particle space: ( n2 , n−2 2 )⊕ ( 2 , 2 ). Thus, d(n) = 2(n2 − 1)
(4.85)
and using (4.32), we then find H + E0 q = 2 G4 (τ ) − 2 G2 (τ ),
E0 = −2
B2 11 B4 +2 = . 8 4 120
(4.86)
Remark 4.5. Let us consider a generalized free vector field lα (z) independent of Fαβ (z) (i.e. commuting with it) and having 2-point function 0|lα (z1 )lβ (z2 )|0 = C
rαβ (z12 ) , 2 z12
∂zα lβ (z) = ∂zβ lα (z)
(4.87)
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
651
(the last equality means that lα (z) is a “longitudinal ” field but we note that there is no GCI scalar field s(z) such that lα (z) = ∂zα s(z)). The field lα (z) satisfies the third-order equation ∂z2 ∂z · l(z) = 0,
(4.88)
and it then follows that the conformal energy n state space has dimension n+3 n−1 dl (n) = (4.89) − = (n + 1)2 + (n − 1)2 = 2(n2 + 1). 3 3 Thus the thermal energy mean value in the state space of Fαβ (z) and lα (z) will be HF + Hl + E0 q = 4 G4 (τ ),
E0 =
1 , 60
(4.90)
(HF and Hl being the conformal Hamiltonians of the corresponding subsystems). We can interpret the state space of lα (z) as the space of pure gauge transformations and then the full state space of Fαβ (z) and lα (z) has the meaning of the space of all gauge potentials. 5. The Thermodynamic Limit 5.1. Compactified Minkowski space as a “finite box” approximation z We shall now substitute z in Eqs. (1.3) and (2.3) by R , thus treating SD−1 and S1 in the definition of M as a sphere and a circle of radius R (> 0). Performing further xµ , µ = 0, . . . , D − 1 (see Eq. (2.2)) the Minkowski space dilation (2R)Ω−1D : xµ → 2R x ) or on the (real) variable (Z =)x in (2.3), we find z(x; R)= Rz( 2R 2 x ix0 − 2R x x x0 x2 x , zD (x; R) − R = x , 2ω . z(x; R) = − i =1+ 2R 4R2 R 2ω 2R 2ω 2R
(5.1) The stability subgroup of z(x; R) = 0 (∈ T+ ) in C is conjugate to the maximal compact subgroup K ⊂ C: K(2R) = (2R)Ω−1D K(2R)−Ω−1D ,
K ≡ K(1) ∼ = U (1) × Spin(D)/Z . 2
(5.2)
In particular, the hermitian U (1)-generator H(2R), which acts in the z-coordinates ∂ , is mutually conjugate to H ≡ H(1), (5.1) as the Euler vector field z · ∂z H(2R) = (2R)Ω−1D H(2R)−Ω−1D ,
H ≡ H(1).
(5.3)
For large R and finite x the variables (z, zD − R) approach the (Wick-rotated) Minkowski space coordinates (x, ix0 ). In particular, for x0 = 0 (= ζ), the real (D − 1)-sphere z 2 = R2 can be viewed as a SO(D)-invariant “box” approaching for R → ∞ the flat space RD−1 . Thus the conformal compactification of Minkowski space also plays the role of a convenient tool for studying the thermodynamic
July 26, 2005 15:21 WSPC/148-RMP
652
J070-00239
N. M. Nikolov & I. T. Todorov
limit of thermal expectation values. This interpretation is justified in view of the following: Proposition 5.1. The asymptotic behavior of z(x; R)−ReD (eD = (0, 1)) for large R is: x2 x2 0 z(x; R) = x + O , zD (x; R) − R = ix + O , (5.4) R R 1 H(2R) 1 = P0 + K0 = P0 + O ∈ ic , (5.5) HR := R 4R2 R where x := (x0 )2 + |x|2 for x = (x0 , x) ∈ M and iP0 is the real conformal algebra generator of the Minkowski time (x0 ) translation (see Sec. 2.1). The operator HR is the physical conformal Hamiltonian (of dimension inverse length). Proof. Equation (5.4) is obtained by a straightforward computation. It follows from Eqs. (2.2) and (2.9) that H=
1 (P0 + K0 ). 2
(5.6)
To derive Eq. (5.5), one should then use (5.6) and the equations λΩ−1D P0 λ−Ω−1D = λP0 , λΩ−1D K0 λ−Ω−1D = λ−1 K0 ; hence, H(2R) = (2R)Ω−1D H(2R)−Ω−1D = RP0 + 1 4R K0 . Remark 5.1. The observation that the universal cover of M , the Einstein universe = R × SD−1 (for D = 4), which admits a globally causal structure, is locally M undistinguishable from M for large R has been emphasized over 30 year ago by Irving Segal (see [32] for a concise expos´e and further references). For a fixed choice, Ω−1D , of the dilation generator in (5.2), he identifies the Minkowski energy P0 with the scale covariant component of HR . With this choice M is osculating M (and ) at the north pole (z, zD ) = (0, R) (respectively, ζ = 0, u = eD ), identified hence M with the origin x = 0 in M . (The vector fields associated with HR and P0 coincide at this point.) Using the Lie algebra limit limR→∞ HR = P0 implied by (5.5), one can approximate the Minkowski energy operator P0 for large R by the physical conformal Hamiltonian HR . As we shall see below, the fact that in all considered free field models in dimension D = 4, the conformal mean energy is a linear combination of modular forms G2k (τ ) with highest weight 2k = 4, has a remarkable corollary: the density ε of the physical mean energy has a limit reproducing the Stefan– Boltzmann law HR qβ C = 4 for qβ := e−β , (5.7) ε(β) := lim R→∞ VR β 1 where C is some constant, β = kT is the inverse temperature T multiplied by the Boltzmann constant k and VR := 2π 2 R3 is the volume of the 3-sphere of radius R
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
653
at a fixed time (say, x0 = 0 = ζ). We will calculate this limit for two cases: the model of a free scalar filed in D = 4 (see Sec. 4.2.1) which we will further denote by ϕ and the Maxwell free field model introduced in Sec. 4.4. Proposition 5.2. For the free scalar field ϕ in dimension D = 4, we have the following behavior of the mean energy density for R β 1, (ϕ) εR (β)
1 trD HR e−βHR := = VR trD e−βHR
π2 1 1 β4 −4π 2 R β − + O(e ) 4. 30 480π 2 R4 β
(5.8)
The corresponding result for the Maxwell free field Fµν is (F )
εR (β) =
π2 1 1 β2 1 β3 11 β 4 −4π 2 R β ) − + − + O(e . 2 3 3 2 4 15 6 R 4π R 240π R β4
(5.9)
Proof. The hermitian operators H and H(2R) are unitarily equivalent due to Eq. (5.3). This leads to the fact that trD q H(2R) and trD H(2R)q H(2R) do not depend on R. Then Eqs. (4.30) and (4.86) imply that in the two models under consideration, we have iβ iβ iβ 1 11 − 240 − 2G2 2πR − 120 G4 2πR 2G4 2πR (ϕ) (F ) , εR (β) = − . εR (β) = RVR RVR (5.10) Using further the relations −1 1 i G2 (τ ) = 2 G2 , − τ τ 4πτ
−1 1 G4 (τ ) = 4 G4 , τ τ
(5.11)
(which are special cases of (A.13) and (A.14)), we find 2πiR 1 β4 2 G 8π , (5.12) − 4 β4 β 480π 2 R4 2πiR 2πiR 1 11β 4 4β 2 β3 (F ) εR (β) = 4 16π 2 G4 . (5.13) + 2 G2 + 3 3− β β R β 4π R 240π 2 R4 (ϕ)
εR (β) =
Finally, to obtain Eqs. (5.8) and (5.9), one should apply the expansion (A.16), implying that 2R 2πiR 2πiR 1 1 −4π 2 R β + O(e−4π β ). ), G4 G2 = − + O(e = β 24 β 240 Remark 5.2. In order to make comparison with the familiar expression for the black body radiation, it is instructive to restore the dimensional constants h and c setting HR = hc R H(2R) (instead of (5.4)). The counterpart of (5.10) and (A.16)
July 26, 2005 15:21 WSPC/148-RMP
654
J070-00239
N. M. Nikolov & I. T. Todorov
then reads HR q =
hcβ ∞ ihcβ hc hc n3 e−n R . G4 − E0 = R R R n=1 1 − e−n hcβ R
(5.14)
Each term in the infinite sum in the right-hand side is a constant multiple of Plank’s black body radiation formula for frequency: ν=n
c . R
(5.15)
Thus, for finite R, there is a minimal frequency, Rc . Using the expansion in (5.14), one can also find an alternative integral derivation of the limit mean energy density (ϕ) εR (β) (5.10): (ϕ) εR (β)
∞ hcβ 3 −n hcβ R hcβ n R e π2 1 − − − → = hcβ 2π 2 h3 c3 β 4 n=1 1 − e−n R R R→∞ 30h3 c3 β 4
since the sum in the right-hand side goes to the integral
∞ 0
t3 e−t 1−e−t dt
=
(5.16) π4 15 .
Remark 5.3. We observe that the constant C in (5.7) in both considered models c1 is equal to 30π 2 , where c1 is the coefficient to the G4 -modular form in Hq (see Eq. (4.29)). If we use in the definition (5.5) of HR , the Hamiltonian H(2R) + E0 ˜ R := H(2R)+E0 , then this will only reflect on the (non-leading) instead of H(2R), H R E0 −E0 β 4 2π 2 R4 where E0 is the “vacuum 1 11 240 and 120 for the fields ϕ and
4
β terms c4 R 4 in (5.8) and (5.9), replacing them by energy” for the corresponding models (i.e. E0 is Fµν , respectively).
5.2. Infinite volume limit of the thermal correlation functions We shall study the R → ∞ limit on the example of a free scalar field, ϕ, in four dimensions. Denote by ϕM (x) (the canonically normalized) D = 4 free massless scalar field with 2-point function −1 (5.17) 0|ϕM (x1 )ϕM (x2 )|0 = (2π)−2 x212 + i0x012 2 (x12 = x1 − x2 , x212 = x212 − x012 ). We define, in accord with Proposition 5.1, a finite volume approximation of its thermal correlation function by ϕM (x1 )ϕM (x2 )β,R :=
trD ϕM (x1 )ϕM (x2 )e−βHR trD e−βHR
(5.18)
and will be interested in the thermodynamic limit, ϕM (x1 )ϕM (x2 )β,∞ := lim ϕM (x1 )ϕM (x2 )β,R . R→∞
(5.19)
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
655
Proposition 5.3. The limit (5.19) (viewed as a meromorphic function) is given by −1 sinh 2π |xβ12 | x0 |x12 | − cosh 2π 12 ϕM (x1 )ϕM (x2 )β,∞ = , (5.20) cosh 2π 8πβ|x12 | β β 2 2 2 |x12 | := x212 ≡ x112 + x212 + x312 . We shall prove this statement by relating ϕM (x) to the compact picture field ϕ(ζ, u) (≡ φ(1) (ζ, u)) whose thermal 2-point function was computed in Sec. 4.2. First, we use Eq. (2.23) to express ϕM (x) in terms of the z-picture field (corresponding to the R-depending chart (5.1)): 2πϕM (x) =
2ω
1 x ϕR (z(x; R))
(5.21)
2R
x −2 dx2 M accounts for (since dz 2 = ω 2R 4 , cp. (2.4)). The factor 2π in front of ϕ the different normalization conventions for the x- and z-picture fields (we have 2 −1 instead of (5.17)). 0|ϕ(z1 )ϕ(z2 )|0 = z12 As a second step we express ϕR (z) — and thus ϕM (x) — in terms of the compact picture field ϕR (ζ, u): ϕR (ζ, u) := Re2πiζ ϕ(Re2πiζ u), x x 1 x ϕR ζ 2πϕM (x) = ,u . 2R 2R 2R ω 2R
(5.22)
x x (z(x) from e2πiζ u = z(x;R) = z 2R Here ζ and u are determined as functions of 2R R is the transformation (2.3)); in deriving the second equation in (5.22), we have used x x −1 2 the relation e4πiζ = z(x;R) ω 2R = ω 2R . R2 Next we observe that ϕR (ζ, u) are mutually conjugate (for different R) just as H(2R) in Eq. (5.3). (To see one can use intermediate “dimensionless” x this, z coordinates z˜(x; R) = R = z 2R , which differs from (2.3) just by the dilation (2R)Ω−1D .) It follows that its vacuum and thermal 2-point function with respect to the Hamiltonian H(2R) do not depend on R and coincide with (4.20) (for d = 1) and (4.22). Thus, p1 (ζ12 + α, τR ) − p1 (ζ12 − α, τR ) (5.23) 16πR2 |ω1 ω2 | sin 2πα x x x x x iβ k 1 1 . In order for ωk = ω 2R , ζ12 = ζ 2R − ζ 2R2 , cos 2πα = u 2R · u 2R2 , τR = 2πR to perform the R → ∞ limit, we derive the large R behavior of |ωk |, ζ12 and α: x1 2 + x2 2 x012 2πζ12 = , 1+O R R2 |x12 | x1 2 + x2 2 (5.24) 2πα = , 1+O R R2 xk 2 , 4|ωk |2 = 1 + O R2 4π 2 ϕM (x1 )ϕM (x2 )β,R =
July 26, 2005 15:21 WSPC/148-RMP
656
J070-00239
N. M. Nikolov & I. T. Todorov
(x0 )2 + |x|2 ) following from xk 2 1 + 2R xk x0k , sin 2πζk = , u= , cos 2πζk = 2|ωk | 2R|ωk | 2R|ωk | x1 2 + x2 2 |x12 |2 4 sin2 πα = (u1 − u2 )2 = 1 + O . R2 R2
(x :=
xk 2 1 − 2R u4 = , 2R|ωk |
To evaluate the small τR (large R) limit of the difference of p1 -functions in (5.23), we use (A.11), (A.17) and (A.14) to deduce ζ −1 1 , p1 (ζ, τ ) = p1 − 2πiζ . (5.25) τ τ τ Equation (5.25) implies, on the other hand, that 0 0 ζ12 ± α −1 x12 ± |x12 | i2πR x12 ± |x12 | ≈ p1 , , −−−→ πi coth π . p1 R→∞ τR τR R→∞ iβ β β (5.26) Inserting (5.24)–(5.26) into (5.23), we complete the proof of (5.20) and hence of Proposition 5.3. Remark 5.4. The physical thermal correlation functions should be, in fact, defined as distributions which amounts to giving integration rules around the poles. To do this, one should view (5.20) as a boundary value of an analytic function in x12 for x012 → x012 − iε, ε > 0, ε → 0 (cf. (5.17)). It is not difficult to demonstrate that the limit ε → +0 and R → ∞ in (5.19) commute. Using (5.25), we can also compute 1 correction to (5.20): the Rβ ϕM (x1 )ϕM (x2 )β,R ≈ ϕM (x1 )ϕM (x2 )β,∞ − R β
1 4π 2 βR
.
(5.27)
To obtain the Fourier expansion of the result, we combine Eqs. (5.23) and (5.24) with the q-series (4.26) and set (as in Remark 5.2) ∞ ∞ n 1 n = p, f ; x, β −−−→ f (p; x, β) dp. (5.28) R→∞ 0 R R R n=1 The result is (2π)2 ϕM (x1 )ϕM (x2 )β,∞ =
1 2 + 2 0 x12 + i0x12 |x12 |
∞ 0
e−βp cos(px012 ) sin(p|x12 |) dp. 1 − e−βp
(5.29)
To conclude: the conformal compactification M of Minkowski space M plays a dual role. On one hand, it can serve as a symmetric finite box approximation to M in the study of finite temperature equilibrium states. In fact, any finite inverse temperature β actually fixes a Lorentz frame (cf. [8]) so that the symmetry of a Gibbs state
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
657
is described by the 7-parameter “Aristotelian group” of (3-dimensional) Euclidean motions and time translations. In the passage from M to M , the Euclidean group is deformed to the (stable) compact group of 4-dimensional rotations while the group of time translations is compactified to U (1). Working throughout with the maximal (7-parameter) symmetry, this allows us to write down simple explicit formulae for both finite R and the “thermodynamic limit”. = R × S3 ) not as an On the other hand, taking M (and its universal cover, M auxiliary finite volume approximation but as a model of a static space-time, we can view R as a (large but) finite quantity and use the above discussion as a basis for studding finite R corrections to the Minkowski space formulae. It is a challenge from this second point of view to study the conformal symmetry breaking by considering . massive fields in M 6. Concluding Remarks Periodicity of (observable) GCI fields in the conformal time variable ζ suggests that their Gibbs (finite temperature) correlation functions should be (doubly periodic) elliptic functions in the conformal time differences with second period proportional to the (complexified) inverse absolute temperature. We give arguments (Theorem 3.5, Corollary 3.6) that this is indeed the case in a GCI Wightman theory. Explicit constructions are presented of elliptic 2-point functions of free fields in an even number of space-time dimensions. If a field ψ(ζ, u) of dimension d and its conjugate satisfy the strong locality property (2.22), i.e. if (cos 2πζ12 − u1 · u2 )N[ψ(ζ1 , u1 )ψ ∗ (ζ2 , u2 ) − (−1)2d ψ ∗ (ζ2 , u2 )ψ(ζ1 , u1 )] = 0 (6.1) for N ≥ Nψ , then the Gibbs 2-point function ψ(ζ1 , u1 )ψ ∗ (ζ2 , u2 )q has exactly two poles in a fundamental domain, centered around the origin of the ζ12 plane, of leading order Nψ , at the points 1 . (6.2) 2 For a rank symmetric tensor field ψ of dimension d, the integer Nψ coincides with d + ; for an irreducible spin-tensor field in D = 4, of S(U (2) × U (2))-weight (d; j1 , j2 ), we have Nψ = d + j1 + j2 . The conformal energy mean value in an equilibrium Gibbs state (with suitably shifted vacuum energy) appears as a superposition of modular forms of different weights. Postulating this property for the photon energy (associated with the Maxwell stress tensor F ) requires including (non-physical) gauge degrees of freedom (otherwise, the non-modular term 2G2 (τ ) contributes to (4.86)). The result is then a modular form of weight 4 (Sec. 4.4, Eq. (4.90)). The same is true for the free massless scalar field for D = 4, while the energy mean of a d = 32 Weyl field is a superposition of modular forms (4.68) of weight 4 and 2 (and level Γθ — see Appendix A, Eq. (A.15) and the text following it). The question ζ12 = ±α for u1 · u2 = cos 2πα, 0 ≤ α <
July 26, 2005 15:21 WSPC/148-RMP
658
J070-00239
N. M. Nikolov & I. T. Todorov
arises whether by relaxing the condition of Wightman positivity, one cannot find an (indefinite metric) interacting Weyl field model whose energy mean value is a (homogeneous) modular form of weight four (as suggested by the study of chiral conformal models in 1 + 1 space-time dimension). More generally, the role of modular invariance in higher dimensional conformal field models still awaits its full understanding. Acknowledgments The authors’ interest in the modular properties of energy distributions in higher dimensional conformal field theory was stimulated by an early suggestion of Maxim Kontsevich. Discussions with Petko Nikolov are also gratefully acknowledged. We thank Seif Randjbar–Daemi and the Abdus Salam International Centre for Theoretical Physics in Trieste for the invitation and support during the later stage of this work. Discussions with Detlev Buchholz in G¨ottingen led to including the present addition of Sec. 5. We acknowledge partial support of the Alexander von Humboldt Foundation and the hospitality of the Institut f¨ ur Theoretische Physik der Universit¨ at G¨ ottingen during the final stage of this paper. Our work is supported in part by the Research Training Network within the Framework Programme 5 of the European Commission under contract HPRN-CT-2002-00325 and by the Bulgarian National Council for Scientific Research under contract Ph-1406. Appendix A. Basic Elliptic Functions In this Appendix, we define the basic elliptic functions used in the paper and list their properties and relations with the conventional functions. Recall that an elliptic function f (ζ) is a meromorpic function on C ( ζ) which is doubly periodic. Its periods can be chosen (after rescalling by a non-zero complex constant) to be 1 and τ with τ ∈ H (:= {τ ∈ C : Im τ > 0}). Thus, f (ζ) = f (ζ + m + nτ ) for m, n ∈ Z and hence, f is completely determined by its values in the fundamental domain D := {ζ ∈ C : ζ = λ + µτ, 0 ≤ λ, µ < 1}. By the Liouville’s theorem, f (ζ) should have at least one pole in D if it is non-constant: otherwise it will be bounded non-constant of the entire function in ζ, which is (ζ) over the boundary ∂D (or, over the shifted not possible. Integrating f (ζ) and ff (ζ) ∂D+ c, if necessary), we conclude in addition (by the Cauchy theorem on one hand, and the double periodicity, on the other) that: (i) the sum of the residues of the simple poles of f lying in D is zero and, (ii) the sum of multiplicities of all zeros minus the sum of multiplicities of all poles of f in D is also zero. In particular, f cannot have just one simple pole in D. Therefore, if the singular part of f in D has the form: S K 1 Nk,s (A.1) (ζ − ζs )k s=1 k=1
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
659
for some K, S ∈ N, N , Ns,k ∈ C, ζs ∈ D (k = 1, . . . , K, s = 1, . . . , S), then f can be represented in a finite sum: f (ζ) = N +
S K
Nk,s pk (ζ − ζs , τ )
(A.2)
k=1 s=1
where pk (ζ, τ ) are, roughly speaking, equal to: 1 . pk (ζ, τ ) := (ζ + m + nτ )k
(A.3)
m,n∈Z
The series (A.3) are absolutely convergent for k ≥ 3 and 1 (A.4) pk+1 (ζ, τ ) = − (∂ζ pk )(ζ, τ ). k For k = 1, 2 one should specialize the order of summation or, alternatively, add regularizing terms. In such a way, we arrive at the standard Weierstrass functions [12]: 1 1 ζ 1 − + Z(ζ, τ ) = + , (A.5) ζ ζ + mτ + n mτ + n (mτ + n)2 (m,n)∈ ∈Z2 \{(0,0)}
1 ℘(ζ, τ ) = 2 + ζ
(m,n)∈ ∈Z2 \{(0,0)}
1 1 − . (ζ + mτ + n)2 (mτ + n)2
(A.6)
Thus Z(ζ, τ ) and ℘(ζ, τ ) are odd and even meromorphic functions in ζ, respectively, and (∂ζ Z)(ζ, τ ) = −℘(ζ, τ ),
(∂ζ ℘)(ζ, τ ) = −2p3 (ζ, τ ).
(A.7)
Since p3 (ζ, τ ) is elliptic, it then follows that ℘(ζ, τ ) is also elliptic. The function Z(ζ, τ ) cannot be elliptic (by the property (i) above) and, in fact, Z(ζ + 1, τ ) = Z(ζ, τ ) − 8π 2 G2 (τ ),
(A.8)
Z(ζ + τ, τ ) = Z(ζ, τ ) − 8π G2 (τ )τ − 2πi,
(A.9)
2
where G2k (τ ) =
(2k − 1)! 2(2πi)2k
n∈Z\{0}
1 + n2k
m∈Z\{0} n∈Z
1 (mτ + n)2k
(A.10)
(k = 1, 2, . . .) are the G-modular functions that are also playing central role in this work. Hence, Z(ζ, τ ) and ℘(ζ, τ ) are possible candidates for p1 and p2 , and they are indeed used as basic functions in [38]. As we have explained in the introduction, we prefer to work with (anti)periodic function in ζ with period 1 and on the other hand, to preserve the relation (A.4) for all k ∈ N so that this naturally fixes p1 (ζ, τ ) := Z(ζ, τ ) + 8π 2 G2 (τ )ζ,
(A.11)
p2 (ζ, τ ) := ℘(ζ, τ ) − 8π G2 (τ ).
(A.12)
2
July 26, 2005 15:21 WSPC/148-RMP
660
J070-00239
N. M. Nikolov & I. T. Todorov
For k > 1, the above introduced G2k (τ ) are modular forms of weight 2k (and level 1): 1 aτ + b a b G2k = G2k (τ ) for ∈ SL(2, Z), (A.13) c d (cτ + d)2k cτ + d while for k = 1 we have instead aτ + b ic 1 . G2 = G2 (τ ) + (cτ + d)2 cτ + d 4π(cτ + d) In applications to CFT, there appear more general modular forms like τ +1 F (τ ) := 2G2 (τ ) − G2 2
(A.14)
(A.15)
which is invariant under the index 2„ subgroup Γθ of SL(2, Z) generated by S and T 2 « 1 1 where S is given by (1.2) and T = 0 1 . We note that the normalization factor in the definition of the modular forms G2k (τ ) (A.10) is chosen so that the coefficient to q in their Fourier expansion is 1. Then, one finds that all Fourier coefficients (except the constant term) are positive integers: ∞ ∞ B2k n2k−1 n 1 + q = ζ(1 − 2k) + σ2k−1 (n)q n (A.16) G2k (τ ) = − 4k 1 − qn 2 n=1 n=1 where σl (n) = r|n rl (sum over all positive divisors r of n), Bl are the Bernoulli numbers, and ζ(s) is the Riemann ζ-function. Let us mention also the modular transformation properties of the Weierstrass functions (A.6) and (A.5): ζ aτ + b 1 Z , = Z(ζ, τ ), (A.17) cτ + d cτ + d cτ + d ζ aτ + b 1 , ℘ = ℘(ζ, τ ). (A.18) (cτ + d)2 cτ + d cτ + d
Thus, our p1 (ζ, τ ) (A.17) and p2 (ζ, τ ) (A.18) will obey inhomogeneous modular transformation laws (as in the example of Eq. (5.25)). (This is the price for preserving the periodicity property for ζ → ζ + 1.) We will use also the Jacobi ϑ-functions, see [25] and [37]: ϑ(ζ, τ ) :=
∞
2
eπi(n
τ +2nζ)
n=−∞
ϑλκ (ζ, τ ) := eπiτ
λ2 4
+πiλ(ζ+ κ 2)
≡ ϑ00 (ζ, τ ),
λτ + κ ,τ ϑ ζ+ 2
(A.19) (A.20)
for κ, λ = 0, 1, which have the following properties (for κ, λ = 0, 1): ϑλκ (ζ + mτ + n, τ ) = (−1)mκ+nλ e−πi(m λκ
2
τ +2mζ)
ϑλκ (ζ, τ ),
ϑλκ (−ζ, τ ) = (−1) ϑλκ (ζ, τ ), 1−λ 1−κ ϑλκ (ζ, τ ) = 0 ⇔ ζ ∈ Z + . τ +Z+ 2 2
(A.21) (A.22) (A.23)
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
661
(Equations (A.21) and (A.22) are first proven for the series (A.19) and then for the other functions (A.20); Eq. (A.23) follows from Eqs. (A.22) and (A.20) since the first one means that ϑ11 (ζ, τ ) is odd in ζ.) We are using in Sec. 3.2 the fact that the odd ϑ-function, ϑ11 , can be written in the form ϑ11(ζ, τ ) = −2
∞
1
2
(−1)n eiπτ (n+ 2 ) sin(2n + 1)πζ.
(A.24)
n=0
Returning to our set {pk (ζ, τ )} of basic elliptic functions, we can rewrite (the qausielliptic) p1 (ζ, τ ) as N
p1 (ζ, τ ) = lim
N →∞
=
∞
π cot[π(ζ + nτ )]
n=−N
{π cot[π(ζ + nτ )] + iπ sgn(n)}
(A.25)
n=−∞
= lim
lim
M→∞ N →∞
M
N
m=−M n=−N
1 , ζ + m + nτ
(A.26)
where sgn(n) := |n| n for n = 0 and sgn(0) := 0. Indeed, first note that the second sum in Eq. (A.25) is absolutely convergent since the absolute value of the summand has a behavior as e−π|n|Im τ . Then Eq. (A.26) follows from Euler’s identity: N (−1)nλ π cos1−λ πζ = lim (λ = 0, 1). N →∞ sin πζ ζ +n
(A.27)
n=−N
Finally, to obtain the first Eq. (A.25), we take the difference between both sides and observe that it is an elliptic function in ζ, in accord to Eqs. (A.9) and (A.11). On the other hand, this difference is regular in ζ in the fundamental domain D, because of Eqs. (A.5) and (A.26), so that it is a constant which is actually zero since it is obviously an odd function in ζ. Equation (A.25) is closely related to the general form of the elliptic correlation functions arising in the free field GCI models according to Theorem 4.1 (see Eq. (4.6)). In view of the more general situation of the “grand canonical” corelation functions in Remark 4.1 (Eq. (4.11)), we are led to introduce for κ, λ = 0, 1, τ ∈ H, ζ ∈ C\(Zτ + Z) and µ ∈ R, ∞ π cos1−λ [π(ζ + nτ )] + iπ(1 − λ) sgn(n) eπin(2µ+κ) . (A.28) (ζ, τ, µ) = pκ,λ 1 sin[π(ζ + nτ )] n=−∞ For |n| 0, the absolute value of the summand in the above series will have a behavior as e−π|n|Im τ and therefore, the series is convergent for every ζ ∈ C\(Zτ + Z), µ ∈ R and τ ∈ H. It then follows that κ −2πiµ κ,λ p1 (ζ, τ, µ) − πi(1 − λ)(1 + e−πi(2µ+κ) ), pκ,λ 1 (ζ + τ, τ, µ) =(−1) e λ κ,λ pκ,λ 1 (ζ + 1, τ, µ) =(−1) p1 (ζ, τ, µ).
(A.29)
July 26, 2005 15:21 WSPC/148-RMP
662
J070-00239
N. M. Nikolov & I. T. Todorov
In the case of κ = λ = 0, we will simplify the notation setting p1 (ζ, τ, µ) := p00 1 (ζ, τ, µ).
(A.30)
Proposition A.1. The functions pκ,λ 1 (ζ, τ, µ) (A.28) (κ, λ = 0, 1) have an analytic extension to meromorphic functions in (ζ, τ, µ) ∈ C×H×C given for µ+ κ2 ∈ R\Z, by pκ,λ 1 (ζ, τ, µ) =
' κ( (∂ζ ϑ11 )(0, τ ) ϑ1−λ1−κ (ζ + µ, τ ) − (1 − λ)π cot π µ + . (A.31) ϑ1−λ1−κ (µ, τ ) ϑ11 (ζ, τ ) 2
They are regular for all µ ∈ R and p1 (ζ, τ, 0) =
(∂ζ ϑ11 )(ζ, τ ) ≡ p1 (ζ, τ ), ϑ11 (ζ, τ )
(A.32)
p1 (ζ, τ ) being defined by Eq. (A.11). Proof. Let µ ∈ R and take the difference ∆(ζ, τ, µ) between the left- and righthand sides of Eq. (A.31). From the properties (A.21)–(A.23) and (A.29), we find that ∆(ζ + mτ + n, τ, µ) = (−1)mκ+nλ e−2πimµ ∆(ζ, τ, µ).
(A.33)
(Note that the second ratio in Eq. (A.31) is chosen to obey the quasi-periodicity property (A.33) and its pole coefficient at ζ = 0 is canceled by the first ratio.) On the other hand, ∆(ζ, τ, µ) is analytic in ζ, for fixed τ and µ, outside the lattice Zτ + Z ⊂ C and since it is also regular at the origin ζ = 0, Eq. (A.33) then implies that ∆(ζ, τ, µ) is an entire bounded function in ζ. By the Liouville’s theorem, we conclude that ∆(ζ, τ, µ) does not depend on ζ and it is actually zero, again by Eq. (A.33). Equation (A.32) follows in the same way from Eqs. (A.8), (A.9) and (A.29). (The constant here is fixed by the behavior for ζ → 0.) For k = 1, 2, . . . , we set 1 κ,λ pk (ζ, τ, µ) := p00 pκ,λ k (ζ, τ, µ), k+1 (ζ, τ, µ) = − ∂ζ pk (ζ, τ, µ), k 1 κ,λ κ,λ pκ,λ pk (ζ, τ ) := p00 k (ζ, τ ). k+1 (ζ, τ ) = pk (ζ, τ, 0)(= − ∂ζ pk (ζ, τ )), k
(A.34) (A.35)
Proposition A.2. Every function pκ,λ k (ζ, τ, µ), for k = 2, 3, . . . , is uniquely characterized by the conditions: (a) pκ,λ k (ζ, τ, µ) is a meromorphic function in (ζ, τ, µ) ∈ C × H × C and for real µ, and for all τ ∈ H, k = 1, 2, . . . , κ, λ = 0, 1, it has exactly one pole in ζ at 0 of order k and residue 1 in the domain {ατ + β : α, β ∈ [0, 1)} ⊂ C; λ κ,λ (b) pκ,λ k (ζ + 1, τ, µ) = (−1) pk (ζ, τ, µ);
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
663
−πi(2µ+κ) κ,λ (c) pκ,λ pk (ζ, τ, µ). k (ζ + τ, τ, µ) = e
It also obeys the property k κ,λ (d) pκ,λ k (−ζ, τ, µ) = (−1) pk (ζ, τ, µ).
The function pκ,λ 1 (ζ, τ, µ) can be fixed by the condition (d) and the relation (A.34) connecting it with the function pκ,λ 2 (ζ, τ, µ). For µ ∈ R, we have the series representation eπim(2µ+κ) eπinλ (ζ, τ, µ) = , (A.36) pκ,λ k (ζ + mτ + n)k m,n∈Z
which is absolutely convergent for k ≥ 3 and ζ ∈ C\(Zτ + Z), and for k = 1, 2, the sum should be taken first in n for |n| ≤ N as N → ∞ and then in m for |m| ≤ M → ∞. Proof. Clearly the functions defined by Eqs. (A.31) and (A.34) satisfy the conditions (a)–(d) except the case of k = 1 in (c). By the argument used in the proof of Proposition A.1, it follows that (a)–(c) uniquely determine the functions κ,λ pκ,λ k (ζ, τ, µ) for k ≥ 2. The relation (A.34) fixes the function p1 (ζ, τ, µ) up to an additive constant which is determined by the condition (d). The derivation of Eq. (A.36) is based on (A.28) and (A.27). Eq. (A.36) implies that: 1−k pk p01 k (ζ, τ ) = 2
ζ τ , 2 2
− pk (ζ, τ ).
p10 k (ζ, τ ) = pk (ζ, 2τ ) − pk (ζ, τ ), ζ τ +1 11 1−k , pk pk (ζ, τ ) = 2 − pk (ζ, τ ). 2 2
(A.37)
Appendix B. Proof of Proposition 3.4 We begin by recalling a basic fact of the theory of formal power series ∞ Fact B.1. Let R be a commutative ring with unit and a(q) = 1+ n=1 an q n ∈ R[[q]] be an infinite formal power series in a single variable q. Then a(q) is invertible in ∞ R[[q]], i.e. there exists unique b(q) = n=0 bn q n ∈ R[[q]] such that a(q)b(q) = 1. Moreover, b0 = 1 and if a(q) is a complex series that is absolutely convergent and non-zero for |q| < λ, then b(q) is absolutely convergent for |q| < λ−1 . ∞ ∞ ∞ = b0 + an q n bn q n Proof. Noting that 1 + n=1 n=0 n=1 bn + n n−1 k=0 an−k bk q , one can inductively determine bn starting with b0 = 1. If a(q) is absolutely convergent and non-zero for |q| < λ, then b(q) will be the Taylor series of an analytic function for |q| < λ−1 so that it will be absolutely convergent there.
July 26, 2005 15:21 WSPC/148-RMP
664
J070-00239
N. M. Nikolov & I. T. Todorov
Continuing with the proof of the statement (a) of Proposition 3.4, we note first that Θ12 is obtained from (3.13) (see also (1.11) and (1.12)) as a formal power (n) series in q with coefficients that are polynomials, say Θ12 , in e±πiζ12 and e±πiα . (n) mπiζ12 (for |m| ≤ n) will be an even trigonometric Thus the coefficient in Θ12 to e polynomial in α with period 1 (since Θ12 , as an analytic function, is even and periodic with period 1 in ζ12 as well as in α, according to Eqs. (A.21) and (A.22)) (n) and hence, Θ12 is a polynomial in cos 2πα = u1 · u2 ∈ C[u1 , u2 ]. Then considering (n) Θ12 as a polynomial in cos 2πα, we find in the same way that its coefficients are polynomials in cos 2πζ12 . Summarizing, we have Θ12 ∈ C[e±2πiζ1 , u1 ; e±2πiζ2 , u2 ]. To (n) prove Eq. (3.15), we observe that Θ12 is a polynomial in cos 2πζ12 (with polynomial coefficients in cos 2πα) which is zero for cos 2πζ12 = cos 2πα (since Θ12 = 0 for ζ12 = Θ
(n)
Θ
(n)
12 ±α). It then follows that 4 sin πζ+12sin πζ− ≡ 2(cos 2πα−cos 2πζ12 ) is again a polynomial in cos 2πζ12 and cos 2πα. This and the second equality in (3.13) prove Eq. (3.15). Since Θ12 is even in both, ζ12 and α, we have the symmetry Θ12 = Θ21 . Now the proof of the first part of Proposition 3.4(b) follows from Fact B.1, Eq. (3.15) and the existence in C[[e±2πiζ1 , u1 ]]+ [[e±2πiζ2 , u2 ]]+ of the inverse:
1 4e−2πiζ12 = sin πζ+ sin πζ− 1 − 2 cos(2πα)e−2πiζ12 + e−4πiζ12 ∞ Cn1 (cos 2πα)e−(n+1)πiζ12 , =4
(B.1) (B.2)
n=0
where Cnk (t) are the Gegenbauer polynomials already used in Sec. 4.2. Continuing with the proof of Proposition 3.4(c), we note first that the symmetry of Ω follows from that of Θjk . To obtain Eq. (3.17), one first derives for m ∈ Z: Θ(ζ12 + mτ ; u1 , u2 ) = e−2πi(m
2
τ +2mζ12 )
Θ(ζ12 ; u1 , u2 )
(B.3)
using Eqs. (3.13) and (A.12). Then we have for λ1 , . . . , λn−1 ∈ Z: Ω(ζ1 2 + λ1 τ, . . . , ζn−1 n + λn−1 τ ; u1 , . . . , un ; τ ) m−1 Θ (ζj j+1 + λj τ ); ul , um = 1≤l<m≤n
j=l
2 m m m = exp −2πi λj τ +2 λj ζj j+1 1≤l≤m≤n−1 j=l j=l j=l × Ω(ζ1 2 , . . . , ζn−1 n ; u1 , . . . , un ; τ )
(B.4)
so that expanding the sums in the latter (n) exponent we arrive at Eq. (3.17) with a positive definite integral matrix Ajk n−1 j,k=1 .
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
665
To prove the Proposition 3.4(d), let us write (following [25]):
F (ζ1 2 , . . . , ζn−1 n ; τ ) =
Fm (q)eiπ 1 ···mn−1
Pn−1 k=1
mk ζkk+1
(B.5)
(m1 ,...,mn−1 )∈Zn−1 1
where Fm (q) are infinite formal power series in q 2 (with coefficients in the 1 ···mn−1 algebra R). Then the properties (3.17) combined with the expansion (B.5) implies (1) (q) are non-zero if mk + εk = 0 mod 2. Therefore, we can rewrite that Fm 1 ···mn−1 the expansion (B.5) in the form:
F (ζ1 2 , . . . , ζn−1 n ; τ ) =
ν1 ∈Z+
...
(1) ε1 2
νn−1 ∈Z+
Fν1 ···νn−1 (q)e2πi
Pn−1 k=1
νk ζk k+1
(1) ε n−1 2
(B.6) 1
(Fν1 ···νn−1 (q) ∈ R[[q 2 ]]). Now combining the expansion (B.5) with the property (3.18), we obtain: (τ )
Fν+2N A(n) (λ) (q) = (−1)λ·ε
q N λ·A
(n)
(λ)−ν·λ
Fν (q), (k)
(B.7) (k)
where ν := (ν1 , . . . , νn−1 ), λ := (λ1 , . . . , λn−1 ), ε(k) := (ε1 , . . . , εn−1 ), A(n) (λ) := n−1 (n) n−1 n−1 k=1 Ajk λk j=1 and ν · λ := k=1 νk λk . Thus, we can find all the series Fν (q) (1)
if we know them for all ν − ε 2 belonging to a finite subset M ⊂ Zn−1 given by the intersection of the lattice Zn−1 with a fundamental domain of its sublattice 2N A(n) (Zn−1 ) (here we use the fact that A(n) is a non-degenerate integral matrix). In fact, if we split the sum in (B.6) into two sums, the first, over the fundamental (1)
domain L+ ε 2 and the second, over its translates 2N A(n) (λ), we find (using (B.7)):
Fν (q)Fν(N )(ζ1 2 , . . . , ζn−1 n ; τ ),
(B.8)
Fν(N ) (ζ1 2 , . . . , ζn−1 n ; τ ) (τ ) (n) (n) := (−1)λ·ε q N λ·A (λ)−ν·λ e2πi(ν+2N A (λ))·ζ ,
(B.9)
F (ζ1 2 , . . . , ζn−1 n ; τ ) =
ε(1) ν∈L+ 2
where the series
λ∈Zn−1
(ζ := (ζ1 2 , . . . , ζn−1 n )) are absolutely convergent to analytic functions in ζ and τ with Im τ > 0 according to the general theory of the theta-series. If F is symmetric as a series in ζ1 , . . . , ζn , then the above basic F (N ) series can be further symmetrized in ζk (k = 1, . . . , n). This completes the proof of Proposition 3.4.
July 26, 2005 15:21 WSPC/148-RMP
666
J070-00239
N. M. Nikolov & I. T. Todorov
References [1] M. F. Atiyah and I. G. Macdonald, Introduction to Commutative Algebra (AddisonWesley, Reading, MA, 1969). [2] R. E. Borcherds, Vertex algebras, Kac–Moody algebras and the Monster, Proc. Natl. Acad. Sci. USA 83 (1986) 3068–3071. [3] R. E. Borcherds, Vertex algebras, Topological Field Theory, Primitive Forms and Related Topics (Kyoto, 1996) pp. 35–77, Progr. Math. (Birkh¨ auser Boston, Boston, MA, 1998) 160. [4] H. J. Borchers, Field operators as C ∞ functions in spacelike dimensions, Nuovo Cim. 33 (1964) 1600. [5] J. Bros and D. Buchholz, Towards a relativistic KMS condition, Nuclear Phys. B423 (1994) 291–318. [6] D. Buchholz and E. Wichmann, Causal independence and the energy-level density of states in local quantum theory, Commun. Math. Phys. 106 (1986) 321–344. [7] D. Buchholz, C. d’Antoni and R. Longo, Nuclear maps and modular structure II, Commun. Math. Phys. 129 (1990) 115–138. [8] D. Buchholz, On hot bangs and the arrow of time in relativistic quantum field theory, Commun. Math. Phys. 237 (2003) 271–288. [9] P. A. M. Dirac, Wave equations in conformal space, Ann. Math. 37 (1936) 429–442. [10] J. S. Dowker and K. Kirsten, Elliptic functions and temperature inversion on spheres, Nuclear Phys. B638 (2002) 405–432. [11] E. Frenkel and D. Ben-Zvi, Vertex Algebras and Algebraic Curves (AMS, 2001). [12] M. Waldschmidt et al. (eds.), From Number Theory to Physics (Springer, 1992, 1995). See, in particular, Chap. 2: J.-B. Bost, Introduction to compact Riemann surfaces, Jacobians and Abelian varieties, pp. 64–211; Chap. 3: H. Cohen, Elliptic curves, pp. 212–237; Chap. 4: D. Zagier, Introduction to modular forms, pp. 238–291. [13] F. Gursey and S. Orfanidis, Conformal invariance and field theory in two dimensions, Phys. Rev. D7 (1973) 2414–2437. [14] R. Haag, Local Quantum Physics: Fields, Particles, Algebras, 2nd rev. edn. (SpringerVerlag, 1996). [15] R. Haag, N. Hugenholtz and M. Winnik, On the equilibrium states in quantum statistical mechanics, Commun. Math. Phys. 5 (1967) 215–136. [16] M. Hortacsu, Explicit examples of conformal invariance, Int. J. Theor. Phys. 42 (2003) 49. [17] M. Hortacsu, R. Seiler and B. Schroer, Conformal symmetry and reverberations, Phys. Rev. D5 (1972) 2519–2534. [18] R. Jost, The General Theory of Quantized Fields (Amer. Math. Soc. Publ., Providence R.I., 1965). [19] V. G. Kac, Vertex Algebras for Beginners, Vol. 10, 2nd edn. (AMS, Providence, R.I., 1998). [20] Y. Kawahigashi and R. Longo, Classification of two-dimensional conformal nets with c < 1 and 2-cohomology vanishing for tensor categories, Commun. Math. Phys. 244 (2004) 63–97. [21] H. A. Kramers and G. H. Wannier, Statistics of two-dimensional frerromagnet I, II, Phys. Rev. 60 (1941) 252–262, 263–276. [22] S. Lang, Elliptic Functions, 2nd edn., Graduate Texts in Mathematics, Vol. 112 (Springer, New York, 1987). [23] M. L¨ uscher and G. Mack, Global conformal invariance in quantum field theory, Commun. Math. Phys. 41 (1975) 203–234.
July 26, 2005 15:21 WSPC/148-RMP
J070-00239
Elliptic Thermal Functions and Modular Forms in Conformal Field Theory
667
[24] G. Mack, All unitary representations of the conformal group SU (2, 2) with positive energy, Commun. Math. Phys. 55 (1977) 1–28. [25] D. Mumford, Tata Lectures on Theta I, II (Birkhauser, Boston, 1983, 1984). [26] N. M. Nikolov, Vertex algebras in higher dimensions and globally conformal invariant quantum field theory, Commun. Math. Phys. 253 (2005) 283–322. [27] N. M. Nikolov, Ya. S. Stanev and I. T. Todorov, Four dimensional CFT models with rational correlation functions, J. Phys. A. Math. Gen. 35 (2002) 2985–3007. [28] N. M. Nikolov, Ya. S. Stanev and I. T. Todorov, Globally conformal invariant gauge field theory with rational correlation functions, Nucl. Phys. B670 (2003) 373–400. [29] N. M. Nikolov and I. T. Todorov, Rationality of conformally invariant local correlation functions on compactified Minkowski space, Commun. Math. Phys. 218 (2001) 417–436. [30] N. M. Nikolov and I. T. Todorov, Conformal quantum field theory in two and four dimensions, Proc. of the Summer School in Modern Mathematical Physics, eds. B. Dragovich and B. Sazdovi´c (Belgrade, 2002), pp. 1–49. [31] B. Schroer, Braided structure in 4-dimensional quantum field theory, Phys. Lett. B506 (2001) 337–343. [32] I. E. Segal, Causally oriented manifolds and groups, Bull. Amer. Math. Soc. 77 (1971) 958–959. [33] I. E. Segal, Covariant chronogeometry and extreme distances. III Macro-micro relations, Int. J. Theor. Phys. 21 (1982) 851–869. [34] R. F. Streater and A. S. Wightman, PCT, Spin and Statistics and All That (Princeton Univ. Press, Princeton, N.J., 2000). [35] I. T. Todorov, Infinite dimensional Lie algebras in conformal QFT models, Conformal Groups and Related Symmetries. Physical Results and Mathematical Background, Lecture Notes in Physics, eds. A. O. Barut and H.-D. Doebner, Vol. 261 (Springer, Berlin, 1986), pp. 387–443. [36] A. Uhlmann, Remarks on the future tube, Acta Phys. Pol. 24 (1963) 293; The closure of Minkowski space. ibid. 295–296. [37] M. Yoshida, Hyprgeometric Functions, My Love (Vieweg, Braunshweig/Wiesbaden, 1997) (see, in particular, Chap. II. Elliptic curves, pp. 29–59). [38] Y. Zhu, Modular invariance of characters of vertex operator algebras, J. Amer. Math. Soc. 9(1) (1996) 237–302.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
Reviews in Mathematical Physics Vol. 17, No. 6 (2005) 669–743 c World Scientific Publishing Company
THE LINEAR BOLTZMANN EQUATION AS THE LOW DENSITY ¨ LIMIT OF A RANDOM SCHRODINGER EQUATION
DAVID ENG∗ Courant Institute, New York University, 251 Mercer Street, New York, NY 10012, USA [email protected] ´ ´ ERDOS ˝ † LASZL O Institute of Mathematics, University of Munich, Theresienstr. 39, D-80333 Munich, Germany [email protected]
Received 12 December 2004 We study the long time evolution of a quantum particle interacting with a random potential in the Boltzmann–Grad low density limit. We prove that the phase space density of the quantum evolution defined through the Husimi function converges weakly to a linear Boltzmann equation. The Boltzmann collision kernel is given by the full quantum scattering cross-section of the obstacle potential. Keywords: Quantum Boltzmann equation; Anderson model; Boltzmann–Grad limit; Lorentz gas. Mathematics Subject Classification 2000: 35Q40, 81Q05, 81Q15, 81Q30
1. The Model and the Result The Schr¨ odinger equation with a random potential describes the propagation of quantum particles in an environment with random impurities. In the first approximation, one neglects the interaction between the particles and the problem reduces to a one-body Schr¨ odinger equation. With high concentration of impurities, the particle is localized, in particular, no conduction occurs [1–3, 8, 11, 12]. In the low concentration regime, conduction is expected to occur but there is no rigorous mathematical proof of the existence of the extended states except for the Bethe lattice [16, 17]. In this paper, we study the long time evolution in the low concentration regime in a specific scaling limit, called the low density or Boltzmann–Grad ∗ Partially
supported by NSF grant DMS-0307295. supported by NSF grant DMS-0200235 and EU-IHP Network “Analysis and Quantum” HPRN-CT-2002-0027. On leave from School of Mathematics, GeorgiaTech, USA. † Partially
669
July 26, 2005 15:22 WSPC/148-RMP
670
J070-00242
D. Eng & L. Erd˝ os
limit. Our model is the quantum analogue of the low density Lorentz gas. As the time increases, the concentration will be scaled down in such a way that the total interaction between the particle and the obstacles remains bounded for a typical configuration. Therefore, our result is far from the extended states regime which requires us to understand the behavior of the Schr¨ odinger evolution for arbitrary long time, independently of the fixed (low) concentration of impurities. We start by defining our model and stating the main result. Let ΛL ⊂ Rd be a cube of width L and let V0 (x) be a smooth radial function with a sufficiently strong decay to be specified later. Denote by ω = (xα ), α = 1, . . . , N , the configuration of uniformly distributed obstacles and let :=
N Ld
(1.1)
be the density of the obstacles. We are interested in the evolution of a quantum particle in the random environment generated by these obstacles. The Schr¨ odinger equation governing the quantum particle is given by i∂t ψt = HN,Lψt ,
ψt=0 = ψ0 ,
(1.2)
where the Hamiltonian is given by 1 HN,L = H := − ∆ + Vω , 2
Vω =
N
Vα (x) := V0 (x − xα ),
Vα ,
(1.3)
α=1
with periodic boundary conditions on ΛL . We have used lower case letters (x, t) to denote the space and time variables in the microscopic (atomic) scale. We shall always take first the simultaneous L → ∞, N → ∞ limits, with a fixed density = N/Ld before any other limit. The finite box ΛL is just a technical device to avoid infinite summation in the potential term. Our method works for any dimension d ≥ 3, but we restrict ourselves here to the case d = 3. As a first step toward a study of conduction, one considers certain scaling limits. Let ε be the scale separation parameter between microscopic and macroscopic variables. In reality, ε = 1 ˚ A/1 cm = 10−8 ; here we always take the idealized ε → 0 limit. Define the macroscopic coordinates (X, T ) by (X, T ) := (xε, tε). Note that the velocity is not rescaled, following scaling limit problem:
X T
=
x t.
In this paper, we will treat the
Low Density Limit (LDL) Let = ε0 for some fixed positive density 0 , N 1 ε ε i∂t ψω,t = − ∆+ Vα (x) ψω,t . (1.4) 2 α=1
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
671
Another interesting scaling limit which has been studied in the literature is: Weak Coupling Limit (WCL) Fix the density = 0 and scale the strength of √ the external potential by ε, N √ 1 ε ε Vα (x) ψω,t . (1.5) i∂t ψω,t = − ∆ + ε 2 α=1 In a related model, the random obstacle potential α Vα is replaced with a Gaussian field Vω (x) with a decaying, ε-independent covariance. It turns out that in both the LDL and WCL models, these are the weakest interaction strengths that result in a nontrivial (non-free) macroscopic evolution in the scaling limit ε → 0. The Wigner transform of a wave function ψ is defined by z z ivz ψ x+ (1.6) ψ x− e dz. Wψ (x, v) := 2 2 R3 The Wigner transform typically has no definite sign but the associated Husimi function is non-negative at appropriate scales. The Husimi function at scale (1 , 2 ) is defined by √
Hψ1 ,2 := Wψ ∗x G1 /
2
√
∗v G2 /
2
,
where Gδ is the standard Gaussian with variance δ 2 , i.e. 3
z2
Gδ (z) := (2πδ 2 )− 2 e− 2δ2 .
(1.7)
The Husimi function at scale 1 = , 2 = −1 is the coherent state at scale defined by −1
Hψ, (x, v) = Cψ, (x, v) := ψ, πx,v ψ, is the projection onto the L2 normalized state G (x − z)eizv . Clearly where πx,v Cψ, is positive and Cψ, (x, v) dx dv = ψ 22 = 1. (1.8)
Thus Cψ, can be considered as a probability density on the phase space at atomic scale. The accuracy for the space variable in the coherent state Cψ, is of order , and the accuracy is of order −1 for the velocity variable. This is optimal by the uncertainty principle. Unfortunately, we cannot keep this accuracy along our proof, and we need a small extra smoothing. The basic object we shall study is the Husimi function on scale 1 = ε−1+µ , 2 = εµ with some 0 < µ < 1/2, which can also be written as −1+µ
Hψε
,εµ
−µ
= Hψε
,εµ
√
∗x G(ε)/
2
√
= Cψ,ε−µ ∗x G(ε)/
2
,
(1.9)
July 26, 2005 15:22 WSPC/148-RMP
672
J070-00242
D. Eng & L. Erd˝ os
√ where (ε) = εµ ε−2 − ε−4η . We can rescale it to the macroscopic scale by defining (ε,µ)
Hψ
−1+µ
(X, V ) := ε−3 Hψε
,εµ
(X/ε, V ) ≥ 0.
(1.10)
(ε,µ) Hψ (X, V
From (1.8), (1.9) and (1.10), it follows that ) defines a probability density on the macroscopic phase space R6 . (ε,µ) is not rescaled. The accuracy for both the Notice that the velocity in Hψ (ε,µ)
(macroscopic) space and velocity variables in Hψ is now of order εµ . We shall use this non-negative phase space density function to represent the true quantum mechanical function ψ. Our goal is to prove that the macroscopic phase space density of ψtε converges to a solution of the linear Boltzmann equation as in the classical case, except that the classical differential scattering cross-section is replaced by the quantum differential scattering cross-section. We now recall the linear Boltzmann equation for a time-dependent phase space density FT (X, V ) with collision kernel Σ(U, V ): ∂T FT (X, V ) + V · ∇X FT (X, V ) = [Σ(U, V )FT (X, U ) − Σ(V, U )FT (X, V )] dU =
Σ(U, V )FT (X, U ) dU − ΣFT (X, V ),
(1.11)
where Σ := Σ(V, U ) dU is the total cross-section. In our setting, Σ(V, U ) will be defined later on in the Main Theorem. For any function f on R3 , we introduce the norms f M,N := xM ∇N f 2 ,
N, M ∈ N,
where x := (1 + x ) . Suppose V0 is a smooth, decaying and radially symmetric potential such that 2 1/2
λ0 := V0 50,50
(1.12) − 12 ∆
+ V0 has is sufficiently small. In particular, the one body Hamiltonian H1 := no bound states and asymptotic completeness holds, i.e. both the incoming and outgoing Hilbert spaces are the full space L2 (R3 ). Recall the wave operators Ω∓ = lims→∞ e±isH0 e∓isH1 , where H0 := − 12 ∆. The kernel of the scattering operator S = Ω∗− Ω+ in the Fourier space exists and can be written as S(u, v) = δ(u − v) − 2πiδ(u2 − v 2 )Tscat (u, v). The differential scattering cross-section can be defined as σ(u, v) := 4π δ(u2 − v 2 ) |Tscat (u, v)|2 . on-shell We shall choose initial data of the form
(1.13)
ψ0ε (x) = ε3/2 h(εx)eiu0 ·x , where u0 ∈ R3 , h 30,30 < ∞ and h is L2 -normalized. This implies ψˆ0ε 30,0 < ∞.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
673
We will usually drop the “hat” on the initial wave function as we will be working in momentum space. It should be noted that the specific form of our initial wave function is used only in the last step — in the identification of the limit. Our result certainly holds for a general class of initial conditions which satisfy ψˆ0ε 30,0 < ∞ and that have a limiting macroscopic phase space density. It is straightforward to check that the rescaled Husimi functions (1.10) of the initial data converge weakly as probability measures on R6 as ε → 0, i.e. (ε,µ)
Hψε (X, V ) dX dV → |h(X)|2 δ(V − u0 ) dX dV. 0
We define F0 (X, V ) := |h(X)|2 δ(V − u0 ) to be the initial data of the limiting Boltzmann equation. We can now state our theorem concerning the low density limit. Main Theorem. Suppose d = 3 and let µ > 0 be sufficiently small. Suppose the random environment ω is uniformly distributed with density = 0 ε with some fixed 0 > 0. Let V0 be a radially symmetric potential such that λ0 := V0 50,50 is ε be the solution to the Schr¨ odinger equation (1.4) with sufficiently small. Let ψω,t 2 ε L -normalized initial data ψ0 of the following form ψ0ε (x) = ε3/2 h(εx)eiu0 ·x where h 30,30 < ∞, h 2 = 1. Then for any T > 0, and any bounded, continuous test function J,
(ε,µ)
lim lim dX dV J(X, V ) EHψε (X, V ) − FT (X, V )
= 0, ε→0 L→∞
ω,T /ε
where FT (X, V ) satisfies the linear Boltzmann equation (1.11) with initial data given by F0 (X, V ) = |h(X)|2 δ(V − u0 ) and with effective collision kernel Σ(U, V ) = 0 σ(U, V ). Here, σ(U, V ) is the differential scattering cross-section given in (1.13). Our result holds for a larger class of distributions of obstacles, but for simplicity, we assume the uniform distribution in this paper. The analogous result in the WCL model was proven by H. Spohn [20] in the case where the obstacles are distributed according to a Gaussian law and macroscopic time is small, T ≤ T0 . His result was extended to higher order correlation functions by Ho–Landau–Wilkins [14] under the same assumptions. The WCL model with a Gaussian field was proven globally in time by Erd˝ os and Yau in [9]. Later, the method was extended to general distributions by Chen in [5]. Chen also showed [6] that the convergence of the expected Wigner transform to the Boltzmann equation holds in Lr for r ≥ 1. The present proof is similar in spirit to the WCL proof in [9]. The main difference between the two models and proofs lies in the Boltzmann collision kernel Σ. In the LDL model, Σ involves summing up the complete Born series of each individual
July 26, 2005 15:22 WSPC/148-RMP
674
J070-00242
D. Eng & L. Erd˝ os
obstacle scattering in contrast to the WCL model, where only the first Born approximation is needed. Unlike the WCL model, in the low density environment where once the quantum particle is in the neighborhood of an obstacle, it can collide with it many times with a non-vanishing amplitude. Moreover, if two obstacles are near to each other, then complicated double recollision patterns arise with comparable amplitudes. On a technical level, this difference forced us to completely reorganize the diagrammatic expasion of [9]. Most importantly, the recollision diagrams have bigger amplitude in the LDL model and their estimate required several new ideas. The classical analogue of the LDL model is the classical Lorentz gas. It is proved by G. Gallavotti [13], H. Spohn [21] and, Boldrighini, Bunimovich and Sinai [4] that the evolution of the phase space density of a classical Lorentz gas converges to a linear Boltzmann equation. However, the classical WCL behavior is governed by a Brownian motion instead of the Boltzmann equation — see Kesten and Papanicolaou [15] and, D¨ urr, Goldstein and Lebowitz [7]. (ε,µ) as a proIn principle, one is interested in the behavior of T → Hψε ω,T /ε cess for typical ω. This means one has to consider the joint distributions of (ε,µ) (ε,µ) , . . . , Hψε . We believe that there is no intrinsic difficulty to extend Hψε ω,T1 /ε
ω,Tn /ε
our method to this setting. But the proof will certainly be much more involved. 2. Preliminaries 2.1. Notation For convenience, we fix a convention to avoid problems with factors of 2π arising from the Fourier transform. We define dx to be the Lebesgue measure on R3 divided by (2π)−3/2 , i.e. 1 d∗ x, dx = (2π)3/2 R3 where we reserve the notation d∗ x for the genuine three-dimensional Lebesgue measure. This convention will apply to any space or momentum variable in 3D but not to one-dimensional integration (like time variables and their variables), where integration will be the standard, unscaled, Lebesgue measure on the line. With this convention, the three-dimensional Fourier transform (which will be usually denoted by a hat) is fˆ(p) = F f (p) := f (x)e−ipx dx and its inverse f (x) = F −1 fˆ(x) =
fˆ(p)eipx dp.
Wave functions will always be represented in momentum space ψ(p), hence we can omit the hat from their notation. The other convention is related to the fact that we will be considering the problem on the torus Λ := LT3 where T3 is the unit torus. Correspondingly, all
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
675
the momenta in this paper will be on the discrete lattice (Z/L)3 . The momentum variables will be denoted by letters p, q, r, u, v or w. The delta function is defined as δ(p) := |Λ| for p = 0, for p ∈ (Z/L)3 \{0}.
δ(p) := 0
(2.1)
Nevertheless, we will use the continuous formalism, under the identification dp :=
1
3 2
(2π) |Λ| p∈(Z\L)3
.
(2.2)
Again, this will only apply to momenta variables. The delta functions with time variables will remain the usual continuous delta functions. The convention (2.2) should not cause any confusion, as L → ∞ can be taken at any stage of the proof independently of all other limits. Gothic script will be used to denote a set of variables, in particular, momenta. Define 2 pm1 ,m2 := {pj }m j=m1 = (pm1 , pm1 +1 , . . . , pm2 ),
pm := p1,m .
(2.3)
In some instances, we will need to single out the first momenta and write (p0 , pm ) instead of p0,m . Similar convention applies to other momentum variables. Moreover, for any l0,b := (0 , . . . , b ), where j are non-negative integers for j = 0, . . . , b, we define l
0,b p0,b := (p0 , . . . , p0 , . . . , pb , . . . , pb ).
0 +1
(2.4)
b +1
If Log x is the standard natural logarithm function (on the positive line), define for x > 0, log x :=
1 Log x
for x ≤ e, for x > e.
(2.5)
If x ≥ 1, we define xO(1) to be xk for some positive constant k which is independent of any parameter (such as ε). Finally, if A, B are fully ordered sets, unordered set operations will be denoted by their usual notation (e.g., ∪, ∩, ∈, etc.). Define A ⊕ B to be the concatenation of A and B, i.e. the ordered set where the ordering of A supersedes that of B. We will at times write AB := A ⊕ B. A ≺ B will denote ordered inclusion, i.e. A ⊂ B and the ordering coincides.
July 26, 2005 15:22 WSPC/148-RMP
676
J070-00242
D. Eng & L. Erd˝ os
2.2. The Duhamel formula For any fixed n0 ≥ 1, the Duhamel formula states t∗ n 0 −1 −itH m −is0 H0 e = (−i) [dsj ]m Vω e−is1 H0 Vω · · · Vω e−ism H0 0 e m=0
0
+ (−i)n0 0
t∗
[dsj ]n0 0 e−is0 H Vω e−is1 H0 Vω · · · Vω e−isn0 H0 ,
where H is the (full) Hamiltonian given in (1.3) and t t t∗ n n [dsj ]nm := ··· dsj δ t − sj , 0
0
0
j=m
(2.6)
(2.7)
j=m
where m ≤ n, and the star refers to the constraint t = sj . Vω is the potential N given in (1.3) and H0 = − 21 ∆. Expanding the potential Vω = α=1 Vα in the Duhamel formula, we generate many terms. We can label these terms by a sequence of obstacles, say, α = (α1 , α2 , . . . , αn ). The terms without e−itH (in the first line of (2.6)) will be called fully expanded while the others will be called truncated. We write the Duhamel formula in momentum space. The kernel of the typical fully expanded term is of the form t∗ 2 2 2 [dsj ]n0 e−is0 p0 /2 Vˆα1 (p0 − p1 )e−is1 p1 /2 Vˆα2 (p1 − p2 ) · · · e−isn pn /2 (2.8) 0
with the intermediate momenta p1 , p2 , . . . , pn−1 integrated out. The truncated 2 terms are of the same form with e−is0 p0 /2 replaced with e−is0 H , t∗ 2 2 [dsj ]n0 e−is0 H Vˆα1 (· − p1 )e−is1 p1 /2 Vˆα2 (p1 − p2 ) · · · e−isn pn /2 . (2.9) 0
The obstacles in α = (α1 , α2 , . . . , αn ) are allowed to repeat. We can relabel them by a sequence of centers A := (α1 , α2 , . . . , αm ),
xαj ∈ ω for all j
and a sequence of non-negative numbers (k1 , k2 , . . . , km ), where kj + 1 denotes the number of times αj repeats itself consecutively (we say that kj is the number of internal recollisions). The sequence A has the property that αj = αj+1 . In order words, the original collision sequence is given by (α1 , . . . , α1 , α2 , . . . , α2 , . . . , αm , . . . , αm ).
k1 +1
k2 +1
(2.10)
km +1
We shall divide the set of momenta into internal ones and external ones. The internal momenta are running between the same obstacles; the external ones are the rest. The internal momenta will be integrated out first (resummation of loop diagrams).
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
677
Hence, repeated consecutive collisions with the same obstacle, internal collisions, will be considered as a single (physical) collision and will be referred to as a collision with a center. When we speak of the number of collisions, we will actually be referring to the number of collisions with centers. For example, in the sequence (2.10), m there are j=1 (kj + 1) collisions, m centers and there are kj internal momenta running among the kj + 1 collisions with the same center αj . Collision histories will be recorded with the ordered set A. Typically, we will use the variable m to denote the cardinality of A, m = |A|. Next, let J be a set of lexicographically ordered double indices for the internal momenta J = Jm,k := (11, 12, . . . , 1k1 , 21, 22, . . . , 2k2 , . . . , mkm ),
(2.11)
where k = (k1 , k2 , . . . , km ) denotes the number of the internal momenta for the obstacle αj . We shall denote the internal momenta by qJ := (qj )j∈Jm,k . Since Vˆα := F Vα = e−ipxα Vˆ (p), we are able to separate the random part of the potential, in the form of a random phase, from the deterministic part. Consequently, denote the random phase corresponding to collision history A by χ(A; p0,m ) :=
m
e−ixαj (pj−1 −pj ) .
(2.12)
j=1
Note that it is independent of the internal momenta. Then, the deterministic part of the potentials is given by L(p0,m , qJm,k ) :=
m
Vˆ0 (pj−1 − qj1 )Vˆ0 (qj1 − qj2 ) · · · Vˆ0 (qjkj − pj ).
(2.13)
j=1
In the case where kj = 0, we only have the term Vˆ0 (pj−1 − pj ). Notice that this expression is independent of the location of the obstacles; that information is contained in the random phase χ. Given a set of momenta, r0,m , define the free evolution kernel as t∗ m 2 K(t; r0 , rm ) := (−i)m [dsj ]m e−isj rj /2 . (2.14) j=0 0
j=0
Notice that this expression is independent of the order of the momenta. Considering (2.8) and using the previously established notation for internal and external momenta, the free evolution kernel associated with the collision sequence A and internal momenta k is K(t; p0,m , qJm,k ) := K(t; p0 , q11 , . . . , q1k1, p1 , q21 , . . . , pm−1 , qm1 , . . . , qmkm, pm ). Define the fully summed (for internal collisions) free evolution kernel as ∞ K(t; p0,m ) := dqJm,k K(t; p0 , pm , qJm,k )L(p0 , pm , qJm,k ). k1 ,...,km =0
(2.15)
July 26, 2005 15:22 WSPC/148-RMP
678
J070-00242
D. Eng & L. Erd˝ os
With these notations, we can express the fully expanded wave function with a collision sequence A (and resummation of loop diagrams) and its associated propagator by ◦ (t)ψ0 (p0 ) := dpm K(t; p0 , pm )χ(A, p0 , pm )ψ0 (pm ), (2.16) ψA (t, p0 ) := UA where ψ0 is the initial wave function in momentum space. It is important to note that the first momentum p0 is not summed for internal momenta. The circle in the ◦ (t) refers to the fact that it is a fully expanded propagator. notation UA Define the fully expanded wave function with m collisions (this is always counted according to the collision centers) without recollision, and its associated propagator by no rec no rec ◦ := Um (t)ψ0 (p0 ) := ψA . (2.17) ψm A:|A|=m
no
rec A:|A|=m
The “no rec” in reminds us that we sum over sets A without repetition (recollision), i.e. αi = αj , i = j. 2.3. Error terms and time division Let m0 = m0 (ε) be an ε-dependent parameter to be chosen later. The Duhamel formula consists of sum of terms of the forms (2.8) and (2.9). It allows the flexibility to expand certain truncated terms, (2.9) further and stop the expansion for other terms. In the truncated terms, we will continue the expansion only for terms whose number of centers is less than m0 and that are non-repeating. In other words, we stop the Duhamel expansion whenever the number of external collisions reaches m0 or if there is a genuine, non-internal recollision. The result is the decomposition e−itH ψ0 =
m 0 −1
no rec no rec error ψm (t) + Ψerror m0 (t) := ψ<m0 (t) + Ψm0 (t).
(2.18)
m=0
The truncated terms will be estimated by using the unitarity of the full Hamiltonian evolution 2 t t −isH ≤t e ψ ds ψs 2 ds, s 0
0
where the additional t factor is the price for using this crude bound. We are able to reduce this price by dividing the total time interval [0, t] into n pieces. We will eventually choose n = n(ε) in a precise way. We refer to this method as the time division argument. As a first step, observe that for n ≥ 1, n (n−k)t t kt −itH −i n H (2.19) ψ0 = e ψ0 = e−i n H e−i n H ψ0 e k=1
for any 0 ≤ k ≤ n. Before each new time evolution of length nt , we successively separate the main term and the error term. For a fixed n ≥ 1, we define the main
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation kt n
term up to time collisions:
679
as the sum of the non-recollision terms with less than m0
(t) := ϕmain k
m 0 −1
no rec ψm
m=0
kt n
=
m 0 −1
◦ Um
m=0
kt ψ0 . n
(2.20)
We follow the time evolution of the main term only, i.e. we define t
ϕk (t) := e−i n H ϕmain k−1 (t), ϕerror (t) := ϕk (t) − ϕmain (t). k k
(2.21)
By (2.19), e−itH ψ0 =
n
e−i
(n−k)t H n
ϕerror (t) + ϕmain (t). k n
(2.22)
k=1 no rec (t) = ψ<m (t), comparison with (2.18) allows us to write Since ϕmain n 0
Ψerror m0 =
n
e−i
(n−k)t H n
ϕerror . k
(2.23)
k=1
Our estimates of the error term will initiate from this expression. The idea is that versing our wave functions as e−it1 H e−it2 H ψ0 , we can expand each full evolution propagator independently using the Duhamel formula and we can gain control of how closely in time the collisions occur. Classical intuition tells us that it is improbable for a path to have many collisions in a very short time. The time division technique employed here will exploit this. With this agenda in mind, we will now define the time-divided propagators. Let A be an ordered set of size m and t > 0. Suppose m1 , m2 ≥ 0 and t1 , t2 ≥ 0 satisfy m = m1 +m2 , t = t1 +t2 . Define the fully expanded, time-divided propagator associated to A as ◦ (t1 , t2 )ψ0 (p0 ) Um 1 ,m2 ;A := dpm Km1 (t1 ; p0,m1 )Km2 (t2 ; pm1 ,m )χ(A; p0,m )ψ0 (pm ).
(2.24)
We will also write Km1 ,m2 (t1 , t2 ; p0,m ) := Km1 (t1 ; p0,m1 )Km2 (t2 ; pm1 ,m ).
(2.25)
With summation over A with non-repeating indices and recalling that m = m1 + m2 , we define ◦ (t1 , t2 )ψ0 (p0 ) := Um 1 ,m2
no rec
◦ Um (t1 , t2 )ψ0 (p0 ). 1 ,m2 ;A
(2.26)
A:|A|=m
If A1 is the ordered set of the first m1 elements of A and A2 the last m2 elements of A, notice that ◦ ◦ ◦ (t1 , t2 )ψ0 (p0 ) = UA (t1 )UA (t2 )ψ0 (p0 ). Um 1 ,m2 ;A 1 2
July 26, 2005 15:22 WSPC/148-RMP
680
J070-00242
D. Eng & L. Erd˝ os
Next, we define wave functions starting from a potential (according to the field theory jargon, we call them “amputated”). They will be denoted by tildes. The amputated wave function with collision sequence A and its associated propagator is (2.27) ψ˜A (t, p0 ) := U˜A (t)ψ0 (p0 ) := dp Vˆα1 (p0 − p)ψA\{α1 } (t, p) if m = |A| ≥ 1. Note that we do not sum the last potential for internal collisions. The time-divided amputated propagator associated with A can be defined for m1 ≥ 1: ◦ U˜m1 ,m2 ;A (t1 , t2 )ψ0 (p0 ) := dp Vˆα1 (p0 − p) Um (t1 , t2 )ψ0 (p). 1 −1,m2 ;A\{α1 } This allows us to define the time-divided full propagator that will be denoted by U without circle or tilde t1 ds e−i(t1 −s)H U˜m1 ,m2 ;A (s, t2 )ψ0 (p0 ). Um1 ,m2 ;A (t1 , t2 )ψ0 (p0 ) := 0
Similarly to (2.26), we define the propagators Um1 ,m2 and U˜m1 ,m2 as (∼) U m1 ,m2 (t1 , t2 )ψ0 (p0 )
:=
no rec
(∼) U m1 ,m2 ;A (t1 , t2 )ψ0 (p0 ).
(2.28)
A:|A|=m
2.4. Properties of the kernel By (2.15), any analysis of K will involve the free evolution kernel K, which was defined in (2.14). We now give several ways to do so. Define for t > 0: 1 for t ≤ 1, (2.29) η(t) := −1 for t > 1. t We will typically write ηj := η(tj ), ηj := η(t j ) and η := η(t). Lemma 2.1 (α-Representation). We have the following identity for η > 0, m 1 i ηt −iαt e (2.30) K(t; r0,m ) = dα e 2 /2 + iη . 2π α − p R j j=0 Consequently, for η(t) defined in (2.29), |K(t; r0,m )| ≤ C
dα R
m j=0
1 . |α − p2j /2 + iη(t)|
The proof is given in [9]. The second statement is a consequence of eη(t)t ≤ C. ˜ will typically be used for one-dimensional integration The variable α (and α ˜ , β, β) on R. In the future, we will not explicitly denote the integration domain for these variables with the convention that it is always over the real line.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
681
With this in mind, we can write K(t; p0,m ) = Bη (α, pj−1 , pj ) :=
i ηt e 2π ∞
dα
m B(α, pj−1 , pj ) e−iαt , 2 α − p0 /2 + iη j=1 α − p2j /2 + iη
(2.31)
kj ˆ V0 (pj−1 − qj1 ) [dqjk ]k=1
kj =0
×
Vˆ0 (qjkj − pj ) Vˆ0 (qj1 − qj2 ) ··· 2 2 /2 + iη , α − qj1 /2 + iη α − qjk j
(2.32)
where the kj = 0 term in the sum is Vˆ0 (pj−1 − pj ). The dependence of Bη on the regularization parameter will often be suppressed in the notation, unless it becomes crucial, and we use B(α, pj−1 , pj ) := Bη (α, pj−1 , pj ). The formula (2.32) has the interpretation that summing over internal collisions, in effect, changes our potential from Vˆ0 (pj−1 − pj ) to Bη (α, pj−1 , pj ). Moreover, the smoothness and decay properties of Vˆ0 will be passed onto B. This will be made precise in Lemma 5.2, which implies, in particular, that sup |p − r30 ∇p 2 ∇r 2 Bη (α, p, r)| ≤ M λ0 ,
η,p,r
where λ0 is defined in (1.12) and M is independent of α and η ≤ 1. We also remark that with α = p2j /2, we have lim Bη
η→0+0
p2j , pj−1 , pj 2
= Tscat (pj−1 , pj ).
(2.33)
The existence of this limit follows from Lemma 5.2. The identification with the scattering T -matrix follows from the standard Born series expansion (see [19, Theorem XI.43]). As Lemma 2.1 will be a fundamental tool in our estimates, we will collect some facts which will assist in the estimate of the terms on the right-hand side of (2.30). They follow from simple calculus and we leave their proofs to the reader. Proposition 2.2. Recall (2.5) and η := η(t) in (2.29). Then the following estimates hold: dα ≤ C log t, (2.34) sup 2 α,p R α|α − p /2 + iη| dp α sup ≤ C log t, (2.35) p4 |α − p2 /2 + iη| α sup p,α
p4 |α
α ≤ Ct, − p2 /2 + iη|
(2.36)
July 26, 2005 15:22 WSPC/148-RMP
682
J070-00242
D. Eng & L. Erd˝ os
sup r,α
p −
r4 |α
sup p,α
dp ≤ C log t, − p2 /2 + iη|
1 ≤ Ct, |α − p2 /2 + iη|
(2.37) (2.38)
where C is independent of t. The next result will be the key estimate to control the so-called crossing terms. Proposition 2.3. Under the assumptions of Proposition 2.2, we have C(log t)2 dp α α ≤ , sup p4 p + q4 |α − p2 /2 + iη| |α − (p + q)2 /2 + iη| |q| + η α,α where C is independent of t. Proof. We change to spherical coordinates and measure the angular component of p against the fixed vector q. If |q| > 0 and r := |p|, we have α α dp 4 4 2 p p + q |α − p /2 + iη| |α − (p + q)2 /2 + iη| ∞ 1 αr2 dr dz = 4 2 2 2 r |α − r /2 + iη| −1 (r + q + 2r|q|z)1/2 4 0 α |α − (r2 + q 2 + 2r|q|z)/2 + iη| ∞ α dz α C ∞ r dr ≤ |q| 0 r4 |α − r2 /2 + iη| 0 z2 |α − z + iη| C log t ∞ r dr α ≤ 4 |q| r |α − r2 /2 + iη| 0 ×
≤
C(log t)2 . |q|
Combining this with the trivial estimate, dp α α p4 p + q4 |α − p2 /2 + iη| |α − (p + q)2 /2 + iη| 1 dp α ≤ , 4 η p |α − p2 /2 + iη| which holds for all q, we prove the lemma using (2.35). The next result shows that the free kernel enjoys a “semi-group” property. It will be crucial in giving us flexibility to estimate the kernel in different ways.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
683
Proposition 2.4. Let m ≥ 1 and I1 , I2 ⊂ {0, . . . , m} such that I1 ∩ I2 = ∅ and I1 ∪ I2 = {0, . . . , m}. That is, I1 and I2 partition {0, . . . , m}. Recalling the notation (2.7), one has the identity t∗ ds1 ds2 K(s1 ; rI1 )K(s2 ; rI2 ), (2.39) K(t; r0,m ) = −i 0
where rIk := (rj )j∈Ik. If one of the sets, say I2 , is empty, we will define K(s2 ; rI2 ) := iδ(s2 ). In this case, the decomposition is trivial. Proposition 2.4 follows directly from the definition (2.14). An immediate consequence is t∗ ds dτ K(s; p0 , pm )F (τ ; p0 , pm ), K(t; p0,m ) = 0
F (τ ; p0,m ) := −i
∞
dqJm,k K(τ, qJm,k )L(p0 , pm , qJm,k ),
(2.40)
k1 ,...,km =0
with L(p0 , pm , qJm,k ) defined in (2.13) and where the term corresponding to k1 = · · · = km = 0 is δ(τ )L(p0,m ). This decomposition isolates the external momenta in the complete free kernel from the effective potential F (τ ) that is obtained after integrating out the internal momenta. This term will be estimated in Lemma 3.4. Using (2.39), we can combine the decompositions given in (2.40) and (2.32). Proposition 2.5. Let 0 ≤ µ1 < µ2 ≤ m. We have t∗ eη(t1 )t1 K(t; p0,m ) = dt1 dt2 K(t2 ; pµ1 +1,µ2 ) 2π 0 µ1 B(α, pj , pj+1 ) × dα e−iαt1 α − p2j /2 + iη1 j=0
m j=µ2
B(α, pj−1 , pj ) . α − p2j /2 + iη1 +1
3. Error Estimate The goal of this section is to prove: Lemma 3.1. Let m0 = m0 (ε) be chosen by (3.42). For Ψerror m0 (t) defined in (2.18), we have −1 2 lim lim E Ψerror ) = 0. m0 (T ε
ε→0 L→∞
Since our main term is comprised of only terms with collision histories which contain no recollisions, terms resulting from the Duhamel expansion which have collision histories with recollisions are included in the error term. It is the estimate of the error term where we will need to analyze the size of recollision terms. Recall that we already sum our wave functions in the main term for immediate recollisions (internal collisions), thereby eliminating them from subsequent analysis.
July 26, 2005 15:22 WSPC/148-RMP
684
J070-00242
D. Eng & L. Erd˝ os
Given m1 , m2 ≥ 0 and A of size m := m1 + m2 with no repeating indices, again denote by A1 the ordered set containing the first m1 elements of A and A2 containing the remaining m2 elements of A. Write αk for the kth element of A. For 2 ≤ κ ≤ m, we define the amputated propagator with collision history of A2 in (0, t2 ] and A1 from (t2 , t1 + t2 ] with recollision ακ to be rec,κ ◦ U˜m (t , t )ψ (p ) := dr0 Vˆακ (p0 − r0 ) Um (t1 , t2 )ψ0 (r0 ), 1 2 0 0 1 ,m2 ;A 1 ,m2 ;A ◦ recalling the definition of the time-divided propagator Um from (2.24). The 1 ,m2 ;A superscript κ decodes the location of the recollision. The corresponding full propagator is then t1 rec,κ rec,κ (t , t ) := ds e−i(t1 −s)H U˜m (s, t2 ). (3.1) Um 1 2 1 ,m2 ;A 1 ,m2 ;A 0
Summing over A and 2 ≤ κ ≤ m removes their respective indices in the above propagators: rec (t1 , t2 ) := Um 1 ,m2
m
no rec
rec,κ Um (t1 , t2 ). 1 ,m2 ;A
(3.2)
κ=2 A:|A|=m
Using this definition for the recollision term, together with definition of the fully ◦ from (2.26) and the truncated term with a expanded non-recollision term Um 1 ,m2 full propagator Um0 ,m from (2.28), we have the following decomposition of the kth error term: Lemma 3.2. Given (2.21), and 1 ≤ k ≤ n, define t1 := t/n and t2 := (k − 1)t/n. We have ϕerror (t) = k
m 0 −1
m 0 −1
m1 =1 m2 =m0 −m1
+
m 0 −1
◦ Um (t1 , t2 )ψ0 + 1 ,m2
m 0 −1
m 0 −1
Um0 ,m (t1 , t2 )ψ0
m=0 rec Um (t1 , t2 )ψ0 1 ,m2
m1 =0 m2 =(2−m1 )+
=: ϕerror,1 (t) + ϕerror,2 (t) + ϕerror,3 (t), k k k
(3.3)
where (a)+ := max(a, 0). ◦ (0) = 0 for m > 0 and U00 (0) = Id. Notice that our definitions imply that Um Consequently, any time-divided propagator of the form Um1 ,m2 (t1 , 0) will be zero unless m2 = 0. In this case, we have Um1 ,0 (t1 , 0) = Um1 (t1 ).
Proof. The proof is just a careful Duhamel expansion. Recall from (2.20) and (2.21) that ϕk (t) = e−it1 H ϕmain k−1 (t) =
m 0 −1 m=0
◦ e−it1 H Um (t2 )ψ0 .
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
685
We now use the Duhamel formula to expand the full propagator. We will stop the expansion when the new potential term represents a recollision or after m0 new external collisions. As before, internal collisions do not count when we speak of total collisions and we compensate by summing over them at each step of the expansion. Performing this, we have m 0 −1
m 0 −1
◦ e−it1 H Um (t2 ) =
m=0
◦ Um (t1 , t2 ) + 1 ,m2
m1 ,m2 =0
+
m 0 −1
m 0 −1
Um0 ,m2 (t1 , t2 )
m2 =0 m 0 −1
rec Um (t1 , t2 ). 1 ,m2
m1 =0 m2 =(2−m1 )+
Finally, one can verify
◦ ◦ Um (t1 , t2 ) = Um (t1 + t2 ), 1 ,m2
m1 ,m2 ≥ 0 m1+m2 = m
which collects the main terms and completes the proof of the lemma. Thus, at each step k, where 1 ≤ k ≤ n, we use the Duhamel formula to expand the additional factor e−it1 H . We keep the wave functions which have total collisions m where m ≤ m0 − 1 and there are no recollisions. Any other cases are collected in the error terms. We will now systematically estimate each of the three terms in (3.3). For the ori(t) is a fully expanded entation of the reader, we note that the first error term ϕerror,1 k (t) conterm with at least m0 total number of collisions. The second term ϕerror,2 k tains a full propagator after m0 collisions in the short time interval [t2 , t2 + t1 ]. (t) contains the recollisions. Finally, the last error term ϕerror,3 k 3.1. Preliminary estimates Recalling (2.24), we see that all of the randomness present in the wave function ◦ (t1 , t2 )ψ0 is contained in the random phase χ(A; p0,m ). We start with disUm 1 ,m2 ;A cussing the expectation value of these random phases. Notice the randomness is unaffected by the time division — the time division is fully recorded in the kernel Km1 ,m2 (t1 , t2 ). 3.1.1. Expectation of the random phases The net effect of expectation of our random phases will be to induce various linear relations (so-called pairing relations) among our external momenta. For the precise formulation, we introduce the notation n−1 N (N − 1) · · · (N − n + 1) j (n) = 1− Λ := (3.4) |Λ|n N j=0
July 26, 2005 15:22 WSPC/148-RMP
686
J070-00242
D. Eng & L. Erd˝ os
for the density of n-particle clusters, where we recall the single-obstacle density = N/|Λ| and its scaling = 0 ε. Note that for a fixed n and ε, n−1 j (n) lim Λ = lim 1− (3.5) = n . L→∞ L→∞ N j=0 We also denote by S(b) the permutation group on b elements. Lemma 3.3 (Simple Set Expectation). Recall the notation introduced in (2.4). Suppose G ∈ L2 (R3(m+1) ; C) and the random phase χ is given in (2.12). Then no rec 2 E dpm χ(A; p0 , pm )G(p0 , pm ) 2 A:|A|=m
=
m
L (dp0 )
b=0 σ∈S(b) l,l
×
(2m−b)
Λ
l
dp0 dpb dp b G(p00 , plbb )G(p00 , p bb )∆σ (p0 , pb , p b ),
(3.6)
where l0,b := (0 , . . . , b ), Σl,l is the sum over such vectors with components in the b b non-negative integers such that j=0 j = j=0 j = m − b, and ∆σ (p0 , pb , p b ) :=
b
δ[(pj−1 − pj ) − (p σ(j)−1 − p σ(j) )].
(3.7)
j=1
In the future, we will refer to ∆σ as the pairing function and to its constituent delta functions as the pairing relations. Proof. In what follows, all summations on ordered sets (such as A or A ) will be understood to be summed over sets with non-repeating indices. That is, we will drop the “no rec” from our summations. We begin by expanding the squared sum
2
dpm χ(A; p0 , pm )G(p0 , pm )
A:|A|=m = dpm dp m χ(A; p0 , pm )χ(A ; p0 , p m )G(p0 , pm )G(p0 , p m ). A:|A|=m A :|A |=m
The key to this Lemma is writing the sum over possible A and A , ordered sets of size m with no repetition, as a sum over their possible intersections and then over their disjoint complements. Explicitly,
A:|A|=m A :|A |=m
=
m
,
b=0 B:|B|=b σ∈S(b) (A,A )
where the last sum is over A, A of size m such that A ∩ A = B, B ≺ A and σ(B) ≺ A.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation pj-1
pj
... α
687
... α
b (j -1)
α
b(j)
j -1
b ( j +1)
j
B A\B Fig. 1.
Basic Feynman diagram.
We introduce the vector l0,b := (0 , 1 , . . . , b ) as follows: let 0 = j if αj+1 ∈ B / B for k ≤ j. Similarly, denote by k the number of αj between the kth and αk ∈ and (k + 1)th members of A which are in not B. By this definition, if αk is the jth member of B and αk+1 is the (j + 1)th member of B, then j = 0. In other words, the vector l0,b counts the number of αj ’s in between members of B. Thus l0,b describes precisely how B is embedded in A. Consequently, we have the relation b
j = m − b.
(3.8)
j=0
See Fig. 1 for the corresponding Feynman diagram. The bullets refer to centers, the lines between them are free propagators carrying a momentum. The filled bullets are single centers that do not appear anywhere else in the expansion, therefore the incoming and outgoing momenta are the same. The elements of the set B (unfilled bullets) involve momentum transfer. Define l 0,b in a similar way for A . We next take the expectation of the L2 norm. Using the independence of the variables xα , the expectation 1 δ(p) , dxαj eixαj p = Exαj eixαj p := |Λ| Λ |Λ| and (2.12), we have Eχ(A; p0,m )χ(A ; p 0,m ) = |Λ|−(2m−b)
×
b j=0
b
δ[(pb(j)−1 − pb(j) ) − (p b ◦σ(j)−1 − p b ◦σ(j) )]
j=1
b (j+1)−1
b(j+1)−1
δ(pk−1 − pk )
δ(p k−1 − p k ) ,
(3.9)
k=b (j)+1
k=b(j)+1
where we defined b(j) :=
j−1
ι + j,
b (j) :=
ι=0
and we set our convention as
m−1 m
j−1
ι + j,
ι=0
= 0 and
m−1 m
= 1.
(3.10)
July 26, 2005 15:22 WSPC/148-RMP
688
J070-00242
D. Eng & L. Erd˝ os
We now integrate over the variables not involved in the pairing relations; specifically we integrate over pm \{pb(j) }bj=0 and their prime counterparts. Of the variables left, we re-label pb(j) → pj and p b (j) → p j , for j = 0, 1, . . . , b. Consequently E
dpm
A:|A|=m
=
m
2 χ(A; p0 , pm )G(p0 , pm )
|Λ|−(2m−b)
b=0 B:|B|=b σ∈S(b) (A,A )
dp0 dpb dp b G(p00 , plbb )G(p00 , p b lb )∆σ (p0 , pb , p b ).
×
Lemma 3.3 then follows since the total number of obstacles is N , hence the ways of choosing B, A and A such that B ≺ A, σ(B) ≺ A and A ∩ A = B, for a fixed N! . σ, l0,b and l 0,b is (N −(2m−b))! In typical applications of this lemma, with m1 + m2 = m, we will set G(p0 , pm ) = Km1 ,m2 (t1 , t2 ; p0,m )ψ0 (pm ), where we recall the definition (2.25). For a fixed l0,b , the integrand in (3.6) implies l0,b that we will have to make estimates on Km1 ,m2 (t1 , t2 ; p0,b ). To do this, we will introduce more notation. Let β = β(m1 , l) be such that 0 ≤ β ≤ m1 and satisfy b(β) ≤ m1 ≤ b(β + 1) − 1,
(3.11)
and we define β1 := m1 − b(β),
β2 := b(β + 1) − 1 − m1 .
(3.12)
In particular β = β1 + β2 . In other words, β is the number of B-elements before the time division, and β1 , β2 describe how the time division line divides the (A\B)elements between the βth and (β + 1)th B-elements (see Fig. 2; the dashed vertical line indicates the time division). We define the primed versions analogously. Recalling (2.4) and (2.25), we have the expression l
0,b ) Km1 ,m2 (t1 , t2 ; p0,b
= K(t1 ; p0 , . . . , p0 , . . . , pβ , . . . , pβ )K(t2 ; pβ , . . . , pβ , . . . , pb , . . . , pb ). 0 +1
β1 +1
β2 +1
b +1
In accordance with (2.40), one can check that l
0,b Km1 ,m2 (t1 , t2 ; p0,b ) t1 ∗ t2 ∗ l0,b l0,b = Fm1 ,m2 τ1 , τ2 ; p0,b ds1 dτ1 ds2 dτ2 Km1 ,m2 s1 , s2 ; p0,b
0
0
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
p
p
β
... α
1
... α
b(1)
β
... α
...
b(β )
β1
... α
b( β +1)
b(b)
α
m
β2
m
1
Fig. 2.
for
... α
m
689
2
Time division.
l0,b l0,β1 lβ2,b := K s1 ; p0,β K s2 ; pβ,b , Km1 ,m2 s1 , s2 ; p0,b l0,b l0,β1 lβ2,b := F τ1 ; p0,β F τ2 ; pβ,b . Fm1 ,m2 τ1 , τ2 ; p0,b
(3.13)
In the future, we will omit the subscripts on Km1 ,m2 , Km1 ,m2 and Fm1 ,m2 when they are obvious from the context. In what follows, we will adopt the following convention. We will use upper case index variables when summing over index sets of the form (0, . . . , β − 1, β1, β2, β + 1, . . . , b). Moreover, define the upper case momenta in the following way. If p0,b is a set of momenta, the corresponding upper case momenta are defined by PJ := pJ
for J = β1, β2,
(3.14)
Pβ1 = Pβ2 := pβ . Using this convention, we can write l0,b Km1 ,m2 s1 , s2 ; p0,b = (−i)b
0
s1 ∗
[dσJ ]β1 0
s2 ∗
[dσJ ]bβ2
0
b
2
e−iσJ PJ /2
J=0
(iσJ )J , J !
where the notation implies that the product is b J=0
2
e−iσJ PJ /2
(iσJ )J J ! 2
= e−i(σβ1 +σβ2 )pβ /2
(iσβ1 )β1 (iσβ2 )β2 β1 !β2 !
b j=0;j=β
2
e−iσj pj /2
(iσj )j . j !
3.1.2. Estimates on the effective potential The next result estimates the size of the effective potential F (τ ) obtained after integrating out the internal momenta (see (2.40)).
July 26, 2005 15:22 WSPC/148-RMP
690
J070-00242
D. Eng & L. Erd˝ os
Lemma 3.4. Let 0 ≤ b ≤ m and I ⊂ {0, . . . , b} with |I| = n ≤ b + 1 and ξ := (ξ1 , . . . , ξn ) ∈ {0, 1, 2}n be a multi-index. Let l0,b be as in the statement of Lemma 3.3. If G is twice differentiable, then there is a universal constant M such that m
ξ
Dp F (τ ; pl0,b )G(p0,b ) ≤ (M λ0 ) I 0,b 3/2 τ
sup ξ ∈{0,1,2}n
|Dpξ I G(p0,b )|
b j=1
1 . pj−1 − pj 26
This lemma is a consequence of the dispersive estimates on the free propagator. In particular,
dp e−isp2 /2 f (p) ≤ eis∆/2 fˇ L∞ ≤ Cs−3/2 fˇ L1 ≤ Cs−3/2 f H 2 .
2 We can combine this with the trivial bound | dp e−isp /2 f (p)| ≤ f L1 to get
dp e−isp2 /2 f (p) ≤ Cs−3/2 ( f L1 + f H 2 ).
We will frequently need to apply this estimate iteratively. To do this precisely, we make some definitions. Suppose I ⊂ {0, . . . , b} of length n and write I = (i1 , . . . , in ). Denote by ξ a multi-index of length n where ξj ∈ {0, 2}. Define the following operations on functions: ξi1 ξ ξn 0 NdpI := Ndpi ◦ · · · ◦ Ndpin , Ndpj := dpj | · |, 1
1/2
2 Ndp := j
dpj | · |2
,
Dpξ I :=
n
ξi ! ∇pijj .
(3.15)
j=1
Now, let f ∈ S(R3(b+1) ; C) and define |||f |||dpI :=
ξ Ndp Dpξ I f. I
(3.16)
ξ∈{0,2}n
With this language,
dp e−isp2 /2 f (p) ≤ Cs−3/2 |||f |||dp .
(3.17)
We now move on to prove Lemma 3.4. Proof. Write k := km and J := Jm,km (recall the definition from (2.11)). From (2.40), we have l0,b )G(p0,b ) Dpξ I F (τ ; p0,b
= −i
∞ k1 ,...,km =0
l0,b , qJ )G(p0,b ) . dqJ K(τ ; qJ )Dpξ I L(p0,b
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
691
For a fixed k, we have from (2.14), l0,b , qJ )G(p0,b ) dqJ K(τ ; qJ )Dpξ I L(p0,b τ∗ 2 l0,b [dσjk ]jk∈J × (−i) k dqJ e−iσjk qjk /2 Dpξ I L(p0,b , qJ )G(p0,b ) , = 0
jk∈J
m
where k := j=1 kj . Applying (3.17) iteratively, we have
dqJ K(τ ; qJ )Dpξ L(pl0,b , qJ )G(p0,b ) I 0,b
τ∗ l0,b ≤ C k |||Dpξ I (L(p0,b , qJ )G(p0,b ))|||dqJ [dσjk ]J 0
≤
jk∈J
1 σjk 3/2
k
C l0,b |||Dpξ I (L(p0,b , qJ )G(p0,b ))|||dqJ , τ 3/2
where the estimate is due to the multiple time integration defined in (2.7). Using the Leibniz rule and the triangle inequality, ξ l0,b l0,b |||Dpξ I (L(p0,b , qJ )G(p0,b ))|||dqJ ≤ L(p0,b , qJ )Dpξ I G(p0,b )|||dqJ, |||Dpξ−ξ I ξ ξ ≤ξ
where the notation ξ ≤ ξ indicates componentwise ordering and ξ1 ξ l0,b · · · ξn . The form of L(p0,b , qJ ) in (2.13) allows us to write ξ 1
ξ
:=
ξ
n
l
0,b |||Dpξ−ξ L(p0,b , qJ )|||dqJ I m ≤ (M λ0 )m+ k dqJ
1 1 1
· · · 30 30 30 rj−1 − qj1 qj1 − qj2 qjkj − rj r0,m =pl0,b j=1
≤ (M λ0 )m+ k
b j=1
1 pj−1 − pj 26
0,b
dqJ
m j=1
1 ··· rj−1 − qj1 4 qjkj
1
. − rj 4 r0,m =pl0,b 0,b
Summing over kj and using that λ0 1, we obtain the lemma. Define J1 := (11, . . . , 1k1 , . . . , m1 1, . . . , m1 km1 ),
(3.18)
J2 := (m1 + 1 1, . . . , m1 + 1 km1 +1 , . . . , m1, . . . , mkm ), where the first double index in J2 has m1 + 1 as the first element and 1 as the second, etc. This implies J1 ⊕ J2 = Jm,km . Expanding (3.13) and using (2.40) yield ∞ l0,b l0,b , qJm,km )K(τ1 ; qJ1 )K(τ2 ; qJ2 ). Fm1 ,m2 (τ1 , τ2 ; p0,b ) = − dqJm,km L(p0,b k1 ,...,km =0
July 26, 2005 15:22 WSPC/148-RMP
692
J070-00242
D. Eng & L. Erd˝ os
Again, the degenerate term of k1 = · · · = km = 0 of the last sum is defined as l0,b ). A simple corollary to Lemma 3.4 is the estimate: δ(τ1 ) δ(τ2 )L(p0,b
ξ
Dp Fm1 ,m2 (τ1 , τ2 ; pl0,b )G(p0,b ) I 0,b (M λ0 )m1 +m2 ≤ τ1 3/2 τ2 3/2
sup ξ ∈{0,1,2}n
|Dpξ I G(p0,b )|
b
1 . p − pj 26 j−1 j=1
(3.19)
3.2. Estimate of ϕerror,1 (t) We now estimate the first error term ϕerror,1 (t) in Lemma 3.2. We will omit the L → ∞ limit from the rest of this section with the understanding that this limit is taken in every estimate before any other limits. Lemma 3.5. Recall that t = T ε−1 and = ε0 . Let m0 = m0 (ε) 1, n = n(ε) 1 (we will make precise choices later in (3.42) and (3.43)) and suppose 1 ≤ m1 , m2 ≤ m0 − 1 such that m = m1 + m2 ≥ m0 . Then for t1 + t2 = t with t2 ≥ t1 , we have the bound ◦ E Um (t1 , t2 )ψ0 2 1 ,m2 " # m1 (t2 )m2 m m (t1 ) m+O(1) m1 −1 m2 + m!(log t) ≤ C(M λ0 ) T (t1 ) (t2 ) . m1 !m2 !
(3.20) Consequently, for k ≥ 1, " (t) 2 E ϕerror,1 k
≤C
m0
T
2m0
# 1 1 2m0 +O(1) + (2m0 )!(log t) . n m0 !
(3.21)
Proof. By definition (2.24), we have ◦ (t1 , t2 )ψ0 (p0 ) Um 1 ,m2 ;A = dpm K(t1 ; p0,m1 )K(t2 ; pm1 ,m )χ(A; p0,m )ψ0 (pm ).
We apply Lemma 3.3 to get ◦ (t1 , t2 )ψ0 2 E Um 1 ,m2
≤
m
b=0 σ∈S(b) l,l
dp0 dpb dpb ∆σ (p0 , pb , pb )ψ0 (pb )ψ0 (pb )
2m−b
× Km1 ,m2 (t1 , t2 ; p00 , plbb )Km1 ,m2 (t1 , t2 ; p00 , p b lb )
:= (Direct) + (Crossing),
(3.22)
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
where m
(Direct) :=
693
2
dp0 dpb |ψ0 (pb )|
2m−b
b=0 l,l
l × K(t1 , t2 ; p00 , plbb )K(t1 , t2 ; p00 , pbb )
, m
(Crossing) :=
dp0 dpb dpb ∆σ (p0 , pb , pb )ψ0 (pb )ψ0 (pb )
2m−b
b=2 σ∈S(b)\Id l,l
× K(t1 , t2 ; p00 , plbb )K(t1 , t2 ; p00 , p b lb )
.
(3.23)
The decomposition depends on whether σ is trivial (identity) or not. When σ is b trivial, then the pairing functions (3.7) reduce to the relations j=1 δ(pj − p j ). This decomposition will correspond to the two terms on the right-hand side of estimate (3.20). We will treat the Direct term first. Applying the Schwarz inequality and using m that l 1 = l 1 = b ≤ 2m , (Direct) ≤
m b=0
l
0,b )|2 . dp0,b |ψ0 (pb )|2 |K(t1 , t2 ; p0,b
2m 2m−b
l
Using (2.40), we can write (Direct) ≤ 2m
m b=0
2m−b
t1 ∗
t1 ∗
ds1 dτ1 0
l
0
ds 1 dτ1
l
t2 ∗
t2 ∗
ds2 dτ2 0
0
ds 2 dτ2
l
0,b 0,b dp0,b |ψ0 (pb )|2 K(s1 , s2 ; p0,b )K(s 1 , s 2 ; p0,b )
×
l
l
0,b 0,b × F (τ1 , τ2 ; p0,b )F (τ1 , τ2 ; p0,b ).
We now estimate the free kernel. With the index convention introduced in (3.14), we use (2.14) and (2.39) to write l0,b l0,b l0,b l0,b )K(s 1 , s 2 ; p0,b )F (τ1 , τ2 ; p0,b )F (τ1 , τ2 ; p0,b ) dp0,b |ψ0 (pb )|2 K(s1 , s2 ; p0,b =
l
×
s1 ∗
0
×
l
0,b 0,b dp0,b F (τ1 , τ2 ; p0,b )F (τ1 , τ2 ; p0,b )|ψ0 (pb )|2
b J=0
[dσJ ]β1 0
s1 ∗
0
2
[dσJ ]β1 0
e−i(σJ −σJ )PJ /2
s2 ∗
0
(σJ σJ )J . (J !)2
s2 ∗
[dσJ ]bβ2
[dσJ ]bβ2
0
(3.24)
For notational convenience, assume β = b. The case b = β is estimated in the same way. By the decay of the initial wave function in momentum space and the triangle
July 26, 2005 15:22 WSPC/148-RMP
694
J070-00242
D. Eng & L. Erd˝ os
inequality, we have
b
−i σJ −σJ PJ2 /2
dp0,b |ψ0 (pb )|2 F τ1 , τ2 ; pl0,b F τ , τ ; pl0,b e 1 2 0,b 0,b
J=0
l0,b l0,b 2 ≤ ψ0 30,0 sup
dp0,b−1 F τ1 , τ2 ; p0,b pb −60 F τ1 , τ2 ; p0,b p b
×e
−i
σβ1 −σβ1 + σβ2 −σβ2
b−1
p2β /2
e
−i σj −σj p2j /2
j=0;j=β
l0,b ≤ ψ0 230,0 C b sup |||F τ1 , τ2 ; p0,b F τ1 , τ2 ; p0,b , pb −60 |||dp0,b−1 pb
×
b−1
!−3/2 + σβ2 − σβ2 σβ1 − σβ1
σj − σj
!−3/2
,
j=0;j=β
where the last estimate used (3.17) iteratively. Applying this to (3.24), using σJ ≤ s 1 and σJ ≤ s 2 for J ≤ β1 and J > β2, respectively, and performing the integration β1 b over dσJ 0 and dσJ β2 , we have l0,b l0,b pb −60 |||dp0,b−1 F τ1 , τ2 ; p0,b (3.24) ≤ C b sup |||F τ1 , τ2 ; p0,b pb
+···+
+···+
β1 b s2 β2 s 0 × 1 0 ! · · · β1 ! β2 ! · · · b !
s1 ∗
0
[dσJ ]β1 0
s2 ∗
[dσJ ]bβ2
0
b
σJJ ,
J=0
where we have also used the trivial estimate 1/(j )! ≤ 1. Using the identity s j !m! sm+j +1 , (s − σj )m σj j dσj = (m + + 1)! j 0 we have (3.24) ≤
1 m2 m−b Cb sm l0,b l0,b 1 s2 t sup |||F τ1 , τ2 ; p0,b pb −60 |||dp0,b−1 . F τ1 , τ2 ; p0,b m1 !m2 ! pb
Using the definition of ||| · ||| and (3.19), we conclude l0,b l0,b pb −60 |||dp0,b−1 ≤ sup |||F τ1 , τ2 ; p0,b F τ1 , τ2 ; p0,b pb
C(M λ0 )m , τ1 3/2 τ2 3/2 τ1 3/2 τ2 3/2
which implies (Direct) ≤ (M λ0 )m
m b=0
×
τ1 3/2 τ1
t1 ∗
2m−b
t1 ∗
ds1 dτ1
l
0
tm−b
m2 1 sm 1 s2 !3/2 m ! m ! 1 2 τ
!3/2
≤ C(M λ0 )m T m
τ2 3/2
0
ds 1 dτ1
t2 ∗
ds2 dτ2 0
0
t2 ∗
ds 2 dτ2
2
(t1 )m1 (t2 )m2 . m1 ! m2 !
(3.25)
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
695
m The last inequality uses b=0 l 1 = 2m . This proves the estimate on the Direct term. It remains to estimate the Crossing term in (3.23). We proceed in the spirit of the “indirect” term estimates in [9] which are based on the α-representation of the free kernel (Lemma 2.1). In particular, from (2.32), we have the representation eη1 t1 +η2 t2 dα1 dα2 e−i(α1 t1 −α2 t2 ) Km1 ,m2 (t1 , t2 ; p0,m ) = − 2 2 4π α1 − p0 /2 + iη1 α2 − p2m1 /2 + iη2 ×
m1 B(α1 , pj−1 , pj ) α − p2j /2 + iη1 j=1 1
m j=m1
B(α2 , pj−1 , pj ) α − p2j /2 + iη2 +1 2
where ηj := η(tj ). To shorten our expressions, define for k = {1, 2}, [αk , p] := αk − p2k /2 + iηk ,
(3.26)
and its absolute value is denoted by |αk , p| := |[αk , p]|. Analogous definitions are introduced for the primed versions with the same ηk regularizations:
(3.27) αk , p := α k − p2k /2 + iηk , α k , p := α k , p . Note that the regularization ηk is not explicitly accounted for in the notation. However, the short notations [α, p], |α, p| will always be used in a context when α equals to one of the variables α1 , α2 , α 1 , α 2 and the index of α indicates the index of the regularizing η. Consequently, it remains to bound
m
2m−b
dp0,b dp b ψ0 (pb )ψ0 p b ∆σ p0 , pb , p b
b=2 σ=Id l,l × dα dα e−i(α−α )·(t1 ,t2 ) B(α1 , p0,β )B(α2 , pβ,b )B α 1 , p0 , p β B α 2 , p β ,b β1 b B(α1 , PJ , PJ )J B(α2 , PJ , PJ )J × [α1 , PJ ]J +1 [α2 , PJ ]J +1 J=0
J=β2
J b β 1 B α 2 , P , P J
B α 1 , PJ , PJ J J × J +1 J +1
α1 , PJ α2 , PJ J=0 J=β 2 where α := (α1 , α2 ), α := α 1 , α 2 and
B(α, pn,m ) :=
m j=n+1
B(α, pj−1 , pj )
(3.28)
for n ≤ m.
(3.29)
July 26, 2005 15:22 WSPC/148-RMP
696
J070-00242
D. Eng & L. Erd˝ os
We will now proceed as in [9, Lemma 3.5] by exploiting the pairing relations and estimating each almost singular integral in a particular way. The technical Lemma 5.2 (to be proven later in Sec. 5) and the triangle inequality imply b
sup B(α1 , p0,β )B(α2 , pβ,b ) ≤
α1 ,αj
j=1
≤
M λ0 pj−1 − pj 30
6
1
b
i=1
pki 4
j=1
M λ0 , pj−1 − pj 4
(3.30)
where ki are between 0 and b and can be chosen at will. The same statements hold for the primed momenta. In general, the pairing structure can be quite complicated. However, we know from [9, Lemmas 2.4 and 2.8] that we can express the primed momenta as linear combinations of the non-primed ones, in particular, b δ p j − lj (p0,b ) , ∆σ p0 , pb , p b = j=1
for some linear functions lj . Moreover, we always have the condition p b = pb . The assumption that σ = Id implies that there is a 0 < κ < b such that lκ (p0,b ) is nontrivial. That is, there are distinct indices κ1 , . . . , κι such that p κ = ±pκ1 ± · · · ± pκι where the right-hand side contains at least three terms. Hence, we can always choose κ1 , κ2 such that κ1 , κ2 = b, and κ1 = β. Suppose first that κ = β . Let $ α1 for 0 ≤ j < β, α(j) := α2 for β ≤ j ≤ b, and define α (j) analogously, with β in place of β. Define α(κ1 , κ2 )c so that {α(κ1 ), α(κ2 ), α(κ1 , κ2 )c } = {α1 , α2 }. c In the case where {α(κ1 ), α(κ2 )} = {α & choose α(κ1 , κ2 ) = α1 . Similarly % 1 , α2 }, c c define α (κ) so that {α (κ), α (κ) } = α1 , α2 . From (3.28), we need to bound
dp0,b dp b−1 pb 60 |ψ0 (pb )|2
b−1
δ(p j = lj (p0,b ))
dα dα
j=1
β1
β 1 b b 1 1 1 1 × , P |J +1 , P |J +1 |α1 , PJ |J +1 |α2 , PJ |J +1 |α |α 1 2 J J J=0 J=0 J=β2 J=β 2
×
6 i=1
b 1 M λ0 . pki 4 p k 4 j=1 pj−1 − pj 4 p j−1 − p j 4 i
(3.31)
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
697
First choose k1 = κ, k1 = κ1 and k2 = κ2 . We begin by using (3.8) and making the bound c 1 1 1 α (κ) sup 4 b pk |α , p β | β 1 |α , p β | β 2 |α2 , pb | |α (κ), p κ |κ α1 ,α2 1
2
p0 ,pb ,pb−1
1 × |α2 , p β |
2
1
0 ≤ j ≤ b−1 j = κ,β
|α (j), p j |j +1
2 ≤ Ct1m1 −1 tm 2 .
(In case of κ = β , the first term on the second line is omitted.) Indeed, this estimate
−1
c follows as we can pick k2 such that α (κ) , pk is a factor in the above product. 2 We then apply (2.36) to obtain $ c Ct1 if κ ≥ β , α (κ) sup ≤ c 4 Ct2 if κ < β , α (κ)c ,p pk |α (κ) , pk | k 2
2
2
while using (2.38) on the remaining factors and applying t1 ≤ t2 . After estimating the initial wave function, ψ0 230,0 ≤ C, we obtain (3.31) ≤
1 −1 m2 Ctm t2 1
sup pb
dp0,b−1 dp b−1
b−1
δ p j = lj (p0,b )
j=1;j=κ
dα1 dα2 dα 1 dα 2 × α1 |α1 , pβ | α2 |α2 , pb | α 1 |α 1 , p β | α 2 |α 2 , pb | δ pκ = ±pκ1 ± pκ2 ± · · · α(κ1 )α(κ2 )α (κ) α(κ1 , κ2 )c × |α (κ), pκ | |α(κ1 ), pκ1 | |α(κ2 ), pκ2 | p κ 4 pκ1 4 pκ2 4 pk3 4 ×
b 1 1 β1 β2 |α1 , pβ | |α2 , pβ | |α(j), pj |j j=0;j=β
=:(i) b−1
×
j =0 j = κ1 ,κ2
b 6 1 1 M λ0 . 4 |α(j), pj | i=4 pki j=1 pj−1 − pj 4
=:(ii)
By our assumptions that m1 , m2 ≥ 1 and m1 + m2 ≥ m0 , we can choose k3 so that the factor |α(κ1 , κ2 )c , pk3 |−1 appears in either (i), (ii) or both. We now use (2.38) to estimate the factors in (i). If |α(κ1 , κ2 )c , pk3 |−1 appears in (i) (for some choice of k3 ), we estimate this term by (2.36), $ ! Ct1 if κ1 ≥ β, α(κ1 , κ2 )c ≤ sup 4 c Ct2 if κ1 < β. α(κ1 ,κ2 )c ,pk3 pk3 |α(κ1 , κ2 ) , pk3 |
July 26, 2005 15:22 WSPC/148-RMP
698
J070-00242
D. Eng & L. Erd˝ os
Either way, we apply (3.8) to produce the bound (i) ≤ tm−b . Next, we integrate p j for 1 ≤ j ≤ b − 1 except for κ, thus removing their corresponding delta functions. We then bound " dα 2 dα2 dα1 dα 1 2 m−b (3.31) ≤ Ct1m1 −1 tm t sup dp0,b−1 2 α2 |α2 , pb | α 2 |α 2 , pb | α1 α 1 pb ×
δ(p κ = pκ1 ± pκ2 ± · · ·) |α (κ), p κ | |α(κ1 ), pκ1 | |α(κ2 ), pκ2 |
# 1 1
×
× (ii ) α1 , lβ (p0,b ) |α1 , pβ |
α(κ1 )α (κ)α(κ2 ) p κ 4 pκ1 4 pκ2 4 (3.32)
where (ii ) is (ii) multiplied by α(κ1 , κ2 )c /pk3 4 if that factor was not used in the estimate of (i). In this case, we choose k3 so that |α(κ1 , κ2 )c , pk3 | appears in (ii). We next integrate p κ which identifies p κ = pκ1 ± pκ2 ± · · · . If g = g(|pκ1 |) is a non-negative function of |pκ1 |, we have the estimate with r = |p|: dpκ1 α (κ)g(|pκ1 |) 4 pκ1 |α (κ), pκ1 ± pκ2 ± · · · | |α(κ1 ), pκ1 | pκ1 ± pκ2 ± · · ·4 ∞ g(r) dr r 1 , (3.33) ≤ 4 r |α(κ1 ), r| |pκ2 ± · · · | 0 where we have abused the notation and wrote |α, |p|| = |α, p|. Indeed this follows from parametrizing the angular component of pκ1 relative to that of ±pκ2 ± · · · and performing the angular integration exactly as in the proof of Proposition 2.3. To apply (3.33), we choose k4 = κ1 −1 in (3.30), and we can make the last line in (3.32) independent of the angular variable of pκ1 by estimating pκ1 −1 − pκ1 −1 ≤ 1. The decay in the variable pk4 = pκ1 −1 is lost, but it is restored by the additional factor pk4 −4 . Our choice of k4 will assure that we have enough decay factors to perform the necessary integrations. We obtain " dα 2 dα1 dα 1 dα2 2 m−b ! t log t × sup (3.31) ≤ Ct1m1 −1 tm 2 α2 |α2 , pb | α2 |α2 , pb | α1 α 1 pb 1 α(κ2 ) d|pκ1 | |pκ1 | α(κ1 ) × dp0,κ1 −1 dpκ1 +1,b−1 |α(κ2 ), pκ2 | |pκ1 |4 |α(κ1 ), |pκ1 || |pκ2 ± · · · | # 1 1 × (ii ) , × |α1 , lβ | |α1 , pβ | where (ii ) is the same as (ii ) with pκ1 −1 − pκ1 −4 majorized by 1 and k4 = κ1 − 1. We then apply (2.34) twice to make the bound dα 1 dα 2 sup ≤ C(log t)2 . α 1 |α 1 , lβ | α 2 |α 2 , pb | p0,b
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
699
We can now integrate |pκ1 | and then choose coordinates for pκ2 so that its angular component is parametrized relative to that of pκ3 ± · · · ± pκι . Choosing k5 = κ2 − 1 and using pκ2 −1 − pκ2 −4 ≤ 1 makes the remaining terms independent of this angle. We then integrate the angle, as done before, allowing us to integrate the remaining pj except for pβ . The integration is handled using (2.37) in all instances except possibly one: in the case where (ii ) contains α(κ1 , κ2 )c /pk3 4 , we use (2.35) to handle this term. Since |α1 , pβ | = |α(κ2 ), |pκ2 ||, we can use (2.34) to bound dα1 ≤ C log t, α1 |α1 , pβ | while integrating |α2 , pβ |−1 and completing the integration of |pκ2 | produces log factors. The order in which this is done will depend on whether or not β = κ2 . Finally we use (2.34) to estimate dα2 ≤ C log t. sup α2 |α2 , pb | pb Collecting these estimates completes the proof in the case where κ = β . The case κ = β is easier to handle and can done as above. Consequently, m+O(1) 2 , (3.31) ≤ C(M λ0 )2b tm−b t1m1 −1 tm 2 (log t)
which applied in (3.28), proves the first statement of Lemma 3.5. The second statement can be easily deduced from the first. 3.3. Estimate of ϕerror,2 k We next prove the amputated version of the preceding lemma which will be used, in (3.3). by setting m1 = m0 , to estimate ϕerror,2 k Lemma 3.6. Suppose, m1 > 2, 0 ≤ m2 < m0 and define m = m1 + m2 . Let t1 ≤ t2 and t1 + t2 = t = T ε−1 for k ≥ 1. We then have the bound sup E U˜m1 ,m2 (s, t2 )ψ0 2
0≤s≤t1
"
≤ C(M λ0 )m T m
# (t1 )m1 −1 (t2 )m2 + m!2 (t1 )m1 −2 (t2 )m2 (log t)m+O(1) . (m1 − 1)! m2 !
It then follows that E ϕerror,2 (t) 2 ≤ C m0 T 2m0 k
"
# t (2m0 )!(log t)2m0 +O(1) + . n m0 m 0 ! n m0
Proof. The proof of the first statement is almost identical to the proof of Lemma 3.5, only that we replace Km1 ,m2 (t1 , t2 ; p0,m ) with Vˆ0 (p0 − p1 )Km1 −1,m2 (t1 , t2 ; p1,m ). The missing p0 in the latter free kernel effectively eliminates a power of t1 from the estimate of Lemma 3.5 and also reduces the effect of m1 by one in the estimate.
July 26, 2005 15:22 WSPC/148-RMP
700
J070-00242
D. Eng & L. Erd˝ os
As a technical note, the crossing estimates which are done with the aid of the α-representation (Lemma 2.1) require the kernel to have at least two momentum variables. Usually this amounts to requiring mj > 0 for j = 1, 2. In the previous lemma, this was avoided by assumption. However, in this case, it is possible that m2 = 0. Accordingly, we do not expand the kernel Km2 (t2 ) with Lemma 2.1 but use the trivial estimate |K0 (t2 ; rb )| ≤ 1 thus reducing our estimates to those without time division. Otherwise, the proof of the first statement follows in the exact same way as the previous lemma. To prove the second statement, we recall from the defintion (2.28) that t1 ds e−i(t1 −s)H U˜m1 ,m2 (s, t2 )ψ0 . Um1 ,m2 (t1 , t2 )ψ0 = 0
A simple consequence of the unitarity of e−i(t1 −s)H implies E Um1 ,m2 (t1 , t2 )ψ0 2 ≤ t2 sup E U˜m1 ,m2 (s, t2 )ψ0 2 . 1
(3.34)
0≤s≤t1
The first part of the lemma with t1 = t/n, t2 = (k − 1)t/n and m1 = m0 yields E ϕerror,2 (t) 2 ≤ m20 t21 k
m 0 −1
E U˜m0 ,m2 (t1 , t2 )ψ0 2
m2 =0
≤C
m0
T
2m0
"
# t (2m0 )!(log t)2m0 +5 + . n m0 m 0 ! n m0
3.4. Estimate of ϕerror,3 k We now move on to estimate the third error term in (3.3). As a rule of thumb, a genuine recollision will allow us to argue as in the estimates of the crossing term in Lemma 3.5 to eliminate a power of t1 . However, we will obtain a factor of t21 when we apply crude estimates such as (3.34). Since the amputation effectively eliminates one power of t1 (as in Lemma 3.6), this term will be O(n−2 ) when m1 is small. After summing on k in (2.23), our error term will be O(1) at best, which is not sufficient. Consequently, we are forced to continue the Duhamel expansion. The idea is that we will keep expanding until we either obtain another genuine recollision or we get a new collision center. The latter will produce another factor of n−1 so that after summation on k in (2.23) our term will be O(n−1 ) and by choosing n to be sufficiently large, this term will vanish in the limit. The case of a second recollision should be smaller by a power of time, which guarantees that this term vanishes in the limit. Intuitively, in order to have recollisions, obstacles need to be within a close vicinity of one another. Hence terms with these collision histories should be small since the probability of such configurations is higher order. If the obstacles were not within a close proximity with one another, then the wave function would need to travel very far to recollide and again, classically, we should be able to argue that the respective term is higher order. However, there is a technical difficulty which presents itself here. Viewing things classically, it is possible that two obstacles are O(1) distance apart. When this
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
701
happens, our wave can collide with these obstacles one after another in succession (of two or more times) and give the appearance of undergoing only one recollision. Though the probability that the obstacles have this configuration is = O(ε), this factor is not sufficient to compensate the loss in our unitary estimate. Consequently, we need effectively sum up the two-obstacle Born series to account for this. Not all pairs of recollisions need to be treated in this manner. If the original collision sequence is given by (α1 , . . . , αm ) and we obtain a new collision center which is a recollision at ακ1 followed by another new collision center which is a recollision at ακ2 , we will immediately be able to argue that the terms corresponding to the case where κ2 > κ1 are small on the basis that this collision pattern is higher order. Indeed, in order to have a genuine recollision and not an internal collision, we need κ1 ≥ 2. This implies that there is at least one more obstacle in the vicinity of ακ1 and ακ2 . The probability of this configuration occurring is higher order. Hence we will only sum the two-obstacle Born series in the case where κ2 < κ1 (they can never be equal since we already summed over internal collisions). Before we precisely describe the final stopping rule for our Duhamel expansion, we need to define propagators associated to more complicated collision patterns. Given A of size m, let m1 and m2 be non-negative integers with m1 + m2 = m. For n1 ≥ 2, define A(n1 ) := (. . . , ακ1 , ακ2 , ακ1 ).
n1
This will be the sequence of centers associated to the pair collision mentioned above. The propagator associated with the pair recollision is defined as ◦;κ1 ,κ2 (t1 , t2 )ψ0 (p0 ) U[n 1 ],m1 ,m2 ;A := du0,n1 −2 dr0,m Kn1 +m1 ,m2 (t1 , t2 ; p0 , u0,n1 −2 , r0,m )
× χ(A(n1 ) ⊕ A; p0 , u0,n1 −2 , r0,m )ψ0 (rm ).
(3.35)
The number n1 in brackets indicates the number of pair recollisions. For the free propagator kernel, we applied the definition (2.25) in the following form Kn1 +m1 ,m2 (t1 , t2 ; p0 , u0,n1 −2 , r0,m ) = Kn1 +m1 (t1 ; p0 , u0,n1 −2 , r0,m1 )Km2 (t2 ; rm1 ,m ), (see Fig. 3 for the order of momentum variables).
p0
u3
u2
u1
Fig. 3.
u0
r0
r1
rκ
2
Resummation of two obstacles.
rκ
1
rm
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
D. Eng & L. Erd˝ os
702
Summing over 1 ≤ κ2 < κ1 ≤ m and A gives ◦ (t1 , t2 ) := U[n 1 ],m1 ,m2
◦;κ1 ,κ2 U[n (t1 , t2 ). 1 ],m1 ,m2 ;A
A:|A|=m κ1 ,κ2 1 ≤ κ2 < κ1 ≤ m
The propagators for n1 = 1 are defined as ◦;κ U[1],m1 ,m2 ;A (t1 , t2 )ψ0 := dr0,m Km1 +1,m2 (t1 , t2 ; p0 , r0,m )χ(ακ ⊕ A; p0 , r0,m )ψ0 (rm ), ◦ U[1],m (t1 , t2 ) := 1 ,m2
m
◦;κ U[1],m (t1 , t2 ). 1 ,m2 ;A
κ=2 A:|A|=m
Note that these are the fully expanded versions of the truncated one recollision terms (3.1) and (3.2). We will also need to define the amputated version of the two recollision propagator κ1 ,κ2 U˜[n (t1 , t2 )ψ0 1 ],m1 ,m2 ;A := du0,n1 −2 dr0,m Vˆ0 (p0 − u0 )Kn1 +m1 −1,m2 (t1 , t2 ; u0,n1 −2 , r0,m )
× χ(A(n1 ) ⊕ A; p0 , u0,n1 −2 , r0,m )ψ0 (rm ) and
κ1 ,κ2 U˜[n (t1 , t2 ) = 1 ],m1 ,m2
(3.36)
κ1 ,κ2 U˜[n (t1 , t2 ). 1 ],m1 ,m2 ;A
A:|A|=m
The next propagators are associated with the pair recollision pattern followed by a new collision with α0 . For n1 ≥ 2, we define κ1 ,κ2 U˜[n (t1 , t2 )ψ0 (p0 ) 1 ],m1 ,m2 ;α0 ,A := dp1 du0,n1 −2 dr0,m Vˆ0 (p0 − p1 )Kn1 +m1 ,m2 (t1 , t2 ; p1 , un1 −2 , r0,m )
× χ(α0 ⊕ A(n1 ) ⊕ A; p0,1 , r0,m )ψ0 (rm ), and the summed up version U˜1,[n1 ],m1 ,m2 (t1 , t2 ) :=
(3.37)
κ1 ,κ2 U˜[n (t1 , t2 ), 1 ],m1 ,m2 ;α0 ,A
ˆ A|=m+1 ˆ κ1 ,κ2 A:| 1 ≤ κ2 < κ1 ≤ m
where the sum is over sets Aˆ := α0 ⊕ A with non-repeating indices. Note that the order of subscripts 1, [n1 ], m1 , m2 indicate the chronological order of the collision types, the bracket indicates the number of pair recollisions. For the special case n1 = 1, the propagators are defined as κ ˜ U[1],m1 ,m2 ;α0 ,A (t1 , t2 )ψ0 := dp0,1 dr0,m Km1 +1,m2 (t1 , t2 ; p1 , r0,m ) × χ(α0 ⊕ ακ ⊕ A; p0,1 , r0,m )ψ0 (rm )
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
703
and U˜1,[1],m1 ,m2 (t1 , t2 ) :=
m
κ U˜[1],m (t1 , t2 ). 1 ,m2 ;α0 ,A
κ=2 A:| ˆ A|=m+1 ˆ
Finally we have the propagators corresponding to the pair recollisions followed by a genuine recollision with ακ3 , κ1 ,κ2 ,κ3 U˜∗,[n (t1 , t2 )ψ0 (p0 ) 1 ],m1 ,m2 ;A := dp1 du0,n1 −2 dr0,m Vˆ0 (p0 − p1 )Kn1 +m1 ,m2 (t1 , t2 ; p1 , un1 −2 , r0,m )
× χ(ακ3 ⊕ A(n1 ) ⊕ A; p0,1 , u0,n1 −2 , r0,m )ψ0 (rm ),
(3.38)
and U˜∗,[n1 ],m1 ,m2 (t1 , t2 ) :=
m κ 1 −1 κ1 =2 κ2=1
m
κ1 ,κ2 ,κ3 U˜∗,[n (t1 , t2 ). 1 ],m1 ,m2 ;A
A:|A|=m κ3 = 1 κ3 = κ1 ,κ2
The condition on κ3 assures us that the new recollision is unrelated to the pair recollision. The star indicates the new recollision that is independent of the pair recollisions. ˜ 1 , t2 ) is any one of the amputated propagators defined above, its correIf U(t sponding full propagator is defined as t1 ˜ t2 ) . U(t1 , t2 ) := ds e−i(t1 −s)H U(s, (3.39) 0
In our notation, summation over appropriate ranges of a particular index removes that index. For example, when the pair recollision indices κ1 , κ2 do not appear explicitly, then the summation over 1 ≤ κ2 < κ1 ≤ m has been performed. If we sum over a different set of κ1 , κ2 (as we will below), the summation will appear explicitly. We now give a precise stopping rule for the expansion of the recollision term rec (t1 , t2 ) defined in (3.2). Dropping the explicit dependence on (t1 , t2 ) in our Um 1 ,m2 propagators, we expand beyond the first recollision center and we obtain κ1 ,κ2 rec ◦ Um = U[1],m + U[2],m + U1,[1],m1 ,m2 . 1 ,m2 1 ,m2 1 ,m2 2≤κ1 ,κ2 ≤m
The first term corresponds to the fully expanded term after the first recollision. The second term is the pair recollision. The third term is a single recollision (n1 = 1) followed by a fresh collision. The second term will be split according to κ1 < κ2 or κ2 < κ1 . In the easier case, when κ1 < κ2 , one can use the unitarity on the full evolution already after the second recollision (n1 = 2). When κ2 < κ1 we have to continue the expansion of this term. We stop when we obtain a brand new collision center or if we have a recollision at a center ακ3 = ακ1 , ακ2 . Internal recollisions are not counted (they are
July 26, 2005 15:22 WSPC/148-RMP
704
J070-00242
D. Eng & L. Erd˝ os
summed as before) and we only expand according to centers. Formally, this gives the identity rec Um = 1 ,m2
∞
◦ U[n + 1 ],m1 ,m2
n1 =1
+
∞
2≤κ1 <κ2 ≤m
U1,[n1 ],m1 ,m2 +
n1 =1
∞
κ1 ,κ2 U[2],m 1 ,m2
U∗,[n1 ],m1 ,m2 .
(3.40)
n1 =2
rec From the estimates below, it will follow that these series converge to Um . 1 ,m2
Lemma 3.7. If 1 ≤ k ≤ n, we have E ϕerror,3 (t) 2 k
≤ CT
4m0
" O(1)
(log t)
# 1 2m0 + (2m0 )!(log t) . n3
Proof. Applying the Schwarz inequality to (3.40), we have the bound rec E Um (t1 , t2 )ψ0 2 1 ,m2
≤
∞
◦ n21 E U[n (t1 , t2 )ψ0 2 + 1 ],m1 ,m2
n1 =1
+
∞
2≤κ1 <κ2 ≤m
n21 E U1,[n1 ],m1 ,m2 (t1 , t2 )ψ0 2 +
n1 =1
∞
κ1 ,κ2 m2 E U[2],m (t1 , t2 )ψ0 2 1 ,m2
n21 E U∗,[n1 ],m1 ,m2 (t1 , t2 )ψ0 2 .
n1 =2
These four terms are estimated in the following technical lemmas, whose proofs are given in the next section. Here we present only a short explanation after each lemma. Lemma 3.8. Suppose m1 + m2 ≥ 2 and n1 ≥ 1. Then ◦ E U[n (t1 , t2 )ψ0 2 ≤ (M λ0 )n1 +m T 2m−2 (log t)O(1) 1 ],m1 ,m2
"
# T + m!(log t)m . n
This term is small by a factor that comes from the recollision. The first term in the square bracket corresponds to the direct term. Since there is a new collision in the short time interval [t2 , t2 + t1 ), this will provide an extra factor t1 and hence the factor 1/n. All the other crossing terms carry an extra . Lemma 3.9. Suppose m1 + m2 ≥ 2 and n1 ≥ 1. We have the bound sup E U˜1,[n1 ],m1 ,m2 (s, t2 )ψ0 2
0≤s≤t1
"
≤ (M λ0 )
T
n1 +m 2
2m−2
O(1)
(log t)
T + m!(log t)m n
# (3.41)
This estimate is similar to the one in Lemma 3.8; the additional factor comes from the amputation.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
705
Lemma 3.10. Let m1 + m2 ≥ 2. Then 2 m 3 2m−3 ˜κ1 ,κ2 U (s, t )ψ (log t)m+O(1) . sup E 2 0 ≤ (M λ0 ) m! T [2],m1 ,m2 0≤s≤t 1
2≤κ1 <κ2 ≤m
In this estimate we gain 2 from the two recollisions and an additional from the amputation. We will not have to distinguish direct and crossing terms. Lemma 3.11. Given n1 ≥ 2 and m1 + m2 ≥ 2, we have the bound sup E U˜∗,[n1 ],m1 ,m2 (s, t2 )ψ0 2 ≤ m!(M λ0 )n1 +m 3 T 2m−3 log(t)m+O(1)
0≤s≤t1
for the propagator defined in (3.38). The amputated propagator we estimate here corresponds to the case of having two genuine recollisions. Each recollision will yield a factor of by utilizing a nontrivial pairing relation. Recalling the discussion at the beginning of Sec. 3.4, we will treat the pair recollision as one genuine recollision. Hence we gain a factor of 2 from the recollisions and an extra from the amputation. Applying these Lemmas, recalling the relation between U and U˜ from (3.39), using the unitarity estimate and performing the sums over n1 and using M λ0 < 1, we get " # 3 rec 2 m 2m−2 O(1) T 2 m (log t) + m!T (log t) . E Um1 ,m2 (t1 , t2 )ψ0 ≤ (M λ0 ) T n3 Consequently, E ϕerror,3 (t) 2 k
m −1 0 = E
2
m 0 −1
rec Um (t1 , t2 )ψ0 1 ,m2
m1 =0 m2 =(2−m1 )+
"
≤ CT
4m0
O(1)
(log t)
# 1 2m0 + (2m0 )!(log t) , n3
which proves Lemma 3.7. Proof of Lemma 3.1. Recall that = 0 ε and t = T ε−1 . We have from (2.23), Ψerror m0 (t) =
n
e−i
(n−k)t n
ϕerror (t) . k
k=1
Using the unitarity of H, and the Schwarz inequality, we have 2 E Ψerror m0 (t) ≤ C
n k=1
1 2 3 k 3/2 E ϕerror (t) 2 + E ϕerror (t) 2 + E ϕerror (t) 2 . k k k
July 26, 2005 15:22 WSPC/148-RMP
706
J070-00242
D. Eng & L. Erd˝ os
Lemmas 3.5, 3.6 and 3.7 imply 2 E Ψerror m0 (t)
" n3/2 + (2m0 )!n5/2 ε(log t)2m0 ≤ C T (log t) n−1/2 + m0 ! # 1 (2m0 )!(log t)2m0 + m −5/2 + . n 0 m0 !ε nm0 −5/2 m0
4m0
O(1)
It is easy to see that setting our parameters to m0 :=
| log ε| , 10 log | log ε| 1
n := ε− 100 ,
(3.42) (3.43)
2 → 0 as ε → 0. guarantees that EΨerror m0 (T /ε) 4. Proof of the Technical Error Estimates In this section, we prove the four technical Lemmas that were needed to complete the argument in the preceding section. We will discuss Lemma 3.8 in details, then we explain the necessary modifications to prove the other three Lemmas. Since several arguments are very similar, we will not repeat them in each case. 4.1. Proof of Lemma 3.8 for n1 ≥ 2 We start by computing the expectation value as in Lemma 3.3. 2 no rec ◦;κ1 ,κ2 E U (t , t )ψ 0 [n1 ],m1 ,m2 ;A 1 2 1≤κ2 <κ1 ≤m A:|A|=m = 1≤κ2<κ1≤m 1≤κ2<κ1≤ m
×E
B : |B| = b σ∈S(b) (A,A ) 0≤b≤m ◦;κ ,κ
◦;κ1 ,κ2 1 2 dp0 Un,m (t1 , t2 )ψ0 (p0 ) U[n1 ],m (t1 , t2 )ψ0 (p0 ) , 1 ,m2 ;A 1 ,m2 ;A
where the sum on (A, A ) is short for summing over ordered sets A, A of size m with non-repeating elements, such that A ∩ A = B, B ≺ A and σ(B) ≺ A . Using independence of the obstacles and the Schwarz inequality, we have ◦ (t1 , t2 )ψ0 2 E U[n 1 ],m1 ,m2 ≤
1≤κ2<κ1≤m B : |B| = b σ∈S(b) (A,A ) 1≤κ2<κ1≤m 0 ≤ b ≤ m
% & ◦;κ1 ,κ2 ◦;κ1 ,κ2 2 × EB EA\B U[n (t1 , t2 )ψ0 2 + EA \B U[n1 ],m (t1 , t2 )ψ0 1 ],m1 ,m2 ;A 1 ,m2 ;A
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
≤
B : |B| = b A : |A| = m 1≤κ2 <κ1 <m 0≤b≤m B≺A
707
2 ◦;κ1 ,κ2 C(N, m, b) EB EA\B U[n (t1 , t2 )ψ0 1 ],m1 ,m2 ;A
:= (I) + (II) for (I) :=
B : |B| = b 0≤b≤m
A : |A| = m B≺A
1 ≤ κ2 < κ1 ≤ m {ακ1 ,ακ2 } ⊆ B
B : |B| = b 0≤b≤m
A : |A| = m B≺A
1 ≤ κ2 < κ1 ≤ m {ακ1 ,ακ2 } B
C(N, m, b)
2 ◦;κ1 ,κ2 × EB EA\B U[n (t1 , t2 )ψ0 , 1 ],m1 ,m2 ;A (II) :=
C(N, m, b)
2 ◦;κ1 ,κ2 × EB EA\B U[n (t1 , t2 )ψ0 , 1 ],m1 ,m2 ;A
(4.1)
“N −m”
and C(N, m, b) := m − b m!(m−1)(m−2) . 2 We will first treat term (I). Recalling (3.10), define (γ1 , γ2 ) such that κ1 = b(γ1 ) and κ2 = b(γ2 ). This means that ακ1 falls in between the γ1 th and (γ1 + 1)th element of B in the sequence of centers, and similar statement holds for ακ2 . Taking expectation in the proof of Lemma 3.3, we obtain the bound (I) ≤ C m
m
b! Wn1 (t1 , t2 ; γ1 , γ2 , b, l0,b ),
(4.2)
b=2 l0,b 1≤γ2 <γ1 ≤b
where Wn1 (t1 , t2 ; γ1 , γ2 , b, l0,b )
2m−b :=
dp0 du0,n1 −2 du0,n1 −2 dr0,b dr0,b ∆ u0,n1 −2 , u0,n1 −2 , r0,b , r0,b
l0,b l0,b
(4.3) × ψ0 (rb )ψ0 (rb )K t1 , t2 ; p0 , u0,n1 −2 , r0,b K t1 , t2 ; p0 , u0,n1 −2 , r0,b , where the vector l0,b is defined in the proof of Lemma 3.3 and ∆ u0,n1 −2 , u 0,n1 −2 , r0,b , r 0,b n −2 1 n1 −j =δ − (−1) (uj − uj ) − r0 − r0 + rγ2 − rγ2 j=0
×
γ 2 −1 j=1
δ
rj
b 1 −1 γ = rj − r0 − r0 δ rj = rj − rγ2 − rγ 2 δ rj = rj j=γ2 +1
:= ∆1 u0,n1 −2 , u 0,n1 −2 , r0 , r0 , rγ2 , rγ 2 × ∆2,b r0,b , r 0,b .
j=γ1
(4.4)
July 26, 2005 15:22 WSPC/148-RMP
708
J070-00242
D. Eng & L. Erd˝ os
p0
p0
u2
u1
u
u′
u′
u′
2
1
0
0
...
...
...
r0
r
r
r′
r′
r′
...
...
0
...
Fig. 4.
1
1
rb
...
2
r ′b
2
...
Feynman diagram example for term (I).
The decomposition above separates the first pairing relation, ∆1 , which is the only relation containing the variables u0,n1 −2 and u 0,n1 −2 , from the remaining b − 1 relations, ∆2,b . Note that Wn1 also depends on m1 , m2 , but these parameters are determined by the variables b and l0,b so they will be omitted from the notation. Figure 4 shows the Feynman diagram when (γ1 , γ2 , n1 ) = (3, 1, 4). The dashed lines on the picture indicate identical centers. The time division line is not shown; it can cut the sequence of filled obstacles (r-momenta lines) anywhere as in Fig. 2. We will now bound Wn1 (t1 , t2 ; γ1 , γ2 , b, l0,b ) by considering several cases in the following subsections. 4.1.1. Term (I), case (γ1 , γ2 ) = (2, 1), b ≤ 4, 2 ≤ n1 ≤ 8 Recall the notation introduced in (3.14). We apply (2.32) to expand our timedivided kernels.a Using the notation defined in (3.26), we write Wn1 (t1 , t2 ; 2, 1, b, l0,b)
1 1
= 2m−b dα dα dp0 dr0,b dr 0,b ∆2,b (r0,b , r 0,b )ψ0 (rb )ψ0 rb
[α1 , p0 ] α , p0 1
J b β1 B(α2 , RJ , RJ )J B α 2 , R , R J B(α1 , RJ , RJ )J B α 1 , RJ , RJ J J × J +1 J +1 [α1 , RJ ]J +1 [α2 , RJ ]J +1 α1 , RJ α2 , RJ J=0 J=β2 × B(α1 , r0,β )B α1 , r0,β B(α2 , rβ,b )B α2 , rβ,b × du0,n1 −2 du 0,n1 −2 ∆1 u0,n1 −2 , u 0,n1 −2 , r0 , r0 , rγ2 , rγ 2 1 −2 n × B(α1 ; p0 , u0,n1 −2 , r0 )B α1 ; p0 , u0,n1 −2 , r0
j=0
1 1 [α1 , uj ] α , u j 1
.
(4.5)
a In the special case of m = 0, we use the trivial estimate |K(t ; r )| ≤ 1 and our subsequent 2 2 b estimates will be similar but easier as we do not have divided time.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
709
Notice that by (3.11) and (3.10), β is determined by m1 , m2 and l0,b . We also defined α := (α1 , α2 ) with an analogous definition for α . Taking absolute values into all of the integrals, we use ψ0 230,0 ≤ C and Lemma 5.2 to get the bound Wn1 (t1 , t2 ; 2, 1, b, l0,b)
≤ C(M λ0 )2m−2b 2m−b sup × ×
rb
dα dα
β1 J=0
×
×
dp0 du0,n1 −2 du 0,n1 −2 dr0,b−1 dr 0,b ∆2 r0,b , r 0,b n −2
1 1 1 ∆ u , u , r , r , r , r 1 0,n −2 0 1 1 0,n1 −2 0 1 |α1 , p0 | |α1 , p0 | |α1 , uj | |α 1 , u j | j=0
b 1 1 1 1 |α1 , RJ |J +1 |α 1 , RJ |J +1 |α2 , RJ |J +1 |α 2 , RJ |J +1 J=β2
n1−2 b 5 1 M λ0 M λ0 1 M λ0 p0 − u0 4 u0 4 j=1 uj−1 − uj 4 j=1 rj−1 − rj 4 i=1 rki 4
M λ0
! 4
p0 − u 0
n1 −2 b 5 1 M λ0 M λ0 1 ! ! ' ( , 4 4 4 u0 j=1 u − u j=1 r − r i=1 4 j−1 j j−1 j rk
(4.6)
i
where the last two lines were obtained by estimating the potential terms using Lemma 5.2 and (3.30). As in the crossing estimate of Lemma 3.5, the indices ki , ki , for 1 ≤ i ≤ 5 will be chosen in the following estimates. Let k1 = k1 = 0, k2 = k2 = γ2 and k3 = k3 = β, the other k-values are arbitrary. We begin by using (2.38) and (3.8) to estimate β1 J=0
b 1 1 ≤ t2m−2b , |α1 , RJ |J |α 1 , RJ |J |α2 , RJ |J |α 2 , RJ |J
(4.7)
J=β2
where we also include the factor
α2 α2 4 rβ 4 rβ
and apply (2.36) twice if β2 > 0.
Let σ ∈ {0, 1} with σ = β. Note that for an appropriately chosen v, we have ∆1 u0,n1 −2 , u 0,n1 −2 , r0 , r0 , rγ2 , rγ 2 = δ u0 − u 0 ± rσ + v . We use the bound sup
du0 du 0 drσ α1 α 1 α (σ)δ u0 − u 0 ± rσ + v u0 4 u 0 4 rσ 4 |α 1 , u 0 | |α (σ), rσ | |α1 , u0 | v,α1 ,α1 drσ α (σ) 1 2 ≤ C(log t) sup 4 rσ |α (σ), rσ | |rσ ± v| + η v,α1 ,α1 ≤ C(log t)3 ,
(4.8)
July 26, 2005 15:22 WSPC/148-RMP
710
J070-00242
D. Eng & L. Erd˝ os
where α (σ) = α 1 for β = 0 and α (σ) = α 2 for β = 0. In the second line, we use Proposition 2.3 to perform the du0 and du 0 integrals. This allows us to estimate the remaining integrals of un1 −2 , u n1 −2 using (2.37). Using (2.38), we make the estimate β−1 1 1 |α1 , p0 | j=2 |α 1 , rj |
b−1 j=β+1;=1
1 ≤ |α 2 , rj |
$ t1 tb−3 t1 t
b−2
1 < β < b, otherwise,
(4.9)
and then integrate over the variables r 2,b which gets rid of the (trivial) pairings in ∆2 . We now apply (2.37) to handle the integration in p0 and rj , for all j = β, b. The remaining estimates depend on β. If β ≤ 1, we apply (2.34) to get 1 1 dα1 dα 1 ≤ C(log t)2 , sup α1 α1 |α1 , rβ | |α1 , rβ | rβ ,rβ allowing us to use (2.35) on the remaining factors of rβ and rβ . This leaves us to apply (2.34) twice more to handle the factors in rb and the integration in α2 and α 2 . When 1 < β < b, we apply (2.37) to integrate the remaining factors in r 0,1 . The four rβ factors are handled by applying (2.36) on |α 2 , rβ |−1 , (2.34) on |α1 , rβ |−1 and |α 1 , rβ |−1 , followed by applying (2.35) on |α2 , rβ |−1 . The remaining factors are handled as before. Finally, if β = b, we integrate the remaining factors in r 0,1 then treat the last four factors in rb by a combination of (2.34), (2.35) and (2.36) as we did before. In all cases for β, we get the estimate Wn1 (t1 , t2 ; 2, 1, b, l0,b ) ≤ (M λ0 )n1 +m (t1 )(t)2m−b−2 (log t)b+2n1 +O(1) .
(4.10)
This estimate is sufficient when 2 ≤ n1 ≤ 8 since n1 = O(1) in our power of log. However for n1 > 8, we will need to introduce the two-obstacle Born series term to assure that our power of log t does not grow too much. This is treated in the next case. 4.1.2. Term (I), case (γ1 , γ2 ) = (2, 1), b ≤ 4, n1 > 8 We begin by expressing our first pairing relation in (4.4) as Pn1 −2 n1 −j (uj −uj )−(r0 −r0 )+(r1 −r1 )] ∆1 u0,n1 −2 , u 0,n1 −2 , . . . = dν eiν[− j=0 (−1) . Defining: Bn1 ,ν (α1 , p0 , r0 ) :=
B(α1 , u0 , u1 )eiνu0 B(α1 , u1 , u2 )e−iνu1 α1 − u20 /2 + iη1 α1 − u21 /2 + iη1 B(α1 , un1 −2 , r0 ) exp (−1)n1 −2 iνun1 −2 ×···× , (4.11) α1 − u2n1 −2 /2 + iη1 du0,n1 −2 B(α1 , p0 , u0 )
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
711
the bracketed integral in (4.5) can be expressed as n 1 −2 du0,n1 −2 du 0,n1 −2 ∆1 (u0,n1 −2 , . . .)
1 1 [α1 , uj ] α , u j=0 1 j × B(α1 ; p0 , u0,n1 −2 , r0 )B α1 ; p0 , u0,n1 −2 , r0 = dν e−iν[(r0 −r0 )−(r1 −r1 )] Bn1 ,ν (α1 , p0 , r0 )Bn1 ,ν α 1 , p0 , r0 .
By Lemma 5.3, we have the bound
−iν[(r0 −r0 )−(r1 −r1 )]
sup dν e Bn1 ,ν (α1 , p0 , r0 )Bn1 ,ν α1 , p0 , r0
r1 ,r1
≤
(M λ0 )n1 p0 − r0 30 p0 − r0
!30
which, in light of (3.30), allows us to follow the technique presented the case 2 ≤ n1 ≤ 8 to complete the estimate. Here, we no longer have the variables u0,n1 −2 , u 0,n1 −2 nor the pairing relation ∆1 which relates them. In place of (4.8), we simply use (2.36) to estimate the factors containing rσ . We mention that we will need to apply (2.35) to bound |α1 , p0 |−1 and (2.36) to bound |α 1 , p0 |−1 . The rest of the details differ trivially from the previous case and are left as an exercise. The result is, for n1 ≥ 2 and b ≤ 4, the bound Wn1 (t1 , t2 ; 2, 1, b, l0,b) ≤ (M λ0 )n1 +m (t1 )(t2 )2m−b−2 (log t)b+O(1) ≤ (M λ0 )n1 +m (t)2m−b−1 (log t)O(1) . (4.12) n The first inequality actually holds for all b. However, immediate application in (4.3), after summation with the b! prefactor, creates a factor of m! in our estimate which is too large. This will be avoided by observing that for large b, most of our pairings are direct pairings rj = rj which can be estimated with time division, as in Lemma 3.5. These estimates should produce one (m!)−1 which would be more than enough in our case. However, we only capture (b!)−1 , which is adequate for our estimates. We will treat this in the next case. 4.1.3. Term (I), case (γ1 , γ2 ) = (2, 1), b > 4 Returning to (4.3), we consider first the case of 0 ≤ β ≤ 2. Proposition 2.5 and (2.40) imply l
β2,b ) K(t2 ; rβ,b t2 ∗ 1 l3,b−1 l3,b−1 η21 t21 = F t23 , r3,b−1 e [dt2j ]31 K t22 ; r3,b−1 dα2 e−iα2 t21 2π 0
×
2 B(α2 , RJ , RJ )J B(α2 , rb , rb )b B(α2 , rβ,3 )B(α2 , rb−1 , rb ). [α2 , RJ ]J +1 [α2 , rb ]b +1
J=β2
July 26, 2005 15:22 WSPC/148-RMP
712
J070-00242
D. Eng & L. Erd˝ os
We recall the convention mentioned after (2.29), i.e. η21 = η(t21 ). Subsequently, l0,β1 applying (2.32) to K t1 ; p0 , u0,n1 −2 , r0,β , we get Wn1 (t1 , t2 ; 2, 1, b, l0,b ) t2 ∗ [dt2j ]31 ≤ 2m−b 0
t2 ∗
0
dt 2j
3 1
l3,b−1 l3,b−1 K t22 ; r3,b−1 dr3,b−1 K t22 ; r3,b−1
l3,b−1 l3,b−1 F t23 , r3,b−1 × F t23 , r3,b−1 An1 t1 , t21 , t 21 ; r3 , rb−1 ,
(4.13)
where An1 (t1 , t21 , t 21 ; r3 , rb−1 ) 2 := dp0 dr0,2 dr0,1 drb |ψ0 (rb )| dα dα e−i[α·(t1 ,t21 )−α ·(t1 ,t21 )] eη·(t1 ,t21 )+η ·(t1 ,t21 ) b
B(α2 , rb , rb )b B(α 2 , rb , rb ) 1 1 b +1 [α1 , p0 ] [α 1 , p0 ] [α2 , rb ]b +1 [α 2 , rb ] J 2 J β1 B(α2 , RJ , RJ )J B α 2 , RJ , RJ B(α1 , RJ , RJ )J B α 1 , RJ , RJ × J +1 J +1 [α1 , RJ ]J +1 [α2 , RJ ]J +1 [α 1 , RJ ] [α 2 , RJ ] J=0 J=β2 B α2 , rβ,1 , r2,3 × B(α1 , r0,β )B(α2 , rβ,3 )B α 1 , r0,β × B(α2 , rb−1 , rb )B α 2 , rb−1 , rb × du0,n1 −2 du 0,n1 −2 ∆1 u0,n1 −2 , u 0,n1 −2 , r0,1 , r 0,1 ×
× B(α1 ; p0 , u0,n1 −2 , r0 )B
α 1 ; p0 , u 0,n1 −2 , r0
1−2 n
j=0
1 1 [α1 , uj ] [α 1 , u j ]
). In our and for α := (α1 , α2 ), α := α 1 , α 2 , η := (η1 , η21 ) and η := (η1 , η21 notation, [α1 , p] gets regularized with η1 as before, whereas [α2 , p] gets regularized with η21 . See (3.26) for a similar convention. Using definition (2.14) and making dispersive estimates as in the direct term proof of Lemma 3.5, we have Wn1 (t1 , t2 ; 2, 1, b, l0,b ) t2 ∗ [dt2j ]31 = 2m−b ×
b−1
0
t2 ∗ 0
[dt 2j ]31
0
t22 ∗
[dsj ]b−1 3
0
t22 ∗
dr3,b−1 [ds j ]b−1 3
j 2
sj s j e−i(sj −sj )rj /2 l3,b−1 l3,b−1 F t23 , r3,b−1 F t23 , r3,b−1 An1 t1 , t21 , t21 ; r3 , rb−1
2 ( !) j j=3
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
713
j sj sj ≤ ! 3/2 0 0 0 0 j=3 sj − sj
l3,b−1 l3,b−1 An1 t1 , t21 , t 21 , r3 , rb−1
dr ×
F t23 , r3,b−1 F t23 , r3,b−1 .
t2 ∗
[dt2j ]31
t2 ∗
[dt 2j ]31
t22 ∗
[dsj ]b−1 3
t22 ∗
[ds j ]b−1 3
b−1
3,b−1
We now trivially bound (j )−1 ≤ 1 to getb : Wn1 (t1 , t2 ; 2, 1, b, l0,b ) t2 ∗ [dt2j ]31 ≤ 2m−b
t2 ∗
[dt 2j ]31
2 l3,b−1 +b−4
t2
!3/2 (b − 4)! t22 − t 22
l3,b−1 l3,b−1 An1 t1 , t21 , t 21 , r3 , rb−1
dr ×
F t23 , r3,b−1 F t23 , r3,b−1 0
where l3,b−1 :=
b−1
0
j=3 j .
3,b−1
,
Repeated use of Lemma 3.4 and (3.15) imply that
F t23 , rl3,b−1 F t , rl3,b−1 An1 t1 , t21 , t 21 , r3 , rb−1
23 3,b−1 3,b−1 dr3,b−1
rb−1 60 ξ (M λ0 ) l3,b−1 +b−4 ξ3 b−1
≤ sup ∇ ∇ A , t , t , r , r t !3/2
r3 48 rb−1 r3 n1 1 21 21 3 b−1 0 ≤ ξ3 ,ξb−1 ≤ 2 t23 3/2 t 23 r3 ,rb−1
×
48 b−1 r 1 ξ 3 Ndr3,b−1 3,b−1 rb−1 60 j=4 rj−1 − rj 52 b−3
ξ3,b−1 ∈{0,2}
(M λ0 ) l3,b−1 +b−4 ≤ !3/2 t23 3/2 t 23
sup 0 ≤ ξ3 ,ξb−1 ≤ 2 r3 ,rb−1
rb−1 60 ξ ξ3 b−1
r3 48 ∇rb−1 ∇r3 An1 t1 , t21 , t21 , r3 , rb−1 .
Given the form of An1 , the derivatives on r3 and rb−1 pass onto only the potentials, which, by Lemma 5.2 are sufficiently smooth. Hence, after taking derivatives and then absolute values into the integrals, we can perform estimates as in Secs. 4.1.1 and 4.1.2 to show that
rb−1 60 ξ ξ3 b−1
t ∇ ∇ A , t , t , r , r
r3 48 rb−1 r3 n1 1 21 21 3 b−1 ξ3 ,ξb−1 ∈{0,1,2} sup
2 l0,1 +2b +1
≤ (M λ0 )n1 +3+ l0,1 +b t1 t2 b More
(log t)O(1) .
careful analysis should allow us to make use of the (j !)−1 factors by arguing as in the proof of the Direct estimate of Lemma 3.5. This would yield a factor of m!−1 instead of b!−1 but since we do not need the former, we opt for the cruder estimate.
July 26, 2005 15:22 WSPC/148-RMP
714
J070-00242
D. Eng & L. Erd˝ os
Combining these estimates yields Wn1 (t1 , t2 ; 2, 1, b, l0,b) t2 ∗ 2m−b 3 ≤ [dt2j ]1 0
≤
t2 ∗
0
m+n1
(M λ0 )
[dt 2j ]31
(M λ0 )m+n1 t1 t2m−b−3 2 (b − 4)!t22 − t 22 3/2 t23 3/2 t 23 3/2
(t1 )(t2 )2m−b−2 (log t)O(1) . (b − 4)!
The case of 2 < β < b − 1 is handled in a similar way. We start by applying Proposition 2.5 and (2.40) on both component kernels in the time-divided kernel. We then bound Wn1 (t1 , t2 ; 2, 1, b, l0,b) t1 ∗ ≤ 2m−b [dt1j ]31 0
t1 ∗
0
[dt 1j ]31
t2 ∗ 0
[dt2j ]31
t2 ∗
0
[dt 2j ]31
dr2,β−1 drβ+1,b−1
l2,β−1 l2,β−1 lβ+1,b−1 lβ+1,b−1 K t22 ; rβ+1,b−1 × K t12 ; r2,β−1 K t12 ; r2,β−1 K t22 ; rβ+1,b−1 l2,β−1 l2,β−1 lβ+1,b−1 lβ+1,b−1 × F t13 ; r2,β−1 F t23 ; rβ+1,b−1 F t13 ; r2,β−1 F t23 ; rβ+1,b−1 × An1 t11 , t21 , t 11 , t 21 ; r2 , rβ−1 , rβ+1 , rb−1 ,
(4.14)
where An1 t11 , t21 , t 11 , t 21 ; r2 , rβ−1 , rβ+1 , rb−1 := dp0 dr0,1 dr 0,1 drβ drb |ψ0 (rb )|2 dα dα e−i[α·(t11 ,t21 )−α ·(t11 ,t21 )]
× eη·(t11 ,t21 )+η ·(t11 ,t21 )
β1
1 B(α1 , rβ , rβ )β1 B(α 1 , rβ , rβ ) 1 β1 +1 [α1 , p0 ] [α1 , p0 ] [α1 , rβ ]β1 +1 [α 1 , rβ ]
j 1 B(α2 , RJ , RJ )J B(α , RJ , RJ )J B(α1 , rj , rj )j B(α 1 , rj , rj ) 2 j +1 j +1 J +1 J +1 [α , r ] [α , R ] 1 j 2 J [α1 , rj ] [α2 , RJ ] j=0 J=β2,b × B(α1 , r0,2 )B(α1 , rβ−1 , rβ )B α 1 , r 0,1 , r2 B α 1 , rβ−1 , rβ × B(α2 , rβ , rβ+1 )B(α2 , rb−1 , rb )B α 2 , rβ , rβ+1 B α 2 , rb−1 , rb × du0,n1 −2 du 0,n1 −2 ∆1 u0,n1 −2 , u 0,n1 −2 , r0,1 , r 0,1
×
× B(α1 ; p0 , u0,n1 −2 , r0 )B
α 1 ; p0 , u 0,n1 −2 , r0
1 −2 n
j=0
1 1 [α1 , uj ] [α 1 , u j ]
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
715
with η := (η11 , η21 ) and α := (α1 , α2 ). We now proceed as in the small β case by estimating (4.14) using dispersive estimates. We get Wn1 (t1 , t2 ; 2, 1, b, l0,b ) t1 ∗ [dt1j ]31 ≤ 2m−b 0
t1 ∗ 0
[dt 1j ]31
t2 ∗
0
[dt2j ]31
0
t2 ∗
[dt 2j ]31
C b tb−5+2( l2,β−1 + lβ+1,b−1 ) !3/2 !3/2 t22 − t 22 (b − 5)! t12 − t 12
l2,β−1 l2,β−1 lβ+1,b−1 lβ+1,b−1 ×
F t13 ; r2,β−1 F t13 ; r2,β−1 F t23 ; rβ+1,b−1 F t23 ; rβ+1,b−1
× An1 t11 , t21 , t 11 , t 21 ; r2 , rβ−1 , rβ+1 , rb−1
dr . ,dr ×
2,β−1
β+1,b−1
Repeated use of Lemma 3.4 gives
F t13 ; rl2,β−1 F t ; rl2,β−1 F t23 ; rlβ+1,b−1 F t ; rlβ+1,b−1 13 2,β−1 23 β+1,b−1 2,β−1 β+1,b−1
× An1 (r2 , rβ−1 , rβ+1 , rb−1 )
dr
(M λ0 )b−5+ l2,β−1 + lβ+1,b−1 !3/2 !3/2 t13 3/2 t 13 t23 3/2 t 23
rβ−1 40 rb−1 60 ξ ξβ−1 ξβ+1 ξ
2 b−1
× sup
∇ ∇ ∇ ∇ A (r , r , r , r ) r r β−1 β+1 r2 rb−1 n1 2 β−1 β+1 b−1 , 36 48 r2 rβ+1 2,β−1 ,drβ+1,b−1
≤
where the supremum is over 0 ≤ ξ2 , ξβ−1 , ξβ+1 , ξb−1 ≤ 2 and r2 , rβ−1 , rβ+1 , rb−1 . Again, we can check that the derivatives on An1 only affect potential terms and we can estimate the last factor following the techniques shown in Secs. 4.1.1 and 4.1.2 to yield Wn1 (t1 , t2 ; 2, 1, b, l0,b ) ≤
(M λ0 )n1 +m (t1 )(t2 )2m−b−2 (log t)O(1) . (b − 5)!
Finally, when b − 1 ≤ β ≤ b, we apply (2.32) to K(t2 ), while using Proposition 2.5 to write l0,β K t1 ; p0 , u0,n1 −2 , r0,β t1 ∗ 1 l2,β−1 l2,β−1 = dα1 e−iα1 t11 eη11 t11 [dt1j ]31 K t12 ; r2,β−1 )F (t13 ; r2,β−1 2π 0 n1 −2 B(α1 , RJ , RJ )J 1 B(α1 , p0 , u0,n1 −2 , r0 ) × B(α1 , r0,2 , rβ−1,β ). [α1 , p0 ] [α1 , uj ] [α1 , RJ ]J +1 j=0 J=0,1,β1
The rest of the estimate is handled as in the previous cases for β.
July 26, 2005 15:22 WSPC/148-RMP
716
J070-00242
D. Eng & L. Erd˝ os
We conclude that in all cases in this subsection Wn1 (t1 , t2 ; 2, 1, b, l0,b ) ≤
(M λ0 )n1 +m (t1 )(t2 )2m−b−2 (log t)O(1) . (b − 4)!
(4.15)
4.1.4. Term (I), case (γ1 , γ2 ) = (2, 1), 2 ≤ n1 ≤ 8 Returning to the pairing functions (4.4), observe that our assumption on (γ1 , γ2 ) imply that there is at least one nontrivial relation in ∆2,b (r0,b , r 0,b ), a trivial relation being of the form δ(rj = rj ). This will allow us to avoid the use of the costly L∞ estimate (2.38) and obtain a bound which is smaller by a factor of compared to the (γ1 , γ2 ) = (2, 1) case. The main mechanism is utilizing estimates like (4.8). We apply (2.32) to obtain (4.5) and (4.6). Suppose first that γ2 < β < γ1 , which implies that ∆2,b (r0,b , r 0,b ) contains the nontrivial relation δ(rβ = rβ − rγ2 + rγ 2 ). Once again we choose k1 = k1 = 0, k2 = k2 = γ2 and k3 = k3 = β. We begin as in the case of (γ1 , γ2 ) = (2, 1), b ≤ 4 by performing estimates (4.7) and (4.8), the latter performed with σ := 0 and α (σ) := α 1 . We then bound the integration in un1 −2 and u n1 −2 as before and apply (2.38): 1 |α 1 , p0 |
β−1 j=1;=γ2
b−1 1 1 ≤ t1 tb−3 . |α1 , rj | |α 2 , rj |
(4.16)
j=β+1
This allows us to integrate over rj for j = 0, γ2 , β which removes the corresponding delta functions. We then estimate the integrals of rj , j = γ2 , β, b by applying (2.37) and use the bound δ rβ = rβ − rγ2 + rγ2 α 2 drγ2 drγ 2 drβ dα 1 sup α 1 rγ2 4 rγ 2 4 rβ 4 |α1 , rγ2 | |α 1 , rγ 2 | |α 1 , rβ | |α 2 , rβ | α2 ,rβ drβ 1 α 2 dα 1 2 sup ≤ C(log t) rβ 4 |rβ − rβ | |α 2 , rβ | α 1 |α 1 , rβ | rβ ≤ C(log t)4 .
(4.17)
After applying (2.34) to integrate α1 , we use either (2.35) or (2.37) on |α2 , rβ |−1 , depending on whether the factor α2 /rβ 4 was used in (4.7). The remaining terms are handled by (2.34). Now suppose that β = γ2 . By assumption, either γ2 ≥ 2 or γ1 − γ2 ≥ 2, which implies that ∆2,b contains either δ(r1 = r1 − r0 + r0 ) or δ(rγ 2 +1 = rγ2 +1 − rγ2 + rγ 2 ), respectively. The first case is handled by applying (4.8) with σ = 0 while applying the same type of estimate in integrating r0 , r1 and r1 . The rest of the estimates are trivial. When γ1 − γ2 ≥ 2, we will need to apply (4.8) with σ = 0 and then apply an estimate of the form (4.17) to handle rγ 2 , rγ2 +1 , rγ 2 +1 and the integral in α 1 . The rest of the estimates are trivial.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
717
The case where 0 ≤ β < γ2 follows analogously as the reader can check. Finally, the case where β ≥ γ1 requires two estimates of the form (4.8) — one to handle ∆1 and the other to handle the nontrivial relation in ∆2 . We omit the details, leaving them as an exercise. The result is Wn1 (t1 , t2 ; γ1 , γ2 , b, l0,b ) ≤ m!(M λ0 )n1 +m 2 T 2m−b−2 (log t)m+O(1) .
(4.18)
4.1.5. Term (I), case (γ1 , γ2 ) = (2, 1), n1 > 8 We form the Born series term as in the corresponding case as in Sec. 4.1.2. This eliminates the paring relation ∆1 (u0,n1 −2 , u 0,n1 −2 , r0 , r0 , rγ2 , rγ 2 ) and makes r0 , r0 , rγ2 and rγ 2 free variables allowing their factors to be estimated by (2.35) or to participate in estimates of the form (4.17). Either way, we avoid the costly estimate (2.36). Again, the condition (γ1 , γ2 ) = (2, 1) implies that there is at least one nontrivial relation in ∆2 . This is exploited as in Sec. 4.1.4 using estimates such as (4.17). The details are omitted and the result is Wn1 (t1 , t2 ; γ1 , γ2 , b, l0,b ) ≤ m!(M λ0 )n1 +m 2 T 2m−b−2 (log t)m+O(1) .
(4.19)
4.1.6. Summary of the estimates of the term (I) Summarizing (4.10), (4.12) (4.15), (4.18) and (4.19), we have shown that for all cases of (γ1 , γ2 ), b, and l0,b , that " χb>4 n1 +m 2m−b−1 Wn1 (t1 , t2 ; γ1 , γ2 , b, l0,b ) ≤ (M λ0 ) T χb≤4 + n (b − 5)! # + m!2 T 2m−b−2 (log t)m+O(1) , which, after summation in (4.2), yields Lemma 3.8 for the term (I) in case n1 ≥ 2. Now we turn to the term (II). 4.1.7. Term (II), n1 ≥ 2 To get explicit expressions, we will first treat the case where ακ1 ∈ B but ακ2 ∈ / B. Suppose γ1 is defined so that b(γ1 ) = κ1 and γ2 is defined so that b(γ2 ) < κ2 < b(γ2 + 1). As before, taking the expectation in (4.1) involves the random phase. Explicitly, EB EA\B χ(A(n1 ) ⊕ A; p0 , u0,n1 −2 , r0,m )EA\B χ(A(n1 ) ⊕ A; p0 , u 0,n1 −2 , r 0,m ) [n1 /2] (un1 −2k−1 − un1 −2k ) + (rκ2 −1 − rκ2 ) = |Λ|−(2m−b) δ
k=1
[n1 /2]
×δ
k=1
(u n1 −2k−1 − u n1 −2k ) + rκ 2 −1 − rκ 2
July 26, 2005 15:22 WSPC/148-RMP
718
J070-00242
D. Eng & L. Erd˝ os
×δ
(−1)n1 −j (uj − u j ) − (r0 − r0 ) + rb(γ1 ) − rb(γ 1)
n 1 −2 j=0
b
×
j=1;=γ1 b
×
δ (rb(j)−1 − rb(j) ) − rb(j)−1 − rb(j)
j=0;=γ2
δ(rj−1 − rj )δ rj−1 − rj
k=b(j)+1
κ 2 −1
×
b(j+1)−1
b(γ2 +1)−1
δ(rj − rj−1 )
j=b(γ2 )+1
δ(rj − rj−1 ) ,
(4.20)
j=κ2 +1
where [ · ] is the least integer &and u−1 = u−1 := p0 . % function b We now integrate rm \ [rb(j) ]0 , rκ2 and their prime counterparts. Of the vari → rj , 1 ≤ j ≤ b, and rκ2 → rˆγ2 , rκ 2 → rˆγ 2 . ables left, relabel rb(j) → rj and rb(j) One can check that we can rewrite our pairing relations as: [n1 /2] (un1 −2k−1 − un1 −2k ) + (rγ2 − rˆγ2 ) δ k=1
×δ u n1 −2k−1 − u n1 −2k + rγ2 − r0 + r0 − rˆγ 2 [n1 /2]
k=1
×
γ2 b 1 −1 γ δ rj = rj − r0 − r0 δ rj = rj − (ˆ rγ2 − rˆγ 2 δ rj = rj . j=1
j=γ2 +1
j=γ1
We now proceed as for the term (I). For 2 ≤ n1 ≤ 8, we use (2.32) and exploit the pairing relations. Unlike the simplified relations (4.4), we have two separate relations: one involving u0,n1 −2 and the other one involving u 0,n1 −2 . Each of these will be involved in an estimate of the form (4.8) or (4.17). The net effect is that we / B and ακ2 ∈ B. gain a factor of 2 . Similar argument is valid in the case when ακ1 ∈ / B, we can simplify the pairing relations so that we have one When ακ1 , ακ2 ∈ separate pairing relation for each group u0,n1 −2 and u 0,n1 −2 , and no other relations involving these variables, making it similar to the other cases in (II). We then exploit these relations performing estimates as in (4.8) or (4.17) twice, while avoiding the costly L∞ estimate (2.38). This gains a factor of 2 . The details of these calculations are left to the reader but the conclusion is that for 2 ≤ n1 ≤ 8, we have (II) ≤ m!(M λ0 )n1 +m T 2m−2 2 (log t)m+O(1) . When n1 > 8, we form the two-obstacle Born series term as in (I). Since we have one separate relation for u0,n1 −2 and one for u 0,n1 −2 , the formula (4.11) will introduce Bν,n1 (α1 , p0 , r0 )Bν ,n1 α 1 , p0 , r0 ,
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
719
where we also have an additional integration over ν . However, by Lemma 5.3, we have sufficient decay in ν to handle the integration. We can deduce the same bound as in the case of the small n1 and leave this to the reader to check. 4.2. Proof of Lemma 3.8 for n1 = 1 The case n1 = 1 requires a separate treatment, but the methods are analogous to the ones in the previous section for n1 ≥ 2. We start with the following estimate 2 m ◦;κ U[1],m1 ,m2 ,A (t1 , t2 )ψ0 ≤ (I) + (II) E κ=2 A:|A|=m
for (I) :=
B : |B| = b 0≤b≤m
(II) :=
B : |B| = b 0≤b≤m
A : |A| = m ακ ∈B B≺A
/ A : |A| = m ακ ∈B B≺A
) where C(N, m, b) =
N −m m−b
*
◦;κ C(N, m, b)EB EA\B U[1],m (t1 , t2 )ψ0 2 1 ,m2 ,A
◦;κ C(N, m, b)EB EA\B U[1],m (t1 , t2 )ψ0 2 , 1 ,m2 ,A
m!(m − 1).
We first treat term (I). Define γ so that b(γ) = κ. Computing the expectations, we have (I) ≤
b m b=2
for
l
b! W1 (t1 , t2 ; γ, b, l0,b ),
κ=2
W1 (t1 , t2 ; γ, b, l0,b ) := 2m−b
dp0 dr0,b dr 0,b ψ(rb )ψ0 (rb )∆(r0,b , r 0,b )
l0,b
l0,b × K(t1 , t2 ; p0 , r0,b )K(t1 , t2 ; p0 , r 0,b ) .
The pairing relations are given by ∆(r0,b , r 0,b )
=
γ−1
δ[rj = rj − (r0 − r0 )]
j=1
b
δ(rj = rj ).
(4.21)
j=γ
We will treat the γ = 2, γ > 2 and γ = 1 cases separately. 4.2.1. Term (I), case γ = 2, n1 = 1 For γ = 2, we have the nontrivial relation r1 = r1 − (r0 − r0 ). When b is small, this allows us to perform estimates such as (4.8) or (4.17). This gains a power of . When b > 4, we will follow the beginning of Sec. 4.1.3 by applying Proposition 2.5
July 26, 2005 15:22 WSPC/148-RMP
720
J070-00242
D. Eng & L. Erd˝ os
and (2.32) to split our kernels. We then perform time dependent estimates which produce a factor of (b!)−1 as before. The details can be gathered from previous estimates. The result is T 2m−1 . (I: γ = 2) ≤ C(M λ0 )m+1 T m n 4.2.2. Term (I), case γ > 2, n1 = 1 Returning to (4.21), the condition γ > 2 gives us at least two nontrivial relations. When β ≥ γ, we exploit the relations r1 = r1 − r0 + r0 and r2 = r2 − r0 + r0 . Using bounds such as (4.8) and (4.17), we avoid the use of L∞ estimates (which produce powers of time) for r1 and r2 as well as r0 , which proves the lemma in this case. For β = 0, we need the following inequality α2 α 2 dα1 dα 1 dr0 dr0 8 8 α1 α1 r0 r0 |α1 , r0 | |α2 , r0 | |α 1 , r0 | |α 2 , r0 | dr1 dr1 δ(r1 = r1 − r0 + r0 ) dr2 dr2 δ(r2 = r2 − r0 + r0 ) × r1 4 r1 4 |α2 , r1 | |α 2 , r1 | r2 4 r2 4 |α2 , r2 | |α 2 , r2 | dr0 dr0 α2 α 2 1 ≤ C(log t)6 r0 4 r0 4 r0 − r0 4 |α2 , r0 | |α 2 , r0 | |r0 − r0 |2 + η 2 ≤ C(log t)9 ,
(4.22)
where the first inequality uses (2.34) twice as well as an estimate similar those in the proof of Proposition 2.3. The remaining cases of 1 ≤ β < γ1 will follow in the same way (after a change of variables). The rest of the estimate follows from previous ones. It follows that (I: γ > 2) ≤ m!(M λ0 )m+1 2 T 2m−2 (log t)m+O(1) . 4.2.3. Term (I), case γ = 1, n1 = 1 The case of γ = 1 requires a separate argument. The pairing relations force rj = rj for 1 ≤ j ≤ b. Note that the constraint αb(γ) = α1 forces 0 > 0 in this case. Separating the internal and external kernels as in the direct estimate in Lemma 3.5, we need to consider t2 ∗ t1 ∗ t2 ∗ t1 ∗ [dt1j ]21 [dt2j ]21 [dt 1j ]21 [dt 2j ]21 dp0 dr0 dr0 drb |ψ0 (rb )|2 0
0
0
0
× Km1 ,m2 (t11 , t21 ; p0 , r00 , rlbb )Km1 ,m2 (t 11 , t 21 ; p0 , r0 0 , rlbb ) × F (t12 , t22 ; p0 , r00 , rlbb )F (t 12 , t 22 ; p0 , r0 0 , rlbb ), where l0,b ) F (t12 , t22 ; p0 , r0,b
:=
∞ k1 ,...,km =0
dqJm,k K(t12 , qJ1 )K(t22 , qJ2 )L(p0 , r00 , rlbb , qJm,k ),
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
721
and J1 , J2 are defined in (3.18). We now show that
dp0 dr0 dr0 drb |ψ0 (rb )|2 K(t11 , t21 ; p0 , r0 , rlb )K(t , t ; p0 , r 0 , rlb ) 0 0 11 21 b b
× F (t12 , t22 ; p0 , r00 , rlbb )F (t 12 , t 22 ; p0 , r0 0 , rlbb )
≤ C m tm−b
1 −1 2 tm tm 1 2 ψ0 230,0 (m1 − 1)! m2 !
× sup |||F (t12 , t22 ; p0 , r00 , rlbb )F (t 12 , t 22 ; p0 , r0 0 , rlbb )rb −60 |||dp0 dr0 dr0,b−1 . rb
(4.23)
We assume first that β ∈ / {0, b}. This implies that m1 ≥ 2 in (4.23) and consequently yields a factor of 1/n. The proof of (4.23) begins with an identity similar to (3.24). However, we replace σ0 − σ0 −3/2 in the latter with σ0 −3/2 σ0 −3/2 . Integration of the dσJ variables yield 0
t11 ∗
[dσJ ]β1 −1
× ×
t11 ∗
0
[dσJ ]β1 −1
t21 ∗
0
≤ C b+1
[dσJ ]bβ2
0
1 (σ0 σ0 )0 3/2 σ 3/2 σ 3/2 ( !)2 σ−1 − σ−1 0 0 0 (σβ1 −
t21 ∗
[dσJ ]bβ2
b−1 j=1;j=β
(σj σj )j σj − σj 3/2 (j !)2
)β1 (σβ2 σβ2 )β2 (σβ1 σβ1 σβ1 ) + (σβ2 − σβ2 )3/2 (β1 !)2 (β2 !)2
tm−b−1/2 (0 ! · · · β1 !β2 ! · · · b !)2
0
t11 ∗
[dσJ ]β1 −1
(σb σb )b (b !)2
t21 ∗
[dσJ ]bβ2
0
σ00 −1 1 σ · · · σbb . σ0 1/2 1
Finally, we estimate the last integral by integrating by parts as in the direct estimate of Lemma 3.5 and use t11 m1 −1/2 (t11 − σ0 )1 +···+β1 +β σ00 −1 m1 t11 , dσ0 ≤ C (1 + · · · + β1 + β)! σ0 1/2 0 ! m1 ! 0 where 0 + · · · + β1 = m1 − β. Putting this together, we obtain (4.23). We can now argue as in Lemma 3.4 to show that, sup |||F (t12 , t22 ; p0 , r00 , rlbb )F (t 12 , t 22 ; p0 , r0 0 , rlbb )rb −60 |||dp0 dr0 dr0,b−1 rb
≤
(M λ0 )m . t12 3/2 t22 3/2 t 12 3/2 t 22 3/2
The cases of β ∈ {0, b} are handled similarly and are left to the reader. Consequently, we get the estimate (I : γ = 1) ≤ C(M λ0 )m+1 T m
T 2m−1 . n
July 26, 2005 15:22 WSPC/148-RMP
722
J070-00242
D. Eng & L. Erd˝ os
4.2.4. Term (II), case n1 = 1 Moving onto (II), if γ is chosen so that b(γ) < κ < b(γ + 1), where b(0) := 0, then the case where γ = 0 is analogous to the γ = 1 case for the term (I) and the case of γ > 0 is analogous to the γ > 1 case for the term (I). The former is handled with the time division and the latter by using (2.32) and making use of two nontrivial pairing relations and yields a factor of 2 . One can check that " (II) ≤ (M λ0 )
m+1
T
m
# T m−1 2 m−2 m+O(1) + m! T (log t) . n
Putting all of estimates in this section together, we complete the proof of the Lemma 3.8.
4.3. Proof of Lemma 3.9 Here we have a new collision in the time interval [t2 , t1 + t2 ) which will provide an extra factor of t1 and hence a factor of 1/n. The amputation of propagator essentially yields an extra factor of compared to Lemma 3.8. To compute the expectation, we will use a similar argument as in Lemma 3.3. Starting with the case of n1 ≥ 2, we have κ m 1 −1 E
no rec
κ1 =2 κ2 =1 α0 ,A:|A|=m
=
2 κ ,κ 2 U˜[n11 ],m (s, t )ψ 2 0 1 ,m2 ;α0 ,A
σ∈S(b) (α0 ⊕A,α0 ⊕A ) 1 ≤ κ2 < κ1 ≤ m B : |B| = b 1 ≤ κ2 < κ1 ≤ m 0 ≤ b ≤ m+1
κ ,κ2 κ1 ,κ2 × E(U˜[n (s, t2 )ψ0 , U˜[n11 ],m (s, t2 )ψ0 ), 1 ],m1 ,m2 ;α0 A 1 ,m2 ;α ,A 0
where the sum on (α0 ⊕ A, α 0 ⊕ A) is short for summing over ordered sets A, A of size m and α0 , α 0 such that α0 ⊕ A and α 0 ⊕ A have no repeating elements, (α0 ⊕A)∩(α 0 ⊕A ) = B, B≺(α0 ⊕A) and σ(B)≺(α 0 ⊕A ). We now use independence of our obstacles, the Schwarz inequality and symmetry to get κ1 ,κ2 2 EU˜1,[n (s, t2 )ψ0 1 ],m1 ,m2 ≤
CN,m,b EB
B : |B| = b α0 , A : |A| = m 1≤κ2 <κ1 <m 0 ≤ b ≤ m+1 B ≺ (α0 ⊕A)
2 ,κ2 × E(α0 ⊕A)\B U˜ακ01,[n (s, t2 )ψ0 1 ],m1 ,m2 ,A := (I) + (II),
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
723
for
no rec
B : |B| = b 0 ≤ b ≤ m+1
α0 , A : |A| = m B ≺ (α0 ⊕A)
1 ≤ κ2 < κ1 ≤ m {ακ1 ,ακ2 } ⊆ B
(I) :=
CN,m,b EB
2 ,κ2 × E(α0 ⊕A)\B U˜ακ01,[n (s, t2 )ψ0 , 1 ],m1 ,m2 ,A
no rec
B : |B| = b 0 ≤ b ≤ m+1
α0 , A : |A| = m B ≺ (α0 ⊕A)
1 ≤ κ2 < κ1 ≤ m {ακ1 ,ακ2 } B
(II) :=
CN,m,b EB
2 ,κ2 × E(α0 ⊕A)\B U˜ακ01,[n (s, t2 )ψ0 , 1 ],m1 ,m2 ,A ) and CN,m,b :=
N −m−1 m+1−b
*
(m+1)!(m−1)(m−2) . 2
We first treat the term (I). Define (γ1 , γ2 ) such that κ1 = b(γ1 ) and κ2 = b(γ2 ). / B, we can compute the By considering separately the cases α0 ∈ B and α0 ∈ expectations and calculate the combinatorics as in Lemma 3.3, to get bound (I) ≤
m+1
b!2m−b
b=2 l0,b 1≤γ2 <γ1 ≤b
×
dp0 dp1 du0,n1 −2 du 0,n1 −2 dr0,b dr 0,b ψ0 (rb )ψ0 (rb ) × ∆(u0,n1 −2 , u 0,n1 −2 , r0,b , r 0,b )
l0,b
l0,b 2 ˆ × |V0 (p0 − p1 )| K(s, t2 ; p1 , u0,n1 −2 , r0,b )K(s, t2 ; p1 , u0,n1 −2 , r 0,b )
2 + dp0 du0,n1 −2 du 0,n1 −2 dr0,b dr 0,b ψ0 (rb )ψ0 (rb )∆(u0,n1 −2 , u 0,n1 −2 , r0,b , r 0,b )
+ l0,b
l0,b 2 ˆ × |V0 (0)| K(s, t2 ; p0 , u0,n1 −2 , r0,b )K(s, t2 ; p0 , u0,n1 −2 , r 0,b ) . The first term arises from cases in which α0 ∈ B and the second when α0 ∈ / B. We will only bound the first term, the second will be smaller by a factor of when treated in the same way. Our pairing relations are ∆(u0,n1 −2 , u 0,n1 −2 , r0,b , r 0,b ) # " n 1 −2 n1 −j (−1) (uj − uj ) − (r0 − r0 ) + (rγ2 − rγ2 ) =δ − j=0
×
γ 2 −1 j=1
δ[rj = rj − (r0 − r0 )]
γ 1 −1 j=γ2 +1
δ[rj = rj − (rγ2 − rγ 2 )]
b j=γ1
δ(rj = rj ).
July 26, 2005 15:22 WSPC/148-RMP
724
J070-00242
D. Eng & L. Erd˝ os
This is essentially identical to Eq. (4.4). The extra integration in p0 is handled through the decay of Vˆ0 (p0 − p1 ) after appealing to Lemma 5.2. Consequently, by following the proof of Lemma 3.8, it is easy to verify the bound " (I) ≤ (M λ0 )n1 +m 2 T m−2 (log t)O(1)
# T + m!(log t)m . n
Aside from the extra factor of from the amputation, the case of (II) is analogous to the one in Lemma 3.8, as is the case of n1 = 1. This completes the proof of Lemma 3.9.
4.4. Proof of Lemma 3.10 The amputation gains a factor of . The pairing relations to follow will show that we will be able to use estimates such as (4.8) to gain another factor of . The last factor of is obtained through either time division estimates like those in Sec. 4.2.3 or by utilizing another nontrivial pairing relation. 2 κ1 ,κ2 as in Lemma 3.8 to get ψ Again, we calculate EB EA\B U˜[2],m 0 ,m ,A 1 2 E
2≤κ1 <κ2 ≤m
2 κ1 ,κ2 ˜ U[2],m1 ,m2 (s, t2 )ψ0 ≤ (I) + (II),
where (I) :=
(II) :=
B : |B| = b 0≤b≤m
A : |A| = m B ≺A
1 ≤ κ1 < κ2 ≤ m {ακ1 ,ακ2 } ⊆ B
B : |B| = b 0≤b≤m
A : |A| = m B ≺A
) and C(N, m, b) :=
N −m m−b
κ1 ,κ2 C(N, m, b)EB EA\B U˜[2],m (s, t2 )ψ0 2 , 1 ,m2 ;A
κ1 ,κ2 C(N, m, b)EB EA\B U˜[2],m (s, t2 )ψ0 2 , 1 ,m2 ;A
1 ≤ κ1 < κ2 ≤ m {ακ1 ,ακ2 } B
*
m!(m−1)(m−2) . 2
Starting with the term (I), define (γ1 , γ2 ) so that (b(γ1 ), b(γ2 )) = (κ1 , κ2 ). We have (I) ≤
m
b=2 l0,b 1 ≤ γ1 < γ2 ≤ b b(γ1 ) ≥ 2
b! 2m−b
dp0 du0 du 0 dr0,b dr 0,b ψ0 (rb )ψ0 (rb )
l0,b × ∆(u0 , u 0 , r0,b , r 0,b )Vˆ0 (p0 − u0 )Vˆ0 (p0 − u 0 )K(s, t2 ; u0 , r0,b )
l0,b
) , × K(s, t2 ; u 0 , r 0,b
where ∆(u0 , u 0 , r0,b , r 0,b ) are the pairing relations.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
725
4.4.1. Term (I), case γ1 = 1 The condition of b(γ1 ) ≥ 2 forces 0 > 0, and ∆(u0 , u 0 , r0,b , r 0,b ) = δ[(u0 −
u 0 )
− (r1 −
r1 )]
γ 2 −1
δ(rj = rj − r1 + r1 )
j=2
b
δ(rj − rj ).
j=γ2
Since none of these relations actually depends on r0 or r0 , we use Proposition 2.5 to split our kernels to isolate the momenta r0 and r0 so that we can estimate them separately with time division. Recalling (3.11), we have, for β > 0, l
0,b K(s, t2 ; u0 , r0,b ) t1 ∗ η·(t12 ,t2 ) B(α1 , u0 , r0 ) 0 e 2 [dt1j ]1 K(t11 , r0 ) = dα e−iα·(t12 ,t2 ) 2 4π [α1 , u0 ] 0
β1 b B(α1 , RJ , RJ )J B(α2 , RJ , RJ )J B(α1 , r0,β )B(α2 , rβ,b ), × [α1 , RJ ]J +1 [α2 , RJ ]J +1 J=1
J=β1
where α = (α1 , α2 ) and η = (η12 , η2 ). The propagator [α1 , p] is regularized with η12 = η(t12 ), whereas [α2 , p] is regularized with η2 = η(t2 ). We have an anall
0,b ogous expansion for K(s, t2 ; u 0 , r 0,b ) with primed variables, α = (α 1 , α 2 ) and , η2 ). η = (η12 The factors in the α and α integration are estimated using the same techniques as in Lemma 3.8. The first relation in ∆(u0 , u 0 , . . .) and the du0 , du 0 integration will allow us to avoid the L∞ -estimate in one of the resolvents with r1 by applying Proposition 2.3 as in (4.8). Consequently, we get a contribution of tb−2+2 l1,b and we gain effectively a factor . Recalling (3.17), the bound
dr0 K(t11 ; r0 )f (r0 ) ≤ (M λ0 )20 +1 t11 0 −3/2 |||f |||dr0 0
handles the remaining terms. Since 0 ≥ 1, after dt11 integration we gain a factor ¯ Finally 1/2 compared to the trivial estimate and a similar gain comes from K. an additional factor comes from the amputation and this gives the result of the lemma. The β = 0 case is handled the same way. This time, we will need to apply Proposition 2.5 twice since r0 will appear in both K(t1 ) and K(t2 ). The details are left to the reader. 4.4.2. Term (I), case γ1 > 1 Our pairing relations are ∆(u0 , u 0 , r0,b , r 0,b )
= δ[(u0 −
u 0 )
− (rγ1 −
rγ 1 )]
γ 1 −1
δ(rj = rj − r0 + r0 )
j=1
×
γ 2 −1 j=γ1 +1
δ(rj = rj − rγ1 + rγ 1 )
b j=γ2
δ(rj = rj ),
July 26, 2005 15:22 WSPC/148-RMP
726
J070-00242
D. Eng & L. Erd˝ os
where the first product is non-empty. After applying (2.32), we can perform estimates as in Lemma 3.8 to exploit two nontrivial relations. The calculations are similar and we conclude the lemma for the term (I). 4.4.3. Term (II) For the term (II), we apply (2.32) and perform the usual estimates which exploit nontrivial pairing relations as in the estimates for (II) in Lemma 3.8. The case of κ1 < κ2 ≤ b(1) requires time division arguments. The key estimates in these cases are dp0 du0 du 0 dr0 dr0,b f (p0 , u0 , u 0 , r0 , r0,b ) l
l
1,b 1,b × K(s1 , s2 ; u0 , r001 , u002 , r1,b )K(s 1 , s 2 ; u 0 , r 001 , u 002 , r1,b )
≤ Ct2m−b−3 |||f |||dp0 du0 du0 dr0 dr0,b , when b(1) = κ1 and dp0 du0 du 0 dr0 dr0,b f (p0 , u0 , u 0 , r0 , r0,b ) l
l
1,b 1,b × K(s1 , s2 ; u0 , r001 , u002 , p003 , r1,b )K(s 1 , s 2 ; u 0 , r001 , u002 , p003 , r1,b )
≤ Ct2m−b−4 |||f |||dp0 du0 du0 dr0 dr0,b , when b(1) > κ1 . In the first estimate 01 + 02 + 1 = 0 and 01 > 0, and in the second one 01 + 02 + 03 + 2 = 0 with 01 > 0. Note that the summation over κ1 and κ2 in the definition of (II) in effect also sums over possible 01 , 02 and 03 . The reader can verify that applying these estimates and following the estimates of the direct terms in Lemma 3.5 one obtains Lemma 3.10. 4.5. Proof of Lemma 3.11 Starting from (3.38), we make the usual decomposition 2 m no rec κ1 ,κ2 ,κ3 ˜ U∗,[n1 ],m1 ,m2 ;A (s, t2 )ψ0 ≤ (I) + (II), κ3 = 1 1≤κ2 <κ1 =m A:|A|=m κ3 = κ1 ,κ2
where (I) :=
C(N, m, b)
B : |B| = b A : |A| = m κ1 ,κ2 ,κ3 {ακ1 ,ακ2 ,ακ3 } ⊆ B 0≤b≤m B≺A κ1 ,κ2 ,κ3 × EB EA\B U˜∗,[n (s, t2 )ψ0 2 , 1 ],m1 ,m2 ;A
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
(II) :=
727
C(N, m, b)
B : |B| = b A : |A| = m κ1 ,κ2 ,κ3 0≤b≤m B ≺A {ακ1 ,ακ2 ,ακ3 } B κ1 ,κ2 ,κ3 × EB EA\B U˜∗,[n (s, t2 )ψ0 2 , 1 ],m1 ,m2 ;A
) C(N, m, b) :=
N −m m−b
*
m!(m−1)(m−2)2 2
and the sums on κ1 , κ2 , κ3 are restricted to
1 ≤ κ2 < κ1 ≤ m and κ3 = κ1 , κ2 . As before, the first term is the leading order term. Defining (γ1 , γ2 , γ3 ) such that (b(γ1 ), b(γ2 ), b(γ3 )) = (κ1 , κ2 , κ3 ), we have (I) ≤
m
b=2 l0,b 1≤γ2 <γ1 ≤b γ3 =γ1 ,γ2
b! 2m−b
dp0,1 dp 1 du0,n1 −2 du 0,n1 −2 dr0,b dr 0,b
× ψ0 (rb )ψ0 (rb )∆(p1 , p 1 , u0,n1 −2 , u 0,n1 −2 , r0,b , r 0,b )
l0,b l0,b
× K(t1 , t2 ; p1 , u0,n1 −2 , r0,b )K(t1 , t2 ; p 1 , u 0,n1 −2 , r0,b ) .
The pairing functions can be simplified to always yield two nontrivial relations: one of which involves the variables (u0,n1 −2 , u 0,n1 −2 , r0,b , r 0,b ) and the other one involves (p1 , p 1 , r0,b , r 0,b ). As in the estimates of Sec. 4.1.4 in Lemma 3.8, we exploit these two nontrivial relations and obtain two factors of (one from each nontrivial relation). This, combined with an extra resulting from the amputation, will yield the correct estimate. In particular, we apply estimates such as (4.8), (4.17) or (4.22) on three of the variables (r0 , rg 1 , rg 2 , rβ ), where g1 and g2 are the first and second smallest element of (γ1 , γ2 , γ3 ). The rest of the factors are handled as in previous proofs. When n1 > 8, we take the additional step of rewriting the pairing relation involving u0,n1 −2 and u 0,n1 −2 in position space, applying (4.11) and forming the Born series term. The second nontrivial relation (which involves p1 and p 1 ) is used in bounds of the form (4.8), (4.17) or (4.22). The result is that we avoid L∞ -estimates on three of the propagators containing the variables (r0 , rg 1 , rg 2 , rβ ). The terms in (II) are handled in the same way as in Lemma 3.8. We isolate two nontrivial relations, one involving u0,n1 −2 and the other u 0,n1 −2 . When n1 ≤ 8, we use these relations in estimates such as (4.8), (4.17) or (4.22) and when n1 > 8, we form the Born series terms. See the discussion in Lemma 3.8. The details are left to the reader. This completes the proof of Lemma 3.11.
5. Estimates on the Propagator
Recall that f n,n := xn ∇x n f (x) 2 . In what follows, we will treat functions f which are possibly dependent on the parameter α ∈ R. In this case, we abuse the notation to write fα n,n := supα xn ∇x n fα (x) 2 .
July 26, 2005 15:22 WSPC/148-RMP
728
J070-00242
D. Eng & L. Erd˝ os
For 0 < η < 1, and α ∈ R, define the following operator Vˆ0 (p − q)f (q) . dq Bη (α, u)f (p) = B(α, u)f (p) := α − (q + u)2 /2 + iη R3
(5.1)
We will usually suppress the dependence in η unless it becomes critical. Lemma 5.1. Let N > 0 and n, n ≤ N . If Vˆ0 N +2,N +2 ≤ λ0 1, then there exists a constant C depending only on N (and implicitly on the dimension d = 3) such that
sup |pn ∇p n Bη (α, u)f (p)| ≤ Cλ0 f n,2, fα n,2 n n sup |p ∇p ∂α Bη (α, u)fα (p)| ≤ Cλ0 + ∂α fα n,2 . |α + iη|1/2 u,α η,u,α
We also have the same bounds for Bη (α, u)f n,n and ∂α Bη (α, u)f n,n , respectively. Proof. A direct calculation of the Fourier Transform of the Yukawa potential yields the identities F
−1
−1
(α − (· + u) /2 + iη) 2
(x) = C
ei|x|
√ α+iη−ixu
|x|
B(α, u)f (p) =
dx e
ixp
V0 (x)
:= Gu (x) := G(x), dy G(x − y)fˇ(y).
Here we consider the branch of the square root with positive imaginary part and we omitted η from the notations. Hence, we can make use of the bound n n n n ˇ . |p ∇p B(α, u)f (p)| ≤ ∇x x V0 (x) dy G(x − y)f (y) L1 (dx)
Using the Leibniz rule, we have (∇x )n xn V0 (x) dy G(x − y)fˇ(y)
L1 (dx)
j n j ˇ ≤ (∇x ) (x V0 (x)) dy G(x − y)(∇y ) f (y) j +j=n
≤
L1 (dx)
Vˆ0 n,n +2 x−2 y−2 G(x − y) L2 (dx dy) y2 ∇y j fˇ L2 (dy) .
j +j=n
The integral involving G(x−y) can be bounded uniformly since we know that η ≤ 1. This proves the first inequality. The second inequality follows in the same way as the first and uses the bound |∂α Gα,u (x)| ≤ C|α + iη|−1/2 . The last two bounds follow in the same manner.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
729
Lemma 5.2. Define Bη (α, p, r) as in (2.32), let N > 0 and n, n ≤ N . If Vˆ0 N +2,N +2 ≤ λ0 1, then there exists a constant M depending only on N such that
|p1 − p2 n ∇p1 n ∇p2 n Bη (α, p1 , p2 )| ≤ M λ0 , M λ0 |p1 − p2 n ∇p1 n ∇p2 n ∂α Bη (α, p1 , p2 )| ≤ . |α + iη|1/2 Proof. We omit η from the notation. For a fixed k, consider Vˆ0 (qk − p2 ) Vˆ0 (q1 − q2 ) B(k; α, p1 , p2 ) := dqk Vˆ0 (p1 − q1 ) ··· , 2 α − q1 /2 + iη α − qk2 /2 + iη where the k = 0 term is defined to be Vˆ0 (p1 − p2 ). This implies
∇p1 n ∇p2 n B(k; α, p1 , p2 ) Vˆ0 (q1 − q2 ) (∇n Vˆ0 )(qk − p2 ) = dqk (∇n Vˆ0 )(p1 − q1 ) · · · α − q12 /2 + iη α − qk2 /2 + iη = dqk (∇n Vˆ0 )(p1 − p2 − q1 ) ×
(∇n Vˆ0 )(qk ) Vˆ0 (q1 − q2 ) · · · . α − (q1 + p2 )2 /2 + iη α − (qk + p2 )2 /2 + iη
By writing p = p1 − p2 and u = p2 , it suffices to bound pn ∇p n B(α, u)k (∇n Vˆ0 )(p) = pn ∇p n B(α, u) ◦ · · · ◦ B(α, u)(∇n Vˆ0 )(p).
k
We can now apply the previous lemma to get |pn ∇p n B(α, u)k (∇n Vˆ0 )(p)| ≤ Cλ0 B(α, u)k−1 (∇n Vˆ0 ) n,2 ,
and inductively, using the bounds on B(α, u)f n,n , we obtain |pn ∇p n B(α, u)k (∇n Vˆ0 )(p)| ≤ (Cλ0 )k+1 . By definition, B(α, p1 , p2 ) = ∞ k=0 B(k; α, p1 , p2 ). Summing the last bound yields the first estimate. The second estimate follows from the Leibniz rule and is proven in an analogous way.
5.1. Summing for pair recollisions This section provides the estimates on the two-obstacle Born series term mentioned in Sec. 3.4. For n1 ≥ 2, recall (4.11), B(α, u0 , u1 )eiνu0 B(α, u1 , u2 )e−iνu1 Bn1 ,ν (α, p0 , r0 ) := du0,n1 −2 B(α, p0 , u0 ) α − u20 /2 + iη α − u21 /2 + iη B(α, un1 −2 , r0 ) exp (−1)n1 −2 iνun1 −2 ×···× , (5.2) α − u2n1 −2 /2 + iη
July 26, 2005 15:22 WSPC/148-RMP
730
J070-00242
D. Eng & L. Erd˝ os
where 0 < η ≤ 1 and B is defined in (2.32). The dependence on η is omitted from the notation as the estimates below are uniform for 0 < η ≤ 1. Lemma 5.3. Let n1 ≥ 2 and N > 0. Then there exists a constant M depending only on N such that for n ≤ N, we have |p1 − p2 n Bn1 ,ν (α, p1 , p2 )| ≤ (M λ0 )n1 ν−(n1 −1)/2 ,
(5.3)
where Vˆ0 N +3,N +3 ≤ λ0 . Proof. Define the following operator (compare to (5.1)) Vˆ0 (p − q)eiν(q+u) Bν (α, u)f (p) := dq f (q). α − (q + u)2 /2 + iη
(5.4)
We claim that for n, n ≤ N ,
|pn ∇p n Bν (α, u)f (p)| ≤ Cλ0 ν−1/2 f n,3.
(5.5)
To show this, we proceed as in Lemma 5.1, except that we estimate
|pn ∇p n Bν (α, u)f (p)| ≤ (∇x )j xn V0 (x) dy Gν,u (x − y)(∇y )j fˇ(y) j+j =n
≤
j+j =n
where Gν,u (x) =
L1 (dx)
Vˆ0 n,n +3 x−3 y−3 Gν,u (x − y)L2 (dxdy) y3 (∇y )j fˇ(y) 2 , √ 1 i|x+ν| α+iη−iux . |x+ν| e
It is easy to see that
x−3 y−3 Gν,u (x − y) L2 (dx dy) ≤ Cν−1/2 , which justifies (5.5). For the proof of Lemma 5.3, we write p = p1 − p2 , u = p2 and estimate
n k0
p (B (α, u) ◦ Bν (α, u) ◦ B k1 (α, u) ◦ · · · ◦ B(−1)n1 −2 ν (α, u) 0 0
kn1−1 ˆ
(α, u) ◦ V0 )(p) ◦ B0 by applying (5.5) repeatedly as in Lemma 5.2. We then sum over k0 , . . . , kn1 −1 to complete the proof of Lemma 5.3. 6. Wigner Transform of Main Term 6.1. Renormalization Recall (2.32) and define m φA (t, p0 ) := dpm ψ0 (pm )χ(A; p0,m )K(t; p0,m ) B(p2m /2, pj−1 , pj ). j=1
(6.1)
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
731
We note that B = Bη and all quantities derived from it depend on η := η(t) (see (2.29)) throughout the whole section but this fact will be omitted from the notation. At the end, we will use that the necessary estimates on B are uniform in η. no rec rec := A:|A|=m φA and recall the definition of ψm = As usual, we define φno m no rec from (2.17). We will suppress the “no rec” notation in the following section. ψm We will first estimate the error of replacing ψm with φm . Lemma 6.1. We have E ψm (t) − φm (t) 2 ≤ C(M λ0 )m m!1/2 (t)m−1/2 (log t)m+O(1) . This implies that for fixed m and our scaling = 0 ε, t = T ε−1 , we obtain lim E ψm (t) − φm (t) 2 = 0.
ε→0
Proof. We begin by appealing to Lemma 2.1 to write m B(pj−1 , pj ) ieηt dα e−iαt φA (t; p0 ) = dpm χ(A; p0,m )ψ0 (pm ) 2 2π α − p0 /2 + iη j=1 α − p2j /2 + iη no rec and applying (2.32) to ψA . To compute E A:|A|=m (ψA − φA ) 2 , we appeal to Lemma 3.3 with k−1 m B(α, pj−1 , pj ) ieηt e−iαt ψ0 (pm ) dα G(p0,m ) = 2π α − p20 /2 + iη α − p2j /2 + iη j=1 k=1 m B(α, pk−1 , pk ) − B(p2m /2, pk−1 , pk ) B(p2m /2, pj−1 , pj ) . × α − p2k /2 + iη α − p2j /2 + iη k+1
We now proceed by estimating
m
lb lb 2m−b
dp0,b dpb G(p0 , pb )G(p0 , pb )∆σ (p0 , pb , pb ) b=0 σ∈S(b) ,
using Proposition 2.2 as in our previous crossing estimates. This time, we do not exploit any structure of the pairings and crudely estimate the integrands with variables p b−1 in L∞ . However, we eliminate a factor of t1/2 as a result of the bound 1/2 |B(α , p j−1 , p j ) − B(p 2 m /2, pj−1 , pj )| ≤ C(M λ0 )t
|α − p 2 m /2| pj−1 − p j 30
which follows trivially from Lemma 5.2. Indeed the numerator will cancel its corre 2/2 sponding singular factor |α − pm + iη|−1 and consequently we eliminate a total factor of t1/2 . For v ∈ R3 , define
Tv (p, q) := B (pm + v)2 /2, p − v, q − v , Tv (p) := Tv (p, p),
(6.2)
July 26, 2005 15:22 WSPC/148-RMP
732
J070-00242
D. Eng & L. Erd˝ os
where we have suppressed the dependence on pm in our notation. When v = 0, we will also conveniently drop the subscript v altogether. Moreover, these quantities also depend on η. We now define the renormalized operator kernels B ren (pj−1 , pj ) := T (pj−1 , pj ) − K ren (t; p0,m ) := (−i)m
m t∗
0
1 T (pj )δ(pj−1 − pj ), |Λ| 2
dsj e−isj (pj /2+T (pj )) .
(6.3) (6.4)
j=0
We recall that the momenta are on a discrete lattice (see (2.1) and (2.2)) before we take L → ∞. The benefit of renormalization is that Eα B ren (pj−1 , pj )eixα (pj−1 −pj ) = 0.
(6.5)
m With the notation B ren (p0,m ) := j=1 B ren (pj−1 , pj ), and with a similar definition for T (p0,m ), we define the renormalized wave function with less than m external collisions to be φren <m (t; p0 ) :=
m−1
no rec
dpm K ren (t; p0,m )B ren (p0,m )χ(A; p0,m )ψ0 (pm ). (6.6)
m=0 A:|A|=m
Lemma 6.2. For = 0 ε, t = T ε−1 , we get 2 lim lim lim sup E φren <m (t) − φ<m (t) = 0.
m→∞ ε→0 L→∞
Proof. Using the definition of B ren (pj−1 , pj ), one can verify that
m−1 rec no
dpn K(t; p0,n )χ(A; p0,n )ψ0 (pn )T (p0,n )
n=0 A:|A|=n
=
rec m−1 no b=0 A:|A|=b
×
b
"
l0,b
l0,b <m−b
#
(T (pj ))j
1 , |Λ|
+O
j=0
where once again l0,b := l0,b K(t; p0,b )
l
0,b )χ(A; p0,b )B ren (p0,b )ψ0 (pb ) dpb K(t; p0,b
b
j=0 j .
(6.7)
Also, the identity
b
= (−i)
0
t∗
[dsj ]b0
b (−isj )j −isj p2j /2 e j j=0
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
733
implies the relation
∞
l0,b ) K(t; p0,b
b
(T (pj ))
j
= K ren (t; p0,b ).
(6.8)
j=0
0 ,...,b =0
With (6.8) and (6.7) in hand, it suffices bound m−1 E
no rec
b=0 A:|A|=b
l0,b ) × K(t; p0,b
dpb ψ0 (pb )B ren (p0,b )χ(A; p0,b )
l0,b
l0,b ≥m−b
b j=0
2 (T (pj )) . j
(6.9)
We expand the L2 -norm m−1
no rec
b,b =0
dp0,b dp b
ψ0 (pb )ψ0 (p b )
l0,b ,l0,b A,A |A| = b;|A | = b
b
(T (pj ))
j=0
l
j
b
(T (p j ))j
j=0
0,b × E[χ(A; p0 , pb )χ(A ; p0 , p b )]K(t; p0,b )K(t; p00 , p b lb )B ren (p0,b )B ren (p0 , p b )
m−1
b dp0,b dp b |ψ0 (pb )|2 B ren (p0,b )B ren (p0 , p b ) ≤
b=0 σ∈S(b) l0,b ,l0,b
l0,b × K(t; p0,b )K(t; p00 , p b lb )∆σ (p0,b , p b )
b
j j T (pj ) (T (pj )) ,
(6.10)
j=0
where after taking expectations, the property (6.5) of B ren forces the existence of a permutation σ such that A = σ(A) and ∆σ (p0,b , p b ) contains the pairing relations as usual. We then distinguish between direct terms (σ = Id) and crossing terms (σ = Id). For the direct terms, we will use the Schwarz inequality
(Direct) ≤
m−1
∞
b=0 m∗ =m
×
b j=0
l0,b l0,b = m∗ −b
∗ m b dp0,b |ψ0 (pb )|2 |B ren (p0,b )|2 b
l
0,b |T (pj )|2j |K(t; p0,b )|2 .
July 26, 2005 15:22 WSPC/148-RMP
734
J070-00242
D. Eng & L. Erd˝ os
B ren can be estimated from (6.2), (6.3) and Lemma 5.2, and we have (Direct) ≤
m−1
∞
b=0 m∗ =m
×
l0,b l0,b = m∗ −b
m∗ b
2m
∗
−b 2m∗ λ0
sup pb
l
0,b )|2 dp0,b |K(t; p0,b
b 1 1 . 60 pb j=1 pj−1 − pj 60
We now use dispersive estimates as in the estimate of the direct terms of Lemma 3.5 to get (Direct) ≤
m−1
∞
b=0 m∗ =m
∗ ∗ C m λ2m 0
∗ ∗ ∞ T 2m −b (CT )m ≤ (m∗ − b)!b! m∗ =m m∗ !
which vanishes as we take m → ∞. To handle the crossing terms, σ = Id on the right-hand side of (6.10), we begin to proceed as in the direct estimate. An application of the Schwarz inequality and symmetry gives (Crossing) ≤
m−1
∞
b=2 σ=Id
m∗ =m
l0,b l0,b = m∗ −b
× ∆σ (p0,b , p b )|B ren (p0,b )|2
b
m∗ b
b
dp0,b dp b |ψ0 (pb )|2
l
0,b |T (pj )|2j |K(t; p0,b )|2 .
j=0
The conjugate momenta are then integrated out using the pairing functions. Following the steps of the direct estimate, we get (Crossing) ≤
m−1 b=2
m!
∗ ∞ (CT )m , m∗ ! m∗ =m
where the m! is due to estimating the number of permutations σ. This crude bound implies that we can appeal to the dominated convergence theorem in order to pass our limit in ε through the infinite sum. To get the limit, we return to (6.10) and expand K using Lemma 2.1. As in Lemma 3.5, the nontriviality of σ will imply that we can gain a factor compared to the crude bound. Finally, the dominated convergence theorem yields lim (Crossing) ≤
ε→0
for every fixed m.
m−1 b=0
m!
∞ m∗ =m
∗
lim C m T 2m
ε→0
∗
−1
=0
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
735
6.2. Computation of Wigner transform ε The rescaled Husimi function associated with ψω,T ε−1 (see (1.10)) can be written as µ µ (ε,µ) Hψ (X0 , V0 ) = Wψε ∗X0 Gε ∗V0 Gε (X0 , V0 ) µ µ = dx dw0 Wψε (x, w0 )Gε (x − X0 )Gε (w0 − V0 ),
where G is the Gaussian function with scaling given in (1.7) and Wψε (x, w0 ) := ε−3 Wψ (x/ε, w0 ) is the rescaled Wigner transform. Recall our decomposition (2.18). It has the disadvantage that our threshold m0 is dependent on ε. To cure this, fix M ∗ > 0 and write e
itH
ψ0 =
∗ M −1
m=0
no rec ψm (t) +
m 0 −1
no rec ψm (t) + Ψerror m0 (t).
m=M ∗
According to Lemma 3.1, the last term vanishes in the ε → 0 limit when we set m0 = m0 (ε) as in (3.42). We have the bound " m # T no rec + m!T m−1 (log T ε−1 )m+5 (T ε−1 ) 2 ≤ (M λ0 )m T m E ψm m! which is essentially the same estimate as Lemma 3.5 except that we do not have time division and hence, it is easier to prove. This implies m −1 2 0 no rec lim lim E ψ (t) (6.11) lim = 0. m M ∗ →∞ ε→0 L→∞ ∗ m=M
We ultimately need to prove that for any given bounded and continuous function J on R6 and any fixed 0 < T < ∞, we have
(ε,µ) lim lim
dX0 dV0 J(X0 , V0 ) EHψ(T ε−1 ) (X0 , V0 ) − FT (X0 , V0 )
= 0, ε→0 L→∞
where FT is the solution of the Boltzmann equation (1.11). Recall that the Husimi function defines a probability density on R6 . Since the Boltzmann equation preserves positivity and the L1 (R6 )-norm, and F0 1 = 1, the solution FT (X0 , V0 ) is also a probability density. Therefore we have to prove weak convergence of probability measures. It is well known that it is sufficient to test such convergence for smooth, compactly supported test functions. For the rest of the section, we thus fix a function J ∈ S(R6 ). Using (1.10) and an argument nearly identical to the one justifying (2.10) in [9], we have
dX0 dV0 J(X0 , V0 )[EH (ε,µ)−1 (X0 , V0 ) − EH (ε,µ) −1 (X0 , V0 )] ψ(T ε ) ψ1 (T ε )
,
E ψ1 2 E ψ2 2 , (6.12) ≤ C sup sup Jˆε (ξ, V0 ) dξ ε<1
V0
July 26, 2005 15:22 WSPC/148-RMP
736
J070-00242
D. Eng & L. Erd˝ os
ˆ −1 , V0 ). We recall for any decomposition ψ =: ψ1 + ψ2 and with Jˆε (ξ, V0 ) := ε−3 J(ξε the definition (6.6) and we set φt = φ(t) := φren <M ∗ (t) for brevity. We apply the estimate (6.12) with ψ = ψt and ψ1 = φ(t). Then Lemmas 3.1, 6.1 and 6.2, and Eqs. (2.18) and (6.11) imply that it suffices to show that lim
(ε,µ)
lim lim EHφ(T ε−1 ) (X0 , V0 ) = FT (X0 , V0 )
M ∗ →∞ ε→0 L→∞
(6.13)
in S . An application of the Fourier inversion theorem gives us the following identity (ε,µ) -φ (εp, w0 ) 1 Gε−µ (p)Gεµ (w0 − V0 ), Hφ (X0 , V0 ) = dp dw0 eipX0 W ε3µ where
ξ ξ Wφ (ξ, w0 ) = φt w0 − φt w0 + 2 2
is the Fourier transform of the Wigner function in the first variable. Using (6.6), we have ∗ M −1 εp Wφ (εp, w0 ) = dpm dp n χ A; w0 + , pm 2 m,n=0 A:|A|=m A :|A |=n εp × χ A ; w0 − , p n 2 εp εp × K ren t; w0 + , pm K ren t; w0 − , p n 2 2 εp εp ε ren ε ren × ψ0 (pm )ψ0 (p n )B w0 + , pm B w0 − , pn . 2 2 We next take the expectation of this expression, and we use that the renormalization forces n = m and A = σ(A) for σ ∈ S(n) (see (6.5)). Rename variables wj := pj − εp 2 and wj := p j + εp 2 and define the following Bpren (w0,m ) :=
m
B ren (wj−1 + p, wj + p),
j=1
B
ren
(wj−1
(wm + p)2 , wj−1 + p, wj + p + p, wj + p) := B 2 (wm + p)2 1 B , wj + p, wj + p δ(wj−1 − wj ), − |Λ| 2
K ren(t; w0,m ± p) := K ren(t; w0 ± p, . . . , wm ± p).
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
737
Using Lemma 3.3, the limit (3.5) and a change of variables p → 2p, we obtain that (ε,µ) dX0 dV0 J(X0 , V0 )EHφ (X0 , V0 ) lim L→∞
=
∗ M −1
2 3 m
dX0 dV0 J(X0 , V0 )
m=0 σ∈S(m)
×
dp dw0 e2ipX0
×
dwm dw m
µ 1 ε−µ G (2p)Gε (w0 − V0 ) ε3µ
m
δ[(wj−1 − wj ) − (wσ(j)−1 − wσ(j) )]
j=1 ren ren (w ˆ × K ren(t; w0,m + εp)K ren(t; w0,m − εp)Bεp (w0,m )B−εp 0,m )Wψ0ε (2εp, wm ).
(6.14) As before, we can show that cross terms, which arise from terms in which σ = Id, are smaller by a factor of t−1 = O(ε) and vanish in the ε → 0 limit (recall we are taking ε → 0 for a fixed M ∗ ). The proof of this statement is nearly identical to estimate of the crossing terms in Lemma 3.5 except without the time-division and using K ren in place of K. One can prove a representation for the renormalized kernel (which differs by a small perturbation in the dispersion relation from the free kernel) analogous to Lemma 2.1 and subsequently, estimates mirroring those in Proposition 2.2 and in Lemma 3.5. The key observation is that in the necessary estimates, the renormalization can be removed from the propagators using the bound
C 1
α − (p2 /2 + T (pj )) + iη ≤ |α − p2 /2 + iη| j j that follows from
1 1
−
α − (p2 /2 + T (pj )) + iη α − p2 /2 + iη j j ≤
|T (pj )| |α − (p2j /2 + T (pj )) + iη| |α − p2j /2 + iη|
≤
C , |α − p2j /2 + iη|
using η −1 ≤ C. Consequently, we are left with only the direct terms (σ = Id) after the ε → 0 limit. Our next step is to replace m 2 ren ren (wm ± εp) , wj−1 ± εp, wj ± εp B B±εp (w0,m ) = 2 j=1
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
D. Eng & L. Erd˝ os
738
with m
(wm ± εp)2 T±εp (w0,m ) = , wj−1 ± εp, wj ± εp B 2 j=1
in (6.14). That is, we remove the renormalization on the potential part. By definition, we have 2 ren (wm ± εp) , wj−1 ± εp, wj ± εp B 2 (wm ± εp)2 =B , wj−1 ± εp, wj ± εp 2 (wm ± εp)2 1 δ(wj−1 − wj )B , wj ± εp, wj ± εp . − |Λ| 2 From this, we can show that for a fixed M ∗ , that the error associated with replacing ren (w0,m ) with Tεp (w0,m ) will be of order |Λ|−1 and hence it vanishes in the Bεp L → ∞ limit. It is important to note here that our momenta are on the discrete lattice and hence we have the identity (δ(p)/|Λ|)2 = δ(p)/|Λ|. In particular, higher powers of the delta functions that may arise in the product are harmless. The free-evolution portion in (6.14) can be written, using our scaling t = T ε−1 and = 0 ε, as K ren(t; w0,m + εp)K ren (t; w0,m − εp) t∗ t∗ m 2 2 m [dsj ]m [ds ] e−isj [(wj +εp) /2+Tεp (wj )] eisj [(wj −εp) /2+T−εp (wj )] = 0 j 0 0
2m = m ε
0
0
T∗
j=0
[daj ]m 0
m
e−iaj [2wj p+0 T (wj ;εp)]
j=0
m j=0
aj /ε
−aj /ε
dbj e−ibj Ω(wj ;εp) δ(Σbj ),
where T (wj ; εp) := Tεp (wj ) − T−εp (wj ), Ω(wj ; εp) := wj2 + Tεp (wj ) + T−εp (wj ) , and we introduced aj = 2ε (sj + s j ), bj = sj − s k . We also define m aj /ε −ibj Ω(wj ;εp) dbj e δ(Σbj ) Mε (a0,m ; w0,m , εp) := j=0
−aj /ε
= Rm
where
[dbj ]m−1 χa0,m ε−1 (b) 0
χa0,m ε−1 (b) := χ −am ε−1 ≤
m−1
e−ibj [Ω(wj ;εp)−Ω(wm ;εp)] , (6.15)
j=0
m−1 j=0
bj ≤ am ε−1
m−1 j=0
χ(−aj ε−1 ≤ bj ≤ aj ε−1 ).
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
739
Neglecting terms which vanish in the L → ∞ and ε → 0 limits, we have
(ε,µ)
dX0 dV0 J(X0 , V0 )EHφM (X0 , V0 ) =2
3
∗ −1 M
m
(20 )
dX0 dV0 J(X0 , V0 )
dp dw0,m e2ipX0
m=0
1 ε−µ G (2p) ε3µ
µ -ψε (2εp, wm )Tεp (w0,m )T−εp (w0,m ) × Gε (w0 − V0 )W 0 T∗ m × [daj ]m e−iaj [2wj p+0 T (wj ;εp)] Mε (a0,m ; w0,m , εp). 0
0
(6.16)
j=0
We would now like to argue that p ∼ ε−µ and therefore we can replace instances of εp with 0 since we are taking ε → 0. We also would like to remove the restriction χa0,m ε−1 from (6.15) to extend to dbj integration and pick up the onshell delta function. To justify this rigorously, we need the following proposition. Proposition 6.3. Let aj ≤ T for j = 0, . . . , m, Mε be defined in (6.15) and f ∈ S(dw0,m ). We have
sup dw0,m f (w0,m )Mε (a0,m ; w0,m , εp) ≤ C(T ) dwm |||f |||dw0,m−1 , p where C(T ) is independent of ε. Moreover, for each fixed values aj > 0, we have
dw0,m f (w0,m )Mε (a0,m ; w0,m , εp) →
dw0,m f (w0,m )
m−1
2 2πδ(wj2 − wm ),
j=0
as ε → 0. The same limit holds if f = fε depends on ε, but uniformly bounded.
dwm |||fε |||dw0,m−1 is
This result says that Mε creates the “on-shell” condition in the limit. Proof. For ε > 0, we can use Fubini to write dw0,m f (w0,m )Mε (a0,m ; w0,m , εp) m−1 2 m−1 −1 = [dbj ]0 χa0,m ε (b) dw0,m e−ibj wj Rm
× f (w0,m )
j=0 m−1 j=0
2
e−ibj [−wm +(Tεp (wj )+T−εp (wj )−Tεp (wm )−T−εp (wm )] .
July 26, 2005 15:22 WSPC/148-RMP
740
J070-00242
D. Eng & L. Erd˝ os
We now apply the dispersive estimate iteratively, (3.17) to get the bound
dw0,m f (w0,m )Mε (a0,m ; w0,m , εp)
m−1 m−1 −3/2 [dbj ]0 χa0,m ε−1 (b) bj dwm ≤C Rm
j=1
m−1
2 −ibj [−wm +(Tεp (wj )+T−εp (wj )−Tεp (wm )−T−εp (wm )]
×
f (w0,m ) e
j=0
. dw0,m−1
A simple computation bounds the factor in the triple norm by m−1 |||f |||dw0,m−1 j=0 bj 2 which is independent of p. The condition of |bj | ≤ aj ε−1 and = 0 ε allows us to complete the proof of the first statement of Proposition 6.3. For the second statement, define m−1 2 2 ˜ ε (a0,m ; w0,m ) := −1 M [dbj ]m−1 χ (b) e−ibj (wj −wm ) , a0,m ε 0 Rm
j=0
M (a0,m ; w0,m ) :=
Rm
[dbj ]m−1 0
m−1
2
2
e−ibj (wj −wm ) =
j=0
m−1
2 2πδ(wj2 − wm ).
j=0
Then
dw0,m f (w0,m ) Mε (a0,m ; w0,m , εp) − M ˜ ε (a0,m ; w0,m )
m−1
−1 (b) ≤
[dbj ]m−1 χ bj −3/2 dwm a ε j=0 0,m j=0
m−1
−ibj [T−εp (wj )+Tεp (wj )−T−εp (wm )−Tεp (wm )]
×
f (w0,m ) 1 − e
dw0,m−1 j=0 ≤ C(T )ε1/2 |log ε| dwm |||f |||dw0,m−1 , since the factor in the triple norm is bounded by j |bj | for |bj | ≤ aj ε−1 . Finally,
˜ ε (a0,m ; w0,m ) − M (a0,m ; w0,m )
dw0,m f (w0,m ) M
≤C
dwm |||f |||dw0,m−1
[dbj ]m−1 0
m−1 1 − χa0,m ε−1 (b) bj −3/2 . j=0
When aj > 0, j = 0, . . . , m, the dbj integral goes to zero as ε → 0 and we prove the proposition. We now show that we can make the the following replacement Tεp (w0,m )T−εp (w0,m )e−i0 Σaj T (wj ;εp) → |T (w0,m )|2 e−i0 Σaj T (wj ;0) ,
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
741
in (6.16). By Proposition 6.3, it suffices to estimate
−µ µ 1 -ψε (2εp, wm ) dwm
dp dX0 dV0 J(X0 , V0 )e2ipX0 3µ Gε (2p)Gε (w0 − V0 )W 0 ε
T∗
× 0
2 −i0 Σaj T (wj ;0) [daj ]m 0 |T (w0,m )| e
− Tεp (w0,m )T−εp (w0,m )e
−i0 Σaj T (wj ;εp)
.
(6.17)
dw0,m−1
Using Lemma 5.2 and the smoothness of J in the V0 variable, the expression in the square brackets can be shown to be order η −1/2 ε|p| ∼ ε1/2 |p|. Since |p| ε−µ apart from exponentially small terms, we see that (6.17) vanishes if µ < 12 . Summarizing these replacements, we are left with, to leading order, (ε,µ) dX0 dV0 J(X0 , V0 )EHφ (X0 , V0 ) =2
3
∗ −1 M
(20 )m
dp dw0,m e2ipX0
dX0 dV0 J(X0 , V0 )
m=0 µ 1 ε−µ -ψε (2εp, wm )|T (w0,m )|2 e−i0 ΣT (wj ;0) G (2p)Gε (w0 − V0 )W 0 ε3µ T∗ m−1 −iΣ2aj wj p 2 × [daj ]m 2πδ(wj2 − wm ). 0 e
×
0
j=0
We now use Fourier inversion to write −µ 1 -ψε (2εp, wm )e−iΣ2aj wj p 23 dp e2ipX0 3µ Gε (2p)W 0 ε ε µ = dx Gε X0 − x − Σm 0 aj wj Wψ0ε (x, wm ). Using our explicit form for the initial wave function, ψ0ε (x) := ε3/2 h(εx)eiu0 x , it can be verified that (ε,µ) lim dX0 dV0 J(X0 , V0 )EHφ (X0 , V0 ) ε→0
=
∗ M −1
m=0
m 0
dX0 dV0 J(X0 , V0 ) 0
T∗
[daj ]m 0
2T 0 Im T (w0 ) × F0 (X0 − Σm 0 aj wj , wm )e
m
dw0,m δ(V0 − w0 ) 2 4π|T (wj−1 , wj )|2 δ(wj2 − wm ),
j=1
where F0 (X, V ) = |h(X)|2 δ(V − u0 ) is the initial condition to the Boltzmann equation. Taking M ∗ → ∞, we proved the limit (6.13), where FT is the solution to the Boltzmann equation (written in the iterated time integration form) with collision kernel 4π0 |T (wj−1 , wj )|2 .
July 26, 2005 15:22 WSPC/148-RMP
742
J070-00242
D. Eng & L. Erd˝ os
Finally, we have to identify the collision kernel. We recall that by definition 2 wm , wj−1 , wj . T (wj−1 , wj ) = Bη 2 We also recall that the dependence of B on η is controlled by Lemma 5.1, and that limη→0+0 Bη is identified with the scattering T-matrix Tscat in (2.33). We therefore have m m 2 2 |T (wj−1 , wj )|2 δ(wj2 − wm )= |Tscat (wj−1 , wj )|2 δ(wj−1 − wj2 ). j=1
j=1
Defining σ(U, V ) := 4π|Tscat (U, V )|2 δ(U 2 − V 2 ), and applying the optical theorem to get Im Tscat (V, V ) = −
1 σ =− 2 2
dU σ(U, V ),
we conclude that (ε,µ)
EHψε (X0 , V0 ) → FT (X0 , V0 ), T /ε
(6.18)
as ε → 0 in S (R6 ), where FT (X0 , V0 ) solves the Boltzmann equation with collision kernel Σ(U, V ) := 0 σ(U, V ). This completes the proof of the Main Theorem. Acknowledgments This work began as a joint project with H.-T. Yau and many of the ideas here have been developed in collaboration with him (see the conference proceeding announcement [10]). We would like to thank him for his support and advice, without which this work would not have been possible. References [1] M. Aizenman, Localization at weak disorder: Some elementary bounds, Rev. Math. Phys. 6 (1994) 1163–1182. [2] M. Aizenman and S. Molchanov, Localization at large disorder and at extreme energies: An elementary derivation, Commun. Math. Phys. 157 (1993) 245–278. [3] P. Anderson, Absences of diffusion in certain random lattices, Phys. Rev. 109 (1958) 1492–1505. [4] C. Boldrighini, L. Bunimovich and Y. Sinai, On the Boltzmann equation for the Lorentz gas, J. Stat. Phys. 32 (1983) 477–501. [5] T. Chen, Localization lengths and Boltzmann limit for the Anderson model at small disorders in dimension 3, preprint, xxx.lanl.gov/math-ph/0305051. odinger to a linear Boltzmann evolution, [6] T. Chen, Lr convergence a random Schr¨ preprint, xxx.lanl.gov/math-ph/0407037. [7] D. D¨ urr, S. Goldstein and J. Lebowitz, Asymptotic motion of a classical particle in a random potential in two dimensions: Landau model, Commun. Math. Phys. 113 (1987) 209–230.
July 26, 2005 15:22 WSPC/148-RMP
J070-00242
The Linear Boltzmann Equation as the LDL of a Random Schr¨ odinger Equation
743
[8] H. von Dreifus and A. Klein, Localization for random Schr¨ odinger operators with correlated potentials, Commun. Math. Phys. 140 (1991) 133–147. [9] L. Erd˝ os and H.-T. Yau, Linear Boltzmann equation as the weak coupling limit of the random Schr¨ odinger equation, Comm. Pure Appl. Math. LIII (2000) 667–735. [10] L. Erd˝ os and H.-T. Yau, Linear Boltzmann equation as the scaling limit of quantum Lorentz gas, Advances in Differential Equations and Mathematical Physics, Contemporary Mathematics 217 (1998) 137–155. [11] J. Fr¨ ohlich and T. Spencer, Absence of diffusion in the Anderson tight binding model for large disorder or low energy, Commun. Math. Phys. 88 (1983) 151–184. [12] J. Fr¨ ohlich, F. Martinelli, S. Scoppola and T. Spencer, Constructive proof of localization in the Anderson tight binding model, Commun. Math. Phys. 101 (1985) 21–46. [13] G. Gallavotti, Rigorous theory of the Boltzmann equation in the Lorentz gas, Nota inteerna n. 358, Univ. di Roma (1970). [14] T. G. Ho, L. J. Landau and A. J. Wilkins, On the weak coupling limit for a Fermi gas in a random potential, Rev. Math. Phys. 5 (1992) 209–298. [15] H. Kesten and G. Papanicolaou, A limit theorem for stochastic acceleration, Commun. Math. Phys. 78 (1980) 19–63. [16] A. Klein, Absolutely continuous spectrum in the Anderson model on the Bethe lattice, Math. Res. Lett. 1 (1994) 399–407. [17] A. Klein, Spreading of wave packets in the Anderson model on the Bethe lattice, Commun. Math. Phys. 177 (1996) 755–773. [18] L. J. Landau, Observation of quantum particles on a large space-time scale, J. Stat. Phys. 77 (1994) 259–309. [19] M. Reed and B. Simon, Methods of modern mathematical physics, Scattering Theory, Vol. 3 (Academic Press, 1980). [20] H. Spohn, Derivation of the transport equation for electrons moving through random impurities, J. Stat. Phys. 17 (1977) 385–412. [21] H. Spohn, The Lorentz process converges to a random flight process, Commun. Math. Phys. 60 (1978) 277–290.
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Reviews in Mathematical Physics Vol. 17, No. 7 (2005) 745–768 c World Scientific Publishing Company
REMARKS ON SUFFICIENT CONDITIONS FOR CONSERVATIVITY OF MINIMAL QUANTUM DYNAMICAL SEMIGROUPS
CHANGSOO BAHN∗ and CHUL KI KO† Natural Science Research Institute, Yonsei University, Seoul 120-749, Korea ∗ [email protected] † [email protected] YONG MOON PARK Department of Mathematics, Yonsei University, Seoul 120-749, Korea [email protected] Received 20 January 2005 Revised 11 May 2005 We have obtained sufficient conditions for conservativity of minimal quantum dynamical semigroup by modifying and extending the method used in [1]. Our criterion for conservativity can be considered as a complement to Chebotarev and Fagnola’s conditions [1]. In order to show that our conditions are useful, we apply our results to concrete examples (models of heavy ion collision and noncommutative elliptic operators). Keywords: Quantum dynamical semigroups; criterion for conservativity; Lindblad operators; noncommutative elliptic operators.
1. Introduction In this paper, we are looking for any possible extension of Chebotarev and Fagnola’s sufficient conditions [1] for conservativity of minimal quantum dynamical semigroup. By modifying and extending the method employed in [1], we have obtained sufficient conditions for conservativity which extend the previous one in some directions. In order to show that our conditions are useful, we apply our results to concrete examples (models of heavy ion collision and noncommutative elliptic operators). The concept of quantum dynamical semigroup (q.d.s.) has become a fundamental notion in the study of irreversible evolutions in quantum mechanics [2, 3], open system [4] and quantum probability theory [5–7]. The theory of q.d.s. has been intensively studied in recent years, laying special emphasis to the minimal q.d.s. as well as to sufficient conditions to ensure its conservativity (markovianity) [1, 8–14]. It is worthy to mention that there has been attention on the existence of 745
August 12, 2005 15:54 WSPC/148-RMP
746
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
stationary states for a given conservative q.d.s. and faithfulness of the stationary states [15, 16]. A q.d.s. T = (Tt )t≥0 in B(h), the Banach space of bounded operators in a Hilbert space h, is a (ultraweakly continuous) semigroup of completely positive linear maps on B(h). A q.d.s. T is conservative if Tt (I) = I where I is the identity operator on h. In rather general cases, the infinitesimal generator L can be written (formally) as ∞
1 1 (1.1) L∗l XLl − M X, X ∈ B(h), L(X) = i[H, X] − XM + 2 2 l=1 ∞ where M = l=1 L∗l Ll , Ll is densely defined and H a symmetric operator on h [17, 7]. However, for unbounded generator L in (1.1) with (unbounded) coefficients H and Ll , the solution T of the quantum master Markov equation d Tt (X) = L(Tt (X)), T0 (X) = X, (1.2) dt may not be unique and conservative [8, 18]. Under suitable conditions, the above equation (1.2) has a minimal solution known as the minimal q.d.s. (see Sec. 2). Moreover if the minimal q.d.s. is conservative, it is the unique solution of the above equation. Also, the study of conservativity conditions is important in quantum probability because they play a key role in the proof of uniqueness and unitarity of solutions of an Hudson–Parthasarathy quantum stochastic differential equation [19–21]. Chebotarev gave necessary and sufficient conditions for conservativity [8]. Some of the conditions, however, are impossible to check practically in many interesting examples. Simplified forms of sufficient conditions were developed in [1, 9, 10]. Especially the form of sufficient conditions in [1] can be written as follows: there exists a positive self-adjoint operator C bounded from below by M satisfying a form inequality L(C) ≤ bC,
(1.3)
where b is a constant. The main aim of this work is to improve the inequality (1.3). Our form of sufficient conditions for conservativity is as follows: there exists a positive selfadjoint operator C bounded from below by δM for some positive δ > 0 such that for all ε ∈ (0, 1), two inequalities L(C) ≤ εC 2 + bC + aε−p I,
(1.4) 1 i[H, C] + C 2 − (M C + CM ) ≤ εC 2 + bC + aε−p I, (1.5) 2 hold for some constants p ∈ (0, 1), b ≥ 0 and a ≥ 0. For details, see Theorem 3.1. In case the positive self-adjoint operator C satisfies (1.5), the inequality (1.4) improves (1.3) obviously. Let us mention that if we choose M for C, i[H, M ] ≤ εM 2 + bM + aε−p I
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
747
is equivalent to (1.5). In order to explain our conditions (1.4) and (1.5) are useful in a practical sense, we give some relative bounds (Lemma 4.1) and apply our results to concrete q.d.s. associated to a quantum system with dissipative heavy ion collisions (Example 4.3) and noncommutative elliptic operators (Example 4.6). The conservativity of the former has been already considered in [1]. However, applying our criterion, we are able to control local singularities of (derivatives of) coefficients of the infinitesimal generator (see Remark 4.5). This paper is organized as follows. In Sec. 2, we give a brief review on the theory of minimal q.d.s. and criteria for conservativity. In Sec. 3, we first list our sufficient conditions for conservativity and then produce the proof of our result. In Sec. 4, we give some relative bounds to apply the results of Sec. 3 to concrete q.d.s. 2. The Minimal Quantum Dynamical Semigroup Let h be a complex separable Hilbert space with the scalar product ·, · and norm · . Let B(h) denote the Banach space of bounded linear operators on h. The uniform norm in B(h) is denoted by · ∞ and the identity in h is denoted by I. We denote by D(G) the domain of operator G in h. Definition 2.1. A quantum dynamical semigroup (q.d.s.) on B(h) is a family T = (Tt )t≥0 of operators in B(h) with the following properties: (i) (ii) (iii) (iv)
T0 (X) = X, for all X ∈ B(h), Tt+s (X) = Tt (Ts (X)), for all s, t ≥ 0 and all X ∈ B(h), Tt (I) ≤ I, for all t ≥ 0, (completely positivity) for all t ≥ 0, all integer n and all finite sequences (Xj )nj=1 , (Yl )nl=1 of elements of B(h), we have n
Yl∗ Tt (Xl∗ Xj )Yj ≥ 0,
j, l=1
(v) (normality) for every sequence (Xn )n≥1 of B(h) converging weakly to an element X of B(h), the sequence (Tt (Xn ))n≥1 converges weakly to an element Tt (X) for all t ≥ 0, (vi) (ultraweak continuity) for all trace class operator ρ on h and all X ∈ B(h), we have lim Tr(ρTt (X)) = Tr(ρX).
t→0+
We recall that as a consequence of properties (iii) and (iv) for each t ≥ 0 and X ∈ B(h), Tt is a contraction, i.e. Tt (X)∞ ≤ X∞ .
(2.1)
Also recall that as a consequence of properties (iv) and (vi), for all X ∈ B(h), the map t → Tt (X) is strongly continuous.
August 12, 2005 15:54 WSPC/148-RMP
748
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
Definition 2.2. A q.d.s. T = (Tt )t≥0 on B(h) is called to be conservative if Tt (I) = I for all t ≥ 0. As mentioned in the Introduction, the natural generator of q.d.s. would be the Lindblad type generator [17, 7]. Letting 1 G = −iH − M, 2
∞
where M =
L∗l Ll ,
(2.2)
l=1
the infinitesimal generator in (1.1) can be formally written by L(X) = XG + G∗ X +
∞
L∗l XLl .
l=1
A very large class of q.d.s. was constructed by Davies [22] under the following assumption. A. The operator G is the infinitesimal generator of a strongly continuous contraction semigroup P = (P (t))t≥0 in h. The domain of the operators (Ll )∞ l=1 contains the domain D(G) of the operator G. For all v, u ∈ D(G), we have v, Gu + Gv, u +
∞
Ll v, Ll u = 0.
(2.3)
l=1
As a result of [10, Proposition 2.5], we can assume only that the domain of the operators Ll contains a subspace D which is a core for G and (2.3) holds for all v, u ∈ D. For all X ∈ B(h), consider the sesquilinear form L(X) on h with domain D(G)× D(G) given by v, L(X)u = v, XGu + Gv, Xu +
∞ Ll v, XLl u.
(2.4)
l=1
Under the assumption A, one can construct a q.d.s. T = (Tt )t≥0 satisfying the equation t v, Tt (X)u = v, Xu + v, L(Ts (X))u ds (2.5) 0
for all v, u ∈ D(G) and all X ∈ B(h). For a strongly continuous family (Tt (X))t≥0 of elements of B(h) satisfying (2.1), the following are equivalent: (i) equation (2.5) holds for all v, u ∈ D(G), (ii) for all v, u ∈ D(G), we have v, Tt (X)u = P (t)v, XP (t)u ∞ t + Ll P (t − s)v, Ts (X)Ll P (t − s)u ds. l=1
0
(2.6)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
749
We refer to the proof of [1, Proposition 2.3]. A solution of the equation (2.6) is obtained by the iterations (0)
u, Tt
(X)u := P (t)u, XP (t)u,
(n+1) (X)u u, Tt
:= P (t)u, XP (t)u ∞ t + Ll P (t − s)u, Ts(n) (X)Ll P (t − s)u ds, l=1
(2.7)
0
for all u ∈ D(G). In fact, for all positive elements X ∈ B(h) and all t ≥ 0, the (n) sequence of operators (Tt (X))n≥0 is non-decreasing. Therefore it is strongly convergent and its limits for X ∈ B(h) and t ≥ 0 define the minimal solution T (min) of (2.6) in the sense that, given another solution (Tt )t≥0 of (2.5), one can easily check that (min)
Tt
(X) ≤ Tt (X) ≤ X∞ I
for any positive element X and all t ≥ 0. For details, we refer to [8, 11]. We recall here a necessary and sufficient condition for conservativity of minimal q.d.s. obtained by Chebotarev. Let us consider the linear monotone maps Pλ : B(h) → B(h) and Qλ : B(h) → B(h) defined by ∞ v, Pλ (X)u = e−λs P (s)v, XP (s)u ds, (2.8) 0
v, Qλ (X)u =
∞ l=1
∞
0
e−λs Ll P (s)v, XLl P (s)u ds,
(2.9)
for all λ > 0 and X ∈ B(h), v, u ∈ D(G). It is easy to check that both Pλ and Qλ are completely positive, and also both λPλ and Qλ are normal contractions in B(h) (see [10, Sec. 2]). (min) )λ>0 defined by The resolvent of the minimal q.d.s. (Rλ ∞ (min) v, Rλ (X)u = e−λs v, Ts(min) (X)u ds 0
(with X ∈ B(h) and v, u ∈ h) can be represented as (min)
Rλ
(X) =
∞
Qkλ (Pλ (X)),
(2.10)
k=0
the series being convergent for the strong operator topology (see [1, Theorem 3.1]). Proposition 2.3. Suppose that the condition A holds and we fix λ > 0. Then the sequence of positive operators (Qkλ (I))k≥0 is non-increasing. Moreover the following conditions are equivalent: (i) the minimal q.d.s. T (min) is conservative, (ii) s-limk→∞ Qkλ (I) = 0.
August 12, 2005 15:54 WSPC/148-RMP
750
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
The above proposition has been proved in [1, 10]. Due to Proposition 2.3, the minimal q.d.s. is conservative whenever, for a fixed λ > 0, the series ∞
u, Qkλ (I)u
(2.11)
k=0
is convergent for all u in a dense subspace of h. In fact, in this case, condition (ii) of Proposition 2.3 holds because the sequence of positive operators (Qkλ (I))k≥0 is non-increasing. Employing the above facts, Chebotarev and Fagnola have obtained a criteria to verify the conservativity of minimal q.d.s. (see [1, Sec. 4]). Here we give their result ([1, Theorem 4.4]): Theorem 2.4. Under the assumption A, suppose that there exists a positive selfadjoint operator C in h with the following properties: (a) the domain D(G) of G is contained in the domain of the positive square root C 1/2 and D(G) is a core for C 1/2 , (b) the linear manifolds Ll (D(G2 )), l ≥ 1, are contained in the domain of C 1/2 , (c) there exists a self-adjoint operator Φ, with D(G) ⊂ D(Φ1/2 ) and D(C) ⊂ D(Φ), such that, for all u ∈ D(G), we have −2Reu, Gu =
∞
Ll u2 = Φ1/2 u2 ,
l=1
(d) for all u ∈ D(C), we have Φ1/2 u ≤ C 1/2 u, (e) for all u ∈ D(G2 ), there exists a positive constant b depending only on G, C, Ll such that 2ReC 1/2 u, C 1/2 Gu +
∞
C 1/2 Ll u2 ≤ bC 1/2 u2 .
(2.12)
l=1
Then the minimal q.d.s. is conservative. We will call the conditions in Theorem 2.4 C-F sufficient condition. 3. Sufficient Condition for Conservativity In this section, we extend more or less a C-F sufficient condition for conservativity of the minimal q.d.s.. First we introduce our assumption. C. There exists a positive self-adjoint operator C such that (a) the domain of its positive square root C 1/2 contains the domain D(G) of G and D(G) is a core of C 1/2 . Also, the domain of C contains the domain of G2 , (b) the linear manifolds Ll (D(G2 )), l ≥ 1, are contained in the domain of C 1/2 ,
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
751
(c) there exist p ∈ (0, 1), b ≥ 0 and a ≥ 0 such that for any ε ∈ (0, 1), two inequalities 2ReCu, Gu ≤ −(1 − ε)Cu2 + bC 1/2 u2 + aε−p u2
(3.1)
and 2ReCu, Gu +
∞
C 1/2 Ll u2 ≤ εCu2 + bC 1/2 u2 + aε−p u2
(3.2)
l=1
hold for all u ∈ D(G2 ). The following is our main result. Theorem 3.1. Suppose that assumptions A and C hold for some positive selfadjoint operator C and there exists a positive self-adjoint operator Φ in h such that: (a) the domain of the positive square root Φ1/2 contains the domain of G and, for every u ∈ D(G), we have −2Reu, Gu =
∞
Ll u, Ll u = Φ1/2 u, Φ1/2 u,
l=1
(b) the domain of C is contained in the domain Φ and, for some δ > 0, we have δΦ1/2 u, Φ1/2 u ≤ C 1/2 u, C 1/2 u,
∀u ∈ D(C).
Then the minimal q.d.s. is conservative. Before proceeding the proof of the above theorem, it may be worth to give some remarks on the assumption C. Remark 3.2. (a) If we choose the operator C satisfying (3.1), the inequality (3.2) evidently improves (2.12) in C-F sufficient condition. (b) As mentioned in the Introduction, the inequality (3.1) can be written formally by 1 i[H, C] + C 2 − (M C + CM ) ≤ εC 2 + bC + aε−p I. 2 ∗ If we choose C = M (= ∞ l=1 Ll Ll ), then (3.1) is equivalent to the following condition iu, [H, M ]u ≤ εM u2 + bM 1/2 u2 + aε−p u2 . Thus, in many cases, the condition (3.1) is easier to check than (3.2). (c) As Kato’s relative bounds [23] control local singularities of potentials in the Schr¨ odinger operator, we believe that the bounds in (3.1) and (3.2) will be able to control local singularities of (derivatives of) the coefficients of generators of q.d.s. In the rest of this section, we produce the proof of Theorem 3.1. The following lemma is an extension of the condition that the series (2.11) converges. It follows
August 12, 2005 15:54 WSPC/148-RMP
752
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
from the fact that (Qkλ (I))k≥0 is a positive and non-increasing sequence. We omit the proof, since it is obvious. Lemma 3.3. Suppose that for fixed λ > 0, the series ∞ k=0
1 u, Qkλ (I)u k+1
(3.3)
is convergent for all u in a dense subspace of h. Then we have s-limk→∞ Qkλ (I) = 0. By Proposition 2.3 and Lemma 3.3, the minimal q.d.s. is conservative whenever, for a fixed λ > 0, the series ∞ k=0
1 u, Qkλ (I)u k+1
converges for all u in a dense subspace of h. By Monotone Convergence Theorem, we have 1 ∞ ∞ 1 k k k u, Qλ (I)u = x u, Qλ (I)u dx. (3.4) k+1 0 k=0
k=0
(min)
Fix x ∈ (0, 1). For all u ∈ D(G) and X ∈ B(h), let Tt,x (X) be the solution obtained by the iterations (0) u, Tt,x (X)u = P (t)u, XP (t)u, (n+1) u, Tt,x (X)u = P (t)u, XP (t)u ∞ t (n) Ll P (t − s)u, Ts,x (X)Ll P (t − s)u ds. (3.5) +x l=1
0
For all u ∈ h and X ∈ B(h), and for λ > 0, let ∞ (n) (n) e−λt u, Tt,x (X)u dt, u, Rλ,x (X)u = 0
(min) u, Rλ,x (X)u =
(3.6)
∞
0
(n)
e
−λt
(min) u, Tt,x (X)u dt.
(min)
Clearly (2.1) guarantees that Rλ,x (X) and Rλ,x (X) are well defined. We can also obtain the relation corresponding to (2.10). Proposition 3.4. For any x ∈ (0, 1), λ > 0 and X ∈ B(h), we have (min)
Rλ,x (X) =
∞
xk Qkλ (Pλ (X)),
k=0
the series being convergent for the strong operator topology.
(3.7)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
753
(n) Proof. For any positive element X of B(h), the sequence Rλ,x (X) n≥0 is nondecreasing. Therefore by (3.6), for all u ∈ h, we have ∞ (min) (min) e−λt u, Tt,x (X)u dt u, Rλ,x (X)u = 0
(n)
= supu, Rλ,x (X)u. n≥0
The second equation (3.5) yields ∞ (n+1) e−λt P (t)u, XP (t)u dt u, Rλ,x (X)u = 0
+x
∞
∞
0
l=1
e
−λt
t
0
(n) Ll P (t − s)u, Ts,x (X)Ll P (t − s)u ds dt
(3.8) for all u ∈ D(G). By the change of variables in the above double integral and (2.8), we have (n+1) u, Rλ,x (X)u = u, Pλ (X)u ∞ ∞ ∞ −λr (n) +x e e−λs Ll P (r)u, Ts,x (X)Ll P (r)u ds dr. 0
l=1
0
(3.9) Thus we obtain the recursion relation (n+1)
Rλ,x
(n)
(X) = Pλ (X) + xQλ (Rλ,x (X)).
Iterating n times, we have (n+1)
Rλ,x
(X) =
n+1
xk Qkλ (Pλ (X))
(3.10)
k=0
and (3.7) follows from letting n tend to ∞. Since any bounded operator can be written as a linear combination of four positive self-adjoint operators, (3.7) also holds for an arbitrary element of B(h). Lemma 3.5. Condition C implies that, for each u ∈ D(G2 ), the function t → C 1/2 P (t)u2 is differentiable and d C 1/2 P (t)u2 = 2ReC 1/2 P (t)u, C 1/2 GP (t)u. dt Proof. For each u ∈ D(G) and each λ > 0, let v = λ(λ − G)−1 u := λR(λ, G)u. Obviously v ∈ D(G2 ). The inequality (3.1) yields 1 C 1/2 u2 = 2 C 1/2 (λ − G)v, C 1/2 (λ − G)v λ = C 1/2 v2 − 2λ−1 ReCv, Gv + λ−2 C 1/2 Gv2 ≥ (1 − λ−1 b)C 1/2 v2 − aλ−1 ε−p v2 .
(3.11)
August 12, 2005 15:54 WSPC/148-RMP
754
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
Note that u2 ≥ λR(λ, G)u2 . Let β = max{b, aε−p }. It follows from (3.11) that the inequality C 1/2 u2 + u2 ≥ (1 − λ−1 b)C 1/2 λR(λ, G)u2 + (1 − λ−1 aε−p )λR(λ, G)u2 (3.12) ≥ (1 − λ−1 β) C 1/2 λR(λ, G)u2 + λR(λ, G)u2 . The above inequality also holds for u ∈ D(C 1/2 ) since D(G) is a core for C 1/2 . ˜ : Note D(C 1/2 ) is a Hilbert space endowed with the graph norm. Let G 1/2 1/2 1/2 ˜ D(C ) → D(C ) be given by D(G) = {u ∈ D(G) : Gu ∈ D(C )} and ˜ = Gu, for all u ∈ D(G). ˜ It is easily checked that G ˜ is closed. Since D(G2 ) Gu is a core for G and D(G) is a core for C 1/2 , D(G2 ) is a core for C 1/2 (see ˜ is densely defined in the Hilbert space D(C 1/2 ). Let [24, Lemma 2.5]). Thus G ˜ us check R(λ, G)u = R(λ, G)u for all u ∈ D(C 1/2 ). If (λ − G)u ∈ D(C 1/2 ) for ˜ Since λ − G is a bijection u ∈ D(G), then Gu ∈ D(C 1/2 ) and we have u ∈ D(G). 1/2 ˜ is invertible on D(C 1/2 ) ˜ from D(G) to h, the range of λ − G is D(C ). Thus λ − G 1/2 ˜ and R(λ, G) is the restriction of R(λ, G) to D(C ). Therefore the inequality (3.12) ˜ is the infinitesimal generator of a strongly continuous semigroup on implies that G the Hilbert space D(C 1/2 ) endowed with the graph norm. See [25, Sec. 1, Corollary 3.8]. This semigroup is obtained by restricting the operators P (t) to D(C 1/2 ). ˜ the claimed differentiation formula follows. Since D(G2 ) ⊂ D(G), (min)
Under assumption C, we can obtain a useful estimate of Rλ,x (C ) where (C )>0 is the family of bounded regularization C = C(I + C)−1 . Proposition 3.6. Suppose that the conditions A and C hold. Then, for any x ∈ (0, 1), λ > max(b, 1) and any u ∈ D(G2 ), the bound (min) (3.13) (λ − b) sup u, Rλ,x (C )u ≤ C 1/2 u2 + 2a(1 − x)−p u2 >0
holds. (n) Proof. Let Rλ,x n≥0 be the sequence of monotone linear maps on B(h) defined in (3.6). Clearly it suffices to show that for all n ≥ 0, λ > max(b, 1), x ∈ (0, 1) and (n) u ∈ D(G2 ), the operator Rλ,x (C ) satisfies (n) (λ − b) sup u, Rλ,x (C )u ≤ C 1/2 u2 + 2a(1 − x)−p u2 . (3.14) >0
For n = 0, integrating by parts, we have (0)
λu, Rλ,x (C )u = λu, Pλ (C )u ∞ =λ e−λt C1/2 P (t)u2 dt
0
∞
e−λt C 1/2 P (t)u2 dt 0 ∞ 1/2 e−λt CP (t)u, GP (t)u dt. = C u2 + 2Re ≤λ
0
(3.15)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
755
Choose ε = 1 − x, x ∈ (0, 1), in (3.1). Applying the inequality (3.1) to (3.15), we get ∞ (0) λu, Rλ, x (C )u ≤ C 1/2 u2 − x e−λt CP (t)u2 dt 0
+b
∞
e
0
−λt
C
1/2
∞
= C 1/2 u2 − x
0
2
−p
P (t)u dt + a(1 − x)
0
∞
e−λt P (t)u2 dt
e−λt CP (t)u2 dt
(0) + b sup u, Rλ, x (C )u + a(1 − x)−p >0
0
∞
e−λt P (t)u2 dt. (3.16)
Notice that
0
∞
e−λt P (t)u2 dt ≤
1 u2. λ
(3.17)
Then for λ > 1/2, (3.14) holds for n = 0. By induction, we assume that (3.14) holds for an integer n. It follows from (3.9) and (3.14) that (n+1)
u, Rλ,x
(C )u = u, Pλ (C )u ∞ ∞ (n) e−λt Ll P (t)u, Rλ,x (C )Ll P (t)u dt +x l=1
0
∞
≤ u, Pλ (C )u + x
1 λ−b
0
l=1
+x
1 2a(1 − x)−p λ−b
∞
∞ ∞ l=1
0
e−λt C 1/2 Ll P (t)u2 dt
e−λt Ll P (t)u2 dt.
(3.18)
∞ The integrability of l=1 e−λt C 1/2 Ll P (t)u2 on [0, ∞) will be followed from (3.2), (3.1) and the integrability of e−λt CP (t)u2 . See (3.22) and (3.23) below. One can check that the integrability of e−λt CP (t)u2 is also followed from (3.1) with ε = 1/2. By (2.3), we have
∞ ∞ ∞ d e−λt Ll P (t)u2 dt = e−λt − P (t)u2 dt dt 0 l=1 0 ∞ e−λt P (t)u2 dt. (3.19) = u2 − λ 0
By (3.15), we also have u, Pλ (C )u ≤
∞ λ e−λt C 1/2 P (t)u2 dt λ−b 0 ∞ b e−λt C 1/2 P (t)u2 dt − λ−b 0
August 12, 2005 15:54 WSPC/148-RMP
756
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
1 λ−b
=
C 1/2 u2 + 2Re
b λ−b
−
∞
0
∞
0
e−λt CP (t)u, GP (t)u dt
e−λt C 1/2 P (t)u2 dt.
(3.20)
We combine (3.18), (3.19) and (3.20) to conclude that (n+1) (λ − b) sup u, Rλ,x (C )u >0 ∞ ≤ C 1/2 u2 + 2Re e−λt CP (t)u, GP (t)u dt 0
−b
∞
0
∞
e−λt C 1/2 P (t)u2 dt + x
−p 2 + 2a(1 − x) u − λ
0
l=1 ∞
e
0
−λt
∞
e−λt C 1/2 Ll P (t)u2 dt 2
P (t)u dt .
(3.21)
Here we have removed a factor x ≤ 1 from the last term. Next, we use (3.2) with ε = (1 − x)/2 to obtain ∞ ∞ ∞ e−λt CP (t)u, GP (t)u dt + x e−λt C 1/2 Ll P (t)u2 dt 2xRe 0
x(1 − x) 2
≤
0
l=1
∞
0
e−λt CP (t)u2 dt + xb
+ 2ax(1 − x)−p
∞
0
∞
0
e−λt C 1/2 P (t)u2 dt
e−λt P (t)u2 dt.
(3.22)
On the other hand, it follows from (3.1) with ε = 1/2 that ∞ e−λt CP (t)u, GP (t)u dt 2(1 − x)Re 0
≤−
(1 − x) 2
+ (1 − x)b
∞
0
0
e−λt CP (t)u2 dt
∞
e
−λt
C
1/2
2
P (t)u dt + 2a(1 − x)
Summing (3.22) and (3.23) yields ∞ ∞ e−λt CP (t)u, GP (t)u dt + x 2Re 0
≤b
0
≤b
l=1 ∞
0
∞
(3.23)
∞
0
∞
0
e−λt C 1/2 Ll P (t)u2 dt
e−λt C 1/2 P (t)u2 dt + 2a x(1 − x)−p + (1 − x) e−λt C 1/2 P (t)u2 dt + 2a(1 − x)−p
e−λt P (t)u2 dt.
0
∞
0
∞
e−λt P (t)u2 dt
e−λt P (t)u2 dt.
(3.24)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
757
For λ > max(b, 1), substituting (3.24) into (3.21), we obtain that (n+1) (λ − b) sup u, Rλ,x (C )u ≤ C 1/2 u2 + 2a(1 − x)−p u2 . ε>0
This completes the proof of the proposition. Proof of Theorem 3.1. Let λ > max(b, 1). Recall that for > 0, C = C(I + C)−1 . For u ∈ D(G), we have ∞ e−λt Φ1/2 P (t)u2 dt supu, Pλ (Φ )u = 0
>0
=
∞ l=1
0
∞
e−λt Ll P (t)u2 dt = u, Qλ (I)u.
This implies that the non-decreasing family of operators (Pλ (Φ ))>0 is uniformly bounded and since D(G) is dense in h, it follows that it converges strongly to Qλ (I) as goes to 0. By the normality of the maps Qkλ and the equation (3.7), for any x ∈ (0, 1), we have ∞
xk u, Qk+1 λ (I)u = sup >0
k=0
∞
xk u, Qkλ (Pλ (Φ ))u
k=0
(min) = sup u, Rλ,x (Φ )u . >0
˜ = δΦ. For > 0, it follows from [3, Proposition 2.2.13] that the bounded posLet Φ ˜ ≤ C . Applying Proposition 3.6, ˜ and C satisfy the inequality Φ itive operators Φ we obtain the estimate 1 ∞ xk+1 u, Qk+1 λ (I)u dx 0
k=0
=δ
−1
≤ δ −1 ≤ δ −1 < ∞.
1
0
>0
1
0
0
(min) ˜ x sup u, Rλ,x (Φ )u dx (min) x sup u, Rλ,x (C )u dx >0
1
x(λ − b)−1 C 1/2 u2 + 2a(1 − x)−p u2 dx
By (3.4) and Lemma 3.3, we have s-limn→∞ Qnλ (I) = 0, which implies that the minimal q.d.s. is conservative. 4. Applications In this section, we obtain some relative bounds to apply our sufficient conservativity condition of Theorem 3.1 to concrete examples.
August 12, 2005 15:54 WSPC/148-RMP
758
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
Let h = L2 (Rn , dx) and W : Rn → R be the real valued function. We are looking for the condition that there exist constants a > 0 and p < 1 such that W ϕ2 ≤ ε−∆ϕ2 + aε−p ϕ2 ,
ϕ ∈ C02 (Rn )
holds for any ε > 0, where ∆ is a Laplacian operator and C02 (Rn ) is the set of twice continuously differentiable functions with compact support on Rn . We prove first the following: Lemma 4.1. For a given n ∈ N, let α be a non-negative real number satisfying n/(1 + α) < 2. If W ∈ L2+2α (Rn ), there exist a > 0 and p < 1 such that the bound W ϕ2 ≤ ε−∆ϕ2 + aε−p ϕ2 holds for any ε > 0 and ϕ ∈ D(−∆). Proof. Since C0∞ (Rn ), the space of infinitely differentiable functions with compact support, is a core for −∆, it is sufficient to show the bound for any ϕ ∈ C0∞ (Rn ). We use the method employed in the proof of [26, Theorem IX. 28]. Assume W 1+α ∈ L2 (Rn ). Denote fˆ as the Fourier transform of f ∈ h. For ϕ ∈ C0∞ (Rn ), we have W 1+α ϕ22 ≤ W 1+α 22 ϕ2∞ , −n/2
ϕ∞ ≤ (2π)
(4.1)
ϕ ˆ 1
and ϕ ˆ 21 ≤ C(λ4 + 1)(1+α)/2 ϕ ˆ 22 , 4
1)−(1+α)/2 22
< ∞ since α > where C = (λ + ˆ Then For any r > 0, let ϕˆr (λ) = rn ϕ(rλ).
n 2
(4.2)
− 1.
ϕˆr 1 = ϕ ˆ 1, 2 n (λ4 + 1)(1+α)/2 ϕˆr 22 = (λ4 + 1)1+α r2n |ϕ(rλ)| ˆ d λ Rn
= rn (r−4 λ4 + 1)(1+α)/2 ϕ ˆ 22 . Thus using (4.2) for ϕˆr , and these equalities, we obtain ϕ ˆ 21 ≤ Crn (r−4 λ4 + 1)(1+α)/2 ϕ ˆ 22 .
(4.3)
Substituting (4.3) into (4.1), by Plancherel’s Theorem, there is a constant C1 > 0 such that W 1+α ϕ22 ≤ C1 rn (r−4 ∆2 + 1)(1+α)/2 ϕ22 , which implies W 2+2α ≤ C1 rn (r−4 ∆2 + 1)1+α . Suppose that A and B are self-adjoint operators such that 0 ≤ B ≤ A.
(4.4)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
759
Then the above implies that 0 ≤ B t ≤ At for any t ∈ [0, 1] (see [26, Chap. VIII, Problem 51] and also the Heinz–Kato theorem in [27, Sec. 2.3.3.]). Thus we have W 2 ≤ C2 rn/(1+α) (r−4 ∆2 + 1), which yields W ϕ2 ≤ C2 r−(4−n/(1+α)) − ∆ϕ2 + C2 rn/(1+α) ϕ2 . Choose ε = C2 r−(4−n/(1+α)) . Then we obtain W ϕ2 ≤ ε − ∆ϕ2 + aε−p ϕ2 , where p = n(4(1 + α) − n)−1 . Since n/(1 + α) < 2, p < 1. If we choose r large enough, the bound follows. Remark 4.2. (a) In Lemma 4.1, one can choose α = 0 for n = 1. Notice that α > 0 for n = 2 and α > 1/2 for n = 3, etc. (b) Let the dimension n = 1, 2, 3. If W ∈ L4 (Rn , dx), then W 2 ∈ L2 (Rn , dx) and so W 2 is relatively bounded by −∆ (see [26, Theorem X.15]). Thus W 2 is relatively form bounded by −∆, i.e. W ϕ2 ≤ bϕ, (−∆ + 1)ϕ,
ϕ ∈ C0∞ (Rn ),
where b is a constant. See also [26, Theorem X.18(b)]. In the rest of this section, we apply Theorem 3.1 and Lemma 4.1 to models of heavy ion collision proposed by Alicki [28] and noncommutative elliptic operators introduced in [29]. Example 4.3 (Q.d.s. in a model for heavy ion collision). Let h = L2 (R3 ). We denote by ∂k = ∂x∂ k (k = 1, 2, 3) differential operators with respect to the kth 2
coordinate and ∂lk = ∂x∂k ∂xl (l, k = 1, 2, 3). For any measurable function T , we ∂T denote the (distributional) derivative ∂x by (T )l , l = 1, 2, 3. Consider the operators l Ll , for l = 1, 2, 3, Ll u = w(xl + α∂l )u, D(Ll ) = {u ∈ L2 (R3 ) : the distribution Ll u ∈ L2 (R3 )},
(4.5)
where w, α ∈ R are non-zero real constants and Ll = 0 for l ≥ 4. Let V be a real measurable function. Consider the operators H and G given by
1 (4.6) Hu = − ∆ + V u, 2 ∞
Gu = −iHu −
1 ∗ Ll Ll u 2 l=1
August 12, 2005 15:54 WSPC/148-RMP
760
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
for u ∈ C0∞ (R3 ). Let us assume that the following properties hold: (1) w2 α2 ≥ 2, (2) |V (x)| ≤ 14 w2 (x2 + b1 ) for some constant b1 > 0, where x2 = x21 + x22 + x23 , (3) There exist real measurable functions U1 and U2 and positive constants b2 , b3 such that U1 ∈ Lβ (R3 ) for some β > 3, U2 (x) ≤ b2 (|x| + b3 ) and the bounds |(V )l | ≤ U1 + U2
(4.7)
hold for l = 1, 2, 3. For instance, the function V (x) = 14 w2 |x|ν , 0 < ν ≤ 2, satisfies the conditions (2) and (3) in the above. Let us mention that in the example proposed by Alicki [28], the constant w in (4.5) is a function W (x) proportional to γ(x) where γ(x) represents a friction force. The conservativity of this q.d.s. has been already investigated in [1] under appropriate (boundedness) assumptions on V, W and their derivatives. In this paper, in the case that W (x) is constant, we control local singularities of the derivatives of the potential function V . The method used here can be extended to the case that W (x) ≥ c > 0. However, we are unable to control the case in which W (x) → 0 as |x| → ∞. We apply Theorem 3.1 and Lemma 4.1 to show that the minimal q.d.s. constructed from above operators Ll and G, given in (4.5) and (4.6) respectively, is conservative. We will check that the main inequalities (3.1) and (3.2) hold for u ∈ C0∞ (R3 ). The most difficult problem is to extend the inequalities to every u ∈ D(G2 ). In order to overcome this problem, we need technical estimates. Lemma 4.4. For all u ∈ C0∞ (R3 ), the bounds u, (α4 ∆2 + x4 )u ≤ u, (−α2 ∆ + x2 + 3|α|)2 u
(4.8)
and
2
1 2
w (−α2 ∆ + x2 − 3α)u ≤ b4 Gu2 + b5 u2
2 for some b4 > 1 and b5 > 0 hold. Proof. A direct computation shows that (−α2 ∆ + x2 )2 = α4 ∆2 + x4 − α2 (∆x2 + x2 ∆) 3 ∂k x2 ∂k + 6 = α4 ∆2 + x4 − α2 2 k=1
≥ α4 ∆2 + x4 − 6α2 as a bilinear form on the domain C0∞ (R3 ). This proves the bound (4.8).
(4.9)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
761
Next, we prove the bound (4.9). Put 3
G0 = −
1 ∗ Ll Ll 2 l=1
1 = − w2 (−α2 ∆ + x2 − 3α). 2
(4.10)
We have that as bilinear forms on C0∞ (R3 ), G∗ G = (iH + G0 )(−iH + G0 ) = H 2 + G20 + i[H, G0 ] ≥ G20 + i[H, G0 ]
(4.11)
and i[H, G0 ] = =
iw2 iw2 α2 [∆, x2 ] + [V, ∆] 4 2 3 3 iw2 iw2 α2 (∂l xl + xl ∂l ) − ∂l (V )l + (V )l ∂l 2 2 l=1
l=1
3 w2 w2 α2 2 2 ≥ − (−∆ + x ) − |(V )l | . −∆ + 2 2 l=1
It follows from (4.7) that i[H, G0 ] ≥ −
w2 (1 + α2 )(−∆ + x2 ) − 3w2 α2 (U12 + U22 ). 2
The bound (4.8) implies that (−∆ + x2 )1/2 is infinitesimally small with respect to G0 . By the condition (3) and Lemma 4.1 (Remark 4.2(a)), U1 and U2 are also infinitesimally small with respect to G0 . Thus, there exist constants 0 < a < 1 and b > 0 such that i[H, G0 ] ≥ −aG20 − b as a bilinear form on C0∞ (R3 ). The bound (4.9) follows from (4.11) and the above bound. Recall that
1 G = −i − ∆ + V + G0 2
where G0 is given as (4.10). Notice that G0 is essentially self-adjoint on C0∞ (R3 ). Since w2 α2 ≥ 2 by condition (1), the bound (4.8) implies that − 21 ∆ is G0 -bounded with relative bound smaller than or equal to 1/2. The condition (2) and the bound (4.8) imply that V is G0 -bounded with relative bound smaller than 1/2. Thus −iH is relatively bounded perturbation of G0 with relative bound smaller than 1. Thus assumption A holds.
August 12, 2005 15:54 WSPC/148-RMP
762
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
We show that the minimal q.d.s. is conservative applying Theorem 3.1. Let us choose the operator C, C = w2 (−α2 ∆ + x2 + 3|α|) = 2
3
L∗l Ll + b6 = −2G0 + b6 ,
l=1
3
2
(4.12)
3
D(C) = {u ∈ L (R ) | the distribution Cu ∈ L (R )}, where b6 = 3w2 (|α|− α). Using the relation (4.9) and the fact that −iH is relatively bounded perturbation of G0 , we obtain that G and C are relatively bounded with respect to each other and so D(G) = D(C). We will check that the operator C satisfies the assumption C. Hypotheses (a) and (b) are trivially fulfilled. Now we will check hypothesis (c). First, we have that as bilinear forms on C0∞ (R3 ), 1 1 C, − ∆ + V = −α2 w2 [∆, V ] − w2 [x2 , ∆] 2 2 = −α2 w2
3
(∂l (V )l + (V )l ∂l ) + w2
l=1
3
(∂l xl + xl ∂l )
l=1
3 2 ≤ α w −∆ + (V )l + w2 (−∆ + x2 ) 2
2
(4.13)
l=1
and [C, Ll ] = w3 (−α2 [∆, xl ] + α[x2 , ∂l ]) = −2w3 α(α∂l + xl ) = −2w2 αLl and so, 3
L∗l [C, Ll ] = −2w2 α
l=1
3
L∗l Ll = −2w2 αC + b6 .
l=1
By direct computation, we have
1 CG + G C + C = −i C, − ∆ + V + b6 C 2 ∗
2
and CG + G∗ C +
3 l=1
L∗l CLl
3 1 ∗ 1 = −i C, − ∆ + V + (Ll [C, Ll ] + (L∗l [C, Ll ])∗ ) 2 2 l=1 1 = −i C, − ∆ + V − 2w2 αC + b6 2
(4.14)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
763
as bilinear forms on C0∞ (R3 ). Substituting (4.13) and (4.14) into the above equations, and using the fact that −∆, −∆+x2 are relatively form bounded with respect to C, we have that for u ∈ C0∞ (R3 ), 2ReCu, Gu + Cu2 ≤ b7 u, Cu + α2 w2
3
(V )l u2
(4.15)
l=1
and 2ReCu, Gu +
3
Ll u, CLl u ≤ b8 u, Cu + α2 w2
l=1
3
(V )l u2 ,
(4.16)
l=1
where b7 , b8 > 0. Note |(V )l | ≤ U1 + U2 for l = 1, 2, 3 with U1 ∈ L∞ (R3 ) where β > 3, and U2 (x) ≤ b2 (|x| + b3 ). Applying Lemma 4.1 to (4.15) and (4.16), and Lemma 4.4, we obtain (3.1) and (3.2) for u ∈ C0∞ (R3 ). We want to extend the inequality (3.1) and (3.2) to the domain D(G). For u ∈ D(G), there exists a sequence {un } of elements of C0∞ (R3 ) such that lim un = u,
n→∞
lim Cun = Cu,
n→∞
lim Gun = Gu,
n→∞
by the relation (4.9). Then the relation (3.1) holds for u ∈ D(G). Also, the relation (3.2) implies that {C 1/2 Ll un }n≥1 is a Cauchy sequence. Therefore, it is convergent and it is easy to deduce that (3.2) holds for u ∈ D(G). Recall that Φ = 3l=1 L∗l Ll and C = 3l=1 L∗l Ll + b6 . Hence the conditions of Theorem 3.1 also hold and the minimal q.d.s. is conservative. Remark 4.5. Let us remind the condition of derivatives of V , |(V )l | ≤ U1 + U2 for l = 1, 2, 3. One can use the previous criterion in [1] to show the conservativity for U1 ∈ L4 (R3 ) (see Remark 4.2(b)). Applying our result, we extend the range of (V )l , i.e. U1 ∈ Lβ (R3 ) where β > 3. Example 4.6 (Noncommutative elliptic operators). Let h = L2 (R). Denote ∂ as d . For any measurable function T on R, we denote the differential operator; ∂ = dx T and T as the first and second order (distributional) derivatives respectively. For a given function W : R → R satisfying appropriate conditions, we consider the following Lindblad type generator on B(h): 1 1 (4.17) L(X) = i[H, X] − L∗ LX + L∗ XL − XL∗ L, X ∈ D(L), 2 2 where L = −(∂ + W ), (4.18) i i H = (LL − L∗ L∗ ) = (W ∂ + ∂W ). 4 2 Thus the operators G0 and G are given by 1 1 G0 = − L∗ L = − (−∆ + W 2 − W ), 2 2 (4.19) G = G0 − iH.
August 12, 2005 15:54 WSPC/148-RMP
764
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
It is worth to mention that, if X ∈ C0∞ (R) ⊂ B(h), the generator in (4.17) can be rewritten as 1 L(X) = X − 2W X 2
1 ∆ − 2W ∂ X. = 2 Thus the generator L given in (4.17) can be considered as a noncommutative generalization of elliptic operators. For the general form of noncommutative elliptic operators, we refer to [29]. From now on, we assume that the function W : R → R satisfies the following conditions: W can be written as W = W1 + W2 , where W , W1 and W2 satisfy the following properties: (1) W ∈ C 1 (R), and there exists a constant c ≥ 0 such that −c ≤ W ,
(4.20)
(2) For any ε > 0, there exist constants b1 (ε) and b2 (ε) such that the following bounds hold: |W | ≤ ε|W | + b1 (ε), |W1 | ≤ ε|W | + b2 (ε).
(4.21)
(3) W2 ∈ L2 (R). For instance, the function W (x) = ax2n+1 + b|x|α satisfies the above properties, where a > 0, b ∈ R, n ∈ N and α ∈ (3/2, 2). Lemma 4.7. There exists a constant b1 > 0 such that for u ∈ C0∞ (R), the bound 1 ∆u2 + W 2 u2 + 2∂u, W 2 ∂u 2 ≤ (−∆ + W 2 )u2 + b1 (−∆ + W 2 )1/2 u2 holds. Proof. Notice that for any u ∈ C0∞ (R), (−∆ + W 2 )u2 = ∆u2 + W 2 u2 + 2Re−∆u, W 2 u = ∆u2 + W 2 u2 + 2∂u, W 2 ∂u + 2Re∂u, (W 2 ) u. By Schwarz inequality and the bound in (4.21), we have 2|∂u, (W 2 ) u| ≤ ∂u2 + 4W W u2 ≤ ∂u2 + 8εW 2 u2 + 8b(ε)W u2 ≤ (1 + 8b(ε))u, (−∆ + W 2 )u + 8εW 2 u2 . Choosing ε = 1/16, the lemma follows from (4.22) and the above bound.
(4.22)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
765
Lemma 4.8. There exist constants b2 > 0 and b3 such that the bound −i[H, G0 ] ≤ G20 − b2 G0 + b3 1 holds as bilinear forms on C0∞ (R). Proof. Notice that as bilinear forms on C0∞ (R), [W ∂ + ∂W, −∆] = ∆W + 2∂W ∂ + W ∆ = 4∂W ∂ + [∂, W ], [W ∂ + ∂W, W 2 − W ] = 4W 2 W − 2W W , which imply 1 −i[H, G0 ] = − [W ∂ + ∂W, −∆ + W 2 − W ] 4 1 1 = −∂W ∂ − [∂, W ] − W 2 W + W W . 4 2
(4.23)
We estimate the terms on the right-hand side of (4.23). Using (4.21) with ε = 1/8 and Lemma 4.7, we obtain that for u ∈ C0∞ (R), 1 ∂u, |W |∂u + c1 ∂u2 8 1 ∂u, W 2 ∂u + c2 ∂u2 ≤ 16 1 (−∆ + W 2 )u2 + c3 (−∆ + W 2 )1/2 u2 . ≤ 32
∂u, W ∂u ≤
(4.24)
Next, we use the property (3), Lemma 4.1 (Remark 4.2(a)) and Lemma 4.7 to obtain that for ε > 0, 1 1 |u, [∂, W ]u| ≤ |∂u, W u| 4 2 1 ≤ (∂u2 + W u2 ) 4 ≤ (−∆ + W 2 )1/2 u2 + ε∆u2 + bε−p u2 ≤ ε(−∆ + W 2 )u2 + b4 (−∆ + W 2 )1/2 u2 + bε−p u2 . (4.25) It follows from (4.20) that −u, W 2 W u ≤ cW u2 ≤ c(−∆ + W 2 )1/2 u2 .
(4.26)
We use the property (3), Lemma 4.1 and Lemma 4.7 to conclude that 1 |u, W W u| ≤ W u2 + W 2 2 ≤ ε(−∆ + W 2 )u2 + c4 (−∆ + W 2 )1/2 u2 + c5 ε−p u2
(4.27)
August 12, 2005 15:54 WSPC/148-RMP
766
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
for some constants c4 and c5 . Choose ε = 1/32 for (4.25) and (4.27). Combining (4.23)–(4.27), we conclude that there exist constants c6 and c7 such that the bound −i[H, G0 ] ≤
1 (−∆ + W 2 )2 + c6 (−∆ + W 2 ) + c7 1 8
(4.28)
holds as bilinear forms on C0∞ (R). Recall that G0 = − 21 (−∆ + W 2 − W ). Using (4.21) and Lemma 4.7, one can show that for any δ > 0, there exists a constant c(δ) such that 1 (−∆ + W 2 )u2 ≤ (1 + δ)G0 u2 + c(δ)u2 . 4 Choosing δ = 1, the lemma follows from (4.28) and (4.29).
(4.29)
We are ready to show that the minimal q.d.s. constructed from L and G given in (4.18) and (4.19) is conservative. Clearly H is a symmetric operator on C0∞ (R). Since W is infinitesimally small with respect to −∆+W 2 by (4.21) and Lemma 4.7, and since −∆+W 2 is essentially self-adjoint on C0∞ (R) [26, Theorem X. 28], G0 is a non-positive, essentially self-adjoint operator on C0∞ (R). Using [30, Proposition 3.1] and Lemma 4.8, it can be shown that G = G0 − iH generates a strongly continuous contraction semigroup on h, and so assumption A holds. For details, we refer to the proof [30, Theorem 3.1]. In order to show that the inequalities (3.1) and (3.2) hold, we choose C = −G0 =
1 (−∆ + W 2 − W ). 2
(4.30)
Notice that CG + G∗ C = −i[H, G0 ] − 2C 2 . Thus the inequality (3.1) for u ∈ C0∞ (R) follows from Lemma 4.8 and the above relation. Next, we prove the inequality (3.2) for u ∈ C0∞ (R). Notice that 1 ∗ (L [C, L] + [L∗ , C]L) = L∗ [L∗ , L]L 2 = −2(W − ∂)W (W + ∂) = 2∂W ∂ − 2W 2 W + 2(W )2 + 2W W .
(4.31)
Thus it follows from (4.23) and (4.31) that 1 i[H, C] + (L∗ [C, L] + [L∗ , C]L) 2 1 5 = ∂W ∂ − [∂, W ] − 3W 2 W + W W + 2(W )2 . 4 2
(4.32)
By (4.20), −∂u, W ∂u ≤ c∂u2 ≤ c(−∆ + W 2 )1/2 u2
(4.33)
August 12, 2005 15:54 WSPC/148-RMP
J070-00241
Remarks on Sufficient Conditions for Conservativity of Minimal QDS
767
and by (4.21), with ε = 1, 2(W )2 ≤ 4(W 2 − W ) + c˜ ≤ 4C + c˜1
(4.34)
for some constant c˜. We apply the bounds (4.25)–(4.27) and (4.33)–(4.34) to (4.32), and use (4.29) to conclude for any ε > 0, that there exist constants b1 > 0 and b2 > 0 independent of ε such that the bound L(C) ≤ εC 2 + b1 C + b2 ε−p 1 holds as bilinear forms on C0∞ (R). Extension of the inequalities (3.1) and (3.2) to u ∈ D(G) can be done by the method used in Example 4.3. Acknowledgments The authors would like to thank their anonymous referees for suggestions made for the improvement of the paper. This work was supported by the Korea Research Foundation Grant (KRF-2003-005-C00010). References [1] A. M. Chebotarev and F. Fagnola, Sufficient conditions for conservativity of minimal quantum dynamical semigroups, J. Funct. Anal. 153 (1998) 382–404. [2] R. Alicki and K. Lendi, Quantum Dynamical Semigroups and Applications, Lecture Notes Physics, Vol. 286 (Springer, 1987). [3] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, 2nd edn. (Springer-Verlag, New York-Heidelberg-Berlin, Vol. I, 1987, Vol. II, 1997). [4] E. B. Davies, Quantum Theory of Open Systems (Academic Press, London-New YorkSan Francisco, 1976). [5] L. Accardi, A. Frigerio and J. T. Lewis, Quantum stochastic processes, Publ. Res. Inst. Math. Sci. 18 (1982) 97–133. [6] P. A. Meyer, Quantum Probability for Probabilists, Lecture Notes Mathematics (Springer Verlag, Berlin, Heidelberg, New York, 1993). [7] K. R. Parthasarathy, An Introduction to Quantum Stochastic Calculus, Monographs in Mathematics (Birkh¨ auser, Basel, 1992). [8] A. M. Chebotarev, Sufficient conditions for conservativism of dynamical semigroups, Theor. Math. Phys. 80(2) (1989). [9] A. M. Chebotarev, Sufficient conditions for conservativity of a minimal dynamical semigroup, Math. Notes 52 (1993) 1067–1077. [10] A. M. Chebotarev and F. Fagnola, Sufficient conditions for conservativity of quantum dynamical semigroups, J. Funct. Anal. 118 (1993) 131–153. [11] F. Fagnola, Chebotarev’s sufficient conditions for conservativity of quantum dynamical semigroups, Quantum Probab. Related Topics VIII, ed. L. Accardi (1993) 123–142. [12] A. M. Chebotarev and S. Y. Shustikov, Conditions sufficient for the conservativity of a minimal quantum dynamical semigroup, Math. Notes 71 (2002) 692–710. [13] A. Arnold and S. Sparber, Quantum dynamical semigroups for diffusion models with Hartree interaction, Commun. Math. Phys. 251 (2004) 179–207.
August 12, 2005 15:54 WSPC/148-RMP
768
J070-00241
C. Bahn, C. K. Ko & Y. M. Park
[14] J. C. Garcia and R. Quezada, Hille–Yosida estimate and nonconservativity criteria for quantum dynamical semigroups, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 7(3) (2004) 383–394. [15] F. Fagnola and R. Rebolledo, On the existence of stationary states for quantum dynamical semigroup, J. Math. Phys. 42 (2001) 1296–1308. [16] F. Fagnola and R. Rebolledo, Subharmonic projections for a quantum Markov semigroup, J. Math. Phys. 43(2) (2002) 1074–1082. [17] G. Lindblad, On the generator on dynamical semigroups, Comm. Math. Phys. 48 (1976) 119–130. [18] B. V. R. Bhat and K. B. Sinha, Examples of unbounded generators leading to nonconservative minimal semigroups, Quantum Probab. Related Topics IX (1994) 89–103. [19] F. Fagnola, Characterization of isometric and unitary weakly differentiable cocycles in Fock space, Quantum Probab. Related Topics VIII, ed. L. Accardi (1993) 143–164. [20] F. Fagnola, Diffusion processes in Fock space, Quantum Probab. Related Topics IX (1994) 189–214. [21] F. Fagnola and S. J. Wills, Solving quantum stochastic differential equations with unbounded coefficients, J. Func. Anal. 198 (2003) 279–310. [22] E. B. Davies, Quantum dynamical semigroups and the neutron diffusion equation, Rep. Math. Phys. 11 (1977) 169–188. [23] T. Kato, Perturbation Theory for Linear Operators (Springer, Berlin, 1966). [24] J. Derenzinski and V. Jaksic, Spectral theory of Paul–Fierz operators, J. Funct. Anal. 180 (2001) 243–327. [25] A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations (Springer-Verlag, New York, Berlin, Heidelberg, Tokyo, 1983). [26] M. Reed and B. Simon, Method of Modern Mathmatical Physics I, II (Academic Press, 1980). [27] H. Tanabe, Equations in Evolutions (Pitman Press, London, 1979). [28] L. Alicki, Scattering Theory for Quantum Dynamical Semigroups in Quantum Probability and Applications to the Quantum Theory of Irreversible Processes, Lecture Notes Mathematics, Vol. 1055 (1984) 20–31. [29] C. Bahn and Y. M. Park, Feynman–Kac representation and Markov property of semigroups generated by noncommutative elliptic operators, Infin. Dim. Anal. Q. Prob. Rel. Topics 6 (2003) 103–121. [30] C. Bahn, C. K. Ko and Y. M. Park, Quantum dynamical semigroups generated by noncommutative unbounded elliptic operators, arXiv: math-ph/0505026.
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Reviews in Mathematical Physics Vol. 17, No. 7 (2005) 769–792 c World Scientific Publishing Company
WETTING PHENOMENA AND CONSTANT MEAN CURVATURE SURFACES WITH BOUNDARY
´ RAFAEL LOPEZ Departamento de Geometr´ıa y Topolog´ıa, Universidad de Granada, 18071 Granada, Spain [email protected] Received 20 October 2004 Revised 08 June 2005 In a microscopic scale or microgravity environment, interfaces in wetting phenomena are usually modeled by surfaces with constant mean curvature (CMC surfaces). Usually, the condition regarding the constancy of the contact angle along the line of separation between different phases is assumed. Although the classical capillary boundary condition is the angle made at the contact line, configurations also occur in which a Dirichlet condition is appropriate. In this article, we discuss those with vanishing boundary conditions, such as those that occur on a thin flat portion of a plate of general shape covered with water. In this paper, we review recent works on the existence of CMC surfaces with non-empty boundary, with a special focus on the Dirichlet problem for the constant mean curvature equation. Keywords: Mean curvature; Dirichlet problem; maximum principle. Mathematics Subject Classification 2000: 53A10, 35J65, 35Q35, 76B45
1. Modeling the Morphology of the Interface Phase changes in states of equilibrium are influenced by the presence of surfaces and interfaces. An interface is the zone existing between two immiscible substances. The understanding of the shape of interfaces is a major objective in numerous scientific activities, such as engineering, chemical industry, nanotechnology and medicine. A typical example of interface occurs when we deposit an amount of liquid on a planar surface. We assume that no chemical reaction occurs between the two materials and that these are homogeneous. In a state of mechanical equilibrium, the droplet attains a fixed position. Wetting is the study of how that droplet spreads out on the substrate. Surface tension between two phases explains the local forces which tend to minimize its interfacial area, being responsible both for the shape and the deformation that the droplet takes. These forces of cohesion have a limited reach and their effects are detectable within a small molecular radius; they will be greater in solids, lower in liquids and almost negligible in gas. Besides liquid droplets on a solid substrate, interfaces appear in a variety of settings: capillarity 769
August 12, 2005 15:54 WSPC/148-RMP
770
J070-00244
R. L´ opez
when a tube is dipped in a reservoir of liquid; a liquid droplet floating on another immiscible liquid; the shape formed by the liquid when it adheres to a wall; a droplet of liquid trapped between two closely-spaced horizontal plates. In wetting and adhesion phenomena, experimentalists are interested in the physical/chemical properties of materials and the morphology of the interfaces that explains, to a certain degree, these characteristics. Mean curvature of these interfaces plays a special role in the shapes and morphologies. The classical mathematical model used for equilibrium configurations is essentially based on the classical Young–Laplace–Gauss equation. We consider a solid substrate (S) whose properties do not change in a gas environment (G). When one deposits a certain amount V of a liquid (L) — a droplet — on the substrate (S), the energies of this system consist of three types: a free surface energy of the (LG) interface S, an adhesion energy of the liquid (L) over the substrate and a volume constraint. Denote SIJ the area of the interface between phases I and J and γIJ the corresponding interfacial energy density. The equilibrium state of the liquid droplet corresponds with those configurations that are stationary values of the total interfacial free energy as given by E = γLG SLG + γLS (SLS − SGS ) + (PG − PL )V . The pressure term PG − PL is included to fulfil the constraint on the volume. In order to minimize the energy E, one considers an interface S of arbitrary shape and performs small displacements of this interface that ensure that the (GLS) phase stays within the substrate surface. Interfacial shapes of minimal free energy are found from the requirement that all variations of the free energy, which are of first order in the displacements, must vanish. The theoretical model that describes the interface S is then expressed by the equation 1 1 + . PL − PG = γLG R1 R2 Here R1 and R2 denote the principal curvature radii that define the normal curvature of the interface S in two orthogonal planes containing the unit normal vector at each point of the interface. The mean curvature at a point of a smooth surface is defined by H = 12 R11 + R12 . Thus one obtains the Young equation which measures the excess pressure across S [72, 35]: PL − PG = 2HγLG .
(1.1)
Complete derivations can be found in [1, 19, 52, 39, 15]. If one assumes the existence of gravity (for example, for sufficiently large droplets), we must add in the expression of E the potential energy ρ L h, where ρ is the difference between the gravities of the (L) and (G) phases, and h is the height above a reference level. We do not consider this term since we will focus on wetting structures when gravity is absent or for which the effects of gravity are small and can be ignored. In the case that both γLG and PG − PL are constant, the interface S is a constant mean curvature surface. Constant mean curvature surfaces are obtained from a variational argument as follows. First, the contact line C of the physical
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
771
system is the region of contact between the gas, liquid and solid phases. If D is an open subset of the (u1 , u2 )-plane, we parametrize S by a one-to-one smooth function 3 x : D → R , where x1 , x2 are two independent tangent vectors. The notation xi means the differentiation with respect to ui . Moreover, we assume that x|∂D is a parametrization of the curve C. We orient the surface by the unit normal vector 3 N = (x1 × x2 )/|x1 × x2 |. Take a variation x(t) : D → R , t ∈ R, of S := x(0)(D) by surfaces St = x(t)(D) with the same boundary C and enclosing the same volume as S. Since the energy is proportional to the interfacial area, let us measure the area A(t) of St . In equilibrium, we seek critical points of A(t) under the constraint that the volume is constant (in fact, surface tension creates a tendency to minimize area). But a straightforward computation leads to A (0) = −2 S H(∂t x(t))t=0 · N dS, where H is the mean curvature of S. Thus, if we impose A (0) = 0 for any such variations, we conclude that H is a constant function, that is, CMC surfaces are critical points of the area functional under variations enclosing a fixed volume and spanning a fixed boundary. On the other hand, and from the theory of differential geometry, the intrinsic geometry of the S interface is expressed in terms of two quadratic differential forms. The first fundamental form is the metric tensor G = [gij ] = xi · xj . The second fundamental form is the curvature tensor defined by [hij ] = N · xij = −Ni · xj and the mean curvature H is then given by H=
1 trace G−1 [hij ]. 2
(1.2)
In order for the surface x(u1 , u2 ) to have constant mean curvature H, the value H in (1.2) is the same at all points on S. Examples of CMC surfaces are spheres and cylinders. If the radius is R, the mean 1 respectively. curvature with respect to the inward orientation is H = R1 and H = 2R If the interface is planar, R1 , R2 → ∞ and so, the difference of pressure vanishes, that is, the pressure on both sides of the interfaces agrees. Surfaces with zero mean curvature at each point are called minimal surfaces. As guides to the theory of minimal surfaces, see [53, 54]. 2. Physical Configurations with Prescribed Boundary The structure of the contact line C is determined by the intermolecular forces between the substrate and the molecules within the fluid phase. Due to homogeneity of our models, the thermodynamic equilibrium between the three interfaces concludes that the contact angle θ which the liquid-gas interface S meets the substrate (S) is constant along the curve C. This angle θ is determined by the Gauss equation γGS − γLS = γLG cos θ.
(2.1)
August 12, 2005 15:54 WSPC/148-RMP
772
J070-00244
R. L´ opez
(a)
(b)
Fig. 1. (a) A droplet in a state of equilibrium and with homogeneous phases. The angle θ is constant according to Eq. (2.1); (b) Hysteresis in a droplet in equilibrium on a heterogeneous substrate. The contact angle changes along the liquid-air-solid phase.
See [18] and Fig. 1(a). Thus the contact angle depends on the surface tensions between the different phases and so, it is an intrinsic property of the system. The combination of Eqs. (1.1) and (2.1) is of great importance because it allows the computations of the interface tensions of solid-liquid and solid-gas by an indirect method. There exist a number of measurement techniques to determine the surface tension [1]. Following Eqs. (1.1) and (2.1), the experimentalist tries to obtain an approximate knowledge of the morphology of the interface thanks to the use of photographic/optical techniques, where accuracy depends on the precision of the instruments used. Usually one assumes a certain symmetry of the interface, such as, for example, the interface being rotationally symmetric, since this simplifies the computations. Once the shape of the meniscus and the angle θ have been obtained, numerical integration allows the determination of the value of H. There is an extensive literature on capillarity. We refer readers to [14, 34, 20] and references therein. For the rest of this article, we shall assume that the effect of gravity is negligible, for example, in a microgravity environment or microscopic scale. Therefore, our interfaces are modeled by CMC surfaces. However, it is difficult to achieve ideal conditions in wetting experiments due to a variety of factors, for example, contamination, impurity, viscosity and volatility in the liquid; roughness, chemical heterogeneity, dirtiness and change of the hydrophilic degree in the substrate. As a consequence, the constancy of the contact angle θ is dropped and Eq. (2.1) is not fulfilled. This phenomenon is known as hysteresis and the value of the contact angle θ varies in a certain range, see Fig. 1(b). In the bibliography, one can find a great number of formulations to justify or to complete Eq. (2.1), see [27, 57, 64, 20]. Precisely, the difference between θ with the expected angle according to Eq. (2.1) is used as a test to measure the degree of cleanliness and roughness of the substrate. We focus on those configurations with a Dirichlet condition and without any assumption on the angle of contact. For example, consider a hydrophobic flat substrate Π with a hydrophilic domain Ω. An amount of liquid placed on this substrate will tend to wet only the domain Ω by forming a droplet that covers the whole domain Ω. In addition, if we assume non-ideality both in the liquid as in
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
773
the substrate, the hysteresis phenomenon appears along the contact line. In such situations, the interface S is a CMC surface with prescribed boundary C = ∂Ω. The interest lies in how the geometry of the boundary line C will be manifested in the shape of the drop. As an immediate question: does the droplet inherit the symmetries of C? In contrast to the physical experiments, the geometric structure of the space of CMC surfaces with boundary is not well known. The simplest case of boundary, that is, when C is a circle, shows the experience/theory contrast. A physicist assumes that a droplet deposited on a round disc substrate adopts a spherical shape, whereas a mathematician allows other possible configurations, for example, a doughnutshaped droplet. Indeed, the following is still unsolved: Conjecture. Let S be a CMC surface with a circular boundary. If one of the following conditions holds, then S is a spherical cap: (1) S is a topological disc. (2) S has no self-intersections (embedded). Our experience says that when we dip a round wire into a soap solution and then blow through it, the real morphologies are spherical bubbles. However, in 1991, Kapouleas proved the existence of CMC surfaces spanning a circle with holes and self-intersections [28]. Although we have considered the case where the boundary is planar, one can pose other configurations, something of which has recently been dealt with by experimentalists: fluids deposited in ring-shaped domains [36], liquids whose boundaries are constrained in two or three spheres [37, 63, 69], and probably, the most studied case, liquid bridges between parallel planes and wedges. See, for example, [33, 50, 55, 58]. Now, a brief comment on the structure of the rest of the text. The material of this review is not organized as a sequence of “Theorem” and “Proof”, but we present it as a continuum of the results that we want to point out. On the other 3 hand, this article is far from being a survey on CMC surfaces in R , for example, we do not treat the techniques on construction of complete or closed CMC surfaces, or the capillarity theory. Our aim is to show that this field is sufficiently active to be attractive to the reader.
3. The Dirichlet Problem of the CMC Equation If the domain Ω is wetted by a not too large amount of liquid, the contact angle θ does not exceed π/2 and it is natural to think that the projection of S is one-to-one on Ω. Therefore, a first attempt is to describe these droplets as graphs on Ω. Let 2 Ω ⊂ R be a smooth domain. Given a real number H and a continuous function φ on ∂Ω, the corresponding Dirichlet problem for the CMC equation consists of
August 12, 2005 15:54 WSPC/148-RMP
774
J070-00244
R. L´ opez
¯ that satisfies finding a function f ∈ C 2 (Ω) ∩ C(Ω) div(T f ) = −2H f =φ
in Ω,
Df , Tf = 1 + |Df |2
(3.1)
along ∂Ω.
(3.2)
Geometrically, H is the mean curvature of the surface z = f (x, y), whose boundary is C = graph φ. The orientation N assumed on the graph points downward, that is, N = (Df, −1)/ 1 + |Df |2 . If φ = 0, then the boundary of the surface is the curve ∂Ω. Equation (3.1) is an elliptic second order partial differential equation in two coordinates. Although it is not a linear equation, the difference of the two solutions of the same equation is, and we can apply the Hopf maximum principle ([26], [21, Theorem 9.2]). In geometric terms, we have: 3
Theorem 3.1 (Tangency Principle). Let S1 and S2 be two surfaces in R with the same constant mean curvature. Suppose that they are tangent at a common interior point p and the Gauss maps N1 and N2 agree at p. If S1 lies locally above S2 at p with respect to N1 (p) = N2 (p), then S1 and S2 coincide in a neighborhood of p. The same holds if p is a common boundary point with the extra hypothesis that ∂S1 and ∂S2 are tangent at p. In contrast, Eq. (3.1) cannot be solved analytically in most cases, even if S is a surface of revolution (in such a case, solutions involve elliptic integrals, see Eq. (3.6)). Thus it is necessary to apply numerical methods to determine the shape of the interface. Recently, several techniques have been developed, for example, the application MESH at the GANG (Center for Geometry, Analysis, Numerics and Graphics) at the University of Massachusetts [25], and the software Surface Evolver designed by Brakke [6]. The general technique employed in the solvability of the Dirichlet problem is the method of continuity. We briefly explain this technique with some detail ([10, 21]). Consider H a fixed real number. For each t ∈ [0, 1], we pose the family of Dirichlet problems div(T ut ) = −2tH in Ω, (Pt ) : u=φ along ∂Ω. Define the set J = {t ∈ [0, 1]; there exists a solution ut of (Pt )}. In this setting, a solution of (3.1) and (3.2) exists provided one shows that 1 ∈ J. For this purpose, we shall prove that J is a non-void, open and closed subset of [0, 1], and hence, J = [0, 1]. Let us prove first that 0 ∈ J, that is, there exists a minimal surface with the same boundary value φ on ∂Ω. In general, this solution is obtained from the very theory of minimal surfaces and, at first, this difficulty is not easily overcome (but if φ = 0, then u = 0 is an immediate solution). In the second step,
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
775
we prove that J is open in [0, 1]. Consider τ ∈ J and we will see that the Dirichlet problem (Pt ) can be solved for each t in a certain interval around τ . Denote by Σt the graph corresponding to ut and define a map h : C02,α (Στ ) → C0α (Στ ) taking each v onto the mean curvature function of the normal graph on Στ corresponding to the function v: h(v) = mean curvature of (p → x(p) + v(p)N (p)), 3
where x : Στ → R is the inclusion map. The linearization of h is the Jacobi operator of Στ , namely, L(v) = ∆v + |σ|2 v, where ∆ is the Laplace–Beltrami operator in Στ and σ is its second fundamental form. Here L is a self-adjoint linear elliptic operator with trivial kernel, since LN, a = 0
and N, a < 0,
(3.3)
where N is the orientation on Σt and a = (0, 0, 1). Hence, and using the implicit function theorem for Banach spaces, h is locally invertible. This shows that there exists a solution ut of (Pt ) for values of t around τ . Finally, it remains to be proved that J is closed in [0, 1]. The Schauder theory reduces the question to establish a priori C 0 and C 1 estimates of each solution ut of (Pt ) independent of t, that is, it suffices to prove that there exists a constant M independent of t such that sup |ut |, Ω
sup |Dut | ≤ M. Ω
The value of |ut |, that is, the height of Σt is controlled by a universal constant. Let us consider S a graph of a function u defined in a domain Ω. Assume that the mean 3 curvature H is constant. Denote x : S → R the inclusion map of S. From (3.3), ∆(Hx, a + N, a) = −2(H 2 − K)N, a, where K is the Gaussian curvature of S. If S is a graph, N, a < 0 and consequently, the function Hx, a + N, a is subharmonic. Maximum principle implies that its maximum is attained at a certain boundary point and thus, |u| = |x, a| ≤ sup |φ| + ∂Ω
1 . |H|
(3.4)
In particular: If S is a graph with constant mean curvature H and planar boundary, the height of S from the boundary plane is at most 1/|H|. This result appears first in [66] and has been recently generalized for unbounded domains [62]. Let us return to our setting. As a consequence of the maximum (or tangency) principle, u0 < ut < u1 ≤ sup∂Ω |φ| + 1/|H|.
August 12, 2005 15:54 WSPC/148-RMP
776
J070-00244
R. L´ opez
Now we seek a priori estimates for |Dut |. By the expression of N in terms of Dut , we know 1 . N, a = − 1 + |Dut |2
(3.5)
But Eq. (3.3) tells us that ∆N, a ≥ 0 and so, the maximum of N, a is attained at a boundary point of ∂Ω. By combining with (3.5), we conclude sup |Dut | = sup |Dut |. Ω
∂Ω
At this moment, and for each particular case of domain Ω, we shall need suitable surfaces as barriers to compare the slope of the graph of ut along its boundary: if |Du| → ∞, N → ± a. Usually, they are pieces of rotational CMC surfaces. Rotational surfaces in the Euclidean space with constant mean curvature are known as Delaunay surfaces. Consider a surface S = {(r(s) cos θ, r(s) sin θ, s); s ∈ I, θ ∈ R} obtained by rotation with respect to the z-axis of the generating curve (r(s), 0, s), s ∈ I and r(s) > 0 on I. Then the mean curvature H satisfies 1 + r2 − rr = 2H(1 + r2 )3/2 . Because d r = 0, Hr2 − √ ds 1 + r2 a first integral yields r =c Hr2 − √ 1 + r2
(3.6)
for a constant c. Anyway, the function r cannot be completely integrated, but it involves elliptic integrals. Delaunay discovered that the profile curve is the trace of a focus of a conic that rolls on the axis of revolution (the z-axis) [11]. Besides the catenoid (H = 0), Delaunay surfaces are unduloids and nodoids, and the limit cases, spheres and cylinders. See Fig. 2. It is possible to assure the existence of the Dirichlet problem for a few cases of domains Ω. Recall that not every value of H is allowed, because a simple integration
Fig. 2.
Profiles curves of Delaunay surfaces.
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
777
of (3.1) together with the divergence theorem implies that 2|H| <
length(∂Ω) . area(Ω)
(3.7)
When Ω is a round disc, the only CMC graphs are (small) spherical caps. When Ω is a bounded convex domain, the classical result of the existence for Eq. (3.1) and an arbitrary boundary condition is due to Serrin [65]: If the curvature κ of ∂Ω with respect to the inner orientation satisfies 0 < 2|H| < κ, then for an arbitrary smooth function φ on ∂Ω, there exists a unique solution of (3.1) with f = φ along ∂Ω. However, for droplets resting on planar substrates, that is, for φ = 0 on ∂Ω, one expects that the range of possible H is bigger, as it happens when Ω is a round disc. Some recent results have been obtained: Theorem 3.2. Let Ω be a bounded convex domain. If one of the following assumptions holds, then there is a solution of Eq. (3.1) for f = 0 along ∂Ω: (1) (2) (3) (4)
0 < |H| < κ, where √ κ is the curvature of ∂Ω [43]. length (∂Ω) < 3π/|H| [49]. area (Ω) < π/(2H 2 ) [46]. Ω is included in a strip of width 1/|H| [44].
In this theorem, pieces of spheres and cylinders are used as barriers since they fit well with the convexity of the domain Ω. We briefly present the proof of item (1) as a demonstration √ of our machinery (this result was proved under the stronger condition κ > 2|H|/ 3 in [56]). Since the boundary condition is φ = 0 on C := ∂Ω, the function u = 0 is the solution of (P0 ). Next, we know that J is an open set of [0, 1] and that we have C 0 bounds for ut , t ∈ [0, 1]. We only need to control the slope of ut along C. Without loss of generality, we may assume that H is positive and ut > 0 on Ω. Let ρ, η > 0 be real numbers such that H < 1/ρ < 1/η < κ. Let us take concentric circles Γρ , Γη of radii ρ and η, respectively, in the same plane Π = {x3 = 0} that contains C. Let us move both circles until C lies within the disc that bounds Γη in Π: this is possible due to the choice of η and the convexity of C. Take a hemisphere Sρ of radius ρ supported on Π and whose boundary is Γρ . Let us descend Sρ until the intersection with Π is Γη and denote Sη the piece of Sρ over Π (a spherical cap). The surface Sη is a graph on Π whose mean curvature is 1/η, with 1/η > H. See Fig. 3. The tangency principle implies that Σt lies in the bounded domain determined by Sη ∪ Π. Because 1/η < κ, and by rolling Γη along C, it is possible to move Sη by horizontal translations in such a way that ∂Sη touches each point of C. The tangency principle prohibits any contact between Sη and Σt before Sη touches Σt at a boundary point. In addition, the slope of Σt is less than that of Sη : this slope depends only on η but not on t, which provides an a priori estimate of |Dut | on ∂Ω.
August 12, 2005 15:54 WSPC/148-RMP
778
J070-00244
R. L´ opez
Fig. 3.
Proof of Theorem 3.2.
The results of Theorem 3.2 in terms of the size of the domain Ω are not optimal, as shown if Ω is a round disc. Other results of the existence of graphs on convex domains, though somewhat too complicated to state here, are obtained in [12, 60]. However, little is known when the boundary is not convex, and even less so, when C is not a planar curve. See also [65, 16, 47] for results of the existence of radial CMC graphs whose boundary is included in a given sphere. For non-bounded domains, the first examples that appear are half-cylinders, where the domain Ω is an infinite strip. Finn suggested in [13] that the half-cylinder of radius 1/(2|H|) is the only graph with constant mean curvature H in a strip of width 1/|H|. Collin [9] showed that other different solutions exist: given a strip B, let us take a convex function φ and we place two copies of the graph of φ over each straight line of ∂B. He showed the existence of a solution f of (3.1) on B such that f = φ along ∂B. For a general unbounded convex domain Ω, the author has proved: Theorem 3.3 ([44]). Let Ω be an unbounded convex domain. The necessary and sufficient condition to solve the Dirichlet problem (3.1) for f = 0 along the boundary 1 . is that Ω lies in a band of width |H| For non-bounded domains, we use the Perron method to obtain the desired solution: see [9, 10, 21] and for an example in the same context. We describe how this technique works by proving Theorem 3.3. Without loss of generality, let us assume H > 0. Denote LH (u) = div(T u) + 2H. Let v be a continuous function in Ω and D ⊂ Ω a closed disc. We denote by v¯ the unique solution of the Dirichlet v ) = 0 in D satisfying the condition v¯ = v on ∂D. The existence of the problem LH (¯ function v¯ is assured by the Serrin result since the radius of D is less than 1/(2H). We define the function MD (v) in Ω as v¯(p), p ∈ D, MD (v) = v(p), p ∈ Ω\D. The function v is a subsolution in Ω if v ≤ MD (v) for every disc D ⊂ Ω. If, in addition, v ≤ 0 along ∂D, we say that v is a subfunction relative to 0. The class F of all subfunctions relative to 0 is a non-empty set and is closed in the sense that MD (v) ∈ F provided that v ∈ F .
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
779
If the band B whereΩ lies is {(x, y) : −1/(2H) < y < 1/(2H), consider the half-cylinder z(x, y) = 1/(4H 2 ) − y 2 and Z = z|Ω . Let us define F ∗ = {v ∈ F ; 0 ≤ v ≤ Z}, with similar properties to F , and u = sup{v; v ∈ F ∗ } = sup{MD (v); v ∈ F ∗ , D ⊂ Ω}. The Perron method shows that the function u is a solution of LH (u) = 0 in Ω. ¯ and that it takes the Finally, it remains to be proved that u is continuous on Ω value 0 along the boundary. The fact that u is continuous in Ω is a consequence of the Harnack principle. The latter statement is achieved by using pieces of halfcylinders as barrier surfaces at the boundary points. Taking p ∈ ∂Ω, it is possible to find a quarter of cylinder Q with the same mean curvature as the graph S of u in such a way that S lies between Q and B. The maximum principle together with the convexity of the domain shows that u takes the value 0 in each boundary point of Ω, proving the theorem. In this result, as in [9], the barriers have been pieces of cylinder. Other results of the existence can be given in a stripped domain or non-convex domains. We shall need the following definition. 2
Definition 3.4. A domain Ω ⊂ R is said to satisfy an exterior ρ-circle condition 2 if it is possible to roll a circle Cρ of radius ρ with Cρ ⊂ R \ Ω and Cρ touches each point of ∂Ω. In the same way, if f : R → R is a smooth function, we say that 2 f satisfies an exterior ρ-condition if one of the domains R \graph (f ) satisfies an exterior ρ-circle condition. See Fig. 4. Now we can state the following: Theorem 3.5 ([45, 59]). Let H > 0 and let B be a strip of width 1/H. (1) Let Ω ⊂ B be a domain that satisfies an exterior ρ-circle condition. There exists a number hρ > 0, depending on H, Ω and ρ, such that if H < hρ , Eq. (3.1) has a solution u, with u = 0 along ∂Ω. (2) Let Ω ⊂ B be an infinite strip and f satisfying an exterior ρ-condition. There exists a number hρ > 0, depending on f, H, Ω and ρ such that if H < hρ , Eq. (3.1) has a solution u, with u = f along ∂Ω.
Fig. 4.
A domain Ω satisfying an exterior ρ-circle condition.
August 12, 2005 15:54 WSPC/148-RMP
780
J070-00244
R. L´ opez
Fig. 5.
Theorem 3.5, cases (1) and (2).
Examples of domains appear in Fig. 5. The proof of this result is similar to that in Theorem 3.3, except that the barrier is a suitable piece of nodoid adequate to our setting. Let us finally pose some open problems related to the subject of this section. Q1 In view of Theorem 3.2, some of the results are not optimal. For example, does the Dirichlet problem have a solution when length (∂Ω) < 2π/|H| or when area (Ω) < π/H 2 ? Q2 It has been proved in (3.4) that the height of a graph with constant mean curvature H spanning a planar boundary is less than 1/|H|. Hemispheres show that this bound is optimal, but we do not know whether the bound 1/|H| is achieved for other configurations of boundary, even in the case that ∂Ω is a convex curve. 4. The Effect of the Boundary in the Morphology of the Droplet In this section, we shall describe the effect of the geometry of the boundary in the shape of the whole surface. Consider S a CMC surface with non-empty boundary 3 ∂S. Let Y be a variational field in Euclidean three-space R . Then the first variation formula of the area |A| of the surface S along Y gives δY |A| = −2H N, Y dS − Y, ν ds, S
∂S
where ν and ds represent, respectively, the inner unit vector along ∂S and the 3 length-arc element of ∂S. Let us fix a vector a ∈ R and consider Y the vector field 3 of translations in the direction of a. Since Y generates isometries of R , the first variation of A is 0. Then 2H N, a ds + ν, a ds = 0. (4.1) S
∂S
The first integral changes into an integral on the boundary as follows. The divergence of the field Zp = (p ∧ a) ∧ N , p ∈ S, is −2N, a. The divergence theorem,
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
together with (4.1), yields ν, a ds + H ∂S
∂S
α × α , a ds = 0,
781
(4.2)
where α is a parametrization of ∂S such that α ∧ ν = N . This equality is known as the “balancing formula” or “flux formula” (see [32] for the embedded case). It is a conservation law in the sense of Noether that reflects the fact that the area (the potential) is invariant under the group of translations of Euclidean space. On the other hand, the formula can be viewed as the physical equilibrium between the forces of the surface tension of S that act along its boundary with the pressure forces that act on the bounded domain by ∂S. More generally, if we cut S into a collection of opens, then the surface tension along the cuts and the pressure through the caps must balance. 3 If the boundary ∂S lies in the plane Π = {x ∈ R ; x, a = 0}, for | a| = 1, then α × α , a is simply the support function of the boundary. Therefore ¯ ν, a ds, (4.3) 2H A = ∂S
where A¯ is the area of the domain that bounds ∂S. This formula holds for an 3 arbitrary immersed surface. See [48]. Thus, given a closed curve C ⊂ R , the value H of the possible mean curvature of surfaces whose boundary is C cannot be imposed, but is determined by the geometry of the boundary curve C: |H| <
length(C) . 2area(Ω)
(4.4)
This generalizes (3.7). In particular, if C is a circle of radius R > 0, a necessary condition for the existence of a surface spanning C with constant mean curvature H is that |H| ≤ 1/R [22]. For the rest of this section, we point out some recent aspects in the theory of CMC surfaces with boundary. 4.1. Small CMC surfaces Classically, there are two ways to obtain CMC surfaces spanning a prescribed closed curve C: the Plateau and isoperimetric problems. Let us recall both problems here. 3 Given a fixed Jordan curve C ⊂ R , the Plateau problem asks for an immersed CMC surface spanning C. In order to avoid complicated topology of the surfaces, ¯ ⊂ R2 ¯ → R3 from the closed unit disc D one restricts to consider immersions X : D such that ∂X(D) = C. For this, one minimizes the functional Area − 2H · Volume in a suitable class of surfaces with the same boundary C. Also, we must assure that there exists such a surface with a finite area. The techniques employed come from the functional analysis. This viewpoint was initiated by Douglas but it was from
August 12, 2005 15:54 WSPC/148-RMP
782
J070-00244
R. L´ opez
the 1950s to the present when it became an active field for a number of authors like Br´ezis, Coron, Heinz, Hildebrandt, Steffen, Struwe and Wente et al. A solution is obtained provided that the original data is small with respect to the size of the boundary C. We state the classical theorem of existence due to Hildebrandt: Let C be a Jordan curve included in a ball of radius R > 0. If |H| ≤ R1 , there exists a topological disc spanning C and with mean curvature H [23]. We should mention that the Hildebrandt result is the best possible one since we cannot drop the assumption on H: condition (4.4) shows that for H with |H| > 1/R, there does not exist a solution when C is a planar circle of radius R. There are many studies on the existence of solutions to the Plateau problem for CMC surfaces. The book by Struwe [68] and the survey paper [67] could serve as a lead for readers. Similarly, one can minimize the area functional for all disc-type surfaces spanning C and enclosing a fixed volume V . Of course, in the case where it is possible to obtain a minimum, the solution is a CMC surface. In this sense, we point out the following result of Wente [70]: Given a Jordan curve C and a number V > 0, there exists a topological disc with constant mean curvature spanning C and enclosing a volume V . Although the above two results provide many CMC surfaces with boundary, no study has been done on the shape and morphology of these surfaces. Only in the case where C is a circle, is it known that the solution given by Hildebrandt is a spherical cap [5]. On the other hand, the isoperimetric problem can be stated as follows. Let C be a closed curve such that C is the boundary of a surface G. Given a positive number 3 V , an isoperimetric region with respect to (G, V ) is a region M ⊂ R of volume V such that ∂M = G ∪ S and S has the least area among all the possible M . Existence of these isoperimetric regions is guaranteed in the context of the geometric measure theory [3]. From regularity, their boundaries are smooth except in C and so they give us embedded CMC surfaces, possibly of high genus. However, in both problems, the geometry of the shapes of such surfaces is unknown. Now assume that C is a planar curve. From experiments and for small volumes, one expects that the solutions obtained by Wente as well as the isoperimetric regions correspond with graphs on the domain determined by C. The mathematical proof of this evidence is in [49] when the boundary C is convex. The basic fact is the following estimate of the height of a CMC surface: Theorem 4.1 ([49]). Let S be a surface with constant mean curvature H. If ∂S is a closed curve included in a plane Π, then the height h of the surface with respect
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
783
to Π satisfies A|H| , 2π where A is the area of the part of S that lies over Π. Equality holds if and only if S is a spherical cap. h≤
A consequence of this estimate is the description of Hildebrandt and Wente solutions for small volumes. Theorem 4.2 ([43]). Let C be a convex curve. (1) There exists a value H0 (C) > 0 depending only on C such that Hildebrandt solutions are graphs if the mean curvature H satisfies |H| < H0 (C). (2) There exists a value V0 (C) > 0 depending only on C such that Wente solutions are graphs if the volume V satisfies V < V0 (C). In the same way, Theorem 4.1 allows the shapes for some isoperimetric regions to be shown. Theorem 4.3 ([49]). Given a bounded convex planar domain Ω, there exists a positive number VΩ such that isoperimetric regions with respect to the pair (Ω, V ) for V ≤ VΩ are bounded by Ω ∪ S, where S is a constant mean curvature graph over Ω. One question is what happens if the curve is not convex. We hope that the same result holds for arbitrary planar closed curves. This belief comes from the solvability of the CMC equation. If C is a planar closed curve, we know that for small values of H, there exist graphs with mean curvature H spanning C. An analysis of the mean curvature equation informs that in this range of H, the volume and the height are increasing functions on H [51]. Since the curve C is prescribed, we expect small volumes to correspond with droplets of high wettability. Thus, the droplet would actually be a graph. 4.2. Existence of critical volumes When we increase the volume of the liquid deposited on the substrate, the configurations cease to be graphs and if the fluid rests pinned to the boundary C, the morphological wetting transitions can change completely. These transitions have been noted in the Max Planck Institute of Colloids and Interfaces at Golm in the study of chemically-structured surfaces containing hydrophilic and hydrophobic surfaces domains. These patterns are interesting in microfluidic devices, for example, in alkaness spreading, photolithography, copolymer films and vapor deposition. See [17] and references therein. It has been observed experimentally that the original cylinders on stripped domains change in an abrupt manner from a certain volume of liquid deposited on the substrate. See Fig. 6. The morphology of liquid is a function of the quantity of liquid injected along a given pattern. An addition of
August 12, 2005 15:54 WSPC/148-RMP
784
J070-00244
R. L´ opez
120 µm Fig. 6. Morphological wetting transitions of water channels on a hydrophilic MgF2 striped surface. The effect of the volume on the shape of liquid appears at microscopic scale. When the amount of water is small, the shapes adopted are the expected ones as graphs on a band (left), i.e. pieces of cylinders. However, for a critical volume, the liquid develops bulges (right), even given the possibility of joining closer channels. Courtesy of R. Lipowsky.
volume produces bulges [17, 38]. Similar results have been noted for annular-type patterns [36]. In the same sense, different computing experiments developed at the GANG show similar situations. For a convex curve C, it seems that there exists a critical volume VC with the following property: a CMC surface bounded by C and enclosing a volume V > VC , has parts on both sides of Π [24]. Thus the existence of two wetting phenomena depending on the volume of fluid deposited on the substrate seems clear. If the volume is small, we know that the surface is a graph [49, 43]. But if we add liquid, the morphologies taken on by the droplet are unexpected. The problem is also interesting for two convex curves C1 , C2 in parallel planes. In this situation, if the volume V is small, isoperimetric regions are disconnected droplets deposited in each plane and such surfaces do not form a bridge between both planes. What happens when V → ∞? We do not know the topology of the isoperimetric regions when the volume increases. Do the two droplets grow and touch for a first time, i.e. is there an isoperimetric region bounded by a topological cylinder joining C1 to C2 ? We do not even know if there is a constant mean curvature topological cylinder with boundary C1 ∪ C2 . The answer is affirmative only if C1 , C2 are circles. In this case, if the droplet touches the planes in axially-concentric discs, then the surface is a rotational surface [42]. 4.3. Embedded CMC surfaces An embedded surface is a surface without self-intersections. For example, graphs of smooth functions are embedded. Although droplets obtained in experiments are embedded, mathematically a CMC surface can have self-intersections (an immersed surface). In this section, we assume that all the surfaces are compact. Shapes of embedded CMC surfaces are the best known due to the tangency principle.
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
785
Alexandrov used it in an original way to prove that the only compact embedded 3 CMC surfaces in R are round spheres [2]. The proof is to compare the surface with itself (Alexandrov reflection method). The original question posed in Sec. 2 of whether a CMC surface inherits the symmetries of its boundary can be answered in some cases if we assume embedding on the surface. The best result in this sense is the following: Theorem 4.4 ([29]). Let C be a closed curve contained in a plane Π. Assume that C is symmetric with respect to a straight line L ⊂ Π and that each piece of C that L divides is a graph on L. If S is an embedded CMC surface with boundary C and S lies on one side of Π, then S is symmetric with respect to plane P orthogonal to Π with P ∩ Π = L. In particular, if C is a round circle, then S is a spherical cap. The proof consists of showing that S is invariant by symmetry with respect to P . Moreover, the proof shows that the parts of S in each side of P are graphs over a domain of P . For this, we work as follows. After an ambient isometry, we can suppose that Π = {z = 0}, P = {y = 0} and S ⊂ {z ≥ 0}. Denote Ω the bounded region in Π bounded by C = ∂S. Since S ∪ Ω is a closed surface, R3 \ (S ∪ Ω) = A ∪ B, being A and B non-bounded and bounded domains, respectively. We orient S by the Gauss map N that points towards A. Consider the family of all translated copies of P given by P (t) = {y = t} (hence P (0) = P ). For t sufficiently negative, P (t) ⊂ A. Letting t → 0, consider the first plane P (t1 ) that reaches S, that is, P (t1 ) ∩ S = ∅ but P (t) ∩ S = ∅ if t < t1 . Now, when we increase t from t1 , we denote by S(t)− and S(t)+ the (closed) parts of S on the left and right of P (t), respectively. Let S(t)∗ be the symmetry of S(t)− through P (t). Initially, when we take t sufficiently close to t1 with t > t1 , S(t)∗ is contained in B and S(t)∗ and S(t)− are graphs on P (t). As t continues to increase, and through the compactness of S, there exists a critical time τ , τ > t1 , such that S(τ )∗ has a first contact with S(t)+ , that is, S(τ ) ⊂ B and there exists p ∈ S(τ )∗ ∩ S(τ )+ at which, for t > t1 S(t)∗ leaves to be contained in B. Different possibilities may occur. First, if τ = 0, then S(0)∗ ⊂ B. In this case, we repeat the argument but we begin from t = ∞. We continue with the reasoning below. If we can once again arrive, by carying out reflection, at the position t = 0, then P is a plane of symmetry of S. In other case, our reasoning is similar to the one below. On the contrary case, τ < 0 and there exists p ∈ S(τ )∗ such that the tangent planes of S(τ )∗ and S(τ )+ are equal at p. See Fig. 7. The point p may be an interior or boundary point of S(τ )∗ but it is prohibited that p ∈ C, due to the symmetry of C and the fact that S lies over Π. Now we will consider S(τ )∗ and S(τ )+ graphs with respect to Tp S. In addition, the Gauss maps of both surfaces at p agree since S is embedded. Applying the tangency principle (in both interior or boundary versions), we conclude that S(τ )∗ = S(τ )+ and P (τ ) is a plane of symmetry of S, in contradiction with the symmetry of C with respect to L. Theorem 4.4 generalizes in two senses. Let us first consider a CMC embedded surface bounded by two coaxial circles. We know that if the surface is included in the
August 12, 2005 15:54 WSPC/148-RMP
786
J070-00244
R. L´ opez
Fig. 7.
The Alexandrov reflection method.
slab determined by the two boundary planes, the surface is rotationally symmetric [42]. What happens if the surface exceeds this slab? Pieces of nodoids show the existence of CMC rotational symmetric surfaces spanning two coaxial circles, but the surface is not included in the slab determined by the boundary. Q3 Let S be a CMC embedded surface bounded by two coaxial circles. Is S a surface of revolution? On the other hand, Theorem 4.4 is generalized for more general assumptions on the mean curvature. If we think a droplet as an embedded surface resting on a planar substrate under the effect of the gravity that makes a constant angle of contact with the planar substrate, a theorem of Wente shows that the droplet is rotational symmetric with respect to a straight line orthogonal to Π [71]. Furthermore, each intersection of the droplet with a horizontal plane is a round circle. In fact, this result is key for experimentalists since it is a custom to consider rotational symmetric droplets. However, even for minimal surfaces, there are immersed counter-examples. Recall also that Kapouleas found non-embedded CMC surfaces spanning a circle. More recently, we pointed out two results that inform us how the geometry of the boundary has an effect on the whole surface. We give a quick proof of them without going into details. Theorem 4.5 ([29]). Let C be a Jordan curve included in a plane Π and let us denote Ω the domain bounded by C. Let S be an embedded CMC surface spanning C. ¯ then S lies on one side of the half-spaces determined by If S does not intersect Π\ Ω, Π. In particular, if C is a circle, S is a spherical cap. We assume that S has points on both sides of Π. We take a large lower hemisphere Q such that the disc that bounds ∂Q in Π contains ∂C and Q ∩ S = ∅. Then Q ∪ S together with the annulus determined by ∂Q and C is a closed surface 3 in R that encloses a domain W . See Fig. 8. We orient S by the Gauss map N that points towards W . We use the touching principle to compare S with that horizontal plane at the lowest point p of S to infer that H < 0. On the other hand, at the highest point q of S, a new comparison with the tangent plane gives H > 0, which is a contradiction.
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
Fig. 8.
787
Proof of Theorem 4.5.
Theorem 4.6 ([8]). Let C be a convex curve included in a plane Π. Let S be an embedded CMC surface spanning C. If S is transverse to Π along C, then S lies in one side of Π. In particular, if C is a circle, S is a spherical cap. Here we show two cases that can illustrate the proof of this result. In Fig. 9(a), the surface meets the plane Π in a nullhomotopic closed curve G in Π\Ω. We use the Alexandrov reflection method by vertical planes arriving from infinity. Since C is convex, we would have an interior contact point, proving that S has a symmetry by a vertical plane that does not intersect C: this case is impossible. The second case that we analyze appears in Fig. 9(b). Again, this surface is impossible. In this situation, we use the balancing formula as follows. The surface S together with Ω encloses a domain W and we orient S by N pointing towards W . Here the mean curvature H is positive. Set a = (0, 0, 1). In the expression (4.2), ν, a is positive, since the surface is transverse to Π along ∂S. Since α = ν ∧ N , the term α ∧ α , a is also positive, which is again a contradiction. It is possible to add to Theorem 4.6. If the droplet is a graph around the boundary, the surface is actually a graph: Theorem 4.7 ([41]). Let C be a Jordan curve included in a plane Π and let Ω be the domain that bounds in Π. Let S be an embedded CMC surface spanning C. If S is a graph over Ω around the boundary C, then the surface is a graph. a
ν N
Ω G
Ω
N
(a)
(b) Fig. 9.
Proof of Theorem 4.6.
W
August 12, 2005 15:54 WSPC/148-RMP
788
J070-00244
R. L´ opez
In this result, we again use the Alexandrov reflection method but with horizontal 3 planes: let us consider Q = S ∪ (C × [−∞, 0]) and B the domain of R \ Q that contains Ω. Let us orient S by the Gauss map that point towards B. Carrying out the Alexandrov reflection method with horizontal planes coming from infinity, there are two possibilities: there exists a contact position between an interior point p of S with a point q of C; or we can carry on doing reflections with planes until arriving at Π. In the former case, there is a contradiction because S is a graph on Ω around C but the segment joining p with q must be in B. As a consequence, the second possibility occurs, proving that S is a graph. From this result, consider C a closed curve included in a plane Π and let S be an embedded CMC surface spanning C such that S lies over Π. We know that for small volumes, S is a graph, but we do not know if the same occurs when the height of S is not very great. The height of an embedded surface is controlled as follows. Consider an embedded compact surface S with constant mean curvature H. Assume that ∂S is contained in a plane Π and S lies in one of the two half-spaces determined by Π. A similar reasoning with the Alexandrov reflection method as in Theorem 4.4 but using horizontal planes coming from infinity, we can easily show that there is no contact point in the reflection process at least until the height h/2, being h the height of S. Since until this moment, the part of surface behind the plane at height h/2 is a graph, we conclude the following. The maximum height of an embedded compact surface with mean curvature H, planar boundary and above the boundary plane is 2/|H|. This estimate is optimal for big spherical caps. Q4 Big spherical caps are not graphs. Thus, we ask if an embedded surface with constant mean curvature H, planar boundary and with height 1/|H| not far from the boundary plane is actually a graph? Furthermore, we expect S to be a hemisphere if the height attains the value 1/|H|. Finally, we ask about the topology of a CMC droplet resting over a plane. One open question is whether an embedded CMC surface with convex boundary and over the plane containing the boundary has the same topology as a disc [8]. We know that this is true in some cases: see Theorems 4.3 and 4.7. If the case of small volume was treated in [49], Ros and Rosenberg have proved an analogous result if the volume is so big. Theorem 4.8 ([61]). Let C be a convex curve included in a plane Π. There exist numbers V0 and A0 , depending only on C, such that any embedded CMC surface with boundary C and over Π is a topological disc provided that either its volume or area is bigger than V0 and A0 respectively. The authors have also proved that for large volumes (or areas), the surface is like a sphere. In particular, the surface is the union of two graphs: first, over the domain D that contains Ω, the domain that bounds C on Π, together with a graph
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
789
on the annulus D \ Ω. We expect this to be true with the only assumption that surface S is locally a graph over Π \ Ω around C. Physically, this means that S dewets along C. ¯ ⊂ D such that S = S1 ∪ S2 , being S1 and S2 Q5 Is there a domain D with Ω ¯ graphs over D and D\ Ω respectively?
4.4. CMC surfaces with circular boundary Returning to the case of circular boundary, the first example is a spherical cap. If we cut off a sphere by a plane, we obtain two surfaces with the same circle as a boundary, i.e. hemispheres, if the plane crosses the center of the sphere, or two (different) spherical caps. However, our mathematical CMC droplets can have configurations that are physically unrealizable. Even in the simple case that the surface has the same topology as a disc or it has no self-intersections, non-spherical shapes of droplets could appear. Definitively, are the droplets on circular domains round? If the radius of the circle is R, a necessary condition about the value of H is that |H| ≤ 1/R ([22] and formula (4.2)). In this case, the only known examples are the planar disc (H = 0), the two corresponding spherical caps of radius 1/|H| if H = 0, and the Kapouleas examples [28] cited in Sec. 2. However, computers have no numerical images of these surfaces. We report here recent results that characterize spherical caps in the family of CMC surfaces with circular boundary. Theorem 4.9. Let C be a circle of radius R and let S be a compact surface spanning C and with constant mean curvature H. Then, S is a spherical cap if one of the following conditions holds: (1) The surface is embedded and lies over the plane containing C [2]. (2) The surface is embedded and do not intersect the exterior domain of the circle in the boundary plane [29]. (3) The mean curvature satisfies |H| = 1/R [7]. (4) The surface is included in a closed ball of radius 1/|H| [5]. (5) The surface is embedded and transverse to the boundary plane along the boundary [8]. (6) The surface is a minimizer surface [30]. (7) The surface is a topological disc with an area less than the area of the small spherical cap with mean curvature H [48]. (8) The volume is less than the volume of a hemisphere [49]. (9) The surface is embedded and included in a slab of width 1/|H| [40]. (10) The surface is a stable topological disc [4].
August 12, 2005 15:54 WSPC/148-RMP
790
J070-00244
R. L´ opez
(11) The surface is a topological disc that makes a constant angle of contact along the boundary [43]. (12) The surface is stable and with a free boundary along the boundary plane [31]. 5. Conclusions and Outlook Constant mean curvature surfaces with prescribed boundary are models for interfaces in a microgravity environment or a microscopic scale. Experiments under non-ideality conditions show that the condition concerning the contact angle is not satisfied (hysteresis). When we deposit a sufficient amount of liquid on a heterogeneous substrate, interfaces are constrained to the borders that separate hydrophilic and hydrophobic domains. If the volume of liquid, or the contact angle, is small (wetting), mathematical results show that CMC graphs simulate the adopted shapes reasonably well. For convex domains at least, the results reported here reveal a mathematical activity on the existence of the Dirichlet problem. This suggests that a major emphasis can be addressed to experiment with shapes for liquids in a nonparametric surface z = f (x, y). It would be of interest to study the relationship between the size of the substrate surface with the construction of liquid droplets, their heights and volumes. Advances in computer technology open the possibility of modeling solutions of the Dirichlet problem. On the other hand, and in the light of the results of the literature, the control of the relation between the shape of the interface and the amount of liquid that contains is not completely understood, since, for large volumes, the interface leaves to be a graph and the morphologies observed are unexpected. In this situation, the shapes obtained experimentally show how the models of such systems are far from being understood. Finally, one comment. The points of view of a physicist and a mathematician can converge, as for example, when experiments in wetting and capillarity help to reach the results that a mathematician is attempting to prove and vice versa; that is, how mathematics provides a model for a physical phenomenon that allows a relative description of these morphologies. However, just as has been shown throughout these pages, one thing is the experimental evidence and another is the theoretical proof. The conjecture on spherical caps (see Sec. 2) is an example of this complexity. The richness of the space of CMC surfaces with boundary and the variety of shapes of such surfaces goes well beyond the limits of the experience: surfaces that are probably physically unrealizable but mathematically possible. Acknowledgments The author would like to thank the referees for reading the paper carefully while providing suggestions and comments to improve it. This work has been partially supported by a MEC-FEDER grant no. MTM2004-00109.
August 12, 2005 15:54 WSPC/148-RMP
J070-00244
Wetting and Constant Mean Curvature Surfaces
791
References [1] A. W. Adamson, Physical Chemistry of Surfaces (John Wiley and Sons, New York, 1990). [2] A. D. Alexandrov, V. Vestnik Leningrad Univ. A.M.S. Ser. 2, 21 (1958) 412. [3] F. J. A. lmgren, Mem. Amer. Math. Soc. 4 (1976). [4] L. Al´ıas, R. L´ opez and B. Palmer, Proc. A.M.S. 127 (1999) 1195. [5] J. L. Barbosa, Matem. Comtemp. 1 (1991) 3. [6] K. Brakke, Exp. Math. 1 (1992) 141. [7] F. Brito and R. Earp, An. Acad. Bras. Ci. 3 (1991) 5. [8] F. Brito, R. Earp, W. Meeks III and H. Rosenberg, Indiana Univ. Math. J. 40 (1991) 333. [9] P. Collin, C.R. Acad. Sci. Paris S´ er. I 311 (1990) 539. [10] R. Courant and D. Hilbert, Methods of Mathematical Physics (Interscience, New York, 1962). [11] C. Delaunay, J. Math. Pure et App. 16 (1841) 309. [12] N. Espirito-Santo and J. Ripoll, J. Geom. Anal. 11 (2001) 603. [13] R. Finn, J. d’Anal. Math. 14 (1965) 139. [14] R. Finn, Equilibrium Capillary Surfaces (Springer-Verlag, New York, 1986). [15] R. Finn, J. Math. Fluid Mech. 3 (2001) 139. [16] P. Fusieger and J. Ripoll, Ann. Global Anal. Geom. 23 (2003) 373. [17] H. Gau, S. Herminghauss, P. Lenz and R. Lipowsky, Science 283 (1999) 46. [18] K. F. Gauss, Comment. Soc. Regiae. Scient. Gottingensis Rec. 7 (1830), reprinted in Werke, Vol. 5 (Gottingen, 1876), 29. [19] J. Gaydos, Colloids and Surfaces A: Physicochemical and Engineering Aspects 114 (1996) 1. [20] P. G. deGennes, F. Brochard-Wyart and D. Qu´er´e, Capillarity and Wetting Phenomena: Drops, Bubbles, Pearls, Waves (Springer-Verlag, New York, 2004). [21] D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order (Springer-Verlag, New York, 1983). [22] H. Heinz, Arch. Rational Mech. Anal. 35 (1969) 249. [23] S. Hildebrandt, Commun. Pure Appl. Math. 23 (1970) 97. [24] D. Hoffman and H. Rosenberg, Surfaces Minimales et Solutions de Problmes Variationnels (Soc. Math. France, Paris, 1993). [25] J. T. Hoffman, MESH: A program for generating parametric surfaces using an adaptive mesh, Document Library, Series 2, No. 35, Univ. Massachusetts (1995). [26] E. Hopf, Preuss. Akad. Wiss. 19 (1927) 147. [27] R. E. Johnson and R. H. Dettre, Surface Coll. Science 1 (1969) 85. [28] N. Kapouleas, J. Diff. Geom. 33 (1991) 683. [29] M. Koiso, Math. Z. 191 (1986) 567. [30] M. Koiso, Manuscripta Math. 87 (1995) 311. [31] M. Koiso, Bull. Kyoto Univ. Ed. Ser. B94 (1999) 1. [32] R. Kusner, Global geometry of extremal surfaces in three-space, Ph.D. thesis, Univ. California, Berkeley, 1998. [33] D. Langbein, Microgravity Sci. Technol. 5 (1992) 2. [34] D. Langbein, Capillary Surfaces (Springer-Verlag, Berlin, 2002). [35] P. S. Laplace, Trait´e de la Mec´ anique Celeste; Supplements en livre X (GauthierVillais, Paris, 1806). [36] P. Lenz, W. Fenzl and R. Lipowsky, Europhys. Lett. 53 (2001) 618. [37] G. Lian, C. Thornton and M. Adams, J. Coll. Interface Sci. 161 (1993) 138. [38] R. Lipowsky, Current Op. Colloid Interface Sci. 6 (2001) 40.
August 12, 2005 15:54 WSPC/148-RMP
792
J070-00244
R. L´ opez
[39] S. Ljunggren, J. C. Eriksson and P. A. Kralchevsky, J. Coll. Interfaces Sci. 191 (1997) 424. [40] R. L´ opez, Geom. Dedicata 66 (1997) 255. [41] R. L´ opez, J. Geom. 60 (1997) 80. [42] R. L´ opez, Ann. Global Anal. Geom. 15 (1997) 201. [43] R. L´ opez, Tsukuba J. Math. 23 (1999) 27. [44] R. L´ opez, J. Diff. Eq. 171 (2001) 54. [45] R. L´ opez, Pacific J. Math. 206 (2002) 359. [46] R. L´ opez, Glasgow Math. J. 44 (2002) 455. [47] R. L´ opez, Manuscripta Math. 110 (2003) 45. [48] R. L´ opez and S. Montiel, Proc. A. M. S. 123 (1995) 1555. [49] R. L´ opez and S. Montiel, Duke Math. J. 85 (1996) 583. [50] U. M. Marconi and F. Van Swol, Phys. Rev. A39 (1989) 4109. [51] J. McCuan, Calc. Var. 9 (1999) 297. [52] A. W. Neumann and J. K. Spelt, Applied Surface Thermodynamics (Marcel Dekker, New York, 1996). [53] J. C. C. Nitsche, Lectures on Minimal Surfaces (Cambridge University Press, Cambridge, 1989). [54] R. Osserman, A Survey of Minimal Surfaces (Dover Publ. Inc., New York, 1986). [55] A. O. Parry, C. Rasc´ on and A. J. Wood, Phys. Rev. Lett. 85 (2000) 345. [56] L. E. Payne and G. A. Philippin, Nonlinear Anal. Theory Meth. App. 3 (1979) 193. [57] D. Quer´e, Physica A313 (2002) 32. [58] C. Rasc´ on and A. O. Parry, Nature 407 (2000) 986. [59] J. Ripoll, Pacific J. Math. 198 (2001) 175. [60] J. Ripoll, J. Differential Equations 181 (2002) 230. [61] A. Ros and H. Rosenberg, J. Diff. Geom. 44 (1996) 807. [62] A. Ros and H. Rosenberg, Properly embedded surfaces with constant mean curvature, preprint (2004). [63] P. R. Rynhart and R. McLachlan, J. R. Jones and R. McKibbin, Res. Lett. Inf. Math. Sci. 5 (2003) 19. [64] L. L. Schramm, D. B. Fisher, S. Sc¨ urch and A. A. Cameron, Colloids Surf. A94 (1995) 145. [65] J. Serrin, Philos. Trans. Roy. Soc. London Ser. A264 (1969) 413. [66] J. Serrin, Math. Z 11 (1969) 77. [67] K. Steffen, Lecture Not. Math. 1713 (1999) 211. [68] M. Struwe, Plateau’s Problem and the Calculus of Variations (Princeton Univ. Press, Princeton, 1988). [69] M. E. Urso, C. J. Lawrence and M. J. Adams, J. Colloid Int. Sci. 220 (1999) 42. [70] H. C. Wente, J. Math. Anal. Appl. 26 (1969) 318. [71] H. C. Wente, Pacific J. Math. 88 (1980) 387. [72] T. Young, Philos. Trans. Royal Soc. (London) 1 (1805) 65.
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Reviews in Mathematical Physics Vol. 17, No. 7 (2005) 793–857 c World Scientific Publishing Company
FREE ENERGY IN THE GENERALIZED SHERRINGTON–KIRKPATRICK MEAN FIELD MODEL
DMITRY PANCHENKO Department of Mathematics, Massachusetts Institute of Technology, 77, Massachusetts Ave, Cambridge, MA 02139, USA [email protected] Received 29 March 2005
In [11], Talagrand gave a rigorous proof of the Parisi formula in the classical Sherrington– Kirkpatrick (SK) model. In this paper, we build upon the methodology developed in [11] and extend Talagrand’s result to the class of SK type models in which the spins have arbitrary prior distribution on a bounded subset of the real line. Keywords: Spin glasses. Mathematics Subject Classification: 60K35, 82B44
1. Introduction and Main Results In [11], Talagrand invented a rigorous proof of the Parisi formula for the free energy in the Sherrington–Kirkpatrick model [7]. The methodology developed by Talagrand was based upon a deep extension of Guerra’s interpolation method in [2] to coupled systems of spins which provided necessary control of the remainder terms in Guerra’s interpolation. The same methodology was successfully used in [13] to compute the free energy in the spherical model and in the present paper, we will utilize it in the setting of a generalized Sherrington–Kirkpatrick model in which the prior distribution of the spins is given by an arbitrary probability measure with bounded support on the real line. Let us start by introducing all necessary notations and definitions. Consider a bounded set Σ ⊆ R and a probability measure ν on the Borel σ-algebra on Σ. Given N ≥ 1, consider a product space (ΣN , ν N ) which will be called the space of configurations. A configuration σ ∈ ΣN is a vector (σ1 , . . . , σN ) of spins σi that take values in Σ. For simplicity of notations, we will omit index N in ν N since it will always be clear from the context whether we consider measure ν on Σ or the product measure on ΣN . For each N , we consider a Hamiltonian HN (σ) on ΣN that is a Gaussian process indexed by σ ∈ ΣN . We will assume that HN (σ) is jointly 793
August 12, 2005 15:54 WSPC/148-RMP
794
J070-00245
D. Panchenko
measurable in (σ, g), where g is the generic point of the underlying probability space on which the Gaussian process HN (σ) is defined. We assume that for a certain sequence c(N ) → 0 and a certain function ξ : R → R, we have 1 1 2 N 1 2 (1.1) ∀ σ , σ ∈ Σ , EHN (σ )HN (σ ) − ξ(R1,2 ) ≤ c(N ), N where R1,2 =
1 1 2 σi σi N
(1.2)
i≤N
is called the overlap of the configurations σ 1 , σ 2 . We will assume that ξ(x) is three times continuously differentiable and satisfies the following conditions ξ(0) = 0,
ξ(x) = ξ(−x),
We will denote the self-overlap of σ by R1,1 =
ξ (x) > 0
if x > 0.
(1.3)
1 (σi )2 . N i≤N
One defines the Gibbs measure GN on ΣN by dGN (σ) = where the normalizing factor ZN =
1 exp HN (σ) dν(σ), ZN
(1.4)
ΣN
exp HN (σ) dν(σ)
is called the partition function. This definition of the Gibbs measure also includes the case of models with external field because given a measurable function h(σ) on Σ and the Gibbs measure defined by 1 dGN (σ) = exp HN (σ) + h(σi ) dν(σ), ZN i≤N
we can simply make the change of measure dν (σ) ∼ exp h(σ) dν(σ) to represent this Gibbs measure as (1.4). The only assumption that we need to make on h(σ) is that exp h(σ) dν(σ) < ∞ which holds, for example, when h(σ) is uniformly bounded. This general model includes the original Sherrington–Kirkpatrick (SK) model in [7] and the Ghatak– Sherrington (GS) model in [1] that will be considered in more detail in Sec. 2.1. In both cases, the Hamiltonian HN (σ) is given by β HN (σ) = √ gij σi σj (1.5) N i<j
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
795
for some β > 0 and i.i.d. Gaussian r.v. gij , the set Σ is equal to {−1, +1} in the SK model and {0, ±1, . . . , ±S} for some integer S in the GS model and in both cases, the measure ν is uniform on Σ. In the case of the SK model, the function h(σ) is given by h(σ) = hσ with the external field parameter h ∈ R; and in the case of the GS model, it is given by h(σ) = hσ 2 with the crystal field parameter h ∈ R. Let us define 1 1 exp HN (σ) dν(σ), (1.6) FN = E log ZN = E log N N ΣN which (usually, with the factor −β −1 which we omit for simplicity of notations) is called the free energy of the system (ΣN , GN ). The main goal of this paper is to find the limit limN →∞ FN . It will soon become clear that the main difference of the above model from the classical SK model lies in the fact that in the √classical model, the length of any configuration σ ∈ {−1, +1}N is constant, |σ| = N , which is not always true here. In general, if Σ is not of the type {−a, +a} for some a ∈ R, then the length of the configuration or self-overlap R1,1 = |σ|2 /N will become variable. As a result, in order to make the methodology of Guerra and Talagrand work, we will first have to compute the local free energy of the set of configurations with constrained self-overlap. Let us now describe the analogue of the Parisi formula that gives the limit of (1.6). Let [d, D] be the smallest interval such that ν({σ : σ 2 ∈ [d, D]}) = 1.
(1.7)
In other words, d ≤ σ 2 ≤ D with probability one and σ 2 can take values arbitrarily close to d and D with positive probability. From now on, we will simply say that d ≤ σ 2 ≤ D for all σ ∈ Σ. Let us consider u ∈ [d, D] and a sequence (εN ) such that εN > 0 and limN →∞ εN = 0, and consider a sequence of sets UN = {σ ∈ ΣN : R1,1 ∈ [u − εN , u + εN ]}.
(1.8)
We define FN (u, εN ) = where
1 E log ZN (u, εN ), N
(1.9)
ZN (u, εN ) =
exp HN (σ) dν(σ). UN
FN (u, εN ) is the free energy of the subset of configurations in (1.8). We will first compute limN →∞ FN (u, εN ) for some sequence (εN ) for each u ∈ [d, D]. Consider an integer k ≥ 1, numbers 0 = m0 ≤ m1 ≤ · · · ≤ mk−1 ≤ mk = 1
(1.10)
and, given u ∈ [d, D], 0 = q0 ≤ q1 ≤ · · · ≤ qk ≤ qk+1 = u.
(1.11)
August 12, 2005 15:54 WSPC/148-RMP
796
J070-00245
D. Panchenko
We will write m = (m0 , . . . , mk ) and q = (q0 , . . . , qk+1 ). Consider independent centered Gaussian r.v. zp for 0 ≤ p ≤ k with Ezp2 = ξ (qp+1 ) − ξ (qp ). Given λ ∈ R, we define the r.v. 2 exp σ zp + λσ dν(σ) Xk+1 = log Σ
(1.12)
(1.13)
0≤p≤k
and, recursively for l ≥ 0, define 1 log El exp ml Xl+1 , ml
Xl =
(1.14)
where El denotes the expectation in the r.v. (zp )p≥l . When ml = 0, this means Xl = El Xl+1 . Clearly, X0 = X0 (m, q, λ) is a non-random function of the parameters m, q and λ. Whenever it does not create ambiguity, we will keep this dependence implicit. Let us note that X0 also depends on u through q since in (1.11) we have qk+1 = u. Let 1 Pk (m, q, λ, u) = −λu + X0 (m, q, λ) − ml (θ(ql+1 ) − θ(ql )), (1.15) 2 1≤l≤k
where θ(q) = qξ (q) − ξ(q), and define P(ξ, u) = inf Pk (m, q, λ, u),
(1.16)
where the infimum is taken over all λ, k, m and q. Finally, we define P(ξ) = sup P(ξ, u).
(1.17)
d≤u≤D
We will first prove the following. Theorem 1.1. Given u ∈ [d, D] and a sequence (εN )N ≥1 that goes to zero slowly enough, lim FN (u, εN ) = P(ξ, u).
N →∞
(1.18)
Gaussian concentration of measure will imply that the limit of the global free energy can be computed by maximizing the local free energy. Theorem 1.2. We have lim FN = P(ξ).
N →∞
(1.19)
Organization of the paper. In Sec. 2, we describe the replica symmetric region of the model and discuss the example of the Ghatak–Sherrington model. In Sec. 3, we introduce the construction (which we call the Parisi functional) that is used often
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
797
throughout the paper and study some of its properties. In Sec. 4, we prove the analogue of Guerra’s interpolation and explain why it seems to be necessary to impose the constraint on the self-overlap in order to utilize the methodology of Talagrand in [11]. Compared to the classical SK model where this problem does not occur, for the general model considered in this paper a brand new argument is required to remove the constraint on the self-overlap at the end of Guerra’s interpolation. This constitutes a certain nontrivial large deviation problem that is solved in Sec. 5. In Sec. 6, we show how Theorem 1.1, can be reduced to certain a priori estimates on the error terms in Guerra’s interpolation. For the most part, the proof of these a priori estimates goes along the lines of the methodology developed by Talagrand in [11] but, nonetheless, considerable effort is required to verify that the arguments and numerous computations in [11] extend to this more general model. We carry out these computations in Appendix A. In Sec. 7, we show how the global Parisi formula of Theorem 1.2 follows from the local Parisi formula of Theorem 1.1 and a certain concentration of measure result. Finally, certain values of the parameter u in Theorem 1.1 require small modifications of some arguments but, fortunately, these cases can be reduced to the classical model considered in [11], a work which is postponed until Appendix B.
2. Replica Symmetric Region In this section, we will describe a relatively simple necessary and sufficient condition in terms of the parameters of the model which guarantees that the infimum on the right-hand side of (1.16) is achieved when k = 1. If this happens, then P(ξ, u) will be called a local replica symmetric solution. In Appendix B, we will explain that the cases when u = d or u = D in Theorem 1.1 can be reduced to the classical SK model for which the domain of validity of the replica symmetric solution is described in [8] and, hence, in the rest of the paper we will assume that d < u < D.
(2.1)
If the infimum in (1.16) is achieved when k = 1, then m = (0, 1),
q = (0, q, u) for some q ∈ [0, u]
(2.2)
and λ and q are the only variables in P1 (m, q, λ, u) which, hence, can be written as 1 P1 (q, λ) = −λu − (θ(u) − θ(q)) + E log 2
exp H(σ) dν(σ), Σ
where 1 H(σ) = σz0 + λσ 2 + σ 2 (ξ (u) − ξ (q)). 2
(2.3)
August 12, 2005 15:54 WSPC/148-RMP
798
J070-00245
D. Panchenko
Let us define the (local) replica symmetric solution by RS(u) = inf P1 (q, λ) λ,q
(2.4)
and describe the criterion which guarantees that P(ξ, u) = RS(u). We will prove that if (2.1) holds, then the infimum on the right-hand side of (2.4) is achieved on some λ and q which, therefore, must satisfy the critical point conditions ∂P1 ∂P1 = = 0. ∂λ ∂q
(2.5)
From now on let (q, λ) be such a pair, i.e. RS(u) = P1 (q, λ). Suppose that P(ξ, u) = RS(u). Then taking k = 2 in (1.16) should not decrease the infimum on the righthand side. Let us take k = 2, m = (0, m, 1) and q = (0, q, a, u) for a ∈ [q, u].
(2.6)
With this choice of parameters, Pk (m, q, λ, u) becomes 1 1 1 Φ(m, a) = −λu − m(θ(a) − θ(q)) − (θ(u) − θ(a)) + E log E1 X m , (2.7) 2 2 m where 1 X= exp H (σ) dν(σ) and H (σ) = σ(z0 + z1 ) + λσ 2 + σ 2 (ξ (u) − ξ (a)) 2 Σ (2.8) and where Ez02 = ξ (q) and Ez12 = ξ (a) − ξ (q). It should be obvious that Φ(1, a) = P1 (q, λ) for any q ≤ a ≤ u. The derivative of Φ(m, a) with respect to m at m = 1 can not be positive because, otherwise, by decreasing m slightly we could decrease Φ(m, a) that would imply P(ξ, u) ≤ Φ(m, a) < P1 (q, λ) = RS(u). A simple computation gives X X ∂Φ 1 (m, a) log f (a) = = − (θ(a) − θ(q)) + E ∂m 2 E X E 1 1X m=1
(2.9)
and, hence, the following condition is necessary if P(ξ, u) = RS(u), f (a) ≤ 0 for all q ≤ a ≤ u.
(2.10)
This derivative f (a) represents what is usually called the replica symmetry breaking fluctuations. Let us note that since Φ(m, q) = P1 (q, λ) does not depend on m, we have f (q) = 0. Also, it is easy to check that (2.5) implies that f (q) = 0. Therefore, if (2.10) holds then we must have f (q) ≤ 0,
(2.11)
which in the SK model is called the Almeida–Thouless condition. It is believed (and numerical computations show) that in the classical SK model (2.11) implies (2.10).
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
799
However, we will give an example below where this is not the case and, therefore, condition (2.10) can not be weakened to (2.11) in general. We will prove that (2.10) is (necessary and) sufficient for P(ξ, u) = RS(u). Theorem 2.1. If a pair (λ, q) satisfies (2.5) and (2.10), then P(ξ, u) = P1 (q, λ) and such pair (λ, q) is unique. The proof of this theorem goes in parallel with the proof of Theorem 1.1 as will be explained in Sec. 6. However, its proof would be immediate if we knew that the functional Pk defined in (1.15) was convex in m. The conjecture that Pk is indeed convex in m was made in [14, 6] where a partial result was proved. We do not give the details here but, shortly speaking, the convexity of Pk would imply the uniqueness of the minimum in the optimization problem (1.16) and since (2.10) means that the replica symmetric choice of parameters is a local minimum in (1.16), hence, it would be a global minimum.
2.1. Ghatak–Sherrington model Let us consider the Ghatak–Sherrington model introduced in [1] with Σ = {−1, 0, +1}, the Hamiltonian HN (σ) defined in (1.5), the measure ν is the counting measure on Σ and the external field h(σ) = hσ 2 for some h ∈ R. This choice of parameters gives β2 √ 1 , P1 (q, λ) = −λu− (u2 −q 2 )+E log 1 + 2 ch(zβ q) exp λ + h + β 2 (u − q) 4 2 where z is a standard normal r.v. and we keep the dependence of P1 on u implicit. Because of the symmetry of the model, rather than the replica symmetric solution, one is usually interested in the case when P(ξ, u) = P1 (q, λ) for q = 0 and some λ. It is easy to check that β2q β2 ∂P1 = − E ∂q 2 2
2 √ 2 sh(zβ q) exp(λ + h + β 2 (u − q)/2) √ 1 + 2 ch(zβ q) exp(λ + h + β 2 (u − q)/2)
and, therefore, q = 0 always satisfies the critical point condition (2.5). If P(ξ, u) = P1 (0, λ) for some λ, then we will call P(ξ, u) a paramagnetic solution and denote it by PM(u). β2 1 PM(u) = inf P1 (0, λ) = inf −λu − β 2 u2 + log(1 + 2eλ+h+ 2 u ) λ λ 4 1 = hu + β 2 u2 + inf (−λ u + log(1 + 2eλ )) λ 4
1 2 1 , = hu + β 2 u2 + u log + (1 − u) log 4 u 1−u
August 12, 2005 15:54 WSPC/148-RMP
800
J070-00245
D. Panchenko
where we made the change of variable λ = λ + h + β 2 u/2. The infimum is achieved on u 1 . λ = −h − β 2 u + log 2 2(1 − u)
(2.12)
Theorem 2.1 can be applied to this model to describe when the local free energy P(ξ, u) is given by the paramagnetic solution PM(u). When q = 0, the definition (2.8) implies that X = 1 + 2eλ+h+
β2 2
(u−a)
√ ch(z1 β a) = 1 +
√ u − β2 a e 2 ch(z1 β a). 1−u
Using the fact that Ee−
β2 2
a
√ ch(zβ a) = 1,
(2.13)
we get E1 X = 1 + u/(1 − u) = 1/(1 − u) and for 0 ≤ a ≤ u, √ √ β2 β2 1 f (a) = − β 2 a2 + E 1 − u + ue− 2 a ch(zβ a) log 1 − u + ue− 2 a ch(zβ a) . 4 (2.14) Since λ in (2.12) and q = 0 satisfy (2.5), Theorem 2.1 implies that the subset of configurations with constrained self-overlap R1,1 ≈ u will be in the paramagnetic phase P(ξ, u) = PM(u) if and only if f (a) ≤ 0 for 0 ≤ a ≤ u.
(2.15)
It is easy to check that f (0) = β 2 (−1 + β 2 u2 )/2 and, therefore, (2.11) implies −1 + β 2 u2 ≤ 0
or βu ≤ 1.
(2.16)
It is tempting to conjecture that (2.16) implies (2.15) but, unfortunately, even though it is expected to be true in the classical SK model, it is not always true here. For example, one can check that for u = 0.05 and β = 17.5, (2.16) holds but (2.15) fails (see Fig. 1). 0.001 0.01
0.02
0.03
0.04
0.05
-0.001 -0.002 -0.003
Fig. 1.
A function f (a) for u = 0.05, β = 17.5 and a ≤ u.
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
801
Next, we would like to describe the region of parameters β and h such that the system as a whole is in the paramagnetic phase in the sense that P(ξ) = sup P(ξ, u) = P(ξ, u ) = PM(u ),
(2.17)
u∈[0,1]
i.e. the local free energy P(ξ, u) is maximized at some point u where P(ξ, u ) = PM(u ) and thus, the global free energy is given by the paramagnetic solution P(ξ) = PM(u ). In Fig. 2, we show a phase diagram in the coordinates (hβ −1 , β −1 ) to compare it with [1] where the phase diagram was given in these coordinates. According to [1], regions 1 and 3 constitute the paramagnetic phase where (2.17) holds, and region 2 is the spin glass phase where (2.17) fails, i.e. P(ξ, u ) < PM(u ) for u such that supu P(ξ, u) = P(ξ, u ). We will explain how these regions were defined in [1] and argue that in regions 1 and 3, (2.17) holds. We consider regions 1 and 3 separately because region 1 can be treated rigorously. The way Fig. 2 was obtained in [1] is apparently as follows. The authors used the replica symmetric approximation which means that instead of looking at the free energy P(ξ) which by Theorem 1.2 is given by supu P(ξ, u), they considered a replica symmetric approximation of P(ξ) given by sup RS(u) = sup inf P1 (q, λ). u
u
q,λ
(2.18)
Fig. 2. A phase diagram in (hβ −1 , β −1 ) coordinates. Regions 1 and 3 are a paramagnetic phase and, according to [1], region 2 is a spin glass phase.
August 12, 2005 15:54 WSPC/148-RMP
802
J070-00245
D. Panchenko
The value provided by this approximation, in general, is not equal to the actual free energy P(ξ) and is only an upper bound. However, this optimization problem is much easier than the case of the general Parisi formula in Theorem 1.2 since (2.18) depends only on three parameters (u, q, λ). The saddle point conditions for the solution (u, q, λ) of (2.18) are given by ∂P1 = 0, ∂λ
∂P1 =0 ∂q
and λ = 0,
(2.19)
since it is easy to check that maximizing over u and using ∂P1 /∂λ = 0 gives λ = 0. Hence, (2.19) reduces to solving the system of two equations. It was predicted in [1] that the paramagnetic phase coincides with the set of parameters (β, h) for which the infimum in (2.18) is attained at the saddle point (u, q, λ) such that q = 0. This set is given by the union of regions 1 and 3. On the complement, region 2, the replica symmetric approximation of the free energy P(ξ) is given by P1 (q, λ) with q = 0 and the authors in [1] concluded that it is, therefore, a spin glass phase in the sense that (2.17) fails. However, this conclusion in general requires further justification because (2.18) is only an approximation of the general Parisi formula. Region 1. We will define region 1 below after we explain a simple but important property of this model. The fact that region 1 is a paramagnetic phase will follow from this property. First, let us observe that for any fixed a, the function fu (a) = f (a) is increasing in u, where for a moment we made the dependence on u explicit. We have √ √ β2 β2 ∂fu (a) = E −1 + e− 2 a ch(zβ a) log 1 − u + ue− 2 a ch(zβ a) . ∂u It is easy to check that for any u ∈ [0, 1], the function x → (−1 + x) log(1 − u + ux) is convex for x ≥ 0 and, therefore, (2.13) and Jensen’s inequality imply that ∂fu (a)/∂u ≥ 0. This means that if we take u1 ≤ u2 , then fu2 (a) ≤ 0 for a ≤ u2 ⇒ fu1 (a) ≤ 0
for a ≤ u1 .
(2.20)
If for some u2 ∈ [0, 1] we have P(ξ, u2 ) = PM(u2 ), then (2.15) holds for u = u2 . By (2.20), (2.15) holds for any u = u1 ≤ u2 and, therefore P(ξ, u1 ) = PM(u1 ). This proves the following important property: for any β and h, there exists u0 = u0 (β) ∈ [0, 1] such that P(ξ, u) = PM(u) if and only if u ≤ u0 .
(2.21)
Suppose that the maximum of PM(u) is achieved on some u1 ≤ u0 . Then the system as a whole will be in the paramagnetic phase because sup P(ξ, u) ≥ sup P(ξ, u) = sup PM(u) = sup PM(u) ≥ sup P(ξ, u) u∈[0,1]
u≤u0
u≤u0
u∈[0,1]
u∈[0,1]
(2.22)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
803
and, therefore, P(ξ) = sup P(ξ, u) = sup PM(u) = PM(u1 ).
(2.23)
u≤u0
u∈[0,1]
Region 1 is precisely where the maximum of PM(u) is achieved on some u1 ≤ u0 and thus, it is a subset of the paramagnetic phase. Region 3. As we mentioned above, in region 3, the optimization problem (2.18) is solved at the saddle point (u , q , λ ) such that q = 0. This means that for this u , sup RS(u) = RS(u ) = inf P1 (q, λ) = inf P1 (0, λ) = PM(u ). q,λ
u
λ
(2.24)
By itself, this fact does allow us to conclude that we are in the paramagnetic phase, but apparently in this particular model when this happens, we also have u ≤ u0 where u0 was defined in (2.21) and we do not see how to prove this using calculus. Should this numerical observation reflect the true situation, as seems likely, then (2.24) would imply that PM(u ) = P(ξ, u ) ≤ sup P(ξ, u) ≤ sup RS(u) = PM(u ) u
u
and we would again be in the paramagnetic phase. Region 2. The fact that the infimum in (2.18) is attained at the saddle point (u , q , λ ) with q = 0 implies that u > u0 and sup PM(u) < RS(u ).
(2.25)
u≤u0
However, since RS(u) is only an upper bound on P(ξ, u), in general, (2.25) does not exclude the possibility that P(ξ, u) < sup PM(u) for all u > u0 u≤u0
which would imply (2.17). At this moment, we do not see how to prove that region 2 is a spin glass phase except by checking directly that (2.17) fails. 3. Parisi Functional We will often consider iterative constructions similar to (1.13) and (1.14), so it will be convenient to define an operator that implements this recursion. In each case, we will only need to specify the parameters of the operator. We will call this operator the Parisi functional. Given k ≥ 1, consider a vector m = (m0 , . . . , mk )
(3.1)
such that all coordinates mi ≥ 0. Let z = (zl )0≤l≤k be a collection of independent random vectors.
August 12, 2005 15:54 WSPC/148-RMP
804
J070-00245
D. Panchenko
Suppose that we are given a random variable F that is a function of z, F = F (z). Then we let Fk+1 = F and for 0 ≤ l ≤ k, define iteratively Fl =
1 log El exp ml Fl+1 , ml
(3.2)
where El denotes the expectation in (zi )i≥l . When ml = 0, this means Fl = El Fl+1 . Definition 3.1. We define the Parisi functional by P(m)F = F0 .
(3.3)
With these notations, the definition of X0 given by (1.10)–(1.14) can be written as X0 = P(m)Xk+1 .
(3.4)
Let us describe several immediate properties of the Parisi functional that will often be used throughout the paper. It is obvious by induction in (3.2) that for any constant c ∈ R, P(m)(c + F ) = c + P(m)F.
(3.5)
Similarly, if zj = (zpj )0≤p≤k are independent for j = 1, 2 and F j = F j (zj ), then P(m)(F 1 + F 2 ) = P(m)F 1 + P(m)F 2 .
(3.6)
P(m)F ≤ P(m)F .
(3.7)
If F ≤ F , then
The next property plays an important role in Talagrand’s interpolation for two copies of the system. Suppose that we have two random variables j j zp for j = 1, 2 (3.8) F =F p≤k
such that zl1 = zl2
for l < r and zl1 and zl2 are independent copies for l ≥ r.
(3.9)
Let us define n = (np )p≤k such that nl =
ml 2
for l < r
and nl = ml
for l ≥ r.
(3.10)
Lemma 3.2. Given (3.8), (3.9) and (3.10), let F = F 1 + F 2 and define Flj by (3.2) and Fl by (3.2) with m replaced by n. Then, Fl = Fl1 + Fl2
for l > r
and
Fl = 2Fl1 = 2Fl2
for l ≤ r.
(3.11)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
805
In particular, P(n)(F 1 + F 2 ) = 2P(m)F 1 .
(3.12)
Proof. The proof follows by induction in (3.2). For l > r, using independence of zl1 and zl2 , we get Fl =
1 1 1 2 = Fl1 + Fl2 . log El exp nl Fl+1 = log El exp ml Fl+1 + Fl+1 nl ml
For l = r, all independent copies already have been averaged and since zl1 = zl2 for l < r, Fr1 = Fr2 and Fr = 2Fr1 . Finally, by induction for l < r, we get Fl =
1 2 ml 1 2Fl+1 = 2Fl1 log El exp nl Fl+1 = log El exp nl ml 2
which for l = 0 proves (3.12). The next property concerns the computation of the derivatives of P(m)F . With the notations of (3.1) and (3.2), let us define Wl = exp ml (Fl+1 − Fl ).
(3.13)
Note that by definition of Fl in (3.2), we have El Wl = 1. Also, since Fl , Fl+1 and Wl do not depend on zi for i ≥ l + 1, we can write El Wl Wl+1 = El El+1 Wl Wl+1 = El Wl El+1 Wl+1 = 1. Repeating the same argument, El Wl · · · Wk = 1.
(3.14)
This fact will be used often below. Lemma 3.3. For a generic variable x, ∂P(m) ∂F = E0 W0 · · · Wk . ∂x ∂x Proof. By (3.2), exp ml Fl = El exp ml Fl+1 and therefore, ml exp ml Fl
∂Fl ∂Fl+1 = ml El exp ml Fl+1 . ∂x ∂x
Since Fl does not depend on zi for i ≥ l, ∂Fl+1 ∂Fl = El Wl , ∂x ∂x where Wl was defined in (3.13). Applying the same equation to Fl+1 gives ∂Fl ∂Fl+2 ∂Fl+2 = El Wl El+1 Wl+1 = El Wl Wl+1 , ∂x ∂x ∂x
(3.15)
August 12, 2005 15:54 WSPC/148-RMP
806
J070-00245
D. Panchenko
since Fl , Fl+1 and therefore, Wl do not depend on (zi ) for i ≥ l + 1. Repeating the same argument inductively, we get ∂F ∂Fl = El Wl · · · Wk , ∂x ∂x
(3.16)
which for l = 0 implies (3.15). We will often assume that the coordinates of m are arranged in a nondecreasing order, m0 ≤ · · · ≤ mk .
(3.17)
The following lemma provides a useful control of the expressions of the type (3.15) that appear as the derivatives of the Parisi functional. Lemma 3.4. Suppose (3.17) holds. Let f = f (z) and F = F (z), and let Wl be defined by (3.13). Suppose that f ≤ F and let m = mr be the first non-zero element in (3.17). Then E0 log Er Wr · · · Wk exp mk (f − F ) ≤ m(P(m)f − P(m)F ).
(3.18)
Proof. Let us define U = exp mk (f − F ). Using (3.13), we can write Wk U = exp mk (f − Fk ) and since Fk does not depend on zk , this implies that Ek Wk U = exp(−mk Fk )Ek exp mk f = exp mk (fk − Fk ). We will proceed by induction to show that for r ≤ l ≤ k, El Wl · · · Wk U ≤ exp ml (fl − Fl ).
(3.19)
As in (3.7), f ≤ F implies that fl ≤ Fl and since ml−1 ≤ ml , (3.19) implies that El Wl · · · Wk U ≤ exp ml−1 (fl − Fl ). Multiplying both sides by Wl−1 gives El Wl−1 Wl · · · Wk U ≤ exp ml−1 (fl − Fl−1 ) since Wl does not depend on zi for i ≥ l. Taking the expectation El−1 and using that Fl−1 does not depend on zi for i ≥ l − 1, we can write El−1 Wl−1 Wl · · · Wk U ≤ exp(−ml−1 Fl−1 )El−1 exp ml−1 fl = exp ml−1 (fl−1 − Fl−1 ).
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
807
This finishes the proof of the induction step. For l = r, (3.19) implies that log Er Wr · · · Wk U ≤ m(fr − Fr ).
(3.20)
Since for l < r, ml = 0, (3.2) implies that P(m)f = f0 = E0 fr
and P(m)F = F0 = E0 Fr
and therefore, taking the expectation of both sides of (3.20) proves (3.18).
4. Guerra’s Interpolation The first step of the proof of Theorem 1.1 is the analogue of Guerra’s interpolation method in [2]. For 1 ≤ i ≤ N, we consider independent copies (zi,p )0≤p≤k of the sequence (zp )0≤p≤k defined in (1.12) that are also independent of the randomness of the Hamiltonian HN (σ). Consider m = (mp )0≤p≤k as in (1.10). We denote by El the expectation in the r.v. (zi,p )i≤N,p≥l . Consider the Hamiltonian √ √ Ht (σ) = tHN (σ) + 1 − t σi zi,p . (4.1) i≤N
0≤p≤k
Given UN defined in (1.8), let F = log
exp Ht (σ) dν(σ)
(4.2)
1 EP(m)F, N
(4.3)
UN
and let ϕN (t) =
where E denotes the expectation in all random variables including the randomness of the Hamiltonian HN (σ). For a function h : ΣN → R, let ht denote its average with respect to the Gibbs measure with Hamiltonian Ht (σ) in (4.1) on the set UN , i.e. h(σ) exp Ht (σ) dν(σ). (4.4)
ht exp F = UN
(3.14) implies that the functional h → El (Wl · · · Wk ht )
(4.5)
is a probability γl on UN . We denote by γl⊗2 its product on UN × UN , and for a function h : UN × UN → R, we set µl (h) = E W1 · · · Wl−1 γl⊗2 (h) . (4.6)
August 12, 2005 15:54 WSPC/148-RMP
808
J070-00245
D. Panchenko
The following Gaussian integration by parts will be commonly used below. If g is a Gaussian random variable, then for a function F : R → R of moderate growth we have [8, A.40], EgF (g) = Eg 2 EF (g).
(4.7)
This can be generalized as follows. If g = (g1 , . . . , gn ) is a jointly Gaussian family of random variables, then for a function F : Rn → R of moderate growth we have (see, for example, [8, A.41]), ∂F Egi F (g) = E(gi gj )E (g). (4.8) ∂gj j≤n
We will need a similar statement for functionals of not necessarily finite Gaussian families, for example, for a random process HN (σ) indexed by σ in a possibly infinite set ΣN . The following is a simple consequence of (4.7). Lemma 4.1. Let g = (g(ρ))ρ∈U be a Gaussian process indexed by U ⊆ Rn and let F (g) be a differentiable functional on RU . Given σ ∈ U, we have Eg(σ)F (g) = E
δF [Eg(σ)g(ρ)] δg
(4.9)
— the expectation of the variational derivative of F in the direction h(ρ) = Eg(σ)g(ρ). Proof. Consider a process g = (g (ρ))ρ∈U defined by g (ρ) = g(ρ) − g(σ)
Eg(ρ)g(σ) , Eg(σ)2
which is, obviously, independent of the r.v. g(σ). If we fix g and denote by E the expectation with respect to g(σ), then using (4.7) with g = g(σ) gives δF Eg(ρ)g(σ) [Eg(σ)g(ρ)]. E g(σ)F (g) = E g(σ)F g (ρ) + g(σ) = E Eg(σ)2 δg Taking the expectation in g proves (4.9). We are ready to prove the main result of this section. The proof will clarify why we first compute the free energy of the set of configurations with constrained self-overlap R1,1 ∈ [u − εN , u + εN ]. Theorem 4.2 (Guerra’s Interpolation). For t ∈ [0, 1], we have 1 ml (θ(ql+1 ) − θ(ql )) ϕN (t) = − 2 1≤l≤k
1 − (ml − ml−1 )µl (ξ(R1,2 ) − R1,2 ξ (ql ) + θ(ql )) + R, 2 1≤l≤k
where |R| ≤ c(N ) + LεN .
(4.10)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
809
Proof. The proof of this theorem repeats the proof of the main result in [2] (see also [11, Theorem 2.1]) with some necessary modifications. We will give the detailed proof in order to demonstrate how Lemma 4.1 replaces (4.8) and to show that the constraint qk+1 = u in (1.11), in some sense, matches the constraint on the selfoverlap R1,1 ∈ [u − εN , u + εN ] in the definition of local free energy (1.8) and (1.9). In particular, we will see in (4.21) below that such choice of parameters allows us to get rid of a certain term in the derivative ϕN (t) that, otherwise, would be problematic to control. First of all, (3.15) implies that ϕN (t) =
∂F 1 EW1 · · · Wk . N ∂t
Using the fact that mk = 1, one can write W1 · · · Wk = exp ml (Fl+1 − Fl ) 1≤l≤k
= exp F +
(ml−1 − ml )Fl
= T exp F,
(4.11)
1≤l≤k
where T = T1 · · · Tk and Tl = exp(ml−1 − ml )Fl . Using (4.2), we can write ϕN (t) = where I=
II =
1 √ 2N t
and 1 √ 2N 1 − t
∂F 1 ET exp F = I − II, N ∂t
ET HN (σ) exp Ht (σ) dν(σ)
(4.12)
UN
ET UN
i≤N
σi
zi,p exp Ht (σ) dν(σ).
(4.13)
p≤k
To compute I, we will use (4.9) for the family g = (HN (ρ))ρ∈UN and we will think of each factor Tl in T as the functional of g. Let us denote ζ(σ, ρ) =
1 EHN (σ)HN (ρ). N
Then (4.9) and (4.14) imply ∂ exp Ht (σ) 1 √ ζ(σ, σ) dν(σ) ET I= ∂HN (σ) 2 t UN 1 1 δTl + √ ET exp Ht (σ) [ζ(σ, ρ)] dν(σ). T 2 t l δg UN l≤k
First of all, ∂ exp Ht (σ) √ = t exp Ht (σ), ∂HN (σ)
(4.14)
(4.15)
August 12, 2005 15:54 WSPC/148-RMP
810
J070-00245
D. Panchenko
and therefore, the first line in (4.15) can be written as
1 1 ET exp Ht (σ)ζ(σ, σ) dν(σ) = EW1 · · · Wk ζ(σ, σ) t . 2 UN 2
(4.16)
Using the definition of Tl , we can write δF δFl 1 δTl [ζ(σ, ρ)] = (ml−1 − ml ) [ζ(σ, ρ)] = (ml−1 − ml )El Wl · · · Wk [ζ(σ, ρ)], Tl δg δg δg where we used (3.16). We have √ δF [ζ(σ, ρ)] = t exp(−F ) δg
ζ(σ, ρ) exp Ht (ρ) dν(ρ) = UN
√ t ζ(σ, ρ)t ,
·t
where denotes the Gibbs average with respect to ρ for a fixed σ. Therefore, for a fixed σ, we get √ √ 1 δTl [ζ(σ, ρ)] = t(ml−1 − ml )El Wl · · · Wk ζ(σ, ρ)t = t(ml−1 − ml )γl (ζ(σ, ρ)), Tl δg where γl was defined in (4.5). Hence, the second line in (4.15) is equal to 1 (ml−1 − ml )EW1 · · · Wk exp(−F ) γl (ζ(σ, ρ)) exp Ht (σ) dν(σ) 2 UN l≤k
=
1 1 (ml−1 − ml )EW1 · · · Wl−1 γl⊗2 (ζ(σ, ρ)) = (ml−1 − ml )µl (ζ(σ, ρ)), 2 2 l≤k
l≤k
(4.17) where µl was defined in (4.6). Combining (4.16) and (4.17), we get I=
1 1 EW1 · · · Wk ζ(σ, σ)t + (ml−1 − ml )µl (ζ(σ, ρ)). 2 2
(4.18)
l≤k
By (1.1), for any σ 1 , σ 2 ∈ ΣN , we have |ζ(σ 1 , σ 2 ) − ξ(R1,2 )| ≤ c(N ) and therefore, (4.18) implies that I=
1 1 EW1 · · · Wk ξ(R1,1 )t + (ml−1 − ml )µl (ξ(R1,2 )) + R, 2 2
(4.19)
l≤k
where |R| ≤ c(N ). The computation of II is very similar, one only needs to note that Fl does not depend on zi,p for l ≤ p. We have II =
1 1 ξ (qk+1 )EW1 · · · Wk R1,1 t + (ml−1 − ml )ξ (ql )µl (R1,2 ). 2 2 l≤k
(4.20)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
811
Combining (4.19) and (4.20) and rearranging terms, it is easy to see that ϕN (t) =
1 EW1 · · · Wk ξ(R1,1 ) − R1,1 ξ (qk+1 ) + θ(qk+1 )t 2 1 1 − ml (θ(ql+1 ) − θ(ql )) − (ml − ml−1 )µl (ξ(R1,2 ) 2 2 1≤l≤k
l≤k
− R1,2 ξ (ql ) + θ(ql )) + R.
(4.21)
It now suffices to notice that ·t is restricted to the set UN and qk+1 = u by (1.11), which implies that 0 ≤ ξ(R1,1 ) − R1,1 ξ (u) + θ(u) ≤ L|R1,1 − u| ≤ LεN . Since EW1 · · · Wk = 1, this finishes the proof of Theorem 4.2. It is to control the first term on the right-hand side of (4.21) that we impose the constraint on selfoverlap. In the classical Sherrington–Kirkpatrick model, this problem did not occur because R1,1 was always 1. 5. Removing the Constraint on the Self-overlap The main goal of this section is to compute limN →∞ ϕN (0). The convexity of ξ implies that ξ(a) − aξ (b) + θ(b) ≥ 0 for any a, b ∈ R and therefore, (4.10) implies ϕN (t) ≤ −
1 ml (θ(ql+1 ) − θ(ql )) + R 2 1≤l≤k
and hence, FN (u, εN ) = ϕN (1) ≤ ϕN (0) −
1 ml (θ(ql+1 ) − θ(ql )) + R. 2
(5.1)
1≤l≤k
In the SK model, ϕN (0) was very easy to compute and in fact, it was independent of N because at the end of Guerra’s interpolation, the spins became decoupled. The situation is different here because of the constraint (1.8). Let us recall that for t = 0, 1 exp σi zi,p dν(σ). ϕN (0) = P(m)F, where F = log N UN i≤N
0≤p≤k
Removing the constraint UN constitutes a large deviation problem that will be addressed in Theorem 5.3 below. First of all, let us give an easy upper bound on ϕN (0). Since by (1.7) the self-overlap R1,1 ∈ [d, D], for σ ∈ UN we have R1,1 ∈ [d, D] ∩ [u − εN , u + εN ] and therefore, −λR1,1 ≤ −λuN (λ),
(5.2)
August 12, 2005 15:54 WSPC/148-RMP
812
J070-00245
D. Panchenko
where uN (λ) = max(d, u − εN )
for λ ≥ 0
and
(5.3)
uN (λ) = min(D, u + εN ) for λ < 0. Using this, we can bound F as follows, exp σi zp,i dν(σ) F = log UN
i≤N
0≤p≤k
≤ −N λuN (λ) + log
exp UN
≤ −N λuN (λ) + log = −N λuN (λ) +
exp ΣN
σi
zp,i + λ
i≤N
0≤p≤k
i≤N
i≤N
σi
zp,i + λ
0≤p≤k
σi2
dν(σ)
σi2
dν(σ)
i≤N
Xk+1,i ,
i≤N
where Xk+1,i
2 = log zp,i + λσ dν(σ) σ Σ
(5.4)
0≤p≤k
are independent copies of Xk+1 defined in (1.13). Using (3.5) and (3.7), ϕN (0) =
1 P(m)F ≤ −λuN (λ) + P(m)Xk+1 = −λuN (λ) + X0 . N
The arbitrary choice of λ here implies that ϕN (0) ≤ inf (−λuN (λ) + X0 ). λ
(5.5)
Combining (5.1) and (5.5), we get
1 FN (u, εN ) ≤ inf −λuN (λ) + X0 (m, q, λ) − ml (θ(ql+1 ) − θ(ql )) + R, 2 1≤l≤k
(5.6) where the infimum is over all choices of parameters k, m, q and λ. The bound (5.6) is the analogue of Guerra’s replica symmetry breaking bound in [2]. If instead of uN (λ) we had u in (5.6), then the infimum would be equal to P(ξ, u) which would prove the upper bound in Theorem 1.1. We will now show that for d
(5.7)
this infimum does not change much by replacing uN (λ) with u. We will need the following.
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
813
Lemma 5.1. There exists a function a(λ) such that for any k, m, q, a(λ)λ ≤ X0 (m, q, λ)
(5.8)
and such that lim a(λ) = D
and
λ→+∞
lim a(λ) = d.
λ→−∞
(5.9)
Proof. Indeed, if in the recursive construction (1.14) one takes all ml = 0, then H¨ older’s inequality yields that for any sequence m
E log
exp σ Σ
zp + λσ 2 dν(σ) ≤ X0 (m, q, λ)
(5.10)
0≤p≤k
Since the function x → log
exp(σx + λσ 2 ) dν(σ)
Σ
is convex by H¨older’s inequality, (5.10) and Jensen’s inequality imply that log Σ
exp λσ 2 dν(σ) ≤ X0 (m, q, λ).
(5.11)
It is clear that (1.7) implies that the left-hand side of (5.11) is asymptotically equivalent to Dλ for λ → +∞ and to dλ for λ → −∞ and this proves Lemma 5.1. Next we will show that the estimate (5.8) and Eq. (5.6) imply the upper bound in Theorem 1.1. Lemma 5.2. For d < u < D, we have lim sup FN (u, εN ) ≤ P(ξ, u). N →∞
(5.12)
Proof. Lemma 5.1 implies that −λuN (λ) + X0 (m, q, λ) ≥ λ(a(λ) − uN (λ)).
(5.13)
For d < u < D, the definition (5.3) implies that for any λ, limN →∞ uN (λ) = u. Combining this with (5.9) yields that for N large enough, the right-hand side of (5.13) goes to infinity as λ → ±∞ and thus, the infimum in (5.6) is achieved for |λ| ≤ Λ where Λ is a large enough constant independent of k, m, q. This means
August 12, 2005 15:54 WSPC/148-RMP
814
J070-00245
D. Panchenko
that in (5.6), restricting minimization over λ to the set {|λ| ≤ Λ} does not change the infimum and, therefore,
1 FN (u, εN ) ≤ inf −λu + X0 (m, q, λ) − ml (θ(ql+1 ) − θ(ql )) 2 1≤l≤k
+ Λ|uN (λ) − u| + R, = P(ξ, u) + Λ|uN (λ) − u| + R and this finishes the proof. In the rest of the section, we will show that the bound in (5.5) is exact in the limit. Theorem 5.3. For any d < u < D, if the sequence εN goes to zero slowly enough then lim ϕN (0) = ϕ0 (m, q) := inf (−λu + X0 (m, q, λ)).
N →∞
λ
(5.14)
Proof. Given a measurable set A ⊆ [d, D] and λ ∈ R, we define
F (A, λ) = log
exp {R1,1 ∈A}
N
σi
i=1
zi,p + λ
0≤p≤k
N
σi2 dν(σ)
(5.15)
i=1
and Φ(A, λ) =
1 P(m)F (A, λ). N
(5.16)
When A = [d, D], the set {R1,1 ∈ A} = ΣN and therefore, F ([d, D], λ) =
Xk+1,i ,
i≤N
where Xk+1,i was defined in (5.4). Using (3.6), Φ([d, D], λ) = P(m)Xk+1 = X0 (λ).
(5.17)
Here we made the dependence of X0 on λ explicit while keeping the dependence on other parameters implicit. For simplicity of notations, we will write F (λ) := F ([d, D], λ)
and Φ(λ) := Φ([d, D], λ) = X0 (λ).
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
815
In these notations, Theorem 5.3 states that there exists a sequence εN → 0 such that lim Φ(UN , 0) = inf (−λu + Φ(λ)) = inf (−λu + X0 (λ)).
N →∞
λ
λ
(5.18)
Let us define λ(u) to be the point where the infimum in (5.18) is achieved, i.e. X0 (λ(u)) − λ(u)u = inf X0 (λ) − λu . (5.19) λ
The infimum is, indeed, achieved because of the following argument. A funcolder’s inequality. Lemma 5.1 implies that tion −λu + X0 (λ) is convex in λ by H¨ X0 (λ) ≥ a(λ)λ for some function a(λ) such that limλ→+∞ = D and limλ→−∞ = d. Hence, for d < u < D, the convex function −λu + X0 (λ) → +∞ as λ → ±∞ and therefore, it has a unique minimum. The critical point condition for λ(u) is ∂X0 (λ(u)) = u. (5.20) ∂λ Consider a fixed small enough ε > 0 such that d < u − ε and u + ε < D and let Uε = [u − ε, u + ε]. Let us analyze Φ(Uε , λ(u)). Consider a set V equal to either [d, u − ε] or [u + ε, D] and note that [d, D] = Uε ∪ [d, u − ε] ∪ [u + ε, D]. We will start by proving an upper bound on Φ(V, λ(u)). We will only consider the case of V = [u + ε, D] since the case V = [d, u − ε] can be treated similarly. Since −γR1,1 ≤ −γ(u + ε) for R1,1 ∈ V
and γ ≥ 0,
we get that for γ ≥ 0, F (V, λ(u)) ≤ −N γ(u + ε) + F (λ(u) + γ). Using (3.5) and (3.7), we get Φ(V, λ(u)) ≤ U (γ) := −γ(u + ε) + Φ(λ(u) + γ) = −γ(u + ε) + X0 (λ(u) + γ). (5.21) Setting γ = 0 gives U (0) = Φ(λ(u)) = X0 (λ(u)). Next, the right derivative of U (γ) at zero is ∂U ∂X0 (λ(u)) = −(u + ε) + u = −ε = −(u + ε) + ∂γ γ=0+ ∂λ using (5.20). Finally, it follows from a tedious but straightforward computation which we will omit here that 2 ∂ U ∂γ 2 ≤ L
August 12, 2005 15:54 WSPC/148-RMP
816
J070-00245
D. Panchenko
for some constant L that depends only on the parameters of the model ξ and ν. Therefore, minimizing over γ ≥ 0 in the right-hand side of (5.21) gives Φ(V, λ(u)) ≤ Φ(λ(u)) −
ε2 . L
(5.22)
The same bound holds for V = [d, u − ε]. For l ≥ 1, let Wl = exp ml (Fl+1 (λ(u)) − Fl (λ(u))) be defined by (3.13) with F = F (λ(u)). Given a set A ⊆ [d, D], let I(R1,1 ∈ A) denote the Gibbs average defined by
I(R1,1 ∈ A) = exp(F (A, λ(u)) − F (λ(u))). The following proposition is the crucial step in the proof of Theorem 5.3. This type of computation was invented by Talagrand [11] in order to control the remainder terms in Guerra’s interpolation and we will use this argument with the same purpose later in the paper as well. Proposition 5.4. Assume that for A ⊆ [d, D] and for some ε > 0, we have
Then,
Φ(A, λ(u)) ≤ Φ(λ(u)) − ε .
(5.23)
N EW1 · · · Wk I(R1,1 ∈ A) ≤ L exp − , L
(5.24)
where L does not depend on N. Proof. The proof is based on the property of the Parisi functional described in Lemma 3.4. For simplicity of notations, let us assume that m1 > 0. The case when several elements of the sequence m are zeroes can be handled in exactly the same way. Let f = F (A, λ(u))
and F = F (λ(u))
so that the condition f ≤ F of Lemma 3.4 is satisfied. Lemma 3.4 and (5.23) imply that E log E1 W1 · · · Wk exp(f − F ) ≤ m1 (P(m)f − P(m)F ) = N m1 (Φ(A, λ(u)) − Φ(λ(u))) ≤ −N m1 ε . Let us consider a function φ(Z) = log E1 W1 · · · Wk exp(f − F ) where Z = (zi,0 )i≤N . We will show that φ(Z) is a Lipschitz function of Z: √ |φ(Z) − φ(Z )| ≤ L N |Z − Z |.
(5.25)
(5.26)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
817
First of all,
σi (zi,0 −
zi,0 )
≤ |Z − Z |
i≤N
1/2 σi2
≤
√ ND |Z − Z |,
i≤N
using (1.7). Definition (5.15) implies that for any set A and any λ, √ |F (A, λ)(Z) − F (A, λ)(Z )| ≤ ND |Z − Z |, where we made the dependence of F (A, λ) on Z explicit. In particular, this holds for f and F. It is also clear from the properties (3.5) and (3.7) that iteration (3.2) in the definition of the Parisi functional preserves Lipschitz condition and therefore, √ |Fl (Z) − Fl (Z )| ≤ ND |Z − Z | for l ≤ k. Using (4.11), we can rewrite φ(Z) as φ(Z) = log E1 exp f (Z) +
(ml−1 − ml )Fl (Z)
1≤l≤k
and (5.26) is now obvious. Gaussian concentration of measure (see, for example, [8, Theorem 2.2.4]) implies that for any t ≥ 0, P(|φ(Z) − Eφ| ≥ N t) ≤ 2 exp(−N t2 /L).
(5.27)
In particular, with probability at least 1 − 2 exp(−N/L), 1 1 φ ≤ Eφ + N m1 ε ≤ − N m1 ε 2 2 and therefore, with probability at least 1 − 2 exp(−N/L), E1 W1 · · · Wk exp(f − F ) ≤ exp(−N/L).
(5.28)
Since f ≤ F , E1 W1 · · · Wk exp(f − F ) ≤ E1 W1 · · · Wk ≤ 1 using (3.14), which together with (5.28) implies that EW1 · · · Wk exp(f − F ) ≤ L exp(−N/L). Corollary 5.5. For any ε > 0, we have Φ(Uε , λ(u)) ≥ Φ(λ(u)) − δN = X0 (λ(u)) − δN , where limN →∞ δN = 0.
(5.29)
August 12, 2005 15:54 WSPC/148-RMP
818
J070-00245
D. Panchenko
Proof. (5.22) and (5.24) imply that for V equal to either [d, u − ε] or [u + ε, D], we have N EW1 · · · Wk I(R1,1 ∈ V ) ≤ L exp − L and therefore, EW1 · · · Wk I(R1,1
N Uε ) ≤ L exp − ∈ . L
(5.30)
Suppose that (5.29) is not true which means that for some positive ε > 0, we have Φ(Uε , λ(u)) ≤ Φ(λ(u)) − ε , for some arbitrarily large N. Then again, (5.22) and (5.24) would imply that N EW1 · · · Wk I(R1,1 ∈ Uε ) ≤ L exp − . L Combining with (5.30), we would get N 1 = EW1 · · · Wk ≤ L exp − , L and we arrive at contradiction. In order to bound Φ(Uε , λ(u)) in terms of Φ(Uε , 0), we can write N F (Uε , λ(u)) = log exp σi zi,p + N λ(u)R1,1 dν(σ) {R1,1 ∈Uε }
i=1
0≤p≤k
≤ N λ(u)u + N |λ(u)|ε + F (Uε , 0). Using (3.5) and (3.7), Φ(Uε , λ(u)) ≤ Φ(Uε , 0) + λ(u)u + |λ(u)|ε and therefore, Φ(Uε , 0) ≥ −λ(u)u + X0 (λ(u)) − |λ(u)|ε − δN .
(5.31)
This implies that for any ε > 0, lim inf Φ(Uε , 0) ≥ −λ(u)u + X0 (λ(u)) − Lε. N →∞
Clearly, this means that one can choose a sequence εN → 0 such that for UN = UεN , lim inf Φ(UN , 0) ≥ −λ(u)u + X0 (λ(u)). N →∞
Since a similar upper bound is obvious this finishes the proof of Theorem 5.3.
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
819
6. Reduction of the Main Results to A Priori Estimates Now that we understand what happens at the end of Guerra’s interpolation, we will turn to analyzing the remainder terms in the second line of (4.10) and, in particular, the functional µr defined in (4.6). First of all, for a function h on UN × UN , the definition of µr (h) can be written equivalently as follows. Let zpj for j = 1, 2 be two copies of the random vector (zp ) defined in (1.12) such that zp1 = zp2 for p < r and zp1 , zp2 are independent for p ≥ r. (6.1) j j Let z j = zi,p where zi,p are independent copies of the vector zpj i≤N,p≤k p≤k
for i ≤ N. Let F j be defined by (4.2) in terms of z j and let Wlj be defined by (3.13) in terms of F j . Let us consider the Hamiltonian j √ j √ 1 2 j (6.2) Ht (σ , σ ) = tHN (σ ) + 1 − t σi zi,p j=1,2
i≤N
0≤p≤k
and define the Gibbs average h of h by 1 2 h(σ 1 , σ 2 ) exp Ht (σ 1 , σ 2 ) dν(σ 1 ) dν(σ 2 ).
h exp(F + F ) = UN ×UN
Then, the definition (4.6) is equivalent to 1 µr (h) = EW11 · · · Wr−1 Wr1 Wr2 · · · Wk1 Wk2 h.
(6.3)
We simply decoupled the measure γr⊗2 by using independent copies of zi,p for p ≥ r. Using Lemma 3.2, we can rewrite this in a more compact way. Let F = F 1 + F 2 and define n as in (3.10), i.e. mp for p < r and np = mp for p ≥ r. np = (6.4) 2 Lemma 3.2 then implies that for l ≥ r, 1 2 Wl1 Wl2 = exp ml Fl+1 − Fl1 exp ml Fl+1 − Fl2 = exp nl (Fl+1 − Fl ) and for l < r, 1 Wl1 = exp ml Fl+1 − Fl1 = exp nl (Fl+1 − Fl ). Therefore, if we define Wl = exp nl (Fl+1 − Fl ), (6.3) becomes µr (h) = EW1 · · · Wk h.
(6.5)
In particular, if h = I(A) is an indicator of a measurable subset A ⊆ UN × UN and exp Ht (σ 1 , σ 2 ) dν(σ 1 ) dν(σ 2 ) (6.6) F (A) = log A
August 12, 2005 15:54 WSPC/148-RMP
820
J070-00245
D. Panchenko
then, since F = F 1 + F 2 = log
UN ×UN
exp Ht (σ 1 , σ 2 ) dν(σ 1 ) dν(σ 2 ),
(6.7)
we get µr (I(A)) = EW1 · · · Wk exp(f − F ). Lemma 3.4 provided the methodology to control this expression. Lemma 6.1. Let F (A) and F be defined by (6.6) and (6.7). If for some ε > 0, we have 1 EP(n)F (A) ≤ 2ϕN (t) − ε, N
(6.8)
then for some constant K independent of N and the set A, µr (I(A)) ≤ K exp(−N/K).
(6.9)
Proof. Using (3.12) and (4.3), we can write 1 2 EP(n)F = EP(m)F 1 = 2ϕN (t), N N so that (6.8) can be written as EP(n)F (A) − EP(n)F ≤ −N ε,
(6.10)
which is the type of condition used in Lemma 3.4. The main idea was already explained in detail in the proof of Proposition 5.4. However, the function φ that was defined in (5.25) was a function of the finite Gaussian vector Z in RN with independent coordinates which allowed us to use the classical Gaussian concentration of measure inequality in (5.27). Now, however, both F (A) and F depend on the entire Gaussian process HN (σ) indexed by σ ∈ ΣN and the only information that we specified about this process was the covariance operator in (1.1). Still, using the specific definition of F (A), F and (1.1), one can prove the same concentration inequality as (5.27) but it would require a tedious computation repeating the proof of (5.27) in [8]. We will actually carry out this computation in a relatively easier situation, below Lemma 7.1, so the idea will be clear and we will omit this computation here. The proof becomes more transparent when the Hamiltonian is expressed explicitly in terms of an i.i.d. Gaussian sequence. For example, one often considers a Hamiltonian of the type ap gi ,...,ip σi1 · · · σip , (6.11) HN (σ) = N (p−1)/2 i ,...,i 1 p≥1 1
p
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
821
where g = (gi1 ,...,ip ) is a sequence of standard Gaussian random variables independent for all p ≥ 1 and all (i1 , . . . , ip ). In this case, 1 EHN (σ 1 )HN (σ 2 ) = ξ(R1,2 ) where ξ(x) = a2p xp . N p≥1
The sequence (ap )p≥1 should be such that ξ(R1,2 ) is well defined for all σ 1 , σ 2 and comparing with (1.7), this means that ξ(D) < ∞. It is easy to check that for two sequences g and g , we have |HN (σ)(g) − HN (σ)(g )| ≤ N ξ(D)|g − g | and since inequality (5.27) is dimension independent, it applies to a sequence g and the rest of the proof repeats the proof of Proposition 5.4. In Lemma 5.2, we explained why P(ξ, u) is an upper bound on local free energy and the main reason was that the remainder terms in Guerra’s interpolation were non-negative. In order to show that this bound is exact in the limit, we must show that these remainder terms are small along the interpolation for some choices of the parameters k, m and q, and that ϕN (t) can be approximated by ψ(t) = ϕ0 (m, q) −
t ml (θ(ql+1 ) − θ(ql )), 2
(6.12)
1≤l≤k
where ϕ0 (m, q) was defined in (5.14). It is also clear that these parameters should approximate the infimum in the definition (1.16) of P(ξ, u). Definition 6.2. We will call a vector (k, m, q, λ) an ε-minimizer if Pk (m, q, λ, u) ≤ P(ξ, u) + ε and (m, q, λ) is the minimizer of (1.15). For any v ∈ [−D, D], let us define a set 1 A(v) = |R1,2 − v| ≤ (UN × UN ). N
(6.13)
(6.14)
The following a priori estimate will allow us to control the remainder terms in Guerra’s interpolation. Theorem 6.3. For any t0 < 1, there exists ε > 0 that depends on t0 , ξ, ν, u only such that if (6.13) holds, then for t ≤ t0 and for large enough N, 1 (v − qr )2 EP(n)F (A(v)) ≤ 2ψ(t) − + R, N K
(6.15)
where K is a constant independent of N, t and v and |R| ≤ aN for a sequence (aN ) independent of t and v and limN →∞ aN = 0.
August 12, 2005 15:54 WSPC/148-RMP
822
J070-00245
D. Panchenko
In the case of the replica symmetric region of Theorem 2.1, the condition on ε-minimizer is replaced by the condition of stability to replica symmetry breaking fluctuations defined in (2.10). Theorem 6.4. Suppose that all functions are defined in terms of parameters in (2.2) and that (2.5) and (2.10) hold. Then for any t0 < 1 and for any t ≤ t0 , for large enough N, (v − q)2 1 EP(n)F (A(v)) ≤ 2ψ(t) − + R, (6.16) N K where K is a constant independent of N, t and v and |R| ≤ aN for a sequence (aN ) independent of t and v and limN →∞ aN = 0. The proof of these a priori estimates will be postponed until Appendix A. First, let us show how they imply Theorems 1.1 and 2.1. Proof of Theorem 1.1. Given t0 < 1, let us take ε > 0 as in Theorem 6.3 and let (k, m, q, λ) be an ε-minimizer defined by (6.13). Clearly, in this case, ψ(1) = Pk (m, q, λ, u)
and
|ψ(1) − P(ξ, u)| ≤ ε.
(6.17)
Let us take K as in (6.15) and for ε1 > 0, define a set V = {v ∈ [−D, D] : (v − qr )2 ≥ 2K(ψ(t) − ϕN (t)) + 2Kε1 }. For any v ∈ V, (6.15) implies that 1 (v − qr )2 EP(n)F (A(v)) ≤ 2ψ(t) − + R ≤ 2ϕN (t) − ε1 N K for large enough N . Everywhere below, let L denote a constant that might depend on ε1 and K denote a constant independent of ε1 . Applying Lemma 6.1, we get µr (I(A(v))) ≤ L exp(−N/L)
(6.18)
and the constant L here does not depend on v. Let us consider a set (UN × UN ). A = {(R1,2 − qr )2 ≥ 2K(ψ(t) − ϕN (t)) + 2Kε1 } We can choose the points v1 , . . . , vM ∈ V with M ≤ KN such that A ⊆ i≤M A(vi ) and (6.18) implies that µr (I(A)) ≤ LN exp(−N/L) ≤ L exp(−N/L). Using the definition of ψ in (6.12) and (4.10), 1 (ψ(t) − ϕN (t)) = (ml − ml−1 )µl (ξ(R1,2 ) − R1,2 ξ (ql ) + θ(ql )) + R. 2 1≤l≤k
Since the second derivative of ξ is bounded on [−D, D], ξ(R1,2 ) − R1,2 ξ (ql ) + θ(ql ) ≤ K(R1,2 − ql )2 ,
(6.19)
(6.20)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
823
and because on the complement Ac of A, we have (R1,2 − qr )2 ≤ 2K(ψ(t) − ϕN (t)) + 2Kε1 . (6.19), for r = l, implies that µl (ξ(R1,2 ) − R1,2 ξ (ql ) + θ(ql )) ≤ Kµl ((R1,2 − ql )2 ) ≤ K((ψ(t) − ϕN (t)) + ε1 + µl (I(A))) ≤ K(ψ(t) − ϕN (t)) + Kε1 + L exp(−N/L). (6.20) now implies that (ψ(t) − ϕN (t)) ≤ K(ψ(t) − ϕN (t)) + Kε1 + L exp(−N/L). Since by Theorem 5.3, limN →∞ ϕN (0) = ψ(0), solving this differential inequality and then letting N → ∞ and ε1 → 0 implies that lim ϕN (t) = ψ(t) for t ≤ t0 .
N →∞
(6.21)
The derivatives ψ (t) and ϕN (t) are both bounded, which is apparent from (6.12) and (4.10), and we get lim sup |ϕN (1) − ψ(1)| ≤ K(1 − t0 ). N →∞
Using (6.17), lim sup |ϕN (1) − P(ξ, u)| ≤ K(1 − t0 ) + ε N →∞
and letting ε → 0 and t0 → 1 finishes the proof of Theorem 1.1. Proof of Theorem 2.1. The proof of the first part follows from Theorem 6.4 in exactly the same way as Theorem 1.1 follows from Theorem 6.3. The uniqueness of (q, λ) follows from the following simple argument. (6.21) implies that for t ≤ t0 , lim µ1 ((R1,2 − q)2 ) = 0
N →∞
and since the definition of µ1 does not depend on λ, this q must be unique. Since P1 (q, λ) is convex in λ, this implies the uniqueness of λ. 7. Computing Global Free Energy In this section, we will prove Theorem 1.2 which will follow from Theorem 1.1 and Gaussian concentration of measure. Let us start by proving the following concentration inequality.
August 12, 2005 15:54 WSPC/148-RMP
824
J070-00245
D. Panchenko
Lemma 7.1. For any measurable subset Ω ⊆ ΣN , let us consider a r.v. X = log exp HN (σ) dν(σ).
(7.1)
Ω
Then, for any t ≥ 0,
√ P(|X − EX| ≥ 2 LN t) ≤ 2 exp(−t),
(7.2)
where L = max{ξ(x) : x ∈ [d, D]} + c(N ). Proof. The proof is a simple modification of [8, Theorem 2.2.4]. Unfortunately, Lemma 7.1 does not fall into the framework of [8, Theorem 2.2.4] directly, but the same argument still works if we utilize the particular definition of X and the covariance structure (1.1) of the Hamiltonian HN (σ). 1 2 and HN be two independent copies of the Hamiltonian HN . For t ∈ [0, 1], Let HN we define, √ √ j Fj = log exp tHN (σ) + 1 − tHN (σ) dν(σ) (7.3) Ω
for j = 1, 2 and let F = F1 + F2 . For s ≥ 0, let ϕ(t) = E exp s(F2 − F1 ). If we define Ht (σ 1 , σ 2 ) =
√ 1 1 √ 2 t HN (σ ) + HN (σ 1 ) + 1 − t HN (σ 1 ) + HN (σ 2 ) ,
then straightforward computation as in Theorem 4.2 gives 2 ϕ (t) = −N s E exp s(F1 − F2 ) exp(−F ) ζ(σ 1 , σ 2 ) exp Ht (σ 1 , σ 2 ) dν(σ 1 ) dν(σ 2 ), Ω2
(7.4) where ζ was defined in (4.14). Since |ζ(σ 1 , σ 2 )| ≤ L, we get exp Ht dν dν = LN s2 ϕ(t). ϕ (t) ≤ LN s2 E exp s(F1 − F2 ) exp(−F )
(7.5)
Ω2
By construction ϕ(0) = 1, so that (7.5) implies that ϕ(1) ≤ exp N Ls2 . On the other hand, by construction, ϕ(1) = E exp s(X − X ) where X is defined in (7.1) and X is an independent copy of X and thus, E exp s(X − X ) ≤ exp N Ls2 . By Jensen’s inequality, this implies that E exp s(X − EX) ≤ exp N Ls2 and using Markov’s inequality, we get that for t > 0, t2 2 P(X − EX ≥ t) ≤ inf exp(N Ls − st) = exp − . s≥0 4N L
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
825
Obviously, a similar inequality can be written for EX − X and therefore, t2 P(|X − EX| ≥ t) ≤ 2 exp − . 4N L This is equivalent to (7.2). The proof of Theorem 1.2 will be based on the concentration inequality of Lemma 7.1 and the following result. Theorem 7.2. For εN = N −1 , there exists a set {u1 , . . . , uM } ⊆ [d, D] of cardinality M ≤ LN such that [ui − εN , ui + εN ] (7.6) [d, D] ⊆ i≤M
and such that for all i ≤ M, FN (ui , εN ) ≤ sup P(ξ, u) + R,
(7.7)
d≤u≤D
where |R| ≤ cN + LεN . We will first show that Theorem 1.2 follows from Theorem 1.1 and Theorem 7.2. Proof of Theorem 1.2. It follows from the definitions that for any d ≤ u ≤ D and any sequence (εN ), FN (u, εN ) ≤ FN and, therefore, Theorem 1.1 implies lim inf FN ≥ sup P(ξ, u). N →∞
d≤u≤D
To prove Theorem 1.2, we need to show that lim sup FN ≤ sup P(ξ, u). N →∞
d≤u≤D
Suppose not. Then for some ε > 0, lim sup FN > sup P(ξ, u) + ε. N →∞
d≤u≤D
To simplify the notations, instead of considering a subsequence of N , we simply assume that for N large enough, we have FN ≥ sup P(ξ, u) + ε.
(7.8)
d≤u≤D
On the other hand, let εN = N −1 and consider a set of points {u1 , . . . , uM } as in Theorem 7.2, so that for N large enough for all i ≤ M , ε FN (ui , εN ) ≤ sup P(ξ, u) + . 2 d≤u≤D
(7.9)
August 12, 2005 15:54 WSPC/148-RMP
826
J070-00245
D. Panchenko
Lemma 7.1 implies that for any t > 0 with probability at least 1 − LN exp (−N t2 /4L), we have FN ≤
1 log ZN + t N
and
1 log ZN (ui , εN ) ≤ FN (ui , εN ) + t N
(7.10)
for all i ≤ M. Therefore, (7.8)–(7.10) imply that for N large enough with probability at least 1 − LN exp(−N t2 /4L), we have 1 ε ε 1 log ZN (ui , εN ) ≤ FN (u, εN ) + t ≤ FN + t − ≤ log ZN + 2t − . N 2 N 2 Taking t = ε/K, we get that for N large enough with probability at least 1 − LN exp(−N ε2 /K), we have ∀ i ≤ M,
log ZN (ui , εN ) ≤ log ZN −
Nε . K
This yields that for i ≤ M , the Gibbs measure Nε GN ({R1,1 ∈ [ui − εN , ui + εN ]}) ≤ exp − K and therefore, by (7.6), N Nε 1 = GN , = GN ({R1,1 ∈ [d, D]}) ≤ LN exp − K which is impossible for large N. The statement of Theorem 7.2 is intuitively obvious considering (5.6). We basically need to show that the term uN (λ) in (5.6) can be substituted by u in a controlled manner. This will be based on three technical lemmas. To formulate the first lemma, it will be convenient to think of the pair (m, q) in terms of the function m = m(q) defined in (A.32). The following lemma is the analogue of a well known continuity property of X0 (m, q, λ) with respect to the functional order parameter m(q) in the SK model (the statement can be found in [2] and the proof is given in [14]) and since its proof is exactly the same, we will not reproduce it here. Lemma 7.3. For any λ and for any functions m(q) and m (q) defined by (A.32) and corresponding to pairs (m, q) and (m , q ), we have D |X0 (m, q, λ) − X0 (m , q , λ)| ≤ L |m(q) − m (q)| dq, (7.11) 0
where the constant L depends on ξ and D only. Next, we will describe several properties of the function Pk (m, q, λ, u) defined in (1.15). For any k, any vectors m, q and any d ≤ u ≤ D, let λ(u) be defined by (5.19).
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
827
Lemma 7.4. For any δ > 0, there exists a constant Λ(δ) such that for any vectors m, q and any u ∈ [d + δ, D − δ], we have |λ(u)| ≤ Λ(δ).
(7.12)
Proof. For a fixed m and q minimizing Pk (m, q, λ, u) over λ is equivalent to minimizing −λu + X0 (m, q, λ) over λ. By Lemma 5.1, −λu + X0 (m, q, λ) ≥ λ(a(λ) − u), where a(λ) satisfies (5.9). Therefore, for any δ > 0, the right-hand side goes to infinity uniformly over d + δ ≤ u ≤ D − δ and, thus, the left-hand side goes to infinity uniformly over m, q and d + δ ≤ u ≤ D − δ. This obviously implies that there exists Λ(δ) such that (7.12) holds. Lemma 7.5. There exists δ > 0 such that for all m and q, we have for u ∈ [d, d + δ),
λ(u) < 0
λ(u) > 0
for u ∈ (D − δ, D].
(7.13)
Proof. Using (3.15) and (5.20), ∂X0 = EW1 · · · Wk σ 2 = u, ∂λ where for a function h(σ), h is defined by, 2
h exp Xk+1 = h(σ) exp σ zp + λσ dν(σ). Σ
(7.14)
(7.15)
p≤k
Since X0 is convex in λ, ∂X0 /∂λ is increasing in λ and therefore, (7.14) implies that λ = λ(u) is nondecreasing in u. Therefore, in order to prove (7.13), it is enough to show that for some δ > 0, if λ(u) = 0 then u ∈ [d + δ, D − δ]. We will only prove that λ(u) = 0 implies that u ≥ d + δ as the case u ≤ D − δ is quite similar. We set λ = 0 and denote Z = p≤k zp . Given γ > 0, we can write EW1 · · · Wk σ 2 ≥ EW1 · · · Wk σ 2 I(|Z| ≤ γ).
(7.16)
First of all, let us show that if |Z| ≤ γ, then 1 exp(−Lγ), (7.17) L where a constant L does not depend on γ (but it depends on other parameters of the model such as measure ν). To show this, let us first note that if |Z| ≤ γ, we have exp Xk+1 = exp(σZ) dν(σ) ≤ L exp(Lγ), (7.18)
σ 2 ≥ d +
Σ
using (1.7). Next,
σ 2 exp Xk+1 =
Σ
σ 2 exp(σZ) dν(σ)
August 12, 2005 15:54 WSPC/148-RMP
828
J070-00245
D. Panchenko
= d exp Xk+1 + ≥ d exp Xk+1 +
Σ
(σ 2 − d) exp(σZ) dν(σ)
1 exp(−Lγ), L
(7.19)
where the last inequality is obtained by restricting the integral to the set {σ 2 ∈ [(D + d)/2, D]} of positive measure ν by (1.7). Combining (7.18) and (7.19) yields (7.17). Plugging (7.17) into (7.16) implies 1 EW1 · · · Wk σ ≥ d + exp(−Lγ) EW1 · · · Wk I(|Z| ≤ γ) L 1 = d + exp(−Lγ) (1 − EW1 · · · Wk I(|Z| > γ)). L 2
(7.20)
Using H¨older’s inequality, we can bound 1/2 1/2 EW1 · · · Wk I(|Z| > γ) ≤ P(|Z| > γ) E(W1 · · · Wk )2 2 γ ≤ L exp − (E(W1 · · · Wk )2 )1/2 , L
(7.21)
since Z is a Gaussian r.v. with variance ξ(u) uniformly bounded for all d ≤ u ≤ D. As in (4.11), we can write W1 · · · Wk = exp Xk+1
(exp(−Xl ))ml −ml−1
l≤k
and by H¨ older’s inequality, 1/2 (m −m )/2 E exp(−4Xl ) l l−1 . E(W1 · · · Wk )2 ≤ E exp(4Xk+1 ) l≤k
The first factor on the right-hand side can be estimated as follows,
4 exp(σZ) dν(σ) ≤ E exp(4σZ) dν(σ)
E exp(4Xk+1 ) = E Σ = exp(8σ 2 ξ (u)) dν(σ) ≤ L.
Σ
Σ
Next, since Xl =
1 log El exp ml Xl+1 ≥ El Xl+1 , ml
(7.22)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
repeating this over l yields that Xl ≥ El Xk+1 . Therefore,
829
−4 exp(σZ)dν(σ)
E exp(−4Xl ) ≤ E exp(−4El Xk+1 ) ≤ E exp(−4Xk+1 ) = E Σ ≤E exp(−4σZ) dν(σ) = exp(8σ 2 ξ (u)) dν(σ) ≤ L. Σ
Σ
Plugging all these estimates into (7.22) gives E(W1 · · · Wk )2 ≤ L and (7.21) implies
2 γ EW1 · · · Wk I(|Z| > γ) ≤ L exp − . L
Finally, (7.20) implies that
2 γ 1 1 − L exp − u = EW1 · · · Wk σ ≥ d + exp(−Lγ) . L L 2 γ 1 ≥ d + exp(−Lγ) − L exp − . L L 2
Taking γ large enough gives u ≥ d+δ for some δ > 0. As we have already mentioned above, one can similarly show that λ = 0 implies that u ≤ D − δ for some δ > 0 and this finishes the proof of the lemma. We are now ready to prove Theorem 7.2. Proof of Theorem 7.2. Let us start with Eq. (5.6) that states that for any λ, m and q such that qk+1 = u, 1 FN (u, εN ) ≤ −λuN (λ) + X0 (m, q, λ) − ml (θ(ql+1 ) − θ(ql )) + R, (7.23) 2 1≤l≤k
where uN (λ) is defined in (5.3) and |R| ≤ cN + LεN . Let us note that the definition (5.3) implies that |uN (λ) − u| ≤ εN .
(7.24)
Let us take δ > 0 as in Lemma 7.5. Lemma 7.4 implies that for all d+δ ≤ u ≤ D−δ, we have |λ(u)| ≤ Λ(δ). Moreover, (7.23) and (7.24) imply that for any d + δ ≤ u ≤ D − δ and |λ| ≤ Λ(δ), we have 1 FN (u, εN ) ≤ −λu + X0 (m, q, λ) − ml (θ(ql+1 ) − θ(ql )) + Λ(δ)εN + R 2 1≤l≤k
= Pk (m, q, λ, u) + R,
(7.25)
where again |R| ≤ cN + LεN . The fact that for d + δ ≤ u ≤ D − δ, we have |λ(u)| ≤ Λ(δ) means in this case that for a fixed (m, q) minimizing Pk (m, q, λ, u)
August 12, 2005 15:54 WSPC/148-RMP
830
J070-00245
D. Panchenko
over λ is equivalent to minimizing it over |λ| ≤ Λ(δ) and therefore, minimizing Pk (m, q, λ, u) over (m, q, λ) is also equivalent to minimizing it over m, q and |λ| ≤ Λ(δ). Therefore, (7.25) implies that for d + δ ≤ u ≤ D − δ, FN (u, εN ) ≤ P(ξ, u) + R ≤ sup P(ξ, u) + R,
(7.26)
d≤u≤D
where |R| ≤ cN + LεN . Next, let u be such that u + εN < d + δ. Let us consider arbitrary k ≥ 1 and, arbitrary m and q such that qk +1 = uN (λ). By (7.24), there exist m and q, maybe, with different parameter k, such that qk+1 = u and |m(q) − m (q)| dq ≤ |u − uN (λ)| ≤ εN , (7.27) where functions m(q) and m (q) are defined in (A.32) in terms of (m, q) and (m , q ) correspondingly. This can be achieved by simply assigning m(q) = 1 for q between u and uN (λ) and, otherwise, letting m(q) = m (q). Then Lemma 7.3 implies that |X0 (m, q, λ) − X0 (m , q , λ)| ≤ LεN . Also, obviously, condition (7.27) implies that 1 1 m (θ(q ) − θ(q )) − m (θ(q ) − θ(q )) l l+1 l l l+1 l ≤ LεN . 2 2 1≤l≤k 1≤l≤k Therefore, (7.23) implies that for any k , m and q such that qk +1 = uN (λ), FN (u, εN ) ≤ −λuN (λ) + X0 (m , q , λ) −
1 ml (θ(ql+1 ) − θ(ql )) + R 2 1≤l≤k
= P (m , q , λ, uN (λ)) + R, k
(7.28)
where again |R| ≤ cN +LεN . Since now u+εN < d+δ < D, for λ < 0 the definition (5.3) implies that uN (λ) = u + εN and therefore, for λ < 0, we have FN (u, εN ) ≤ Pk (m , q , λ, u + εN ) + R.
(7.29)
Since u + εN < d + δ, Lemma 7.5 implies that for any m and q , λ(u) < 0 which means that for fixed (m , q ) minimizing Pk (m , q , λ, u + εN ) over λ is equivalent to minimizing it over λ < 0 and therefore, minimizing Pk (m , q , λ, u + εN ) over (m , q , λ) is also equivalent to minimizing it over m , q and λ < 0. Hence, (7.29) yields that if u + εN < d + δ, then FN (u, εN ) ≤ P(ξ, u + εN ) + R ≤ sup P(ξ, u) + R. d≤u≤D
(7.30)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
831
Similarly, one can show that this holds for u such that u − εN > D − δ and, combining this with (7.26), we showed that for all u in the set, {u + εN < d + δ} ∪ {d + δ ≤ u ≤ D − δ} ∪ {u − εN > D − εN }
(7.31)
we have FN (u, εN ) ≤ sup P(ξ, u) + R. d≤u≤D
−1
It is obvious that for εN = N , the set (7.31) contains εN -net of the interval [d, D] of cardinality LN. This finishes the proof of Theorem 7.2. Acknowledgment I would like to thank David Sherrington for suggesting the topic of this research. Appendix A. Proof of the A Priori Estimates We will prove the a priori estimates of Sec. 6 in several steps. In Appendix A.1, we obtain the analogue of Talagrand’s interpolation for two copies of the system that is the main technical tool of the proof. In Appendix A.2, we summarize several properties of the parameters in the definition of the Parisi formula in (1.15) and (1.16) and, in Appendices A.3 and A.4 we prove the a priori estimates of Sec. 6 by considering two separate cases of “far” points and “close” points. A.1. Talagrand’s interpolation for two copies The key to proving the a priori estimate of Sec. 6 is Talagrand’s interpolation for two copies of the system. The proof of this result is similar to the proof of Guerra’s interpolation in Theorem 4.2. Once the Talagrand’s interpolation is obtained, the arguments in the rest of the paper will be adapted from [11] with some necessary modifications. For v ∈ [−D, D], let v = η|v| for η = ±1. Consider κ ≥ 1 and consider a sequence n: 0 = n0 ≤ n1 ≤ · · · ≤ nκ = 1
(A.1)
such that nτ = |v| for some τ ≤ κ. Consider a sequence ρ: 0 = ρ0 ≤ · · · ≤ ρκ+1 = u. Consider a sequence of pairs of random such that
(A.2)
variables (Yp1 , Yp2 )
independent for 0 ≤ p ≤ κ
E(Yp1 )2 = E(Yp2 )2 = t(ξ (ρp+1 ) − ξ (ρp ))
(A.3)
and such that Yp1 = ηYp2
for p < τ
and Yp1 , Yp2 are independent for p ≥ τ.
Let (Zp1 , Zp2 ) be an arbitrary sequence of independent 1 2 1 2 , Zi,p ) and (Yi,p , Yi,p ) be independent copies for i ≤ (Zi,p
(A.4)
vectors for 0 ≤ p ≤ κ. Let N of the sequences (Zp1 , Zp2 )
August 12, 2005 15:54 WSPC/148-RMP
832
J070-00245
D. Panchenko
and (Yp1 , Yp2 ), and we assume that they are independent of each other and the randomness in the Hamiltonian HN (σ). For s ∈ [0, 1], let us define an interpolating Hamiltonian j j √ √ √ j Hs (σ 1 , σ 2 ) = stHN (σ 1 ) + stHN (σ 2 ) + σi Zi,p + 1 − sYi,p j≤2 i≤N
p≤κ
(A.5) and consider
exp Hs (σ 1 , σ 2 ) dν(σ 1 ) dν(σ 2 ),
F = log
(A.6)
A(v)
where A(v) was defined in (6.14). Let 1 EP(n)F. N
χ(s) =
(A.7)
Theorem A.1. For s ∈ [0, 1], we have χ (s) ≤ −2t nl (θ(ρl+1 ) − θ(ρl )) − t nl (θ(ρl+1 ) − θ(ρl )) + R, l<τ
(A.8)
l≥τ
where |R| ≤ K(εN + c(N )). Proof. The argument is similar to Guerra’s interpolation in Theorem 4.2. (3.15) implies that χ (s) =
1 ∂F EW1 · · · Wκ , N ∂s
where Wl = exp nl (Fl+1 − Fl ) and where Fl are defined as in (3.2). Using the fact that nκ = 1, one can write nl (Fl+1 − Fl ) W1 · · · Wκ = exp 1≤l≤κ
= exp F +
(nl−1 − nl )Fl
= T exp F,
(A.9)
1≤l≤κ
where T = T1 · · · Tκ and Tl = exp(nl−1 − nl )Fl . Using the definition of F in (A.6), χ (s) =
∂F 1 ET exp F = I − II, N ∂s
where
√ t √ I= ET (HN (σ 1 ) + HN (σ 2 )) exp Hs (σ 1 , σ 2 ) dν(σ 1 ) dν(σ 2 ), 2N s A(v)
(A.10)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
and II =
1 √ 2N 1 − s
ET A(v)
σij
j≤2 i≤N
j Yi,p exp Hs (σ 1 , σ 2 ) dν(σ 1 ) dν(σ 2 ).
833
(A.11)
p≤κ
To simplify the notations, let us denote σ = (σ 1 , σ 2 ) and ρ = (ρ1 , ρ2 ), let dν(σ) = dν(σ 1 ) dν(σ 2 ) and define 1 E(HN (σ 1 ) + HN (σ 2 ))(HN (ρ1 ) + HN (ρ2 )) N = ζ(σ 1 , ρ1 ) + ζ(σ 1 , ρ2 ) + ζ(σ 2 , ρ1 ) + ζ(σ 2 , ρ2 )
z(σ, ρ) =
(A.12)
where ζ was defined in (4.14). To compute I, we will use (4.9) for the family g(ρ) = HN (ρ1 ) + HN (ρ2 ) for ρ = (ρ1 , ρ2 ) ∈ A(v) and we will think of each factor Tl in T as the functional of g(ρ). Then (4.9) and (A.12) imply, √ t ∂ exp Hs (σ) z(σ, σ) dν(σ) I= √ ET ∂g(σ) 2 s A(v) √ t 1 δTl + √ [z(σ, ρ)] dν(σ). (A.13) ET exp Hs (σ) 2 s T l δg A(v) l≤κ
First of all, ∂ exp Hs (σ) √ = st exp Hs (σ), ∂g(σ) and therefore, the first line in (A.13) can be written as
t t ET exp Hs (σ)z(σ, σ) dν(σ) = EW1 · · · Wκ z(σ, σ) , 2 A(v) 2 where
z(σ, σ) = exp(−F )
(A.14)
z(σ, σ) exp Hs (σ) dν(σ). A(v)
Using the definition of Tl , we can write δFl δF 1 δTl [z(σ, ρ)] = (nl−1 − nl ) [z(σ, ρ)] = (nl−1 − nl )El Wl · · · Wκ [z(σ, ρ)], Tl δg δg δg where we used (3.16). We have √ δF [z(σ, ρ)] = st exp(−F ) δg
z(σ, ρ) exp Hs (ρ) dν(ρ) =:
√
st z(σ, ρ) ,
A(v)
where · denotes the Gibbs average with respect to ρ for a fixed σ. Therefore, for a fixed σ, we get √ √
1 δTl [z(σ, ρ)] = st(nl−1 − nl )El Wl · · · Wκ z(σ, ρ) =: st(nl−1 − nl )γl (z(σ, ρ)), Tl δg
August 12, 2005 15:54 WSPC/148-RMP
834
J070-00245
D. Panchenko
where γl here is defined by analogy with (4.5). Combining this with the fact that T exp Hs (σ) dν(σ) = W1 · · · Wκ exp(−F ) exp Hs (σ) dν(σ), we can write the second line in (A.13) as t (nl−1 − nl )EW1 · · · Wκ exp(−F ) γl (z(σ, ρ)) dν(σ) 2 A(v) l≤κ
=
1 t (nl−1 − nl )EW1 · · · Wl−1 γl⊗2 (z(σ, ρ)) = (nl−1 − nl )µl (z(σ, ρ)), 2 2 l≤κ
l≤κ
(A.15) where µl here is defined analogously to (4.6). Combining (A.14) and (A.15), we get
t t I = EW1 · · · Wκ z(σ, σ) + (nl−1 − nl )µl (z(σ, ρ)). (A.16) 2 2 l≤κ
Let us denote by Rj,j the overlap of σ j and σ j and, by Rj,j the overlap of σ j and ρj . From (1.1) and (A.12), t t ξ(Rj,j ) + (nl−1 − nl )µl ξ(Rj,j ) + R, I = EW1 · · · Wκ 2 2 j,j ≤2
j,j ≤2
l≤k
(A.17) where |R| ≤ 4c(N ). Since the average · is over the set A(v), we have |Rj,j − u| ≤ εN
and |R1,2 − v| ≤
Therefore, t I = t(ξ(u) + ξ(v)) + (nl−1 − nl )µl 2
l≤k
1 . N
(A.18)
ξ(Rj,j )
+ R,
(A.19)
j,j ≤
where |R| ≤ K(εN + c(N )). The computation of II is very similar. For p ≤ κ, let us define j j σi Yi,p gp (σ) = so that II =
j≤2 i≤N
p≤κ
II(p) where
1 √ II(p) = 2N 1 − s
A(v)
ET gp (σ) exp Hs (σ) dν(σ).
Let us define zp (σ, ρ) =
1 Egp (σ)gp (ρ) = Rj,j EYpj Ypj . N j,j ≤2
(A.20)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
835
Then one can repeat the computations leading to (A.16) with one important difference. One needs to note that Fl does not depend on the r.v. Yi,p for l ≤ p and, as a result, the summation in the second term will be over p < l ≤ κ, i.e. II(p) =
1 1 EW1 · · · Wκ zp (σ, σ) + (nl−1 − nl )µl (zp (σ, ρ)) 2 2
(A.21)
p
and 1 II = EW1 · · · Wκ 2
zp (σ, σ)
p≤κ
1 + (nl−1 − nl )µl 2
l≤κ
zp (σ, ρ) .
(A.22)
p
Using (A.18) and (A.3), it is easy to see that the first term on the right-hand side is tuξ (u) + t|v|ξ (|v|) + R. Using (A.3) and (A.4),
E(Ypj )2 = tξ (ql ) and
p
EYp1 Yp2 = tηξ (ql∧τ ) = tξ (ηql∧τ ),
p
where in the last equality, we used the fact that ξ is an even function. If we define qlj,j = ql then
EYpj Ypj = tξ (qlj,j ) and (A.22) can be written as
1 II = tuξ (u) + t|v|ξ (|v|) + (nl−1 − nl )µl 2
(A.23)
p
and ql1,2 = ηql∧τ ,
l≤κ
R
j,j
j,j ≤2
ξ
(qlj,j )
+ R. (A.24)
Since θ(x) = xξ (x) − ξ(x) is also even, (A.19) and (A.24) imply that χ (s) = I − II = −tθ(u) − tθ(v) +
t (nl − nl−1 ) θ(qlj,j ) + R 2 l≤κ
j,j ≤2
1 j,j j,j j,j j,j − (nl − nl−1 )µl (ξ(R ) − R ξ (ql ) + θ(ql )) 2 l≤κ j,j ≤2 t ≤ −tθ(u) − tθ(v) + (nl − nl−1 ) θ(qlj,j ) + R 2 j,j ≤2 l≤κ t j,j j,j θ(qκ+1 )− nl (θ(ql+1 ) − θ(qlj,j )) + R = −tθ(u) − tθ(v) + 2 j,j ≤2 j,j ≤2 l≤κ t j,j =− nl (θ(ql+1 ) − θ(qlj,j )) + R, 2
j,j ≤2 l≤κ
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
D. Panchenko
836
j,j 1,2 since qκ+1 = u and qκ+1 = |v|. To finish the proof, it remains to use (A.23). We have j,j θ ql+1 − θ qlj,j = 2(θ(ql+1 ) − θ(ql )) for l < τ j,j ≤2
and
j,j (θ(ql+1 ) − θ(qlj,j )) = θ(ql+1 ) − θ(ql )
for l ≥ τ,
j,j ≤2
1,2 since for l ≥ τ, θ(ql+1 ) − θ(ql1,2 ) = θ(qτ ) − θ(qτ ) = 0.
We can bound χ(0) as follows. Using (A.18), for any λ, γ ∈ R, exp F |s=0 ≤ −2N λu − N γv + 2N |λ|εN + |γ| + log ×
(ΣN )2
j
σi
j≤2
j j (Zi,p + Yi,p )+
p≤κ
i≤N
λ(σij )2 + γσi1 σi2 dν(σ 1 ) dν(σ 2 )
j≤2
= −2N λu − N γv + 2N |λ|εN + |γ| +
Fi (λ, γ),
i≤N
where Fi (λ, γ) are independent copies of F (λ, γ)
exp
= log Σ×Σ
j≤2
σj
(Zpj
+
p≤κ
Ypj )
+
2
λ(σj ) + γσ1 σ2 dν(σ1 ) dν(σ2 ).
j≤2
(A.25) (3.5), (3.6) and (3.7) now imply that χ(0) ≤ −2λu − γv + 2|λ|εN + |γ|N −1 + P(n)F (λ, γ)
(A.26)
and we obtained the following. Corollary A.2. If χ(s) and F (λ, γ) are defined by (A.7) and (A.25), then for all λ, γ ∈ R, nl (θ(ρl+1 ) − θ(ρl )) χ(1) ≤ −2λu − γv + P(n)F (λ, γ) − 2t −t
l<τ
nl (θ(ρl+1 ) − θ(ρl )) + R,
l≥τ
where |R| ≤ K(εN + |λ|εN + |γ|N −1 + c(N )).
(A.27)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
837
Proof. The proof follows immediately by combining (A.8) and (A.26). Remark A.3. The remainder R in Theorem 6.3 will be a result of application of (A.27). We will use it for λ = λ(u) defined in (5.20) and, as in the proof of Lemma 5.2, |λ(u)| ≤ Λ for some Λ that depends only on ξ, ν and u. Below we will use (A.27) for |γ| ≤ L for some constant L independent of N, t and v. As a result, the remainder in Theorem 6.3, |R| ≤ aN for aN = K(εN + c(N ) + N −1 ). A.2. Properties of ε-minimizer In this section, we will describe several properties of the sequence (k, m, q, λ) in (6.13) that follow from its definition. [11, Sec. 4] describes these properties in a very general setting, with no reference to the classical SK model and all the computations there apply to our case. The only difference is that in [11], certain generic computations were applied to the function log chx and here we will apply them to the function exp(σx + λσ 2 ) dν(σ). (A.28) x → log Σ
We will not reproduce some of the generic computations in [11] that directly apply to our case. Perturbing the sequences m and q. Let qr−1 ≤ a ≤ qr ,
b = ξ (a) and mr−1 ≤ m ≤ mr
and define new sequences m , q by inserting m and a into m and q. Let us consider a )− sequence (zp )p≤k+1 of independent random variables such that E(zp )2 = ξ (qp+1 ξ (qp ) or, expressing this explicitly in terms of b, E(zp )2 = ξ (qp+1 ) − ξ (qp ) )2 E(zr−1 E(zp+1 )2
= b − ξ (qr−1 ),
= ξ (qp+1 ) − ξ (qp )
Let F = log Σ
expσ
for p < r − 1,
E(zr )2
= ξ (qr ) − b,
for r ≤ p ≤ k.
zp + λσ 2 dν(σ)
p≤k+1
and consider functions T (m, b) = P(m )F and Φ(m, a) = −λu + T (m, ξ (a)) −
(A.29)
1 ml (θ(ql+1 ) − θ(ql )) 2 l≤k
1 − (m − mr−1 )(θ(qr ) − θ(a)). 2
(A.30)
August 12, 2005 15:54 WSPC/148-RMP
838
J070-00245
D. Panchenko
Fig. 3.
Perturbing the ε-minimizer.
Comparing with the definition (1.15), it is clear that T (m, ξ (a)) = X0 (m , q , λ)
and Φ(m, a) = Pk+1 (m , q , λ, u)
(A.31)
and thus, T and Φ describe the behavior of X0 , Pk when we perturb the set of parameters by adding an extra point. It will be very convenient to note that the functionals X0 and Pk depend on the sequences m, q only through the function m(q) defined by m(q) = ml
for ql ≤ q ≤ ql+1 ,
(A.32)
which is called the functional order parameter. Therefore, inserting parameters m and a can be visualized as the perturbation of m(q), as shown in Fig. 3. From this point of view, it becomes obvious that for all m and a as above, T (mr−1 , b) = T (m, ξ (qr )) = X0 (m, q, λ),
(A.33)
Φ(mr−1 , a) = Φ(m, qr ) = Pk (m, q, λ, u)
(A.34)
and thus, it is very important to study the behavior of the derivatives of T, Φ in m and a at m = mr−1 and a = qr . Properties of the derivatives. Let us define ∂T (m, b)|m=mr−1 ∂m
(A.35)
∂Φ 1 1 (m, a)|m=mr−1 = U (ξ (a)) − (θ(qr ) − θ(a)). ∂m 2 2
(A.36)
U (b) = 2 and f (a) =
The first fundamental formula is ∂T 1 (m, b)|b=ξ (qr ) = − (m − mr−1 )A, ∂b 2
(A.37)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
839
where A is independent of m. This can be obtained by a straightforward computation and one can write down an explicit formula for A (see [11]) but we will omit it here. By definition of the ε-minimizer (6.13), (m, q, λ) is the minimizer of Pk (m, q, λ, u) and therefore, ∂Pk ∂X0 1 =0⇒ + (mr − mr−1 )qr ξ (qr ) = 0, ∂qr ∂qr 2
(A.38)
where we used (1.15) and the fact that θ (qr ) = qr ξ (qr ). When m = mr , it is apparent from Fig. 3 that the intervals [a, qr ] and [qr , qr+1 ] are glued together in a sense that all functionals defined above become independent of qr and, in particular, using (A.31) T (mr , ξ (a)) = X0 (m , q , λ)|m=mr = X0 (m, q, λ)|qr =a .
(A.39)
Taking the derivative of both sides with respect to a at a = qr gives ∂T ∂X0 (mr , ξ (a))|a=qr (m, q, λ) = ∂qr ∂a ∂T 1 (mr , b)|b=ξ (qr ) = − (mr − mr−1 )ξ (qr )A, = ξ (qr ) ∂b 2 where the last equality follows from (A.37) for m = mr . Comparing this with (A.38) implies the first consequence of (6.13), A = qr .
(A.40)
Next, (A.35) and (A.37) imply that 1 ∂T ∂ U (ξ (qr )) = (m, b) 2 ∂b ∂m m=mr−1 b=ξ (qr ) ∂T ∂ 1 (m, b) = = − A, ∂m ∂b 2 b=ξ (qr ) m=mr−1
where, given any doubt, equality in the middle can be checked by computing both sides. Therefore, −U (ξ (qr )) = A = qr .
(A.41)
Another crucial property of U (b) that can be verified by straightforward computation is U (b) ≤ 0,
(A.42)
i.e. U (b) is concave in b. Next, let us describe several properties of f (a) in (A.36). We have f (qr ) = f (qr ) = 0,
f (qr−1 ) ≥ 0.
(A.43)
August 12, 2005 15:54 WSPC/148-RMP
840
J070-00245
D. Panchenko
The first one follows from f (qr ) =
1 ∂T U (ξ (qr )) = (m, ξ (qr ))|m=mr−1 = 0, 2 ∂m
since (A.33) yields that T (m, ξ (qr )) does not depend on m. The second one follows from (A.36) and (A.41) since f (qr ) =
1 ξ (qr )(U (ξ (qr )) + qr ) = 0. 2
To show that f (qr−1 ) ≥ 0, let us note that Φ(m, qr−1 ) = Pk (m, q, λ, u)|mr−1 =m since setting a = qr−1 simply replaces mr−1 with m in the definition of Pk , which is also apparent from Fig. 3. If mr−1 > 0, then as in (A.38), ∂Pk ∂Φ (m, qr−1 )|m=mr−1 = 0. = 0 ⇒ f (qr−1 ) = ∂mr−1 ∂m If mr−1 = 0, then since (m, q, λ) is the minimizer of Pk , slightly increasing mr−1 should not decrease Pk and therefore, the right derivative ∂Pk /∂mr−1 ≥ 0 and f (qr−1 ) ≥ 0. ε-dependent properties of the derivatives. So far we have only utilized the fact that (m, q, λ) is the minimizer of Pk and we have not used the condition in (6.13) that Pk (m, q, λ, u) ≤ P(ξ, u) + ε. In particular, this implies that for any m and a, Φ(mr−1 , a) = Pk (m, q, λ, u) ≤ P(ξ, u) + ε ≤ Φ(m, a) + ε,
(A.44)
which means that we cannot decrease Pk much by varying parameters m, a. This can be combined with the following fact that plays a central role: All derivatives of T, Φ, U, f with respect to a, b, m are bounded by constants that depend only on ξ, ν, u but not on (k, m, q).
(A.45)
Let L denote such constants that depend only on ξ, ν, u. The proof of (A.45) in [11] relied on the fact that Ech(c + c z)L ≤ L for a standard Gaussian z and |c|, |c | ≤ L.
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
841
In our case, this conditions will be replaced by an obvious condition, L exp(σcz + λσ 2 ) dν(σ) ≤ L for |c|, |λ| ≤ L. E Σ
The following Lemma holds. Lemma A.4. The function f (a) in (A.36) satisfies, √ f (a) ≥ −L ε
(A.46)
and f (qr ) =
1 ξ (qr )(ξ (qr )U (ξ (qr )) + 1) ≥ −Lε1/6 . 2
(A.47)
Proof. (A.46) holds if f (a) ≥ 0, so we can assume that f (a) < 0. Using (A.36) and (A.45), we can write Φ(m, a) ≤ Φ(mr−1 , a) + (m − mr−1 )f (a) + L(m − mr−1 )2 .
(A.48)
By (A.34) and the fact that (m, q, λ) is a minimizer of Pk , Φ(mr−1 , a) = Pk (m, q, λ, u) ≤ Pk (m, q, λ, u)|qr =a = Φ(mr , a), and the last equality can been seen in (A.39). Therefore, (A.48) with m = mr implies that (mr − mr−1 )f (a) + L(mr − mr−1 )2 ≥ 0 and therefore, mr ≥ mr−1 −
f (a) ≥ mr−1 , L
(A.49)
where the second inequality follows from our assumption that f (a) < 0. (A.44) and (A.48) imply that for any m and a, −ε ≤ (m − mr−1 )f (a) + L(m − mr−1 )2 . Taking m := mr−1 − f (a)/2L, which belongs to [mr−1 , mr ] by (A.49), implies −ε ≤ −f (a)2 /4L and this proves (A.46). It remains to prove (A.47). From (A.36) and (A.41), we see that 1 1 ξ (qr )(U (ξ (qr )) + qr ) + ξ (qr )(U (ξ (qr ))ξ (qr ) + 1) 2 2 1 = ξ (qr )(U (ξ (qr ))ξ (qr ) + 1). 2
f (qr ) =
August 12, 2005 15:54 WSPC/148-RMP
842
J070-00245
D. Panchenko
(A.45) and (A.46) imply that for any a, √ 1 −L ε ≤ f (a) ≤ f (qr )(a − qr )2 + L|a − qr |3 . 2 Using (A.43), we have 0 ≤ f (qr−1 ) ≤
(A.50)
1 f (qr )(qr−1 − qr )2 + L|qr−1 − qr |3 2
and hence, qr ≥ qr−1 −
f (qr ) . 2L
If f (qr ) ≥ 0, then (A.47) holds, otherwise f (qr ) ∈ [qr−1 , qr ] 2L and using (A.50) for this choice of a again implies (A.47). a = qr−1 −
Dual construction and the replica symmetric case. The construction above will be used in the proof of Theorem 6.3 to provide control of the points on the left-hand side of qr , i.e. qr−1 ≤ v ≤ qr . In order to provide control of the points on the right-hand side qr ≤ v ≤ qr+1 , one can consider a dual construction by perturbing parameter mr on the interval [qr , a]. This construction is very similar so we will not detail it and we will only consider the points on the left-hand side in Theorem 6.3. In the replica symmetric case, the function Φ(m, a) defined in (2.7) is the analogue of the function in (A.30) and, the properties (2.10) and (2.11) replace the properties (A.46) and (A.47) and, in fact, are stronger because ε is replaced by 0. (The change of sign in the inequalities is simply because we consider a dual construction.) Therefore, the proof of the replica symmetric a priori estimate in Theorem 6.4 is exactly the same as the proof of Theorem 6.3 if we use (2.10) and (2.11) instead of (A.46) and (A.47), and we will not detail it. A.3. Control of the far points In this section, we will prove the easiest case of Theorem 6.3, when the point v is far from qr in the following sense. Given ε > 0, let (k, m, q, λ) be an ε-minimizer defined by (6.13). Without loss of generality, we will assume that all coordinates of the vector m are different and that all coordinates of the vector q are also different. Otherwise, we can decrease the value of k by gluing equal coordinates without changing the value of the functional Pk (m, q, λ). In this section, we will consider the case when v ∈ [−D, D] is such that v < qr−1 or v > qr+1
if r ≥ 2,
v < −q1 or v > q2
if r = 1,
and we will prove the following.
(A.51)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
843
Proposition A.5. In the notations of Theorem 6.3, if (A.51) holds then 1 EP(n)F (A(v)) ≤ 2ψ(t) − K + R, N
(A.52)
where K > 0 is a constant independent of N, t and v. This implies Theorem 6.3 in the range of parameters (A.51) because K ≥ (v − qr )2 /(K −1 (qr − qr−1 )2 ∧ (qr+1 − qr )2 ). The proof of Proposition A.5 is based on the following three step construction. Step 1. Let us recall the definition of n in (6.4) and (zp1 , zp2 )p≤k in (6.1). Step 2. (Inserting |v|.). Let a be such that qa ≤ |v| ≤ qa+1 .
(A.53)
Consider a vector ) = (q0 , . . . , qa−1 , |v|, qa , . . . , qk+1 ) q = (q0 , . . . , qk+2
which is defined by inserting |v| in the vector q and define a vector m ma−1 0 n = ,..., , ma−1 , ma , . . . , mk . 2 2 Consider a sequence (yp )p≤k+1 of independent Gaussian random variables such that ) − ξ (qp ) Eyp2 = ξ (qp+1
and let (yp1 , yp2 )p≤k+1 consist of two copies of (yp )p≤k+1 such that yp1 = ηyp2 for p < a
and yp1 , yp2 are independent for p ≥ a,
(A.54)
where v = η|v|. Step 3. (Gluing two sequences together). Let κ = 2k + 1. Let us consider a vector n = (n0 , . . . , nκ+1 )
such that n0 ≤ · · · ≤ nκ+1 ,
and such that n consists of the elements of vectors n and n . More precisely, there exists a partition I, J of the set {0, . . . , κ+1} with card I = k +1 and card J = k +2 such that the elements np are the elements of n for p ∈ I and the elements of n for p ∈ J. For 0 ≤ p ≤ κ, define √ (A.55) (Zp1 , Zp2 ) = 0 for p ∈ J and (Zp1 , Zp2 ) = 1 − t(zl1 , zl2 ) for p ∈ I and where l is such that np = nl . Similarly, define √ (Yp1 , Yp2 ) = 0 for p ∈ I and (Yp1 , Yp2 ) = t(yl1 , yl2 ) for p ∈ J
(A.56)
August 12, 2005 15:54 WSPC/148-RMP
844
J070-00245
D. Panchenko
and where l is such that np = nl . For 0 ≤ p ≤ κ, let us define (gp1 , gp2 ) = (Zp1 , Zp2 ) + (Yp1 , Yp2 )
(A.57)
or, in other words, (gp1 , gp2 ) = (Zp1 , Zp2 ) for p ∈ I
and (gp1 , gp2 ) = (Yp1 , Yp2 ) for p ∈ J.
In order to match this definition of (Ypj ) with (A.2) and (A.3), let us define a sequence ρ0 ≤ · · · ≤ ρκ+1 such that ρp = ql for l such that np = nl . Define τ by nτ = na = ma−1 . We will now apply Corollary A.2 to these choices of n, (Zpj ) and (Ypj ). First of all, χ(1) =
1 EP(n)F (A(v)) N
(A.58)
since for s = 1, the random variables (Ypj ) will disappear in the definition of Hs (σ 1 , σ 2 ) in (A.5), the Hamiltonian Hs (σ 1 , σ 2 ) will coincide with Hamiltonian Ht (σ 1 , σ 2 ) in (6.2) and, as a result, the definition of F in (A.6) will coincide with F (A(v)) in Theorem 6.3. Next, it is clear from the construction that −2t
nl (θ(ρl+1 ) − θ(ρl )) − t
l<τ
= −2t
nl (θ(ql+1 ) − θ(ql )) − t
l
= −t
nl (θ(ρl+1 ) − θ(ρl ))
l≥τ
nl (θ(ql+1 ) − θ(ql ))
l≥a
ml (θ(ql+1 ) − θ(ql )).
l≤k
Corollary A.2 now implies that 1 EP(n)F (A(v)) ≤ −2λu − γv + P(n)F (λ, γ) − t ml (θ(ql+1 ) − θ(ql )) + R. N l≤k
We will use this bound for λ as in the ε-minimizer (k, m, q, λ) and γ = 0, i.e. 1 EP(n)F (A(v)) ≤ −2λu + P(n)F (λ, 0) − t ml (θ(ql+1 ) − θ(ql )) + R. N l≤k
The argument in Lemma 5.2 shows that |λ| ≤ Λ for a constant Λ that depends only on ξ, ν and u and hence, |R| ≤ K(εN + c(N )). Recalling the definition of ψ(t) in
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
845
(6.12), in order to prove Proposition A.5, it remains to show that P(n)F (λ, 0) ≤ 2X0 (m, q, λ) − K = 2P(m)Xk+1 − K, where we used (3.4) and where K > 0 is a constant independent of t and v. In fact, it is enough to show that P(n)F (λ, 0) < 2X0 (m, q, λ) = 2P(m)Xk+1 ,
(A.59)
for all parameters t ∈ [0, 1 − t0 ] and v as in (A.51) because the functionals on both sides are continuous in these parameters and, even though the set defined in (A.51) is not a compact, the case of the end points will be proved in the following sections and (A.59) holds on the closure of (A.51). Therefore, by continuity and compactness, strict inequality for each (t, v) will imply strict inequality uniformly over the entire set of parameters. The proof of (A.59) repeats the proof in [11, Proposition 5.7] with only one modification that instead of log chx, we consider (A.28) and note that this function is also strictly convex in x because we eliminated the case when ν is concentrated on one point in Appendix B. Instead of reproducing the proof in its entirety, we will explain a very clear idea behind it by looking at a few cases. Let us first show that a non-strict version of (A.59), i.e. P(n)F (λ, 0) ≤ 2X0 (m, q, λ)
(A.60)
always holds, even without the assumption (A.51). (A.25) gives that F (λ, 0) = F 1 + F 2 where exp σ gpj + λσ 2 dν(σ). F j = log Σ
p≤κ
It is clear from the construction that for p ∈ I n p = nl =
ml 2
for l < r ⇔ gp1 = gp2
(A.61)
for l < a ⇔ gp1 = ηgp2 .
(A.62)
and for p ∈ J, np = nl =
ml 2
In other words, np is of the type ml /2 whenever the corresponding random pair is fully correlated. Let us define a vector m = (m0 , . . . , mκ ) by mp = 2np
for p as in (A.61) or (A.62) and mp = np otherwise.
(A.63)
A fact that plays a very important role below is that coordinates of m are not necessarily arranged in an increasing order. Let us first prove the following. Lemma A.6. We have P(n)F (λ, 0) = P(n)(F 1 + F 2 ) ≤ P(m)F 1 + P(m)F 2 .
(A.64)
August 12, 2005 15:54 WSPC/148-RMP
846
J070-00245
D. Panchenko
Proof. This follows by induction in (3.2). For p such that mp = np and gp1 , gp2 are independent, we have (F 1 + F 2 )p = =
1 1 1 2 log Ep exp np (F 1 + F 2 )p+1 ≤ log Ep exp np (Fp+1 + Fp+1 ) np np 1 1 1 2 log Ep exp mp Fp+1 + log Ep exp mp Fp+1 = Fp1 + Fp2 . mp mp
(A.65)
For p such that mp = 2np and gp1 = ±gp2 , we have (F 1 + F 2 )p = ≤
1 2 mp 1 2 (Fp+1 + Fp+1 log Ep exp np (F 1 + F 2 )p+1 ≤ log Ep exp ) np mp 2 1 1 1 2 log Ep exp mp Fp+1 + log Ep exp mp Fp+1 = Fp1 + Fp2 , mp mp
(A.66)
where in the second line, we used H¨older’s inequality. For p = 0, this gives (A.64).
Lemma A.7. Let m be a nondecreasing permutation of the vector m. Then, P(m)F j ≤ P(m )F j = P(m)Xk+1 = X0 (m, q, λ).
(A.67)
The first inequality means that the Parisi functional will decrease if m is not arranged in an increasing order. Lemma A.7 follows from [11, Lemma 5.12] which states the following. Given a function Q and numbers a ≥ 0 and m > 0, let Tm,a (Q)(x) =
√ 1 log E exp mQ(x + g a), m
where g is standard Gaussian. Lemma A.8 ([11]). If a, a ≥ 0 and m ≥ m , then for each x we have Tm,a ◦ Tm ,a (Q)(x) ≤ Tm ,a ◦ Tm,a (Q)(x).
(A.68)
If a, a > 0 and m > m , then we can have equality only if Q is constant.
Proof of Lemma A.7. The first inequality is obvious by Lemma A.8. Equality in (A.67) follows by construction. The elements of m are precisely the elements of m. The random variables gpj for p such that mp = ml are exactly √ 1 − tzlj
and
√
tylj
for l = a − 1
and √ √ j √ 1 − tzlj and tylj , tyl+1
for l = a − 1.
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
847
Obviously, Tm,a ◦ Tm,a = Tm,a+a ,
(A.69)
which means that we can combine the random variables corresponding to the same value ml and since it is easy to check that in both cases, the sum of these random variables is equal in distribution to zl defined in (1.12), (A.67) follows. Combining Lemmas A.6 and A.7, we proved (A.60). From the proof, it is clear that there are only two places, (A.66) and (A.68), where the inequality could become strict. It turns out that condition (A.51) ensures that gluing two sequences together occurs in such a way that at least in one of these two steps the inequality will become strict. We will not present the detailed proof here and refer a reader to [11, Proposition 5.7]. We will explain the main idea by looking at several typical cases. Let us consider the case r ≥ 2 in (A.51). The case r = 1 is quite similar with the exception that the interval −q1 ≤ v ≤ q0 = 0 was excluded in (A.51) because it requires a different approach and it will be postponed until the following sections. For r ≥ 2 and v as in (A.51), we have (a) |v| ∈ [qr−1 , qr+1 ] or (b) |v| ∈ [qr−1 , qr+1 ] and v = −|v|, i.e. η = −1. Case (a). Let us assume for simplicity that qr−2 ≤ |v| < qr−1 since other cases are similar. This corresponds to the case a = r − 2 in (A.53). Then we will split case (a) into two subcases: qr−2 ≤ |v| < qr−1
and mr−2 < mr−1 /2,
(A.70)
qr−2 ≤ |v| < qr−1
and mr−1 /2 ≤ mr−2 .
(A.71)
Let us now see what happens when we combine the sequences at step 3 above. First of all, at step 1, the sequences n and (zp1 , zp2 ) will have subsequences ... mr−2 /2 mr−1 /2 mr . . . (zr−2 , zr−2 ) (zr−1 , zr−1 ) (zr1 , zr2 )
mr+1 ... 1 2 (zr+1 , zr+1 ) ...
(A.72)
At step 2, the sequences n and (yp1 , yp2 ) will have subsequences ... mr−2 /2 mr−2 mr−1 mr 1 2 1 2 , yr−1 ) (yr1 , yr2 ) (yr+1 , yr+1 ) . . . (yr−2 , ηyr−2 ) (yr−1
... ...
(A.73)
In both cases, we write (z 1 , z 2 ) or (y 1 , y 2 ) whenever two coordinates are independent. When we glue these sequences together at step 3, the sequences n and (gp1 , gp2 ) will contain subsequences mr−2 /2 ... mr−2 /2 . . . (Zr−2 , Zr−2 ) (Yr−2 , ηYr−2 )
mr−2 mr−1 /2 1 2 (Yr−1 , Yr−1 ) (Zr−1 , Zr−1 )
mr−1 (Yr1 , Yr2 )
... ... (A.74)
August 12, 2005 15:54 WSPC/148-RMP
848
J070-00245
D. Panchenko
in the case (A.70) and ... mr−2 /2 . . . (Zr−2 , Zr−2 )
mr−2 /2 mr−1 /2 (Yr−2 , ηYr−2 ) (Zr−1 , Zr−1 )
mr−2 1 2 (Yr−1 , Yr−1 )
mr−1 ... 1 2 (Yr , Yr ) . . . (A.75)
in the case (A.71). Suppose that (A.74) is the case. Then the strict inequality will appear when we apply Eq. (A.66) at the step when np is equal to mr−1 /2. Indeed, at this step 1 1 1 Fp+1 = Fp+1 (· · · + Zr−2 + Yr−2 + Yr−1 + Zr−1 ), 2 2 2 = Fp+1 (· · · + Zr−2 + Yr−2 + Yr−1 + Zr−1 ) Fp+1 j 1 2 , Yr−1 are independent and nondegenerate since E(Yr−1 )2 = t(ξ (qr−1 ) − and Yr−1 j ξ (|v|)) > 0 by (A.70). Also, both functions x → Fp+1 (x) are strictly convex because (A.28) is strictly convex and iteration in the Parisi functional (3.2) will preserve 1 2 and Fp+1 are not collinear as functions of Zr−1 strict convexity. Therefore, Fp+1 1 2 with probability one over (Yr−1 , Yr−1 ) and, therefore, H¨older’s inequality in (A.66) will be strict with probability one. Now, suppose that (A.75) holds. Then after using Lemma A.6, P(m)F 1 will be defined in terms of the sequences that contain subsequences
. . . mr−2 . . . Zr−2
mr−2 Yr−2
mr−1 Zr−1
mr−2 1 Yr−1
mr−1 Yr1
... ...
(A.76)
In this case, m is not arranged in an increasing order, since mr−1 > mr−2 , and 1 are nondegenerate. Therefore, when we rearrange these sequences in an Zr−1 , Yr−1 increasing order by applying Lemma A.8, we will get strict inequality in (A.68). Case (b). In this case, the scenario of (A.75) can not occur and the fact that η = −1 plays an important role. Suppose for certainty that qr−1 ≤ |v| ≤ qr . (A.72) does not change but instead of (A.73), we will now have: mr−1 /2 mr−1 mr ... mr−2 /2 1 2 , yr+1 ) . . . (yr−2 , −yr−2 ) (yr−1 , −yr−1 ) (yr1 , yr2 ) (yr+1
... ...
(A.77)
When we glue this sequence with (A.72), we will get ... mr−2 /2 mr−2 /2 mr−1 /2 mr−1 /2 mr−1 . . . (Zr−2 , Zr−2 ) (Yr−2 , −Yr−2 ) (Zr−1 , Zr−1 ) (Yr−1 , −Yr−1 ) (Yr1 , Yr2 )
... ... (A.78)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
849
The strict inequality will appear when we apply Eq. (A.66) at the step when np is equal to mr−1 /2 and 1 1 Fp+1 = Fp+1 (· · · + Zr−2 + Yr−2 + Zr−1 + Yr−1 ), 2 2 Fp+1 = Fp+1 (· · · + Zr−2 − Yr−2 + Zr−1 − Yr−1 ).
Random variables Yr−2 , Zr−2 are independent and nondegenerate, and we can argue as in the case (A.74) above. All other cases in the proof of [11, Proposition 5.6] are very similar and (A.52) holds. A.4. Control of the close points In Appendix A.3, we obtained the control of the points v far from qr and in this section, we will consider the remaining cases when qr−1 ≤ v ≤ qr+1 or −q1 ≤ v < 0 when r = 1. All arguments repeat the arguments in [11, Sec. 5], so we will only consider the case when qr−1 ≤ v ≤ qr . As in the previous section, let L1 , L2 , . . . denote constants that depend only on ν, ξ and u. Consider a function Γ(c) = inf{|ξ(y) − ξ(x) + (x − y)ξ (y)| : 0 ≤ x, y ≤ D, |x − y| ≥ c}.
(A.79)
Since ξ (x) > 0, we have Γ(c) > 0 for c > 0. In the notations of Theorem 6.3, the following holds. Proposition A.9. Suppose that qr−1 ≤ v ≤ qr . If L1 ε1/6 ≤ 1 − t0 , then L1 (qr − v) ≤ 1 − t0 (1 − t0 )2 1 ⇒ EP(n)F (A(v)) ≤ 2ψ(t) − (v − qr )2 + R, N L1
(A.80)
and if L2 ε1/2 ≤ (1 − t0 )Γ((1 − t0 )/L1 ), then L1 (qr − v) ≥ 1 − t0 ⇒
1 EP(n)F (A(v)) < 2ψ(t) + R. N
(A.81)
Together with a similar result for qr ≤ v ≤ qr+1 and the results of Appendix A.3, this proves Theorem 6.3. We will again use Talagrand’s interpolation for two copies. Given mr−1 ≤ m ≤ mr , 2 let us define sequences n and ρ in (A.1) and (A.2) by 0 = n0 =
m0 m1 mr−1 , n1 = , . . . , nr−1 = , nr = m, nr+1 = mr , . . . , nk+1 = mk 2 2 2
August 12, 2005 15:54 WSPC/148-RMP
850
J070-00245
D. Panchenko
and ρ0 = q0 , . . . , ρr−1 = qr−1 , ρr = v, ρr+1 = qr , . . . , ρk+2 = qk+1 . Since ρr = v, we have τ = r. Consider a sequence (Yp1 , Yp2 ) as in (A.3), i.e. E(Ypj )2 = t(ξ (ρp+1 ) − ξ (ρp )), Yp1 = Yp2
for p < r
and Yp1 , Yp2 are independent for p ≥ r.
Let (Zp1 , Zp2 ) be such that E(Zpj )2 = (1 − t)(ξ (ρp+1 ) − ξ (ρp )) for p < r − 1 and p > r, j )2 = (1 − t)(ξ (qr ) − ξ (qr−1 )) E(Zr−1
and Zrj = 0. Let Zp1 = Zp2
for p < r
and Zp1 , Zp2 are independent for p > r.
If we denote gpj = Ypj + Zpj
for 0 ≤ p ≤ k + 1, j = 1, 2,
(A.82)
then if follows from the construction that E(gpj )2 = ξ (ρp+1 ) − ξ (ρp )
for p < r − 1, p > r,
j E(gr−1 )2 = (ξ (qr ) − ξ (qr−1 )) − t(ξ (qr ) − ξ (v)),
E(grj )2 = t(ξ (qr ) − ξ (v)) (A.83)
and gp1 = gp2
for p < r
and gp1 , gp2 are independent for p > r.
If we define a point a ∈ [v, qr ] by ξ (a) = tξ (v) + (1 − t)ξ (qr ),
(A.84)
then (A.83) can be rewritten as j )2 = ξ (a) − ξ (qr−1 ), E(gr−1
j E(gr−1 )2 = ξ (qr ) − ξ (a).
If we define a new sequence ρ by ρ0 = q0 , . . . , ρr−1 = qr−1 , ρr = a, ρr+1 = qr , . . . , ρk+2 = qk+1 obtained by inserting a into the sequence q, we finally get E(gpj )2 = ξ (ρp+1 ) − ξ (ρp ) for all p ≤ k + 1
(A.85)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
851
and gp1 = gp2
for p < r
and gp1 , gp2 are independent for p > r.
(A.86)
Plugging the definition (A.82) into (A.25), we get exp σj gpj + λ(σj )2 + γσ1 σ2 dν(σ1 ) dν(σ2 ). F (λ, γ) = log Σ×Σ
j≤2
j≤2
p≤k+1
(A.87) Let us define V (γ, m, v) = P(n)F (λ, γ),
(A.88)
where we made the dependence of the right-hand side on the parameters (γ, m, v) explicit. In order to apply Corollary A.2, let us first note that from the construction of sequences n and ρ, we have nl (θ(ρl+1 ) − θ(ρl )) − t nl (θ(ρl+1 ) − θ(ρl )) −2t l<τ
= −t
l≥τ
ml (θ(ql+1 ) − θ(ql )) − tmr−1 (θ(v) − θ(qr−1 ))
l≤r−2
− tm(θ(qr ) − θ(v)) − t = −t
ml (θ(ql+1 ) − θ(ql ))
r≤l≤k
ml (θ(ql+1 ) − θ(ql )) − t(m − mr−1 )(θ(qr ) − θ(v)).
l≤k
Corollary A.2 and (A.58) now imply that 1 EP(n)F (A(v)) N ≤ −2λu − γv + V (γ, m, v) − t ml (θ(ql+1 ) − θ(ql )) l≤k
− t(m − mr−1 )(θ(qr ) − θ(v)) + R.
(A.89)
One can easily check using the argument of Lemma 3.2 that V (0, m, v) = 2T (m, ξ (a)), where T was defined in (A.29). (A.33) implies that V (0, mr−1 , v) = 2X0 (m, q, λ) and therefore, (A.89) with γ = 0, m = mr−1 implies that 1 EP(n)F (A(v)) ≤ 2ψ(t) + R. N
(A.90)
August 12, 2005 15:54 WSPC/148-RMP
852
J070-00245
D. Panchenko
In order to prove Proposition A.9, we will perturb parameters m and γ around these values 0, mr−1 and use the properties of ε-minimizer from the previous section. The fundamental connection of the bound (A.89) to the properties of ε-minimizer lies in the following fact: ∂V (γ, mr−1 , v)|γ=0 = −U (ξ (a)). ∂γ
(A.91)
The proof follows from straightforward computation and is given in [11, Lemma 5.8]. Also, similar to (A.45), we have 2 ∂ V (A.92) ∂γ 2 ≤ L. We are now ready to prove Proposition A.9. Proof of Proposition A.9. Let us consider a function α(γ) = V (γ, mr−1 , v) − γv, which is the part of the bound (A.89) for m = mr−1 that depends on γ. By (A.91), we have h(v) := α (0) = −U (ξ (a)) − v = −U tξ (v) + (1 − t)ξ (qr ) − v, since a was defined in (A.84). By (A.41), we have h(qr ) = 0 and h (qr ) = −tξ (qr )U (ξ (qr )) − 1.
(A.93)
Using (A.47), h (qr ) = −ξ (qr )U (ξ (qr )) − 1 + (1 − t)ξ (qr )U (ξ(qr )) ≤
L ε1/6 + (1 − t)ξ (qr )U (ξ(qr )). ξ (qr )
(A.94)
We will now show that Lε1/6 ≤ 1 − t0 ⇒ h (qr ) ≤ −
1 − t0 . 4
If −ξ (qr )U (ξ (qr )) ≤ 1/2, then (A.93) gives h (qr ) ≤
1 t −1≤− . 2 2
If −ξ (qr )U (ξ (qr )) ≥ 1/2, then 1 ξ (qr )
≤ −2U (ξ (qr )) ≤ L
(A.95)
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
853
by (A.45), and (A.94) gives h (qr ) ≤ Lε1/6 −
1 − t0 1−t ≤− , 2 4
where the last inequality holds if 4Lε1/6 ≤ 1 − t0 . This proves (A.95). Since (A.45) implies that |h (v)| ≤ L and since h(qr ) = 0, we can write h(v) ≥ (v − qr )h (qr ) − L(v − qr )2 ≥
1 (qr − v)(1 − t0 ) 8
if qr − v ≤ (1 − t0 )/8L and if (A.95) holds. (A.92) implies that |α (γ)| ≤ L and we finally get h(v)2 α (0)2 = α(0) − inf α(γ) ≤ inf α(0) + α (0)γ + Lγ 2 ≤ α(0) − γ γ L L 1 2 2 (A.96) ≤ 2X0 (m, q, λ) − (1 − t0 ) (v − qr ) . L Applying this to the bound (A.89) proves (A.80). Note that the infimum was achieved on γ = −α (0)/2L and that (A.45) implies that |γ| ≤ L. As we explained in the remark following Corollary A.2, the bound (A.27) is used only for |γ| ≤ L. Next, we will prove (A.81). If −h(v) = U (ξ (a)) + v = 0, then we can simply use the first inequality in (A.96). Let us assume now that U (ξ (a)) = −v. Let us set γ = 0 in the bound (A.89) and consider the derivative of this bound in m at m = mr−1 , i.e. ∂V (0, m, v)|m=mr−1 − t(θ(qr ) − θ(v)) ∂m = U (ξ (a)) − t(θ(qr ) − θ(v))
D(t) =
= U (tξ (v) + (1 − t)ξ (qr )) − t(θ(qr ) − θ(v))
by (A.90) and (A.35) by (A.84).
Since we assumed that U (ξ (a)) = −v, D (t) = (ξ (v) − ξ (qr ))U (ξ (a)) − (θ(qr ) − θ(v)) = −(ξ (v) − ξ (qr ))v − (θ(qr ) − θ(v)) = ξ(qr ) − ξ(v) + (v − qr )ξ (qr ) ≤ −Γ(qr − v), where the last inequality follows from the definition (A.79). By (A.42), D(t) is concave in t and therefore, D(1) ≤ D(t) + (1 − t)D (t) ≤ D(t) − (1 − t)γ(qr − v). By (A.46), D(1) = U (ξ (v)) − (θ(qr ) − θ(v)) = 2f (v) ≥ −Lε1/2 . We get D(t) ≥ (1 − t)γ(qr − v) − Lε1/2 > 0,
August 12, 2005 15:54 WSPC/148-RMP
854
J070-00245
D. Panchenko
if (1 − t0 )γ(qr − v) > Lε1/2 , which is true under the conditions in (A.81) and one can finish the proof as in (A.96).
Appendix B. Cases Reducible to the Classical SK Model We will now show that only the case of d < u < D in Theorem 1.1 from √ √ is different the classical SK model. First of all, d = D means that Σ = {− d, + d} which is precisely the case of the SK model. If measure ν has non-zero mass at both points √ ± d, then ν(σ) is proportional to exp hσ for some external field parameter h ∈ R. Otherwise, if ν is concentrated at one point, the statement of Theorem 1.1 becomes trivial. It remains to consider the cases of d < D and u = d or u = D. We will only consider the case u = d, since the case u = D is similar. Let us consider a set UN (ε) = {σ : R1,1 ∈ [d, d + ε]} and a function 1 E log N
FN (ε) =
UN (ε)
exp HN (σ) dν(σ).
Since by (1.1), E
ΣN
exp HN (σ) dν(σ) < ∞,
the function exp HN (σ) is ν-integrable with respect to σ almost surely and therefore, by the monotone convergence theorem, exp HN (σ) dν(σ) = exp HN (σ) dν(σ) a.s. lim ε→0
UN (ε)
{R1,1 =d}
Using the monotone convergence theorem once again implies 1 lim FN (ε) = PN := E log exp HN (σ) dν(σ). ε→0 N {R1,1 =d}
(B.1)
If we choose a sequence (εN ) so that |FN (εN ) − PN | ≤ N −1 , in order to prove Theorem 1.1 for u = d, it is enough to show that lim PN = P(ξ, d).
N →∞
(B.2)
We will prove this by considering two separate cases. Case 1. √ ν({σ 2 = d}) = 0. This means that the measure ν has no atoms at the points ± d and therefore, ν({R1,1 = d}) = 0 and PN = −∞. To prove (B.2), we
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
855
need to show that P(ξ, d) = −∞. Let us, for example, take k = 1 and q1 = 0. For this choice of m, q, we have exp(σz ξ (u) + λσ 2 ) dν(σ) X0 (m, q, λ) = log E Σ
1 exp σ 2 ξ (u) + λσ 2 dν(σ). 2 Σ
= log Therefore,
−λd + X0 (m, q, λ) = log
1 exp σ 2 ξ (u) + λ(σ 2 − d) dν(σ) 2 Σ
and by the monotone convergence theorem, lim (−λd + X0 (m, q, λ)) = log
1 exp σ 2 ξ (u) dν(σ) = −∞, 2 {σ2 =d}
λ→−∞
since we assumed that ν({σ 2 = d}) = 0. Clearly, (1.16) yields that P(ξ, d) = −∞. Case 2. √ ν({σ 2 = d}) > 0. This means that measure ν has at least one atom at the points ± d. Consider a probability measure ν¯ defined by √ √ √ √ ν({− d}) ν({ d}) ν¯({− d}) = , ν¯({ d}) = . (B.3) ν({σ 2 = d}) ν({σ 2 = d}) Condition (1.7) implies that R1,1 = d if and only if σi2 = d for all i ≤ N. In other words, {R1,1 = d} =
N
,
where
d
√ √ = {− d, d}.
d
With these notations, PN in (B.1) can be written as PN = log ν({σ 2 = d}) + P¯N , where 1 P¯N = E log N
(B.4)
PN
exp HN (σ) d¯ ν (σ).
d
But P¯N falls exactly into the case d = D considered above because ν¯({σ 2 = d}) = 1, and therefore, its limit can be written as follows. If we consider Y¯k+1 = log exp σ zp d¯ ν (σ), Σd
0≤p≤k
define Y¯l recursively as in (1.14) and let 1 P¯k (m, q) = Y¯0 − ml (θ(ql+1 ) − θ(ql )), 2 1≤l≤k
August 12, 2005 15:54 WSPC/148-RMP
856
J070-00245
D. Panchenko
then lim P¯N = inf P¯k (m, q),
N →∞
where the infimum is over all choices of k, m and q, which by (B.4) implies lim PN = log ν({σ 2 = d}) + inf P¯k (m, q).
N →∞
Equivalently, this can be written as follows. If we consider 2 Yk+1 = log ν({σ = d}) + log exp σ zp d¯ ν (σ), Σd
(B.5)
0≤p≤k
define Yl recursively as in (1.14) and let 1 ml θ(ql+1 ) − θ(ql ) , Pk (m, q) = Y0 − 2
(B.6)
1≤l≤k
then lim PN = inf Pk (m, q).
N →∞
By definition (B.3) of measure ν¯, Yk+1 in (B.5) can be also written as exp σ zp dν(σ). Yk+1 = log {σ2 =d}
(B.7)
(B.8)
0≤p≤k
In order to prove (B.2), we will show that P(ξ, d) is equal to the right-hand side of (B.7). The definition of P(ξ, d) given by (1.13)–(1.16) can be written equivalently as follows. If we consider 2 exp σ zp + λ(σ − d) dν(σ), (B.9) Xk+1 = log Σ
0≤p≤k
define Xl recursively as in (1.14), and define 1 Pk (m, q, λ, d) = X0 − ml (θ(ql+1 ) − θ(ql )), 2
(B.10)
1≤l≤k
then P(ξ, d) = inf Pk (m, q, λ, d),
(B.11)
where the infimum is taken over all λ, k, m and q. Since d ≤ σ 2 for σ ∈ Σ, Xk+1 in (B.9) is increasing in λ which implies that X0 is also increasing in λ. Therefore, for any fixed m and q to minimize the right-hand side of (B.11) over λ, one should let λ → −∞. By the monotone convergence theorem, almost surely, exp σ zp dν(σ) = Yk+1 . lim Xk+1 = λ→−∞
{σ2 =d}
0≤p≤k
August 12, 2005 15:54 WSPC/148-RMP
J070-00245
Free Energy in the Generalized Sherrington–Kirkpatrick Mean Field Model
857
Using the monotone convergence theorem repeatedly in the recursive construction, (1.14) gives limλ→−∞ X0 = Y0 and comparing (B.6) and (B.10), we get inf Pk (m, q, λ, d) = Pk (m, q). λ
Combining this with (B.7) and (B.11) gives (B.2). References [1] S. K. Ghatak and D. Sherrington, Crystal field effects in a general S Ising spin glass, J. Phys. C: Solid State Phys. 10 (1977) 3149. [2] F. Guerra, Broken replica symmetry bounds in the mean field spin glass model, Comm. Math. Phys. 233(1) (2003) 1–12. [3] F. Guerra and F. L. Toninelli, The thermodynamic limit in mean field spin glass models, Comm. Math. Phys. 230 (2002) 71–79. [4] P. J. Mottishaw and D. Sherrington, Stability of a crystal-field split spin glass. J. Phys. C: Solid State Phys. 18 (1985) 5201–5213. [5] M. Ledoux and M. Talagrand, Probability in Banach spaces, Isoperimetry and Processes (Springer-Verlag, New York, 1991). [6] D. Panchenko, A question about the Parisi functional, Electron. Comm. Probab. 10 (2005) 155–166. [7] D. Sherrington and S. Kirkpatrick, Solvable model of a spin glass, Phys. Rev. Lett. 35 (1972) 1792–1796. [8] M. Talagrand, Spin Glasses: A Challenge for Mathematicians (Springer-Verlag, New York, 2003). [9] M. Talagrand, On Guerra’s broken replica-symmetry bound, C. R. Math. Acad. Sci. Paris 337(7) (2003) 477–480. [10] M. Talagrand, The generalized Parisi formula, C. R. Math. Acad. Sci. Paris 337(2) (2003) 111–114. [11] M. Talagrand, Parisi formula, to appear in Ann. Math. (2003). [12] M. Talagrand, On the meaning of Parisi’s functional order parameter, C. R. Math. Acad. Sci. Paris 337(9) (2003) 625–628. [13] M. Talagrand, Free energy of the spherical mean field model, to appear in Probab. Theory Related Fields (2004). [14] M. Talagrand, Parisi measures, to appear in J. Funct. Analysis (2004).
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Reviews in Mathematical Physics Vol. 17, No. 8 (2005) 859–880 c World Scientific Publishing Company
THE CASIMIR EFFECT BETWEEN NON-PARALLEL PLATES BY GEOMETRIC OPTICS
BRENDAN GUILFOYLE Department of Mathematics and Computing, Institute of Technology, Tralee, Clash, Tralee, Co. Kerry, Ireland [email protected] WILHELM KLINGENBERG Department of Mathematical Sciences, University of Durham, Durham DH1 3LE, United Kingdom [email protected] SIDDHARTHA SEN School of Mathematics, University of Dublin, Dublin 2, Ireland and Department of Theoretical Physics, IACS, Kolkata, India [email protected] Received 24 February 2005 Revised 31 May 2005
The first two authors have developed a technique which uses the complex geometry of the space of oriented affine lines in R3 to describe the reflection of rays off a surface. This can be viewed as a parametric approach to geometric optics which has many possible applications. Recently, Jaffe and Scardicchio have developed a geometric optics approximation to the Casimir effect and the main purpose of this paper is to show that the quantities involved can be easily computed by this complex formalism. To illustrate this, we determine explicitly and in closed form the geometric optics approximation of the Casimir force between two non-parallel plates. By making one of the plates finite, we regularize the divergence that is caused by the intersection of the planes. In the parallel plate limit, we prove that our expression reduces to Casimir’s original result. Keywords: Casimir force; geometric optics. Mathematics Subject Classification 1991: Primary: 78A05; Secondary: 53Z05
1. Introduction In a recent series of papers [3–5], the first two authors have presented a study of the local and global geometry of the space of oriented affine lines in R3 . This has proved useful in geometric optics, for example, in the modeling of reflection. 859
August 29, 2005 18:7 WSPC/148-RMP
860
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
The purpose of this paper is to review this technique and to apply it to some recent developments in mathematical physics. In particular, an elegant approximation of the Casimir force based on geometric optics has been suggested by Jaffe and Scardicchio and they have shown that between a sphere and a plane, this approximation appears to work well [7, 8]. We present a complex geometry scheme [5] which, in principle, can be used to evaluate the Casimir effect for a range of boundaries in the approximation scheme of Jaffe et al. The basic ingredient of our method is a description of reflection in R3 via the space of oriented affine lines, which we identify with the tangent bundle to the 2-sphere T S2 . In this representation, reflection consists of the combined actions of PSL(2, C) and fibre mappings on T S2 . We illustrate the method by evaluating the case of non-parallel plates. For plane boundaries, our approach gives an explicit iteration of the classical method of images using properties of the above-mentioned action on T S2 . The convergence of the Casimir energy is then studied through the analytic properties of the maps involved. Consider a plate lying between radial distances R = R0 and R = R1 from the origin, forming an angle γ with the horizontal plane containing the origin as in Fig. 1.
Fig. 1.
Define m0 and m1 by cos[(m1 −1)γ] cos γ
π 2m0 +2
≤ γ <
π 2m0 , cos(m1 γ) cos γ
and either cos(m1 γ) ≤
R0 R1
≤
for the smallest even m1 , or ≤ ≤ cos[(m1 − 1)γ] for the smallest odd m1 . Let m2 = min{m0 , m1 }. Our main result is: R0 R1
Main Theorem 1.1. The geometric optics approximation to the Casimir energy between a finite plate, of dimensions stated above, and an infinite plate is 0 +2 E = 2E2m 2
m2
1 E2m ,
m=1
where 1 E2m
cW cos2 (mγ) sin γ =− 64π 2 sin4 (mγ)
1 1 − 2 2 R0 cos γ R1 cos[(m − 1)γ] cos(mγ)
,
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
861
and 0 =− E2m 2
×
cW cos(m2 γ) 64π 2 sin4 (m2 γ)R02 R12 0 [R0 −R1 cos(m2 γ)]2 sin(m2 γ)
[R0 cos γ−R1 cos(m2 γ)]2 sin(m2 −1)γ cos γ
for m0 ≤ m1 , for m0 > m1 and m1 even, for m0 > m1 and m1 odd.
In particular, the Casimir force between two plates is always attractive. Our expression agrees with the results stated in [8]. As the plates tend to parallel while maintaining a non-zero separation, the finite sum becomes infinite and the integrand goes to zero in the above expressions for the energy. Thus the convergence of the energy in this limit is nontrival. We prove that the geometric optics approximation does indeed converge, and it does so to Casimir’s original expression for parallel plates: Main Theorem 1.2. In the parallel plate limit γ → 0, keeping a minimum separation of L, the above energy reduces to Casimir’s original expression: E=−
cπ 2 A , 1440 L3
where A is the area of the finite plate. This paper is organized as follows. Section 2 outlines our approach to the geometry of reflection in terms of the complex geometry on T S2 — further details can be found in [5]. In the following section, we review the optical approximation to the Casimir effect, as developed by Jaffe and Scardicchio, and use the classical method of images to determine the existence and lengths of closed paths in an infinite wedge. In Sec. 4, we find the sequence of intersection points of a closed n-bounce path in a wedge. Here we must treat even and odd bounce closed paths separately. Section 5 combines these results to determine the Dirichlet Casimir energy between a plane and a finite plate. The result is as stated in Main Theorem 1.1. In Sec. 6, we obtain Casimir’s original result in the parallel plate limit (Main Theorem 1.2). 2. The Complex Geometry of Reflection We now describe our approach to geometric optics with the aid of the space of oriented affine lines in R3 . Further details of this material, which we only summarize, can be found in [4, 5]. The space of oriented affine lines in R3 can be identified with the tangent bundle to the 2-sphere T S2 [6], as we now explain. Start with 3-dimensional Euclidean space R3 and fix standard coordinates (x1 , x2 , x3 ).
August 29, 2005 18:7 WSPC/148-RMP
862
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
Let L be the set of oriented lines, or rays, in Euclidean space R3 . A point in L, that is, an oriented line in R3 , is uniquely determined by its unit direction vector and the vector V joining the origin to the point on the line that lies closest to U the origin. That is, V) ∈ R3 × R3 | |U| =1U ·V = 0}. L = {(U, to the origin and V to the head of U. Thus, one By parallel translation, move U obtains a vector that is tangent to the unit 2-dimensional sphere in R3 . The mapping is one-to-one and so it identifies the space of oriented lines L with the tangent bundle of the 2-sphere, T S2 as in Fig. 2.
Fig. 2.
The space of oriented lines.
From our point of view, geometric optics is the study of 2-parameter families of oriented lines or line congruences. A line congruence, which we consider as a surface in the non-compact 4-manifold L, consists of the rays of the optical system. The space L has a wealth of natural geometric structure. In particular, it has a canonical complex structure so that it can be considered as a complex 2-manifold. For the purposes of this paper, the main implication of this is the existence of a pair of local complex coordinates on T S2 . First, choose coordinates ξ on S2 minus the south pole by stereographic projection from the south pole of the unit sphere about the origin onto the plane through the equator. This can be extended to coordinates on T S2 minus the tangent plane over the south pole by identifying (ξ, η) ∈ C2 with ∂ ∂ + η¯ ¯ ∈ Tξ S2 . η ∂ξ ∂ξ Thus ξ gives the direction of the oriented line in R3 and η gives the perpendicular distance vector from the origin to the line. For Euclidean coordinates (x1 , x2 , x3 ) on R3 = C ⊕ R, and set z = x1 + ix2 , t = x3 . The point (z, t) lies on the line (ξ, η) iff the following incidence relation holds [3]: 1 (2.1) η = (z − 2tξ − z¯ξ 2 ). 2 We now consider reflection of a ray off a surface in R3 .
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
863
Consider an incoming ray (ξk , ηk ) ∈ T S2 reflecting off an oriented surface at a point (αk , bk ) ∈ C ⊕ R = R3 . Suppose that the oriented normal to the surface at the point of reflection is (νk , χk ) ∈ T S2 and that the reflected ray is (ξk+1 , ηk+1 ) ∈ T S2 . We denote the (oriented) distance of (αk , bk ) from the closest point to the origin on the incoming ray, reflected ray and normal by rk , rk+1 and sk , respectively (see Fig. 3).
Fig. 3.
Reflection in a surface.
The following proposition describes reflection as the combined actions of PSL(2, C), complex conjugation and fiber mappings on T S2 . Proposition 2.1. [5] The reflected ray is given by 2νk ξ¯k + 1 − νk ν¯k , (1 − νk ν¯k )ξ¯k − 2¯ νk
(2.2)
νk − ξ¯k )(1 + νk ξ¯k )(1 + νk ν¯k )sk −(1 + νk ν¯k )2 η¯k + 2(¯ , ((1 − νk ν¯k )ξ¯k − 2¯ νk )2
(2.3)
ξk+1 = ηk+1 =
the distance of (αk , bk ) from the closest point to the origin on the reflected ray is rk+1 = rk +
2(|νk − ξk |2 − |1 + νk ξ¯k |2 ) sk , (1 + νk ν¯k )(1 + ξk ξ¯k )
(2.4)
and the intersection equation is ηk =
¯k + (νk − ξk )(1 + ν¯k ξk )(1 + νk ν¯k )sk (1 + ν¯k ξk )2 χk − (νk − ξk )2 χ . (1 + νk ν¯k )2
(2.5)
There are many equivalent ways of rewriting and using these equations. In [5], these equations are used to explicitly determine the scattering of plane and spherical
August 29, 2005 18:7 WSPC/148-RMP
864
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
waves off planes, spheres and tori. To do this, we first solve the intersection equation (2.5) to find the normal νk to the surface at the point of reflection and then use (2.2) and (2.3) to determine the outgoing ray. The Van Vleck determinant along a ray in a line congruence is a two-point function defined by ∆n (x, x ) = lim δ 2 e−2
Rl δ
Hdr
δ→0
,
where H is the mean curvature of the wavefront and r is an affine parameter along the straight line joining x and x . To get an explicit general form for the Van Vleck determinant in our formalism, we proceed as follows. As detailed in [4], the mean curvature of a wavefront orthogonal to the parametrized line congruence ¯ η(ζ, ζ)), ¯ for ζ ∈ C is ζ → (ξ(ζ, ζ), H= where ∂ =
∂ ∂ζ
∂ + η ∂¯ ξ¯ − ∂ − η ∂ ξ¯ , ∂+η ∂+η − ∂−η ∂−η
and ∂ + η ≡ ∂η −
¯ 2η ξ∂ξ + r∂ξ, 1 + ξ ξ¯
¯ − ∂ − η ≡ ∂η
¯ 2η ξ¯ ∂ξ ¯ + r∂ξ. 1 + ξ ξ¯
Suppose we start with a spherical wave emanating from the point (α0 , b0 ) ∈ C ⊕ R = R3 and this is reflected n times before passing through a point (αn+1 , bn+1 ). Let (ξk , ηk ) be the line congruence after the kth reflection. Proposition 2.2. The geometric form of the Van Vleck determinant between the start and end points is ∆n (0, n + 1) =
n 1 [Ψk ]1 , l02 [Ψk ]2 k=1
where l0 is the distance from the initial point to the first reflection, Ψk = ∂ + η ∂ + η − ∂ − η ∂ − η, [Ψk ]1 is evaluation of Ψk at the start of the kth reflection and [Ψk ]2 is evaluation of Ψk at the end of the kth reflection. Proof. This follows immediately from the fact that the mean curvature is the derivative with respect to r of the natural log of Ψk . The term δ −2 is cancelled by the singularity in the spherical wavefront at the initial point. Corollary 2.3. The Van Vleck determinant of two points between plane boundaries is the reciprocal of the path length squared. Proof. A spherical wavefront reflected in a plane remains spherical with the same principal curvatures at the point of reflection and so [Ψk ]2 = [Ψk+1 ]1 . Moreover,
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
865
√ √ [ Ψk ]2 = [ Ψk ]1 + lk , where lk is the length of the kth reflected path. Thus k=n −2 1 ∆n (0, n + 1) = = lk , [Ψn ]2 k=0
as claimed. For the background of more standard aspects of geometric optics, see [9, Chap. 5 and Appendix B]. 3. The Geometric Optics Approximation to the Casimir Effect in a Wedge Since its discovery over 50 years ago [2], the Casimir effect has been the subject of much research. Recently, the effect has been measured in different situations [1] and, as a result, there is renewed interest in calculating the Casimir effect between different shaped boundaries [10]. Recently, Jaffe and Scardicchio [7] have given the following explicit formula for the Casimir energy in terms of geometric quantities: ∆n (x, x) 3 c n d x. (−1) E=− 2 2π n ln3 (x) Dn
Here the sum is over straight paths with n-bounces between the boundaries which begin and end at the point x (although, not necessarily periodic), ln is the length of such a path and ∆n (x, x) is the expansion factor, or Van Vleck determinant, as described above. The origins of this approximation lies in the path integral representation of the Helmholtz Green’s function with Dirichlet boundary conditions: (∆ + k 2 )Φ = 0
in D,
Φ=0
on ∂D,
where k is the wavenumber. If µ is the mass of the field, the well-known expression for the Casimir energy in terms of the density of states is ∞ kc 2 dk k + µ2 [G(x, x, k) − G0 (x, x, k)] d3 x, E= π 0 where G(x, x , k) and G0 (x, x , k) are the Green’s functions for the Helmholtz equation with and without boundary conditions. By taking a stationary phase approximation, Jaffe and Scardicchio [8] obtain the geometric optics approximation to the Green’s function with Dirichlet boundary conditions: Gn (x, x , k), Gopt (x, x , k) = n
where the nth-reflection Green’s function turns out to be 1
Gn = (−1)n
∆n2 ikln e . π
August 29, 2005 18:7 WSPC/148-RMP
866
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
Carrying out the k integration in the energy then yields the result stated above in the limit as µ → 0. The optical approximation is valid when the scales of diffraction are large compared to the scales of the strength of the Casimir force. This will typically be given by the ratio of the separation of the conducting boundaries to their curvature. The geometric optics approximation would need to be modified in the presence of caustics (where the Van Vleck determinant blows up) and to accomodate non-zero temperatures. The computation of the Casimir effect can, in this approximation, be reduced to determining the length ln , Van Vleck determinant ∆n and region of existence Dn for closed n-bounce paths between some prescribed boundary components. When the boundaries are planes, the Van Vleck determinant is just the reciprocal of squared length of the path. So we need only to compute ln and Dn . We now consider the Casimir effect in a wedge formed by two non-parallel planes. Since any closed path in the wedge will be contained in a plane perpendicular to the line of intersection of the planes, the problem is reduced to a 2-dimensional one. Consider, then, a wedge in the plane with opening angle γ, where 0 < γ < π2 . The following two propositions determine the existence and length of closed paths as a function of γ and the number of reflections, by applying the classical method of images. Proposition 3.1. There exists a closed 2m-bounce path in a wedge with angle γ π . There exists at most two closed paths (traversed in opposite directions) iff γ < 2m and the length of such a closed path is 2R|sin(mγ)|. Proof. We reflect the wedge 2m-times through one of its sides. For a point (R, ψ) in the wedge, we also reflect it 2m-times. The image of its path reflected in the wedge is a straight line joining the initial point to the final image point. The angle between the ray from the origin to the beginning and endpoints is 2mγ and so the closed path exists iff 2mγ < π, which proves the first part of the proposition. Applying the cosine rule, we find that the distance between these two points, which is equal to the length of the closed path, is 2R|sin(mγ)|, as claimed.
Fig. 4.
The method of images for even and odd bounces.
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
867
The existence and length of a closed odd-bounce path is somewhat different. Let (R, ψ) be polar coordinates in the plane, where R is the distance from the vertex, the bottom plane is aligned with the horizontal, and ψ measures the angle from the vertical. Proposition 3.2. There exists a closed (2m − 1)-bounce path starting at the point π and either of the (R sin ψ, R cos ψ) in a wedge of opening angle γ iff γ < 2m−2 following hold: ψ < π − mγ or ψ > (m − 1)γ. The former applies to paths that first strike the non-horizontal plane, while the latter applies to paths that first strike the horizontal plane. There exists exactly none, one or two closed (2m − 1)-bounce paths from a given point, according to the inequalities above. The total length of a closed (2m − 1) reflection path in a wedge with angle γ starting at the point (R sin ψ, R cos ψ) is either 2R|cos(ψ + mγ)| or 2R|cos[ψ − (m− 1)γ]| (respectively). Proof. Applying the classical method of images again, we reflect the wedge (2m − 1)-times through one of its sides. For a point (R, ψ) in the wedge, we also reflect it (2m − 1)-times in the non-horizontal plane first. The image of its path reflected in the wedge is a straight line joining the initial point to the final image point. The angle between the beginning and endpoints is 2ψ−π+2mγ and so the closed path exists iff 2ψ − π + 2mγ < π, which proves the first part of the proposition. Applying the cosine rule, we find that the distance between the two points, which is equal to the length of the closed path, is 2R|cos(ψ + mγ)|, as claimed. Similarly for the path that first strikes the horizontal plane. Corollary 3.3. A closed odd bounce path retraces itself. Proof. It is clear from symmetry that a reflected odd bounce path crosses the middle plane image at right angles, and so is reflected back along itself. By way of example, Fig. 5 shows the regions where a closed 3-bounce exists for various opening angles γ. Points in the lightly shaded areas lie on one closed 3-bounce, while those in darker areas lie on two.
Fig. 5.
Regions where a closed 3-bounce exists.
August 29, 2005 18:7 WSPC/148-RMP
868
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
Introducing the preceding geometric expression for the path lengths into Jaffe and Scardicchio’s optical approximation for the Casimir energy, as described in the π ≤γ< last section, we find that for a wedge with opening angle γ satisfying 2m+2 π : 2m+2 m 1 c d3 x E=− 4 2 4 16π n=1 R sin (nγ) π π 2
m c + 32π 2 n=1
+
c 32π 2
m n=1
−γ≤ψ≤ 2
ψ1 <ψ< π 2
1 d3 x R4 cos4 [ψ + (n + 1)γ]
π 2 −γ<ψ<ψ2
1 d3 x, R4 cos4 (ψ − nγ)
where ψ1 = max{nγ, π2 − γ} and ψ2 = min{π − (n + 1)γ, π2 }. Here, we have used the fact that even bounces count twice as they can be traversed in either direction, while odd bounces give two contributions — one from reflecting off the horizontal plane first and one from reflecting off the top plane first. H T and E2m+1 , respectively. We denote these three contributions by E2m , E2m+1 The 1-bounce paths have been excluded from the sum as the energy density diverges for such paths, since their lengths go to zero as one approaches the boundary. The difficulty with the above integrals is that they are divergent at the limit R = 0. This is caused by the intersection of the two planes where the closed path lengths go to zero. Indeed, we cannot expect the geometric optics approximation to be accurate near corners. In order to remove this difficulty, we put a finite separation between the two plates. This will alter the regions of integration Dn and this we investigate after introducing some complex geometry in the next section. 4. Reflections in an Infinite Wedge Consider a ray originating from (α0 , b0 ) and reflecting off a series of planes. Suppose that the points of reflection on the planes are given by (αk , bk ), the reflected directions are ξk+1 and lk k+1 are the (oriented) distances between these points, for k = 1, 2, . . . . Let νk be the normal direction of the kth plane. Proposition 4.1. The points of reflection are given by αk = αk−1 +
2ξk lk−1 k , 1 + ξk ξ¯k
bk = bk−1 +
1 − ξk ξ¯k lk−1 k , 1 + ξk ξ¯k
(4.1)
where lk−1 k = −
¯ k−1 νk + (1 − νk ν¯k )bk−1 − (1 + νk ν¯k )sk ) (1 + ξk ξ¯k )(αk−1 ν¯k + α . (4.2) |1 + νk ξ¯k |2 − |νk − ξk |2
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
869
Proof. The first two equations hold by the definition of lk−1 k as being the distance between, and ξk the direction of the oriented line joining (αk , bk ) and (αk−1 , bk−1 ). The normal line contains the point (αk , bk ) and the incoming ray contains the point (αk−1 , bk−1 ). The corresponding incidence relations (2.1) are 1 1 (αk − 2bk νk − α ¯ k νk2 ), ηk = (αk−1 − 2bk−1 ξk − α ¯k−1 ξk2 ). 2 2 Introducing these relations together with the first two of the proposition into Eq. (2.5) gives the third. χk =
From here on we consider only the case of two plane boundaries. Consider now reflection in two planes, one of which is horizontal, both containing the origin. Then ν1 = 0 and we let ν2 be the normal direction to the non-horizontal plane pointing inward in the first quadrant. Proposition 4.2. The lengths of the reflected paths satisfy ¯ 2k−1 ν2 ) (1 + ξ2k ξ¯2k )(α2k−1 ν¯2 + α , l2k−1 2k = − (1 − ξ2k ξ¯2k )(1 − ν2 ν¯2 ) + 2ξ2k ν¯2 + 2ξ¯2k ν2 l2k 2k+1 =
(1 + ξ2k+1 ξ¯2k+1 )(1 − ξ2k ξ¯2k ) l2k−1 2k . (1 − ξ2k+1 ξ¯2k+1 )(1 + ξ2k ξ¯2k )
(4.3)
(4.4)
Proof. The first equation follows from Eq. (4.2) and the fact that sk = 0 and b2k−1 = 0. The second equation follows from the same equation using ν2k+1 = 0 and the second equation of (4.1) in Eq. (4.2). In the case of two planes, there is a translational symmetry which we now exploit. We assume that the line of intersection of the planes lies along the x1 -axis and the acute angle lies in the first quadrant. Set ν2 = tan(β/2) so that γ = π − β is the opening angle of the wedge. With these simplifications, the problem reduces to reflection in two lines in the x2 x3 -plane, and αk , ξk and νk are all real. Let αk = ak ∈ R and introduce polar coordinates (R, ψ) in the x2 x3 -plane. The reflected rays after k-bounces are given by: Proposition 4.3. Consider a ray with direction ξ1 emanating from the point (a0 , b0 ) and striking the horizontal plane. The sequence of reflected rays is given by ξ2k =
η2k =
− sin[(k − 1)β]ξ1 + cos[(k − 1)β] , cos[(k − 1)β]ξ1 + sin[(k − 1)β]
−(a0 − 2b0 ξ1 − a0 ξ12 ) , 2(cos[(k − 1)β]ξ1 + sin[(k − 1)β])2
ξ2k+1 =
η2k+1 =
cos[kβ]ξ1 + sin[kβ] , sin[kβ]ξ1 + cos[kβ]
a0 − 2b0 ξ1 − a0 ξ12 . 2(sin[kβ]ξ1 + cos[kβ])2
Proof. These follow from (2.2) and (2.3) by induction on k.
(4.5)
(4.6)
August 29, 2005 18:7 WSPC/148-RMP
870
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
For future calculations we note that: Lemma 4.4. If ξ1 = tan(φ/2), the sequence of reflected directions satisfy 1 − ξ2k ξ¯2k = − cos[φ + 2(k − 1)β], 1 + ξ2k ξ¯2k 1 − ξ2k+1 ξ¯2k+1 = cos[φ + 2kβ], 1 + ξ2k+1 ξ¯2k+1
2ξ2k = sin[φ + 2(k − 1)β], 1 + ξ2k ξ¯2k
(4.7)
2ξ2k+1 = sin[φ + 2kβ]. 1 + ξ2k+1 ξ¯2k+1
(4.8)
Proof. These follow, with the aid of trigonometry identities, from equations of (4.5). 4.1. Even reflections We now study paths that return to the original point after 2m reflections. The direction of the initial ray is given by: Proposition 4.5. A point (a0 , b0 ) lies on a closed path with 2m reflections iff the initial direction of the ray is ξ1 =
sin[ψ − mβ] ± 1 , cos[ψ − mβ]
(4.9)
where a0 = R sin ψ and b0 = R cos ψ. For ξ1 = tan(φ/2), this is equivalent to cos φ = ∓ sin[ψ − mβ],
sin φ = ± cos[ψ − mβ].
(4.10)
Proof. The initial point (a0 , b0 ) is on a closed 2m-bounce if it is contained on the final outgoing ray: η2m+1 =
1 2 (a0 − 2b0 ξ2m+1 − a0 ξ2m+1 ). 2
Substituting the first of equations from (4.5) and (4.6) in this gives the quadratic equation cos[ψ − mβ]ξ12 − 2 sin[ψ − mβ]ξ1 − cos[ψ − mβ] = 0. The solution to this is (4.9), or equivalently, (4.10). For future reference we note that: Lemma 4.6. For a closed 2m reflection path with first reflection off the horizontal plane 1 − ξ2k ξ¯2k = ± sin[ψ − (m − 2k + 2)β], 1 + ξ2k ξ¯2k 2ξ2k = ± cos[ψ − (m − 2k + 2)β], 1 + ξ2k ξ¯2k
(4.11)
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
1 − ξ2k+1 ξ¯2k+1 = ∓ sin[ψ − (m − 2k)β], 1 + ξ2k+1 ξ¯2k+1 2ξ2k+1 = ± cos[ψ − (m − 2k)β]. 1 + ξ2k+1 ξ¯2k+1
871
(4.12)
Proof. These follow from substituting (4.10) in (4.7) and (4.8). For a closed 2m-bounce path, the sequence of points of intersection with the boundaries and the length of the paths are given by: Proposition 4.7. For a closed 2m reflection path with first reflection off the horizontal plane, the sequence of points of reflection and path lengths are a2k =
R cos(mβ) cos β , sin[ψ − (m − 2k + 1)β] b2k = − lk k+1 = ∓ l0 1 = ±
a2k−1 =
R cos(mβ) , sin[ψ − (m − 2k + 2)β]
R cos(mβ) sin β , sin[ψ − (m − 2k + 1)β]
b2k−1 = 0,
(4.13) (4.14)
R cos(mβ) sin β , sin[ψ − (m − k)β] sin[ψ − (m − k + 1)β]
(4.15)
R cos[ψ − β] , sin[ψ + (m − 1)β]
(4.16)
R cos ψ , sin[ψ − mβ]
l2m 0 = ∓
where the signs are chosen to make the lengths positive. Proof. These follow from Eqs. (4.1) to (4.12) by induction on k. 4.2. Odd reflections The direction of an initial ray that returns to the original point after (2m − 1) reflections is given by: Proposition 4.8. A point (a0 = R sin ψ, b0 = R cos ψ) is on a closed (2m − 1)bounce which strikes the horizontal plane first if ξ1 =
cos[(m − 1)β] ± 1 , sin[(m − 1)β]
(4.17)
or, equivalently, sin φ = ± sin[(m − 1)β],
cos φ = ∓ cos[(m − 1)β].
(4.18)
Note that the two solutions are antipodal. Proof. The initial point (a0 , b0 ) is on a closed 2m-bounce if it is contained on the final outgoing ray (cf. Eq. (2.1)): η2m =
1 2 (a0 − 2b0 ξ2m − a0 ξ2m ). 2
August 29, 2005 18:7 WSPC/148-RMP
872
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
Substituting the first of Eqs. (4.5) and (4.6) in this gives the quadratic equation [sin ψ(−1 + cos[2(m − 1)β]) + cos ψ sin[2(m − 1)β]]ξ¯12 + 2[cos ψ(−1 − cos[2(m − 1)β]) + sin ψ sin[2(m − 1)β]]ξ¯1 − sin ψ(−1 + cos[2(m − 1)β]) − cos ψ sin[2(m − 1)β] = 0. The solution to this is (4.9), or, equivalently, (4.10). For future use we note the following: Lemma 4.9. For a closed (2m − 1) reflection path with first reflection off the horizontal plane, 1 − ξ2k ξ¯2k = ± cos[(m − 2k + 1)β], 1 + ξ2k ξ¯2k
2ξ2k = ± sin[(m − 2k + 1)β], 1 + ξ2k ξ¯2k (4.19)
1 − ξ2k+1 ξ¯2k+1 = ∓ cos[(m − 2k − 1)β], 1 + ξ2k+1 ξ¯2k+1
2ξ2k+1 = ± cos[(m − 2k − 1)β]. 1 + ξ2k+1 ξ¯2k+1 (4.20)
Proof. These follow from substituting (4.18) in (4.7) and (4.8). For a closed (2m − 1)-bounce path, the sequence of points of intersection with the boundaries and the length of the paths are given by: Proposition 4.10. For a closed (2m − 1)-bounce with first reflection off the horizontal plane, the sequence of points of reflection and path lengths are a2k =
R sin[ψ + (m − 1)β] cos β , cos[(m − 2k)β] b2k = −
lk k+1 = ∓
a2k−1 =
R sin[ψ + (m − 1)β] sin β , cos[(m − 2k)β]
R sin[ψ + (m − 1)β] sin β , cos[(m − k)β] cos[(m − k − 1)β]
R sin[ψ + (m − 1)β] , cos[(m − 2k + 1)β] b2k−1 = 0, l0 1 = ±
R cos ψ , sin[ψ − mβ]
(4.21) (4.22) (4.23)
where the signs are chosen to make the lengths positive. Proof. These follow from Eqs. (4.1), (4.3), (4.18) to (4.20) by induction on k. As proved earlier, a closed odd bounce retraces itself. This can also be seen from the fact that for a closed (2m − 1)-bounce, with m odd, a2k−1 = a2(m−k)+1 , while for m even a2k = a2(m−k) .
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
873
Proposition 4.11. For a closed (2m − 1)-bounce with first reflection off the horizontal plane, the sequence of points of reflection and path lengths are a2k =
R sin[ψ − mβ] , cos[(m − 2k)β] b2k−1 = −
a2k−1 =
R sin[ψ − mβ] cos β , cos[(m − 2k + 1)β]
R sin[ψ − mβ] sin β , cos[(m − 2k + 1)β]
b2k = 0,
(4.24)
(4.25)
Proof. These follow from (4.21) and (4.22) by reflecting through the bisector of the wedge (R, ψ) → (R, β − ψ), so that a2k ,0 , (a2k , b2k ) → − cos β
(a2k−1 , 0) → (−a2k−1 cos β, a2k−1 sin β) .
Note. Instead of using the method of images as in Sec. 2, the lengths of the closed 2m- and (2m − 1)-bounce paths can be found from Propositions 4.7 and 4.10 and the following two trigonometric identities: cos ψ cos(ψ − β) − sin[ψ + (m − 1)β] sin(ψ − mβ) +
2m−1 k=1
cos(mβ) sin β = 2 sin(mβ), sin[ψ − (m − k)β] sin[ψ − (m − k + 1)β]
m−1 cos ψ sin[ψ + (m − 1)β] sin β − = 2 cos[ψ + (m − 1)β]. cos[(m − 1)β] cos[(m − k)β] cos[(m − k − 1)β] k=1
5. The Casimir Energy With Finite Top Plate Consider the case where the top plate is finite, lying between a radial distance R = R0 and R = R1 from the origin, where R1 > R0 . Let the width of the top plate be W and the bottom horizontal plate through the origin be infinite in all directions. The effect of limiting the upper plate is that some of the closed paths must now be excluded from the energy as they run off the edge of the plate. This restricts the domain of integration Dn . Consider a closed 2m- or (2m + 1)-bounce path that first strikes the horizontal plate. The closest and furthest points from the vertex have x2 -coordinate:
m odd m even
Closest
Furthest
am+1 am
a2m a2m
August 29, 2005 18:7 WSPC/148-RMP
874
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
For a (2m + 1)-bounce path that strikes the top plate first, the closest and furthest points from the vertex have x2 -coordinate: Closest
Furthest
am am+1
a1 a1
m odd m even
We will use these to restrict the domains of integration, taking the odd and even bounce cases separately. 5.1. The even bounce contribution We start by considering the case of a 2m-bounce. The restrictions we must introduce for the finite plate, obtained from (4.13) are m odd m even
am+1 ≥ −R0 cos β am ≥ −R0 cos β
a2m ≤ −R1 cos β . a2m ≤ −R1 cos β
Suppose that m = 2n, then the restrictions are R cos(2nβ) cos β ≥ −R0 cos β, sin(ψ − β) R cos(2nβ) cos β ≤ −R1 cos β, = sin[ψ + (2n − 1)β]
am = a2n = a2m = a4n
(5.1) (5.2)
which we put together to find −
R0 sin(ψ − β) R1 sin[ψ + (2n − 1)β] ≤R≤− . cos(2nβ) cos(2nβ)
We now consider the restriction this inequality places on ψ, namely: −R0 sin(ψ − β) ≤ −R1 sin[ψ + (2n − 1)β]. Since β −
π 2
≤ψ≤
π 2
and β >
4n−1 4n π,
the function
f (ψ) = −R0 sin(ψ − β) + R1 sin[ψ + (2n − 1)β] is a decreasing function of ψ. Thus the inequality holds iff f ( π2 ) ≤ 0, or −R0 cos β ≤ −R1 cos[(2n − 1)β]. We consider two cases: either f (β − π2 ) ≤ 0, i.e. R0 ≤ R1 cos(2nβ), or there exists ψ0 ∈ [β − π2 , π2 ] such that f (ψ0 ) = 0: −R0 sin(ψ0 − β) = −R1 sin[ψ0 + (2n − 1)β]. This can also be written tan ψ0 =
R0 sin β + R1 sin[(2n − 1)β] . R0 cos β − R1 cos[(2n − 1)β]
(5.3)
In the former case, the region of integration is [β − π2 , π2 ] while in the latter case it is [ψ0 , π2 ].
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
875
Figure 6 shows the regions of integration for a closed 4-bounce for various wedge angles when R0 = 1 and R1 = 2. Here we can see the transition from the latter to the former integration regions occurring at wedge angle γ = 30◦ .
o
o
34°
o
o
23° Fig. 6.
30°
14°
Integration regions for closed 4-bounce.
In fact, the regions of integration lie between two circles which pass through the origin and whose second point of intersection lies in the first quadrant at an angle ψ0 . This follows from studying Eqs. (5.1) and (5.2). The region of integration is also seen to lie within the wedge. If we let ψ1 = max{ψ0 , β − π2 }, then the energy associated with a closed 4nbounce can be computed: E4n
cW =− 32π 2
π 2
ψ1
−
R1 sin(ψ+(2n−1)β) cos(2nβ)
R sin(ψ−β) − 0cos(2nβ)
R3
1 dR dψ sin4 (2nβ)
π 2
cos2 (2nβ) cos2 (2nβ) − dψ 2 2 R02 sin2 (ψ − β) ψ1 R1 sin [ψ + (2n − 1)β] π
cW cos2 (2nβ) cot[ψ + (2n − 1)β] cot(ψ − β) 2 =− − R12 R02 64π 2 sin4 (2nβ) ψ1 2 1 cW cos (2nβ) cos ψ1 =− R02 cos β sin(ψ1 − β) 64π 2 sin4 (2nβ) 1 − 2 . R1 cos[(2n − 1)β] sin[ψ1 + (2n − 1)β] cW = 2 64π sin4 (2nβ)
Now suppose that m = 2n − 1. The restrictions , together with (4.13), lead to am+1 = a2n = a2m = a4n−2
R cos[(2n − 1)β] cos β ≥ −R0 cos β, sin ψ R cos[(2n − 1)β] cos β ≤ −R1 cos β, = sin[ψ + (2n − 2)β]
August 29, 2005 18:7 WSPC/148-RMP
876
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
which we put together to find −
R0 sin ψ R1 sin[ψ + (2n − 2)β] ≤R≤− . cos[(2n − 1)β] cos[(2n − 1)β]
Thus this places the following inequality on ψ: R0 sin ψ ≤ R1 sin[ψ + (2n − 2)β], or tan ψ0 =
R1 sin[(2n − 2)β] . R0 − R1 cos[(2n − 2)β]
(5.4)
A similar argument to the m = 2n case shows that the inequality holds iff R0 ≤ R1 cos[(2n − 2)β]. We consider two cases: either −R0 cos β ≤ −R1 cos[(2n − 1)β], or there exists ψ0 ∈ [β − π2 , π2 ] such that R0 sin ψ0 = R1 sin[ψ0 + (2n − 2)β]. In the former case, the region of integration is [β − π2 , π2 ] while in the latter case it is [ψ0 , π2 ]. As before, these lie between two circles which intersect at the origin and at a point in the first quadrant. Letting ψ1 = max{ψ0 , β − π2 } the resulting energy integrates up to cW cos2 [(2n − 1)β] cos ψ1 64π 2 sin4 [(2n − 1)β] 1 1 × − . R02 sin ψ1 R12 cos[(2n − 2)β] sin[ψ1 + (2n − 2)β]
E4n−2 = −
We combine these results for the even bounce case: Proposition 5.1. Given β, R0 and R1 , define m0 and m1 by cos[(m1 −1)β] cos β
2m0 +1 2m0 +2 π,
2m0 −1 2m0 π cos(m1 β) cos β
< β ≤
R0 0 and either cos(m1 β) ≤ R for m1 even or ≤ R ≤ R1 ≤ 1 cos[(m1 − 1)β] for m1 odd. Let m2 = min{m0 , m1 }. Then the contribution to the Casimir energy obtained by taking only even indices in the energy expression is
Eeven =
0 2E2m 2
+2
m2
1 E2m ,
m=1
where 1 E2m
cW cos2 (mβ) sin β = 64π 2 sin4 (mβ)
1 1 − 2 2 R0 cos β R1 cos[(m − 1)β] cos(mβ)
,
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
877
and 0 E2m = 2
cW cos(m2 β) 64π 2 sin4 (m2 β)R02 R12 0 [R0 −R1 cos(m2 β)]2 × sin(m2 β) [R0 cos β−R1 cos(m2 β)]2 sin(m2 −1)β cos β
for m0 ≤ m1 , for m0 > m1 and m1 even, for m0 > m1 and m1 odd.
5.2. The odd bounce contribution The situation for odd bounce paths is quite different than for even bounce paths. Due to the finite size of the top plates, there are again restrictions on the regions of integration. However, rather than eliminating the contribution from higher bounces, as in the even case, these contributions give lower bounce paths originating from regions outside of the wedge. In Fig. 7, the region D5 for the closed 5-bounce paths is shown for γ = 22.5◦ , R0 = 1 and R1 = 2. This includes a region outside of the wedge (darker shading), which is the reflection on the dotted line of the 7-bounce region that would exist if the top plate were larger. Since reflection preserves lengths and areas, the total contribution of the odd bounces can be calculated by integrating with only the lower bound on R. That is, the energy is the same as that for a semi-infinite top plate. Since this term will not involve R1 , from a physical point of view, such contributions to the force have no significance, and are excluded [7]. Thus we only have the even bounce contributions, and Main Theorem 1.1 follows from Proposition 5.1 with γ = π − β.
Fig. 7.
Integration regions for closed 5-bounce.
August 29, 2005 18:7 WSPC/148-RMP
878
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
6. The Parallel Plate Limit From Casimir’s original work [2], the Dirichlet energy between two parallel plates of area A and separation L was computed to be cπ 2 A . 1440 L3 In this section, we retrieve this result as the limit of our expressions for the energy between non-parallel plates. Let us consider the limit as γ → 0, i.e. as the plates become parallel. Since β = π − γ, this is the same as β → π. As it stands, the Casimir energy between a finite plate and an infinite plane, as given above, diverges as β → π. This is because the separation between the boundaries goes to zero in this limit. Before taking the limit, we fix the non-zero minimum separation L between the plates by letting R0 sin β = L and b = R1 − R0 , where b is the length of the top plate. While for each β < π there is only a finite number of contributions to the Casimir energy, as β → π we pick up an infinite number of terms in the sum. Thus the parallel plate limit needs careful consideration. E=−
Main Theorem 1.2. In the parallel plate limit, we retrieve Casimir’s original result cπ 2 bW . lim E = − β→π 1440 L3 Proof. We will prove this by using the following lemma. Lemma 6.1. In the notation of Main Theorem 1.1, we have cW b 1 |≤ . |E2m 32π 2 L3 m4 Proof. We first claim that for −
2m−1 2m π 4 2
≤ β < π,
sin β cos (mβ) 1 ≤ 4. m sin4 (mβ) cos β
(6.1)
To show this, define f (β) = (−1)m+1
sin2 β cos(mβ) . sin2 (mβ) cos β
Firstly, it is not hard to see that f ≥ 0 and 0≤−
sin4 β cos2 (mβ) ≤ f 2 (β). sin4 (mβ) cos β
By L’Hˆ opital’s rule f (π) = m12 , so we will have established the claim if we can show that f is non-decreasing. To this end, we compute that (−1)m+1 sin β f (β) = sin(2mβ)(1 + cos2 β) − m sin(2β)(1 + cos2 (mβ)) . 3 2 2 sin (mβ) cos β
August 29, 2005 18:7 WSPC/148-RMP
J070-00243
Casimir Effect
879
The factor outside the large brackets is easily seen to be positive, while 1 + cos2 β ≥ 1 + cos2 (mβ) for the values of β = π − γ mentioned before Main Theorem 1.1. Thus, we will have shown that f is non-decreasing, and hence inequality (6.1), if we can show that sin(2mβ) − m sin(2β) ≥ 0. This we do inductively. It is clearly true for m = 1, and suppose that sin(2kβ) − k sin(2β) ≥ 0. Then sin 2(k + 1)β − (k + 1) sin(2β) = sin(2kβ) cos(2β) + cos(2kβ) sin(2β) − (k + 1) sin(2β) ≥ sin(2kβ) − k sin(2β) + (cos(2kβ) − 1) sin(2β). The first two summands are positive, by the inductive hypothesis, while the final term is also positive, as it is a product of negative factors. This establishes inequality (6.1). To complete the lemma, we note that cW sin3 β cos(mβ) (L + b sin β)2 cos[(m − 1)β] cos(mβ) − L2 cos β 1 | = |E2m 64π 2 sin4 (mβ)L2 cos β(L + b sin β)2 cos[(m − 1)β] cW sin3 β cos(mβ)[(L + b sin β)2 − L2 ] cos(m − 1)β cos(mβ) ≤ 64π 2 sin4 (mβ)L4 cos β cos[(m − 1)β] cW sin3 β cos(mβ)L2 [cos[(m − 1)β] cos(mβ) − cos β] + . 64π 2 sin4 (mβ)L4 cos β cos[(m − 1)β] Now, cos[(m − 1)β] cos(mβ) − cos β = 12 [cos[(2m − 1)β] − cos β] ≥ 0, so we have cW sin3 β cos2 (mβ)(2Lb sin β + b2 sin2 β) 1 |E2m | ≤ 64π 2 sin4 (mβ)L4 cos β cW b sin4 β cos2 (mβ) ≤ cW b , ≤ 4 2 3 32π L sin (mβ) cos β 32π 2 L3 m4 by Eq. (6.1), as claimed. As β → π, for fixed m0 and m1 , we have that 0 for m0 ≤ m1 , cW b for m0 > m1 and m1 even, 0 64π 2 L3 m32 → E2m 2 cW b(5m22 +3) for m0 > m1 and m1 odd, 64π 2 L3 m4 (m2 −1) 2
0 and since m0 , m1 → ∞ as β → π and m2 = max{m0 , m1 }, we have that E2m → 0. 2 Thus m2 ∞ ∞ cbW 1 cπ 2 bW 1 1 lim E = 2 lim E2m =2 lim E2m =− =− , 2 3 4 β→π β→π β→π 16π L m=1 m 1440 L3 m=1 m=1
where we have been able to interchange the limit with the sum by dominated convergence and the previous lemma.
August 29, 2005 18:7 WSPC/148-RMP
880
J070-00243
B. Guilfoyle, W. Klingenberg & S. Sen
Acknowledgments The first two authors would like to thank Karl Luttinger for many inspiring and helpful discussions on the mathematical background of this paper. The proof of Main Theorem 1.1 was significantly shortened by suggestions of Patrick Dorey and Michael Farber. The first author would like to express his appreciation for the hospitality of Grey College, Durham University, during the development of this work. This work was made possible by the International Collaboration Programme, Enterprise Ireland. References [1] M. Bordag, U. Mohideen and V. Mostepanko, New developments in the Casimir effect, Phys. Rep. 353 (2001) 1. [2] H. Casimir, On the attraction between two perfectly conducting plates, Proc. K. Ned. Akad. Wet. 52 (1948), p. 793. [3] B. Guilfoyle and W. Klingenberg, On the space of oriented affine lines in R3 , Archiv. der Math. 82 (2004) 81–84. [4] B. Guilfoyle and W. Klingenberg, Generalized surfaces in R3 , Math. Proc. R. Ir. Acad. 104A (2004), pp. 199–209. [5] B. Guilfoyle and W. Klingenberg, Reflection of a wave off a surface, to appear in J. Geom. [math.DG/0406212]. [6] N. J. Hitchin, Monopoles and geodesics, Comm. Math. Phys. 83 (1982) 579–602. [7] R. L. Jaffe and A. Scardicchio, The Casimir effect and geometric optics, Phys. Rev. Lett. 92 (2004) 070402. [8] R. L. Jaffe and A. Scardicchio, Casimir effects: An optical approach I. Foundations and examples, Nuclear Phys. B704 (2005) 552–582. [9] M. Kline and I. Kay, Electromagnetic Theory and Geometric Optics (Wiley, New York, 1965). [10] A. Lambrecht, The Casimir effect: A force from nothing, Phys. World (Sept 2002).
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Reviews in Mathematical Physics Vol. 17, No. 8 (2005) 881–976 c World Scientific Publishing Company
COMBINATORIAL HOPF ALGEBRAS IN QUANTUM FIELD THEORY I
´ HECTOR FIGUEROA Departamento de Matem´ aticas, Universidad de Costa Rica, San Pedro 2060, Costa Rica [email protected] ´ M. GRACIA-BOND´IA JOSE Departamento de F´ısica Te´ orica I, Universidad Complutense, Madrid 28040, Spain [email protected] Received 27 August 2004 Revised 06 June 2005
This paper stands at the interface between combinatorial Hopf algebra theory and renormalization theory. Its plan is as follows: Sec. 1.1 is the introduction, and contains an elementary invitation to the subject as well. The rest of Sec. 1 is devoted to the basics of Hopf algebra theory and examples in ascending level of complexity. Section 2 turns around the all-important Fa` a di Bruno Hopf algebra. Section 2.1 contains a first, direct approach to it. Section 2.2 gives applications of the Fa` a di Bruno algebra to quantum field theory and Lagrange reversion. Section 2.3 rederives the related Connes–Moscovici algebras. In Sec. 3, we turn to the Connes–Kreimer Hopf algebras of Feynman graphs and, more generally, to incidence bialgebras. In Sec. 3.1, we describe the first. Then in Sec. 3.2, we give a simple derivation of (the properly combinatorial part of) Zimmermann’s cancellation-free method, in its original diagrammatic form. In Sec. 3.3, general incidence algebras are introduced, and the Fa` a di Bruno bialgebras are described as incidence bialgebras. In Sec. 3.4, deeper lore on Rota’s incidence algebras allows us to reinterpret Connes–Kreimer algebras in terms of distributive lattices. Next, the general algebraic-combinatorial proof of the cancellation-free formula for antipodes is ascertained. The structure results for commutative Hopf algebras are found in Sec. 4. An outlook section very briefly reviews the coalgebraic aspects of quantization and the Rota–Baxter map in renormalization. Keywords: Hopf algebras; combinatorics; renormalization; noncommutative geometry. 2001 PACS: 11.10.Gh, 02.20.Uw, 02.40.Gh
Contents 1. Basic Combinatorial Hopf Algebra Theory 1.1. Why Hopf algebras? 1.2. Pr´ecis of bialgebra theory 881
882 882 886
August 29, 2005 18:8 WSPC/148-RMP
882
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
1.3. Primitive and indecomposable elements 1.4. Dualities 1.5. Antipodes 1.6. Symmetric algebras 2. The Fa` a di Bruno Bialgebras 2.1. Partitions, Bell polynomials and the Fa` a di Bruno algebras 2.2. Working with the Fa` a di Bruno Hopf algebra 2.3. From Fa` a di Bruno to Connes–Moscovici 3. Hopf Algebras of Graphs and Distributive Lattices 3.1. Hopf algebras of Feynman graphs 3.2. Breaking the chains: The formula of Zimmermann 3.3. Incidence Hopf algebras 3.3.1. The Fa` a di Bruno algebra as an incidence bialgebra 3.4. Distributive lattices and the general Zimmermann formula 4. The General Structure Theorems 4.1. Structure of commutative Hopf algebras I 4.2. Structure of commutative Hopf algebras II 4.3. Coda: On twisting and other matters
892 895 903 909 914 914 921 927 933 933 942 945 950 953 958 958 963 969
1. Basic Combinatorial Hopf Algebra Theory 1.1. Why Hopf algebras? Quantum field theory (QFT) aims to describe the fundamental phenomena of physics at the shortest scales, that is, at the higher energies. In spite of many practical successes, QFT is mathematically a problematic construction. Many of its difficulties are related to the need for renormalization. This complicated process is at present required to make sense of quantities very naturally defined, that we are however unable to calculate without incurring infinities. The complications are of both analytical and combinatorial in nature. Since the work by Joni and Rota [1] on incidence coalgebras, the framework of Hopf algebras (a dual concept to groups in the spirit of noncommutative geometry) has been recognized as a very sophisticated one at our disposal, for formalizing the art of combinatorics. Now, recent developments (from 1998 onwards) have placed Hopf algebras at the heart of a noncommutative geometry approach to physics. Rather unexpectedly, but quite naturally, akin Hopf algebras appeared in two previously unrelated contexts: perturbative renormalization in quantum field theories [2–4] and index formulae in noncommutative geometry [5]. Even more recently, we have become aware of the neglected coalgebraic side of the quantization procedure [6, 7]. Thus, even leaving aside the role of “quantum symmetry groups” in conformal field theory, Hopf algebra is invading QFT from both ends, that is, at the foundational and the computational levels. The whole development of QFT from principles to applications might conceivably be subtended by Hopf algebra. The approach from quantum theoretical first principles is still in its infancy. This is one reason why we focused here on the understanding of the contributions by
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
883
Kreimer, Connes and Moscovici from the viewpoint of algebraic combinatorics — particularly in respect of incidence bialgebra theory. In other words (in contrast with [8] for instance), we examine Rota, Connes and Kreimer’s lines of thought in parallel. Time permitting, we will return in another article to the perspectives broached in [6, 7], and try to point out ways from the outposts into still unconquered territory. In [9], Broadhurst and Kreimer declared: “[In renormalization theory] combinations of diagrams . . . can provide cancellations of poles, and may hence eliminate pole terms”. The practical interest of this is manifest. In fact, the ultimate goal of tackling the Schwinger–Dyson equations in QFT is linked to general characterization problems for commutative Hopf algebras [10]. This is one of the reasons why we have provided a leisurely introduction to the subject, splashing pertinent examples to chase dreariness away, and we hope, leading the readers almost imperceptibly from the outset towards the deeper structure questions undertaken in Secs. 4.1 and 4.2. The study of the more classical Fa` a di Bruno Hopf algebras, effected mostly in Sec. 2, serves as a guiding thread of this survey. As well as their applications, in particular to the Lagrange reversion formula — a subject we find fascinating. The Fa`a di Bruno algebras, denoted F (n), are of the same general type as the Kreimer–Connes–Moscovici Hopf algebras; they are in fact Hopf subalgebras of the Connes–Moscovici Hopf algebras HCM (n). As hinted above, the latter appeared in connection with the index formula for transversally elliptic operators on a foliation. These canonical Hopf algebras depend only on the codimension n of the foliation, and their action on the transverse frame bundle simplifies decisively the associated computation of the index, that takes place on the cyclic cohomology of HCM (n). One of our main results is a theorem describing HCM (n) as a kind of bicrossedproduct Hopf algebra of F (n) by the (action of/coaction on) the Lie algebra of the affine transformation group. This is implicit in [5], but there the construction, in the words of G. Skandalis [11], is performed “by hand”, using the action on the frame bundle. As the HCM (n) reappears in other contexts, such as modular forms [12], a more abstract characterization was the order of the day. Another focus of interest is the comprehensive investigation of the range of validity of Zimmermann’s combinatorial formula of QFT [13], in the algebraiccombinatorial context. It so happens that the “natural” formulae for computing the antipodes of the algebras we are dealing with are in some sense inefficient. This inefficiency lies in the huge sets of cancellations that arise in the final result. A case in point is the alternating sum over chains characteristic of incidence Hopf algebras. The Fa` a di Bruno bialgebras are incidence bialgebras, corresponding to the family of partially ordered sets (posets) that are partition lattices of finite sets. In relation with them at the end of his authoritative review [14] on antipodes of incidence bialgebras, Schmitt in 1987 wrote: “[The Lagrange reversion formula] can be viewed as a description of what remains of the alternating sum [over chains] after
August 29, 2005 18:8 WSPC/148-RMP
884
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
all the cancellations have been taken into account. We believe that understanding exactly how these cancellations take place will not only provide a direct combinatorial proof of the Lagrange inversion formula, but may well yield analogous formulas for the antipodes of Hopf algebras arising from classes of geometric lattices other than partition lattices”. Unbeknown, Zimmermann’s formula had been performing this trick in renormalization theory for almost twenty years by then. And already in [15], the Dyson– Salam method for renormalization was thoroughly re-examined and corrected, and its equivalence (in the context of the so-called BPHZ renormalization scheme) to Bogoliubov and Zimmermann’s methods was studied. Inspired by the book [15] and the work by Kreimer and Connes, the present authors investigated the Hopf algebra theory underpinnings of these equivalences in [16, 17] a couple of years ago. This is the punchline: on the one hand, the Connes–Kreimer algebras of rooted trees and Feynman diagrams can be subsumed under the theory of incidence algebras of distributive lattices (see [18] for the latter); on the other, Zimmermann’s formula can be incorporated into the mainstream of combinatorial Hopf algebra theory as the cancellation-free formula for the antipode of any such incidence algebra. We show all this in Secs. 3.2 and 3.4. Haiman and Schmitt [19] eventually found the equivalent of Zimmermann’s formula for Fa` a di Bruno algebras. This is also subsumed here. Thus the trade between QFT and combinatorial Hopf algebra theory is not one way. We do not want to assume that the readers are familiar with all the notions of Hopf algebra theory; some may find a direct passage to the starchy algebraist’s diet too abrupt. This is why we start, in the footsteps of [20], with a motivational discussion — that experts should probably skip. In the spirit of noncommutative geometry, suppose we choose to study a set S via a commutative algebra F (S) of complex functions on it (we work with complex numbers for definiteness only: nearly everything we have to say applies when C is replaced by any unital commutative Q-algebra; also, in some parts of this paper, it will be clear from the context when we work with real numbers instead of C). The tensor product of algebras F (S) ⊗ F(S) has the algebra structure given by (f ⊗ g)(f ⊗ g ) = f f ⊗ gg . Also, there is a one-to-one map σ : F (S) ⊗ F(S) → F (S × S) given by f ⊗ g(s, s ) → f (s)g(s ). The image of σ consists of just those functions h of two variables for which the vector space spanned by the partial maps hs (s) := h(s, s ) is of finite dimension. Suppose now the set S ≡ G is a group. Then there is much more to F (G) than its algebra structure. The multiplication in G induces an algebra map ρ : F (G) → F (G × G) given by ρ[f ](x, y) = f (xy) =: y f (x) =: f x(y), where the functions y f , f x are respectively the right translate of f by y and the left translate of f by x. Then ρ[f ] ∈ im σ iff f is such that its translates span a
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
885
finite-dimensional space. An example is given by the space of polynomials over the additive group C acting on itself by translation. A function with this property is called a representative function; representative functions clearly form a subalgebra R(G) of F (G). In summary, there is an algebra map ∆ = σ −1 ◦ ρ from R(G) to R(G) ⊗ R(G) that we express by fj(1) (x) ⊗ fj(2) (y) := ∆f (x, y). f (xy) = j
This gives a linearized form of the group operation. Now, the associative identity f ((xy)z) = f (x(yz)) imposes (id ⊗ ∆) ◦ ∆ = (∆ ⊗ id) ◦ ∆, where we denote id the identity map in the algebra of representative functions. Moreover, f → f (1G ) defines a map η : R(G) → C (called the augmentation map) that, in view of f (x1G ) = f (1G x) = f (x), verifies (id ⊗ η) ◦ ∆ = (η ⊗ id) ◦ ∆ = id. The basic concept of coalgebra in Hopf algebra theory abstracts the properties of the triple (R(G), ∆, η). Let us go a bit further and consider, for a complex vector space V , representative maps G → V whose translates span a finite-dimensional space of maps, say RV (G). Let (f1 , . . . , fn ) be a basis for the space of translates of f ∈ RV (G), and express y f (x) =
n
ci (y)fi (x).
i=1
In particular, y fj (x) = defines the cij ; also f = n
n i=1
implying y ci =
n
cij (y)fi (x)
i=1
η(fi )ci . Now, the ci are representative, since
ci (zy)fi (x) = z (y f )(x) =
i=1
n
j=1 cj (y)cij .
n
ci (y)z fi (x) =
i=1
n
cj (y)cij (z)fi (x),
i,j=1
In consequence, the map v ⊗ f → f (.)v
from V ⊗ R(G) to RV (G) is bijective, and we identify these two spaces. All this points to the importance of R(G). But in what wilderness are its elements found? A moment’s reflection shows that linear forms on any locally finite G-module provide representative functions. Suppose V is such a module, and let T denote the representation of G on it. A map γT : V → V ⊗ R(G) is given by γT (v)(x) = T (x)v.
August 29, 2005 18:8 WSPC/148-RMP
886
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
The fact T (x)T (y) = T (xy) means that γT “interacts” with ∆: (γT ⊗ id) ◦ γT = (idV ⊗ ∆) ◦ γT ; and the fact T (1G) = idV forces (idV ⊗ η) = idV . One says that (V, γT ) is a comodule for (R(G), ∆, η). A Martian versed in the “dual” way of thinking could find this method a more congenial one to analyze group representations. Let us add that the coalgebraic aspect is decisive in the applications of Hopf algebra theory to renormalization; those interested mainly in this aspect might consult now our invitation at the beginning of Sec. 3.1. Also, after extracting so much mileage from the coalgebraic side of the group properties, one could ask, what is the original algebra structure of R(G) good for? This question we shall answer in due course. 1.2. Pr´ ecis of bialgebra theory We assume familiarity with (associative, unital) algebras; but it is convenient here to rephrase the requirements for an algebra A in terms of its two defining maps, to wit m : A ⊗ A → A and u : C → A, respectively given by m(a, b) := ab; u(1) = 1A . They must satisfy: • Associativity: m(m ⊗ id) = m(id ⊗ m) : A ⊗ A → A; • Unity: m(u ⊗ id) = m(id ⊗ u) = id : C ⊗ A = A ⊗ C = A → A. In the following, we omit the sign ◦ for composition of linear maps. These two properties correspond, respectively, to the commutativity of the diagrams A⊗A⊗A
m ⊗ idA
/ A⊗A
idA ⊗ m
A⊗A
m
m
/ A,
and C ⊗O A
u ⊗ idA
/ A⊗A m
A
idA
/ A,
A ⊗O C A
idA ⊗ u
/ A⊗A m
idA
/ A.
The unnamed arrows denote the natural identifications C ⊗ A = A = A ⊗ C respectively given by λ ⊗ a → λa and a ⊗ λ → λa. Definition 1.1. A coalgebra C is a vector space with the structure obtained by reversing the arrows in the diagrams above characterizing an algebra. A coalgebra
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
887
is therefore also described by two linear maps: the coproduct ∆ : C → C ⊗ C (also called diagonalization, or “sharing”), and the counit (or augmentation) η : C → C, with requirements: • Coassociativity: (∆ ⊗ id)∆ = (id ⊗ ∆)∆ : C → C ⊗ C ⊗ C; • Counity: (η ⊗ id)∆ = (id ⊗ η)∆ = id : C → C. These conditions, therefore, simply describe the commutativity of the diagrams “dual” to the above, namely ∆ ⊗ idC C ⊗ CO ⊗ C o C ⊗O C idC ⊗ ∆
∆
C⊗C o
∆
C
and η ⊗ idC C ⊗O C o C ⊗O C
Co
idC ⊗ η C ⊗O C o C ⊗O C
∆ idC
Co
C,
∆ idC
C.
In general, the prefix “co” makes reference to the process of reversing arrows in diagrams. With ∆a := j aj(1) ⊗ aj(2) , the second requirement is explicitly given by η(aj(1) ) aj(2) = aj(1) η(aj(2) ) = a. (1.1) j
j
Since it has a left inverse, ∆ is always injective. If ∆aj(1) = k ajk(1)(1) ⊗ ajk(1)(2) and ∆aj(2) = l ajl(2)(1) ⊗ ajl(2)(2) , the first condition is ajk(1)(1) ⊗ ajk(1)(2) ⊗ aj(2) = aj(1) ⊗ ajl(2)(1) ⊗ ajl(2)(2) . jk
jl
Thus coassociativity corresponds to the idea that, in decomposing the “object” a in sets of three pieces, the order in which the breakups take place does not matter. In what follows, we alleviate the notation by writing simply ∆a = a(1) ⊗ a(2) . Sometimes even the sign will be omitted. Since a(1)(1) ⊗ a(1)(2) ⊗ a(2) = a(1) ⊗ a(2)(1) ⊗ a(2)(2) , we can write ∆2 a = a(1) ⊗ a(2) ⊗ a(3) ,
∆3 a = a(1) ⊗ a(2) ⊗ a(3) ⊗ a(4) ,
and so on, for the n-fold coproducts. Notice that (η ⊗ id ⊗ · · · ⊗ id)∆n = ∆n−1 (n id factors being understood). And similarly when η is in any other position. We analogously consider the n-fold products mn : A ⊗ · · · ⊗ A → A, with n + 1 factors A being understood.
August 29, 2005 18:8 WSPC/148-RMP
888
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
Like algebras, coalgebras have a tensor product. The coalgebra C ⊗ D is the vector space C ⊗ D endowed with the maps ∆⊗ (c ⊗ d) = c(1) ⊗ d(1) ⊗ c(2) ⊗ d(2) ; η⊗ (c ⊗ d) = η(c)η(d). (1.2) That is ∆⊗ = (id⊗ τ ⊗ id)(∆C ⊗ ∆D ), in parallel with m⊗ = (mA ⊗ mB )(id ⊗ τ ⊗ id) for algebras. A counital coalgebra (or comultiplicative) map : C → D between two coalgebras is a linear map that preserves the coalgebra structure, ∆D = ( ⊗ )∆C : C → D ⊗ D;
ηD = ηC : C → C.
Once more these properties correspond to the commutativity of the diagrams C⊗C o
∆C
⊗
D⊗D o
/D C3 33
3
η
ηC 33 D
C,
C
∆D
D,
obtained by reversing arrows in the diagrams that express the homomorphism properties of linear maps between unital algebras. (Non)commutativity of the algebra and coalgebra operations is formulated with the help of the “flip map” τ (a ⊗ b) := b ⊗ a. The algebra A is commutative if mτ = m : A ⊗ A → A; likewise, the coalgebra C is called cocommutative if τ ∆ = ∆ : C → C ⊗ C. These properties correspond, respectively, to the commutativity of the diagrams A ⊗ ;A ;; ;; m ;;
τ
A,
/ A⊗A m
o C ⊗C C@ ⊗ C ^<< << << ∆ < ∆ C. τ
For commutative algebras, the map m is a homomorphism, and similarly ∆ is a coalgebra map for cocommutative coalgebras. The same space C with the coalgebra structure given by τ ∆ is called the coopposite coalgebra C cop . A subcoalgebra of C is a subspace Z such that ∆Z ⊆ Z ⊗ Z. A coalgebra without nontrivial subcoalgebras is simple. The direct sum of all simple subcoalgebras of C is called its coradical R(C). A very important concept is that of coideal. A subspace J is a coideal of C if ∆J ⊆ J ⊗ C + C ⊗ J
and η(J) = 0.
The kernel of any coalgebra map is a coideal. In effect, ∆D (ker ) = 0 = ( ⊗ )∆C ker ,
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
889
forces ∆C ker ⊆ ker ⊗ C + C ⊗ ker , and moreover ηC (ker ) = ηD (ker ) = 0. If J is a coideal, then C/J has a (unique) coalgebra structure such that the canonical q projection C → C/J is a coalgebra map — see the example at the end of this section. A coalgebra filtered as a vector space is called a filtered coalgebra when the filtering is compatible with the coalgebra structure; that is, there exists a nested sequence of subspaces Cn such that C0 C1 . . . and n≥0 Cn = C, and moreover ∆Cn ⊆
n
Cn−k ⊗ Ck .
k=0
Given an algebra A and a coalgebra C over C, one can define the convolution of two elements f, g of the vector space of C-linear maps Hom(C, A), as the map f ∗ g ∈ Hom(C, A) given by the composition ∆
f ⊗g
m
C −→ C ⊗ C −−−→ A ⊗ A −→ A. In other words, f ∗ g(a) =
f (a(1) )g(a(2) ).
The triple (Hom(C, A), ∗, uA ηC ) is then a unital algebra. Indeed, convolution is associative because of associativity of m and coassociativity of ∆: (f ∗ g) ∗ h = m((f ∗ g) ⊗ h)∆ = m(m ⊗ id)(f ⊗ g ⊗ h)(∆ ⊗ id)∆ = m(id ⊗ m)(f ⊗ g ⊗ h)(id ⊗ ∆)∆ = m(f ⊗ (g ∗ h))∆ = f ∗ (g ∗ h). The unit is in effect uA ηC : f ∗ uA ηC = m(f ⊗ uA ηC )∆ = m(idA ⊗ uA )(f ⊗ idC )(idC ⊗ ηC )∆ = idA f idC = f, uA ηC ∗ f = m(uA ηC ⊗ f )∆ = m(uA ⊗ idA )(idC ⊗ f )(ηC ⊗ idC )∆ = idA f idC = f. Algebra morphisms l : A → B and coalgebra morphisms : D → C respect convolution, in the following respective ways: if f, g ∈ Hom(C, A) for some coalgebra C and some algebra A, then l(f ∗ g) = lmA (f ⊗ g)∆ = mB (l ⊗ l)(f ⊗ g)∆ = mB (lf ⊗ lg)∆ = lf ∗ lg,
(1.3)
(f ∗ g) = m(f ⊗ g)∆C = m(f ⊗ g)( ⊗ )∆D = mA (f ⊗ g )∆D = f ∗ g . (1.4) Definition 1.2. To obtain a bialgebra, say H, one considers on the vector space H both an algebra and a coalgebra structure, and further stipulates the compatibility condition that the algebra structure maps m and u are counital coalgebra morphisms, when H ⊗ H is seen as the tensor product coalgebra.
August 29, 2005 18:8 WSPC/148-RMP
890
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
This leads (omitting the subscript from the unit in the algebra) to: ∆a ∆b = ∆(ab),
∆1 = 1 ⊗ 1,
η(a)η(b) = η(ab),
η(1) = 1;
(1.5)
in particular, η is a one-dimensional representation of the algebra H. For instance, ∆(ab) = ∆m(a ⊗ b) = (m ⊗ m)∆(a ⊗ b) = (m ⊗ m)[a(1) ⊗ b(1) ⊗ a(2) ⊗ b(2) ] = a(1) b(1) ⊗ a(2) b(2) = (a(1) ⊗ a(2) )(b(1) ⊗ b(2) ) = ∆a ∆b. Of course, it would be totally equivalent and (to our earthlings’ mind) looks simpler to postulate instead the conditions (1.5): it stated that the coalgebra structure maps ∆ and η are unital algebra morphisms. But it is imperative that we familiarize ourselves with the coalgebra operations. Note that the last equation in (1.5) is redundant. The map uη : H → H is an idempotent, as uηuη(a) = η(a)uη(1H ) = η(a)1H = uη(a). Therefore H = im uη ⊕ ker uη = im u ⊕ ker η = C1H ⊕ ker η. A halfway house between algebras or coalgebras and bialgebras is provided by the notions of augmented algebra, which is a quadruple (A, m, u, η), or augmented coalgebra, which is a quadruple (C, ∆, u, η), with the known properties in both cases. A bialgebra morphism is a linear map between two bialgebras, which is both a unital algebra homomorphism and a counital coalgebra map. A subbialgebra of H is a vector subspace E that is both a subalgebra and a subcoalgebra; in other words, E, together with the restrictions of the product, coproduct and so on, is also a bialgebra and the inclusion E → H is a bialgebra morphism. A biideal J of H is a linear subspace that is both an ideal of the algebra H and a coideal of the coalgebra H. The quotient H/J inherits a bialgebra structure. Associated to any bialgebra H, there are the three bialgebras H opp , H cop and copp obtained by taking opposite either of the algebra structure or the coalgebra H structure or both. Linear maps of a bialgebra H into an algebra A can in particular be convolved; but if they are multiplicative, that is, algebra homomorphisms, their convolution in general will be multiplicative only if A is commutative. In such a case if f, g ∈ Homalg (H, A), then f ∗ g(ab) = f (a(1) )f (b(1) )g(a(2) )g(b(2) ) = f (a(1) )g(a(2) )f (b(1) )g(b(2) ) = f ∗ g(a)f ∗ g(b). Similarly, the convolution of coalgebra maps of a coalgebra C into a bialgebra H is comultiplicative when H is cocommutative. Definition 1.3. A bialgebra N-filtered as a vector space is called a filtered bialgebra when the filtering is compatible with both the algebra and the coalgebra structures; that is, there exists a nested sequence of subspaces H0 H1 . . . such that n≥0 Hn = H, and moreover ∆Hn ⊆
n k=0
Hn−k ⊗ Hk ,
Hn Hm ⊆ Hn+m .
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
891
Connected bialgebras are those filtered bialgebras for which the first piece consists just of scalars: H0 = u(C), in that case, R(H) = H0 and the augmentations η, u are unique. Any coalgebra C is filtered via a filtering, dubbed the coradical filtering, whose starting piece is the coradical. When H is a bialgebra, its coradical filtering is not necessarily compatible with the algebra structure. Nevertheless, it is compatible when R := R(H) is a subbialgebra of H, in particular when H0 = u(C). In short, the coradical filtering goes as follows: consider the sums of tensor products 1 HR := R,
2 HR := R ⊗ H + H ⊗ R,
3 HR := R ⊗ H ⊗ H + H ⊗ R ⊗ H + H ⊗ H ⊗ R
and so on; then the subcoalgebras Hn are defined by n+1 ). Hn = [∆n ]−1 (HR
We refer the reader to [21, 22] for more details on the coradical filtering. Naturally, a bialgebra may have several different filterings. (n) graded as a vector space is called a graded bialA bialgebra H = ∞ n=0 H gebra when the grading # is compatible with both the algebra and the coalgebra structures: n H (n−k) ⊗ H (k) . H (n) H (m) ⊆ H (n+m) and ∆H (n) ⊆ k=0
That is, grading H ⊗ H in the obvious way, m and ∆ are homogeneous maps of degree zero. A graded bialgebra is filtered in the obvious way. Most often we work with graded bialgebras of finite type, for which the H (n) are finite-dimensional; and, at any rate, all graded bialgebras are direct limits of subbialgebras of finite type [23, Proposition 4.13]. Two main “classical” examples of bialgebras, respectively commutative and cocommutative, are the space of representative functions on a compact group and the enveloping algebra of a Lie algebra. Example 1.4. Let G be a compact topological group (most often, a Lie group). The Peter–Weyl theorem shows that any unitary irreducible representation π of G is finite-dimensional, any matrix element f (x) := u, π(x)v is a representative function on G and the vector space R(G) generated by these matrix elements is a dense ∗-subalgebra of C(G). Elements of this space can be characterized as those continuous functions f : G → C whose translates t f : x → f (xt), for all t ∈ G, generate a finite-dimensional subspace of C(G). The theorem says, in other words, that nothing of consequence is lost in compact group theory by studying just R(G). As we already know, the commutative algebra R(G) is a coalgebra, with operations ∆f (x, y) := f (xy);
η(f ) := f (1G ).
(1.6)
Example 1.5. The universal enveloping algebra U(g) of a Lie algebra g is the quotient of the tensor algebra T (g) — with T 0 (g) C — by the two-sided ideal I
August 29, 2005 18:8 WSPC/148-RMP
892
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
generated by the elements XY −Y X −[X, Y ], for all X, Y ∈ g. The word “universal” is appropriate because any Lie algebra homomorphism ψ : g → A, where A is a unital associative algebra, extends uniquely to a unital algebra homomorphism Uψ : U(g) → A. A coproduct and counit are defined first on elements of g by ∆X := X ⊗ 1 + 1 ⊗ X
(1.7)
and η(X) := 0. These linear maps on g extend to homomorphisms of T (g); for instance, ∆(XY ) = ∆X∆Y = XY ⊗ 1 + X ⊗ Y + Y ⊗ X + 1 ⊗ XY. It follows that the tensor algebra on any vector space is a (graded, connected) bialgebra. Now ∆(XY − Y X − [X, Y ]) = (XY − Y X − [X, Y ]) ⊗ 1 + 1 ⊗ (XY − Y X − [X, Y ]). Thus I is also a coideal (clearly η(I) = 0, too) and if q : T (g) → U(g) is the quotient map, then I ⊆ ker(q ⊗ q)∆. Thus (q ⊗ q)∆ induces a coproduct on the quotient U(g), that becomes an irreducible bialgebra. From (1.7) and the definition of I, it is seen that U(g) is cocommutative. Note also that it is a graded coalgebra. The coradical filtering of U(g) is just the obvious filtering by degree [21]. When g is the Lie algebra of G, both previous constructions are mutually dual in a sense that will be studied soon. 1.3. Primitive and indecomposable elements Definition 1.6. An element a in a bialgebra H is said to be (1-)primitive when ∆a = a ⊗ 1 + 1 ⊗ a. Primitive elements of H form a vector subspace P (H), which is seen at once to be a Lie subalgebra of H with the ordinary bracket [a, b] := ab − ba. For instance, elements of g inside U(g) are primitive by definition; and there are no others. Denote by H+ := ker η the augmentation ideal of H. If for some a ∈ H, we can find a1 , a2 ∈ H+ with ∆a = a1 ⊗ 1 + 1 ⊗ a2 , then by the counit property a = (id ⊗ η)∆a = a1 , and similarly a = a2 ; so a is primitive. In other words, P (H) = ∆−1 (H+ ⊗ 1 + 1 ⊗ H+ ).
(1.8)
By the counit property as well, a ∈ ker η is automatic for primitive elements. If : H → K is a bialgebra morphism, and a ∈ P (H) then ∆K (a) = ( ⊗ )∆H a = (a) ⊗ 1 + 1 ⊗ (a). Thus the restriction of to P (H) defines a Lie algebra map P ( ) : P (H) → P (K). If is injective, obviously so is P ( ).
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
893
Proposition 1.7. Let H be a connected bialgebra. Write H+ n := H+ ∩ Hn for all n. If a ∈ H+ n , then ∆a = a ⊗ 1 + 1 ⊗ a + y,
where y ∈ H+ n−1 ⊗ H+ n−1 .
(1.9)
Moreover, H1 ⊆ C1 ⊕ P (H). Proof. Write ∆a = a ⊗ 1 + 1 ⊗ a + y. If a ∈ H+ , then (id ⊗ η)(y) = (id ⊗ η)∆a − a = 0 and similarly (η ⊗ id)(y) = 0 by the counity properties (1.1). Therefore, y ∈ H+ ⊗ H+ and thus y∈
n
H+ i ⊗ H+ n−i .
i=0
As H+ 0 = (0), the first Thus, when H is a write c = µ1 + a, with for some λ ∈ C; and in C 1 ⊕ P (H).
conclusion follows. connected bialgebra, H = H0 ⊕ ker η. If c ∈ H1 , we can µ ∈ C and a ∈ H+ 1 . Now, ∆a = a ⊗ 1 + 1 ⊗ a + λ(1 ⊗ 1) a + λ1 is primitive, thus c = (µ − λ)1 + (a + λ1) lies
In a connected bialgebra, the primitive elements together with the scalars constitute the “second step” of the coradical filtering. If H is graded, some H (k) might be zero; but if k is the smallest non-zero integer such that H (k) = 0, compatibility of the coproduct with the grading ensures H (k) ⊆ P (H). In view of (1.9), it will be very convenient to consider the reduced coproduct ∆ defined on H+ by a(1) ⊗ a(2) . (1.10) ∆ a := ∆a − a ⊗ 1 − 1 ⊗ a =: In other words, ∆ is the restriction of ∆ as a map ker η → ker η ⊗ ker η; and a is primitive if and only if it lies in the kernel of ∆ . Coassociativity of ∆ (and cocommutativity, when it holds) is easily obtained from the coassociativity of ∆; and we write n a(1) ⊗ · · · ⊗ a(n) . ∆ a = The reduced coproduct is not an algebra homomorphism: ∆ (ab) = ∆ a ∆ b + (a ⊗ 1 + 1 ⊗ a)∆ b + ∆ a(b ⊗ 1 + 1 ⊗ b) + a ⊗ b + b ⊗ a.
(1.11)
Let p : H → H+ be the projection defined by p(a) := (id − uη)a = a − η(a) 1; then the previous result (1.9) is reformulated as (p ⊗ p)∆ = ∆ p; more generally, it follows that Un := (p ⊗ p ⊗ · · · ⊗ p)∆n = ∆ p, n
(1.12)
with n + 1 factors in p ⊗ p ⊗ · · · ⊗ p. This humble equality plays a decisive role in this work.
August 29, 2005 18:8 WSPC/148-RMP
894
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
Theorem 1.8. Consider now the tensor product H ⊗ K of two connected graded bialgebras H and K, which is also a connected graded bialgebra, with the coproduct (1.2). Then P (H ⊗ K) = P (H) ⊗ 1 + 1 ⊗ P (K) ∼ = P (H) ⊕ P (K)
(1.13)
in H ⊗ K. Proof. The last identification comes from the obvious P (H) ⊗ 1 ∩ 1 ⊗ P (K) = (0) in H ⊗ K. Let a ∈ P (H) and b ∈ P (K), then ∆⊗ (a ⊗ 1 + 1 ⊗ b) = (id ⊗ τ ⊗ id) (a ⊗ 1 + 1 ⊗ a) ⊗ (1 ⊗ 1) + (1 ⊗ 1) ⊗ (b ⊗ 1 + 1 ⊗ b) = (a ⊗ 1 + 1 ⊗ b) ⊗ (1 ⊗ 1) + (1 ⊗ 1)(a ⊗ 1 + 1 ⊗ b), so a ⊗ 1 + 1 ⊗ b ∈ P (H ⊗ K). On the other hand, letting x ∈ P (H ⊗ K), ∆⊗ x = x ⊗ 1 ⊗ 1 + 1 ⊗ 1 ⊗ x.
(1.14)
Since H ⊗ K is a graded bialgebra, we can write x = a0 ⊗ 1 + 1 ⊗ b 0 + ap ⊗ b q , p,q≥1
and b ∈ K (q) . By (1.10), ∆H ai = ai ⊗ 1 + where a ∈ H+ , b ∈ K+ , a ∈ H i j j i i ⊗ b(2) for j ≥ 0. 1 ⊗ a + a(1) ⊗ a(2) for i ≥ 0 and ∆K b j = b j ⊗ 1 + 1 ⊗ b j + b(1) Thus, after careful book-keeping, it follows that 0 0 a0 1 ⊗ b0 ∆⊗ x = x ⊗ 1 ⊗ 1 + 1 ⊗ 1 ⊗ x + (1) ⊗ 1 ⊗ a(2) ⊗ 1 + (1) ⊗ 1 ⊗ b(2) + 1 ⊗ b q ⊗ ap ⊗ 1 + ap ⊗ 1 ⊗ 1 ⊗ bq + R, 0
0
p
(p)
q
where R is a sum of terms of the form c ⊗ d ⊗ e ⊗ f where at least three of the following conditions hold: c ∈ H+ , d ∈ K+ , e ∈ H+ or f ∈ K+ . A comparison with (1.14) then gives R = 0 and 0 0 1 ⊗ b0 ap ⊗ 1 ⊗ 1 ⊗ bq = 0. a0 (1) ⊗ 1 ⊗ a(2) ⊗ 1 = (1) ⊗ 1 ⊗ b(2) = p The vanishing of the third sum gives that a ⊗ bq = 0, so x = a0 ⊗ 1 + 1 ⊗ b0 , 0 whereas the vanishing of the first and second sums give a(1) ⊗ a0 (2) = 0 and 0 0 0 0 b(1) ⊗ b(2) = 0, that is a ∈ P (H) and b ∈ P (K). Definition 1.9. In a connected Hopf algebra, H+ is the unique maximal ideal m for m ≥ 1 form a descending sequence of ideals. The graded algebra and the H+ 2 is called the set of indecomposables of H. Q(H) := C1 ⊕ H+ /H+ 2 Algebraically, the H-module Q(H) := H+ /H+ is the tensor product of H+ and C by means of η : H → C [24, Sec. 2.4]. We spell this out. Given M and N , respectively a right H-module by an action φM and a left H-module by an action φN , the vector space whose elements are finite sums j mj ⊗nj with mj ∈ M
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
895
and nj ∈ N , subject to the relations mφM (a) ⊗ n = m ⊗ φN (a)n
for each a ∈ H
is denoted M ⊗H N . Note now that C is a (left or right) H-module by a λ = λ a := η(a)λ. Also H+ is a (right or left) H-module. Thus, the tensor product H+ ⊗H C of H+ and C over H by means of η is the graded vector space whose elements are finite sums j sj ⊗ βj with sj ∈ H+ and βj ∈ C, subject to the relations sa ⊗ β = s ⊗ η(a)β
for each a ∈ A
2 , then s = 0 in H+ ⊗H C. Similarly, one defines C ⊗H H+ H+ ⊗H C. if s ∈ H+ Notice that C ⊗H N N/H+ N . 2 The quotient algebra morphism H → C1 ⊕ H+ /H+ restricts to a graded linear 2 2 = 0, map qH : P (H) → H+ /H+ . Clearly, this map will be one-to-one iff P (H)∩H+ 2 = H+ . and onto iff P (H) + H+
Proposition 1.10. If the relation 2 P (H) ∩ H+ = (0),
implying that primitive elements are all indecomposable, holds, then H is commutative. 2 , thereProof. The commutator [a, b] for a, b primitive belongs both to P (H) and H+ fore it must vanish. Proceeding by induction on the degree, one sees that the bracket vanishes for general elements of H: indeed, let [a, b] = 0 for all a ∈ Hp , b ∈ Hq and consider [a, b] for, say, a ∈ Hp , b ∈ Hq+1 . A straightforward computation, using (1.11), shows that ∆ [a, b] = 0. So [a, b] is primitive and hence, zero.
If : H → K is an onto bialgebra map, then the induced map Q( ) : Q(H) → Q(K) is onto. (The terminology “indecomposable elements” used for instance in this section, 2 . is somewhat sloppy, as in fact the indecomposables are defined only modulo H+ However, to avoid circumlocutions, we shall still use it often, trusting the reader not be confused.) 1.4. Dualities Consider the space C ∗ of all linear functionals on a coalgebra C. One identifies C ∗ ⊗ C ∗ with a subspace of (C ⊗ C)∗ by defining f ⊗ g(a ⊗ b) = f (a)g(b),
(1.15)
where a, b ∈ C; f, g ∈ C ∗ . Then C ∗ becomes an algebra with product the restriction of ∆t to C ∗ ⊗ C ∗ ; with t denoting transposed maps. We have already seen this, as this product is just the convolution product: f g(a) = f (a(1) )g(a(2) ) and uC ∗ 1 = η means the unit is η t .
August 29, 2005 18:8 WSPC/148-RMP
896
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
It is a bit harder to obtain a coalgebra by dualization of an algebra A — and thus a bialgebra by a dualization of another. The reason is that mt takes A∗ to (A ⊗ A)∗ and there is no natural mapping from (A ⊗ A)∗ to A∗ ⊗ A∗ ; if A is not finite-dimensional, the inverse of the identification in (1.15) does not exist, as the first of these spaces is larger than the second. In view of these difficulties, the pragmatic approach is to focus on the (strict) pairing of two bialgebras H and K, where each may be regarded as included in the dual of the other. That is to say, we write down a bilinear form a, f := f (a) for a ∈ H and f ∈ K with implicit inclusions K → H ∗ , H → K ∗ . The transposing of operations between the two bialgebras boils down to the following four relations, for a, b ∈ H and f, g ∈ K:
ab, f = a ⊗ b, ∆K f ,
a, f g = ∆H a, f ⊗ g,
ηH (a) = a, 1 and ηK (f ) = 1, f .
(1.16)
The nondegeneracy conditions that allow us to assume that H → K ∗ and K → H ∗ are: (i) a, f = 0 for all f ∈ K implies a = 0, and (ii) a, f = 0 for all a ∈ H implies f = 0. It is plain that H is a commutative bialgebra iff K is cocommutative. The two examples at the end of Sec. 1.2 are tied up by duality as follows. Let G be a compact connected Lie group whose Lie algebra is g. The function algebra R(G) is a commutative bialgebra, whereas U(g) is a cocommutative bialgebra. Moreover, representative functions are smooth [25]. On identifying g with the space of left-invariant vector fields on the group manifold G, we can realize U(g) as the algebra of left-invariant differential operators on G. If X ∈ g and f ∈ R(G), we define d
X, f := Xf (1) = f (exp tX) dt t=0 and more generally, X1 . . . Xn , f := X1 (· · · (Xn f ) · · ·)(1); we also set 1, f := f (1). This yields a duality between R(G) and U(g). Indeed, the Leibniz rule for vector fields, namely X(f h) = (Xf )h + f (Xh), gives
X, f h = Xf (1)h(1) + f (1)Xh(1) = (X ⊗ 1 + 1 ⊗ X)(f ⊗ h)(1 ⊗ 1) = ∆X(f ⊗ h)(1 ⊗ 1) = ∆X, f ⊗ h while
(1.17)
d d (∆f )(exp tX ⊗ exp sY ) dt t=0 ds s=0 d d = f (exp tX exp sY ) dt t=0 ds s=0 d = (Y f )(exp tX) = X(Y f )(1) = XY, f . dt t=0
X ⊗ Y, ∆f =
The necessary properties can be easily checked. Relation (1.17) shows that ∆X = X ⊗ 1 + 1 ⊗ X encodes the Leibniz rule for vector fields.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
897
A more normative approach to duality is to consider instead the subspace A◦ of A∗ made of functionals whose kernels contain an ideal of finite codimension in A. Alternatively, A◦ can be defined as the set of all functionals f ∈ A∗ for which there r are functionals g1 , . . . , gr ; h1 , . . . , hr in A∗ such that f (ab) = j=1 gj (a)hj (b); that is to say, A◦ is the set of functions on the monoid of A that are both linear and representative. It can be checked that mt maps A◦ to A◦ ⊗A◦ , and so (A◦ , mt |A◦ , ut ) defines a coalgebra structure on A◦ , with ηA◦ (f ) = f (1A ). Given a bialgebra (H, m, u, ∆, η), one then sees that (H ◦ , ∆t , η t , mt , ut ) is again a bialgebra, called the finite dual or Sweedler dual of H; the contravariant functor H → H ◦ defines a duality of the category of bialgebras into itself. In the previous case of a dual pair (H, K), we actually have K → H ◦ and H → K ◦ . If G is a group, CG denotes the group algebra of G, that is, the complex vector space freely generated by G as a basis, with product defined by extending linearly the group multiplication of G, so 1G is the unit in CG. Endowed with the coalgebra structure given by (the linear extensions of) x → x ⊗ x and η(x) := 1, it is a cocommutative bialgebra. In view of the discussion in Sec. 1.1, R(G) is the Sweedler dual of CG. In a general bialgebra H, a non-zero element g is called group-like if ∆g := g ⊗g; for it η(g) = 1. The product of group-like elements is group-like. The characters of a bialgebra H (further discussed in the following section) are by definition the multiplicative elements of H ∗ . They belong to H ◦ , as for them, mt f = f ⊗ f . Then, the set G(H ◦ ) of group-like elements of H ◦ coincides with the set of characters of H. If H = R(G), the map x → x0 given by x0 (f ) = f (x) gives a map G → G(R0 (G)); it will become clear soon that it is a homomorphism of groups. Among the interesting elements of the dual of a bialgebra, there are also the derivations or infinitesimal characters: these are linear functionals δ satisfying δ(ab) = δ(a)η(b) + η(a)δ(b)
for all a, b ∈ H.
This entails δ(1) = 0. The previous relation can also be written as mt (δ) = δ ⊗ η + η ⊗ δ, which shows that the infinitesimal characters belong to H ◦ as well, and are primitive there. Thus the Lie algebra of primitive elements of H ◦ coincides with the Lie algebra Derη H of infinitesimal characters. When A is a graded algebra of finite type, one can consider the space ∗ A(n) , A := n≥0 ∗
where A(n) , A(m) = 0 for n = m, and there is certainly no obstacle to define the graded coproduct on homogeneous components of A as the transpose of m:
n k=0
A(k) ⊗ A(n−k) → A(n) .
August 29, 2005 18:8 WSPC/148-RMP
898
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
If the algebra above is a bialgebra H, one obtains in this way a subbialgebra H of H ◦ , called the graded dual of H. Certainly (H, H ) form a nondegenerate dual pair. Note that H = H. If I is a linear subspace of H graded of finite type, we denote by I ⊥ its orthogonal in H . Naturally I ⊥⊥ = I. Proposition 1.11. For a graded connected bialgebra of finite type H, 2 P (H )⊥ = C1 ⊕ H+ .
(1.18)
Proof. Let p ∈ P (H ). Using (1.16), we obtain p, 1 = ηH (p) = 0. Also, for a 1 , a 2 ∈ H+
p, a1 a2 = p ⊗ 1 + 1 ⊗ p, a1 ⊗ a2 = ηH (a2 ) p, a1 + ηH (a1 ) p, a2 = 0. 2 2 ⊥ ⊆ P (H )⊥ . Use of (1.16) again easily gives (C1 ⊕ H+ ) ⊆ P (H ). Thus C1 ⊕ H+ 2 2 Then P (H )⊥ ⊆ C1 ⊕ H+ and therefore, P (H )⊥ = C1 ⊕ H+ .
In general, H H ◦ . As an example, let the polynomial algebra H = C[Y ] be endowed with the coproduct associated to translation, n
n (1.19) ∆Y n = Y k ⊗ Y n−k , k k=0
which is a homomorphism in view of the Vandermonde identity; this is the so-called binomial bialgebra. Consider the elements f (n) of H ∗ defined by
f (n) , Y m = δnm . Obviously any φ ∈ H ∗ can be written as φ= cn f (n) , n≥0
where the complex numbers cn are given by cn = φ, Y n . Now, write f := f (1) . Notice that f is primitive since
∆t f, Y n ⊗ Y m = f, Y n Y m = f, Y n+m = δ1n δ0m + δ0n δ1m = f ⊗ η + η ⊗ f, Y n ⊗ Y m . On the other hand,
f 2 , Y n = f ⊗ f, ∆Y n =
n
n k=0
k
f (Y n−k )f (Y k ).
Since f (Y k ) = 0 unless k = 1, then f 2 = 2!f (2) . A simple induction entails f n = n!f (n) . Thus, φ can be written as φ = n≥0 dn f n with dn = cn!n ; so H ∗ C[[f ]], the algebra of formal (exponential, if you wish) power series in f .
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
899
It is also rather clear what C[Y ] is: in terms of the f (n) it is the divided powers bialgebra, namely the bialgebra with basis f (n) for n ≥ 0, where the product and coproduct are, respectively, given by
n n+m f (n) f (m) = f (n−k) ⊗ f (k) . f (n+m) and ∆f (n) = n k=0
We can conclude that C[Y ] = C[f ]. Consider now φ in the Sweedler dual H ◦ . By definition, there exists some (principal) ideal I = (p(Y )), with p a (monic) polynomial such that φ(I) = 0. Therefore, we shall first describe all the φ that vanish on a given ideal I. We start with the case p(Y ) = (Y − λ)r for some λ ∈ C and r ∈ N. Let φλ := n≥0 λn f n = exp(λf ). The set {(Y − λ)m : m ≥ 0} is also a basis of H. As before, one can consider the (m) (m) elements gλ of H ∗ defined by gλ , (Y − λ)l = δlm . We are going to prove that (m) gλ = f (m) φλ . Indeed l
l
f (m) φλ , (Y − λ)l = f (m) φλ , (−λ)l−k Y k k k=0
l l = (−λ)l−k f (m) ⊗ φλ , ∆Y k k k=0 k
l l k = (−λ)l−k f (m) ⊗ φλ , Y j ⊗ Y k−j k j k=0 j=0
=
k
l l k k=0 j=0
k
j
(−λ)l−k f (m) (Y j )φλ (Y k−j ).
Since f (Y ) vanish if m > j, it is clear that f (m) φλ , (Y − λ)l = 0 if m > l. If m = l, only one term survives and f (m) φλ , (Y − λ)m = 1. On the other hand, if m < l, then
l l k (m) l (−1)l−k λl−m
f φλ , (Y − λ) = k m k=m l
λl−m l = (−1)l−k k(k − 1) · · · (k − m + 1). k m! (m)
j
k=m
Successive derivatives of the binomial identity give l(l − 1) · · · (l − m + 1)(x − 1)l−m l
l = (−1)l−k k(k − 1) · · · (k − m + 1)xk−m , k k=m
therefore 0=
l
l (−1)l−k k(k − 1) · · · (k − m + 1), k
k=m
August 29, 2005 18:8 WSPC/148-RMP
900
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
and we conclude that f (m) φλ , (Y − λ)l = δlm . Any φ ∈ H ◦ can be written as φ= em f (m) φλ . m≥0
It follows that those φ satisfying φ, (Y − λ)r = 0 are of the form φ=
r−1
em f (m) φλ ,
(1.20)
m=0
s we can think of them as linear recursive sequences [26]. In general, p(Y ) = i=1 (Y − λi )ri , and the φ satisfying φ, (p(Y )) = 0 will be linear combinations of terms as in (1.20). Thus H◦ = eij f (i) φλj : eij , λj ∈ C . Furthermore, since φλ (Y n ) = λn = φλ (Y )n , it ensues that φλ ∈ Homalg (H, C) = G(H ◦ ). Conversely, if φ is group-like and φ(Y ) = λ, then 2
φ, Y 2 = ∆t φ, Y ⊗ Y = φ ⊗ φ, Y ⊗ Y = φ(Y ) = λ2 , and so on. It follows that φ = φλ , in other words the group-like elements of H ◦ are precisely the exponentials φλ = exp(λf ), and since φλ φµ = φλ+µ , we conclude that G(C[Y ]◦ ) ∼ = (C, +). In summary, H ◦ can be rewritten as C[Y ]◦ = H ⊗ G(C[Y ]◦ ). functions is given in [27]. An interpretation of C[Y ]◦ as the space of proper rational In general, when H is commutative, H = U P (H ◦ ) = U(Derη H). This is an instance of the Milnor–Moore theorem [23], on which we shall dwell a bit in Sec. 4.1. There are no group-like elements in H , apart from η. The characters of a graded connected commutative bialgebra H can be recovered as the set of group-like elements in the completion H ◦ of the algebra H = U(Derη H). The sets k k≥m (Derη H) , for m = 1, 2, . . . form a basis of neighborhoods of 0 for a vector space topology on U(Derη H); the grading properties mean that the Hopf algebra operations are continuous for this H+ -adic topology. An element of the completion of U(Derη H) is a series k≥0 zk with zk ∈ (Derη H)k for each k. As δ m+1 (a) = 0 if a ∈ (ker η)m , the element exp(δ) ∈ H ◦ makes sense for each δ ∈ Derη H and exp δ ∈ G(H ◦ ) since ∆ exp δ = exp ∆δ = exp(η ⊗ δ + δ ⊗ η) = exp δ ⊗ exp δ, by continuity of ∆. This exponential map is a bijection between Derη H and G(H ◦ ) [28, 29], with inverse log µ = −
∞ (η − µ)n . n n=1
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
901
That is, the group G(H ◦ ) is a pro-unipotent Lie group, and one regards the commutative Hopf algebra H as an algebra of affine coordinates on that group. As a general paradigm, the dual of R(G), where G is a Lie group with Lie algebra g, is of the form U(g) ⊗ CG as a coalgebra; as an algebra, it is the smash product of U(g) and CG. Smash products will be briefly discussed in Sec. 2.3. Here they are just a fancy name for the adjoint action of G on g. Finally we can come back to our introductory remarks in the second half of Sec. 1.1. The whole idea of harmonic analysis is to “linearize” the action of a group — or a Lie algebra — on a space X by switching the attention, as it is done in noncommutative geometry, from X itself to spaces of functions on X, and the corresponding operator algebras. Now, linear representations of groups and of Lie algebras can be tensor multiplied, whereas general representations of algebras cannot. Thus from the standpoint of representation theory, the main role of the coproduct is to ensure that the action on H-modules propagates to their tensor products. To wit, if a bialgebra H acts on V and W , then it will also act on V ⊗W by h⊗ · (v ⊗ w) = h(1) · v ⊗ h(2) · w for all h ∈ H, v ∈ V and w ∈ W . In other words, if φV : H ⊗ V → V , φW : H ⊗ W → W denote the actions, then φV ⊗W := (φV ⊗ φW )(id ⊗ τ ⊗ id)(∆ ⊗ id ⊗ id).
(1.21)
Indeed, h⊗ · (k⊗ · (v ⊗ w))) = (h(1) k(1) ) · v ⊗ (h(2) k(2) ) · w = (hk)⊗ · (v ⊗ w), and moreover 1⊗ · (v ⊗ w) := 1 · v ⊗ 1 · w = v ⊗ w as it should. In this view, the product structure on the module of all representations of a group comes from the comultiplication: g → g ⊗ g for g ∈ G; in the case of representations of Lie algebras, where a g-module is the same as a module for the bialgebra U(g), there is analogously a product. Note that V ⊗ W W ⊗ V in an arbitrary H-module category. Definition 1.12. Dually, we envisage corepresentations of coalgebras C. A right corepresentation or coaction of C on a vector space V is a linear map γ : V → V ⊗C such that (id ⊗ ∆)γ = (γ ⊗ id)γ and id = (id ⊗ η)γ. These conditions are expressed by the commutativity of the diagrams id ⊗ ∆ V ⊗ CO ⊗ C o V ⊗O C γ ⊗ id
V ⊗C o
γ γ
V,
VO o
/ V ⊗C O id ⊗ η
id
V
γ
/ V ⊗ C,
August 29, 2005 18:8 WSPC/148-RMP
902
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
obtained by the process of reversing arrows from the axioms of a left representation of an algebra φ : A ⊗ V → V : A⊗A⊗V
m ⊗ id
/ A⊗V
id ⊗ φ
A⊗V
φ
φ
γ(v) =
/ C⊗V u ⊗ id
id
/ V,
We use the convenient notation
VO o V o
φ
A ⊗ V.
v (1) ⊗ v (2) ,
with v (1) ∈ V and v (2) ∈ C. So, for instance, the first defining relation becomes (2) (2) v (1) ⊗ v(1) ⊗ v(2) . v (1)(1) ⊗ v (1)(2) ⊗ v (2) = Representations of algebras come from corepresentations of their predual coalgebras: if γ is a corepresentation as above, then h · v := (id ⊗ h)γ(v) or h · v := v (1) h(v (2) ) defines a representation of C ∗ . Indeed v (1)(1) h1 (v (1)(2) )h2 (v (2) ) h1 · (h2 · v) = h1 · v (1) h2 (v (2) ) = (2) (2) (1) = v (1) h1 v(1) h2 v(2) = v h1 h2 (v (2) ) = (h1 h2 ) · v. Now we use the product structure: if a bialgebra H coacts on V and W , it coacts on the tensor product V ⊗ W by γ⊗ (v ⊗ w) = v (1) ⊗ w(1) ⊗ v (2) w(2) , v ∈ V, w ∈ W ; that is γV ⊗W := (id ⊗ id ⊗ m)(id ⊗ τ ⊗ id)(γV ⊗ γW ), in complete parallel to (1.21). The required corepresentation properties are easily checked as well. For instance, (v ⊗ w)(1)(1) ⊗ (v ⊗ w)(1)(2) ⊗ (v ⊗ w)(2) = v (1)(1) ⊗ w(1)(1) ⊗ v (1)(2) w(1)(2) ⊗ v (2) w(2) = v (1)(1) ⊗ 1 ⊗ v (1)(2) ⊗ v (2) · 1 ⊗ w(1)(1) ⊗ w(1)(2) ⊗ w(2) (2) (2) (2) (2) = v (1) ⊗ 1 ⊗ v(1) ⊗ v(2) · 1 ⊗ w(1) ⊗ w(1) ⊗ w(2) (2) (2) (2) (2) = v (1) ⊗ w(1) ⊗ v(1) w(1) ⊗ v(2) w(2) (2) (2) = (v ⊗ w)(1) ⊗ (v ⊗ w)(1) ⊗ (v ⊗ w)(2) . These simple observations prove decisive to our reconstruction of the Connes– Moscovici Hopf algebra in Sec. 2.3.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
903
1.5. Antipodes Definition 1.13. A skewgroup or Hopf algebra is a bialgebra H together with a (necessarily unique) convolution inverse S for the identity map id. Thus, id ∗ S = m(id ⊗ S)∆ = uη,
S ∗ id = m(S ⊗ id)∆ = uη,
which boils down to the commutativity of the diagram H ⊗H o
∆
HO
u◦η
id ⊗ S
H ⊗H
∆
/Ho
m
In terms of elements, this means a(1) Sa(2) = η(a)
and
m
/ H ⊗H S ⊗ id
H ⊗ H.
Sa(1) a(2) = η(a).
The map S is usually called the antipode or coinverse of H. The notion of Hopf algebra occurred first in the work by Hopf in algebraic topology [30]. Uniqueness of the antipode can be seen as follows. Let S, S be two antipodes on a bialgebra. Then S a = S a(1) η(a(2) ) = S a(1) a(2)(1) Sa(2)(2) = S a(1)(1) a(1)(2) Sa(2) = η(a(1) )Sa(2) = Sa. We have used counity in the first equality, and successively the antipode property for S, coassociativity, the antipode property for S and counity again. A bialgebra morphism between two Hopf algebras H, K is automatically a Hopf algebra morphism, i.e. it exchanges the antipodes: SH = SK . For that, it is enough to prove that these maps are one-sided convolution inverses for in Hom(H, K). Indeed, since the identity in Hom(H, K) is uK ηH , it is enough to notice that SH ∗ = (SH ∗ idH ) = uH ηH = uK ηH = uK ηK = (idK ∗ SK ) = ∗ SK ,
(1.22)
associativity of convolution then yields SK = uK ηH ∗ SK = SH ∗ ∗ SK = SH ∗ uK ηH = SH . The antipode is an antimultiplicative and anticomultiplicative map of H. This means Sm = mτ (S ⊗ S),
S1 = 1 and τ ∆S = (S ⊗ S)∆,
ηS = S.
The first relation, evaluated on a ⊗ b, becomes the familiar antihomomorphism property S(ab) = SbSa. For the proof of it we refer to [24, Lemma 1.26]; the second relation is a similar exercise. A group-like element g of a Hopf algebra H is always invertible with g −1 = Sg. Indeed, 1 = uη(g) = m(id ⊗ S)∆g = gSg = m(S ⊗ id)∆g = (Sg)g.
August 29, 2005 18:8 WSPC/148-RMP
904
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
Often the antipode S is involutive (thus invertible); that is, S 2 = idH . Proposition 1.14. S is involutive if and only if Sa(2) a(1) = a(2) Sa(1) = η(a).
(1.23)
Proof. The relation Sa(2) a(1) = η(a) implies S ∗ S 2 a = Sa(1) S 2 a(2) = S Sa(2) a(1) = Sη(a) = η(a). Hence S ∗ S 2 = S 2 ∗ S = uη, which entails S 2 = id. Reciprocally, if S 2 = id, then Sa(2) a(1) = Sa(2) S 2 a(1) = S Sa(1) a(2) = Sη(a) = η(a), and analogously a(2) Sa(1) = η(a). In other words, the coinverse S is involutive when it is still the inverse of the identity for the new operation obtained from ∗ by twisting it with the flip map. Property (1.23) clearly holds true for Hopf algebras that are commutative or cocommutative. The antipode for a commutative and cocommutative Hopf algebra is an involutive bialgebra morphism. A Hopf subalgebra of H is a vector subspace E that is a Hopf algebra with the restrictions of the antipode, product, coproduct and so on, the inclusion E → H being a bialgebra morphism. A Hopf ideal is a biideal J such that SJ ⊆ J; the quotient H/J gives a Hopf algebra. A glance at the defining conditions for the antipode shows that, if H is a Hopf algebra, then H copp is also a Hopf algebra with the same antipode. However, the bialgebras H opp , H cop are Hopf algebras if and only if S is invertible, and then the antipode is precisely S −1 [31, Sec. 1.2.4]. We prove this for H cop . Assume S −1 exists. It will be an algebra antihomomorphism. Hence S −1 a(2) a(1) = S −1 (Sa(1) a(2) ) = S −1 η(a)1 = η(a)1, similarly m(id ⊗ S −1 )∆cop = uη. Reciprocally, if H cop has an antipode S , then SS ∗ Sa = SS a(1) Sa(2) = S(a(2) S a(1) ) = S η(a)1 = uη(a). Therefore SS = id, so S is the inverse of S under composition. The duality nonsense of Sec. 1.4 is immediately lifted to the Hopf algebra category. The dual of (H, m, u, ∆, η, S) becomes (H ◦ , ∆t , η t , mt , ut , S t ); to Eqs. (1.16) we add the condition
SH a, f = a, SK f , which is actually redundant. As for examples, the bialgebra CG is a Hopf algebra with coinverse Sx = x−1 ; the bialgebra R(G) is a Hopf algebra with coinverse Sf (x) = f (x−1 ).
(1.24)
The antipode is a powerful inversion machine. If H is a Hopf algebra, both algebra homomorphisms on an algebra A and coalgebra morphisms on a coalgebra C
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
905
can be inverted in the convolution algebra. In fact, by going back to reinspect (1.22), we see already that in the case of algebra morphisms f S is a left inverse for f ; also, using (1.3), f ∗ f S = f (id ∗ S) = f uH ηH = uA ηH . In the case of coalgebra maps in (1.22), we see that Sf is a right inverse for f ; and similarly it is a left inverse. Recall that Homalg (H, A) denotes the convolution monoid of multiplicative morphisms on an algebra A with neutral element uA ηH . The “catch” is that f S does not belong to Homalg (H, A) in general; as remarked before, it will if A is commutative (a moment’s reflection reassures us that although S is antimultiplicative, f S indeed is multiplicative). In that case, Homalg (H, A) becomes a group (an abelian one if H is cocommutative). In particular, that is the case of the set Homalg (H, C) of multiplicative functions or characters, and of Homalg (H, H), when H is commutative. (One also gets a group when H is cocommutative, and considers the coalgebra morphisms to H from a coalgebra C.) In the first of the examples given in Sec. 1.2, the group of real characters of R(G) reconstructs G in its full topological glory: this is the Tannaka–Kre˘ın duality — see [24, Chap. 1] and [28]. Characters of connected graded Hopf algebras have special properties, exhaustively studied in [32]. A pillar of wisdom in Hopf algebra theory: connected bialgebras are always Hopf. There are at least two “grand strategies” to produce the antipode S : H → H in a connected bialgebra. One is to exploit its very definition as the convolution inverse of the identity in H, via a geometric series: S := id∗−1 = SG := (uη − (uη − id))∗−1 := uη + (uη − id) + (uη − id)∗2 + · · · Proposition 1.15. The geometric series expansion of Sa with a ∈ Hn has at most n + 1 terms. Proof. If a ∈ H0 , the claim holds since (uη − id)1 = 0. It also holds in H+ 1 because these elements are primitive. Assume that the statement holds for elements in H+ n−1 and let a ∈ H+ n , then (uη − id)∗(n+1) (a) = (uη − id) ∗ (uη − id)∗n (a) = m[(uη − id) ⊗ (uη − id)∗n ]∆a = m[(uη − id) ⊗ (uη − id)∗n ](a ⊗ 1 + 1 ⊗ a + ∆ a). The first two terms vanish because (uη − id)1 = 0. By the induction hypothesis, each of the summands of the third term is also zero. In view of (1.12), on H+ we can write for k ≥ 1, (uη − id)∗k+1 = (−1)k+1 mk ∆ . k
(1.25)
August 29, 2005 18:8 WSPC/148-RMP
906
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
There is then a fully explicit expression for the antipode for elements without degree 0 components (recall S1 = 1), in terms of the product and the reduced coproduct SG = −id +
∞
(−1)k+1 mk ∆ . k
(1.26)
k=1
All this is remarked in [16]. The second canonical way to show that a connected bialgebra is a Hopf algebra amounts to take advantage of the equation m(S ⊗ id)∆a = 0 for a ∈ H+ . For a ∈ H+ n and n ≥ 1, one ushers in the recursive formula: SB a(1) a(2) , (1.27) SB (a) := −a − using the notation in (1.10). Proposition 1.16. If H is a connected bialgebra, then SG a = SB a. Proof. The statement holds, by a direct check, if a ∈ H+ 1 . Assume that SG b = SB b whenever b ∈ H+ n and let a ∈ H+ n+1 . Then SG a = (uη − id)a +
n
(uη − id)∗i ∗ (uη − id)a
i=1
n ∗i = −a + m (uη − id) ⊗ (uη − id) ∆a i=1
= −a + m
n
(uη − id)∗i ⊗ (uη − id)(a ⊗ 1 + 1 ⊗ a + ∆ a)
i=1
= −a +
n
(uη − id)∗i a(1) (uη − id)a(2) = −a −
i=1
n
(uη − id)∗i a(1) a(2)
i=1
= −a − SB a(1) a(2) = SB a, where the penultimate equality uses the induction hypothesis. Taking into account the alternative expression SG a = (uη − id)a +
n
(uη − id) ∗ (uη − id)∗i a,
i=1
it follows that the twin formula a := −a − SB
a(1) SB a(2)
provides as well a formula for the antipode. The subindices B in SB , SB reminds us that this second strategy corresponds precisely to Bogoliubov’s formula for renormalization in quantum field theory. The geometric series leading to the antipode can be generalized as follows. Consider, for H a connected bialgebra and A an arbitrary algebra, the set of
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
907
elements f ∈ Hom(H, A) fulfilling f (1) = 1. They form a convolution monoid, with neutral element uA ηH , as f ∗ g(1) = 1, if both f and g are of this type. Moreover we can repeat the inversion procedure: f ∗−1 := (uη − (uη − f ))∗−1 := uη + (uη − f ) + (uη − f )∗2 + · · · Then f ∗−1 (1) = 1 and for any a ∈ H+ , (uη − f )∗k+1 a = (−1)k+1 mk (f ⊗ · · · ⊗ f )∆ a, k
vanish for a ∈ Hn when n ≤ k. Therefore the series stops. The convolution monoid then becomes a group; of which, as we already know, the set Homalg (H, A) of multiplicative morphisms is a subgroup when A is commutative. The foregoing indicates that, associated to any connected bialgebra, there is a natural filtering — we call it depth — where the order δ(a) of a generator a ∈ H+ k is k > 0 when H is the smallest integer such that a ∈ ker(∆ p), we then say a is k-primitive. Whenever a ∈ Hn , it holds δ(a) ≤ n. On account of (1.12), k k+1 ). In other words, for those bialgebras, depth ker(Uk ) := ker(∆ p) = (∆k )−1 (HR is the coradical filtering. Antipodes are automatically filtered. k On H+ , one has (uη − id)∗k+1 a = 0, in view of (1.25), if ∆ a = 0; but of course the converse is not true. Definition 1.17. For H commutative, we say a indecomposable is quasiprimitive if Sa = −a,
this implies
(uη − id)∗2 a = 0.
(1.28)
Obviously, primitive elements are quasiprimitive. In Sec. 4.2, we give examples of elements that are quasiprimitive, but not primitive, and discuss their relevance. We will also see that a basis for H can be found such that, for any element b of it, Sb = −b or b. Following [33], we conclude this part with a relatively short proof of that δ indeed is a filtering, in the framework of connected graded bialgebras of finite type, comprising our main examples. Let us denote in the reminder of this section k Hk = C1 ⊕ ker ∆ = { a ∈ H : δ(a) ≤ k } for k ≥ 0. This is certainly a linear filtering. k+1 (l) ∗ Proposition 1.18. Let H+ := ⊂ H . Then Hk⊥ = H+ — thus l≥1 H k+1 ⊥ Hk = (H+ ) . Proof. (Derivations are typical elements of 1⊥ .) The assertion is true and obvi with kernel 1H . Let ous for k = 0. Consider (id − uη)t , the projection on H+ κ1 , . . . , κk+1 ∈ H+ and a ∈ Hk . We have
κ1 · · · κk+1 , a = κ1 ⊗ · · · ⊗ κk+1 , ∆k a = (id − uη)t
⊗(k+1)
(κ1 ⊗ · · · ⊗ κk+1 ), ∆k a
= κ1 ⊗ · · · ⊗ κk+1 , Uk a = 0.
August 29, 2005 18:8 WSPC/148-RMP
908
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
k+1 k+1 ⊥ Therefore, H+ ⊆ Hk⊥ . Now, let a ∈ (H+ ) . Then again
κ1 ⊗ · · · ⊗ κk+1 , Uk a = κ1 · · · κk+1 , a = 0. ⊗(k+1) ⊥ ⊗(k+1) k+1 ⊥ Therefore Uk a ∈ H+ ∩ H+ ) = (0). Consequently (H+ ) ⊆ Hk , k+1 ⊥ which implies Hk ⊆ H+ . Given any augmented graded algebra A, connected in the sense that A(0) 0 with A+ = i≥1 A(i) , it is not too hard to see that
Ai+ ⊗ Aj+ =
i+j>k
m+1 Al+1 . + ⊗ A + A ⊗ A+ l+m=k
As a corollary: Proposition 1.19. The filtering by the Hk is a coalgebra filtering. Proof. We have
Hl ⊗ H m =
l+m=k
m+1 ⊥ l+1 ⊥ H+ ⊗ H+ l+m=k
=
l+1 m+1 ⊥ H+ ⊗ H + H ⊗ H+
l+m=k
=
⊥ l+1 m+1 H+ ⊗ H + H ⊗ H +
l+m=k
=
i H+
⊥ ⊗
j H+
.
i+j>k j k+1 i Now, let a ∈ Hk and κ1 ∈ H+ , κ 2 ∈ H+ , i + j > k. Then κ1 κ2 ∈ H+ = Hk⊥ . Thus
0 = a, κ1 κ2 = ∆a, κ1 ⊗ κ2 . This means ∆a ∈
l+m=k
Hl ⊗ Hm , and so Hk is a coalgebra filtering as claimed.
Proposition 1.20. The filtering by the Hk is an algebra filtering. Proof. If a ∈ Hk , then ∆ k a = 0, and the counit property entails ∆ k−1 a ∈ P (H)⊗k . Let now a ∈ Hl , b ∈ Hm with l−1 m−1 ∆ a = a(1) ⊗ a(2) ⊗ · · · ⊗ a(l) , ∆ b= b(1) ⊗ · · · ⊗ b(m) .
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
909
By (1.12) once again ∆
l+m−1
(ab) = Ul+m−1 (ab) = (id − uη)⊗l+m (∆l+m−1 a∆l+m−1 b) a(σ(1)) ⊗ · · · ⊗ a(σ(l)) ⊗ b(σ(l+1)) ⊗ · · · ⊗ b(σ(l+m)) = =:
σ∈Sl+m,l
σ · (a(1) ⊗ · · · ⊗ a(l) ⊗ b(1) ⊗ · · · ⊗ b(m) ),
(1.29)
σ∈Sl+m,l
where Sn,p denotes the set of (p, n − p)-shuffles; a (p, q)-shuffle is an element of the group of permutations Sp+q of {1, 2, . . . p + q} in which σ(1) < σ(2) < · · · < σ(p) and σ(p + 1) < · · · < σ(p + q). l+m−1 l+m (ab) ∈ P (H)⊗l+m , and hence ∆ Equation (1.29) implies that ∆ (ab) = 0. We retain as well the following piece of information n−1 (p1 · · · pn ) = pσ(1) ⊗ · · · ⊗ pσ(n) ∆ σ∈Sn
for primitive elements p1 , . . . , pn . The proof is by induction. Obviously we have in particular ∆ (p1 p2 ) = p1 ⊗ p2 + p2 ⊗ p1 . Assuming ∆
n−2
(p1 · · · pn−1 ) =
pσ(1) ⊗ · · · ⊗ pσ(n−1) ,
σ∈Sn−1
Eq. (1.29) tells us that ∆
n−1
(p1 · · · pn ) =
τ · (pσ(1) ⊗ · · · ⊗ pσ(n−1) ⊗ pn )
τ ∈Sn,1 σ∈Sn−1
=
pσ(1) ⊗ · · · ⊗ pσ(n) .
σ∈Sn
1.6. Symmetric algebras In our second example in Sec. 1.2, we know SX = −X for X ∈ g, since X is primitive, and S(XY ) = Y X; but the concrete expression in terms of a basis given a priori can be quite involved. Consider, however, the universal enveloping algebra corresponding to the trivial Lie algebra structure on V . This is clearly a commutative and cocommutative Hopf algebra, nothing else than the familiar symmetric, free commutative algebra or, in physics, boson algebra B(V ) over V of quantum field theory. Given V , a complex vector space, B(V ) is defined as ∞ ∨n , where V ∨n is the complex vector space algebraically generated by the n=0 V symmetric products 1 vσ(1) ⊗ vσ(2) ⊗ · · · ⊗ vσ(n) , v1 ∨ v2 ∨ · · · ∨ vn := n! σ∈Sn
August 29, 2005 18:8 WSPC/148-RMP
910
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
with V ∨0 = C by convention. On B(V ), a coproduct and counit are defined respectively by ∆v := v ⊗ 1 + 1 ⊗ v and η(v) = 0, for v ∈ V , and then extended by the homomorphism property. In general, a1(1) ∨ a2(1) ⊗ a1(2) ∨ a2(2) ∆(a1 ∨ a2 ) = for a1 , a2 ∈ B(V ). Another formula is
∆(v1 ∨ v2 ∨ · · · ∨ vn ) =
U (I) ⊗ U (I c )
(1.30)
I
with sum over all the subsets I ⊆ {v1 , v2 , . . . vn }, I c = {v1 , v2 , . . . vn }\I and U (I) denotes the ∨ product of the elements in I. Thus, if u = v1 ∨ v2 ∨ · · · ∨ vn , then ∆ u =
n−1
vσ(1) ∨ · · · ∨ vσ(p) ⊗ vσ(p+1) ∨ · · · ∨ vσ(n) .
p=1 σ∈Sn,p
Here we are practically repeating the calculations at the end of the previous section. In the particularly simple case when V is one-dimensional, the Hopf algebra B(V ) U(C) is just the binomial bialgebra. Finally Sa = −a for elements of B(V ) of odd degree and Sa = a for even elements. The reader can amuse himself checking how (1.26) works here. It should be clear that an element of B(V ) is primitive iff it belongs to V . For a direct proof, let {ei } be a basis for V , any a ∈ B(V ) can be represented as αi1 ,...,ik ei1 ∨ · · · ∨ eik a = α1 + k≥1 i1 ≤···≤ik
for some complex numbers αi1 ,...,ik . Now, if a is primitive, then α = η(a) = 0 and ∆a = a ⊗ 1 + 1 ⊗ a = αi1 ,...,ik (ei1 ∨ . . . ∨ eik ⊗ 1 + 1 ⊗ ei1 ∨ · · · ∨ eik ) but also ∆a =
k
=
αi1 ,...,ik (ei1 ⊗ 1 + 1 ⊗ ei1 ) · · · (eik ⊗ 1 + 1 ⊗ eik )
eJ ⊗ eJ c , αi1 ,...,ik ei1 ∨ · · · ∨ eik ⊗ 1 + 1 ⊗ ei1 ∨ · · · ∨ eik + J
k
where the last sum runs over all nonempty subsets J = {i1 , . . . , il } of [k] := {1, . . . , k} with at most k − 1 elements, J c denotes the complement of J in [k] and eJ := ei1 ∨ · · · ∨ eil . A comparison of the two expressions gives αi1 ,...,ik eJ ⊗ eJ c = 0. k≥2 i1 ≤···≤ik
∅ =J[k]
This forces αi1 ,...,ik = 0 for k ≥ 2, so a ∈ V . Definition 1.21. A Hopf algebra H is said to be primitively generated when the smallest subalgebra of H containing all its primitive elements is H itself. Cocommutativity of H is clearly a necessary condition for that. It is plain that B(V ) is primitively generated.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
911
Symmetric algebras have the following universal property: any morphism of (graded, if you wish) vector spaces ψ : V → H, where V (0) = (0) and H is a unital graded connected commutative algebra, extends uniquely to a unital graded algebra homomorphism Bψ : B(V ) → H; one says B(V ) is free over V . Note the isomorphism B(V ⊕ V˜ ) B(V ) ⊗ B(V˜ ), implemented by V ⊕ V˜ (v, v˜) → v ⊗ 1 + 1 ⊗ v˜, extended by linear combinations of products. In this perspective, the comultiplication ∆ on B(V ) is the extension Bd induced by the diagonal map d : V → V ⊕ V . Proposition 1.22. Let H be a graded connected commutative Hopf algebra and denote by ψH the inclusion P (H) → H. The universal property gives a graded algebra map BψH : B(P (H)) → H. This is a graded Hopf algebra morphism. Proof. The coproduct ∆ : H → H ⊗ H gives a linear map P ∆ : P (H) → P (H ⊗ H) such that ∆ψH = ψH⊗H P ∆. The universal property of symmetric algebras then gives us maps BψH , BψH⊗H and BP ∆ such that ∆BψH = BψH⊗H BP ∆. By (1.13), B P (H ⊗ H) ∼ = B P (H) ⊕ P (H) ∼ B P (H) ⊗ B P (H) , BψH⊗H BψH ⊗ BψH . = Therefore BP ∆ is identified to the coproduct of B(P (H)) and ∆BψH = (BψH ⊗ BψH )∆B(P (H)) . In a similar fashion it is seen that BψH respects counity and coinverse. If V ⊆ P (H), then the subalgebra C[V ] generated by V is a primitively generated Hopf subalgebra of H. The inclusion ι : V → H induces a morphism of graded algebras, indeed of Hopf algebras, Bι : B(V ) → H whose range is C[V ]. In particular, C[P (H)] is the largest primitively generated subalgebra of H and BψH : B(P (H)) → H is a morphism of Hopf algebras onto C[P (H)]. Therefore, H is primitively generated only if the underlying algebra is generated by P (H). All these statements follow from the previous proof and the simple observation that v ∈ V is primitive in both C[V ] and B(V ). Proposition 1.23. The morphism BψH : B(P (H)) → H is injective. In particular, if H is primitively generated, then H = B(P (H)). Proof. The vector space P (H) can be regarded as the direct limit of all its finitedimensional subspaces V , hence B(P (H)) is the direct limit of all B(V ) — tensor products commute with direct limits — and the map BψH is injective if and only if its restriction to each of the algebras B(V ) is injective. Thus it is enough to prove the proposition for V a finite-dimensional subspace of P (H). We do that by induction on m = dim V . For m = 0, there is nothing to prove. Assume, then, that the claim holds for all subspaces W of P (H) with dim W ≤ m−1
August 29, 2005 18:8 WSPC/148-RMP
912
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
and let V be a subspace of P (H) with dim V = m. Let W be an (m−1)-dimensional subspace of V , then Bψ : B(W ) → C[W ] is an isomorphism of Hopf algebras. Take Y ∈ V \C[W ] homogeneous of minimal degree, so V = W ⊕ CY . Then B(V ) ∼ = ∼ B(W ) ⊗ B(CY ) ∼ CY ] C[W ] ⊗ C[Y ]. The = C[W ] ⊗ B(CY ). Now, C[V ] ∼ = C[W ⊕ = remarks prior to the statement imply that B P (C[Y ]) ∼ = C P (C[Y ]) , and since Y ∈ V ⊂ P (H), clearly P (C[Y ]) = CY So B(CY ) ∼ = C[Y ]. It follows that B(V ) ∼ = C[W ] ⊗ B(CY ) ∼ = C[W ] ⊗ C[Y ] ∼ = C[V ] which completes the induction. A similar argument allows us to take up some unfinished business: the converse of Proposition 1.10. Theorem 1.24. The relation 2 P (H) ∩ H+ = (0)
(1.31)
or equivalently qH : P (H) → Q(H) is one-to-one, holds if H is commutative. Proof. Suppose H commutative has a unique generator. Then a moment’s reflection shows that H has to be the binomial algebra, and then P (H) Q(H). Suppose now that the proposition is proved for algebras with less than or equal to n generators. Let the elements a1 , . . . , an+1 be such that their images by the canonical projection H → Q(H) form a basis of Q(H). Leave out the element of highest degree among them, and consider B, the Hopf subalgebra of H generated by the other n elements. We form C ⊗B H = H/B+ H. This is seen to be a Hopf algebra with one generator. Moreover, H B ⊗ C ⊗B H. Then (1.13) implies P (H) = P (B) ⊕ P (C ⊗B H). By the induction hypothesis, qH = qB ⊕ qC⊗B H is injective. The proposition is then proved for finitely generated Hopf algebras. Hopf algebras of finite type are clearly direct limits of finitely generated Hopf algebras; and direct limits preserve the functors P, Q and injective maps; so (1.31) holds true for Hopf algebras of finite type as well. Finally, by the result of [23] already invoked in Sec. 1.3, the proposition holds for all commutative connected graded Hopf algebras without exception. Yet another commutative and cocommutative Hopf algebra is the polynomial algebra C[Y1 , Y2 , Y3 , . . .] with coproduct given by ∆Yn =
n
Yj ⊗ Yn−j
j=0
to be studied later, using symmetric algebra theory. We denote this algebra by H and call it the ladder Hopf algebra. It is related to nested Feynman graphs and to the Hopf algebra of rooted trees, considered in Sec. 3.3. To finish, we come back to the universal property of B(V ) as an algebra. For that, the coalgebra structure previously considered does not come into play, and one
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
913
could ask whether other coproducts are available, that can also give a Hopf algebra structure on B(V ). The answer is yes, but the one exhibited here is the only one that makes B(V ) a graded Hopf algebra. More to the point: dually, given graded cocommutative coalgebras and linear maps into vector spaces ϕ : H → V , there is a universal “cofree cocommutative” coalgebra, say Qcocom (V ), “cogenerated” by V , together with a unique coalgebra map Qcocom ϕ : H → Qcocom (V ) restricting to ϕ by the projection Qcocom (V ) → V . Now, Qcocom (V ) is nothing but the vector space underlying B(V ) with the coalgebra structure already given here; and there is a unique algebra structure on Qcocom (V ) making it a graded Hopf algebra. In many contexts it is important to waive the (co)commutativity requisite on B and on Qcocom ; this respectively leads to the tensor or free graded algebra (already touched upon in Example 1.5) and to the cotensor or cofree graded coalgebra, for which the most natural Hopf algebra structure is Ree [34] and Chen’s [35] shuffle algebra. In some detail: already in Sec. 1.2 we have touched upon the tensor algebra T (V ), which is made into a (graded, connected, cocommutative) Hopf algebra by the algebra morphisms ∆1 = 1 ⊗ 1,
∆X := X ⊗ 1 + 1 ⊗ X,
η(X) := 0,
η(1) = 1,
for X ∈ V . It is useful to think of the basis of V as an alphabet, whose elements are letters, and to write the tensor product simply as a concatenation. Then we have, as before, n Xσ(1) · · · Xσ(p) ⊗ Xσ(p+1) · · · Xσ(n) , (1.32) ∆(X1 X2 · · · Xn ) = p=0 σ∈Sn,p
with Sn,p denoting the (p, n − p)-shuffles. It should be clear that primitive elements of T (V ) constitute a free Lie algebra [36]. For instance, if there are two letters {X, Y } in V , a basis is made of X, Y, [X, Y ], [X, [X, Y ]], [[X, Y ], Y ], [X, [X, [X, Y ]]], [X, [[X, Y ], Y ]] and so on. The graded dual of T (V ) — which at least in the finite type case is isomorphic to T (V ) as a graded vector space — is the shuffle Hopf algebra Sh(V ), a most interesting object that appears in many contexts. We find its (necessarily associative and commutative) product, denoted by , by dualizing (1.32): for the words u = d1 d2 · · · dp and v = dp+1 dp+2 · · · dn with each di in V , their product is dσ(1) · · · dσ(n) . uv= σ∈Sn,p
For instance, 1 d = d 1 = d,
d1 d2 = d2 d1 = d1 d2 + d2 d1 ,
d1 d2 d3 = d3 d1 d2 + d1 d3 d2 + d1 d2 d3 . There is a recursive definition given by d1 d2 · · · dp dp+1 dp+2 · · · dn = d1 (d2 · · · dp dp+1 dp+2 · · · dn ) + dp+1 (d1 d2 · · · dp dp+2 · · · dn ).
August 29, 2005 18:8 WSPC/148-RMP
914
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
We have indeed
u v, X1 X2 · · · Xn = u ⊗ v, ∆(X1 X2 · · · Xn ). One can easily show, using the universal property, that Sh(V ) is the unique commutative graded Hopf algebra structure on the cofree graded subalgebra Q(V ) [37]. The coproduct on the latter is given by
∆u, X1 X2 · · · Xl ⊗ Y1 Y2 · · · Yk = u, X1 X2 · · · Xl Y1 Y2 · · · Yk , that is, deconcatenation: ∆u =
v ⊗ w.
u=vw
2. The Fa` a di Bruno Bialgebras 2.1. Partitions, Bell polynomials and the Fa` a di Bruno algebras In a “moral” sense, epitomized by the first example of Sec. 1.2, the discussion around Eq. (1.24) and the consideration of G(H ◦ ) in Sec. 1.4, commutative skewgroups are equivalent to groups. Now, we would like to deal with relatively complicated groups, like diffeomorphism groups. Variants of Hopf algebra theory generalizing categories of (noncompact in general) topological groups do exist [38]. It is still unclear how to handle diffeomorphism groups globally, though: the interplay between topology and algebra becomes too delicate [39]. We settle for a “perturbative version”. Locally, one can think of orientation preserving diffeomorphisms of R leaving a fixed point as given by formal power series like f (t) =
∞ n=0
fn
tn n!
(2.1)
with f0 = 0, f1 > 0. (Orientation preserving diffeomorphisms of the circle are just periodic ones of R, and locally there is no difference.) Among the functions on the group G of diffeomorphisms, the coordinate functions on jets an (f ) := fn = f (n) (0),
n ≥ 1,
single out themselves. The product of two diffeomorphisms is expressed by series composition; to which just like in Example 1.4, we expect to correspond a coproduct for the an elements. As the an are representative, it is unlikely that this reasoning will lead us astray. Let us then work out f ◦g, for f as above and g of the same form with coordinates gn = an (g). This old problem is solved with the help of Bell polynomials. The (partial, exponential) Bell polynomials Bn,k (x1 , . . . , xn+1−k ) for n ≥ 1, 1 ≤ k ≤ n are defined by the series expansion n m tn t k xm = 1 + u Bn,k (x1 , . . . , xn+1−k ) . (2.2) exp u m! n! m≥1
n≥1
k=1
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
915
The first Bell polynomials are readily found: Bn,1 = xn , Bn,n = xn1 , B2,1 = x2 , B3,2 = 3x1 x2 , B4,2 = 3x22 + 4x1 x3 , B4,3 = 6x21 x2 , B5,2 = 10x2 x3 + 5x1 x4 , B5,3 = 10x21 x3 + 15x1 x22 , B5,4 = 10x31 x2 , . . . Each Bell polynomial Bn,k is homogeneous of degree k. We claim the following: if h(t) = f ◦ g(t), then hn =
n
fk Bn,k (g1 , . . . , gn+1−k ).
(2.3)
k=1
One can actually allow here for f0 = 0. Then the same result holds, together with h0 = f 0 . The proof is quite easy. It is clear that the hn are linear in the fn : hn =
n
fk An,k (g).
k=1
In order to determine the An,k , we choose the series f (t) = eut . This entails fk = uk and n m tn t ug k h = f ◦ g = e = exp u gm = 1 + u Bn,k (g1 , . . . , gn+1−k ) , m! n! m≥1
n≥1
k=1
from which at once there follows An,k (g) = Bn,k (g1 , . . . , gn+1−k ).
(2.4)
So h1 = f1 g1 , h2 = f1 g2 +f2 g12 , h3 = f1 g3 +3f2 g1 g2 +f3 g13 and so on. Francesco Fa`a di Bruno (beatified in 1988) gave a formula equivalent to (2.4) about a hundred and fifty years ago [40]. There are older instances of it: see our comment at the end of Sec. 2.3. Lest the reader think we are dealing with purely formal results, we remark that, if g is real analytic on an open interval I1 of the real line and takes values on another open interval I2 , on which f is analytic as well, then f ◦ g given by (2.3) is analytic on I1 too [41]. To obtain explicit formulae for Bn,k , one can proceed directly from the definition. We shall only need the multinomial identity (β1 + β2 + · · · + βr )k =
c1 +c2 +···+cr =k
k! β c1 β c2 · · · βrcr , c1 !c2 ! · · · cr ! 1 2
that generalizes directly the binomial identity. To see that, note that if c1 +c2 +· · ·+ cr = k, then the multilinear coefficient c1 , c2 ,k. . . , cr of β1c1 β2c2 · · · βrcr is the number of ordered r-tuples of mutually disjoint subsets (S1 , S2, . . . , Sr ) with |Si | = ci whose k union is {1, 2, . . . , k}. Then, since be filled in c1 different ways, and once n −Sn1 can 1 ways, and so on: S1 is filled, S2 can be filled in n2
k! k k − c1 k − c1 − c2 c k . = ··· r = c2 c3 cr c1 c1 , . . . , cr c1 ! c2 ! · · · cr !
August 29, 2005 18:8 WSPC/148-RMP
916
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
Now, we can expand k uk uk tm xm = k! m! k!
k≥0
m≥1
k≥0
=
c1 ,c2 ,...≥0
c1 +c2 +···=k
k! (x1 t)c1 (x2 t2 /2!)c2 · · · c1 !c2 ! · · ·
uc1 +c2 +c3 +··· tc1 +2c2 +3c3 +··· c1 c2 x1 x2 · · · . c1 !c2 ! · · · (1!)c1 (2!)c2 · · ·
k n
Taking the coefficients of u t /n! in view of (2.2), it follows that n! xc1 xc2 xc3 · · · , Bn,k (x1 , . . . , xn+1−k ) = c1 !c2 !c3 ! · · · (1!)c1 (2!)c2 (3!)c3 · · · 1 2 3
(2.5)
where the sum is over the sets of positive integers c1 , c2 , . . . , cn such that c1 + c2 + c3 + · · · + cn = k and c1 + 2c2 + 3c3 + · · · + ncn = n. It is convenient to introduce the notations
n! n , := λ; k λ1 !λ2 ! · · · λn !(1!)λ1 (2!)λ2 · · · (n!)λn where λ is the sequence (1, 1, . . . ; 2, 2, . . . ; . . .), better written as (1λ1 , 2λ2 , 3λ3 . . .), of λ1 1’s, λ2 2’s and so on; and xλ := xλ1 1 xλ2 2 xλ3 3 · · ·; obviously some of the λi may vanish, and certainly λn is at most 1. The coefficients ( λ;nk ) also have a combinatorial meaning. We have already employed the concept of partition of a set : if S is a finite set, with |S| = n, a partition {A1 , . . . , Ak } is a collection of k ≤ n nonempty, pairwise disjoint subsets of S, called blocks, whose union is S. It is often convenient to think of S as of [n] := {1, 2, . . . , n}. Suppose that in a partition of [n] into k blocks there are λ1 singletons, λ2 two-element subsets and so on, thereby precisely λ1 +λ2 +λ3 +· · · = k and λ1 + 2λ2 + 3λ3 + · · · = n; sometimes k is called the length of the partition and n its weight. We just saw that the number of ordered (λ1 , . . . , λr )-tuples of subsets partitioning [n] is n! . (1!)λ1 (2!)λ2 · · · (r!)λr Making the necessary permutations, we conclude that [n] possesses of class λ. Also n
Bn,k (1, . . . , 1) = = |Πn,k |, λ; k
“ n ” λ; k
partitions
λ
with Πn,k standing for the set of all partitions of [n] into k subsets. The |Πn,k | are the so-called Stirling numbers of the second kind. Later, it will be convenient to consider partitions of an integer n, a concept that should not be confused with partitions of the set [n]. A partition of n is a sequence of positive integers (f1 , f2 , . . . , fk ) such that f1 ≥ f2 ≥ f3 ≥ · · · and k i=1 fi = n. The number of partitions of n is denoted p(n). Now, consider a partition π of [n] of type λ(π) = (1λ1 , 2λ2 , 3λ3 , . . .), and let m be the largest number
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
917
for which λm does not vanish. We put f1 = f2 = · · · = fλm = m, then we take for fλm +1 the largest number such that λfλm +1 among the remaining λ’s does not vanish, and so on. The procedure can be inverted, and it is clear that partitions of n can be indexed by the sequence (f1 , f2 , . . . , fk ) of their definition, or by λ. The number n of partitions π of [n] for which λ(π) represents a partition of n is precisely λ; k . To take a simple example, let n = 4. There are the following partitions of 4: (4) ≡ (41 ), corresponding to one partition of [4]; (3, 1) ≡ (11 , 31 ), corresponding to four partitions of [4]; (2, 2) ≡ (22 ), corresponding to three partitions of [4]; (2, 1, 1) ≡ (12 , 21 ), corresponding to six partitions of [4]; (1, 1, 1, 1) ≡ (14 ), corresponding to one partition of [4]. In all, p(4) = 5, whereas the number B4 of partitions of [4] is 15. We have p(5) = 7, whereas the number of partitions of [5] is B5 = 52. The results (2.3) and (2.5) are so important that to record a slightly differently worded argument to recover them will do no harm: let f, g, h be power series as above. Notice that ∞ k ∞ fk gl l t . h(t) = k! l! k=0
l=1
To compute the nth coefficient hn of h(t), we only need to consider the partial sum up to k = n, since the other products contain powers of t higher than n, on account of g0 = 0. Then for n ≥ 1, from Cauchy’s product formula hn =
n fk k=1
k!
l1 +···+lk =n, 1≤li
n! gl1 · · · glk . l1 ! · · · lk !
Now, each sum l1 +· · ·+lk = n can be written in the form of α1 +2α2 +· · ·+nαn = n for a unique vector (α1 , . . . , αn ), satisfying α1 + · · · + αn = k; and since there are k!/α1 ! · · · αn ! ways to order the gl of each term, it again follows that hn =
n fk k=1
k!
α
g1α1 · · · gnαn n!k! = fk Bn,k (g1 , . . . , gn+1−k ), α α α 1 2 n α1 ! · · · αn ! (1!) (2!) · · · (n!) n
k=1
where the second sum runs over the vectors fulfilling the conditions just mentioned. The (complete, exponential) Bell polynomials Yn are defined by Y0 = 1 and Yn (x1 , . . . , xn ) =
n
Bn,k (x1 , . . . , xn+1−k ),
k=1
that is, taking u = 1 in (2.2), tn tm = Yn (x1 , . . . , xn ) xm exp m! n! m≥1
n≥0
and the Bell numbers by Bn := Yn (1, . . . , 1). It is clear that the Bell numbers coincide with cardinality of the set Πn of all partitions of {1, 2, . . . , n}; a fact already
August 29, 2005 18:8 WSPC/148-RMP
918
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
registered in our notation. Some amusing properties can be now derived. In formula (2.2), take u = xm = 1. We get ∞ Bn tn = exp(et − 1) or n! n=0
log
∞ Bn tn = et − 1. n! n=0
Differentiating n + 1 times both sides, it ensues the recurrence relation: n
n Bn+1 = Bk . k k=0
The same relation is of course established by combinatorial arguments. Consider the partitions of [n] as a starting point to determine the number of partitions of [n +1]. The number n + 1 must lie in a block of size k + 1 with 0 ≤ k ≤ n, and there n are k choices for such a block. Once the block is chosen, the remaining n − k numbers can be partitioned in Bn−k ways. Summing over k, one sees n
n Bn+1 = Bn−k , k k=0
which is the same formula. Thus the analytical smoke has cleared and now we put the paradigm of (1.6) in Example 1.4 to work. Taking our cue from this, we have the right to expect that the formula ∆an (g, f ) := an (f ◦ g) = an(1) (f ) an(2) (g) give rise to a coproduct for the polynomial algebra generated by the coordinates an for n going from 1 to ∞. In other words,
n n n Bn,k (a1 , . . . , an+1−k ) ⊗ ak (2.6) ∆an = aλ ⊗ ak = λ; k k=1 λ
k=1
must yield a bialgebra, which is commutative but clearly not cocommutative. The quirk in defining ∆an (g, f ) by an (f ◦ g) rather than by an (g ◦ f ) owns to the wish of having the linear part of the coproduct standing on the right of the ⊗ sign, and not on the left. The first few values for the coproduct will be ∆a1 = a1 ⊗ a1 , ∆a2 = a2 ⊗ a1 + a21 ⊗ a2 , ∆a3 = a3 ⊗ a1 + a31 ⊗ a3 + 3a2 a1 ⊗ a2 , ∆a4 = a4 ⊗ a1 + ∆a5 = a5 ⊗ a1 +
a41 a51
⊗ a4 +
6a2 a21
⊗ a5 +
10a2 a31
⊗ a3 +
(2.7) (3a22
+ 4a3 a1 ) ⊗ a2 ,
⊗ a4 + (10a3 a21 + 15a22 a1 ) ⊗ a3
+ (5a4 a1 + 10a2 a3 ) ⊗ a2 . The Hopf algebras of rooted trees and Feynman graphs introduced in QFT by Kreimer and Connes [3, 42], as well as the Connes–Moscovici Hopf algebra [5], are of
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
919
the same general type, with a linear part of the coproduct standing on the right of the ⊗ sign and a polynomial one on the left. The kinship is also manifest in that, as conclusively shown in [4] — see also [43] — one can use Feynman diagrams to obtain formulae of the type of (2.6). In what follows, we shall clarify the relations, and show how all those bialgebras fit in the framework and machinery of incidence bialgebras. But before doing that, we plan to explore at leisure the obtained bialgebra and some of its applications. We do not have a connected Hopf algebra here. Indeed, since a1 is group-like, it ought to be invertible, with inverse Sa1 . Besides, if f (−1) denotes the reciprocal series of f , then according to the following paradigm, S should be given by (1.24): Sa1 = a1 (f (−1) ) = a−1 1 (f ). To obtain a connected Hopf algebra it is necessary to set a1 = 1. In other words, to consider only formal power series (2.1) of the form f (t) = t + n≥2 fn tn /n!. The resulting graded connected bialgebra is hereinafter denoted F and called the Fa` a di Bruno algebra (terminology due to Joni and Rota [1]); the degree is given by #(an ) = n − 1, with the degree of a product given by definition as the sum of the degrees of the factors. If G1 is the subgroup of diffeomorphisms of R such that f (0) = 0 and df (0) = id, we could denote F by Rcop (G1 ). The coproduct formula is accordingly simplified as follows:
n n n ∆an = Bn,k (1, . . . , an+1−k ) ⊗ ak . aλ2 2 aλ3 3 · · · ⊗ ak = λ; k k=1 λ
k=1
Now we go for the antipode in F . Formula (1.26) applies and in this context reduces to San = −an +
n−1
j=2
1
(−1)j
Bn,k1 Bk1 ,k2 · · · Bkj−2 ,kj−1 akj−1 .
(2.8)
In (2.8), the arguments of the Bell polynomials have been suppressed for concision. In particular, Sa2 = −a2 , Sa3 = −a3 + 3a22 , Sa4 = −a4 − 15a32 + 10a2 a3 . Let us give the details in computing Sa5 : Sa5 = −a5 + (B5,2 a2 + B5,3 a3 + B5,4 a4 ) − (B5,4 B4,3 a3 + B5,4 B4,2 a2 + B5,3 B3,2 a2 ) + B5,4 B4,3 B3,2 a2
= −a5 + 15a2 a4 + 10a23 + 25a22 a3 − 130a22 a3 + 75a42 + 180a42 = −a5 + 15a2 a4 + 10a23 − 105a22a3 + 105a42.
(2.9)
August 29, 2005 18:8 WSPC/148-RMP
920
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
The computation using instead SB runs as follows: Sa5 = −a5 − 10a2 SB a4 − 10a3 + 15a22 SB a3 − (5a4 + 10a2 a3 )SB a2 = −a5 + 10a2 a4 + 6a2 SB a3 + 3a22 + 4a3 SB a2 + 10a3 + 15a22 (a3 + 3a2 SB a2 ) + 5a2 a4 + 10a22 a3 = −a5 + 10a2 a4 − 60a22 (a3 + 3a2 SB a2 ) − 30a42 − 40a22 a3 + 10a23 − 30a22 a3 + 15a22 a3 − 45a42 + 5a2 a4 + 10a22 a3 = −a5 + 10a2 a4 − 60a22 a3 + 180a42 − 30a42 − 40a22 a3 + 10a23 − 30a22 a3 + 15a22 a3 − 45a42 + 5a2 a4 + 10a22 a3 = −a5 + 15a2 a4 + 10a23 − 105a22 a3 + 105a42 . Note that in both procedures, there are the same cancellations although the expansions do not coincide term-by-term. However, since this Fa`a di Bruno algebra (vaguely) looks of the same general type as a Hopf algebra of Feynman graphs, this is a case where we would expect a formula a` la Zimmermann, that is, without cancellations, to compute the antipode. Such a formula indeed exists, as mentioned in the introduction [19]. It leads to San =
n−1
(−1)k Bn−1+k,k (0, a2 , a3 , . . .).
k=1
The elegance of this equation is immediately appealing. Using a standard identity of the Bell polynomials, it can be further simplified:
n−1 a2 a3 , ,... . (2.10) San = (−1)k (n − 1 + k) · · · nBn−1,k 2 3 k=1
For instance, (2.9) is recovered at once with no cancellations, the coincidence of (2.8) and the last formula actually gives nonstandard identities for Bell polynomials. We cannot omit a description, however brief, of the graded dual F . Consider the primitive elements an ∈ F given by an , am = δnm , then also
an , ap aq = ∆an , ap ⊗ aq = an ⊗ 1 + 1 ⊗ an , ap ⊗ aq = 0, if p ≥ 1 and q ≥ 1 — the an kill nontrivial products of the aq generators. On the other hand,
an am , aq = an ⊗ am , ∆aq q =
an ⊗ am , Bq,k (1, a2 , . . . , aq+1−k ) ⊗ ak = =
k=1 q
an , Bq,k (1, a2 , . . . , aq+1−k ) am , ak
k=1
an , Bq,m (1, a2 , . . . , aq+1−m ).
The polynomial Bq,m is homogeneous of degree m, and the only monomial in it “ q ” xn term. Its coefficient is λ; m giving a nonvanishing contribution is the xm−1 1
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
921
with λ the sequence (1m−1 , 0, . . . , n1 , 0, . . .), satisfying q = λ1 + 2λ2 + · · · + qλq = m − 1 + n. Thus,
m+n−1 if q = m + n − 1, n
an am , aq = 0 otherwise. On the other hand, ∆(aq ar ) = aq ar ⊗ 1 + 1 ⊗ aq ar + aq ⊗ ar + ar ⊗ aq + R, where R is either vanishing or a sum of terms of the form b ⊗ c with b or c a monomial in a2 , a3 , . . . of degree greater than 1. Therefore, 1 if n = q = m = r or n = r = m = q,
an am , aq ar = an ⊗ am , ∆(aq ar ) = 2 if m = n = q = r, 0 otherwise. Furthermore, it is clear that all the terms of the coproduct of three or more a’s are the tensor product of two monomials where at least one of them is of order greater than 1. So an am , aq1 aq2 aq3 · · · = 0. All together gives
m−1+n an am = an+m−1 + (1 + δnm )(an am ) . n In particular, [an , am ] := an am − am an = (m − n)
(n + m − 1)! an+m−1 . n!m!
Therefore, taking bn := (n + 1)!an+1 , we get the simpler looking [bn , bm ] = (m − n)bn+m .
(2.11)
The Milnor–Moore theorem implies that F is isomorphic to the universal enveloping algebra of the Lie algebra defined by the last equation. The algebra F can be realized by vector fields [5]. As we saw in Chap. 4, F ◦ is bigger and contains group-like elements f , g, . . . with g ∗ f = f ◦ g. We shall come back to the structure of the Fa` a di Bruno algebra in Sec. 3.3.1. Also, primitivity in the Fa` a di Bruno algebra is thoroughly examined in Sec. 4.2. 2.2. Working with the Fa` a di Bruno Hopf algebra The Fa`a di Bruno formulae (2.2), (2.3) and (2.5) and the algebra F are ubiquitous in quantum field theory. We give a couple of such examples and then we turn to a famous example in a combinatorial-algebraic context. Example 2.1. Consider charged fermions in an external field. The complex space H of classical solutions of the corresponding Dirac equation is graded by E+ − E− ,
August 29, 2005 18:8 WSPC/148-RMP
922
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
where E+ , E− project on the particle and antiparticle subspaces; we write H± := E± H. Operators on H = H+ ⊕ H− can be presented in block form,
A++ A+− A= . A−+ A−− In particular, a unitary operator
S=
S++ S−+
S+− S−−
corresponds to a classical scattering matrix if and only if † † S++ S++ + S+− S+− = E+ ,
† † S−− S−− + S−+ S−+ = E− ,
† † † † + S+− S−− = S−− S+− + S−+ S++ = 0. S++ S−+
By the Shale–Stinespring theorem [24, Chap. 6], this operator is implemented in Fock space iff S+− , S−+ are Hilbert–Schmidt. Assuming this is the case, the quantum scattering matrix S is obtained through the spin representation [44] and it can be proved that (the square of the absolute value of) the vacuum persistence amplitude is of the form † ). | 0in , 0out |2 = | 0in , S0in |2 = det(1 − S+− S+−
The determinant exists because of the Shale–Stinespring condition. With A of trace class and of norm less than 1, one can use the development ∞ σk det(1 − A) = exp(Tr log(1 − A)) =: exp − k k=1
k
with σk := Tr A . We may want to reorganize this series, for instance to expand the exact result for 0in , S0in in terms of coupling constants. Say ∞ ∞ σk 1+ bn := exp − . k n=1 k=1
Our formula (2.2) gives 1 (−1)k Bn,k (σ1 , σ2 , 2σ3 , . . . , (n − k)!σn+1−k ), n! n
bn =
k=1
so, for example, b1 = −σ1 , One finds [45] that
b2 =
1 2 σ − σ2 , 2 1
b3 = −
σ1 σ2 .. .
n−1 σ1 .. .
σn
σn−2 σn−1
(−1)n bn = det n! σn−1
1 3 σ − 3σ1 σ2 + 2σ3 , . . . 6 1 0 ... 0 n − 2 ... 0 .. .. . .. . . . σn−3 . . . 1 σn−2 . . . σ1
(2.12)
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
923
Example 2.2. The first generating functional in QFT is usually defined by the expression n # # ∞ 1 i d4 x1 · · · d4 xn G(x1 , . . . , xn )j(x1 ) · · · j(xn ). Z[j] = 1 + n! n=1 This is often taken to be a formal expression, but actually it makes sense perturbatively if by the Green function G(x1 , . . . , xn ) we understand a renormalized chronological product a` la Epstein–Glaser. The second generating functional is defined: W [j] := −i log Z[j]. We assert that W [j] =
m−1 # # ∞ i 1 d4 x1 · · · d4 xm Gc (x1 , . . . , xm ) j(x1 ) · · · j(xm ); m! m=1
where Gc denotes the connected Green functions [46, Chap. 5]: G(x1 , . . . , xn ) :=
n
Gλc 1 (x1 )Gλc 2 (x1 , x2 ) · · ·
k=1 λ∈Πn,k
with (1λ1 , 2λ2 , . . .) as before a partition of [n] in k blocks. To prove our assertion directly, one can use the Fa` a di Bruno formula xm exp m! m≥1 n 1 n! λ1 λ2 λ3 x x x ··· . = n! λ1 !λ2 !λ3 ! · · · (1!)λ1 (2!)λ2 (3!)λ3 · · · 1 2 3 n≥1
k=1 λ1 ,λ2 ...
Therefore
%c 1 $ # ∞ i 1 i W [j] = d4 x1 Gc (x1 )j(x1 ) exp c ! c1 =0 1
c2 # ∞ 2 i 1 1 4 4 × ···, d x1 d x2 Gc (x1 , x2 )j(x1 )j(x2 ) c ! 2! c =0 2 2
which is precisely Z[j]. The proof is rather spectacular; but only a simple counting principle, well known in combinatorics [47, Sec. 5.1], is involved here. The same principle plays a role in the cluster expansion of the S-matrix [48, Chap. 4]: the sum of operators associated with Wick diagrams is equal to the normally ordered exponential of the sum of operators associated with the connected Wick diagrams. The most important application in QFT concerns the renormalization group; but this would take us too far afield. See our comments in Sec. 3.3.1.
August 29, 2005 18:8 WSPC/148-RMP
924
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
Example 2.3. In (2.12), we might need to solve for the σ’s instead of the b’s. Notice that to study diffeomorphisms as formal power series of the type f (t) = t + n≥2 an (f )tn /n! it is not indispensable to use the coordinate functions an . Instead, one can use, for example, the new set of coordinates dn := an+1 /(n + 1)! for n ≥ 1. Consider then ∞ ∞ ∞ ∞ dn := exp ωn ωn = log 1 + dn or 1+ n=1
k=1
to be solved for the ωn . Since log(1 + t) = Fa` a di Bruno formula (2.3), it follows that
k=1
n=1
k−1 (k − 1)!tk /k!, k≥1 (−1)
from the very
1 (−1)k−1 (k − 1)!Bn,k (d1 , 2d2 , . . . , (n + 1 − k)!dn+1−k ) =: Sn (d1 , . . . , dn ). n! n
ωn =
k=1
The polynomials Sn are the so-called Newton polynomials; they are well known from the theory of symmetric functions. The first four are: d21 d3 , S3 (d1 , d2 , d3 ) = d3 − d1 d2 + 1 , 2 3 4 d22 d − d1 d3 + d21 d2 − 1 . S4 (d1 , d2 , d3 , d4 ) = d4 − 2 4
S1 (d1 ) = d1 ,
S2 (d1 , d2 ) = d2 −
In view of (2.12), the inverse map expressing the dn in terms of the ωn is given by 1 Bn,k (ω1 , 2ω2 , . . . , (n + 1 − k)!ωn+1−k ) dn = n! k=1 ω1c1 · · · ωncn =: ASn (ω1 , . . . , ωn ) = c1 ! · · · cn ! c +2c +···+nc =n n
1
2
n
on the account of (2.5). Now, we can construct an isomorphism between H and B(V ), where V is the vector space with a denumerable basis Y1 , Y2 , . . . , by the algebraic correspondence li → Si (Y1 , . . . , Yi ), with inverse Yi → ASi (l1 , . . . , li ). An easy computation [49] allows one to verify that n ∆B(V ) [ASi (Y1 , . . . , Yi )] = ASj ⊗ ASn−j (Y1 , . . . , Yi ). j=0
This is enough to conclude as well that the elements Si (l1 , . . . , li )
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
925
are all primitive and constitute a basis for the space of primitive elements of H , which is thus primitively generated. We shall come back to the isomorphism between H and B(V ) in Sec. 4.1, where it plays an important role. Example 2.4. Among the techniques for solving differential equations of the type dY = A(t)Y (t), dt
Y (t) = 1,
with the variable Y being an operator valued function, there is the Magnus tech∞ nique [50], where the solution is sought in the form exp( k=1 Ωk (t)) — and an explicit form for the Ωk is given — and the (Feynman–Dyson) time-ordered exponential, which is in fact an iterative solution of the form 1+
∞
Pl ,
l=1
where the Pl are time-ordered products. Relations between both types of solutions have been rather painfully obtained in applied science papers [51, 52]. Here they are transparent; of course, it does not reduce to Ωk = Sk (P1 , . . . , Pk ) because now the Pl need not commute. It is however possible to introduce symmetrized Newton polynomials like: 1 d3 S˜3 (d1 , d2 , d3 ) = d3 − (d1 d2 + d2 d1 ) + 1 , 2 3 d4 1 1 d2 d2 + d1 d2 d1 + d2 d21 S˜4 (d1 , d2 , d3 , d4 ) = d4 − d22 − (d1 d3 + d3 d1 ) + 1 − 1, 2 2 3 4 and so on; and then certainly Ωk = S˜k (P1 , . . . , Pk ). More details on this will be given in [53]. Example 2.5. Formulae for reversion of formal power series, going back to Lagrange [54], have enchanted generations of mathematicians and physicists, and there is an immense literature on them. It is very easy to prove that an element f (x) = ax + bx2 + cx3 + · · · of xC[[x]] like the ones we have used to study diffeomorphisms has a reciprocal or compositional inverse f (−1) (x) such that f (f (−1) (x)) = x and f (−1) (f (x)) = x if and only if a = 0, in which case the reciprocal is unique; and any left or right compositional inverse must coincide with it. For computing it, many methods are available. One finds in the handbook of mathematical functions [55] the recipe: given y = ax + bx2 + cx3 + dx4 + ex5 + · · · then x = Ay + By 2 + Cy 3 + Dx4 + Ey 5 + · · · ,
August 29, 2005 18:8 WSPC/148-RMP
926
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
where aA = 1, a3 B = −b, a5 C = 2b2 − ac, a7 D = 5abc − a2 d − 5b3 ,
(2.13)
a9 E = 6a2 bd + 3a2 c2 + 14b4 − a3 e − 21ab2 c, .. . Let us translate the search for f (−1) into algebraic-combinatorial terms; everything that follows should be pretty obvious. As F Rcop (G), at least “morally” speaking, one expects to get back to the group G (or rather Gopp ) by means of the Tannaka–Kre˘ın paradigm. Since R is commutative, the set Homalg (F , R) of all algebra morphisms is a group under convolution. Now, the action of f ∈ Homalg (F , R) is determined by its values on the an . The map f → f (t) =
∞
fn
n=1
tn , n!
where fn := f, an , establishes a bijection from Homalg (F , R) onto the set of formal (exponential) power series over the reals such that f0 = 0 and f1 = 1. We know that such series form a group under the operation of functional composition. This correspondence is an anti-isomorphism of groups. There is really nothing to prove, but we may go through the motions again. Indeed, let f, g ∈ Homalg (F , R), then n gk Bn,k (1, f2 , . . . , fn ), f ∗ g(an ) = m(f ⊗ g)∆an = k=1
where we took in consideration that f1 = f, 1 = 1. This is the same as the nth coefficient of h(t) = g(f (t)). In other words, G = Homalg (F cop , R). By our discussion in Sec. 1.5, the antipode of F cop is S −1 = S. Therefore, making allowance for our prior choice of a = 1 — an almost trivial matter on which we shall reflect later — and the use of ordinary instead of exponential series, whenever f and g are formal exponential power series with f0 = 0 and f1 = 1 verifying f ◦ g(x) = x or g ◦ f (x) = x, the formulae (2.13) correspond to an (g) = San (f ). Indeed Sa2 = −a2 gives B = −b; Sa3 = −a3 + 3a22 gives 3!C = −3!c + 3(2!b)2 , that is C = 2b2 −c; Sa4 = −a4 −15a32 +10a2 a3 gives 4!D = −4!d+10(2!b)(3!c)−15(2!b)3, that is D = 5bc − d − 5b3 ; and so on. Lagrange reversion is usually proved by Cauchy’s theorem in the context of analytic functions, where it is known as the Lagrange–B¨ urmann formula; or by matrix algebraic methods. Its derivation by Hopf algebraic methods in [19] is particularly
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
927
elegant, but it is worthwhile to note that the equivalent of formula (2.10) that they use was known prior to that derivation. 2.3. From Fa` a di Bruno to Connes–Moscovici The reader who is familiar with the “noncommutative geometry” Connes–Moscovici algebras will have noticed the similitude between the maximal commutative Hopf a di Bruno algebra F . subalgebra of HCM (1) — the simplest of those — and the Fa` In fact, they are one and the same. To see this, remember that to study diffeomorphisms as formal power series, it is not necessary to use the coordinates an . Without losing information, a description of F can be done in terms of the new set of coordinates δn (f ) := [log f (t)](n) (0), Consider then h(t) :=
n ≥ 1.
n
δn (f )t /n! = log f (t) = log 1 +
n≥1
n
an+1 (f )t /n! .
n≥1
From the formula (2.3), it follows that δn =
n
(−1)k−1 (k − 1)!Bn,k (a2 , . . . , an+2−k ) =: Ln (a2 , . . . , an+1 ).
(2.14)
k=1
The polynomials Ln (closely related to the Newton polynomials) are called logarithmic polynomials in combinatorics: they give the successive derivatives of (exponential) series of the form log f (t) . On the other hand, from the series expression of exp(h(t)) = f (t), we see that an+1 =
n
Bn,k (δ1 , . . . , δn+1−k ) =: Yn (δ1 , . . . , δn )
(2.15)
k=1
and we have inverted (2.14). Recall that Yn (δ1 , . . . , δn ) =
δ1λ1 · · · δnλn ,
λ∈Πn
where as usual λj is the number of blocks of size j in a partition λ of [n]. Now, from (2.14) we get δ1 = a2 , δ2 = a3 − a22 , δ3 = a4 − 3a2 a3 + 2a32 and δ4 = a5 −3a23 −4a2 a4 +12a22a3 −6a42 , and since the coproduct is an algebra morphism, by use of (2.7) we obtain the coproduct in the Connes–Moscovici coordinates. For instance, for the first few generators, ∆δ1 = δ1 ⊗ 1 + 1 ⊗ δ1 , ∆δ2 = δ2 ⊗ 1 + 1 ⊗ δ2 + δ1 ⊗ δ1 , ∆δ3 = δ3 ⊗ 1 + 1 ⊗ δ3 + 3δ1 ⊗ δ2 + (δ12 + δ2 ) ⊗ δ1 , ∆δ4 = δ4 ⊗ 1 + 1 ⊗ δ4 + 6δ1 ⊗ δ3 + (7δ12 + 4δ2 ) ⊗ δ2 + (3δ1 δ2 + δ13 + δ3 ) ⊗ δ1 .
August 29, 2005 18:8 WSPC/148-RMP
928
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
In conclusion, the commutative subalgebra of the Connes–Moscovici Hopf algebra is but an avatar of the Fa` a di Bruno algebra: algebras that differ only by the basis presentation we of course do not distinguish. The task is now to assemble the whole of HCM (1) from F . For that, it is required to look closer at the structure of G and just a bit more of Hopf algebra theory. A matched pair of groups is by definition a group G with two subgroups G1 , G2 such that G = G2 G1 , G1 ∩ G2 = 1. The affine subgroup G2 of transformations in Diff+ (R) given by t → β + αt, with β ∈ R, α ∈ R+ and the group G1 of transformations tangent to the identity, considered in Sec. 2.2, give such a bijection onto G. Since matched pairs are conspicuous by their absence in most textbooks, we discuss the situation in some detail. The definition implies that there is a natural action of G (and in particular of G2 ) on the homogeneous space G1 G2 \G, and vice versa of G (thus of G1 ) on G2 G/G1 . Both actions are given by composition. In detail: given ψ ∈ G, we decompose it as ψ = kf with k ∈ G2 , f ∈ G1 and β = ψ(0),
α = ψ (0),
f (t) =
ψ(t) − ψ(0) . ψ (0)
The right action of ψ ∈ G on f ∈ G1 is given by [f ψ](t) =
f (ψ(t)) − f (ψ(0)) , ψ (0)f (ψ(0))
note [f ψ](0) = 0, [f ψ] (0) = 1; and in particular [f (β, α)](t) =
f (αt + β) − f (β) . αf (β)
(2.16)
The left action of G on G2 is seen from consideration of ψ(β + αt), and we obtain in particular for ψ = f ∈ G1 : f (β, α) = (f (β), αf (β)).
(2.17)
Note that (k1 f1 )(k2 f2 ) = k1 (f1 k2 )(f1 k2 )f2 and (kf )−1 = (f −1 k −1 )× (f −1 k −1 ), as general equalities of the theory of matched pairs. We ought to translate the previous considerations into Hopf-algebraic terms. In what follows, we consider bialgebras acting on algebras from the left. On C, they are assumed to act by h · 1 = η(h) 1. Definition 2.6. A (left, Hopf) H-module algebra A is an algebra which is a left module for the algebra H such that the defining maps u : C → A and m : A ⊗ A → A intertwine the action of H: (2.18) m(h⊗ · (a ⊗ b)) = h · m(a ⊗ b) and h · (ab) = (h(1) · a)(h(2) · b) whenever a, b ∈ A and
u(h · 1) = h · u(1), that is, h · 1A = η(h) 1A h ∈ H.
The first condition is actually redundant [57]. It is easy to see that the definition corresponds to the usual notions of groups acting by automorphisms, or Lie algebras
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
929
acting by derivations. If h is a primitive element of H, then (2.18) entails that h · 1A = 0 and that h · (ab) = (h · a)b + a(h · b): as already observed, primitive elements act by derivations. Therefore, (2.18) may be regarded as a generalized Leibniz rule. (Analogously one defines H-module coalgebras, H-module bialgebras and so on; we have no use for them in this article.) When there is an algebra covariant under a group or Lie algebra, one constructs in a standard way the semidirect product algebra. In the same spirit [59]: Definition 2.7. Let H be a Hopf algebra and A a (left) Hopf H-module algebra. The (untwisted) smash product or crossed product algebra, denoted A#H or A H, is the vector space A ⊗ H endowed with unit 1 ⊗ 1 and the product (a ⊗ h)(b ⊗ k) := a(h(1) · b) ⊗ h(2) k. We verify associativity of the construction: (a ⊗ h)((b ⊗ k)(c ⊗ l)) = (a ⊗ h)(b(k(1) · c) ⊗ k(2) l) = a(h(1) · (b(k(1) · c))) ⊗ h(2) k(2) l = a(h(1) · b)(h(2) k(1) · c) ⊗ h(3) k(2) l = (a(h(1) · b) ⊗ h(2) k)(c ⊗ l) = ((a ⊗ h)(b ⊗ k))(c ⊗ l). To alleviate the notation, we can identify a ≡ a ⊗ 1 and h ≡ 1 ⊗ h. Then ah = (a ⊗ 1)(1 ⊗ h), whereas ha = (1 ⊗ h)(a ⊗ 1) = h(1) · a ⊗ h(2) = (h(1) · a)h(2) . Both A and H are subalgebras of A H. A very simple example is given by d on A = C[x]. H = U (g), where g is the one-dimensional Lie algebra acting as dx Then A#H is the Weyl algebra. In our case, H will be the enveloping algebra U(g2 ) of the affine group Lie algebra g2 ; and the left module algebra is none other than F . We exhibit then the action of U(g2 ) on F , and check that the Fa` a di Bruno algebra is a Hopf U(g2 )-module algebra. Consider the action on G1 given by (2.16). We then proceed step by step. With β = 0, one gathers an ([f (0, α)]) := [f (0, α)](n) (0) = αn−1 an (f ) for n ≥ 2. At the infinitesimal level, with (0, α) =: exp sY , we conclude the existence of an action: Y · an = (n − 1)an .
(2.19)
Y · δn = nδn .
(2.20)
Similarly
Note that Y · is the grading operator for F . At this stage, one could construct already the crossed product of F by the grading.
August 29, 2005 18:8 WSPC/148-RMP
930
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
We proceed to the other generator. With α = 1, Eq. (2.16) now gives
(n) f (t + β) − f (β) (n) [f (β, 1)] (0) = (0). f (β) At the infinitesimal level, with (β, 1) =: exp βX, we conclude: X · an = an+1 − a2 an .
(2.21)
At this point, if not before, one sees the wisdom of Connes and Moscovici’s logarithmic coordinates, as then (2.16) mutates into log[f (β, α)] (t) = log f (β + αt), up to a constant; from which one infers the simpler action X · δn = δn+1 .
(2.22)
Both X and Y act by derivation; thus F is a U(g2 )-module algebra. Here the relations Za = Z · a + aZ hold, for Z = X or Y . Note that F is not a U(g2 )-module bialgebra. By inspection of [5] and summarizing so far, we conclude that the algebra HCM (1) is the smash product F U(g2 ) of the enveloping algebra U(g2 ) of the affine Lie algebra by the Fa` a di Bruno Hopf algebra; it is generated just by Y , X, a2 = δ1 . We can as well regard the algebra HCM (1) as the enveloping algebra of the extension L of g2 by the abelian Lie algebra L spanned by the an , which is a derivation g2 -module. Recall that an extension of this type is an exact sequence 0 → L → L → g2 → 0, i
h
where [x, i(a)] := i(h(x) · a), for x ∈ L and a ∈ L. This lifts to an exact sequence of enveloping algebras, in our case 0 → F → HCM (1) → U(g2 ) → 0. It is permissible to write [X, δn ] for X ·δn and so on; but we do not use that notation. The burning question now is, what is the “good” coproduct on the smash product algebra F # U(g2 )? But before tackling that, we pause to redeliver (2.19) from Connes and Moscovici’s (2.20) and (2.21) from (2.22) by the combinatorial argument. Since X, Y are derivations (2.23) Y · δ1c1 · · · δncn = (1c1 + 2c2 + · · · + ncn )δ1c1 · · · δncn , n c −1 cj+1 +1 cj δ1c1 · · · δj j δj+1 · · · δncn . (2.24) X · δ1c1 · · · δncn = j=1
Thus, using (2.15) and (2.23) Y · an+1 = Y · Yn (δ1 , . . . , δn ) = =
Y · δ1c1 · · · δncn
c∈Πn
(1c1 + 2c2 + · · · + ncn )δ1c1 · · · δncn
c∈Πn
= nYn (δ1 , . . . , δn ) = nan+1 .
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
931
We have verified (2.19). Similarly, (2.24) entails X · an+1 =
X · (δ1c1 · · · δncn ) =
n
c −1 cj+1 +1 δj+1
cj δ1c1 · · · δj j
· · · δncn .
c∈Πn j=1
c∈Πn
Notice that 1c1 + 2c2 + · · · + j(cj − 1) + (j + 1)(cj+1 + 1) + (j + 2)cj+2 + · · · + ncn = n + 1, therefore we must think of partitions of [n + 1]. Now, by deleting the number n + 1 such partitions give a partition of [n]. Furthermore, if c is a partition of [n], then the partitions of [n + 1] that give rise to c by dropping n + 1, are obtained by either adding the singleton {n + 1} or by inserting n + 1 in one of the blocks of c. Conversely, all partitions of [n+1] are obtained from one of [n] by either of these two procedures. Moreover, if c is of class 1c1 2c2 · · · ncn , and n+1 is included in a block of size j of c we obtain a partition of [n+ 1] of class 1c1 2c2 · · · j cj −1 (j + 1)cj+1 +1 · · · ncn , and there are as many of these partitions as blocks of size j; namely cj . Thus en+1 an+2 = Yn+1 (δ1 , . . . , δn+1 ) = δ1e1 · · · δn+1 e∈Πn+1
=
n
c −1 cj+1 +1 δj+1
cj δ1c1 · · · δj j
c∈Πn j=1
· · · δncn + δ1
δ1c1 · · · δncn
c∈Πn
= X · an+1 + δ1 an+1 , which is precisely (2.21). We return to the search for a compatible coalgebra structure on F U(g2 ). Given a Hopf algebra H, one considers on C the coaction given by γ(1) = u(1). Definition 2.8. A (right) Hopf H-comodule coalgebra C is a coalgebra which is a right comodule for the coalgebra H such that the counit map and the coproduct on C intertwine the coaction of H: (η ⊗ id)γ = γη,
(∆ ⊗ id)γ = γ⊗ ∆.
In this context, the smash coproduct or crossed coproduct coalgebra H C is defined as the vector space H ⊗ C endowed with the counit ηH ⊗ ηC and the coproduct (1) (2) ∆(a ⊗ c) = a(1) ⊗ c(1) ⊗ a(2) c(1) ⊗ c(2) . Now comes “the revenge of the Fa`a di Bruno coalgebra”, since for the identification of the Connes–Moscovici Hopf algebra we need H = F to coact on the coalgebra C = U(g2 ). One ought to be careful here, as we are about to use the coalgebra structure of F for the first time, and actually F R(G1 ), but F Rcop (G1 )! To see why F naturally coacts on U(g2 ), note that, to implement the action (2.17) on G2 of the diffeomorphisms tangent to the identity and reasonopp on a suitable algebra C(G2 ) of ing as above, we would have an action of F functions on G2 . This we can choose to regard as a coaction of F cop on C(G2 )
August 29, 2005 18:8 WSPC/148-RMP
932
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
[29, Sec. 1.2] or better on its dual coalgebra U (g2 ). At the infinitesimal level, if (β, α) = exp(βX) exp(log αY ) we see that f · Y = Y for all f but d [f (α, β)] = X + f (0)Y = X + a2 (f )Y. f ·X = dβ α=1,β=0 The astuteness of Connes and Moscovici’s definition of HCM (1) is precisely that this information on the structure of the diffeomorphism group is made patent. The last expression corresponds to the coaction γ : U (g2 ) → U (g2 ) ⊗ F cop given by γ(Y ) = Y ⊗ 1, γ(X) = X ⊗ 1 + Y ⊗ a2 . The smash coproduct structure F cop U(g2 ) is given by ∆cop X := ∆(1 ⊗ X) = 1 ⊗ 1 ⊗ 1 ⊗ X + 1 ⊗ X ⊗ 1 ⊗ 1 + 1 ⊗ Y ⊗ a2 ⊗ 1 = 1 ⊗ X + X ⊗ 1 + Y ⊗ a2 ,
(2.25)
the other pertinent coproducts stay unchanged. It is time to return to the use of F , instead of R(G1 ), and to account for that it is enough to turn (2.25) around. Therefore we finally have ∆X = X ⊗ 1 + 1 ⊗ X + a2 ⊗ Y. In conclusion, we have proved the following result. Theorem 2.9. The Connes–Moscovici bialgebra is the crossed product algebra and coalgebra of the enveloping algebra of the Lie algebra of the affine group and the Beatus Fa` a di Bruno bialgebra. This conclusion is restated in [58]. The construction is similar to Majid’s bicrossedproduct bialgebra, although with important detail differences. In [59, 60], the canonical link to the matched pair of groups situation is also highlighted; and one might borrow the notation F U(g2 ) for HCM (1). Furthermore, F U(g2 ) is automatically a Hopf algebra. In fact 0 = m(id ⊗ S)∆X = m(X ⊗ 1 + 1 ⊗ SX + a2 ⊗ SY ) = X + SX − a2 Y entails that the only nontrivial formula for the good antipode is given by SX = −X +a2 Y . The same result is obtained from the formula S(aZ) = SZ (1) S(aZ (2) ) in [59, Chap. 6], with a = 1, Z = X. Note that S is of infinite order, as S 2n X = X + na2
and S 2n+1 X = −X + a2 Y − na2 .
The natural pairing between HCM (1) and C(G2 ) G1 given by
aZk , gf = g(k)a(f ), where g is a function on G2 and Zk is the element of g2 that gives k ∈ G2 by exponentiation, is also transparent. Similarly, HCM (n) F(n) (Rn gl(n; R)), with F (n) the n-colored Fa`a di Bruno Hopf algebra (on which something will be said later).
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
933
Given the role accorded in this work to the Fa` a di Bruno algebras, a reference to the somewhat checkered past of the Beatus Fa` a di Bruno formula might not be out of place. Consulting the excellent article by Johnson [61], where besides a nice derivation of the formula very much in the spirit of our Sec. 2.1, it is recalled that there is no proof for (2.4) in [40], and the explanation of the formulae equivalent to it were present in the mathematical literature from the beginning of the nineteenth century. Also, further historical investigation [62] deserves a better look. 3. Hopf Algebras of Graphs and Distributive Lattices 3.1. Hopf algebras of Feynman graphs We start this section by an invitation, too. Some readers will want to see a worked out example from perturbative quantum field theory, where the combinatorics of renormalization leads to a finite “renormalized” graph by means of local counterterms, from a given Feynman graph. The analytical complications militate against developing an example that is really challenging from the combinatorial viewpoint; but this cannot be helped. Needless to say, the experts can skip the following discussion. To make matters as simple as possible, our example will be taken from the (Ginzburg–Landau) ϕ44 scalar model in Euclidean field theory. We work in the dimensional regularization scheme in momentum space [63, 64] and display the counterterms ab initio. The free energy functional is given by $ % # 1 1 Efree [ϕ] = dD x ϕϕ + m2 ϕ2 . 2 2 Here denotes the D-dimensional Laplacian. The interaction part is extended by counterterms, of the form $ ε % # µ g˜ 4 cm2 2 2 µε g˜ 4 cϕ D ϕ + cg ϕ + ϕϕ + m ϕ . (3.1) Eint [ϕ] = d x 4! 4! 2 2 Denote ε := 4 − D. The definition of the original vertex includes the mass parameter µ, introduced to make g˜ dimensionless: g = g˜µε = @ . @ The counterterms produce additional vertices in the diagrammatic expansion. In particular: cg g˜µε = @• . @ Dimensional analysis indicates that all the counterterms are dimensionless; therefore they can only depend on ε, g˜ or combinations like m2 /µ2 , k 2 /µ2 . It turns out that the last one appears just in intermediate stages as log(k 2 /µ2 ) and that these non-local terms cancel in the final expressions — this is the key to the whole affair.
August 29, 2005 18:8 WSPC/148-RMP
934
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
Thus cm2 , cϕ , cg depend only on ε, g˜, m2 /µ2 (and the dependence on m2 /µ2 can be made to disappear). The quantities ϕ, m, g˜ in the previous displays are the renormalized field, mass and coupling constant. The original form of the theory is recovered by multiplicative renormalization: Z ϕ = 1 + cϕ ,
Z m 2 = 1 + cm 2 ,
Z g = 1 + cg .
The total energy functional becomes $ % # 1 1 µε g˜ D 2 2 4 Zg ϕ E[ϕ] = d x Zϕ ϕϕ + Zm2 m ϕ + 2 2 4! with the ε-dependent coefficients. Introduction of the bare field, the bare mass and coupling: ϕB =
& Zϕ ϕ,
m2B =
Z m2 , Zϕ
g˜B =
Zg ε µ g˜, Zϕ2
brings the energy functional to the standard form $ % # 1 1 g˜B 4 ϕB ϕB + m2B ϕ2B + ϕB . E[ϕB ] = dD x 2 2 4! The bare quantities here are functions of the renormalized quantities g˜, m, the mass scale µ and ε. The Feynman rules for the model are next recalled. Instead of using directly the modified Feynman rules including the counterterms, we will “discover” the latter in the process of renormalization. We plan to concentrate on the proper (1PI) vertex function Γ(4) (k1 , . . . , k4 ), which is defined only for k1 +· · ·+k4 = 0 and is represented by amputated diagrams. This means that we only need: • A propagator factor
1 , p2j +m2
with j ∈ { 1, . . . , I = 2p − 2 }, where p is the approxi-
mation order (number of vertices) for each internal line. Each internal momentum pj is expressed in terms' of the loop momenta and the four external momenta. • An integration (2π)−D d4 lm over each (independent) loop momentum with the index m ∈ { 1, . . . , L = I − p + 1 = p − 1 }. • The factors g˜µε , one for each vertex. • The weight factor of the graph. We also briefly recall the counting of ultraviolet divergences. According to the rules, a Feynman integral IΓ with p vertices and four external lines contains L = p−1 loop integrations and thus in the numerator of the integrand, a power D(p − 1) of the momentum appears. Each of the internal lines contributes a propagator. Thus there are altogether ω(Γ) := D(p − 1) − 2(2p − 2) = ε(1 − p)
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
935
powers of momentum in such a Feynman integral. This power ω(Γ) is the superficial degree of divergence of Γ. As ε ↓ 0, all four-point proper graphs are superficially logarithmically divergent. A graph is said to have subdivergences if it contains a superficially divergent subdiagram γ, that is to say, with ω(γ) ≥ 0 in that limit. The only possibly divergent subintegrations are those of the two- and four-point subdiagrams. Up to one loop, we have 3 + + counterterm + O(g 3 ). Γ(4) (k1 , . . . , k4 ) = @ 2 @ Actually “ 23
” stands for three integrals of the same form, in terms of the
“Mandelstam” variables s = (k1 + k2 )2 , t = (k1 + k3 )2 and u = (k1 + k4 )2 . In the next to leading order:
3 J 3 Γ(4) (k1 , . . . , k4 ) = @ + +3 J + 2 4 @
J + counterterms + O(g 4 ). A similar comment applies in relation to the s, t, u variables. (We ignore a fish-cumtadpole graph that would easily be taken into account anyway. Only at three-loop order would we have to handle in parallel the renormalization of two-point diagrams in earnest.) :
We compute the fish graph #
1 1 dD p , (2π)D p2 + m2 (p + k)2 + m2 where k = k1 + k2 , say. Using Feynman’s formula # 1 Γ(a + b) 1 xa−1 (1 − x)b−1 = dx , a b A B Γ(a)Γ(b) 0 [Ax + B(1 − x)]a+b we obtain # 1 # 1 dD p dx Ifish (k) = g 2 D 2 2 (2π) {(p + m )(1 − x) + [(p + k)2 + m2 ]x}2 0 # 1 # 1 dD p = g2 dx D 2 (2π) (p + 2pkx + k 2 x + m2 )2 0 # g 2 Γ(2 − D/2) 1 1 = dx D/2 (4π) (sx(1 − x) + m2 )2−D/2 0 $ %ε/2 # 4πµ2 g˜2 µε Γ(ε/2) 1 = dx . (4π)2 (sx(1 − x) + m2 ) 0 Expanding now partially in powers of ε, this yields % $ # 1 2 g˜ 4πµ2 ε + ψ(1) + dx log + O(ε) . Ifish (s) = g˜µ (4π)2 ε sx(1 − x) + m2 0 Ifish (k) = g˜2 µ2ε
(3.2)
Here ψ(z) := Γ (z)/Γ(z) is the digamma function. Some comments are already in order. The divergence happens for D an even integer greater than or equal to 4
August 29, 2005 18:8 WSPC/148-RMP
936
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
(so it is clearly a logarithmic one). The remaining integral is finite as long as m2 = 0. In Eq. (3.2), we have separated g˜µε , that will become the coupling constant, and is not expanded in powers of ε. Only the expression multiplying it contributes to the renormalization constant Zg with a pole term independent of the free mass scale µ. That parameter appears only in the finite part and the arbitrariness of its choice exhibits a degree of ambiguity in the regularization procedure. Now comes the renormalization prescription. Suppose we wanted the value of Γ(4) (ki ) at ki = 0 to be (finite and) equal, at least at the present order, to the renormalized coupling constant Γ(4) (0) = g.
(3.3)
This would lead us to take Zg = 1 +
3 g˜Ifish (0), 2
that is, a counterterm given by
% $ g˜ 6 2 2 cg (ε, g˜, m /µ ) = − + 3ψ(1) − 3 log(4πm /µ ) . (4π)2 ε 2
2
However, other choices like cg = −6˜ g/(4π)2 ε, not “soaking up” the finite terms, would be admissible. We do not bother to write the (finite, if a bit involved) result for Γ(4) (s, t, u) in any of those cases. Let us now tackle the two-loop diagrams of the four-point function. The bikini is uncomplicated: graph # 1 dD p 3 Ibikini (k) = −g (2π)D ((p − k)2 + m2 )(p2 + m2 ) # dD q 1 × , (2π)D ((q − k)2 + m2 )(q 2 + m2 ) a product of two independent integrals. Here k again denotes any one of k1 + k2 , k1 + k3 or k1 + k4 . We obtain %2 $ # 1 2 ˜2 4πµ2 ε g Ibikini = −˜ + ψ(1) + gµ dx log 2 + O(ε) (4π)4 ε k x(1 − x) + m2 0 % $ # 4 4 1 ˜2 4 4πµ2 ε g 0 = −˜ gµ + ψ(1) + dx log 2 + O(ε ) . (4π)4 ε2 ε ε 0 k x(1 − x) + m2 For the first time, we face a combinatorial problem. It would not do to square and subtract (one third of) the previously obtained vertex counterterm, as this procedure gives rise to a non-local divergence of the form (log k 2 )/ε. The solution is easy enough: we take into account the counterterms corresponding to the subdivergences 2 by substituting Ifish (k) − Ifish (0) for Ibikini (k). This happens to accord with the renormalization prescription and both the double and the single pole then cancel out. In other words, renormalization “factorizes” and the three counterterms that come from taking the functional (3.1) seriously do appear.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
Things are slightly more complicated for the wine-cup graph
937
J J , both ana J
lytically and combinatorially. We find the integral # 1 dD p dD q − g3 D D 2 (2π) (2π) ((k1 + k2 − p) + m2 )(p2 + m2 ) ×
1 , (q 2 + m2 )((p − q + k3 )2 + m2 )
or
# Iwine-cup (k) = −g 3
1 dD p Ifish (p + k3 ). D 2 (2π) ((k1 + k2 − p) + m2 )(p2 + m2 )
(We have taken the loop momentum of the integral on the loop with two sides as q and p for the loop momentum of the integral on the loop with three sides, and have used k1 + k2 = −k3 − k4 .) Therefore, $ %ε/2 4πµ2 # # g˜3 µ2ε Γ(ε/2) 1 dD p (k3 + p)2 x(1 − x) + m2 ) . Iwine-cup (k) = dx 2 D (4π) (2π) ((k1 + k2 − p)2 + m2 )(p2 + m2 ) 0 Combining the denominators in the usual way, after integrating with respect to p, we obtain # # 1 # 1 g˜3 µε (4πµ2 )ε Γ(ε) 1 −ε/2 ε/2−1 dx [x(1 − x)] dy y(1 − y) dz Iwine-cup (k) = (4π)4 0 0 0 $ × yz(1 − yz)(k1 + k2 )2 + y(1 − y)k32 − 2yz(1 − y)k3 (k1 + k2 ) +m y + 2
1−y x(1 − x)
%−ε .
There is a pole term coming from the endpoint singularity in the integral at y = 1. Around this point, the square bracket can be expanded: $ %
m2 2 [· · ·] = 1 − ε log z(1 − z) + + O(ε ) . s We are left with
# 1 # 1 g˜3 µε (4πµ2 )ε 1 −ε/2 (1 + εψ(1)) dx [x(1 − x)] dy y(1 − y)ε/2−1 (4π)4 ε 0 0 $ %
# 1 m2 × dz 1 − ε log z(1 − z) + (3.4) + O(ε2 ) . s 0
Now, # 0
1
dx[x(1 − x)]−ε/2 =
Γ2 (1 − ε/2) . Γ(2 − ε)
August 29, 2005 18:8 WSPC/148-RMP
938
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
Similarly for the integral with respect to y. We reckon that the integral in (3.4) is $
% # 1 Γ2 (1 − ε/2)Γ(ε) m2 dx log x(1 − x) + 1−ε . Γ(2 − ε)Γ(2 + ε/2) s 0 After some work, 2 1 Γ2 (1 − ε/2)Γ(ε) = [1 + O(ε)], Γ(2 − ε)Γ(2 + ε/2) ε (1 − ε)(1 + ε/2) and in conclusion
$ g˜2 2 ε(1 + 2ψ(1)) 1+ (4π)4 ε2 2
% # 1 sx(1 − x) + m2 −ε dx log + O(ε0 ). 4πµ2 0
Iwine-cup (s) = g˜µε
There is no way to contemplate the renormalization of this without getting rid of the dreaded non-local divergence. This can be done precisely by taking into account of the subdivergence in the wine-cup diagram, that is, subtracting the term (one half of) Ifish (s)Ifish (0). The non-local divergence in this expression is seen to exactly cancel the non-local divergence in the previous display! The result of such a subtraction is still divergent, but it contains only terms independent of momentum, that now can be reabsorbed in the subtraction of the overall divergence. Although the combinatorial resources required so far are trivial, it is pretty clear that at a higher approximation order, every detail of renormalization computations becomes nightmarish. It makes sense then, also from the viewpoint of general renormalizability theorems, to clarify the combinatorial aspect as much as possible independently of the analytical ones. This was accomplished to a large extent by the precursors, culminating in the Zimmermann forest formula [13]. A step further was taken when Kreimer recognized the Hopf algebra structure lurking behind it. This is what eventually allows us to extract the general combinatorial meaning of the forest formula in this paper. Thus motivated, we turn to the Connes–Kreimer paradigm. Bialgebras of Feynman graphs, encoding the combinatorics of renormalization, were introduced in [3]. The precise definition we use in this paper was first given in [65]. It is appropriate for massless scalar models in configuration space; in the examples we shall always envisage the ϕ44 model. Nevertheless, similar constructions hold in any given quantum field theory, such as the massive ϕ36 model considered in [3]. Just like in the original paper by Connes and Kreimer and its follow-ups, we avoid cumbersome formalism by relying on the pictorial intuition provided by the diagrams themselves. The reader is by now supposed to be thoroughly familiar with the concept of (superficial) degree of divergence. We recall that a graph or diagram Γ of the theory is specified by a set V (Γ) of vertices and a set L(Γ) of lines (propagators) among them; external lines are attached to only one vertex each, internal lines to two. Diagrams with no external
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
939
lines (“vacuum diagrams”) will not be taken into account; in ϕ44 theory, only graphs with an even number of external lines are to be found. We are also excluding diagrams with a single vertex and thus, it is desirable to banish tadpole diagrams, in which a line connects a vertex to itself, as well. Given a graph Γ, a subdiagram γ of Γ is specified by a subset of at least two elements of V (Γ) and a subset of the lines that join these vertices in Γ. By exception, the empty subset ∅ will be admitted as a subdiagram of Γ. As well as Γ itself. Clearly, the external lines for a subdiagram γ include not only a subset of original incident lines but also some internal lines of Γ not included in γ. The connected pieces of Γ are the maximal connected subdiagrams. A diagram is called proper (or 1PI) when the number of its connected pieces would not increase on the removal of a single internal line; otherwise it is improper. An improper graph is the union of proper components plus subdiagrams containing a single line. A diagram is called 1VI when the number of its connected pieces can increase upon the removal of a single vertex. A subgraph of a proper graph is a subdiagram that contains all the elements of L(Γ) joining its vertices in the whole graph; as such, it is determined solely by the vertices. A subgraph of an improper graph Γ, distinct from Γ itself, is a proper subdiagram each of whose components is a subgraph with respect to the proper components of Γ. We write γ ⊆ Γ if and only if γ is a subgraph of Γ as defined (not just a subdiagram), although practically everything we have to say would work the same with a less restrictive definition. (For renormalization in configuration space, it is more convenient to deal with subgraphs than with more general subdiagrams. Moreover, Zimmermann showed long ago that only subtractions corresponding to subgraphs that are renormalization parts need to be used [66] in renormalization, and this dispenses us from dealing with a more general definition.) When a subdiagram contains several connected pieces, each one of them being a subgraph, we still call it a subgraph. For example, Fig. 1 illustrates the case of a subgraph of the ϕ44 model, made out of two connected pieces, which in spite of containing all the vertices, does not coincide with the whole graph. Two subgraphs γ1 , γ2 of Γ are said to be nonoverlapping when γ1 ∩ γ2 = ∅ (that is, γ1 and γ2 have no common vertices) or γ1 ⊆ γ2 or γ2 ⊆ γ1 ; otherwise they are overlapping. Given γ ⊆ Γ, the quotient graph or cograph Γ/γ (reduced graph in Zimmermann’s [13] parlance) is defined by shrinking each connected component of γ in Γ to a point, that is to say, each piece of γ (bereft of its external lines) is
Fig. 1.
The “roll” in ϕ44 theory.
August 29, 2005 18:8 WSPC/148-RMP
940
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
considered as a vertex of Γ and all the lines in Γ not belonging to γ belong to Γ/γ. This is modified in the obvious way when γ represents a propagator correction (i.e. has two external lines). A nonempty Γ/γ will be proper iff Γ is proper. The graphs Γ and Γ/γ have the same external structure, that is, structure of external lines. It is useful to think of the external structure as a kind of color: although subgraphs of a given graph Γ may have colors different from Γ, cographs cannot. Now, a bialgebra H is defined as the polynomial algebra generated by the empty set ∅ and the connected Feynman graphs that are (superficially) divergent and/or have (superficially) divergent subgraphs (renormalization parts in Zimmermann’s parlance), with the set union as the product operation. Hence ∅ is the unit element 1 ∈ H. The counit is given by η(Γ) := 0 on any generator, except η(∅) = 1. The really telling operation is the coproduct ∆ : H → H ⊗ H; as it is to be a homomorphism of the algebra structure, we need only define it on connected diagrams. By definition, the (reduced) coproduct of Γ is given by γ ⊗ Γ/γ. (3.5) ∆ Γ := ∅γΓ
The sum is over all divergent, proper, not necessarily connected subgraphs of Γ (not including ∅ and Γ) such that each piece is divergent and such that Γ/γ is not a tadpole part. We put Γ/Γ = 1. When appropriate, the sum also runs over different types of local counterterms associated to γ [3]; this is not needed in our model example because it corresponds to a massless theory, where propagators carry only one type of counterterms. It is then suggestive that the concept of primitive element for ∆ coincides with that of primitive diagram in QFT. We show in Fig. 2 how appearance of tadpole parts in Γ/γ can happen. The cograph corresponding to the “bikini” subgraph in the upper part of the graph in Fig. 2 is a tadpole correction. It gives rise to a one-vertex reducible subgraph that can be suppressed [67]. For the proof of the bialgebra properties of H, we refer to [24]. For graphical examples of coproducts, see [65]. From now on, we restrict consideration to the nontrivial subbialgebra of proper and superficially divergent graphs — the situation considered in [3] — that we still refer to as H. Actually H is a graded bialgebra. Obvious grading operators are available: if #(Γ) denotes the number of vertices in Γ (i.e. the coupling order), then we define the degree of a generator (connected element) Γ as ν(Γ) := #(Γ) − 1; the degree of a product is the sum of the degrees of the factors. This grading is compatible with the coproduct and clearly scalars are the only degree 0 elements. Other gradings are by the number I(Γ) of internal lines
∆
'$ would contain the term &% Fig. 2.
⊗ A A
Cograph which is a tadpole part.
Q• n
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
941
in Γ and by loop number l(Γ) := I(Γ) − ν(Γ). For the ϕ44 model, l(Γ) = ν(Γ) + 1 for two-point graphs and l(Γ) = ν(Γ) for four-point graphs. By our remarks in Sec. 1.5, H is a Hopf algebra and we have at least two formulae for the antipode: SG and SB . Nevertheless, in this context it is preferable to write the antipode in a more pictorial language, taking advantage of geometrical intuition. Definition 3.1. A chain C of a proper, connected graph Γ is a sequence γ1 γ2 · · · γk of proper, divergent, not necessarily connected subgraphs of Γ, not including ∅ and Γ. We denote by Ch(Γ) the set of chains of Γ. The length of a chain C is the number of “links” l(C) = k+1 and we write Ω(C) := γ1 (γ2 /γ1 ) · · · (γk /γk−1 ) (Γ/γk ). We allow also the empty set to be a chain. With this notation, we can define the antipode as follows: (−1)l(C) Ω(C). SDS (Γ) :=
(3.6)
C∈Ch(Γ)
This definition corresponds, on the one hand, to the correct version of the Dyson–Salam formula for renormalization. On the other hand, formula (3.6) looks similar to the explicit expression for the antipode given by Schmitt for his incidence Hopf algebras [14]; we shall come back to that. We show that SDS is nothing but a reformulation of SG , in other words: Theorem 3.2. The expansion of SDS coincides identically with the expansion of SG . Proof. First, given a proper, connected graph Γ, we rewrite (−1)k+1 Ω(C), SDS (Γ) := k
C∈Chk (Γ)
where Chk (Γ) denote the set of chains of length k + 1. Thus, it is enough to prove that Ω(C) = (uη − id)∗(k+1) (Γ). (−1)k+1 C∈Chk (Γ)
For this purpose, first notice that γ1 ⊗ γ2 /γ1 ⊗ · · · ⊗ γk /γk−1 ⊗ Γ/γk = ∆k · · · ∆2 ∆ Γ.
(3.7)
∅γ1 γ2 ···γk
Indeed, by definition of the coproduct the statement is true for k = 1. Moreover, if (3.7) holds for k − 1, then γ1 ⊗ γ2 /γ1 ⊗ · · · ⊗ γk /γk−1 ⊗ Γ/γk ∅γ1 γ2 ···γk Γ
=
∅γ2 γ3 ···γk Γ
∆k (γ2 ⊗ γ3 /γ2 ⊗ · · · ⊗ γk /γk−1 ⊗ Γ/γk )
August 29, 2005 18:8 WSPC/148-RMP
942
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
= ∆k
γ1 ⊗ γ2 /γ1 ⊗ · · · ⊗ γk−1 /γk−2 ⊗ Γ/γk−1
∅γ1 γ2 ···γk−1 Γ
= ∆k ∆k−1 · · · ∆2 ∆ Γ. Thus, by (1.25) (−1)k+1
Ω(C) = (−1)k+1
C∈Chk (Γ)
γ1 (γ2 /γ1 ) · · · (γk /γk−1 )(Γ/γk )
∅γ1 γ2 ···γk Γ
= (−1)k+1 m m2 · · · mk ∆k · · · ∆2 ∆ (Γ) = (uη − id)∗(k + 1)(Γ). The proof shows how the chains are generated from the coproduct. For illustration, now that we are at it, we record (1.27) in the language of the algebra of graphs: (SB γ)Γ/γ. SB Γ := −Γ − ∅γΓ
Let us illustrate as well the construction of the graded dual Hopf algebra H for Hopf algebras of Feynman graphs. Each connected element Γ gives a derivation or element ZΓ : H → C of the Lie algebra of infinitesimal characters on H, defined by
ZΓ , Γ1 · · · Γk = 0
unless k = 1 and Γ1 = Γ,
ZΓ , Γ = 1. Also, ZΓ , 1 = 0 since ZΓ ∈ Derη H. Clearly any derivation δ vanishes on the ideal generated by products of two or more connected elements. Therefore, derivations are determined by their values on the subspace spanned by single graphs and reduced to linear forms on this subspace. 3.2. Breaking the chains: The formula of Zimmermann Definition 3.3. A (normal) forest F of a proper, connected graph Γ is a set of proper, divergent and connected subgraphs, not including ∅ and Γ, such that any pair of elements are nonoverlapping. F (Γ) denotes the set of forests of Γ. The density of a forest F is the number d(F ) = |F | + 1, where |F | is the number of elements of F . We allow also the empty set to be a forest. Given γ ∈ F ∪ {Γ}, we say that γ is a predecessor of γ in F if γ γ and there is no element γ in F such that γ γ γ; that is to say, γ covers γ . Let ( Θ(F ) := γ/˜ γ, γ∈F ∪{Γ}
where γ˜ denote the disjoint union of all predecessors of γ in F . When γ is minimal, γ˜ = ∅ and γ/˜ γ = γ. Notice that if a forest F is a chain, then Θ(F ) = Ω(F ) and conversely if a chain C is a forest, Ω(C) = Θ(C). Obviously not every forest is a chain; but also not
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
943
every chain is a forest, because product subgraphs can occur in chains and cannot in forests. Zimmermann’s version for the antipode is given by (−1)d(F ) Θ(F ). (3.8) SZ (Γ) := F ∈F (Γ)
We assert that SZ provides yet another formula for the antipode of H. A standard proof of this, except for minor combinatorial details, is just the mechanical one of checking that a given formula yields a convolution inverse for id [17]. It is not entirely uninformative; but gives little insight into why there should be no cancellations in the actual computation; and cancellations in (3.8) there are not, since in order to have Θ(F ) = Θ(F ), it is required that d(F ) = d(F )! Instead of repeating such a proof, and with a view to generalizations, here we give a longer, but elementary and much more instructive treatment that (we hope) clarifies how chains of different length give cancelling contributions. Of course the assertion means that, in general, there are fewer forests than chains. Experience with Feynman graphs indicates that overlapping tends to reduce the cancellation phenomenon. There are “very overlapping” diagrams like the one in the ϕ44 model pictured in Fig. 3, for which the sets of chains and forests coincide. Despite examples like this one, it is apparent that Zimmermann’s formula is, from the combinatorial viewpoint, altogether subtler than those of Bogoliubov or Dyson and Salam. We plunge now into showing how all the cancellations implicit in the Dyson–Salam approach for graphs are taken into account and suppressed. To each chain C of Γ, we associate the forest FC consisting of all the connected components of the different elements of C. Lemma 3.4. Let C be a chain, then Ω(C) = Θ(FC ). Proof. Clearly the statement is true for chains with only one element. Assume the k−1 result holds for chains in i=2 Chi (Γ) where Γ is an arbitrary graph in H, and let C := γ1 ⊂ γ2 ⊂ · · · ⊂ γk be a chain in Chk (Γ). Then D := γ1 ⊂ γ2 ⊂ · · · ⊂ γk−1 belongs to Chk−1 (γk ) and if λ1 , . . . , λs are the connected components of γk , then FC = FD ∪ {λ1 , . . . , λs }. Now, clearly λ1 , . . . , λs are the maximal elements of FC , ˜ Thus by the induction hypothesis therefore, in FC , Γ/γk = Γ/λ1 · · · λs = Γ/Γ. ( Ω(C) = Ω(D) Γ/γk = Θ(FD ) Γ/γk = γ/˜ γ = Θ(FC ). γ∈FC
S {Γ}
H HH A P P A H PP H PP A H PPA Fig. 3.
Diagram Γ without extra cancellations in SZ (Γ) with respect to SDS (Γ).
August 29, 2005 18:8 WSPC/148-RMP
944
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
We can as well associate chains to forests in such a way that, if C has been associated to F , then FC = F . If a forest F is a chain, then F itself is the only chain so associated to F and clearly d(F ) = (F ). Each forest F has associated 1 1 := λ , . . . , λ at least one chain. Indeed, let L 1 1 n1 be the list of maximal elements of F , let L2 := λ21 , . . . , λ2n2 be the set of predecessors in F of elements in L1 , let L3 be the collection of predecessors in F of elements in L2 , and so on. If the listing of the elements of F is completed in the kth iteration, we let CF be the chain CF := λk1 · · · λknk ⊂ · · · ⊂ λ21 · · · λ2n2 ⊂ λ11 · · · λ1n1 , which by definition is associated to F . We denote by Ch(F ) the set of chains (of different length in general) associated to a given forest F of Γ. So far, we find (−1)(C) Ω(C) SDS (Γ) = C∈Ch(Γ)
=
(C)
(−1)
C∈Ch(Γ)
Θ(FC ) =
F ∈F (Γ)
(C)
(−1)
Θ(F ).
C∈Ch(F )
Theorem 3.5. For each proper, connected graph Γ, SDS (Γ) = SZ (Γ). Proof. To conclude that SDS (Γ) = SZ (Γ) it remains to prove that for each forest (C) = (−1)d(F ) . For this the idea is to enumerate the chains F, C∈Ch(F ) (−1) associated to a forest with k + 1 elements from a list of the chains associated to the forest obtained by deleting one element. Thus we proceed by induction on the density of the forests. By the previous remarks the statement holds if F is a chain, in particular if F has only one element. Suppose that the claim holds whenever F has at most k elements and let G be a forest with k + 1 elements, say G = F ∪ {γ}. Let C1 , . . . , Cr be the collection of all chains in Ch(F ). If Ci is the chain γ1i ⊂ γ2i ⊂ · · · ⊂ γki i , let γtii be the first (from the left) term of the chain Ci such that a connected component of it covers γ. If no term satisfies that condition, set ti = ki + 1. On the other hand, let γsi i be the last (from the right) link of the chain with (one or several) connected components contained in γ, not appearing in a previous link of the chain. If no γji has connected components contained in γ, take si = 0. We shall denote by γˆji the element of H obtained from γji by replacing the product of all its connected components contained in γ by γ itself. Then we can construct the following 2(ti − si ) − 1 chains in Ch(G): Ci0 := γ1i ⊂ · · · ⊂ γsi i ⊂ γˆsi i ⊂ γˆsi i +1 ⊂ · · · ⊂ γˆtii −1 ⊂ γtii ⊂ · · · ⊂ γki i , Ci1 := γ1i ⊂ · · · ⊂ γsi i ⊂ γˆsi i +1 ⊂ γˆsi i +2 ⊂ · · · ⊂ γˆtii −1 ⊂ γtii ⊂ · · · ⊂ γki i , Ci2 := γ1i ⊂ · · · ⊂ γsi i +1 ⊂ γˆsi i +1 ⊂ γˆsi i +2 ⊂ · · · ⊂ γˆtii −1 ⊂ γtii ⊂ · · · ⊂ γki i , .. . Ci2j−1 := γ1i ⊂ · · · ⊂ γsi i +j−1 ⊂ γˆsi i +j ⊂ · · · ⊂ γˆtii −1 ⊂ γtii ⊂ · · · ⊂ γki i ,
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
945
Ci2j := γ1i ⊂ · · · ⊂ γsi i +j ⊂ γˆsi i +j ⊂ · · · ⊂ γˆtii −1 ⊂ γtii ⊂ · · · ⊂ γki i , .. . 2(ti −si )−3
Ci
:= γ1i ⊂ · · · ⊂ γtii −2 ⊂ γˆtii −1 ⊂ γtii ⊂ · · · ⊂ γki i ,
2(ti −si )−2
Ci
:= γ1i ⊂ · · · ⊂ γtii −1 ⊂ γˆtii −1 ⊂ γtii ⊂ · · · ⊂ γki i . Notice that Ci2u = (Ci )+1 for u = 0, 1, . . . , (ti −si )−1, whereas Ci2v−1 = (Ci ) for v = 1, 2, . . . , (ti − si ) − 1. Since every chain in Ch(G) is equal to Cij for some pair (i, j), it follows, by the induction hypothesis, that C∈Ch(G)
(−1)(C) =
r
2(ti −si )−2
i=1
j=0
=−
r
j
(−1)(Ci ) =
r
0
(−1)(Ci )
i=1
(−1)(Ci ) = −(−1)d(F ) = (−1)d(G) .
i=1
3.3. Incidence Hopf algebras The kinship between the Connes–Moscovici algebra, the Kreimer Hopf algebra [2] of renormalization, Connes–Kreimer algebras of rooted trees and Feynman graphs is by now well known; it was the discovery of this kinship that gave the current impulse to the subject. Yet the “classical” Fa`a di Bruno algebra appeared to be of the same kind and it is known to fit in the framework of incidence bialgebras of lattices and posets, that went back to the pioneering work by Rota, together with Joni, Doubilet and Stanley [1, 68]. Rota introduced bialgebras in combinatorics, with coproducts becoming natural tools to systematize decompositions of posets and other combinatorial objects. This has been developed by Schmitt [14, 69]. The lecture notes by D¨ ur [70] also touched upon the subject. We want to explore the relations between Connes–Kreimer–Moscovici algebras and the lore of incidence Hopf algebras, eventually culminating in the importation of Zimmermann’s formula into the latter realm. Before that, we will obtain the Fa` a di Bruno˜ algebra F (yes, the symbol F is overworked in this paper) anew as the incidence Hopf algebra on the family of partitions of finite sets. A family P of finite intervals (that is, finite partially ordered sets, or posets for short, of the form {x, z} := { y : x ≤ y ≤ z }) is called (interval) closed if it contains all the subintervals of its elements. That is, if x ≤ y ∈ P ∈ P, then {x, y} ∈ P. We will write x < y whenever x ≤ y and x = y. Analogously to the previous section, we will say that y covers x whenever x < y and there is no z such that x < z < y. If P = {x, y} ∈ P, denote 0P := x and 1P := y. An order-compatible equivalence relation on a closed family P is an equivalence relation ∼ such that whenever the intervals P, Q are equivalent in P, then there exists a bijection φ : P → Q such that {0P , x} ∼ {0Q , φ(x)} and {x, 1P } ∼ {φ(x), 1Q } for all x ∈ P . Denote by [P ] the equivalence class of P in the quotient family P∼ . Such equivalence classes are called types.
August 29, 2005 18:8 WSPC/148-RMP
946
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
Definition 3.6. Let C(P∼ ) be the vector space generated by the types. It is a coalgebra under the maps ∆ : C(P∼ ) → C(P∼ ) ⊗ C(P∼ ) and η : C(P∼ ) → C defined by [{0P , x}] ⊗ [{x, 1P }], (3.9) ∆[P ] = x∈P
and η[P ] = 1 if |P | = 1,
η[P ] = 0 otherwise.
Since ∼ is order-compatible, ∆ is well defined. The verifications are elementary. We call C(P∼ ) the incidence coalgebra of P modulo ∼. It is also easily seen that all incidence coalgebras are quotients of the “full” incidence coalgebra, associated to the trivial (isomorphism) equivalence relation, by the coideals spanned by elements P − Q, when P ∼ Q. The family P is hereditary if it is closed as well under Cartesian products: P × Q ∈ P for every pair P, Q in P; the partial order (x1 , y1 ) ≤ (x2 , y2 ) iff x1 ≤ x2 in P and y1 ≤ y2 in Q is understood. A chain in P is an interval C such that x, y ∈ C implies either x ≤ y or y ≤ x. A chain C is said to have length (C) = n if |C| = n + 1. Clearly a chain of length n can be written as x0 < x1 < · · · < xn . The set of chains that are subintervals of an interval P , such that x0 = 0P and xn = 1P , is denoted by Ch(P ). We write Ω(C) for the type of the Cartesian product {x0 , x1 } × {x1 , x2 } × · · · × {xn−1 , xn }. A chain is saturated if no further “links” can be found between its elements, that is, for 1 ≤ i ≤ n, xi−1 ≤ y ≤ xi implies y = xi−1 or y = xi . Incidence coalgebras are filtered by length in a natural way. An interval P is said to have length n if the longest chain in P has length (P ) = n. If P ∼ Q, then l(P ) = l(Q), so length is well defined on the set of types; also clearly Ch(P ) ∼ Ch(Q) if P ∼ Q. Let Cn be the vector subspace of C(P∼ ) generated by types of length n or less; then C0 ⊆ C1 ⊆ · · · is a filtering of C(P∼ ). Indeed, for any interval P and x in P , the union of a chain of {0P , x} and a chain of {x, 1P } gives a chain n of P , hence l([{0P , x}]) + l([{x, 1P }]) ≤ l(P ), and so ∆Cn ⊆ k=0 Ck ⊗ Cn−k . An interval is graded when the length of all saturated chains between 0P and 1P is the same. Whenever all elements of the family P are graded, C(P∼ ) is a graded coalgebra. Suppose furthermore that there is an order-compatible equivalence relation ∼ on a hereditary family P satisfying: (i) if P ∼ Q, then P × R ∼ Q × R and R × P ∼ R × Q for all R ∈ P. (ii) if Q has just one element, then P × Q ∼ P ∼ Q × P . In particular, the second of these conditions imply that all one-point intervals are of the same type (hereinafter denoted by 1). If they both hold, then C(P∼ ) turns out to be a connected bialgebra, with the product induced by the Cartesian product
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
947
of intervals. In this context, types that are not (nontrivial) Cartesian products are ◦ . called indecomposables. The set of indecomposable types is denoted P∼ Theorem 3.7. Let us rebaptize C(P∼ ) as H(P∼ ). In view of our remarks in Sec. 1.5 it is in fact a Hopf algebra, called an incidence Hopf algebra and a formula for the antipode is (−1)l(C) Ω(C). (3.10) SDS [P ] = C∈Ch([P ])
Proof. We follow our argument of [17] for the Hopf algebras of Feynman graphs, which translates almost verbatim into general incidence algebras theory, whereby it is directly checked that SDS is an inverse of id under convolution. By definition (SDS ∗ id)[P ] = SDS ([{0P , x}]) [{x, 1P }] = SDS ([P ]) x∈P
+
SDS ([{0P , x}]) [{x, 1P }]
0P ≤x<1P
= SDS ([P ]) +
(−1)l(D) Ω(D) [{x, 1P }].
0P ≤x<1P D∈Ch({0P ,x})
Now, if D ∈ Ch({0P , x}), say D = x0 < x1 < · · · < xn = x, then C = x0 < x1 < · · · < xn < xn+1 = 1P ∈ Ch(P ). Moreover, Ω(C) = Ω(D) [{x, 1P }] and l(C) = l(D) + 1.
(3.11)
Reciprocally, given a chain C = x0 < x1 < · · · < xn+1 ∈ Ch(P ), then D = x0 < x1 < · · · < xn =: x ∈ Ch({0P , x}) and (3.11) holds. Therefore, (−1)l(C) Ω(C) = 0 = uη([P ]), SDS ∗ id([P ]) = SDS ([P ]) − C∈Ch(P )
in other words, SDS is a left inverse for id and therefore, it is an antipode. It is still possible to have a Hopf algebra, albeit not a connected one, by weakening condition (ii) above: one requires that a neutral element 1 exist in P such that P ∼ P × 1 ∼ 1 × P for all P ∈ P and that all P ∈ P with |P | = 1 be invertible. By now the reader should not be surprised to find that the expansions for SDS and SG are identical: one formally can reproduce exactly the argument of Theorem 3.2 to obtain mk ∆ ([P ]) = k
k+1 (
0P =x0 <x1 <···<xk+1 =1P
i=1
[{xi−1 , xi }] =
Ω(C).
C∈Chk (P )
In summary, the (correct version of) the Dyson–Salam scheme for renormalization is analogous to the explicit expression for the antipode (3.10), typical of incidence Hopf algebras.
August 29, 2005 18:8 WSPC/148-RMP
948
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
The dual convolution algebra H ∗ (P∼ ) of H(P∼ ) is called the incidence algebra of P modulo ∼ (unfortunately the standard terminology is a bit confusing). It can be identified to the set of maps from P∼ to C; and the group of multiplicative functions Homalg (H(P∼ ), C) can be identified with the set of functions whose domain is the set of indecomposable types. Explicitly,
f, [{0P , x}] g, [{x, 1P }].
f ∗ g, [P ] = x∈P
In this context, the classical theory of the M¨ obius function is reformulated as follows. Incidence Hopf algebras come to the world with a distinguished character: the zeta function ζ of P, given by ζ, [P ] = 1 for all types; it is patently multiplicative. Its convolution inverse µ is the M¨ obius character or M¨ obius function of posets [71]. Therefore, µ = ζS and different formulae for the antipode S provide different ways to compute µ. For instance, (3.10) gives µ([P ]) = −1 + c2 − c3 + · · · , where cn is the number of chains of length n. Rota defined in [71] the Euler characteristic of P as E([P ]) := 1 + µ([P ]) and related it with the classical Euler characteristic in a suitable homology theory associated to P . As for the examples, if one considers the family P of finite Boolean algebras and let ∼ be isomorphism, it is seen that P is hereditary; and if we denote by Y the isomorphism class of the poset of subsets of a one-element set, the associated incidence Hopf algebra is the binomial Hopf algebra with coproduct (1.19). Also, if we take for P the family of linearly ordered sets and ∼ the isomorphism relation again, and we let Yn denote the type of the linearly ordered set of length n, we recover the ladder Hopf algebra H , introduced after Theorem 1.24, as an incidence Hopf algebra. The similarity of the previous setup to the theory of algebras of Feynman graphs is striking. A dictionary for translating the family of Feynman graphs, reduced by isomorphism classes, into the language of incidence algebras would then seem to be quickly established, purporting to identify ≤ with ⊆ and to show (3.5) as an avatar of (3.9). In order to underline the parallelism, one could try to depart somewhat from the usual notation: indeed, writing ∅, Γ for any 0P , 1P respectively, x for {0P , x} and Γ/x for {x, 1P } would make the notations completely parallel. It is also clear that connected in the sense of graph types should translate into indecomposable in the incidence algebra framework. However, the direct attempt to identify Connes–Kreimer algebras and the Rota incidence algebras does not work: H represents a particular class of incidence algebras and it is important to recognize the traits that characterize it inside that framework: one can be badly misled by formal analogies. We show how this can happen by means of an example (see Example 3.9 further down) inside the commutative Hopf algebra of rooted trees. An (undecorated) rooted tree is a connected poset in which each element (vertex) is covered by at most another element: this selects a unique distinguished uncovered
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
949
element, called the root. An admissible cut c of a rooted tree T is a subset of its lines such that the path from the root to any other vertex includes at most one line of c. Deleting the cut branches produces several subtrees; the component containing the original root (the trunk) is denoted Rc (T ). The remaining branches also form rooted trees, where in each case the new root is the vertex immediately below the deleted line; Pc (T ) denotes the juxtaposition of these pruned branches. Definition 3.8. The (Connes–Kreimer) Hopf algebra of rooted trees HR is the commutative algebra generated by symbols T , one for each isomorphism class of rooted trees, plus a unit 1 corresponding to the empty tree; the product of trees is written as the juxtaposition of their symbols. A product of trees is called a (rooted) wood (we can hardly call it a forest). The counit η : HR → R is the linear map defined by η(1) := 1R and η(T1 T2 · · · Tn ) = 0 if T1 , . . . , Tn are trees. In [42], the coproduct on HR is defined as a map ∆ : HR → HR ⊗ HR on the generators (extending it as an algebra homomorphism) as follows: ∆1 := 1 ⊗ 1, ∆T := T ⊗ 1 + 1 ⊗ T + Pc (T ) ⊗ Rc (T ). (3.12) c∈C(T )
Here C(T ) is the list of admissible cuts. It should be clear that the ladder Hopf algebra H above is a (cocommutative) Hopf subalgebra of HR : the type of the linearly ordered set of length n is identified to the “stick” with n vertices. Let the natural growth N : HR → HR be the unique derivation defined, for each tree T with vertices v1 , . . . , vn , by N (T ) := T1 + T2 + · · · + Tn , where each Tj is obtained from T by sprouting one new leaf from each vertex vj . In particular, let t1 , t2 respectively denote the trees with one and two vertices; t31 , t32 the two trees with three vertices, where the root has respectively fertility 1 and 2; for trees with four vertices, we shall denote by t41 the rooted tree where all vertices have fertility 1 (a stick), by t42 and t43 the four-vertex trees with 2 or 3 outgoing lines from the root (respectively a hook and a claw), and by t44 the tree whose root has fertility 1 and whose only vertex with length 1 has fertility 2 (a biped); this being the notation of [24, Chap. 14]. We obtain N (t1 ) = t2 , N 2 (t1 ) = N (t2 ) = t31 + t32 , N 3 (t1 ) = N (t31 + t32 ) = t41 + 3t42 + t43 + t44 . From (3.12), ∆N (t1 ) = N (t1 ) ⊗ 1 + 1 ⊗ N (t1 ) + t1 ⊗ t1 , ∆N 2 (t1 ) = N 2 (t1 ) ⊗ 1 + 1 ⊗ N 2 (t1 ) + (t21 + t2 ) ⊗ t1 + 3t1 ⊗ t2 , ∆N 3 (t1 ) = N 3 (t1 ) ⊗ 1 + 1 ⊗ N 3 (t1 ) + N 2 (t1 ) ⊗ t1 + 4t2 ⊗ t2 + 6t1 ⊗ N 2 (t1 ) + 7t21 ⊗ t2 + 3t1 t2 ⊗ t1 + t31 ⊗ t1 ,
August 29, 2005 18:8 WSPC/148-RMP
950
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
this the reader will find familiar, as these are completely analogous to the expresa di Bruno algebra. This is no sions for the coproduct of the δ2 , δ3 , δ4 in the Fa` accident. The number of times that the tree T with n vertices appears in N n−1 (t1 ) is known, and the reader can consult for instance [42] or [24, Sec. 14.1] for a proof a di Bruno that indeed with the identifications t1 ≡ δ1 , N n−1 (t1 ) ≡ δn , the Fa` algebra F is a Hopf subalgebra of HR . By the way, the identification of the natural growth subalgebra in HR to F is not unexpected, in that there is a relation between differentials and rooted trees, that goes back to Cayley [72], giving precisely that correspondence; see [73] and also [24, Sec. 14.2] and the more recent [74]. Hopf algebras of rooted trees have extremely important normative properties, into which unfortunately we cannot go here. We refer the reader to [37, 75]. For the relation between the algebra of rooted trees and the shuffle algebra, consult [76]. Example 3.9. Consider now, for instance, the tree t32 . With this definition, its coproduct is ∆t32 = t32 ⊗ 1 + 1 ⊗ t32 + 2t1 ⊗ t2 + t21 ⊗ t1 . This is shown in Fig. 4. Now, we can make every tree into an interval {0T , 1T } by collating all its leaves to a notional element 0T . But even so, if we think of a tree as just such a poset and identify {0T , v} with the tree generated by a cut just above v, and omit the reasonable doubt about the meaning of {v, 1T }, this definition would not agree with (3.9): the last term certainly does not appear. The point here is that the interval {0T , 1T } does not contain all of its convex subposets. What “convex” means is recalled in Sec. 3.4. Once this is clarified, the proof of Zimmermann’s formula in the previous section will be seen to adapt to a large class of incidence Hopf algebras. To do this in some generality, and to dispel the misunderstandings we alluded to above, a new bevy of concepts in incidence Hopf algebra theory is required. Before turning to that, let us obtain F anew, as advertised, in terms of the theory of this section. c c c c ∆ r @r = r @r ⊗ 1 + 1 ⊗ r @r + 2 c ⊗ r + c c ⊗ c Fig. 4.
Subinterval rule = cut rule.
3.3.1. The Fa` a di Bruno algebra as an incidence bialgebra The Fa`a di Bruno algebra, it could be suspected by now, is an incidence bialgebra, corresponding to the case where the intervals belong to the family of posets that are partitions of finite sets; this was realized in [68]. We start by ordering the partitions of a finite set S. One says that {A1 , . . . , An } = π ≤ τ = {B1 , . . . , Bm }, or that π refines τ , if each Ai is contained in some Bj . The set of partitions Π(S) of S is an interval, where the biggest element is the partition with just one block and the smallest the partition whose blocks are all singletons.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
951
Proposition 3.10. The subinterval {π, τ } of Π(S) is isomorphic to the poset Πλ1 1 × · · · × Πλk k · · · , where λj is the number of blocks of τ that are the union of jλj = |π|). exactly j blocks of π (so λj = |τ | and of π that are also blocks of τ , and for each Proof. Let B11 , . . . , Bλ11 be the blocks integer k, let B1k1 , . . . , B1kk , . . . , Bλk1k , . . . , Bλkkk be the collections of k blocks of π that produce a block of τ . Clearly Π B11 × · · · × B121 , B122 × · · · × Bλk1k , . . . , Bλkkk × · · · λ ∼ (3.13) = Πλ1 × · · · × Π k · · · . 1
k
If σ ∈ {π, τ }, then σ is obtained by dividing some of the blocks of τ into pieces that are unions of blocks of π, which amounts to taking a partition of some of the products on the right-hand side of (3.13). It is natural therefore to assign to each interval {π, τ } the sequence λ = (1λ1 , . . . , k λk , . . .) — only finitely many components are non-zero — and to declare two intervals in the family P = { Π(S) : S is a finite set } to be equivalent when the corresponding vectors are equal; there are of course an infinite number of intervals in each equivalence class. It is immediate that P, with this equivalence relation ∼, is interval closed and hereditary. Equivalently, one can think of the poset of finite partitions of a countable set (such that only one block is infinite) ordered by refinement. Furthermore, from the correspondences [{π, τ }] ↔ λ ↔ xλ1 1 xλ2 2 · · · xλk k · · · , it is also natural to declare P˜ isomorphic, as an algebra, to the algebra of polynomials of infinitely many variables C[x1 , . . . , xk , . . .]. To describe the associated coproduct explicitly, it is enough to find what its action is on the indecomposable types [Πn ]. If [{π, τ }] = [Πn ] — which corresponds to the vector (0, . . . , 0, 1, 0, . . .) with the 1 in the nth place, or to the polynomial xn — then π has n blocks and τ just one. Moreover, if σ ∈ {π, τ }, then [{σ, τ }] corresponds to xk for some k ≤ n, whereas [{π, σ}] goes with a vector α satisfying α1 + · · · + αn = k and α1 + 2α2 + · · · + nαn = n. We conclude from (3.9) that ∆xn =
n n αn 1 α2 xα 1 x2 · · · xn ⊗ xk , α; k α
k=1
and we have recovered (2.6), with the identification an ≡ [Πn ]. Now, there is the proviso [Π1 ] = 1, implicit in the main construction of Sec. 2.1. In consequence, F = H({ Π(S) : S is a finite set }∼ ). Several final remarks on Fa` a di Bruno algebras are in order. A variant of the construction of F allows one to obtain an enlarged Fa` a di Bruno algebra, that we shall denote by Fenl . For that, we adjoin to C[x2 , . . . , xk , . . .] the
August 29, 2005 18:8 WSPC/148-RMP
952
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
group-like elements x1 and x−1 1 . Obviously, the coproduct formula (2.6) holds and the antipode generalizes as follows:
n−1 x3 x−1 x2 x−1 k (n − 1 + k)! 1 1 B , , . . . , (−1) Senl xn = x−n n−1,k 1 (n − 1)! 2 3 k=1
yielding (together with the discussion in that section) formula (2.13) of Sec. 2.2. This can be shown to arise from a variation of the chain formula to take into account the one-point segments. The following is taken from [19]. A (multi)colored set is a finite set X with a map θ : X → {1, . . . , N }. The color of x ∈ X is the value of θ(x). If Xr := { x ∈ X : θ(x) = r }, the size |X| of X is the row vector (|X1 |, . . . , |XN |). A colored partition of a colored set X is a partition of X whose set of blocks is also colored, with the condition θ({x}) = θ(x) for singletons. In what follows, take N = 2 for simplicity of notation. Let |X| = (n1 , n2 ). Colored partitions form a poset Πn1 ,n2 , with π ≤ ρ if π ≤ ρ as partitions and, for each block B of π which is also a block of ρ, θπ (B) = θρ (B). We banish from the family of posets one of the maximal elements 1X , so there are the two families Πrn1 ,n2 with r = 1 or 2 according to which maximal element is kept. On the hereditary classes of segments of such colored partition posets under the relation of color-isomorphism, it is possible to define a Hopf algebra structure called the colored Fa` a di Bruno Hopf algebra F (N ). There is a corresponding anti-isomorphism between the groups (by composition) of complex formal series in two variables of the form xn1 xn2 (3.14) fnr1 n2 1 2 f r (x1 , x2 ) = xr + n1 !n2 ! n +n >1 1
2
and the group of multiplicative functions f on colored Fa` a di Bruno Hopf algebras, r r given by fn1 n2 := f, [Πn1 n2 ]. The antipode on F (N ) provides Lagrange reversion in several variables. We refrain from going into that, but refer the reader to [77] for a classical kind of proof and to [19, 43] in the spirit of this review. For the colored Fa` a di Bruno algebras, G(F (N )◦ ) reproduces the opposite Gopp N of the formal diffeomorphism group in N variables; for undecorated trees, the dual group was identified by Brouder as the Butcher group of Runge-Kutta methods [73]; the series of trees are composed by appropriate grafting. The dual group G(H◦ ) of H for a quantum field theory has been termed diffeographism group by Connes and Kreimer. The name is entirely appropriate, as one should regard also the apparatus of Feynman graphs as a (very sophisticated) approximation machinery for the computation of (quantum corrections to) the coupling constants of a physical theory. In view of the “main theorem of renormalization” [78], these are typically given by series like (3.14). In the outstanding paper, a Hopf algebra homomorphism F (N ) → H (with the coupling constants as colors) and its dual group morphism are exhibited. This second morphism is first constructed at the G(H◦ ) → Gopp N level of the infinitesimal characters and then lifted to the group; the transpose map from F (N ) to H coincides with the one obtained when calculating the coupling
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
953
constants in terms of Feynman graphs. For simple theories like massless ϕ44 , only are involved. The previous considerathe simple Fa`a di Bruno algebra and Gopp 1 tions authorize us to regard the diffeographism group as the “general renormalization group”. Unfortunately, this is the place where we stop; but we can refer on this point, besides [4], to [79–81], in a different vein to [82];a and to the excellent review of [22]. Schmitt [69] defines a uniform family as a hereditary family P of graded posets together with a relation ∼, giving rise to an incidence Hopf algebra, such that, moreover: • The monoid P is commutative. • If [P ] is indecomposable, then {y, 1P } for y < 1P is indecomposable. ◦ of degree n. • For all n ≥ 1, there is exactly one type in P∼ Obviously the families of posets giving rise, respectively, to the ladder Hopf algebra H and to the Fa` a di Bruno algebra F , are uniform. This allows nice determinant formulae for the antipode [61, 69], that need not concern us now. We may define a quasi-uniform family by dropping the last condition; the colored posets giving rise to the F (N ) do correspond to quasi-uniform families. The algebras HR and H (even more so modulo the considerations in the next section) do correspond to quasi-uniform families as well. 3.4. Distributive lattices and the general Zimmermann formula This section is partly motivated by the closing remarks of [69]. Our goal is to employ the distributive lattice of order ideals associated with a general partially ordered set in order to interpret the combinatorics of cuts in the algebra of rooted trees, and further, to resolve the combinatorics of overlapping divergences. We need some definitions. We work with finite posets. An antichain in a poset P is a subset of P such that any two of its elements are incomparable in P . An order ideal is a subset I that includes all the descendants of its elements, that is, x ∈ I whenever x is smaller than some y ∈ I. We include the empty set among the ideals of P . The principal ideal generated by y ∈ P is the ideal Λy := { x ∈ P : x ≤ y }. Let A be an antichain; the order ideal generated by A is the subset of P of those elements smaller than some y ∈ A. A subposet K is convex when it contains the intervals associated to any pair of its elements; in particular an interval is convex. Differences of order ideals are also convex. Definition 3.11. A lattice is a particular class of poset, in which every pair of elements s, t has a greatest lower bound s ∧ t (“meet”) and a least upper bound s ∨ t (“join”). That is, there exist, respectively, an element s ∧ t satisfying s ∧ t ≤ s, a This
paper came out at an early stage of the subject, with the excellent idea of developing its Lie algebraic aspect in the context of triangular matrix representations; but its treatment of factorization for the Bogoliubov recursive procedure of renormalization was inconsistent.
August 29, 2005 18:8 WSPC/148-RMP
954
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
s ∧ t ≤ t such that any other with the same property is smaller, and one element s ∨ t satisfying s ∨ t ≥ s, s ∨ t ≥ t and any other element with the same property is greater. A lattice L is distributive if for any s, t, u in L s ∧ (t ∨ u) = (s ∧ t) ∨ (s ∧ u) or equivalently
s ∨ (t ∧ u) = (s ∨ t) ∧ (s ∨ u).
Distributive lattices are always intervals. Sublattices of distributive lattices are distributive. The collection of all subsets of any set is a lattice, with meet s ∧ t := s ∩ t and join s ∨ t := s ∪ t. In this example, these are the usual distributive laws between unions and intersections. The sets of (finite) partitions of finite or countable sets are lattices, but already Π3 is not distributive. The main example of distributive lattices is the poset JP of ideals of any finite poset P (ordered by inclusion). In fact, it is the only example. An important theorem by Birkhoff [18, Sec. 3.4] states that for every finite distributive lattice L, there exists a unique (up to isomorphism) poset P such that L ∼ = JP . For instance, this correspondence sends an antichain A with n elements into the set of its subsets, ordered by inclusion, that is, the Boolean lattice of rank n. In general, the set P can be taken as the subset of join irreducible elements of L, that is, those elements s that can not be written as s = t ∨ u with t < s and u < s. An order ideal of P is join irreducible in JP iff it is a principal order ideal. Thus there is a one-to-one correspondence Λy ↔ y between the join irreducibles of JP and the elements of P . A poset P is connected if it cannot be written as the disjoint union of two nontrivial posets. In other words, given any two elements x, y ∈ P , one can find a sequence of elements x0 = x, x1 , . . . , xn−1 , xn = y such that any two successive elements of the sequence are related by ≤ or ≥. Now, to the family of all finite posets, or to any subfamily P of it closed under the formation of disjoint unions and containing all the convex subsets of its elements, we can associate the new family of posets JP := { JP : P ∈ P }. An interval {I, I } in JP , is isomorphic to JI \I , where I \ I is regarded as a subposet of P . It is easy to see that distributive lattices are always graded, with the degree or “rank” of JP being precisely the cardinality |P | of P . Consider the isomorphism equivalence relation both in P and JP . Since JP ∪Q ∼ = JP × JQ , the set of indecomposable types JP◦ of JP is precisely the set of isomorphism classes of connected posets of P. Furthermore, by (3.9), the coproduct in the incidence Hopf algebra H(JP ∼ ) is given by ∆[JP ] =
I∈JP
[0JP , I}] ⊗ [{I, 1JP }] =
[{∅, I}] ⊗ [{I, P }] =
I∈JP
[JI ] ⊗ [JP \I ].
I∈JP
(Note that since both I and P \ I are convex, their types belong to P∼ .) Motivated by all this, we introduce a new commutative Hopf algebra structure in P∼ , defined by the product P Q = QP := P ∪ Q
(3.15)
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
and the coproduct ∆[P ] :=
[I] ⊗ [P \I],
955
(3.16)
I∈JP
with the obvious unit and counit. By construction, this Hopf algebra is isomorphic to H(JP ∼ ). With some abuse of notation, in the remainder of this section, we call H(P∼ ) the Hopf algebra given by (3.15) and (3.16). In this setting, a chain C of P is defined by a chain of JP , that is, a sequence of order ideals ∅ = I0 I1 · · · In = P and Ω(C) = [{I0 , I1 }{I1 , I2 } · · · {In−1 , In }] = [JI1 \I0 JI2 \I1 · · · JIn \In−1 ]. When C is regarded as a subposet of P , we write Ω(C) = [I1 \I0 ][I2 \I1 ] · · · [In \In−1 ]. By Proposition 3.7, the antipode in H(P∼ ) is given by (−1)l(C) Ω(C), SDS [P ] = C∈Ch([P ])
where Ch([P ]) denotes the set of chain of [P ]. In complete analogy to the Hopf algebra of Feynman graphs, we consider the following. Definition 3.12. A forest F of P is a collection of connected subposets of P such that if I1 and I2 are in F , then either I1 ∩ I2 = ∅ or I1 ⊂ I2 or I2 ⊂ I1 . If I ∈ F, then I is a predecessor of I in F if there is no I ∈ F such that I I I and we denote by I˜ the disjoint union of all predecessors of I in F . As in Sec. 3.2, we define ( ˜ [I \ I]. Θ(F ) = I∈F ∪{P }
We come to the main result: a general formula of the Zimmermann type for the antipode of the incidence Hopf algebras H(P∼ ). Theorem 3.13. Let P be a family of posets that contains all the convex subsets of its elements and is closed under the formation of disjoint unions and consider on P the isomorphism equivalence relation. For any [P ] ∈ H(P∼ ), let F ([P ]) be the set of all forests of [P ]. Then S([P ]) = SZ ([P ]) := (−1)d(F ) Θ([P ]). F ∈F ([P ])
Proof. Our proof of Theorem 3.5, and its preliminaries, just go through with no more than notational change and there is no point in repeating them. We turn to the examples. In order to obtain (3.12) from the theory of incidence Hopf algebras, one can proceed as follows: consider the family W of posets
August 29, 2005 18:8 WSPC/148-RMP
956
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
constituted by all rooted woods; consider JW := { JP : P ∈ W } modulo isomorphism. Then JW is hereditary and we proceed with the construction of the incidence algebra associated to it. The indecomposable types are precisely the trees and H(W∼ ) H(JW ∼ ) now denotes the polynomial algebra on them with the coproduct of above. This is seen to reproduce the good coproduct (3.12). For instance, in Example 3.9, as well as the trivial ideals for t32 , there are the ideals given by each leaf, and both leaves together. In conclusion, H(W∼ ) HR . Clearly HR is a quasi-uniform incidence algebra. Note that if a tree t is a stick with n vertices, then Jt is a stick with n + 1 vertices. In this way, we are able to complement the graphic description of rooted trees by the Hasse diagrams of their distributive lattices of ideals (the Hasse diagram of a finite poset P is drawn by representing the elements of P by vertices and the cover relations by edges [83]). We illustrate with the accompanying figures the correspondence of (incidence Hopf algebras of) distributive lattices to (the Hopf algebra of) rooted trees, up to four vertices. With a bit of practice, one is able to read quickly the lattice paths. For instance, in Fig. 6, one sees that there are 12 chains and 8 forests associated to the tree t42 . There is no particular advantage in computing the antipode in this way. However, on other matters, the distributive lattice viewpoint is very helpful. For bialgebras of Feynman graphs, things are just a tad more difficult. We also illustrate with a few figures the correspondence of incidence Hopf algebras to some Feynman graphs for the four-point function in ϕ44 theory, up to three loops. The reader will notice that Boolean lattices (and thus, antichains) must be ascribed to articulate (onevertex reducible) diagrams in ϕ44 theory. This is in consonance with the fact, noted ra
ca rb
-
rc Fig. 5.
rb rc r∅
The distributive lattice for the tree t31 .
ra
b r dr
ca @ @r c
-
r bc @ @r cd b r @ @ @r c @ d r @ @r ∅
Fig. 6.
The distributive lattice for the tree t42 .
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
957
ra
ca @ b r c r @r d
-
r bcd @ rbd@r cd bc r @ @ b r @r c @r d @ @r ∅
Fig. 7.
The distributive lattice for the tree t43 .
ra
ca rb @ @r d
c r
Fig. 8.
γ3 γ1 H H
Fig. 9.
rb r cd @ @r d
-
c r @ @r ∅
The distributive lattice for the tree t44 .
γ2 HH H HH H
-
rΓ @ @r γ3
γ1 γ2 r @ @r γ2 γ1 r @ @r ∅
The distributive lattice for the feeding-shark diagram.
in Sec. 3.2, that renormalization factorizes for such graphs. The case of the cat’seye diagram (Fig. 10) illustrates the possibility, for overlapping divergencies, of choosing a subposet of JP , for P the divergence poset. Note the grading by length of Feynman graphs (that for connected graphs coincides as a filtering with the filtering by depth). In conclusion for this section, we have understood the question on the relation between the Hopf algebras of renormalization of Connes and Kreimer and the incidence Hopf algebras of Rota, Schmitt and others, in terms of the incidence Hopf algebra on the family of (sub)poset ideals JP , where P denotes the family of posets made of Feynman graphs and subgraphs pertaining to a given model theory. If in the antipode formulae for connected elements T of HR , one formally substitutes 1 for all trees, one obtains zero whenever T = t1 . Concretely, the zeta function on the trees is the character taking the value 1 on every tree (i.e. the “geometric obius function ζ ∗−1 sends 1 to 1, t1 series” element of G(HT◦ ), see [74], and the M¨
August 29, 2005 18:8 WSPC/148-RMP
958
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
γ2
γ3
@ @
γ1
@ @
-
rΓ @ @r γ3
γ2 r @ γ@r 1
r∅ Fig. 10.
The distributive lattice for the cat’s-eye diagram.
to −1 and any other tree to zero. This reflects the well-known fact that the M¨obius function of any distributive lattice J is zero, except when J is the Boolean algebra obius function of rank n, for which ζ ∗−1 (J) = (−1)n — whereby on H(P∼ ) the M¨ vanishes except for antichains. For the latter, ζ ∗−1 (A) = (−1)|A| holds. The formula of Zimmermann is also valid under more general circumstances. For instance, for the Fa`a di Bruno algebra although the partition lattices are not distributive. What is needed is a good description of the intervals in the algebra in terms of indecomposable types. This occurs in the Fa` a di Bruno algebra: see Proposition 3.10. The expression of SZ then leads to (2.10), this was found in [19], as repeatedly indicated. After the first version of our paper appeared, in an interesting development, it has been found to work for the free Fa` a di Bruno algebra [84] as well. In our present framework, it is enough to remark that, as discussed in Sec. 2.3, the Fa` a di Bruno algebra F is a Hopf subalgebra of HR , to which the distributive lattice paradigm applies.
4. The General Structure Theorems 4.1. Structure of commutative Hopf algebras I The title of the section might have been Why the boson Hopf algebra is important. Indeed B(V ), one of the simplest commutative and cocommutative connected bialgebras, beyond its role in QFT will be seen here to play a normative role in the theory. We have invoked a couple of times the Milnor–Moore theorem. This is the well-known structure theorem for connected graded cocommutative Hopf algebras, stating that such a Hopf algebra H is necessarily isomorphic, as a Hopf algebra, to U(P (H)). Remember P (U(P (H))) = P (H). For algebras of finite type from our remarks on the filtering by depth at the end of Sec. 1.6, there is but a short step to a proof for the Milnor–Moore theorem and the reader is invited to provide such proof. We refrain from giving any, as that is found in several places, including the book [24] by ourselves; we remit as well to [21] and the original paper [23]. Theorem 4.1. Let H be a graded connected commutative skewgroup of finite type 2 = H+ . Then there exists a and let V be a graded vector space such that V ⊕ H+ graded algebra isomorphism between B(V ) and H.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
959
Proof. One has H H U(P (H )) by the Milnor–Moore theorem. If {ei } is a graded basis of P (H ), then consider the vector space V0 linearly generated by the dual basis (defined by fj (ei ) = δij ). Note that the space algebraically generated by V0 is all of U(P (H )) — this is essentially the Poincar´e–Birkhoff– Witt theorem [85]. Thus, the map BιV0 : B(V0 ) → U(P (H )) lifting V0 fj → fj ∈ U(P (H )) is an algebra isomorphism. 2 = H+ . This is so in view of One has V0 ⊕ H+ 2 H = U(P (H )) = V0 ⊕ P (H )⊥ = C1 ⊕ V0 ⊕ H+
where (1.18) has been used in the last equality. Now, there is nothing very particular 2 in H+ about V0 . Let V be another supplementary graded vector subspace of H+ and denote as usual by C[V ] the subalgebra of H generated by V . To prove that n C[V ] = H, we (routinely) show by induction that k=0 H (n) ⊂ C[V ]. Clearly, this is true for n = 0. Assume the claim is true for all k < n and let a be an element of degree n, that can be taken to be homogeneous. One has a = v + b1 b2 with v ∈ V n−1 (k) . Therefore, a ∈ C[V ], i.e. V indeed generates H. Moreover and bi ∈ k=0 H V0 and V are isomorphic graded vector spaces. Therefore, B(V ), B(V0 ) and H are isomorphic graded algebras. As a consequence, H cannot have divisors of zero. Also, the filtering by depth in connected graded commutative Hopf algebras (discussed extensively at the end of Sec. 1.6) has an associated algebra grading. It is useful to reflect more on Theorem 4.1. Borrowing the language of topol2 2 → H+ is a section of π : H+ → H+ /H+ if s ogy, we may say that s : H+ /H+ is a right inverse for π. We have arrived at the conclusion that any section of π for a commutative connected graded bialgebra H induces an algebra isomorphism 2 and H. This is Leray’s theobetween the free commutative algebra over H+ /H+ rem [23, Theorem 7.5], proved here at least for H of finite type. (By the way, the theorem by Milnor and Moore, together with the Poincar´e–Birkhoff–Witt theorem, incorporates the statement that a cocommutative connected graded bialgebra H is isomorphic as a coalgebra with the cofree cocommutative coalgebra Qcocom (P (H)) over the space of its primitive elements, with the isomorphism induced by any left 2 P (H )∗ and as inverse for the inclusion P (H) → H.) Note as well that H+ /H+ 2 P (H ) is a Lie algebra, H+ /H+ inherits a natural Lie coalgebra structure, on which unfortunately we cannot dwell. Theorem 4.1 does not give all we would want, as for instance we are still mostly in the dark about the primitive elements of the Hopf algebra that seem to play such an important role in the Connes–Kreimer approach to renormalization theory. The Hopf algebra of rooted trees contains many primitive elements, but not being cocommutative, cannot be primitively generated. Also Hopf algebras of Feynman graphs are not primitively generated; and the Fa` a di Bruno algebra, as we shall soon see, is very far from being primitively generated. At least we know for sure, in view of (1.31), that primitive elements are indecomposable. The landscape thus emerging can be summarized as follows (look
August 29, 2005 18:8 WSPC/148-RMP
960
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
back at Sec. 1.6, and if you can endure the category theory jargon, have a n (l) . Each H(n) is a graded subalgelook at [86] too). Let H(n) := C l=1 H ∞ 2 . Therefore, bra and H = n=1 H(n). It is clear that H(n)(n+1) = H (n+1) ∩ H+ (n+1) (n+1) = (0) and one can decompose H as P (H) ∩ H(n) H (n+1) = H(n)(n+1) ⊕ P (H)(n+1) ⊕ WH (n + 1) for some suitable (nonuniquely defined) supplementary vector space WH (n + 1). n Furthermore, since H(n) is generated by l=1 H (l) and the compatibility of the coproduct with the grading entails ∆H (n) ⊂ H(n) ⊗ H(n) for each n, we see that H(n) is an ascending sequence of Hopf algebras. Thus, in any graded connected (0) commutative bialgebra H, there is a graded vector subspace WH , with WH = (1) WH = 0, such that the unique morphism of graded Hopf algebras ν : B(P (H)) ⊗ B(WH ) → H, extending the morphism (P (H) ⊗ 1) ⊕ (1 ⊗ WH ) → P (H) ⊕ WH ⊆ H p ⊗ 1 + 1 ⊗ w → p + w
(4.1)
2 maps the is a graded algebra isomorphism. The quotient morphism H → H/H+ 2 graded subspace P (H) + WH isomorphically onto H/H+ . Moreover, the follow2 is an isomorphism; ing statements are equivalent: (1) qH : P (H) → H+ /H+ (2) WH = 0; (3) H is primitively generated; (4) ν is an isomorphism of Hopf algebras. Also, B(P (H)) is precisely the largest cocommutative Hopf subalgebra (subcoalgebra, subbialgebra) of H. To investigate the primitive elements (and primitivity degree) in important commutative algebras, in the few years after the introduction of the Connes– Kreimer paradigm, a few strategies, due to Broadhurst and Kreimer, Foissy, and Chryssomalakos and coworkers, were made available. The reader wishing to familiarize himself/herself with the structure of graded connected commutative Hopf algebras ought to consult the original papers [9, 33, 49, 87]. In the last of these, so-called normal coordinate elements are introduced. We indicate the origin of the “normal coordinate” terminology: the ψ-coordinates on G(H ◦ ) defined by ψδ (exp λδ) = λ are elements of H that can be interpreted geometrically as “normal coordinates”. We commend [87] for other geometrical insights, such as the identification of primitive elements in H with closed left invariant 1-forms on G(H ◦ ). However, the computational algorithm in [87] for “normal coordinate elements” — consult for instance their equation (25) — is painfully indirect. Our own strategy, inspired by work of Reutenauer [36], Patras [88, 89] and Loday [90, 91], is to settle for the knowledge of what in Sec. 1.5 we call the quasiprimitive elements. 2 in H+ . BeautiThey constitute a privileged canonical supplementary space of H+ fully, and somewhat unexpectedly, it coincides with the space of normal coordinate elements. This treatment grants us the best grip on the reconstruction of H; it is deeply related to cyclic homology, but we will not go into that here.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
961
As a rule, examples are more instructive than proofs. Therefore, we illustrate our purpose by giving first the “easy” example of the ladder algebra H . The finer structure theorem is proved in the next section. The example of the Fa` a di Bruno algebra follows afterwards. 2 is an isoExample 4.2. As a corollary of Leray’s theorem, if P (H) → H+ /H+ morphism, then BιP (H) is an isomorphism of Hopf algebras, and H is primitively generated. In particular, there will be a polynomial expression for the basis elements of H in terms of primitive elements. Now, in any Hopf algebra associated to a field theory, there exist “ladder” subalgebras of diagrams with only completely nested subgraphs and any of these is isomorphic to H . The structure of H is completely understood in terms of the isomorphism, already described in Example 2.3, between this bialgebra with its filtering by depth (which in this case, give rise to a Hopf algebra grading) and a symmetric algebra. As an incidence algebra, H comes from a uniform family. Therefore, with the standard grading #, there are exactly p(n) linearly independent elements of degree n. Let us examine the first few stages. For n = 1, there is a basis with only the stick with one vertex l1 . For n = 2, there is a basis with l2 , the stick with two vertices, and l21 . Further bases are l3 , l1 l2 , l31 for n = 3; l4 , l1 l3 , l22 , l21 l2 , l41 for n = 4; l5 , l1 l4 , l2 l3 , l21 l3 , l22 l1 , l31 l2 , l51 for n = 5. On the other hand, there is an infinite number of primitives, given as we know, by the Newton polynomials in the sticks, one at each #-order. It must be so because there is one indecomposable at each #-order. The first few are:
1 1 p2 = l2 − l21 , p3 = l3 − l1 l2 + l31 , 2 3 1 1 p4 = l4 − l1 l3 − l22 + l21 l2 − l41 , 2 4 1 p5 = l5 − l1 l4 − l2 l3 + l21 l3 + l1 l22 − l31 l2 + l51 . 5 Following [9], we now introduce in the Hopf algebra H , the dimension hn,k of the space of homogeneous elements a such that #(a) = n and δ(a) = k. We know that if a is not a scalar, 1 ≤ k ≤ n holds. We claimed already hn,1 = 1 for all n. And certainly hn,n ≥ 1, as for instance, ln has depth n — in fact, hn,n = 1. To explore farther, in analogy to the definition of the antipode as a geometric series, we introduce (−1)k−1 (id − uη)∗3 (id − uη)∗2 (id − uη)∗k = (id − uη) − + − ···. log∗ id := k 2 3 p1 = l1 ,
k≥1
which, by the argument used in Proposition 1.15, is a finite sum in H+ and its convolution powers. We easily find log∗ id l2 = p2 , log∗ id l3 = p3 , log∗ id l4 = p4 ,
log∗ id l21 = 0, log∗ id l1 l2 = log∗ id l31 = 0, log∗ id l1 l3 = log∗ id l22 = log∗ id l21 l2 = log∗ id l41 = 0,
(4.2)
August 29, 2005 18:8 WSPC/148-RMP
962
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
and so on, as log∗ id sends the indecomposables into the primitives, and kills products — see Proposition 4.4 below for details. Now log∗2 id p2 = 0, 2!
log∗2 id 2 l1 = l21 . 2!
This together with the first line in (4.2) means h2,1 = h2,2 = 1; it could not be otherwise. Next, log∗2 id p3 = 0, 2!
log∗2 id 1 l1 l2 = l1 l2 − l31 = l1 p2 , 2! 2
log∗2 id 3 l1 = 0. 2!
At order 3, finally: log∗3 id p3 = 0, 3!
log∗3 id l1 p2 = 0, 3!
log∗3 id 3 l1 = l31 . 3!
(3)
Thus the “natural basis” for H is
13 13 3 l3 − l1 l2 + l1 , l1 l2 − l1 , l1 = p3 , p1 p2 , p31 , 3 2 with respective depths (primitivity degrees) 1, 2, 3. Thus, h3,1 = h3,2 = h3,3 = 1. Similarly, then 1 log∗2 id 2 1 log∗2 id l1 l3 = l1 l3 − l21 l2 + l41 = l1 p3 , l2 = l22 − l21 l2 + l41 = p22 , 2! 3 2! 4 log∗2 id 2 log∗2 id 4 l1 l2 = 0, l1 = 0. 2! 2! (4)
Therefore, there is a space of subprimitive elements of dimension 2 in H , that is, ∗3 h4,2 = 2. Moreover, log3! id kills p4 , l1 p3 , p22 and l41 . Also, 1 log∗3 id 2 l1 l2 = l21 l2 − l41 = l21 p2 , 3! 2
finally
log∗4 id 4 l1 = l41 . 4!
(4)
The natural basis for H is:
1 2 14 14 2 1 4 2 14 4 2 2 2 l4 − l1 l3 − l2 + l1 l2 − l1 , l1 l3 − l1 l2 + l1 , l2 − l1 l2 + l1 , l1 l2 − l1 , l1 2 4 3 4 2 = (p4 , p1 p3 , p22 , p21 p2 , p41 ). Moreover Spi = −pi for all i. The reader is invited to examine the next homogeneous subspace. One sees that hn,k for H is the number of partitions of n into k integers. This was proved in [9] using a powerful technique involving Hilbert series. In summary, the algebra of ladder graphs or sticks H offers a case in point for the simplest version of the result in (4.1), for which WH = 0.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
963
4.2. Structure of commutative Hopf algebras II Consider, for H a graded connected commutative Hopf algebra, a suitable comple¯ . This is a unital algebra, with product tion of the tensor product H ⊗H , say H ⊗H t m ⊗ ∆ and unit 1 ⊗ 1. Now by Leray’s Theorem 4.1, our H is a boson algebra over 2 in H+ . Let A index a basis for V , let A˜ index the linear basis a supplement V of H+ of H consisting of monomials Xu in elements of A and let Zu denote an element ¯ is given by the double series of the dual basis in H . Then the product on H ⊗H product: αuv Xu ⊗ Zv βwt Xw ⊗ Zt := αuv βwt Xu Xw ⊗ Zv Zt . ˜ u,v∈A
˜ w,t∈A
˜ u,v,w,t∈A
This is done just like in [36] for the shuffle-deconcatenation Hopf algebras of words. ¯ H given by The linear embedding End H → H ⊗ f → f (Xu ) ⊗ Zu , ˜ u∈A
is really a convolution algebra embedding ¯ H , m ⊗ ∆t ). (End H, ∗) → (H ⊗ Indeed,
˜ u∈A
=
f (Xu ) ⊗ Zu
g(Xv ) ⊗ Zv
˜ v∈A
f (Xu )g(Xv ) ⊗ Zu Zv
˜ u,v∈A
=
˜ t∈A
=
˜ t∈A
=
f (Xu )g(Xv ) < Zu Zv , Xt > ⊗ Zt
˜ u,v∈A
f (Xu )g(Xv ) Zu ⊗ Zv , ∆Xt ⊗ Zt
˜ u,v∈A
f ∗ g(Xt ) ⊗ Zt .
(4.3)
˜ t∈A
Notice that the identities uη for convolution and id for composition in End H correspond respectively to uη → 1 ⊗ 1, id → Xu ⊗ Zu , ˜ u∈A
in the double series formalism.
August 29, 2005 18:8 WSPC/148-RMP
964
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
On the other hand, let A˜+ := A˜ \ 1. Using the same idea as in (4.3), we get k (−1)k−1 Xu ⊗ Zu : = Xu ⊗ Zu log k ˜ ˜ k≥1
u∈A
u∈A+
(−1)k−1 = k
=
Xu1 · · · Xuk ⊗ Zu1 · · · Zuk
˜+ u1 ,...,uk ∈A
k≥1
(−1)k−1 k ˜ k≥1 w∈A ×
Zu1 · · · Zuk , Xw Xu1 · · · Xuk ⊗ Zw . ˜+ u1 ,...,uk ∈A
Therefore
log
Xu ⊗ Zu =
˜ u∈A
π1 (Xw ) ⊗ Zw ,
(4.4)
˜ w∈A
where π1 (Xw ) :=
(−1)k−1 k≥1
k
Zu1 · · · Zuk , Xw Xu1 · · · Xuk .
˜+ u1 ,...,uk ∈A
Now, by definition π1 (Xw ) = log∗ id Xw . We moreover consider the endomorphisms πn := so that, by (4.3): w∈A
1 πn (Xw ) ⊗ Zw = n! ∗
π1∗n , n!
n π1 (Xv ) ⊗ Zv
.
v∈A∗
We may put π0 := uη. Thus, if a ∈ H is of order n, πm (a) = 0 for m > n. Furthermore, for n > 0, id∗l a = exp∗ (log∗ (id∗l ))a = = = In particular, id =
n m=1 n m=1
m≥0
πm .
lm
n (log∗ (id∗l a))m m! m=1
(log∗ id a)m m!
lm πm (a).
(4.5)
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
965
Proposition 4.3. For any integers n and k, id∗n id∗k = id∗nk = id∗k id∗n .
(4.6)
Proof. The assertion is certainly true for k = 1 and all integers n, and if it is true for some k and all integers n, then taking into account that id is an algebra homomorphism, (1.3) and the induction hypothesis give id∗n id∗k+1 = id∗n (id∗k ∗ id) = id∗nk ∗ id∗n = id∗n(k+1) . Substituting the final expression of (4.5) in (4.6), with very little work one obtains [90, Theorem 2.9.c] πm πk = δmk πk .
(4.7)
In other words, the πk form a family of orthogonal projectors and therefore, the space H has the direct sum decomposition H= H (n) := πn (H). (4.8) n≥0
n≥0
Moreover, from (4.5), id∗l H (n) = ln H (n) , so the H (n) are the common eigenspaces of the operators id∗l with eigenvalues ln . Thus, the decomposition (4.8) turns H into a graded algebra. Indeed, if a ∈ H (r) and b ∈ H (s) , then id∗l (ab) = id∗l a id∗l b = lr+s (ab), and therefore m sends H (r) ⊗ H (s) into H (r+s) . If H were cocommutative, nearly all the previous arguments in this section would go through. For (4.6), one uses (1.4) instead of (1.3) and naturally instead of an algebra grading, one gets a coalgebra grading. In particular, appealing to the identity (1.8) and since obviously P (H) is contained in π1 (H), we conclude that π1 (H) = P (H) in this case. Assume again that H is commutative, so H is cocommutative. Our contention now is that the logarithm kills products. 2 Proposition 4.4. On a commutative, connected Hopf algebra H, π1 (H+ ) = 0 holds. 2 is orthogonal to P (H ), by (1.18). Now, to avoid conProof. Recall first that H+ fusion, denote in this proof by π1 the first projector of H . Since (id − uη)H = (id − uη)tH , clearly π1 is the transpose of π1 and then for an arbitrary u,
π1 (Xi Xj ), Zu = Xi Xj , π1 (Zu ) = 0,
August 29, 2005 18:8 WSPC/148-RMP
966
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
since π1 (Zu ) is primitive. Therefore, π1 (Xi Xj ) must vanish as claimed. (For the trained field theorist, this is plausible as he knows that logarithms identify “connected” elements.) The grading associated to log∗ id does not, in general, coincide with the previous grading. At any rate, because of homogeneity of the convolution powers of id, we get an algebra bigrading of the Hopf algebra, becoming a bialgebra bigrading when H is cocommutative as well. Now turn to the so-called normal coordinate elements (canonical coordinates of the first kind would be a more appropriate name) introduced in [87] by the definition Xu ⊗ Zu =: exp ψj ⊗ Zj . ˜ u∈A
j∈A
Note that the sum on the right-hand side is only on A (that is, only on “letters”, not on all “words”). From this, the authors immediately conclude that any algebra basis element has a canonical decomposition as follows: Zi · · · Zi , Xj 1 k ψi1 · · · ψik Xj = ψj + k! k≥2
= ψj +
Zi ⊗ · · · ⊗ Zi , ∆k Xj 1 k ψi1 · · · ψik . k!
k≥2
The second form makes clear that only a finite number of terms intervene, corresponding to sequences J = (i1 , . . . , ik ) compatible with Xj in the sense that
Zi1 ⊗ · · · ⊗ Zik , ∆k Xj = 0. Even so, to extract ψj from this change of infinite basis is a painful business. However, looking back at (4.4) and exponentiating both sides, we see that Xu ⊗ Zu = exp π1 (Xw ) ⊗ Zw = exp π1 (Xj ) ⊗ Zj ˜ u∈A
j∈A
˜ w∈A
since π1 kills products, the sum of the right-hand side extends only over A. Therefore, ψj = π1 (Xj ). All the properties claimed for HR in [87] are seen as easy corollaries of the properties of the canonical projector π1 . For instance, the diagonality of the antipode in the ψj basis, or quasiprimitivity. Indeed, since Sg = g −1 if g is group-like, then
g −1 , ψj = Sg, ψj = g, Sψj . But, if
g = exp
k∈A
αk Zk
,
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
967
then g, ψj = αj and g −1 , ψj = −αj for all j ∈ A. The conclusion that Sψj = −ψj follows. For applications in QFT, further investigation of (depth and) quasiprimitivity in commutative algebras of the kind studied in this review seems paramount. Before reexamining our old friend F , we make a skimpy remark on the algebra HR of rooted trees, much studied by several authors [9, 33, 49, 87]. Even taken as a proxy for the Hopf algebras of QFT, its complexity is staggering. The number of rooted trees with n vertices is given by a famous sequence r = (r1 = 1, r2 = 1, 2, 4, 9, 20, 48, 115, . . .). Then, with the standard grading by number of vertices, clearly dim H (n) = rn+1 . There are many primitives, beyond those contained in its Hopf subalgebra H : for instance, h15,1 = 30135. We wish to remark that the Fa` a di Bruno formulae are instrumental in finding hn,k in the context of HR [49] again. The space WHR starts at level 3, since Fig. 11 exhibits an indecomposable nonprimitive (which has depth 2).
log∗ id Fig. 11.
c c c c r @r = r @r − r +
1 6
c c c,
Nonprimitive combination of nonproduct woods.
Example 4.5. Matters for the Beatus Fa`a di Bruno algebra are rather different from the ladder algebra. Of course dim F (n) = p(n) still holds. Consider F (2) , that is, the linear span of a22 = δ12 and a3 = δ2 + δ12 — remember that with our notation 2 . Now, although a3 in the indecomposable class is #(an ) = n − 1. Plainly, a22 ∈ F+ 2 not primitive, it belongs to F+ ⊕ P (F )(2) , since 3 1 log∗ id a3 = a3 − a22 = δ2 − δ12 =: p2 2 2 (the Schwarzian derivative) is primitive; the telltale sign was τ ∆a3 = ∆a3 . Notice that no more primitives in F , outside the space spanned by p1 := δ1 and p2 , can possibly happen, Eq. (1.18) tells us 2 ⊥ ) . P (F ) = (C1 ⊕ F+
But then from (2.11), a dual basis of F is made of products except for two elements. Therefore, dim P (F ) = 2. Also, the projector log∗ id will produce primitives only on indecomposable classes fulfilling ∆a = τ ∆a; this cannot be contrived here. We obtain 9 1 p˜3 := log∗ id a4 = a4 − 5a2 a3 + a32 = δ3 − 2δ1 δ2 + δ12 2 2 and log∗ id kills products as usual. We know that p˜3 is not primitive, rather it has depth 2, as it can easily be checked. It still is quasiprimitive, in that m∆ p˜3 = 0.
August 29, 2005 18:8 WSPC/148-RMP
968
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
The algebra grading respectively induced by the convolution powers of the logarithm and by the depth filtering in this case do not coincide. We have as well
log∗2 id 3 log∗2 id 3 log∗2 id 3 a2 a3 = a2 a3 − a32 = p1 p2 , a4 = 5 a2 a3 − a32 , a2 = 0. 2! 2 2! 2 2! Finally log∗3 id 3 a2 = a32 = p31 . 2! A suitable basis for F (3) is then
9 3 1 1 a4 − 5a2 a3 + a32 , a2 a3 − a32 , a32 = δ3 − 2δ1 δ2 + δ12 , δ1 δ2 − δ13 , δ13 2 2 2 2 = (˜ p3 , p1 p2 , p31 ), with respective depths (primitivity degrees) 2, 2, 3. In F (4) , we will have the linearly independent elements p22 ,
p1 p˜3 ,
p21 p2 ,
p41 ,
of respective depths 2, 3, 3, 4. We seek a suitable representative for the missing ∗k indecomposable element. Proceeding systematically with the projectors logk! id , we obtain 185 2 a a3 − 20a42 , p˜4 := log∗ id a5 = a5 − 15a2 a4 − 5a32 + 6 2 next, log∗2 id 3 log∗2 id a2 = p22 , a2 a4 = p1 p˜3 , 2! 2! A suitable basis for F (4) is then
log∗3 id 2 a2 a3 = p21 p2 , 3!
log∗4 id 4 a2 = p41 . 4!
(˜ p4 , p22 , p1 p˜3 , p21 p2 , p41 ). p4 ) = −˜ p4 ; and so in Note that S(p1 p˜3 ) = p1 p˜3 , S(p22 ) = p22 , S(p21 p2 ) = −p21 p2 , S(˜ ∗n log∗n id n log id a (we owe this remark to F. Patras). Going back to general, S n! a = (−) n! p˜4 , one can check quasiprimitivity, and the fact it has depth 3. The pattern repeats itself: at every order # in the original grading one nonproduct generator of depth (# − 1) — for which Broadhurst and Kreimer give a recipe in [9] — is found. Concerning WF : we know a3 belongs to F (1)(2) ⊕ P (F )(2) , since a3 − 3/2a22 is (2) primitive. Therefore, in this case WF is trivial. Now consider F (3) , the linear span 3 of a2 , a2 a3 and a4 . The element a4 cannot be primitively generated and so there is (3) (n) a one-dimensional WF . The same is true for WF , for every n ≥ 3. To summarize the general situation: im π1 is the “good” supplementary space for 2 we were looking for. The elements a and log∗ id a belong to the same class modH+ 2 . The logarithms of the nonproduct elements, all quasiprimitives, provide ulo H+ an eminently suitable algebra basis for H. We can think of H, as a graded algebra, as the polynomial algebra on the log∗ id a. Moreover, the projectors behave well in regard to the action of the antipode.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
969
4.3. Coda: On twisting and other matters We conclude provisionally these notes with pointers to two subjects of current interest: Hopf algebras of quantum fields and investigation of “twisted antipodes”. Both are indebted to Rota’s work. Consider now the theory of the free neutral scalar (for simplicity) field ϕ(x). Then the space of quantum observables can be identified to the Fock boson algebra on V ≡ the space of complex solutions of the Klein–Gordon (KG) equation ∂ 2 v(x) − ∆v(x) + m2 v(x) = 0 ∂t2 on Minkowski space M4 . The algebra product is just the normal product of fields. Let D be the (Jordan–Pauli) distribution solving the Cauchy problem for the KG equation. The Hilbert space of states is in turn a Fock space, built on the space of real solutions VR made complex with the help of a complex structure, so that if # vi (x) = D(x, y)hi (y) d4 y (4.9) ( + m2 )v(x) :=
for i = 1, 2 are real solutions of the KG equation, then #
v1 | v2 = h1 (x)D+ (x, y)h2 (y) d4 x d4 y, where D+ is the standard Wightman function. The projection h → v to field equation solutions in (4.9) corresponds to dividing out the unrestricted fields by the ideal generated by that equation, and will be implicitly used all the time. The quantum field is a V -valued distribution; it is defined by its action [92] by creating and annihilating a particle in the distributional state D(· − x): 1 ϕ(x) = √ [a(D(· − x)) + a† (D(· − x))]. 2 2 This ensures that ( + m )ϕ(h) := ϕ(( + m2 )h) = 0 for any complex smearing function h. Note that
0, ϕ(h1 )ϕ(h2 ) 0 = v1 , v2 .
(4.10)
On this concrete boson algebra, we put the compatible cocommutative coalgebra structure already described; the counit η is at once identified with the vacuum expectation value η(a) = 0, a 0. It is instructive to consider the Wick monomials in the field operator ϕ(x). With regards to notation, to conform to usual practice, we write :ϕ(x1 ) · · · ϕ(xn ): instead of ϕ(x1 ) ∨ · · ·∨ ϕ(xn ). The powers ϕ(x) ∨ ϕ(x) will be denoted simply ϕ2 (x) in place of the standard :ϕ2 (x): (no other powers are defined). For the purpose, one posits ϕn (x) := :ϕ(x)ϕ(x2 ) · · · ϕ(xn ):, δ(x − x2 ) · · · δ(x − xn )x2 ···xn , or, if one wishes, ϕn (h) := :ϕ(x)ϕ(x2 ) · · · ϕ(xn ):, h(x)δ(x − x2 ) · · · δ(x − xn )x,x2 ···xn .
(4.11)
August 29, 2005 18:8 WSPC/148-RMP
970
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
This is known to be well defined. The diagonalization map for Wick monomials can hardly be simpler: from (1.30) and (4.11), it is a couple of easy steps to obtain: n
n n ∆ϕ (x) = ϕi (x) ⊗ ϕn−i (x), i i=0
i.e. the comultiplication is the binomial one (as befits honest-to-God monomials). One can alternatively use divided powers ϕ(n) := ϕn /n!. More generally then: ϕ(m1 ) (x) ⊗ · · · ⊗ ϕ(ml+1 ) (x), ∆l (ϕ(n) (x)) = with sum over all combinations of l + 1 non-negative integers mi such that l+1 i=1 mi = n. Very easy is also to check, from (4.11), that δnm n D+ (x, y). n! (Unlike other propagators, D± have powers of all orders.) From this starting point, reference [7] proceeds by “twisting” the Hopf algebra structure of B(V ) by suitable bilinear forms z on V . The twisted product is given by a • b= z(a(1) , b(1) ) a(2) ∨ b(2) .
ϕn (x) | ϕm (y) =
Setting z(1, 1) = 1 is understood. By cocommutativity, we equivalently have a •b= a(1) ∨ b(1) z(a(2) , b(2) ) = a(1) ∨ b(2) z(a(2) , b(1) ) = a(2) ∨ b(1) z(a(1) , b(2) ). Note that there is no restriction on the degrees of a, b ∈ B(V ). The authors consider two different twisted products, respectively corresponding to the operator and the time-ordered products of elements of B(V ). The associated bilinear forms are z(v1 , v2 ) = v1 | v2 and the symmetric pairing (· | ·), given by # z(v1 , v2 ) = (v1 | v2 ) = (v2 | v1 ) = 0 | T[ϕ(h1 )ϕ(h2 )] 0 = h1 (x)DF (x, y)h2 (y), we recall that the time-ordered product of free fields is defined by T[ϕ(x1 )ϕ(x2 )] = T[ϕ(x2 )ϕ(x1 )] = Θ(t1 − t2 )ϕ(x1 )ϕ(x2 ) + Θ(t2 − t1 )ϕ(x2 )ϕ(x1 ), where Θ is the Heaviside function (so, time increases from right to left) and that
0 | T[ϕ(x1 )ϕ(x2 )] 0 = DF (x1 − x2 ). The resulting algebra in the first case can be called the Weyl algebra, since the canonical commutation relations
0 | [ϕ(h1 ), ϕ(h2 )] 0 = s(v1 , v2 ) are satisfied; where the bilinear form s is given by the integral on the space VR of solutions # # s(v1 , v2 ) := (v1 ∂µ v2 − v2 ∂µ v1 ) dσ µ = h1 (x)D(x, y)h2 (y) d4 x d4 y, Σ
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
971
that does not depend on Σ itself, and defines a symplectic form, which is complexified in the standard way. In the second case, we obtain a commutative algebra. By use on this algebra of the above indicated comultiplication, recently Mestre and Oeckl have been able to show the relations between the different n-point functions of quantum field theory in a very economical manner [93]. An interesting property, generalizing (4.10), is η(a • b) = z(a, b). In effect, by the defining property of η η(a(1) )η(b(1) )z(a(2) , b(2) ) η(a • b) = η(a(1) ∨ b(1) )z(a(2) , b(2) ) = = z(η(a(1) )a(2) , η(b(1) )b(2) ) = z(a, b). The bilinear forms used are examples of Laplace pairings (a concept originally introduced by Rota [94]); in turn these are cocycles in a Hopf algebra cohomology. More general cocycles would seem to relate to interacting quantum fields, and to the passage to noncommutative field theory [95]. It is worth noting that the (much more difficult) Hopf algebra cohomology of the noncocommutative Hopf algebras HR and H underlines Kreimer’s program in the direction of using Hopf algebraic techniques to simplify the calculation of Feynman diagrams: see, for instance, [96]. In summary, the fundamentals of quantum field theory have been coalgebrized. The price is worth paying for the complete automation not only of the twisted product formulae, but of many indispensable calculations in field theory that are not found, or only haphazardly, in textbooks, and otherwise require a substantial amount of combinatorics (in this respect, the splendid little book by Caianello [97] still stands out). The Hopf algebra approach frees us from the perennial, tiresome recourse of decompositions in ∨-products of homogeneous elements of order 1 in every argument. The benefits of this abstract framework were harvested in [6, 7], as their results translate into strong versions of the Wick theorems of quantum field theory. As we said in the introduction, however, the approach from quantum theoretical first principles is still evolving (see [98] in this connection), and the passage from the Hopf algebras that Brouder, Oeckl and others have associated to quantum fields to the renormalization Hopf algebra of Connes and Kreimer is perhaps in the cards. The reader should be aware that, with respect to the Hopf algebra approach to renormalization, we have done less than to scratch the surface. The heart of this approach is a multiplicative map f (the “Feynman rule”) of H into an algebra V of Feynman amplitudes: for instance, in dimensional regularization the character takes values in a ring of Laurent series in the regularization parameter ε. In physics, the Feynman rules are essentially fixed by the interpretation of the theory, and thus one tends to identify Γ with f (Γ). Perhaps the main path-breaking insight of [2] is the introduction of the “twisted antipode”. Let us then usher in the other personages of this drama. There is a linear map T : V → V , which effects the
August 29, 2005 18:8 WSPC/148-RMP
972
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
subtraction of ultraviolet divergencies in each renormalization scheme. The twisted (or “renormalized”) antipode ST,f is a map H → V defined by ST,f (∅) = 1; ST,f = T ◦ f ◦ S for primitive diagrams, and then recursively: ST,f (γ) f (Γ/γ) . ST,f Γ = −[T ◦ f ]Γ − T ∅γΓ
In other words, ST,f is the map that produces the counterterms in perturbative field theory. The Hopf algebra approach works most effectively because in many cases ST,f is multiplicative. For that, it is not necessary for T to be an endomorphism of the algebra of amplitudes V , but the following weaker condition [99] is sufficient: T (hg) = T (T (h)g) + T (hT (g)) − T (h)T (g). This endows V with the structure of a Rota–Baxter algebra of weight 1; it is fulfilled in the BPHZ formalism for massive particles, and the dimensional regularization scheme with minimal subtraction, for which the Connes–Kreimer paradigm is most cleanly formulated. By the way, here is where [82] fails. Finally, the renormalized amplitude RT,f is given by RT,f := ST,f ∗ f. This map is also a homomorphism; compatibility with the coproduct operation is given by its very definition as a convolution. The (nonrecursive) forest formula for RT,f is precisely (the complete) Zimmermann’s formula [13] of quantum field theory. A Rota–Baxter algebra of weight θ is given by the condition θT (hg) = T (T (h)g) + T (hT (g)) − T (h)T (g). Shuffle algebras and Rota–Baxter algebras of weight 0 are essentially the same thing [35, 76]. The theory of Rota–Baxter algebras is examined in depth in [100, 101]. Acknowledgments This paper collects and expands for the most part a series of lectures delivered by the second-named author in the framework of the joint mathematical physics seminar of the Universit´es d’Artois and Lille 1, as a guest of the first-named institution, from late January till mid-February 2003. We thank Amine El-Gradechi and the Universit´es d’Artois and Lille 1 for the excellent occasion to lecture on a subject close to our hearts. JMG-B is very grateful to his “students” on that occasion for much friendliness, and for the hospitality of the Theory Division of CERN, where a draft was written prior to the lectures. We also thank Michel Petitot for teaching us the double series; Li Guo, Michiel Hazewinkel, Fr´ed´eric Patras and Leonhard Schuster for helpful remarks; and Christian Brouder, Kurusch Ebrahimi-Fard, Amine El-Gradechi, Dirk Kreimer
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
973
and Joseph C. V´ arilly for careful reading of prior versions of the paper and providing insights and welcomed advice. We are also indebted to Lo¨ıc Foissy and Dominique Manchon for forwarding us very valuable material. Joseph V´ arilly generously helped us with the figures. We owe to the referees, whose comments greatly helped to improve the presentation. HF acknowledges support from the Vicerrector´ıa de Investigaci´on of the Universidad de Costa Rica.
References [1] [2] [3] [4] [5] [6]
[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]
[23] [24] [25] [26] [27] [28]
S. A. Joni and G.-C. Rota, Contemp. Math. 6 (1982) 1. D. Kreimer, Adv. Theor. Math. Phys. 2 (1998) 303. A. Connes and D. Kreimer, Commun. Math. Phys. 210 (2000) 249. A. Connes and D. Kreimer, Commun. Math. Phys. 216 (2001) 215. A. Connes and H. Moscovici, Commun. Math. Phys. 198 (1998) 198. Ch. Brouder, Quantum groups and interacting quantum fields, in Group 24: Physical and Mathematical Aspects of Symmetries, eds. J.-P. Gazeau, R. Kerner, J.-P. Antoine, S. M´etens and J.-Y. Thibon (Institute of Physics Publ., Bristol, 2003), pp. 89–97. Ch. Brouder, B. Fauser, A. Frabetti and R. Oeckl, J. Phys. A: Math. Gen. 37 (2004) 5895. D. Kreimer, Phys. Reports 363 (2002) 387. D. J. Broadhurst and D. Kreimer, Commun. Math. Phys. 215 (2000) 217. D. Kreimer, private communication. J. Cuntz, G. Skandalis and B. Tsygan, Cyclic Homology in Noncommutative Geometry (Springer, Berlin, 2004). A. Connes and H. Moscovici, Mosc. Math. J. 4 (2004) 67. W. Zimmermann, Commun. Math. Phys. 15 (1969) 208. W. R. Schmitt, J. Combin. Theory A46 (1987) 264. E. B. Manoukian, Renormalization (Academic Press, London, 1983). H. Figueroa and J. M. Gracia-Bond´ıa, Modern Phys. Lett. A16 (2001) 1427. H. Figueroa and J. M. Gracia-Bond´ıa, Internat. J. Mod. Phys. A19 (2004) 2739. R. P. Stanley, Enumerative Combinatorics, Vol. 1, Cambridge University Press, Cambridge, 1997. M. Haiman and W. R. Schmitt, J. Combin. Theory A50 (1989) 172. G. Hochschild, Introduction to Affine Algebraic Groups (Holden-Day, San Francisco, 1971). S. Montgomery, Hopf Algebras and their Actions on Rings (American Mathematical Society, Providence, RI, 1993). D. Manchon, Hopf Algebras from basics to applications to renormalization, math.QA/0408405, to appear in the Comptes Rendus des Rencontres Math´ematiques de Glanon. J. W. Milnor and J. C. Moore, Ann. Math. 81 (1965) 211. J. M. Gracia-Bond´ıa, J. C. V´ arilly and H. Figueroa, Elements of Noncommutative Geometry (Birkh¨ auser, Boston, 2001). T. Br¨ ocker and T. tom Dieck, Representations of Compact Lie Groups (Springer, Berlin, 1985). B. Peterson and E. J. Taft, Aequationes Mathematicae 20 (1980) 1. L. Verde-Star, Internat. J. Theor. Phys. 40 (2001) 41. G. Hochschild, La Structure des Groupes de Lie (Dunod, Paris, 1968).
August 29, 2005 18:8 WSPC/148-RMP
974
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
[29] J. C. V´ arilly, Hopf algebras in noncommutative geometry, in Geometrical and Topological Methods in Quantum Field Theory, eds. A. Cardona, H. Ocampo and S. Paycha (World Scientific, Singapore, 2003), pp. 1–85. [30] H. Hopf, Ann. Math. 42 (1941) 22. [31] A. Klimyk and K. Schm¨ udgen, Quantum Groups and their Representations (Springer, Berlin, 1997). [32] M. Aguiar, N. Bergeron and F. Sottile, Combinatorial Hopf algebras and generalized Dehn–Somerville relations, math.CO/0310016. [33] L. Foissy, Les alg`ebres de Hopf des arbres enracin´es d´ecor´es, Ph.D. dissertation, Reims (2002). [34] R. Ree, Ann. Math. 68 (1958) 210. [35] K.-T. Chen, Bull. Amer. Math. Soc. 73 (1967) 975. [36] Ch. Reutenauer, Free Lie Algebras (Clarendon Press, Oxford, 1993). [37] M. Aguiar and F. Sottile, Cocommutative Hopf algebras of permutations and trees, math.QA/0403101. ´ Norm. Sup. 33 (2000) 837. [38] J. Kustermans and S. Vaes, Ann. Sci. Ec. [39] J. Leslie, Trans. Amer. Math. Soc. 333 (1992) 423. [40] F. Fa` a di Bruno, Annali di Scienze Matematiche e Fisiche di Tortoloni 6 (1855) 479; Quart. J. Pure Appl. Math. 1 (1857) 359. [41] S. G. Krantz and H. R. Parks, A Primer of Real Analytic Functions (Birkh¨ auser, Boston, 1992). [42] A. Connes and D. Kreimer, Commun. Math. Phys. 199 (1998) 203. [43] A. Abdesselam, Feynman diagrams in algebraic combinatorics, math.co/0212121. [44] J. M. Gracia-Bond´ıa and J. C. V´ arilly, J. Math. Phys. 35 (1994) 3340. [45] A. Salam and P. T. Matthews, Phys. Rev. 90 (1953) 690. [46] C. Itzykson and J.-B. Zuber, Quantum Field Theory (McGraw-Hill, New York, 1980). [47] R. P. Stanley, Enumerative Combinatorics, Vol. 2 (Cambridge University Press, Cambridge, 1999). [48] R. Ticciati, Quantum Field Theory for Mathematicians (Cambridge University Press, Cambridge, 1999). [49] L. Foissy, J. Algebra 255 (2002) 85. [50] W. Magnus, Commun. Pure Appl. Math. 7 (1954) 649. [51] D. P. Burum, Phys. Rev. B24 (1981) 3684. [52] W. R. Salzman, J. Chem. Phys. 82 (1985) 822. [53] J. F. Cari˜ nena, K. Ebrahimi-Fard and J. M. Gracia-Bond´ıa, Hopf algebras in dynamical systems theory, forthcoming. [54] J.-L. de Lagrange, M´em. Acad. Royale des Sciences et Belles-Lettres de Berlin 24 (1770) 251. [55] M. Abramowitz and I. A. Stegun, Handbook of Mathematical Functions (Dover, New York, 1965). [56] P. Henrici, J. Math. Anal. Appl. 8 (1964) 218. [57] M. Cohen, Contemp. Math. 134 (1992) 1. [58] M. Khalkhali and B. Rangipour, Introduction to Hopf-Cyclic cohomology, math.qa/0503244. [59] S. Majid, Foundations of Quantum Group Theory (Cambridge University Press, Cambridge, 1995). [60] S. Majid, A Quantum Groups Primer (Cambridge University Press Cambridge, 2002). [61] W. P. Johnson, Amer. Math. Monthly 109 (2002) 217.
August 29, 2005 18:8 WSPC/148-RMP
J070-00246
Combinatorial Hopf Algebras in Quantum Field Theory I
[62] [63] [64] [65] [66]
[67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78]
[79] [80] [81]
[82]
[83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93]
975
A. D. D. Craik, Amer. Math. Monthly 112 (2005) 119. G. ’t Hooft and M. Veltman, Nucl. Phys. B44 (1972) 189. C. G. Bollini and J. J. Giambiagi, Nuovo Cimento B12 (1972) 20. J. M. Gracia-Bond´ıa and S. Lazzarini, Connes–Kreimer–Epstein–Glaser renormalization, hep-th/0006106. W. Zimmermann, Remark on equivalent formulations for Bogoliubov’s method of renormalization, in Renormalization Theory, eds. G. Velo and A. S. Wightman (D. Reidel, Dordrecht, 1976), pp. 161–170. M. C. Bergere and Y.-M. Lam, J. Math. Phys. 17 (1976) 1546. P. Doubilet, J. Alg. 28 (1974) 127. W. R. Schmitt, J. Pure Appl. Algebra 96 (1994) 299. A. D¨ ur, M¨ obius Functions, Incidence Algebras and Power Series Representations, Lecture Notes in Mathematics 1202 (Springer, Berlin, 1986). G.-C. Rota, Z. Wahrscheinlichskeitstheorie 2 (1964) 340. A. Cayley, Phil. Magazine 13 (1857) 17; also in Collected Mathematical Papers of Arthur Cayley, Vol. 3 (Cambridge University Press, Cambridge, 1889), pp. 242–246. Ch. Brouder, Eur. Phys. J. C12 (2000) 521. F. Girelli, T. Krajewski and P. Martinetti, J. Math. Phys. 45 (2004) 4679. V. Turaev, J. Geom. Phys. 53 (2005) 461. A. Murua, The Hopf algebra of rooted trees, free Lie algebras and Lie series, preprint, San Sebasti´ an, 2003. P. Henrici, Jahresber. Deutsch. Math.-Verein. 86 (1984) 115. G. Popineau and R. Stora, A pedagogical remark on the main theorem of perturbative renormalization theory, unpublished preprint, CERN & LAPP–TH (1982). A. Connes, Sym´etries galoisiennes et renormalisation, S´eminaire Poincar´e 2 (2002), pp. 75–91. F. Girelli, T. Krajewski and P. Martinetti, Modern Phys. Lett. A 16 (2001) 299. A. Connes and M. Marcolli, From physics to number theory via noncommutative geometry, Vol. II, Renormalization, the Riemann–Hilbert correspondence and motivic Galois theory, hep-th/0411114. M. Berg and P. Cartier, Representations of the renormalization group as matrix Lie algebra, hep-th/0105315, Vol. 2, to appear Chap. 17 in the book by C. DeWitt and P. Cartier, Functional Integration: Action and Symmetries (Cambridge University Press, Cambridge, 2005). K. Rybnikov, An´ alisis Combinatorio (Mir, Moscow, 1988). M. Anshelevich, E. G. Effros and M. Popa, Zimmermann type cancellation in the free Fa´ a di Bruno algebra, math.co/0504436. J. Dixmier, Alg`ebres Enveloppantes (Gauthier-Villars, Paris, 1974). K. H. Hofmann and S. A. Morris, The Structure of Compact Groups (de Gruyter, Berlin, 1998). C. Chryssomalakos, H. Quevedo, M. Rosenbaum and J. D. Vergara, Commun. Math. Phys. 225 (2002) 465. F. Patras, Ann. Inst. Fourier (Grenoble) 43 (1993) 1067. F. Patras, J. Algebra 170 (1994) 547. J. L. Loday, Expos. Math. 12 (1994) 165. J. L. Loday, Cyclic Homology (Springer, Berlin, 1998). ´ D. Kastler, Introduction ` a l’Electrodynamique Quantique (Dunod, Paris, 1960). A. Mestre and R. Oeckl, Combinatorics of the n-point functions via Hopf algebra in quantum field theory, math-ph/0505066.
August 29, 2005 18:8 WSPC/148-RMP
976
J070-00246
H. Figueroa & J. M. Gracia-Bond´ıa
[94] F. D. Grosshans, G.-C. Rota and J. A. Stein, Invariant Theory and Superalgebras (American Mathematical Society, Providence, RI, 1987). [95] R. Oeckl, Nuclear. Phys. B581 (2000) 559. [96] D. Kreimer, Ann. Phys. 303 (2003) 179; ibid. 305 (2003) 79. [97] E. R. Caianello, Combinatorics and Renormalization in Quantum Field Theory (Benjamin, Reading, Massachusetts, 1973). [98] Ch. Brouder and W. Schmitt, Quantum groups and quantum field theory: III. Renormalization, hep-th/0210097. [99] G.-C. Rota, Bull. Amer. Math. Soc. 75 (1969) 325; ibid. 75 (1969) 330. [100] K. Ebrahimi-Fard, L. Guo and D. Kreimer, Ann. Inst. H. Poincar´ e 6 (2005) 369. [101] K. Ebrahimi-Fard, L. Guo and D. Kreimer, J. Phys. A37 (2004) 11037.
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Reviews in Mathematical Physics Vol. 17, No. 9 (2005) 977–1020 c World Scientific Publishing Company
QUANTUM MACROSTATISTICAL THEORY OF NONEQUILIBRIUM STEADY STATES
GEOFFREY L. SEWELL Department of Physics, Queen Mary, University of London, Mile End Road, London E1 4NS, UK [email protected] Received 17 January 2005 Revised 4 July 2005 We provide a general macrostatistical formulation of nonequilibrium steady states of reservoir driven quantum systems. This formulation is centered on the large scale properties of the locally conserved hydrodynamical observables, and our basic physical assumptions comprise (a) a chaoticity hypothesis for the nonconserved currents carried by these observables, (b) an extension of Onsager’s regression hypothesis to fluctuations about nonequilibrium states, and (c) a certain mesoscopic local equilibrium hypothesis. On this basis, we obtain a picture wherein the fluctuations of the hydrodynamical variables about a nonequilibrium steady state execute a Gaussian Markov process of a generalized Onsager–Machlup type, which is completely determined by the position dependent transport coefficients and the equilibrium entropy function of the system. This picture reveals that the transport coefficients satisfy a generalized form of the Onsager reciprocity relations in the nonequilibrium situation and that the spatial correlations of the hydrodynamical observables are generically of long range. This last result constitutes a model-independent quantum mechanical generalization of that obtained for special classical stochastic systems and marks a striking difference between the steady nonequilibrium and equilibrium states, since it is only at critical points that the latter carry long range correlations. Keywords: Quantum macrostatistics; nonequilibrium steady states; chaotic current fluctuations; long range hydrodynamical correlations. Mathematics Subject Classification 2000: 82C10, 82B35, 81R15
1. Introduction The statistical thermodynamics of nonequilibrium steady states or, more generally, dynamically stable ones, of reservoir driven macroscopic systemsa is a key area of the natural sciences, with ramifications for condensed matter physics [1–4], chemistry [5] and biology [6]. At the phenomenological and heuristic levels, there is aA
very simple example of such a state is the stationary one of a solid rod, whose ends are coupled to thermostats of different temperatures. 977
October 20, 2005 8:48 WSPC/148-RMP
978
J070-00249
G. L. Sewell
an abundant literature on this subject. At the level of mathematical physics, however, the subject is still at an exploratory stage. In the classical regime, two types of rigorous approaches have been made to it. The first is centered on the hypotheses that the macroscopic properties of complex systems are yielded by the model of classical Anosov dynamical systems [7, 8]. This hypothesis is designed to capture the chaoticity that underlies macroscopic irreversibility, and it has been shown to lead to nonequilibrium generalizations both of the Onsager reciprocity relations [8] and of the fluctuation-dissipation theorem [7]. A second approach is centered on microscopic treatments of stochastic (non-Hamiltonian) dynamical models [9–11], which are also designed to capture the chaoticity underlying macroscopic irreversibility. The treatment of these models has led to some interesting developments, and [11] has provided a dynamically-based picture of the hydrodynamical fluctuations about their nonequilibrium steady states. Moreover, in the case of a certain particular model, namely the symmetric exclusion process, it has been shown that the nonequilibrium steady state has long range density correlations [9–11] and that the probability distribution of its large scale density field is determined by an explicitly specified and highly nontrivial nonequilibrium generalization of its free energy [10, 11]. In the quantum regime, a natural dynamically-based definition of nonequilibrium steady states of reservoir driven systems has been formulated [12, 13] at the microscopic level. In the present article we set out a different approach to the subject, which is quantum macrostatistical in that it is centered on the hydrodynamical observables of reservoir driven quantum systems. This approach, which was briefly sketched in [14], parallels the one we have previously made to the nonequilibrium thermodynamics of conservative quantum systems [15, 16], where it yielded an extension of the Onsager reciprocity relations to a nonlinear regime. In general, the quantum macrostatistics is designed, like Onsager’s [17] irreversible thermodynamics and Landau’s fluctuating hydrodynamics [18], to form a bridge between the microscopic and macroscopic pictures of matter, rather than a deduction of the latter from the former. Indeed, accepting Boltzmann’s hypothesis of molecular chaos [19], we take the view that such a derivation is not even feasible for realistically interacting systems, since this chaos renders the microscopic equations of motion intractable over periods substantially longer than the intervals between successive collisions.b Thus, the microscopic equations of motion must necessarily be supplemented by further assumptions in order to interconnect the quantum and phenomenological properties of matter. In fact, the key physical assumptions of our macrostatistical project concern only very general, model-independent properties of many-particle systems. Specifically, they
b This
view is supported by the fact that the rigorous derivations of Boltzmann equations from the Hamiltonian dynamics of both classical [20] and quantum [21] systems are applicable only over microscopic times of the order of the interval between successive collisions of a particle. For longer times, the chaos bars the way to further analysis of the microscopic equations of motion.
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
979
comprise (A) an extension of Onsager’s regression hypothesis [17], to the effect that the hydrodynamical fluctuations about nonequilibrium steady states are governed by the same dynamical laws as the “small” perturbations of the hydrodynamical variables about their steady values; (B) a certain mesoscopic local equilibrium hypothesis; and (C) a chaoticity hypothesis for the nonconserved currents carried by the locally conserved hydrodynamical observables. These assumptions may be regarded as the “axioms” of our theory. The physical considerations that underlie them will be discussed, along with their formulation, in Secs. 4.1, 4.2 and 5.4. In fact, hypothesis (C), like Boltzmann’s Stosszahlansatz and its subsequent developments [7–11], exploits the consequences of the very chaos that obstructs the analytical dynamics of realistically interacting many-particle systems. The principal results that we obtain by supplementing the Schr¨ odinger dynamics of many-particle systems by the “axioms” (A)–(C), together with certain technical assumptions, are the following ones (I)–(III), which we claim to be new, at least on the level of a rigorous, general, model-independent quantum theory of nonequilibrium steady states. (I) The spatial correlations of the hydrodynamical observables are generically of long range. This comprises a quantum mechanical generalization of that obtained from both rigorous microscopic treatments of certain classical stochastic models [9–11] and from heuristic treatments [23, 24] of Landau’s fluctuating hydrodynamics. Most importantly, it marks a qualitative difference between equilibrium and nonequilibrium steady states, since the hydrodynamical correlations in the former states are generically of short range, except at critical points. (II) The transport coefficients satisfy a generalized, position-dependent version of the Onsager reciprocity relations. Thus, this result extends Onsager’s irreversible thermodynamics from the neighborhood of equilibrium to that of nonequilibrium steady states. (III) The hydrodynamical fluctuations execute a classical Gaussian Markov process, of a generalized Onsager–Machlup (OM) type [22]. Thus, this result extends the OM theory from the regime of fluctuations about thermal equilibrium to that of fluctuations about nonequilibrium steady states. A similar result was obtained for certain classical stochastic models in [11]. Let us now briefly describe the macrostatistical strategy we employ to obtain these results. We take our model to be an N -particle quantum system, Σ, that is confined to a bounded open connected region, ΩN , of a d-dimensional Euclidean space, X, and coupled at its boundary, ∂ΩN , to an array, R, of quantum mechanical reservoirs. Σ is thus an open system, while the composite (Σ + R) is a conservative one. Since we shall have occasion to pass to thermodynamic and
October 20, 2005 8:48 WSPC/148-RMP
980
J070-00249
G. L. Sewell
hydrodynamic limits where N tends to infinity, we take N to be a variable parameter of the system. We assume that its particle number density ν := N/Vol(ΩN ) is N -independent and that ΩN is the dilation by a factor LN of a fixed, N -independent region Ω of unit volume. Thus, ΩN = LN Ω := {LN x | x ∈ Ω} and LN = (N/ν)1/d .
(1.1)
For the hydrodynamic description of Σ, we take LN to be the unit of length. Thus, Ω is the region occupied by the system in the hydrodynamical picture. We assume that, in that picture, Σ evolves according to a phenomenological law governing the evolution of a set of locally conserved classical fields qt (x) = (q1,t (x), . . . , qm,t (x)), which correspond to the densities at position x and time t of the extensive thermodynamic variablesc of the system. We denote the associated currents of qt (x) by jt (x) = (j1,t (x), . . . , jm,t (x)). Thus, qt satisfies the local conservation law ∂qt + ∇ · jt (x) = 0. (1.2) ∂t We assume that its phenomenological dynamics is governed by a constitutive equation of the form jt (x) = J (qt ; x),
(1.3)
where J is a functional of the field qt and the position x. Thus, by Eqs. (1.2) and (1.3), qt evolves according to an autonomous law ∂qt (x) = F (qt ; x) := −∇ · J (qt ; x), (1.4) ∂t subject to boundary conditions determined by the reservoirs. We assume that this phenomenological law is invariant under scale transformations x → λx, t → λk t for some constant k. A simple example for which this assumption is valid, with k = 2, is that of nonlinear diffusions, where J takes the form ˜ t (x))∇qt (x), J (qt ; x) = −K(q
(1.5)
˜ being an m-by-m matrix [K ˜ kl ], which acts by standard matrix multiplication on K ∇qt . In this case, the phenomenological equation (1.4) takes the form ∂qt ˜ t )∇qt ). (1.6) = ∇ · (K(q ∂t We shall base some of our explicit calculations on this case and, in particular, we shall henceforth assume that the scaling exponent k is equal to 2. A simple consequence of this assumption is that, since LN is the unit of length for the hydrodynamical picture, L2N is the unit of time for this picture. We assume that, in general, the dynamics described by Eq. (1.4) is dissipative, in that the m-component field qt (x) relaxes eventually to a unique time-independent c We
provide a characterization of these variables in Sec. 2.2 along lines previously formulated in [15].
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
981
form q(x), which thus corresponds to a steady hydrodynamical state. By Eq. (1.3), the corresponding steady m-component current, j(x), is then J (q; x). By Eq. (1.4), the linearized equation of motion for “small” perturbations, δqt (x), of q(x) is simply ∂ ∂ δqt (x) = Lδqt (x) := F (q + λδqt ; x)|λ=0 , (1.7) ∂t ∂λ while, by Eq. (1.3), the corresponding increment in the m-component current j(x) is ∂ J (q + λδqt ; x)|λ=0 . ∂λ We note that, by Eqs. (1.4), (1.7) and (1.8), δjt (x) = Kδqt (x) :=
L = −∇ · K.
(1.8)
(1.9)
Further, in the case of nonlinear diffusions, it follows from the identification of the r.h.s.’s of Eqs. (1.4) and (1.6) that Eq. (1.7) yields the following formal equation for L. ˜ ˜ (q(x))χ(x)]∇q(x) , + [K (1.10) [Lχ](x) = ∇ · K(q(x))∇χ(x) ˜ (q) is the where χ is a single column matrix function of position and K ˜ ˜ derivative of K(q), i.e. its gradient with respect to q: thus [K (q)χ(q)]kl = m ˜ r=1 [∂ Kkl (q)/∂qr ]χr (q). In order to relate the phenomenological dynamics given by Eqs. (1.4) and (1.7) to the underlying microscopic quantum mechanics of Σ, we assume that qt (x) is the expectation value of a set of locally conserved quantum fields qˆt (x) = (ˆ q1,t (x), . . . , qˆm,t (x)) as rescaled for the hydrodynamical picture and in a limit in which N , and hence LN , becomes infinite. Correspondingly, we formulate the fluctuations ξt (x) of this m-component quantum field qt (x) about its mean on the same macroscopic scale and with a standard normalization, subject to the abovedescribed assumptions (A)–(C). On this basis, we establish that ξt executes a Gaussian Markov process represented by a generalized Langevin equation of the form ∂ ξt (x) = Lξt (x) + bt (x), (1.11) ∂t where bt (x) is a white noise whose autocorrelation function is of zero range with respect to position as well as time. Thus, ξt executes a generalized Onsager–Machlup process. We employ this result to infer that the spatial correlations of the fluctuation field ξ in nonequilibrium steady states are generically of long range. In this way, we derive the above results (I)–(III) from our basic macrostatistical assumptions. We present our treatment as follows. In Sec. 2, we formulate the quantum statistical thermodynamical model of the composite system (Σ + R) at both microscopic and macroscopic levels. This formulation provides general specifications of the nonequilibrium steady states of the model and also of the locally conserved quantum fields qˆt and associated currents jˆt pertinent to its hydrodynamic description.
October 20, 2005 8:48 WSPC/148-RMP
982
J070-00249
G. L. Sewell
Here, in accordance with the general requirements of quantum field theory [25], we assume that these are distribution-valued operators. In Sec. 3, we relate the classical hydrodynamical variables, qt and jt , and their fluctuations, ξt and ηt , about a nonequilibrium steady state to these quantum fields and currents; and we obtain sufficient conditions for the fluctuations ξt to execute a classical stochastic process. In Sec. 4, we formulate our regression and local equilibrium hypotheses for this process and note that these, together with the assumption of microscopic reversibility for the composite (Σ + R), yields a canonical extension of Onsager’s reciprocity relations to the nonlinear hydrodynamical regime. In Sec. 5, we extend our local equilibrium hypothesis to the fluctuating currents, ηt , and formulate our chaoticity hypothesis for these currents. We then establish that the assumptions of the regression hypothesis, local equilibrium and chaoticity imply the field ξt executes a generalized Onsager–Machlup process represented formally by Eq. (1.11). In Sec. 6, we obtain an explicit formula for the two-point function for this process in terms of the equilibrium entropy density function and the transport coefficients of the system, and we infer therefrom that the static correlations of the hydrodynamical fluctuation field ξ are generically of non-zero range on the macroscopic scale and hence of long (infinite!) range on the microscopic one. We conclude in Sec. 7 with some general observations about the results of this article and of their possible generalizations to less restrictive conditions than those assumed here. We leave the proofs of some technical Propositions to four Appendices. 2. The Quantum Model We take our model to be the open quantum system, Σ, briefly described in Sec. 1. Thus, Σ is a system of N particles, which occupies a bounded open connected region, ΩN , of a d-dimensional Euclidean space X and is coupled at its surface, ∂ΩN , to an array, R, of reservoirs. Here ΩN is the dilation by a factor LN of a region, Ω, of unit volume and LN is given by Eq. (1.1), which represents the N -independence of the particle density of Σ. We assume that the composite quantum system Σ(c) := (Σ+R) is conservative and that all its interactions are invariant under spatial translations and rotations. 2.1. The microscopic picture We formulate this picture in standard operator algebraic terms, denoting the C -algebras of bounded observables of Σ and Σ(c) by A and B, respectively. We assume that A is a subalgebra of B and that it is isomorphic to the W -algebra of bounded operators in a separable Hilbert space H, which comprises the square integrable functions f (x1 , . . . , xN ; s1 , . . . , sN ) (appropriately symmetrized or antisymmetrized) of the positions {xj } and the spins {sj (= ±1)} of its particles. The unbounded observables of Σ are represented by the unbounded self-adjoint operators affiliated to A, i.e. by those whose spectral projectors belong to this algebra. The states of this system are represented by the density matrices in H, and the
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
983
expectation value of an observable, A, of Σ for the state ρ is Tr(ρA). In general, we denote this expectation value by ρ(A) ≡ ρ; A, and we employ the corresponding notation for Σ(c) . The Wigner time reversal operator, which serves to reverse the velocities and spins of the particles of Σ, is defined to be the antilinear transformation of H given by the formula ∀f ∈ H, (2.1) (T f )(x1 , . . . , xN ; s1 , . . . , sN ) = f¯(x1 , . . . , xN ; −s1 , . . . , −sN ) where the bar denotes complex conjugation. Thus, T implements an antiautomorphism τA of A, defined by the formula τA A = T A T
∀A ∈ A.
(2.2) (c)
We assume that the dynamics of the composite system Σ is given by a oneparameter group, {αt | t ∈ R} := α(R), of automorphisms of B. Further, we assume that this dynamics is reversible, i.e. that B is equipped with an antiautomorphism τ , which reduces to τA on A and implements time reversals according to the prescription τ αt τ = α−t .
(2.3)
The evolution of the observables of Σ is given by the isomorphisms of A into B obtained by the restriction of α(R) to the former algebra. 2.2. Thermodynamic variables and potentials In order to formulate the thermodynamic observables and potentials of Σ we pass, for the moment, to the situation where it is decoupled from the reservoirs R and thus becomes a conservative system, whose dynamics is given by a one-parameter group, (0) {αt | t ∈ R}, of automorphisms of A. In this situation, its canonical equilibrium state, ρ, at inverse temperature β is characterized by the Kubo–Martin–Schwinger (KMS) condition [26] (0) (0) (2.4) ∀A1 , A2 ∈ A; t ∈ R. ρ; αt A1 A2 = ρ; A2 αt+iβ A1 Most importantly, this condition survives the thermodynamic limit where N tends to infinity and the particle density ν remains finite [26]. Moreover, in this limit,d the system may support different states that satisfy the condition. The set of these states is convex, and its extremal elements may naturally be interpreted as the pure equilibrium phases for the inverse temperature β [15, 29]. We assume that Σ has a linearly independent set of extensive conserved observˆ n ), which intercommutee up to surface effects and satisfy the ˆ = (Q ˆ 1, . . . , Q ables Q model of the infinite system is formulated, in a standard way, in terms of its C -algebra of quasi-local bounded observables [15, 26–28]. Its states are then positive normalized linear functionals on that algebra. e The assumption of intercommutativity is not universally fulfilled. It is violated, for example in ˆ l , say, are different components of the magnetic moment of Σ. In such ˆ k and Q the case where Q cases, some aspects of our treatment would have to be refined. d The
October 20, 2005 8:48 WSPC/148-RMP
984
J070-00249
G. L. Sewell
following condition of thermodynamical completeness [15]:- in the limit N → ∞, the pure phases are labeled by, i.e. are in one-to-one correspondence with, the expectation ˆ 1, . . . , Q ˆ m , respectively. The resultant values q1 , . . . , qm of the global densities of Q set of classical, intensive thermodynamical variables of Σ is then q = (q1 , . . . , qm ). ˆ 1 to be the Hamiltonian of the system: correspondingly, q1 is In general, we take Q its energy density. The equilibrium entropy density, in the limit N → ∞, is a function, s, of q, which may be formulated by standard methods of quantum statistical mechanics [15, 27]. The classical equilibrium thermodynamics of the system is then governed by the form of s(q). The demand of thermodynamical stability ensures that this function is concave. We define the thermodynamic conjugate of qk to be θk = ∂s(q)/∂qk . Thus, denoting the element (θ1 , . . . , θm ) of Rm by θ, θ = s (q),
(2.5)
the derivative of s(q), i.e. its gradient in q-space. Correspondingly, the second derivative, s (q), of this function is the Hessian [∂ 2 s(q)/∂qk ∂ql ]. We assume throughout this treatment that the system is in a single phase region, i.e. one where s is infinitely differentiable, where the function q → θ(q) is invertible and where, for each value of q, the matrix s (q) is invertible. We define J(q) := −s (q)−1 ,
(2.6)
which, in view of the concavity of s, is a positive matrix. 2.3. The reservoir system R We assume that R comprises a set, {RJ }, of spatially disjoint reservoirs, such that each RJ is placed in contact with a subregion ∂ΩN,J of ∂ΩN and J ∂ΩN,J = ∂ΩN . Further, we assume that each RJ has a thermodynamically complete set of global ˆ J,m ) that are the natural counterparts ˆ J,1 , . . . , Q extensive conserved observables (Q ˆ ˆ of Q1 , . . . , Qm , respectively, in that, when Σ and RJ are placed in contact, the ˆ J,k ) of Σ(c) are still conserved. Correspondingly, the thermodyˆk + Q observables (Q namic control variables of RJ conjugate to QJ are the same as those of Σ, namely θ. We denote by ωJ (θJ ) the equilibrium state of RJ for which its θ-value is θJ . 2.4. Nonequilibrium steady states of Σ(c) Returning now to the situation where Σ is an open system, we assume that this is prepared according to the following prescription. Σ and the reservoirs {RJ } are independently prepared in the remote past in states ρ0 and {ωJ (θJ )}, respectively, where ρ0 is normal and the values of θJ generally varies from reservoir to reservoir: thus, in general, the reservoirs {RJ } are not in equilibrium with one another. Following this preparation, the systems Σ and R are then coupled together and the resultant conservative composite evolves freely according to the dynamics governed by the automorphisms α(R). We assume that, as established under suitable
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
985
asymptotically abelian conditions [12, 13], this dynamics acts so as to drive the systemf Σ(c) into a terminal ρ0 -independent state φ (= w − limt→∞ αt [ρ0 ⊗J ωJ (θJ )]), whose restriction to A is normal. This state is uniquely determined by the states {ωJ (θJ )}. Accordingly, we take φ to be the nonequilibrium steady state of Σ(c) stemming from the specified preparation, and we denote its GNS triple by (Hφ , π, Φ). We note that, in view of the stationarity of φ, the automorphisms α(R) are implemented by a unitary representation U of R in Hφ according to the prescription [31] π(αt B) = Ut π(B)Ut−1
∀B ∈ B,
t ∈ R,
(2.7)
∀B ∈ B,
t ∈ R.
(2.8)
where U is defined by the formula Ut π(B)Φ = π(αt B)Φ
Since Eq. (2.8) is applicable to the subalgebra A of B, the dynamics of the open system Σ, in the normal folium of φ, is given by the isomorphisms implemented by U of π(A) into π(B). Moreover, this prescription extends to the unbounded observables of Σ for the following reasons. Since the restriction of φ to A is normal, so too, by Eq. (2.7), are the representations π and π ◦ αt . It follows [32] that these representations have canonical extensions to the unbounded observables, S, of Σ according to the prescription that, if {Eλ } is the family of spectral projectors of S, then those of π(S) and π(αt S) are {π(Eλ )} and {π(αt Eλ ) = Ut π(Eλ )Ut−1 }, respectively. Hence, the extension of the formula (2.7) to the unbounded observables takes the form π(αt S) = Ut π(S)U−t
(2.9)
for all unbounded observables S of Σ. 2.5. The fields qˆ and the currents jˆ We assume that, in the GNS representation π for the nonequilibrium steady state ˆ has a positionφ, the m-component extensive thermodynamical observable Q dependent, locally conserved density qˆ(x) = (ˆ q1 (x), . . . , qˆm (x)), with associated ˆ ˆ ˆ current density j(x) = (j 1 (x), . . . , j m (x)). Thus, the qˆk ’s and jˆk ’s are quantum fields and, in accordance with the general requirements of quantum field theory [25], we assume that they are distributions,g in the sense of Schwartz [33]. We formulate these distributions in terms of the Schwartz spaces, D(ΩN ) and DV (ΩN ), of real, infinitely differentiable scalar and Rd -vector valued functions, f The
same result has been also obtained constructively [30] for certain models, which however are too rudimentary for our present purposes. In particular, the version of Σ there is just an multi-level atom. g In concrete cases, it is a simple matter to verify that the explicit formulae for these fields and currents are indeed distributions. For example, the number density operator at position x is simply P N r=1 δ(x − xr ), where xr is the position of the rth particle.
October 20, 2005 8:48 WSPC/148-RMP
986
J070-00249
G. L. Sewell
respectively, on X with support in ΩN . We define Dm (ΩN ) and DVm (ΩN ), respectively, to be the real vector spaces given by their mth topological powers, equipped with the operations of binary addition and multiplication by real numbers given by the formula ) = (λf1 + λ f1 , . . . , λfm + λ fm ) λ(f1 , . . . , fm ) + λ (f1 , . . . , fm
∀λ, λ ∈ R,
fk , fk ∈ D(Ω)
or DV (Ω),
k = 1, . . . , m.
We denote by Dm (ΩN ) and DVm (ΩN ) the topological dual vector spaces of Dm (ΩN ) and DVm (ΩN ) respectively. Evidently, these are spaces of distributions (cf. [33])). ˆ We assume that the m-component fields qˆ(x) and j(x) are operator valued m m elements of D (ΩN ) and DV (ΩN ), respectively. For simplicitly, we also assume that the components, qˆk , of qˆ are invariant under time-reversals,h i.e. that they commute with the Wigner time reversal operator T . The algebraic properties of the field qˆ(x) are governed by the forms of the commutators [ˆ qk (x), qˆl (y)]. We assume that these take the following form, which is readily verified by the use of standard formulae in the case where qˆ1 is the energy density of the system and the remaining qˆk ’s are the particle number densities for the different species of its constituent particles. [ˆ qk (x), qˆl (y)] = i
m
cklr jˆr (x) · ∇δ(x − y),
(2.10)
r=1
where the c’s are N -independent constants. This formula evidently accords with our ˆ k ’s intercommute, up to surface effects: indeed, it implies that assumption that the Q their commutators are just the integrals of currents over ∂ΩN . ˆ We denote by qˆ(f ) and j(g) the “smeared fields” obtained by integrating the ˆ distributions qˆ and j against test functions f = (f1 , . . . , fm ) and g = (g1 , . . . , gm ), which belong to the spaces Dm (ΩN ) and DVm (ΩN ) respectively. Thus, m
qˆ(f ) = dxˆ qk (x)fk (x) (2.11) ΩN
k=1
and ˆ = j(g)
m
k=1
dxjˆk (x) · gk (x).
(2.12)
ΩN
In general, these smeared fields are unbounded observables, affiliated to the algebra A. Therefore, by Eq. (2.7), their evolutes at time t, which we denote by qˆt (f ) and jˆt (g), are their transforms implemented by the unitary operator Ut . Thus, they are the smeared fields corresponding to distribution valued opera−1 ˆ tors qˆt (x) = Ut qˆ(x)Ut−1 and jˆt (x) = Ut j(x)U t , respectively; and the analogous statement may evidently be made for their components qˆk,t (x) and jˆk,t (x). For h Standard examples of time-reversal invariant q ˆk ’s are the local number and energy densities of many-particle systems.
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
987
notational convenience, we shall sometimes denote qˆt (x), qˆt (f ), jˆt (x) and jˆt (g) by ˆ t) and j(g, ˆ t), respectively. qˆ(x, t), qˆ(f, t), j(x, We assume that the cyclic vector Φ for the state φ lies in the domain of all monomials in the smeared fields qˆt (f ) and jˆt (g) and that the resultant vector values of these monomials are continuous in the f ’s, g’s, t’s and t ’s. Since jˆ is the current associated with qˆ, the local conservation laws for the latter field may be expressed in the form
t dujˆu (∇f ) ∀t, s ∈ R, f ∈ Dm (ΩN ). (2.13) qˆt (f ) − qˆs (f ) = s
2.6. The hydrodynamical scaling We assume that the hydrodynamical observables of the open system Σ comprise just the m-component field qˆ, as viewed on the scale where the unit of length is LN . Thus, on this scale, the system is confined to the fixed region Ω. Further, in accordance with our assumption, following Eq. (1.6), that the macroscopic dynamics is invariant under space-time scale transformations x → λx, t → λ2 t, we assume that L2N is the unit of time corresponding to the length unit LN . Hence, in the normal folium of the nonequilibrium steady state φ, the m-component hydrodynamic field is represented by the distribution valued operator qˇt (x) := qˆ(LN x, L2N t).
(2.14)
It follows from this equation and Eq. (2.11) that the smeared hydrodynamic field obtained by integrating qˇt (x) against a Dm (Ω)-class test function f is qˇt (f ) = qˆ(f (N ) , L2N t), where f
(N )
∀f ∈ Dm (Ω),
t ∈ R,
(2.15)
(∈ D (ΩN )) is related to f according to the formula m
−1 f (N ) (x) = L−d N f (LN x)
∀x ∈ ΩN .
(2.16)
Since the scale transformation (x, t) → (LN x, L2N t) sends qˆ to qˇ, it follows that the local conservation law (2.13), or formally ∂ qˆt (x)/∂t = −∇ · jˆt (x), will be preserved if it sends jˆt (x) to jˇt (x), where ˆ N x, L2N t). (2.17) jˇt (x) := LN j(L It follows from this formula and Eq. (2.12) that the smeared field obtained by integrating jˇt (x) against a DVm (Ω)-class test function g is ˆ (N ) , L2N t), (2.18) jˇt (g) = j(g where −1 g (N ) (x) = L1−d N g(LN x).
(2.19)
In view of Eqs. (2.15) and (2.18), it is a simple matter to confirm that the local conservation law (2.13) retains its form in the macroscopic description, i.e. that
t qˇt (f ) − qˇs (f ) = dujˇu (∇f ) ∀t, s ∈ R, f ∈ Dm (Ω). (2.20) s
October 20, 2005 8:48 WSPC/148-RMP
988
J070-00249
G. L. Sewell
3. Connections Between the Quantum Picture, the Phenomenological Dynamics and the Hydrodynamical Fluctuations We now seek an inter-relationship between the quantum and hydrodynamical properties of the macroscopic field qˇt (x) and its current jˇt (x) in the limit where N tends to infinity. In order to formulate this limit, we shall henceforth indicate the N -dependence of the quantum model by attaching the superscript (N ) to the symˆ qˇ and j. ˇ The symbol Σ, without that superscript, will be bols Σ, φ, Φ, U, qˆ, j, reserved for the limiting case where N becomes infinite. The symbol Ω, on the other hand, will continue to represent the fixed region occupied by Σ(N ) , in the hydrodynamical scaling, for all N . Our basic assumptions concerning the relationship between the quantum and hydrodynamic pictures of the model are that, in the limit N → ∞, (a) the stationary hydrodynamic fields q(x) and j(x) are the expectation values of (N ) (N ) the quantum fields qˇt (x) and jˇt (x), respectively, for the steady state φ(N ) ; and (b) the regressions of the fluctuations of these fields are governed, in a sense that will be made precise in Sec. 4, by the same dynamical laws (1.7) and (1.8) as the weak perturbations δqt (x) and δjt (x) of q(x) and j(x), respectively. The regression hypothesis (b) is a natural generalization of that proposed by Onsager [17] for fluctuations about equilibrium states. We remark here that, since D spaces are complete, these assumptions imply that the classical fields q(x), j(x), δqt (x) and δjt (x), introduced in Sec. 1, are distributions. 3.1. Quantum statistical formulae for the hydrodynamical variables It follows immediately from our specifications that the above assumption (a) signifies that (N ) (3.1) q(x) = lim Φ(N ) , qˇt (x)Φ(N ) N →∞
and
(N ) j(x) = lim Φ(N ) , jˇt (x)Φ(N ) , N →∞
(3.2)
the t-independence of the r.h.s. of these formulae being guaranteed by the stationarity of φ(N ) . In order to bring the hydrodynamical description of the model into line with thermodynamics, we introduce the field θ(x) = θ1 (x), . . . , θm (x) , conjugate to q(x) as defined by the space-dependent version of Eq. (2.5), namely (3.3) θ(x) = s q(x) . Since we are assuming that the system is perpetually in a single phase region, and thus that the function s is invertible, it follows from this formula that the fields q(x) and θ(x) are in one-to-one correspondence.
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
989
Turning now to the hydrodynamical equation (1.4), we see immediately that the stationary field q(x) is determined by the requirement that F (q; x) = 0, together with the conditions imposed by the Σ(N ) − R coupling at the boundary ∂Ω of Ω. In order to specify these conditions, we denote by ∂ΩJ the section of ∂Ω where Σ(N ) is in contact with RJ . We then assume the following boundary condition. (R) On the section ∂ΩJ of the boundary of Σ, the classical field θ(x) of this system takes the value θJ of the control variables of the equilibrium state in which RJ is initially prepared. Thus, the array of reservoirs fixes the form of θ(x) and therefore of q(x) on ∂Ω. This assumption signifies that, on the hydrodynamic time scale and in the limit N → ∞, the local thermodynamical variables θ(x) of Σ spontaneously take up the same values as the reservoir with which this system is in contact at its boundary. The assumption is fulfilled by the models of [9–11]. Note on the phenomenological dynamics: ∇θ as the driving force. In the general situation where the field qt is time-dependent, we define its thermodynamical conjugate to be the field θt given by the space-time dependent version of Eq. (2.5), namely (3.4) θt (x) = s qt (x) . Thus, in view of our assumption that the system is perpetually in a single phase region, the function s is invertible and the phenomenological law (1.4) may be expressed in the form ∂ qt (x) = ∇ · G(θt ; x , ∂t where the functional G is determined by J according to the formula G s (qt ); x) = −J (qt ; x).
(3.5)
(3.6)
In particular, in the case of nonlinear diffusion, it follows from Eqs. (1.4), (1.5), (2.5) and (2.6) that this phenomenological law reduces to the form ∂ qt (x) + ∇ · (K θt (x) ∇θt (x)) = 0, ∂t
(3.7)
where, in correspondence with the general relationship (2.5) between q and θ, ˜ ˜ ]−1 (θ))J([s ]−1 (θ)). K(θ) = K(q)J(q) ≡ K([s
(3.8)
One sees immediately from Eq. (3.7) that the gradient of the thermodynamical field θt acts as the hydrodynamical driving force. 3.2. Linearized perturbations of the hydrodynamics In view of our above remarks, δqt is a distribution that satisfies Eq. (1.7) and vanishes on ∂Ω. We assume that the linear operator L appearing in that equation is
October 20, 2005 8:48 WSPC/148-RMP
990
J070-00249
G. L. Sewell
the generator of a one-parameter semigroup, {Tt | t ∈ R+ } := T (R+ ), of transformations of Dm (Ω). The solution of Eq. (1.7) is then ∀t ≥ s ≥ 0.
δqt = Tt−s δqs
(3.9)
Correspondingly, by Eq. (1.8), δjt = Kδqt = KTt−s δqs
∀t ≥ s.
(3.10)
Further, by Eq. (3.9) and the dissipativity condition stated in the paragraph before Eq. (1.7), D − lim Tt ψ = 0
∀ψ ∈ Dm (Ω)
(3.11)
D − lim Tt f = 0
∀f ∈ Dm (Ω),
(3.12)
t→∞
or equivalently, t→∞
where {Tt | t ∈ R+ } is the one-parameter semigroup of transformations of Dm (Ω) dual to T (R+ ). We denote its generator by L , which is just the dual of L. 3.3. The hydrodynamical fluctuation fields (N ) (N ) (N ) (N ) = We define the quantum fields, ξt (x) = ξ1,t (x), . . . , ξm,t (x) and ηt (N ) (N ) η1,t (x), . . . , ηm,t (x) , representing the fluctuations of the hydrodynamically scaled (N ) (N ) field qˇt (x) and the associated current jˇt (x), by the formulae (N ) d/2 (N ) (N ) ξt (x) = LN qˇt (x) − Φ(N ) , qˇt (x)Φ(N ) , (3.13) and (N )
ηt
d/2 ˇ(N ) j t (x)
(x) = LN
(N ) − Φ(N ) , jˇt (x)Φ(N ) ,
(3.14)
d/2
the normalization factor LN being natural for this scaling. The corresponding (N ) (N ) smeared fields ξt (f ) and ηt (g) are then the observables obtained by integrating these fields against test functions f (∈ Dm (Ω)) and g (∈ DVm (Ω)), respectively. (N ) satisfies the local Thus, it follows from Eqs. (2.20), (3.13) and (3.14) that ξt conservation law
t (N ) duηu(N ) (∇f ) ∀t, s ∈ R, f ∈ Dm (ΩN ). (3.15) ξt (f ) − ξs(N ) (f ) = s (N )
The dynamical properties of the fluctuation field ξt are encoded in the correlation functions (N ) (N ) (3.16) W (N ) (f (1) , . . . , f (r) ; t1 , . . . , tr ) = Φ(N ) , ξt1 (f (1) ) · · · ξtr (f (r) )Φ(N ) . This formula, together with Eqs. (2.15) and (3.13), serves to express W (N ) in terms (N ) of the smeared fields qˆt (f ) of Sec. 2. Thus, in view of our stipulation there that (N ) of the monomials in these fields are continuous in the f ’s the actions on Φ and t’s, it follows that W (N ) is continuous in all its arguments. Further, it follows
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
991
from the stationarity of the state φ(N ) and the self-adjointness of the observables (N ) ξt (f ) that W (N ) (f (1) , . . . , f (r) ; t1 + a, . . . , tr + a) = W (N ) (f (1) , . . . , f (r) ; t1 , . . . , tr )
∀a ∈ R
(3.17)
and ¯ (N ) (f (1) , . . . , f (r) ; t1 , . . . , tr ) = W (N ) (f (r) , . . . , f (1) ; tr , . . . , t1 ); W
(3.18)
while the positivity of φ(N ) implies that (AΦ(N ) , AΦ(N ) ) ≥ 0 for any polynomial A (N ) in the smeared fields ξt (f ). Thus, choosing A=
p
(N )
(N )
ck ξtk,1 (f (k,1) ) · · · ξtk,r (f (k,rk ) ), k
k=1
where the c’s are complex constants and p is finite, p
c¯k cl W (N ) (f (k,rk ) , . . . , f (k,1) f (l,1) , . . . , f (l,rl ) ; tk,rk , . . . , tk,1 , tl,1 , . . . , tl,rl ) ≥ 0.
k,l=1
(3.19) 3.4. Hydrodynamic limit of the fluctuation process We now assume that W (N ) converges to a functional W in the hydrodynamic limit where N → ∞, i.e. that lim W (N ) (f (1) , . . . , f (r) ; t1 , . . . , tr ) = W (f (1) , . . . , f (r) ; t1 , . . . , tr )
N →∞
∀f (1) , . . . , f (r) ∈ Dm (Ω), t1 , . . . , tr ∈ R,
r ∈ N.
(3.20)
Hence, in view of the continuity properties of W (N ) and the completeness of D spaces, W is continuous in the f ’s and measurable in the t’s. It is therefore a zero order distribution with respect to the latter variables [33]. Further, it follows immediately from Eq. (3.20) that W inherits the stationarity, Hermiticity and positivity properties of W (N ) , as given by Eqs. (3.17)–(3.19). Thus, W (f (1) , . . . , f (r) ; t1 + a, . . . , tr + a) = W (f (1) , . . . , f (r) ; t1 , . . . , tr )
∀a ∈ R, (3.21)
¯ (f W and p
(1)
,...,f
(r)
; t1 , . . . , tr ) = W (f
(r)
,...,f
(1)
; tr , . . . , t1 );
(3.22)
c¯k cl W (f (k,rk ) , . . . , f (k,1) , f (l,1) , . . . , f (l,rl ) ; tk,rk , . . . , tk,1 , tl,1 , . . . , tl,rl ) ≥ 0.
k,l=1
(3.23) It follows from these properties that, by Wightman’s reconstruction theorem [25], W corresponds precisely to the quadruple (H, V, ξ, Ψ), where
October 20, 2005 8:48 WSPC/148-RMP
992
J070-00249
G. L. Sewell
(a) H is a Hilbert space; (b) V is a unitary representation of R in H such that Vt , the image of t (∈ R) under V , is strongly measurable; (c) ξt (x) is a Hermitian operator valued distribution, of class Dm (Ω), in H, which implements the time translations of ξ, i.e. ξt+s (x) = Vt ξs (x)Vt−1 ;
(3.24)
and (d) Ψ is a vector in H that is invariant under Vt and cyclic with respect to the polynomials in the smeared fields ξt (f ) obtained by integrating ξt (x) against Dm (Ω)-class test functions f . The functional W is then related to the smeared field ξt (x) and the cyclic vector Ψ by the formula W (f (1) , . . . , f (r) ; t1 , . . . , tr ) = Ψ, ξt1 (f (1) ) · · · ξtr (f (r) )Ψ . (3.25) 3.5. Conditions for W to represent a classical stochastic process The question of whether W represents a classical stochastic process reduces to those of whether (a) it defines a quantum stochastic process in the sense of [34] and (b) this process has the abelian properties of a classical one. Now, the condition (a) since, in this case, is fulfilled if the smeared Hermitian fieldsmξt (f ) are self-adjoint the unitary operators {exp iξt (f ) | f ∈ D (Ω)} generate a W -algebra Nt and the correlation functions (Ψ, Ft1 · · · Ftr Ψ) | Fts ∈ Nts ; s = 1, . . . , r define a quantum stochastic process, as formulated in [34]. Further, the classicality conditioni (b) is simply that of the intercommutativity of the operators ξt (f ). The following proposition provides a sufficient condition for the functional W to represent a quantum stochastic process. Proposition 3.1. The functional W uniquely defines a quantum stochastic process ξ, indexed by Dm (Ω) × R, if there is a bounded, positive functional (f, t) → Ft (f ) on that product space such that |W (f (1) , . . . , f (r) ; t1 , . . . , tr )| ≤ r2 Ft1 (f (1) ) · · · Ftr (f (r) ) ∀f (1) , . . . , f (r) ∈ Dm (Ω);
t1 , . . . , tr ∈ R.
(3.26)
Comment. We shall subsequently establish in Proposition 6.1 that, under the assumptions of our scheme, the process ξ is Gaussian. Since that result implies that the truncated r-point functions induced by W all vanish and thus that Eq. (3.26) is satisfied, it signifies a consistency of our assumptions. Proof of Proposition 3.1. As noted above, W defines a stochastic process if the Hermitian operators ξt (f ) are self-adjoint; and by Nelson’s theorem [35], a sufficient i Here
we consider classical processes as special (abelian) cases of the quantum ones.
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
993
condition for this is that each of these fields has a dense domain of analytic vectors. To prove that this is the case, subject to the assumption of Eq. (3.26), we note that it follows from that inequality and Eq. (3.25) that, for arbitrary f, f (1) , . . . , f (r) in Dm (Ω) and t, t1 , . . . , tr in R,
ξt (f )p ξt1 (f (1) ) · · · ξtr (f (r) )Ψ ≤ (p + r)2 Ft (f )p Ft1 (f (1) ) · · · Ftr (f (r) ) ∞ and therefore, that the H-valued function z (∈ C) → p=0 z p ξt (f )p ξt1 (f (1) ) · · · ξtr (f (r) )Ψ/p! has an infinite radius of convergence. Hence, in view of the cyclicity of Ψ with respect to the polynomials in the smeared fields {ξt (f )}, these fields are self-adjoint and therefore W corresponds to a stochastic process. We shall assume henceforth that W does indeed define a stochastic process. In order to formulate a condition for its classicality, we introduce the following definition. Definition 3.2. (i) We define P (resp. P (N ) ) to be the set of polynomials in (N ) the smeared fields {ξt (f ) resp. ξt (f ) | f ∈ Dm (Ω), t ∈ R} and we define the (N ) (N ) of P onto P by the prescription that P (N ) is the element of bijection P → P P (N ) obtained by replacing ξ by ξ (N ) in the formula for P . (N ) (ii) For P ∈ P and N ∈ N, we define the vector ΨP (∈ Hφ(N ) ) by the formula (N )
ΨP
= P (N ) Φ(N ) .
(3.27)
We now note that, by Eq. (3.25), the classicality condition that the operators ξt (f ) intercommute is equivalent to the invariance of W (f (1) , . . . , f (k) ; t1 , . . . , tn ) under the permutations (f (r) , tr ) (f (r+1) , tr+1 ); and by Definition 3.2 and Eqs. (3.12), (3.16), (3.20), this latter condition may be expressed in the form (N ) (N ) (N ) (N ) =0 ∀P ∈ P, f, f ∈ Dm (Ω), t, t ∈ R. lim ΨP , [ξt (f ), ξt (f )]ΨP N →∞
Moreover, we can set t = 0 here without loss of generality, since Φ(N ) is invariant (N ) (N ) and therefore, by Eq. (2.14), Definition 3.2 and the definition of ξt , under Ut (N ) (N ) is stable under this unitary transformation. Consequently, the manifold P Φ we have the following proposition, whose significance we shall discuss below. Proposition 3.3. Under the above assumptions, the process ξ is classical if and (N ) only if ξt (f ) satisfies the condition that (N ) (N ) (N ) =0 ∀P ∈ P, f, f ∈ Dm (Ω), t ∈ R. lim ΨP , [ξt (f ), ξ (N ) (f )]ΨP N →∞
(3.28) Comment. In order to relate condition (3.28) to the microscopic picture, we infer from Eqs. (2.10), (2.14)–(2.19) and (3.13) that this condition signifies the following.
October 20, 2005 8:48 WSPC/148-RMP
994
J070-00249
G. L. Sewell
(i) In the case where t = 0, m
(N ) (N ) dxdy ΨP , [ˆ qk (LN x, L2N t), qˆl (LN y)]ΨP fk (x)fl (y) = 0 lim LdN
N →∞
k,l=1
Ω2
∀f, f ∈ Dm (Ω),
P ∈ P,
(3.29)
which is evidently a space-time asymptotic abelian condition on the field qˆ. (ii) In the case where t = 0, (N ) (N ) ˇ (gf,f )Ψ(N ) = 0 lim L−2 ∀f, f ∈ Dm (Ω), P ∈ P, (3.30) N ΨP , j P N →∞
where gf,f is the element of DVm (Ω) whose rth component is crkl fk ∇fl . gf,f ;r =
(3.31)
kl
Thus, Eq. (3.30) signifies the avoidance of the catastrophe whereby, for fixed P ∈ P, (N ) the expectation value of the smeared hydrodynamically scaled current jˇ (gf,f ) (N ) in the vector state ΨP would grow as rapidly as L2N with increasing N . 4. The Stochastic Process ξ: Regression and Local Equilibrium Hypotheses and the Generalized Onsager Relations We now assume that the conditions of Propositions 3.1 and 3.3 are fulfilled and hence that ξ is a classical stochastic process, indexed by R × Dm (Ω). In a standard way, we denote the expectation functional of the random variables for this process by E. Thus, by Eq. (3.25), E ξt1 (f (1) ) · · · ξtr (f (r) ) = Ψ, ξt1 (f (1) ) · · · ξtr (f (r) )Ψ ∀t1 , . . . , tr ∈ R,
f (1) , . . . , f (r) ∈ Dm (Ω). (4.1)
We note that, by Eqs. (3.20), (3.25) and (4.1), the process ξ (N ) converges to ξ, i.e. its correlation functions converge to the corresponding ones for ξ, as N → ∞. Further, in view of the observation following Eq. (3.20), the correlation function E ξt1 (f (1) ) · · · ξtr (f (r) ) is continuous with respect to the f ’s and measurable with respect to the t’s. Conditional expectations. For any random variable F of the ξ-process and for t ∈ R, we denote the conditional expectations of F with respect to the σ-algebras generated by {ξt (f ) | f ∈ Dm (Ω)} and {ξt (f ) | t ≤ t, f ∈ Dm (Ω)} by E(F | ξt ) and E(F | ξ≤t ), respectively. 4.1. The regression hypothesis This hypothesis is just the canonical generalization of that assumed by Onsager [17] for fluctuations about equilibrium states. Its essential import is that the evolution of a small hydrodynamical deviation from a steady state does not depend on whether
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
995
the deviation has arisen from a spontaneous fluctuation or from a weak perturbation of the system.j Thus, in mathematical terms, the regression hypothesis asserts that, for fixed s and t ≥ s, the evolution of E(ξt | ξs ) is governed by the same law as that of the linearized perturbation δqt of the deterministic trajectory qt , i.e. by Eq. (3.9), that f) ∀t ≥ s. (4.2) E ξt (f ) | ξs = [Tt−s ξs ](f ) ≡ ξs (Tt−s Hence, since Nelson’s forward time derivative [36] of ξt (f ) is defined to be Dξt (f ) := lim u−1 E ξt+u (f ) − ξt (f ) | ξt u→+0
(4.3)
and, since L is the generator of T (R+ ), it follows that Dξt (f ) = Lξt (f ).
(4.4)
Further, defining the static two-point function WS : Dm (Ω) × Dm (Ω) → R by the formula ∀f, f ∈ Dm (Ω), (4.5) WS (f, f ) = E ξ(f )ξ(f ) it follows from Eq. (4.2) and the stationarity of the ξ- process that E ξt (f )ξt (f ) = WS (Tt−t ∀f, f ∈ Dm (Ω), t, t (≤ t) ∈ R. f, f )
(4.6)
4.2. Local equilibrium conditions Our next assumption asserts essentially that, in a nonequilibrium steady state, the statistical properties of the fluctuation field ξ in a “small” neighborhood, N (x), of an arbitrary point x (∈ Ω) simulate those enjoyed by these fields in the true equilibrium state corresponding to the value q(x) of the thermodynamic variable q. This is a mesoscopic local equilibrium condition, since it involves only the fluctuation field ξ and is thus weaker than that of microscopic local equilibrium [37], which would signify that the microstate of Σ in N (x) simulated the equilibrium microstate corresponding to q(x) there. Here we note that even this stronger condition has been shown to be fulfilled [38] by systems of fermions for which an Eulerian hydrodynamics has been established. Moreover, it may be expected to ensue more generally from the fact that the ratio of the hydrodynamic time scale to that of the microscopic processes (collisions, etc.) is infinite, since that implies that local values of the hydro-thermodynamic variables q change negligibly in the time taken for the latter processes to generate equilibrium in macroscopically small spatial regions. In order to precisely specify our mesoscopic local equilibrium hypothesis, we start by formulating the relevant properties of hydrodynamical fluctuations about j As in Onsager’s theory, the assumption of this equivalence between the consequences of fluctuations and weak perturbations is not quite innocuous, since the modifications of the variables q due to the former are O(N −1/2 ), whereas those due to the latter are of order of a different small parameter that represents the strength of the perturbation.
October 20, 2005 8:48 WSPC/148-RMP
996
J070-00249
G. L. Sewell
true equilibrium states for which the stationary classical field q(x) is assumed to be uniform. Equilibrium fluctuations. We recall that, for a finite system, the equilibrium probability distribution function, P , for macroscopic observables A is determined by the entropy S(A) according to the Einstein formula P (A) = const. exp S(A) , and this serves to relate the static correlation functions for the fluctuations of these observables to the thermodynamics of the system. The generalization of this relation to infinite systems has been derived by a quantum statistical treatment [15, Chap. 7, Appendix C] of equilibrium states and takes the form (4.7) ∀f, f ∈ Dm (Ω), Eeq ξ(f )ξ(f ) = f, J(q)f , where Eeq is the equilibrium expectation functional for the fluctuation process, J(q) is defined by Eq. (2.6) and (. , .) is the inner product on Dm (Ω) defined by the formula m
dxfk (x)fk (x). (4.8) (f, f ) = k=1
Ω
It follows from Eqs. (4.2) and (4.7) that f, J(q)f Eeq ξt (f )ξs (f ) = Tt−s
∀f, f ∈ Dm (Ω),
t ≥ s.
(4.9)
Further, recalling the assumption, introduced in Sec. 2.5, of the invariance of the quantum field qˆ(N ) (x) under the time-reversal antiautomorphism τ and assuming (N ) that the equilibrium statek φeq of (Σ(N ) + R) is likewise time-reversal invariant, it follows from the stationarity of this state and Eq. (3.13) that (N ) (N ) ) (N ) (N ) ) (N ) (N ) (f )ξ−t (f ) = φ(N (f )ξ (f ) . φeq ; ξt (f )ξ (N ) (f ) = φ(N eq ; ξ eq ; ξt On passing to the limit of this equation as N → ∞, we see that Eeq ξt (f )ξ(f ) = Eeq ξt (f )ξ(f ) ; and therefore, by Eq. (4.9), that Eeq ξ(Tt f )ξ(f ) = Eeq ξ(Tt f )ξ(f ),
∀t ≥ 0.
Consequently, since L is the generator of T (R+ ), ∀f, f ∈ Dm (Ω). Eeq ξ(L f )ξ(f ) = Eeq ξ(L f )ξ(f )
(4.10)
Local form of equilibrium correlations. We formulate the local properties of the equilibrium fluctuations in terms of test functions that are highly localized k The same assumption would not be valid for nonequilibrium states, since these generally carry currents of odd parity with respect to time reversals
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
997
around an arbitrary point x0 of Ω. Specifically, for f ∈ Dm (Ω), x0 ∈ Ω and ∈ R+ , we define the function fx0 , on the Euclidean space X by the formula fx0 , (x) = −d/2 f −1 (x − x0 ) ∀x0 ∈ Ω, f ∈ Dm (Ω). (4.11) Since Ω is a bounded open subregion of X, it follows that the restriction of fx0 , to Ω belongs to the space Dm (Ω) for sufficiently small . In this case, we may take Eq. (4.11) to define a transformation f → fx0 , of Dm (Ω), with representing the degree of localization of the latter function about the point x0 . We now note that, by Eqs. (4.8) and (4.11), the r.h.s. of Eq. (4.7) is invariant under the transformation f → fx0 , and therefore it follows from that equation that the equilibrium fluctuations enjoy the local property given by the formula ∀x0 ∈ Ω, f, f ∈ Dm (Ω). (4.12) lim Eeq ξ(fx0 , ) ξ(fx 0 , ) = (f, J(q)f ) ↓0
Further, in the case of nonlinear diffusion, it follows from Eq. (1.10) that, for per˜ turbations of the equilibrium state, L = K(q)∆, with q constant. Hence, for fluctuations about equilibrium, it follows from Eq. (4.7) that both sides of Eq. (4.10) are invariant under the transformation f → fx0 , , f → fx 0 , , Eeq → 2 Eeq , and consequently lim 2 Eeq ξ(L fx0 , ) ξ(fx 0 , ) = lim 2 Eeq ξ(L fx 0 , ) ξ(fx0 , ) ∀x0 ∈ Ω. (4.13) ↓0
↓0
Local equilibrium conditions for nonequilibrium steady states. We now assume that, for these states, the natural counterparts of the local conditions (4.12) and (4.13) still hold, i.e. that ∀x0 ∈ Ω, f, f ∈ Dm (Ω) (4.14) lim E ξ(fx0 , ) ξ(fx 0 , ) = f, J(q(x0 ))f ↓0
and
lim 2 E ξ(L fx0 , ) ξ(fx 0 , ) ↓0 = lim 2 E ξ(L fx 0 , ) ξ(fx0 , ) ↓0
∀x0 ∈ Ω,
f, f ∈ Dm (Ω).
(4.15)
These are our local equilibrium conditions, which manifestly concern the fluctuation field ξ only. 4.3. Generalized Onsager reciprocity relations The following proposition represents a generalization of the Onsager reciprocity relations to nonequilibrium steady states of the nonlinear diffusion process. Proposition 4.1. Under the assumption of the regression and local equilibrium hypotheses, the transport coefficients of the nonlinear diffusion process satisfy the position-dependent Onsager relations ∀x ∈ Ω, k, l ∈ [1, m]. (4.16) Kkl θ(x) = Klk θ(x)
October 20, 2005 8:48 WSPC/148-RMP
998
J070-00249
G. L. Sewell
Proof. Since we employ the same argument as that for nonequilibrium states of conservative systems in [15, Chap. 7], we shall just sketch the proof here. We start by introducing the linear transformation L0 of Dm (Ω) by the formula ˜ q(x0 ) ∆. (4.17) L0 := K It then follows, after some manipulation, from Eqs. (1.10), (3.8), (4.14) and (4.17), ˜ J and q, that together with the continuity properties of the functions K, ∀f, f ∈ Dm (Ω), x0 ∈ Ω. (4.18) lim 2 E ξ([L − L0 ]fx0 , )ξ(fx 0 , ) = 0 ↓0
This implies that L may be replaced by L0 in Eq. (4.15), i.e. that lim 2 E ξ(L0 fx0 , )ξ(fx 0 , ) = lim 2 E ξ(L0 fx 0 , )ξ(fx0 , ) ↓0
↓0
∀f, f ∈ Dm (Ω),
x0 ∈ Ω.
(4.19)
Further, since, by Eqs. (4.11) and (4.17), 2 L0 fx0 , = [L0 f ]x0 , , Eq. (4.19) reduces to the form lim E ξ([L0 f ]x0 , )ξ(fx 0 , ) = lim E ξ([L0 f ]x0 , )ξ(fx0 , ) ↓0
↓0
∀f, f ∈ Dm (Ω),
x0 ∈ Ω.
It follows from this equation, together with Eqs. (3.8), (4.14) and (4.17), that (∆f, K θ(x0 ) f ) = (∆f , K θ(x0 ) f ), ∀f, f ∈ Dm (Ω), x0 ∈ Ω. (4.20) Further, since, by Eq. (4.8), (∆f, f ) ≡ (∆f , f ) ∀f, f ∈ Dm (Ω), and since the actions of ∆ and K θ(x0 ) on Dm (Ω) intercommute, Eq. (4.20) is equivalent to the following formula. (∆f, K θ(x0 ) f ) = (∆f, K θ(x0 ) f ) ∀f, f ∈ Dm (Ω), x0 ∈ Ω, (4.21) where K is the adjoint of K. Hence, the matrix K(θ(x0 )) is symmetric for all points x0 in Ω. This is equivalent to the required result. 5. Fluctuating Currents, Chaoticity and the Onsager–Machlup Process 5.1. A preliminary observation We now aim to extend the stochastic process ξ so as to include the currents associated with these fluctuations. To this end we recall that, under the assumptions of (N ) Propositions 3.1 and 3.3, ξt converges a classical process ξ, indexed by Dm (Ω)×R,
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
999
with ξt (f ) continuous in f and measurable in t. We shall now argue that, by contrast, η (N ) cannot converge to a process η possessing the corresponding continuity and measurability properties. To show this, we suppose that the correlation functions for η (N ) converge to those of a process η, indexed by DVm (Ω)×R. Then, since L is the generator of T (R+ ), it follows from Eqs. (3.15), (3.20), (4.1), (4.5) and (4.6) that
t t ds1 ds2 E ηs1 (∇f )ηs2 (∇f ) = E [ξt (f ) − ξ(f )]2 0 0 = 2E ξ(f )[ξ(f ) − ξ(Tt f )]
t = −2 dsWS (f, Tt L f )) ∀f ∈ Dm (Ω), t ∈ R+ . 0 2 Now the r.h.s. of this equation is O(t), whereas the l.h.s. would be O(t ) if E ηs1 (g)ηs2 (g) were continuous in g and measurable with respect to s1 and s2 . Hence, we cannot assume that η (N ) converges to a process η that possesses these continuity and measurability properties.
5.2. The processes ζ and η In view of this last observation, we proceed somewhat differently, starting with the definition
t (N ) ζt,s (g) := duηu(N ) (g) ∀g ∈ DVm (Ω), t, s ∈ R. (5.1) s
We assume that the cyclic vector Φ(N ) lies in the domain of all monomials in (N ) (N ) the operators ξu (f ) and ζt,s (g) as f and g run through Dm (Ω) and DVm (Ω), respectively, and t, s and u run through R. We further assume that the correlation functions given by the expectation values of these monomials for the vector state Φ(N ) are continuous in their spatial test functions and time variables, that they converge pointwise to definite limits as N → ∞, and that these limits satisfy the canonical counterparts to the assumptions of Propositions 3.1 and 3.3. It then follows, by analogy with the arguments of Sec. 3, that the quantum process (ξ (N ) , ζ (N ) ) converges to a classical one, (ξ, ζ), whose two components are indexed by Dm (Ω) × R and DVm (Ω) × R2 , respectively, and are continuous with respect to their spatial test functions and measurable with respect to their time variables. In view of Eq. (5.1) and the fact that the process ζ is the limiting form of ζ (N ) as N → ∞, we term ζ the time-integrated current. We note that since by Eq. (5.1), (N )
(N )
(N ) ζt,s ≡ ζt,u + ζu,s
(N )
and ζt,t ≡ 0,
it follows that, correspondingly, ζt,s ≡ ζt,u + ζu,s
and ζt,t ≡ 0.
(5.2)
October 20, 2005 8:48 WSPC/148-RMP
1000
J070-00249
G. L. Sewell
Further, by Eqs. (3.15) and (5.1), (N )
ξt
(N )
(f ) − ξs(N ) (f ) = ζt,s (∇f )
∀f ∈ Dm (Ω),
t, s ∈ R,
and hence, correspondingly, ξt (f ) − ξs (f ) = ζt,s (∇f )
∀f ∈ Dm (Ω),
t, s ∈ R,
(5.3)
which is just the local conservation law for ξ. 5.3. Extension of the regression hypothesis: Secular and stochastic currents By Eq. (1.8), the increment δjt in the phenomenological current due to a perturbation δqt of the field qt is Kδqt . Correspondingly, by way of extending the regression hypothesis of Sec. 3, we designate the secular part of the time-integrated fluctuation current ζt,s to be
t sec := duKξu , (5.4) ζt,s s
where K, defined formally by Eq. (1.8), may now be interpreted as a mapping from Dm (Ω) into DVm (Ω). We define the time-integrated stochastic current to be the residual part of ζt,s , namely sec , ζ˜t,s = ζt,s − ζt,s
i.e. by Eq. (5.4),
t
ζ˜t,s = ζt,s −
duKξu .
(5.5)
s
In view of this formula, we may re-express the local conservation law (5.3) in the form
t ξt (f ) − ξs (f ) = duξu (K ∇f ) + ζ˜t,s (∇f ), s
or equivalently, since Eqs. (1.9) and (3.15) imply that ∇ · K = −L,
t ξt (f ) − ξs (f ) = duξu (L f ) + wt,s (f ) ∀f ∈ Dm (Ω), t, s ∈ R,
(5.6)
s
where wt,s (f ) := ζ˜t,s (∇f ))
∀f ∈ Dm (Ω),
t, s ∈ R.
(5.7)
Further, since, by Eqs. (5.2) and (5.7), wt,s ≡ wt,u + wu,s
and wt,t ≡ 0,
(5.8)
Eq. (5.6) is formally a Langevin equation. However, the condition for it to qualify as a bona fide Langevin equation is that w has the temporal stochastic properties of a Wiener process. The following proposition, which we shall prove in Appendix A, establishes that its two-point function does have the requisite properties. Further
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1001
assumptions concerning the chaoticity of the time-integrated stochastic current ζ˜t , which will be introduced in Sec. 5.4, then lead to a picture in which w is indeed a fully-fledged Wiener process. Proposition 5.1. Assuming the regression hypothesis, the local conservation law (5.3) and the definition of wt , ∀t ≥ s ≥ u, f, f ∈ Dm (Ω) (5.9) E wt,s (f )ξu (f ) = 0 and
E(wt,s (f )wt ,s (f )) = − WS (L f, f ) + WS (f, L f ) |[s, t] ∩ [s , t ]| ∀t, s(≤ t),
t , s (≤ t ) ∈ R,
f, f ∈ Dm (Ω),
(5.10)
where the last factor represents the length of the intersection of the intervals [s, t] and [s , t ] and WS is the two-point function defined by Eq. (4.5). Further, the process w is nontrivial, i.e. wt,s does not vanish. 5.4. The chaoticity and temporal continuity hypotheses We assume that the stochastic current is chaotic in the sense that the space-time correlations of ζ˜t,s (x) are of short range on the microscopic scale. This assumption is designed to represent Boltzmann’s hypothesis of molecular chaos, as transferred from the local particle velocities to the stochastic currents. Since LN tends to infinity with N , it signifies that the space-time correlations of ζ˜t,s (x) are of zero range on the hydrodynamic scale. Further, in accordance with the central limit theorem for fluctuation fields with short range spatial correlations [39], we assume that the process ζ˜ is Gaussian. Thus, our chaoticity hypothesis is that ˜ (C.1) The process ζ is Gaussian; ˜ ˜ (C.2) E ζt,s (g)ζt ,s (g ) = 0 if (s, t) ∩ (s , t ) = ∅; and (C.3) E ζ˜t,s (g)ζ˜t ,s (g ) = 0 if supp(g) ∩ supp(g ) = ∅. ˜ It follows immediately from (C.1) that the process ζ is completely determined by its two-point function E ζ˜t,s (g)ζ˜t ,s (g ) . In view of the discussion following Eq. (5.1), this is continuous with respect to the test functions g and g and measurable with respect to the time variables t, s, t and s . We now strengthen this conclusion by the following continuity hypothesis to the effect that it is continuous with respect to the time variables. (C) The two-point function E ζ˜t,s (g)ζ˜t ,s (g ) is continuous with respect to the time variables t, s, t , s . The following proposition, which we shall prove in Appendix B, stems from an ˜ subject application of a key theorem of Schwartz [33, Theorem 35] to the process ζ, to the assumptions (C.2) and (C). Proposition 5.2. Under the assumption of the hypotheses (C.2), (C.3) and (C), together with the condition of continuity with respect to its spatial test functions,
October 20, 2005 8:48 WSPC/148-RMP
1002
J070-00249
G. L. Sewell
the two-point function for the process ζ˜ takes the form E ζ˜t,s (g)ζ˜t ,s (g ) = Γ(g, g )|[s, t] ∩ [s , t ]| ∀g, g ∈ DVm (Ω),
t, s, t , s ∈ R, (5.11)
where Γ ∈ DVm (Ω) ⊗ DVm (Ω) and supp(Γ) ⊂ {(x, x ) ∈ Ω2 | x = x}. 5.5. A local equilibrium condition for the currents In order to extend our local equilibrium condition to the stochastic currents of the nonlinear diffusion process, we start by formulating the two-point function at ˜ equilibrium for the process ζ. ˜ Assuming again that the field q is Equilibrium two point function for ζ. uniform at equilibrium, we infer from Eqs. (1.6) and (1.7) that in this situation ˜ L = K(q)∆, with q constant. Hence, by Eqs. (3.8), (4.7), (5.7) and (5.10), together with the symmetry of J(q), which follows from Eq. (2.6), Eeq ζ˜t,s (∇f )ζ˜t ,s (∇f ) = − ∆f, K(θ)f + K(θ)f, ∆f |[s, t] ∩ [s , t ]|, which, by Eq. (4.16), is equivalent to the following formula for the unsmeared two˜ point function for ζ. ∂2 Eeq ζ˜t,s;k,µ (x)ζ˜t ,s ;l,ν (x ) = −2Kkl (θ)∆δ(x − x )|[s, t] ∩ [s , t ]|, (5.12) ∂xµ ∂xν where ζ˜t,s;k,µ is the µth spatial component of the kth component of the field ζ˜t,s = (ζ˜t,s;1 , . . . , ζ˜t,s;m ) and the summation convention is employed for the indices µ and ν. Recalling now our assumption, at the start of Sec. 2, that the interactions are translationally and rotationally invariant, we assume that the corresponding symmetries are unbroken in the pure equilibrium phase and thus that the process ζ˜ is invariant under the space translations and rotations that are implemented within the confines of Ω. We remark here that the limitation in Euclidean symmetry imposed by the boundedness of Ω is not serious from the physical standpoint, since Ω is an open subset of X and so any point of it, as viewed in the microscopic picture, is infinitely far from the boundary of Σ. Assuming then that the equilibrium two-point function for ζ˜ is invariant under space translations and rotations, we may express it in the form (5.13) Eeq ζ˜t,s;k,µ (x)ζ˜t ,s ;l,ν (x ) = Skl (x − x )δµν |[s, t] ∩ [s , t ]|, where Skl ∈ D (Ω). It follows from this formula that Eq. (5.12) reduces to the following differential equation for Skl . ∆Skl (x) = 2Kkl (θ)∆δ(x).
(5.14)
Further, by condition (C.3) and Eq. (5.13), the distribution Skl has support at the origin, and therefore [33, Theorem 35] Skl (x − x , t) is a finite linear combination of δ(x − x ) and its derivatives. Hence, the only admissible solution of Eq. (5.14) is Skl (x) = 2Kkl (θ)δ(x)
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1003
and therefore, by Eq. (5.13), the equilibrium two-point function for ζ˜ is given by the formula (5.15) Eeq ζ˜t,s;k,µ (x)ζ˜t ,s ;l,ν (x ) = 2Kkl (θ)δ(x − x )δµν |[s, t] ∩ [s , t ]|. Equivalently, the equilibrium two-point function for the smeared field ζ˜t,s (g) takes the form Eeq ζ˜t,s (g)ζ˜t ,s (g ) = 2 g, K(θ)g )V |[s, t] ∩ [s , t ]| ∀g, g ∈ DVm (Ω),
t, s, t , s ∈ R,
(5.16)
where (.)V is the inner product in DVm (Ω) defined by the formula m
(g, g )V = dxg(x) · g (x) ∀g, g ∈ DVm (Ω) k=1
≡
Ω
m d
k=1 µ=1
dxgk,µ (x)gk,µ (x),
(5.17)
Ω
and where gk,µ is the µth spatial component of gk . Local property of the equilibrium two-point function. We formulate the local properties of the stochastic current ζ˜ along the lines employed in Sec. 4.2 for the process ξ. Thus, for (x0 , ) ∈ Ω × R+ , and sufficiently small, we define the transformation g → gx0 , of DVm (Ω) by the formula gx0 , (x) = −d/2 g −1 (x − x0 ) . (5.18) We then observe that, by Eqs. (5.17) and (5.18), the transformations t → 2 t, g → gx0 , , of the times and test functions, lead to the multiplication of the smeared two-point function of Eq. (5.16) by the factor 2 . Thus, ˜ ) = 2 g, K(θ)g )V |[s, t] ∩ [s , t ]| −2 Eeq ζ˜2 t,2 s (gx0 , )ζ(g x0 , ∀x0 ∈ Ω,
g, g ∈ DVm (Ω),
t, s, t , s ∈ R.
(5.19)
The local property of the two-point function for ζ˜ at the point x0 is then obtained by passing to the limiting form of this equation as → 0. Local equilibrium property for the stochastic current in the nonequilibrium steady state. In view of the last observation, we assume that, in the nonequilibrium steady state, the process ζ˜ enjoys the local equilibrium property obtained by passing to the limit → 0 and replacing Eeq and θ by E and θ(x0 ), respectively, in Eq. (5.19). Thus, we assume that lim −2 E ζ˜2 t,2 s (gx0 , )ζ˜2 t ,2 s (gx 0 , ) = 2 g, K(θ(x0 ))g )V |[s, t] ∩ [s , t ]| →0
∀x0 ∈ Ω,
g, g ∈ DVm (Ω),
t, s(≤ t),
t , s (≤ t ) ∈ R.
This is our local equilibrium condition for the stochastic current.
(5.20)
October 20, 2005 8:48 WSPC/148-RMP
1004
J070-00249
G. L. Sewell
5.6. Explicit form of the two point function for ζ˜ By Proposition 5.2, this function is determined by the functional Γ, which by Eqs. (5.11) and (5.20), possesses the following local equilibrium property. ∀x0 ∈ Ω, g, g ∈ DVm (Ω). (5.21) lim Γ(gx0 , , gx 0 , ) = 2 g, K(θ(x0 ))g )V ↓0
The following proposition, which will be proved in Appendix C, provides an explicit formula for the functional Γ, which stems from a combination of the chaoticity condition (C.3) and the local equilibrium condition (5.21). Proposition 5.3. Under the previous assumptions, together with the local equilibrium condition of (5.21), Γ is given by the formula Γ(g, g ) = 2(g, Kθ g )V
∀g, g ∈ DVm (Ω),
where Kθ is the matrix-valued operator K ◦ θ in DVm (Ω), i.e. Kθ (x) = K θ(x) .
(5.22)
(5.23)
The following corollary is an immediate consequence of this proposition and Proposition 5.2. Corollary 5.4. Under the same assumptions, the two-point function of the stationary process ζ˜ is given by the formula E ζ˜t,s (g)ζ˜t ,s (g ) = 2(g, Kθ g )V |[s, t] ∩ [s , t ]| ∀g, g ∈ DVm (Ω),
t, s(≤ t),
t , s (≤ t ) ∈ R.
(5.24)
5.7. The generalized Onsager–Machlup process ξ It now follows immediately from Corollary 5.4 and Eq. (5.7) that E wt,s (f )wt ,s (f ) = 2(∇f, Kθ ∇f )V |[s, t] ∩ [s , t ]| ∀f, f ∈ Dm (Ω),
t, s, t , s ∈ R.
(5.25)
Hence, by the chaotic hypothesis (C.1) and Eqs. (5.7) and (5.25), w is a generalized Wiener process. Further, on re-expressing Eq. (5.6) in the form dξt = Lξt dt + dwt,s ,
(5.26)
we see that, in view of the additive property (5.8) of w, the fluctuation field ξ executes a generalized Onsager–Machlup process; while Eq. (5.25) signifies that the two-point function for w corresponds precisely to that assumed for the stochastic force in Landau’s fluctuation hydrodynamics [18]. In order to derive the properties of the process ξ from those of w, we note that, since L is the generator of T (R+ ), the solution of the Langevin equation (5.26) is
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1005
given by the formula
ξt = Tt−s ξs +
t
Tt−u dwu,s
∀t, s(≤ t) ∈ R,
(5.27)
s
or equivalently,
f) + ξt (f ) = ξs (Tt−s
t dwu,s (Tt−u f)
∀f ∈ Dm (Ω),
T, s(≤ t) ∈ R+ .
(5.28)
s
The following proposition, which we shall prove in Appendix D, is a natural generalization of standard properties of the Brownian motion of a single particle that ensue from the Langevin equation governing its velocity (cf. [36]). Proposition 5.5. Under the above assumptions, (i) ξ is a Gaussian Markov process, and (ii) the fields wt,s and ξu are statistically independent of one another if s and t are greater than or equal to u. Comment. It follows from this proposition and Eqs. (4.5) and (4.6) that the process ξ is completely determined by the forms of the semigroup T (R+ ) and the distribution WS . 6. Long Range Spatial Correlations of the ξ-Process 6.1. The static two-point function for ξ By Eq. (4.5), the unsmeared form of the Dm (Ω) ⊗ Dm (Ω)-class distribution WS is given by the formula (6.1) WS (x, x ) = E ξ(x) ⊗ ξ(x ) . The following Proposition provides an explicit formula for WS , as well as a differential equation for this distribution in terms of the semigroup T (R+ ) and the transport function Kθ . Proposition 6.1. Under the above assumptions,
∞ WS (f, f ) = 2 dt ∇Tt f, Kθ ∇Tt f V
∀f, f ∈ Dm (Ω)
(6.2)
and, further, the generalized function WS (x, x ) satisfies the equation [L ⊗ I + I ⊗ L ]WS (x, x ) = 2∇ · Kθ (x)∇δ(x − x ) ,
(6.3)
0
where L is the version of L that acts on functions of x . Proof. By Eq. (4.5) and the stationarity of the ξ-process, WS (f, f ) = E ξt (f )ξt (f ) ∀f, f ∈ Dm (Ω),
t ∈ R+
October 20, 2005 8:48 WSPC/148-RMP
1006
J070-00249
G. L. Sewell
and therefore, by Eq. (5.27),
t WS (f, f ) = E ξ(Tt f )ξ(Tt f ) + E ξ(Tt f )dwu,0 (Tt−u f ) 0
t E ξ(Tt f )dwu,0 (Tt−u f) + 0
t t E dwu,0 (Tt−u f )dwu ,0 (Tt−u ∀f, f ∈ Dm (Ω), t ∈ R+ . + f )
0
0
(6.4) Now, by the dissipativity condition (3.12), the first term on the r.h.s. of this equation vanishes in the limit t → ∞, while by Eq. (5.9), the second and third terms there vanish. Hence, it follows from Eq. (6.4) that
t t E dwu,0 (Tt−u f )dwu ,0 (Tt−u ∀f, f ∈ Dm (Ω). WS (f, f ) = lim f ) t→∞
0
0
(6.5) Further, by Eq. (5.25), E dwu,0 (f )dwu ,0 (f ) = 2(∇f, Kθ ∇f )V δ(u − u )dudu and consequently Eq. (6.5) reduces to the form
t
t WS (f, f ) = lim 2 du(∇Tt−u f, Kθ ∇Tt−u f )V ≡ lim 2 du(∇Tu f, Kθ ∇Tu f )V , t→∞
0
t→∞
0
which is equivalent to the required formula (6.2). Further, since L is the generator of T (R+ ), it follows from Eq. (6.2) that
∞ d WS (L f, f ) + WS (f, L f ) = 2 dt (∇Tt f, Kθ ∇Tt f )V dt 0 and consequently, by the dissipativity condition (3.12), WS (L f, f ) + WS (f, L f ) = −2(∇f, Kθ ∇f )V
∀f, f ∈ Dm (Ω),
which, by Eq. (6.1), is equivalent to the required formula (6.3). 6.2. Long range spatial correlations In order to provide a precise characterization of long range correlations, we first recall that the ratio of the macroscopic length scale to the microscopic one is infinite. Consequently, correlations of finite range on the microscopic scale are of zero range on the macroscopic one. Accordingly, we term the range of correlations “short” or “long” according to whether or not it reduces to zero in the macroscopic picture. Thus, our condition for long range spatial correlations for the ξ-field is simply that the support of the distribution WS does not lie in the domain {(x, x ) ∈ Ω2 |x = x }. The following proposition establishes that the spatial correlations of ξ for the nonlinear diffusion process are generically of long range.
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1007
Proposition 6.2. Let Φq be the m-by-m matrix-valued function on Ω defined by the formula Φq (x) = ∆Kθ (x) + ∇ · Ψq (x), where
(6.6)
m
∂ ˜ Kkk q(x) Ψq;kl (q; x) = ∂ql (x) k ,l =1 × Jl l q(x) ∇qk (x) − Jk l q(x) ∇ql (x) .
(6.7)
Then under the above assumptions, a sufficient condition for the spatial correlations of ξ to be of long range is that either Φq does not vanish or that the matrix Ψq is symmetric. Comments. (1) The Proposition establishes that the correlations are generically of long range, since the specified conditions on Φq and Ψq can be satisfied only for ˜ ◦ q and s ◦ q; and these are generally special relationships between the functions K ˜ independent of one another, since s and K govern the equilibrium and transport properties, respectively, of Σ. By contrast, the corresponding correlations for equilibrium states are generically of short range, except at critical points. A treatment of critical equilibrium correlations of fluctuation observables is provided by [40]. (2) In the particular case of the symmetric exclusion process [9–11], n = 1, d = 1, ˜ K(q) = 1, s(q) = −q ln q − (1 − q) ln (1 − q) and q(x) = a + b · x, where a and b (= 0) are constants. Thus, in this case, it follows from Eqs. (1.6), (2.6), (6.6) and (6.7) that Ψq = 0 and Φq (x) = −2b2 = 0. Hence, long range correlations prevail in this model, in accordance with the results obtained by its explicit solution in [9–11]. Proof of Proposition 6.2. Suppose that the static spatial correlations of ξ are not of long range, i.e. that the support of the distribution WS lies in the domain {(x, x ) ∈ Ω2 | x = x}. Then it follows from this supposition and the local equilibrium condition (4.14), by precise analogy of the derivation of Eq. (5.24) from corresponding conditions of zero range correlations and local equilibrium for the ˜ that process ζ, WS (x, x ) = Jq (x)δ(x − x ),
(6.8)
Jq (x) := J q(x) .
(6.9)
where
Hence, by Eqs. (1.10), (3.8) and (6.7)–(6.9), (L ⊗ I)W (x, x ) = ∆[Kθ (x)δ(x − x )] + ∇ · [Ψq (x)δ(x − x )]. Further, by Eq. (6.1), (I ⊗ L )W (x, x ) = [(L ⊗ I)W (x , x)]tr ,
(6.10)
October 20, 2005 8:48 WSPC/148-RMP
1008
J070-00249
G. L. Sewell
where the superscript tr denotes transpose, and therefore, by Eq. (6.10), (I ⊗ L )W (x, x ) = ∆ [Kθ (x − x)δ(x − x)]tr + ∇ · [Ψq (x )δ(x − x)]tr ,
(6.11)
where ∆ and ∇ are the versions of ∆ and ∇, respectively, that act on functions of x . Consequently, since Kθ is symmetric, by Eqs. (4.16) and (5.23), it follows from Eqs. (6.6), (6.10) and (6.11) that [L ⊗ I + I ⊗ L ]WS (x, x ) = 2∇ · Kθ (x)∇δ(x − x ) + Φq (x)δ(x − x ) + [Ψq (x) − Ψtr q (x)] · ∇δ(x − x ).
(6.12)
On comparing this equation with Eq. (6.3), we see that Φq (x)δ(x − x ) + [Ψq (x) − Ψtr q (x)] · ∇δ(x − x ) = 0,
i.e. that Φq vanishes and that Ψq is symmetric. These, then, are conditions that ensue from the assumption of short range correlations of the ξ-process. We conclude, therefore, that the violation of either of these conditions signifies that the correlations are of long range. 7. Concluding Remarks We have proposed a macrostatistical treatment of nonequilibrium steady states of quantum systems that is centered on the fluctuations of their hydrodynamical variables. The key physical assumptions on which this treatment is based are (a) the regression hypothesis for the hydrodynamic fluctuation field ξ; (b) the chaoticity of the associated currents, as represented by their time integrals ζt,s ; (c) the local equilibrium conditions on the stochastic process comprising ξ and ζ; (d) the space-time scale invariance of the phenomenological equation of motion (1.4), as exemplified by the case of nonlinear diffusions; and (e) the invariance of the quantum field qˆ, and correspondingly of the classical field ξ, under time reversals. On the basis of these assumptions and certain technical ones, we have obtained a picture that provides natural generalizations of the Onsager reciprocity relations and the Onsager–Machlup fluctuation process to nonequilibrium steady states, together with a demonstration that the spatial correlations of the hydrodynamical variables are generically of long range in these states. Furthermore, this picture is expressed exclusively in terms of the phenomenological functions representing the equilibrium entropy, s(q), the transport coefficients, K(θ), and the hydrodynamical boundary conditions. This may easily be seen from the comment at the end of Sec. 5, together with Eqs. (1.10), (3.8) and (6.2), and the fact that the semigroup T (R) is completely determined by its generator L. Let us now discuss the assumptions (a)–(e) a little further. In our view, for reasons expressed in Sec. 4.1, 4.2 and 5.4, the first three of these seem natural from
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1009
the physical standpoint, though they are very hard to prove in concrete cases. On the other hand, it is clear that assumptions (d) and (e) are not universally valid: for example, they both fail in the important case of Navier–Stokes hydrodynamics. Consequently, it is of interest to consider how the macrostatistical picture presented here might be extended to situations where (d) and (e) are replaced by weaker assumptions. In fact, the weakening of (e) provides no serious problems, since the locally conserved fields of continuum mechanics are generally either even or odd with respect to time reversals [41]. Accordingly, we replace (e) by the assumption that each of the quantum fields, qˆj , has either even or odd parity with respect to time reversals, i.e. that τ qˆj (x) = Rj qˆj (x),
Rj = ±1,
j = 1, . . . , n,
(7.1)
where again τ is the time-reversal antiautomorphism. This weakened assumption then leads to the nonlinear version of Casimir’s extension [41] of Onsager’s theory, wherein Eq. (4.16) is modified to the formula (7.2) Kkl θ(x0 ) = Rk Rl Klk Rθ(x0 ) , where
Rθ := R1 θ1 , . . . , Rn θn .
(7.3)
Similarly, the modification of assumption (e) to the form given by Eq. (7.1) presents no serious problems for the other issues treated here. On the other hand, there does not appear to be any natural generalization of the scaling assumption (d), which lay behind the interdependence of the ratios of the macroscopic to microscopic scales for distance and time, the former ratio being LN and the latter L2N (or more generally LkN ). Moreover, one sees from Eqs. (2.15) and (3.13) that this interdependence was essential to the limit procedures of Eqs. (3.1) and (3.20). Nevertheless, it does not appear to be essential to the key physical ideas that (i) the ratios of the macroscopic to microscopic scales for both distance and time are extremely large, and (ii) the currents associated with the locally conserved quantum fields satisfy the chaoticity assumption of Sec. 5.4, whereby the space-time correlations of their fluctuations decay within microscopic distances and times. Since such chaoticity does not necessarily require any interdependence of the ratios of the macroscopic to microscopic scales for distance and time, it appears reasonable to expect that some version of the present macrostatistical model should still be applicable even in the absence of macroscopic space-time scale invariance. Thus, from the standpoint of mathematical physics, a most challenging question is whether the present scheme can be generalized to a setting which does not require the scale invariance of the macroscopic law (1.4). Presumably such a generalization would require a difficult multi-scale analysis.
October 20, 2005 8:48 WSPC/148-RMP
1010
J070-00249
G. L. Sewell
Appendix A. Proof of Proposition 5.1 We shall first prove Eqs. (5.9) and (5.10) and then demonstrate the nontriviality of the process w. Since L is the generator of T (R+ ), Eq. (5.9) follows immediately from Eqs. (4.2) and (5.6). It then follows from Eqs. (5.6) and (5.9) that the l.h.s. of Eq. (5.10) vanishes if the intervals [s, t] and [s , t ] do not intersect. Hence, in view of Eq. (5.8), the proof of Eq. (5.10) reduces to that of the same formula with s = s and t = t and t ≥ s. Thus, it suffices for us to prove that E wt,s (f )wt,s (f ) = − WS (L f, f ) + WS (f, L f ) | t − s| ∀t, s (≤ t) ∈ R,
f, f ∈ Dm (Ω). (A.1)
We start by inferring from Eq. (5.6) that the l.h.s. of (A.1) is the sum of the following four terms: E ξt (f ) − ξs (f ) ξt (f ) − ξs (f ) ,
t
−
(a)
duE ξt (f ) − ξs (f ) ξu (L f ) ,
(b)
duE ξu (L f ) ξt (f ) − ξs (f )
(c)
s
t
− s
and
t
du s
t
dvE ξu (L f )ξv (L f ) .
(d)
s
Since t ≥ s and the ξ-process is stationary, it follows from Eqs. (4.5) and (4.6) that Term (a) = 2WS (f, f ) − WS (Tt−s f, f ) − WS (f, Tt−s f ),
t
t Term (b) = − duWS (Tt−u f, L f ) + duWS (f, Tu−s L f ), s s
t
t duWS (L f, Tt−u f ) + duWS (Tu−s L f, f ) Term (c) = − s
(A.2) (A.3) (A.4)
s
and
Term (d) =
t
du s
s
u
dvWS (Tu−v L f, L f ) +
t
t
du s
dvWS L f, Tv−u L f .
u
(A.5)
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1011
Since WS is linear in each of its arguments and since L is the generator of T (R+ ), it follows that (A.3)–(A.5) may be re-expressed in the following forms.
t Term (b) = − duWS (Tt−u f, L f ) + WS (f, Tt−s f ) − WS (f, f ), (A.6) s
Term (c) = −
t
duWS (L f, Tt−u f ) + WS (Tt−s f, f ) − WS (f, f )
(A.7)
s
and
Term (d) =
du −WS (f, L f ) + WS (Tu−s f, L f ) s + WS (L f, Tt−u f ) − WS (L f, f ) . t
(A.8)
It follows now from (A.2) and (A.6)–(A.8) that the sum of the terms (a), (b), (c) and (d), which comprises the l.h.s. of (A.1), is equal to the r.h.s. of that equation. This completes the proof of (A.1) and thus of Eq. (5.10). Finally, we employ a reductio ad absurdum method to establish the nontriviality of the process w. Thus, we assume that wt,s vanishes. It then follows from Eq. (5.7) that WS (L f, f ) + WS (f, L f ) = 0
∀f, f ∈ Dm (Ω)
and hence, that WS (L Tt f, Tt f ) + WS (Tt f, L Tt f ) = 0
∀f, f ∈ Dm (Ω),
t ∈ R+ .
Since WS is linear in each of its arguments and since L is the generator of T (R+ ), this signifies that d WS (Tt f, Tt f ) = 0 dt and therefore, since T0 = I, that WS (Tt f, Tt f ) = WS (f, f )
∀f, f ∈ Dm (Ω),
t ∈ R+ .
(A.9)
Moreover, by Eq. (4.5) and the dissipativity condition (3.12), the l.h.s. of (A.9) vanishes in the limit t → ∞. Hence, (A.9) implies that the static two-point function WS vanishes. This conflicts with the fact that, by Eqs. (4.5), (4.11) and (4.14), lim WS (fx0 , , fx 0 , ) = f, J(q(x0 ))f , ↓0
which does not vanish identically. This contradiction establishes that the assumption of the triviality of w is untenable and thus completes the proof of the proposition. Appendix B. Proof of Proposition 5.2 We start by noting that, in view of Eq. (5.2) and condition (C.2), the proof of this proposition reduces to that of the formula (5.11) for the particular case where
October 20, 2005 8:48 WSPC/148-RMP
1012
J070-00249
G. L. Sewell
s = s , t = t and s ≤ t. Thus, we need only prove that ∀g, g ∈ DVm (Ω), E ζt,s (g)ζt,s (g ) = Γ(g, g )(t − s)
t, s (≤ t) ∈ R,
(B.1)
where Γ is an element of DVm ⊗ DVm with support in the domain {(x, x ) ∈ Ω2 |x = x}. To this end, we start by defining (B.2) Fg,g (t, s) := E ζt,s (g)ζt,s (g ) and inferring from Eq. (5.2) and condition (C.2) that for t ≥ u ≥ s.
Fg,g (t, s) = Fg,g (t, u) + Fg,g (u, s)
(B.3)
Further, by (B.2) and the stationarity of the process ζ, ∀b ∈ R,
Fg,g (t, s) = Fg,g (t + b, s + b)
which signifies that Fg,g may be expressed in the form Fg,g (t, s) = F˜g,g (t − s)
∀s, t ∈ R,
(B.4)
where, by condition (C), F˜g,g is a continuous function on R. It follows now from (B.3) and (B.4) that F˜g,g (t) + F˜g,g (t ) = F˜g,g (t + t )
∀t, t ∈ R+
(B.5)
and hence that F˜g,g (nt) = nF˜g,g (t),
∀t ∈ R+ ,
n∈N
or, equivalently, F˜g,g (t) = n F˜g,g (t/n ),
∀t ∈ R+ ,
n ∈ N\{0}.
These last two equations imply that F˜g,g (rt) = rF˜g,g (t) for all non-negative t and positive rational r; and further, by condition (C), this result extends to all positive r. Hence, the action of F˜g,g on R+ takes the form F˜g,g (t) = Γ(g, g )t
∀t ∈ R+ ,
(B.6)
where Γ(g, g ) := F˜g,g (1). By (B.2) and (B.4), (B.6) is equivalent to the required formula (B.1); and, moreover, it follows from condition (C.2) and the continuity and linearity of the l.h.s. of that equation with respect to the test functions g and g that Γ is indeed an element of DVm ⊗ DVm with support in the domain {(x, x ) ∈ Ω2 | x = x}.
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1013
Appendix C. Proof of Proposition 5.3 We base the proof of Proposition 5.2 on the following lemma. Lemma C.1. Let Ω1 be any open subset of Ω whose boundary, ∂Ω1 , does not intersect ∂Ω. Then, under the assumptions of Proposition 5.2, the restriction of the two-point function Γ to the spatial domain Ω21 is given by a finite sum of the following form.
d m n,n Γ(g, g ) = dxCk,l;µ,ν (x)∂xn gk,µ (x)∂xn gl,ν (x) n,n ∈Nd k,l=1 µ,ν=1 ∀g, g ∈ DVm (Ω1 ),
Ω
t, s, t , s ∈ R,
(C.1)
where (i) the C’s are continuous functions on Ω with support in some arbitrary neighborhood of Ω1 ; (ii) gk,µ is the µth spatial component of the kth component of g = (g1 · · · gm ); and (iii) for n = (n1 , . . . , nd ) ∈ Nd , ∂xn := ∂ n1 +···+nd /∂xn1 1 · · · ∂xnd d . Proof of Proposition 5.3 assuming Lemma C.1. We start by inferring from Eq. (5.18) that, for any g, g ∈ DVm (Ω), x0 ∈ Ω and sufficiently small, one can find an open subset Ω1 of Ω such that gx0 , and gx 0 , lie in DVm (Ω1 ). Hence, by Eqs. (5.18) and (C.1),
Γ(gx0 , , gx 0 , ) =
d m
−(|n+n |)
n,n ∈Nd k,l=1 µ,ν=1
× X
n,n dxCk,l;µ,ν (x0 + x)∂xn gk,µ (x)∂xn gl,ν (x)
∀g, g ∈ DVm (Ω), (C.2)
d
k=1 (nk + nk ):
where |n + n | := evidently, the effective domain of integration here is supp(g) ∩ supp(g ). Since the functions C are continuous, the summand on the r.h.s. of this equation will diverge, as → 0, unless either n and n are both zero n,n (x0 ) = 0. Hence, the local equilibrium condition (5.21) implies that the or Ck,l;µ,ν only non-vanishing C’s are those for which n and n are zero. Thus, (C.1) reduces to the form
d m 0,0 dxCk,l;µ,ν (x)gk,µ (x)gl,ν (x) ∀g, g ∈ DVm (Ω1 ). (C.3) Γ(g, g ) = k,l=1 µ,ν=1
Ω
Correspondingly, (C.2) reduces to the form
d m 0,0 Γ(gx0 , , gx0 , ) = dxCk,l;µ,ν (x0 + x)gk,µ (x)gl,ν (x) k,l=1 µ,ν=1
∀g, g ∈ DVm (Ω).
X
(C.4)
October 20, 2005 8:48 WSPC/148-RMP
1014
J070-00249
G. L. Sewell
It now follows immediately from this formula and the local equilibrium condition (5.21) that d
m 0,0 dxCk,l;µ,ν (x0 )gk,µ (x)gl,ν (x) k,l=1 µ,ν=1
Ω
= 2 g, K(θ(x0 ))g )V
∀x0 ∈ Ω,
g, g ∈ DVm (Ω).
Further, in view of Eq. (5.17), this last equation signifies that 0,0 (x) = 2Kkl (θ(x))δµν Ck,l;µ,ν
(C.5)
and consequently that (C.3) reduces to the required formula (5.22), at least for g, g ∈ DVm (Ω1 ). The extension to all g, g in DVm (Ω) is trivial, since for any pair of elements of the latter space, one can always choose Ω1 to be an open subset of that space that contains their supports. in (C.1) are arbitrary Proof of Lemma C.1. Since the test functions gk,µ and gl,ν elements of D(Ω), this lemma reduces to the following one.
Lemma C.2. Let T be a D (Ω2 )-class distribution whose support lies in the region {(x, x ) ∈ Ω2 | x = x} and let Ω1 be an open subset of Ω whose boundary, ∂Ω1 , does not intersect ∂Ω. Then, the restriction of T to the domain {f ⊗ f | f, f ∈ D(Ω1 )} is given by a finite sum of the form
dxC n,n (x)∂xn f (x)∂xn f (x) ∀f, f ∈ D(Ω1 ), (C.6) T (f ⊗ f ) = n,n ∈Nd
Ω
where the C’s are continuous functions on Ω with supports in some neighborhood of Ω1 . Proof of Lemma C.2. Let σ be a D(Ω)-class function which takes the value unity in Ω1 and whose support lies in a compact connected subset, K, of Ω whose boundary, ∂K, does not intersect either ∂Ω or ∂Ω1 . We define the distribution T˜ (∈ D (Ω2 )) by the formula T˜ (x, x ) = σ(x)σ(x )T (x, x ).
(C.7)
Thus, T˜ coincides with T in Ω21 and supp(T˜ ) ⊂ {(x, x ) ∈ K 2 | x = x}.
(C.8)
We define Φ to be the linear transformation of X 2 given by the formula Φ(y, z) = (y + z, y − z)
∀y, z ∈ X,
from which it follows that 1 −1 1 (x + x ), (x − x ) Φ (x, x ) = 2 2
∀x, x ∈ X.
We then define Θ := Φ−1 (Ω2 ) = {(y, z) ∈ X 2 | (y ± z) ∈ Ω},
(C.9)
(C.10)
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1015
and we define the bijection F → Fˆ of D(Ω2 ) onto D(Θ) by the formula Fˆ = F ◦ Φ, i.e. Fˆ (y, z) = F (y + z, y − z)
∀(y, z) ∈ Θ.
(C.11)
Correspondingly, we define the distribution Tˆ (∈ D (Θ)) in terms of T˜ by the formula Tˆ (Fˆ ) = T˜ (F )
∀F ∈ D(Ω2 ).
(C.12)
It follows from (C.8), (C.11) and (C.12), that supp(Tˆ ) ⊂ K × {0}.
(C.13)
We want to restrict Tˆ to an open subset of Θ which contains the support of this distribution and takes the form Ω2 ×J, where Ω2 and J are open subsets of Ω and X respectively. Accordingly, we choose b to be a positive number that is less than dist(∂K, ∂Ω), the minimal distance between the boundaries, ∂K and ∂Ω, of K and Ω. We then define Ω2 := {y ∈ X | (y, z) ∈ Θ ∀|z| ≤ b} and J := {z ∈ X | |z| < b}. It follows from these definitions that Ω2 and Ω2 × J are open subsets of Ω and Θ, respectively, that K ⊂ Ω2 and that ∂Ω2 , the boundary of Ω2 , does not intersect either ∂K or ∂Ω. Hence, by (C.13), Ω2 × J is an open neighborhood of supp(Tˆ ) and the restriction, Tˆ , of Tˆ to this domain carries all the information we require. It follows from its definition that Tˆ ∈ D (Ω2 × J). Now let e be an arbitrary element of D(Ω2 ). Then, for e ∈ D(J), Tˆ induces a continuous linear functional Tˆe on D(J) according to the formula Tˆe (e ) = Tˆ (e ⊗ e )
∀e ∈ D(J),
(C.14)
where the mapping e → Tˆe of D(Ω2 ) into D (J) is continuous. Further, it follows from (C.13) and (C.14) that Tˆe has support at the origin and consequently, by Schwartz’s point support theorem [33, Theorem 35], that this distribution is a finite sum of derivatives of δ(z), with coefficients given by linear continuous functionals of e, i.e. Tn (e)(∂ n e )(0), (C.15) Tˆe (e ) = n
where each Tn ∈ D (J). Further, in view of the definition of Tˆe , it follows from (C.13) and (C.14) that Tn has support in the compact K and therefore, by Schwartz’s compact support theorem [33, Theorem 26], it is a finite sum of derivatives of continuous functions on Ω2 with support in an arbitrary neighborhood of K. Consequently, by (C.15), the action of Tˆ on D(Ω × J) is given by a finite sum of the form
ˆ n ,n (y)∂yn ∂zn Fˆ (y, z)z=0 Tˆ (Fˆ ) = dy D ∀Fˆ ∈ D(Ω2 × J), (C.16) n ,n
Ω2
ˆ s are continuous functions on Ω2 with support in a neighborhood of K. where the D Hence, as Tˆ is just the restriction of Tˆ to D(Ω2 × J) and since T˜ coincides with
October 20, 2005 8:48 WSPC/148-RMP
1016
J070-00249
G. L. Sewell
T in Ω21 , it follows from (C.9)–(C.12) that (C.16) is equivalent to the formula
T (F ) = dxC n,n (x)∂xn ∂xn F (x, x ) | x =x ∀F ∈ D(Ω1 ), (C.17) n,n ∈Nd
Ω
which in turn is equivalent to the required (C.6). Appendix D: Proof of Proposition 5.5 Part (a). The characteristic functional for the process ξ is r C(f (1) , . . . , f (r) ; t1 , . . . , tr ) = E exp i ξtk (f (k) ) k=1
∀f (1) , . . . , f (r) ∈ Dm (Ω); Equivalently, since the process is stationary, C(f
(1)
,...,f
(r)
; t1 , . . . , tr ) = E exp i
t1 , . . . , tr ∈ R,
r
r ∈ N. (D.1)
ξtk +t0 (f
(k)
)
k=1
∀f (1) , . . . , f (r) ∈ Dm (Ω);
t0 , t1 , . . . , tr ∈ R,
r ∈ N. (D.2)
Here we are at liberty to choose t0 to be any real number and, for any specified set of times t1 , . . . , tr , we choose it so that t1 + t0 , . . . , tr + t0 are all positive. It then follows from Eq. (5.28) that
tk +t0 (k) (k) ξtk +t0 (f ) = ξ(Ttk +t0 f ) + dwu,0 Ttk +t0 −u f (k) 0
and therefore that Eq. (D.2) may be re-expressed as C(f (1) , . . . , f (r) ; t1 , . . . , tr ) r
r (k) ξ(Ttk +t0 f ) exp i = E exp i k=1
tk +t0
0
k=1
(k) . dwu,0 Ttk +t0 −u f (D.3)
We now define
˜ (1) , . . . , f (r) ; t0 , t1 , . . . , tr ) = exp i C(f
r
k=1
0
tk +t0
dwu,0 Ttk +t0 −u f (k)
and, using the Schwartz inequality, we infer from the last two equations that ˜ (1) , . . . , f (r) ; t0 , t1 , . . . , tr )2 C(f (1) , . . . , f (r) ; t1 , . . . , tr ) − C(f 2 r ≤ E exp i ξ(Ttk +t0 f (k) ) − 1 k=1 r 2 r ≤E ξ(Ttk +t0 f (k) ) E ξ(Ttk +t0 f (k) )ξ(Ttl +t0 f (l) ) . = k=1
k,l=1
(D.4)
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1017
It follows from the dissipativity condition (3.12) that the r.h.s. of this estimate vanishes in the limit t0 → ∞ and therefore that C(f (1) , . . . , f (r) ; t1 , . . . , tr ) = lim C˜ f (1) , . . . , f (r) ; t0 , t1 , . . . , tr , t0 →∞
i.e. by (D.4), that
C(f (1) , . . . , f (r) ; t1 , . . . , tr ) = lim E exp i t0 →∞
r
k=1
0
tk +t0
dwu,0 (Ttk +t0 −u f (k) )
.
Since, by Eq. (5.7) and the chaoticity condition (C.1), the process w is Gaussian, it follows immediately from this last equation that the process ξ is Gaussian. In order to show that it is also Markovian, we need just to prove that, for t ∈ R and any random variable B≥t generated by {ξu (f ) | f ∈ Dm (Ω), u ≥ t}, the conditional expectations of B≥t with respect to the random variables for time t and for times ≤ t are equal, i.e. that E(B≥t | ξt ) = E(B≥t | ξ≤t ).
(D.5)
Now the random variables over the times ≥, = and ≤ t are generated by linear combinations of terms p (k) ξuk (f ) , (D.6) F≥t = exp i k=1
Ft = exp(iξt (f )) and
F≤t = exp i
r
(D.7)
ξsl (f (l) ) ,
(D.8)
l=1
respectively, where uk ≥ t ≥ sl and f (k) , f and f (l) are elements of Dm (Ω). It follows from (D.1) and (D.6)–(D.8), together with the Gaussian property of ξ, that p E ξuk (f (k) )ξt (f ) E(F≥t Ft ) = C(f (1) , . . . , f (p) ; u1 , . . . , up )C(f ; t)exp − k=1
(D.9) and that E(F≥t F≤t ) = C(f (1) , . . . , f (p) ; u1 , . . . , up )C(f (1) , . . . , f (r) ; s1 , . . . , sr ) p r (k) (l) × exp − E ξuk (f )ξsl (f ) . (D.10) k=1 l=1
Further, since uk ≥ t ≥ sl , it follows from Eqs. (4.5) and (4.6) that the summands appearing in the exponents in (D.9) and (D.10) are equal to E ξ(Tuk −t f (k) )ξ(f )
October 20, 2005 8:48 WSPC/148-RMP
1018
J070-00249
G. L. Sewell
and E ξ(Tuk −sl f (k) )ξ(f (l) ) , respectively, and therefore those equations may be re-expressed as p (1) (p) (k) E ξ(Tuk −t f )ξ(f ) E(F≥t Ft ) = C(f , . . . , f ; u1 , . . . , up )C(f ; t)exp − k=1
(D.11) and E(F≥t F≤t ) = C(f (1) , . . . , f (p) ; u1 , . . . , up )C(f (1) , . . . , f ; s1 , . . . , sr ) p r (k) (l) × exp − E ξ(Tuk −sl f )ξ(f ) . (D.12) k=1 l=1
Further, since E(F≥t | ξt ) is the unique random variable of the ξ-process at time t for which E(E(F≥t | ξt )Ft ) = E(F≥t Ft ) for all F≥t and Ft of the forms given by (D.6) and (D.7), respectively, it follows from (D.9), together with the stationarity and the Gaussian property of the process, that p C(f (1) , . . . , f (p) ; u1 , . . . , up ) ξt Tuk −t f (k) . exp i E(F≥t | ξt ) = C(Tu1 −t f (1) , . . . , Tup −t f (p) ; 0, . . . , 0) k=1
(D.13) Hence, by (D.8), E(E(F≥t | ξt )F≤t ) =
C(f (1) , . . . , f (p) ; u1 , . . . , up ) C(Tu1 −t f (1) , . . . , Tup −t f (p) ; 0, . . . , 0)
× C f (1) , . . . , f (r) ,
p
Tuk −t f (k) : s1 , . . . , sr , t . (D.14)
k=1
Further, in view of the Gaussian property of the process, the last factor in this formula is equal to p (1) (r) (k) : s1 , . . . , sr )C Tuk −t f ; t C(f , . . . , f k=1
p r × exp − E ξt (Tuk −t f (k) )ξsl (f (l) ) . k=1 l=1 Therefore, since by Eq. (4.6) and the semigroup property the + ), summand of T(k)(R ξ(f (l) ) , it follows in the exponent in this expression is equal to E ξ Tuk −sl f from (D.10) and (D.14) that
E(E(F≥t | ξt )F≤t ) = E(F≥t F≤t ).
October 20, 2005 8:48 WSPC/148-RMP
J070-00249
Quantum Macrostatistical Theory of Nonequilibrium Steady States
1019
Hence, E(F≥t | ξt ) = E(F≥t | ξ≤t ), which signifies that the process is temporally Markovian. Part (b). Since Eq. (5.8) implies that wt,s = −ws,t and since wt,s and ξu are Gaussian random fields whose means are zero, it follows from Eq. (5.9) that the latter two fields are statistically independent of one another if s and t are both greater than or equal to u. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
[14] [15] [16]
[17] [18] [19] [20] [21] [22] [23] [24] [25]
R. Graham and H. Haken, Z. Phys. 237 (1970) 31. K. Hepp and E. H. Lieb, Helv. Phys. Acta 46 (1973) 573. G. Alli and G. L. Sewell, J. Math. Phys. 36 (1995) 5598. F. Bagarello and G. L. Sewell, J. Math. Phys. 39 (1998) 2730. P. Glansdorf and I. Prigogine, Thermodynamic Theory of Structure, Stability and Fluctuations (Wiley-Interscience, London, 1971). H. Fr¨ ohlich, Int. J. Quantum Chem. 2 (1968) 641. G. Gallavotti, J. Stat. Phys. 84 (1996) 899. D. Ruelle, J. Stat. Phys. 85 (1996) 1. H. Spohn, J. Phys. A16 (1983) 4275. B. Derrida, J. L. Lebowitz and E. R. Speer, J. Stat. Phys. 107 (2002) 599. L. Bertini, A. de Sole, D. Gabrielli, G. Jona-Lasinio and C. Landim, J. Stat. Phys. 107 (2002) 635. D. Ruelle, J. Stat. Phys. 98 (2000) 57. S. Tasaki and T. Matsui, Fluctuation theorem, nonequilibrium steady states and the McLennan–Zubarev ensembles of L1 -asymptotical abelian C -dynamical systems, in Fundamental Aspects of Quantum Physics, eds. L. Accardi and S. Tasaki (World Scientific, Singapore, 2003), pp. 100–119. G. L. Sewell, Lett. Math. Phys. 68 (2004) 53. G. L. Sewell, Quantum Mechanics and its Emergent Macrophysics (Princeton University Press, Princeton, Oxford, 2002). G. L. Sewell, Quantum Macrostatistics and Irreversible Thermodynamics, Lecture Notes in Mathematics, Vol. 1442, eds. L. Accardi and W. Von Waldebfels (Springer, Berlin, 1990), pp. 368–83. L. Onsager, Phys. Rev. 37 (1931) 405; 38 (1931) 2265. L. D. Landau and E. M. Lifschitz, Fluid Mechanics (Pergamon, Oxford, 1984). L. Boltzmann, Lectures on Gas Theory (University of California Press, Berkeley, CA, 1964). O. E. Lanford, Time evolution of large classical systems, in 1974 Battelle Rencontre, ed. J. Moser, LNP, Vol. 38 (Springer, Berlin, 1975). T. G. Ho, L. J. Landau and A. J. Wilkins, On the weak coupling limit for a Fermi gas in a random potential, Rev. Math. Phys. 5 (1993) 209. L. Onsager and S. Machlup, Phys. Rev. 91 (1953) 1505. G. Grinstein, D.-H. Lee and S. Sachdev, Phys. Rev. Lett. 64 (1990) 1927. J. R. Dorfman, T. R. Kirkpatrick and J. V. Sengers, Annu. Rev. Phys. Chem. 45 (1994) 213. R. F. Streater and A. S. Wightman, PCT, Spin and Statistics, and All That (W. A. Benjamin, New York, 1964).
October 20, 2005 8:48 WSPC/148-RMP
1020
J070-00249
G. L. Sewell
[26] R. Haag, N. M. Hugenholtz and M. Winnink, Commun. Math. Phys. 5 (1967) 215. [27] D. Ruelle, Statistical Mechanics (W. A. Benjamin, New York, 1969). [28] G. G. Emch, Algebraic Methods in Statistical Mechanics and Quantum Field Theory (Wiley, New York, 1971). [29] G. G. Emch, H. J. F. Knops and E. J. Verboven, J. Math. Phys. 11 (1970) 1655. [30] V. Jakcic and C. A. Pillet, Commun. Math. Phys. 226 (2002) 131. [31] I. E. Segal, Ann. Math. 48 (1947) 930. [32] G. L. Sewell, J. Math. Phys. 11 (1970) 1868. [33] L. Schwartz, Th´eorie des Distributions (Hermann, Paris, 1998). [34] L. Accardi, A. Frigerio and J. T. Lewis, Publ. Res. Inst. Math. Sci. 18 (1982) 97. [35] E. Nelson, Ann. Math. 70 (1959) 572. [36] E. Nelson, Dynamical Theories of Brownian Motion (Princeton University Press, Princeton, 1972). [37] R. Sen and G. L. Sewell, J. Math. Phys. 43 (2002) 1323. [38] B. Nachtergaele and H. T. Yau, Commun. Math. Phys. 243 (2003) 485. [39] D. Goderis, P. Vets and A. Verbeure, Probab. Theory Related Fields 82 (1989) 527. [40] M. Broidio, B. Momont and A. Verbeure, J. Math. Phys. 36 (1995) 6746. [41] H. G. B. Casimir, Rev. Modern Phys. 17 (1945) 343–350.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Reviews in Mathematical Physics Vol. 17, No. 9 (2005) 1021–1070 c World Scientific Publishing Company
HOMOTOPY OF POSETS, NET-COHOMOLOGY AND SUPERSELECTION SECTORS IN GLOBALLY HYPERBOLIC SPACE-TIMES
GIUSEPPE RUZZI Dipartimento di Matematica, Universit` a di Roma “Tor Vergata”, Via della Ricerca Scientifica, 00133, Roma, Italy [email protected] Received 7 February 2005 Revised 11 July 2005
We study sharply localized sectors, known as sectors of DHR-type, of a net of local observables, in arbitrary globally hyperbolic space-times with dimension ≥ 3. We show that these sectors define, as it happens in Minkowski space, a C∗ -category in which the charge structure manifests itself by the existence of a tensor product, a permutation symmetry and a conjugation. The mathematical framework is that of the net-cohomology of posets according to J. E. Roberts. The net of local observables is indexed by a poset formed by a basis for the topology of the space-time ordered under inclusion. The category of sectors, is equivalent to the category of 1-cocycles of the poset with values in the net. We succeed in analyzing the structure of this category because we show how topological properties of the space-time are encoded in the poset used as index set: the first homotopy group of a poset is introduced and it is shown that the fundamental group of the poset and one of the underlying space-time are isomorphic; any 1-cocycle defines a unitary representation of these fundamental groups. Another important result is the invariance of the net-cohomology under a suitable change of index set of the net. Keywords: Cohomology of posets; homotopy of posets; curved space-times; superselection sectors. Mathematics Subject Classification 2000: 81T05, 81T20, 05E99, 57Q99
Contents 1. Introduction 2. Homotopy and Net-Cohomology of Posets 2.1. Preliminaries: The simplicial set and net-cohomology 2.2. The first homotopy group of a poset 2.3. Connection between homotopy and net-cohomology 2.4. Change of index set 2.5. The poset as a basis for a topological space 2.5.1. Homotopy 2.5.2. Net-cohomology 1021
1022 1025 1026 1029 1032 1034 1036 1036 1041
October 20, 2005 8:48 WSPC/148-RMP
1022
J070-00248
G. Ruzzi
3. Good Index Sets for a Globally Hyperbolic Space-Time 3.1. Preliminaries on space-time geometry 3.2. The set of diamonds 3.2.1. Causal punctures 3.3. Net-cohomology 4. Superselection Sectors 4.1. Presheaves and the strategy for studying superselection sectors 4.2. Local theory 4.2.1. Tensor structure 4.2.2. Symmetry and statistics 4.2.3. Conjugation 4.3. Global theory 4.3.1. Symmetry, statistics and conjugation 5. Concluding Remarks Appendix A. Tensor C∗ -categories
1042 1043 1044 1047 1048 1049 1050 1052 1054 1055 1058 1060 1062 1065 1066
1. Introduction The present paper is concerned with the study of charged superselection sectors in globally hyperbolic space-times in the framework of the algebraic approach to quantum field theory [17, 18]. The basic object of this approach is the abstract net of local observables RK , namely the correspondence K O → R(O), which associates to any element O of a family K of relatively compact open regions of the space-time M, considered as a fixed background manifold, the C∗ -algebra R(O) generated by all the observables which are measurable within O. Sectors are unitary equivalence classes of irreducible representations of this net, the labels distinguishing different classes are the quantum numbers. The study of physically meaningful sectors of the net of local observables, and how to select them, is the realm of the theory of superselection sectors. One of the main results of superselection sectors theory has been the demonstration that in Minkowski space M4 , among the representations of the net of local observables, it is possible to select a family of sectors whose quantum numbers manifest the same properties as the charges carried by elementary particles: a composition law, the alternative of Bose and Fermi statistics and the charge conjugation symmetry. The first example of sectors manifesting these properties has been provided in [10, 11], known as DHR-analysis, where the authors investigated sharply localized sectors. Namely, a representation π of RK is a sector of DHR-type whenever its restriction to the spacelike complement O⊥ of any element O of K is unitary equivalent to the vacuum representation πo of the net, in symbols π|R(O⊥ ) ∼ = πo |R(O⊥ ) ,
O ∈ K.
(1.1)
Although no known charge present in nature is sharply localized, the importance of the DHR-analysis resides in the following reasons. Firstly, it suggests the idea
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1023
that physically charged sectors might be localized in a more generalized sense with respect to (1.1). Secondly, the introduction of powerful techniques based only on the causal structure of the Minkowski space that can be used to investigate other types of localized sectors. A relevant example are the BF-sectors [3] which describe charges in purely massive theories. BF-sectors are localized in spacelike cones, which are a family of noncompact regions of M4 . In curved space-times, the nontrivial topology can induce superselection sectors, see [1] and references quoted therein. However, up until now, the localization properties of these sectors are not known, hence it is still not possible to investigate their charge structure. In the present paper, we deal with the study of sectors of DHR-type in arbitrary globally hyperbolic space-times. Because of the sharp localization, sectors of DHRtype should be insensitive to the nontrivial topology of the space-time, and their quantum numbers should exhibit the same features as in Minkowski space. However, the first investigations [16, 27] have provided only partial results in this direction, and, in particular, they have pointed out that for particular classes of space-times the topology might affect the properties of sectors of DHR-type. The aim of the present paper is to show how this type of sectors and the topology of space-time are related and that they manifest the properties described above also in an arbitrary globally hyperbolic space-time. We want to stress that the results of this paper are confined to space-times with dimension ≥ 3. Before entering the theory of DHR-sectors in globally hyperbolic space-times, a key fact has still to be mentioned. The DHR-analysis can be equivalently read in terms of net-cohomology of posets, a cohomological approach initiated and developed by J. E. Roberts [25], (see also [26, 27] and references therein). Such an approach makes clear that the space-time information which is relevant for the analysis of the DHR-sectors is the topological and the causal structure of Minkowski space (the Poincar´e symmetry enters the theory only in the definition of the vacuum representation). In particular, the essential point is how these two properties are encoded in the structure of the index K as a partially ordered set (poset ) with respect to inclusion order relation ⊆. Representations satisfying (1.1) are, up to equivalence, in 1-1 correspondence with 1-cocycles z of the poset K with values in the vacuum representation of the net AK : O → A(O). Here, A(O) is the von Neumann algebra obtained by taking the bicommutant πo (R(O)) of πo (R(O)). These 1-cocycles, which are nothing but the charge transporters of DHR-analysis, define a tensor C∗ -category Zt1 (AK ) with a permutation symmetry and conjugation. The first investigation of the sectors of the DHR-type in a globally hyperbolic space-time M has been done in [16]. Firstly, the authors consider the net of local observables RK indexed by the set K of regular diamonds of M: a family of relatively compact open sets codifying the topological and the causal properties of M. Secondly, they take a reference representation πo of RK on a Hilbert space Ho such that the net AK : K O → A(O) ≡ πo (R(O)) satisfies Haag duality and the Borchers property (see Sec. 4). The reference representation πo plays for the theory the same role that the vacuum representation plays in the case of the
October 20, 2005 8:48 WSPC/148-RMP
1024
J070-00248
G. Ruzzi
Minkowski space. Examples of physically meaningful nets of local algebras indexed by regular diamonds have been given in [32]. Finally, the DHR-sectors are singled out from the representations of the net RK by generalizing, in a suitable way, the criterion (1.1). As in the Minkowski space, the physical content of DHR-sectors is contained in the C∗ -category Zt1 (AK ) of 1-cocycles of K with values in AK , and when K is directed under inclusion, there exist a tensor product, a symmetry and conjugated 1-cocycles. The analogy with the theory in the Minkowski space breaks down when the K is not directed. In this situation, only the introduction of a tensor product on Zt1 (AK ) and the existence of a symmetry have been achieved in [16], although the definition of the tensor product is not completely discussed (see below). There are two well known topological conditions on the space-time, implying that not only regular diamonds but any reasonable set of indices for a net of local algebras is not directed: this happens when the space-time has either nonsimply connected or has compact Cauchy surfaces (Corollary 2.19 and Lemma 3.2). There arises, therefore, the necessity to understand the connection between netcohomology and topology of the underlying space-time. Progress in this direction has been achieved in [27]. The homotopy of paths, the net-cohomology under a change of the index set are issues developed in that work that will turn out to be fundamental for our aim. Moreover, it has been shown that the statistics of sectors can be classified provided that the net satisfies punctured Haag duality (see Sec. 4). However, no result concerning the conjugation has been achieved. To see what is the main drawback caused by the nondirectness of the poset K , we have to describe more in detail Zt1 (AK ). The elements z of Zt1 (AK ) are 1-cocycles trivial in B(Ho ) or, equivalently, path-independent on K . The latter means that the evaluation of z on a path of K depends only on the endpoints of the path. When K is directed, any 1-cocycle is trivial in B(Ho ), but this might not ˆ be the hold when K is not directed. The consequences can be easily showed: Let ⊗ 1 ˆ z1 tensor product introduced in [16]: for any z, z1 ∈ Zt (AK ), it turns out that z ⊗ is a 1-cocycle of K with values in AK , but it is not clear whether it is trivial in B(Ho ) (we will see in Remark 4.18 that this 1-cocycle is trivial in B(Ho )). Now, we know that the nonsimply connectedness and the compactness of the Cauchy surfaces are topological obstructions to the directness of the index sets. The first aim of this paper is to understand whether these conditions are also obstructions to the triviality in B(Ho ) of 1-cocycles. This problem is analyzed in great generality in Sec. 2. We introduce the notions of the first homotopy group and fundamental group for an abstract poset P (Definition 2.4) and prove that any 1-cocycle z of P, with values in a net of local algebras AP indexed by P, defines a unitary representation of the fundamental group of P (Theorem 2.8). In the case that P is a basis for a topological space ordered under inclusion, and whose elements are arcwise and simply connected sets, then the fundamental group of P is isomorphic to the fundamental group of the underlying topological space (Theorem 2.18). This states that the only possible topological obstruction to the triviality in B(Ho ) of 1-cocycles is the nonsimply connectedness (Corollary 2.21).
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1025
Before studying superselection sectors in a globally hyperbolic space-time M, we have to point out another problem arising in [16, 27]. Regular diamonds do not need to have arcwise connected causal complements. This, on the one hand creates some technical problems; on the other hand, it is not clear whether it is justified to assume Haag duality on AK : the only known result showing that a net of local observables, in the presence of a nontrivial superselection structure, inherits Haag duality from fields makes use of the arcwise connectedness of causal complements of the elements of the index set [16, Theorem 3.15]. We start to deal with this problem by showing that net-cohomology is invariant under a change of the index set (Theorem 2.23), provided the new index set is a refinement of K (see Definition 2.9 and Lemma 2.22(a)). In Sec. 3.2, we introduce the set K of diamonds of M. K is a refinement of K and any element of K has an arcwise connected causal complement. Therefore, adopting K as index set, the cited problems are overcome. In Sec. 4, we consider an irreducible net AK satisfying the Borchers property and punctured Haag duality. The key for studying superselection sectors of the net AK , namely the C∗ -category Zt1 (AK ), is provided by the following fact. We introduce the causal puncture Kx of K induced by a point x of M (3.2) and consider the categories Zt1 (AKx ) of 1-cocycles of Kx , trivial in B(Ho ), with values in the net AKx . We show that a family zx ∈ Zt1 (AKx ) for any x ∈ M admits an extension to a 1-cocycle z ∈ Zt1 (AK ) if, and only if, a suitable gluing condition is verified (Proposition 4.2). A similar result holds for arrows (Proposition 4.3), and can be easily generalized to functors. These results suggest that one could proceed as follows: firstly, prove that the categories Zt1 (AKx ) have the right structure to describe the superselection theory (local theory, Sec. 4.2); secondly, check that the constructions we have made on Zt1 (AKx ) satisfy the mentioned gluing condition, and consequently can be extended to Zt1 (AK ) (global theory, Sec. 4.3). This argument works. We will prove that Zt1 (AK ) has a tensor product, a symmetry and that any object has left inverses. The full subcategory Zt1 (AK )f of Zt1 (AK ) whose objects have finite statistics has conjugates (Theorem 4.15). In Appendix A, we give some basics definitions and results on tensor C∗ categories.
2. Homotopy and Net-Cohomology of Posets After some preliminaries, the main topics are discussed in full generality in the first three sections: the first homotopy group of a poset; the connection between homotopy and net-cohomology; the behavior of net-cohomology under a change of the index set. The remaining two sections are devoted to the study of the case that the poset is a basis for the topology of a topological space. We stress that the results obtained in the first three sections in terms of abstract posets can be applied, not only to sharply localized charges which are the subject of the present investigation, but also to charges like those studied in [3, 1].
October 20, 2005 8:48 WSPC/148-RMP
1026
J070-00248
G. Ruzzi
2.1. Preliminaries: The simplicial set and net-cohomology In the present section, we recall the definition of simplicial set of a poset and the notion of net-cohomology of a poset, thereby establishing our notations. References for this section are [26, 16, 27]. The simplicial set. A poset (P, ≤) is a partially ordered set. This means that ≤ is a binary relation on a nonempty set P, satisfying for any O ∈ P ⇒O≤O O1 ≤ O2 and O2 ≤ O1 ⇒ O1 = O2 O1 ≤ O2 and O2 ≤ O3 ⇒ O1 ≤ O3
reflexive, antisymmetric, transitive.
A poset is said to be directed if for any pair O1 , O2 ∈ P, there exists O3 ∈ P such that O1 , O2 ≤ O3 . For our aim, important examples of posets are provided by the standard simplices. A standard n-simplex is defined as ∆n ≡ (λ0 , . . . , λn ) ∈ Rn+1 | λ0 + · · · + λn = 1, λi ∈ [0, 1] . It is clear that ∆0 is a point, ∆1 is a closed interval, etc. The inclusion maps dni between standard simplices are maps dni : ∆n−1 → ∆n defined as dni (λ0 , . . . , λn−1 ) = (λ0 , λ1 , . . . , λi−1 , 0, λi , . . . , λn−1 ), for n ≥ 1 and 0 ≤ i ≤ n. Now, note that a standard n-simplex ∆n can be regarded as a partially ordered set with respect to the inclusion of its subsimplices. A singular n-simplex of a poset P is an order preserving map f : ∆n → P. We denote by Σn (P) the collection of singular n-simplices of P and by Σ∗ (P) the collection of all singular simplices of P. Σ∗ (P) is the simplicial set of P. The inclusion maps dni between standard simplices are extended to maps ∂in : Σn (P) → Σn−1 (P), called boundaries, between singular simplices by setting ∂in f ≡ f ◦ dni . One can easily check, by the definition of dni , that the following relations n ∂in−1 ◦ ∂jn = ∂jn−1 ◦ ∂i+1 ,
i ≥ j,
hold. From now on, we will omit the superscripts from the symbol ∂in , and will denote: the composition ∂i ◦ ∂j by the symbol ∂ij ; 0-simplices by the letter a; 1-simplices by b and 2-simplices by c. Notice that a 0-simplex a is nothing but an element of P; a 1-simplex b is formed by two 0-simplices ∂0 b, ∂1 b and an element |b| of P, called the support of b, such that ∂0 b, ∂1 b ≤ |b|. Given a0 , a1 ∈ Σ0 (P), a path from a0 to a1 is a finite ordered sequence p = {bn , . . . , b1 } of 1-simplices satisfying the relations ∂1 b1 = a0 ,
∂0 bi = ∂1 bi+1
with i ∈ {1, . . . , n − 1},
∂0 bn = a1 .
The starting point of p, written ∂1 p, is the 0-simplex a0 , while the endpoint of p, written ∂0 p, is the 0-simplex a1 . We will denote by P(a0 , a1 ) the set of paths from a0 to a1 , and by P(a0 ) the set of closed paths with endpoint a0 . P is said to be pathwise connected whenever for any pair a0 , a1 of 0-simplices there exists a path p ∈ P(a0 , a1 ). The support of the path is the collection |p| ≡ {|bi | | i = 1, . . . , n},
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
|c|
δ 0b |b |
δ 1b
b3
δ 0c
Fig. 1.
δ0p
b2
δ 1c
b1
δ 2c b
1027
δ 1p
c
p
b is a 1-simplex, c is a 2-simplex and p = {b3 , b2 , b1 } is a path. The symbol δ stands for ∂.
and we will write |p| ⊆ P if P is a subset of P with |bi | ∈ P for any i. Furthermore, with an abuse of notation, we will write |p| ⊆ O if O ∈ P with |bi | ≤ O for any i. Causal disjointness and net of local algebras. Given a poset P, a causal disjointness relation on P is a symmetric binary relation ⊥ on P satisfying the following properties: (i) O1 ∈ P ⇒ ∃O2 ∈ P such that O1 ⊥ O2 , (ii) O1 ≤ O2 and O2 ⊥ O3 ⇒ O1 ⊥ O3 .
(2.1)
Given a subset P ⊆ P, the causal complement of P is the subset P ⊥ of P defined as P ⊥ ≡ {O ∈ P | O ⊥ O1 , ∀O1 ∈ P }. Note that if P1 ⊆ P , then P ⊥ ⊆ P1⊥ . Now, assume that P is a pathwise connected poset equipped with a causal disjointness relation ⊥. A net of local algebras indexed by P is a correspondence AP : P O → A(O) ⊆ B(Ho ), associating to any O a von Neumann algebras A(O) defined on a fixed Hilbert space Ho , and satisfying O1 ≤ O2 ⇒ A(O1 ) ⊆ A(O2 ), O1 ⊥ O2 ⇒ A(O1 ) ⊆ A(O2 ) ,
isotony, causality,
where the prime over the algebra stands for the commutant of the algebra. The algebra A(O⊥ ) associated with the causal complement O⊥ of O ∈ P, is the C∗ -algebra generated by the algebras A(O1 ) for any O1 ∈ P with O1 ⊥ O. The net AP is said to be irreducible whenever, given T ∈ B(Ho ) such that T ∈ A(O) for any O ∈ P, then T = c · 11. The category of 1-cocycles. We refer the reader to the Appendix for the definition of C∗ -category. Let P be a poset with a causal disjointness relation ⊥, and let AP be an irreducible net of local algebras. A 1-cocycle z of P with values in AP is a field z : Σ1 (P) b → z(b) ∈ B(Ho ) of unitary operators satisfying the 1-cocycle identity: z(∂0 c) · z(∂2 c) = z(∂1 c),
c ∈ Σ2 (P),
October 20, 2005 8:48 WSPC/148-RMP
1028
J070-00248
G. Ruzzi
and the locality condition: z(b) ∈ A(|b|) for any 1-simplex b. An intertwiner t ∈ (z, z1 ) between a pair of 1-cocycles z, z1 is a field t : Σ0 (P) a → ta ∈ B(Ho ) satisfying the relation t∂0 b · z(b) = z1 (b) · t∂1 b ,
b ∈ Σ1 (P),
and the locality condition: ta ∈ A(a) for any 0-simplex a. The category of 1-cocycles Z 1 (AP ) is the category whose objects are 1-cocycles and whose arrows are the corresponding set of intertwiners. The composition between s ∈ (z, z1 ) and t ∈ (z1 , z2 ) is the arrow t · s ∈ (z, z2 ) defined as (t · s)a ≡ ta · sa ,
a ∈ Σ0 (P).
Note that the arrow 1z of (z, z) defined as (1z )a = 11, for any a ∈ Σ0 (P), is the identity of (z, z). Now, the set (z, z1 ) has a structure of complex vector space defined as (α · t + β · s)a ≡ α · ta + β · sa ,
a ∈ Σ0 (P),
for any α, β ∈ C and t, s ∈ (z, z1 ). With these operations and the composition “·”, the set (z, z) is an algebra with identity 1z . The category Z 1 (AP ) has an adjoint ∗, defined as the identity, z ∗ = z, on the objects, while the adjoint t∗ ∈ (z1 , z) of arrows t ∈ (z, z1 ) is defined as (t∗ )a ≡ (ta )∗ ,
a ∈ Σ0 (P),
where (ta )∗ stands for the adjoint in B(Ho ) of the operator ta . Now, let be the norm of B(Ho ). Given t ∈ (z, z1 ), we have that ta = ta1 for any pair a, a1 of 0-simplices because P is pathwise connected. Therefore, by defining t ≡ ta ,
a ∈ Σ0 (P),
it turns out that (z, z1 ) is a complex Banach space for any z, z1 ∈ Z 1 (AP ), while (z, z) is a C∗ -algebra for any z ∈ Z 1 (AP ). This entails that Z 1 (AP ) is a C∗ -category. Two 1-cocycles z, z1 are equivalent (or cohomologous) if there exists a unitary arrow t ∈ (z, z1 ). A 1-cocycle z is trivial if it is equivalent to the identity cocycle ι defined as ι(b) = 11 for any 1-simplex b. Note that, since AP is irreducible, ι is irreducible: (ι, ι) = C11. Equivalence in B(Ho ) and path-independence. A weaker form of equivalence between 1-cocycles is the following: z, z1 are said to be equivalent in B(Ho ) if there exists a field V : Σ0 (P) a → Va ∈ B(Ho ) of unitary operators such that V∂0 b · z(b) = z1 (b) · V∂1 b ,
b ∈ Σ1 (P).
Note that the field V is not an arrow of (z, z1 ) because it is not required that V satisfies the locality condition. A 1-cocycle is trivial in B(Ho ) if it is equivalent in B(Ho ) to the trivial 1-cocycle ι. We denote by Zt1 (AP ) the set of the 1-cocycles trivial in B(Ho ) and with the same symbol we denote the full C∗ -subcategory of Z 1 (AP ) whose objects are the 1-cocycles trivial in B(Ho ). Triviality in B(Ho ) is
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1029
related to the notion of path-independence. The evaluation of a 1-cocycle z on a path p = {bn , . . . , b1 } is defined as z(p) ≡ z(bn ) · · · z(b2 ) · z(b1 ). z is said to be path-independent on a subset P ⊆ P whenever z(p) = z(q) for any p, q ∈ P(a0 , a1 ) such that |p|, |q| ⊆ P.
(2.2)
As P is pathwise connected, a 1-cocycle is trivial in B(Ho ) if, and only if, it is path-independent on all P [16]. For later purposes, we recall the following result: assume that z is a 1-cocycle trivial in B(Ho ), if the causal complement O⊥ of O is pathwise connected, then z(p) · A · z(p)∗ = A,
A ∈ A(O)
(2.3)
for any path p with ∂1 p, ∂0 p ⊥ O [16, Lemma 3A.5]. 2.2. The first homotopy group of a poset The logical steps necessary to define the first homotopy group of posets are the same as in the case of topological spaces. We first recall the definition of a homotopy of paths; secondly, we introduce the reverse of a path, the composition of paths and prove that they behave well under the homotopy equivalence relation; finally we define the first homotopy group of a poset. The definition of a homotopy of paths ([27, p. 322]) needs some preliminaries. An ampliation of a 1-simplex b is a 2-simplex c such that ∂1 c = b. We denote by A(b) the set of the ampliations of b. An elementary ampliation of a path p = {bn , . . . , b1 }, is a path q of the form q = {bn , . . . , bj+1 , ∂0 c, ∂2 c, bj−1 , . . . , b2 , b1 },
c ∈ A(bj ).
(2.4)
Consider now a pair {b2 , b1 } of 1-simplices satisfying ∂1 b2 = ∂0 b1 . A contraction of {b2 , b1 } is a 2-simplex c satisfying ∂0 c = b2 , ∂2 c = b1 . We denote by C(b2 , b1 ) the set of the contractions of {b2 , b1 }. An elementary contraction of a path p = {bn , . . . , b1 } is a path q of the form q = {bn , . . . , bj+2 , ∂1 c, bj−1 , . . . , b1 },
c ∈ C(bj+1 , bj ).
(2.5)
An elementary deformation of a path p is a path q which is either an elementary ampliation or an elementary contraction of p. Note that a path q is an elementary ampliation of a path p if, and only if, p is an elementary contraction of q. This can be easily seen by observing that if c ∈ Σ2 (P), then c ∈ A(∂1 c) and c ∈ C(∂0 c, ∂2 c). This entails that deformation is a symmetric, reflexive binary relation on the set of paths with the same endpoints. However, if P is not directed, deformation does not need to be an equivalence relation on paths with the same endpoints, because transitivity might fail. Given a0 , a1 ∈ Σ0 (P), a homotopy of paths in P(a0 , a1 ) is a map h(i) : {1, 2, . . . , n} → P(a0 , a1 ) such that h(i) is an elementary deformation of h(i − 1)
October 20, 2005 8:48 WSPC/148-RMP
1030
J070-00248
G. Ruzzi
δ0c
b3
b3
b3
δ 1 c1 b2
δ2c
b1
b1
q
p
δ 1 c = b1
q1
δ 0 c1 = b2 δ 2 c1 = b1
Fig. 2. q is an elementary ampliation of the path p and q1 is an elementary contraction of p. The symbol δ stands for ∂.
for 1 < i ≤ n. We will say that two paths p, q ∈ P(a0 , a1 ) are homotopic, p ∼ q, if there exists a homotopy of paths h in P(a0 , a1 ) such that h(1) = q and h(n) = p. It is clear that a homotopy of paths is an equivalence relation on paths with the same endpoints. We now define the composition of paths and the reverse of a path. Given p = {bn , . . . , b1 } ∈ P(a0 , a1 ) and q = {bk , . . . , b1 } ∈ P(a1 , a2 ), the composition of p and q is the path p ∗ q ∈ P(a0 , a2 ) defined as q ∗ p ≡ {bk , . . . , b1 , bn , . . . , b1 }.
(2.6)
Note that p1 ∗ (p2 ∗ p3 ) = (p1 ∗ p2 ) ∗ p3 , if the composition is defined. Lemma 2.1. Let p1 , q1 ∈ P(a0 , a1 ), p2 , q2 ∈ P(a1 , a2 ). If p1 ∼ q1 and p2 ∼ q2 , then p2 ∗ p1 ∼ q2 ∗ q1 . Proof. Let h1 : {1, . . . , n} → P(a0 , a1 ) and h2 : {1, . . . , k} → P(a1 , a2 ) be homotopies of paths such that h1 (1) = p1 , h1 (n) = q1 and h2 (1) = p2 , h2 (k) = q2 . Define i ∈ {1, . . . , n}, h2 (1) ∗ h1 (i), h(i) ≡ h2 (i − n) ∗ h1 (n), i ∈ {n + 1, . . . , n + k}. Then h : {1, . . . , n + k} → P(a0 , a2 ) is a homotopy of paths such that h(1) = p2 ∗ p1 and h(n + k) = q2 ∗ q1 , completing the proof. The reverse of a 1-simplex b, is the 1-simplex ¯b defined as ∂0¯b = ∂1 b,
∂1¯b = ∂0 b,
|¯b| = |b|.
(2.7)
So, the reverse of a path p = {bn , . . . , b1 } ∈ P(a0 , a1 ) is the path p¯ ∈ P(a1 , a0 ) defined as p¯ ≡ {b1 , . . . , bn }. It is clear that p¯ = p. Furthermore, Lemma 2.2. p ∼ q ⇒ p¯ ∼ q¯. Proof. The reverse of a 2-simplex c is the 2-simplex c¯ defined as ∂1 c¯ = ∂1 c,
∂0 c¯ = ∂2 c,
∂2 c¯ = ∂0 c,
|¯ c| = |c|.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1031
Note that, if c ∈ A(b), then c¯ ∈ A(¯b); if c ∈ C(b2 , b1 ), then c¯ ∈ C b1 , b2 . So, let h : {1, . . . , n} → P(a0 , a1 ) be a homotopy of paths. Then, maps ¯h : {1, . . . , n} → ¯ ≡ h(i) for any i is a homotopy of paths, completing the P(a1 , a0 ) defined as h(i) proof. A 1-simplex b is said to be degenerate to a 0-simplex a0 whenever ∂0 b = a0 = ∂1 b,
a0 = |b|.
(2.8)
We will denote by b(a0 ) the 1-simplex degenerate to a0 . Lemma 2.3. The following assertions hold: (i) p ∗ b(∂1 p) ∼ p ∼ b(∂0 p) ∗ p, (ii) p ∗ p¯ ∼ b(∂0 p) and p¯ ∗ p ∼ b(∂1 p). Proof. By Lemma 2.1, it is enough to prove the assertions in the case that p is a 1-simplex b. (i) Let c1 be the 2-simplex defined as ∂2 c1 = b(∂1 b), ∂0 c1 = b, ∂1 c1 = b and whose support |c1 | equals |b|. Then, c1 ∈ C(b, b(∂1 b)) and b ∗ b(∂1 b) ∼ b. The other identity follows in a similar way. (ii) Let c2 be the 2-simplex defined as ∂0 c2 = b, ∂2 c2 = ¯b, ∂1 c2 = b(∂0 b) and whose support |c2 | equals |b|. Then, c2 ∈ C(b, ¯b) and b ∗ ¯b ∼ b(∂0 b). The other identity follows in a similar way. We now are in a position to define the first homotopy group of a poset. Fix a base 0-simplex a0 and consider the set of closed paths P(a0 ). Note that the composition and the reverse are internal operations of P(a0 ) and that b(a0 ) ∈ P(a0 ). We define π1 (P, a0 ) ≡ P(a0 )/∼,
(2.9)
where ∼ is the homotopy equivalence relation. Let [p] denote the homotopy class of an element p of P(a0 ). Equip π1 (P, a0 ) with the product [p] ∗ [q] ≡ [p ∗ q],
[p], [q] ∈ π1 (P, a0 ).
∗ is associative, and it easily follows from previous lemmas that π1 (P, a0 ) with ∗ is p]. Now, a group: the identity 1 of the group is [b(a0 )]; the inverse [p]−1 of [p] is [¯ assume that P is pathwise connected. Given a 0-simplex a1 , let q be a path from a0 to a1 . Then, the map π1 (P, a0 ) [p] → [q ∗ p ∗ q¯] ∈ π1 (P, a1 ) is a group isomorphism. On the grounds of these facts, we give the following: Definition 2.4. We call π1 (P, a0 ) the first homotopy group of P with base a0 ∈ Σ0 (P). If P is pathwise connected, we denote this group by π1 (P) and call
October 20, 2005 8:48 WSPC/148-RMP
1032
J070-00248
G. Ruzzi
it the fundamental group of P. If π1 (P) = 1, we will say that P is simply connected. We have the following result: Proposition 2.5. If P is directed, then P is pathwise and simply connected. Proof. Clearly P is pathwise connected. Let p = {bn , . . . , b1 } ∈ P(a0 ). As P is directed, we can find ci ∈ A(bi ) for i = 2, . . . , n − 1 with ∂2 c2 = b1 ,
∂0 ci−1 = ∂2 ci
for i = 3, . . . , n − 1,
∂0 cn−1 = bn .
One can easily deduce from these relations that ∂02 ci = a0 for any i = 2, . . . , n − 1. By Lemmas 2.1, 2.2 and 2.3, we have p ∼ bn ∗ ∂0 cn−1 ∗ ∂2 cn−1 ∗ ∂0 cn−2 ∗ · · · ∗ ∂2 c3 ∗ ∂0 c2 ∗ ∂2 c2 ∗ b1 = bn ∗ bn ∗ ∂2 cn−1 ∗ ∂2 cn−1 ∗ · · · ∗ ∂2 c3 ∗ ∂2 c3 ∗ b1 ∗ b1 ∼ b(a0 ) ∗ · · · ∗ b(a0 ) ∼ b(a0 ), completing the proof. 2.3. Connection between homotopy and net-cohomology Let us consider a pathwise-connected poset P, equipped with a causal disjointness relation ⊥, and let AP be an irreducible net of local algebras. In this section, we show the relation between π1 (P) and the set Z 1 (AP ). To begin with, we prove the invariance of 1-cocycles for homotopic paths. Lemma 2.6. Let z ∈ Z 1 (AP ). For any pair p, q of paths with the same endpoints, if p ∼ q, then z(p) = z(q). Proof. It is enough to check the invariance of z for elementary deformations. For instance let q = {bn , . . . , bj+1 , ∂0 c, ∂2 c, bj−1 , . . . , b2 , b1 }, with c ∈ A(bj ), that is an elementary ampliation of p. By definition of A(bj ) and the 1-cocycle identity, we have z(q) = z(bn ) · · · z(bj+1 ) · z(∂0 c) · z(∂2 c) · z(bj−1 ) · · · z(b1 ) = z(bn ) · · · z(bj+1 ) · z(∂1 c) · z(bj−1 ) · · · z(b1 ) = z(p). The invariance for elementary contractions follows in a similar way. Lemma 2.7. Let z ∈ Z 1 (AP ). Then: (i) z(b(a)) =11 for any 0-simplex a, (ii) z(¯ p) = z(p)∗ for any path p. Proof. (i) Let c(a0 ) be the 2-simplex degenerate to a0 , that is ∂0 c(a0 ) = ∂2 c(a0 ) = ∂1 c(a0 ) = b(a0 ),
|c(a0 )| = a0 ,
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1033
where b(a0 ) is the 1-simplex degenerate to a0 . By the 1-cocycle identity we have: z(∂0 c(a)) · z(∂2 c(a)) = z(∂1 c(a)) ⇔ z(b(a)) · z(b(a)) = z(b(a)) ⇔ z(b(a)) = 11. (ii) Follows from (i) Lemma 2.6 and Lemma 2.3(ii). We now are in a position to show the connection between the fundamental group of P and Z 1 (AP ). Fix a base 0-simplex a0 . Given z ∈ Z 1 (AP ), define πz ([p]) ≡ z(p),
[p] ∈ π1 (P, a0 ).
(2.10)
This definition is well posed as z is invariant for homotopic paths. Theorem 2.8. The correspondence Z 1 (AP ) z → πz maps 1-cocycles z, equivalent in B(Ho ), into equivalent unitary representations πz of π1 (P) in Ho . Up to equivalence, this map is injective. If π1 (P) = 1, then Z 1 (AP ) = Zt1 (AP ). Proof. First, recall that the identity 1 of π1 (P) is the equivalence class [b(a0 )] associated with the 1-simplex degenerate to a0 . By Lemma 2.7, we have that πz (1) = 1 and that πz ([p]−1 ) = πz ([p])∗ . Furthermore, it is obvious from the definition of πz , that πz ([p] ∗ [q]) = πz ([p]) · πz ([q]), therefore πz is a unitary representation of π1 (P) in Ho . Note that if z1 ∈ Z 1 (AP ) and u ∈ (z, z1 ) is unitary, then ua0 · πz ([p]) = πz1 ([p]) · ua0 . Now, let π be a unitary representation of π1 (P) on Ho . Fix a base 0-simplex a0 , and for any 0-simplex a, denote by pa a path with ∂1 pa = a and ∂0 pa = a0 . Let zπ (b) ≡ π([p∂0 b ∗ b ∗ p∂1 b ]),
b ∈ Σ1 (P).
Given a 2-simplex c, we have zπ (∂0 c) · zπ (∂2 c) = π([p∂00 c ∗ ∂0 c ∗ p∂10 c ∗ p∂02 c ∗ ∂2 c ∗ p∂12 c ]) = π([p∂00 c ∗ ∂0 c ∗ p∂10 c ∗ p∂10 c ∗ ∂2 c ∗ p∂12 c ]) = π([p∂00 c ∗ ∂0 c ∗ ∂2 c ∗ p∂11 c ]) = π([p∂01 c ∗ ∂1 c ∗ p∂11 c ]) = zπ (∂1 c). Hence, zπ satisfies the 1-cocycle identity but in general zπ ∈ Z 1 (AK ) because zπ (b) does not need to belong to A(|b|). However, note that if we consider πz1 for some z1 ∈ Z 1 (AK ), then zπz1 (b) = πz1 ([p∂0 b ∗ b ∗ p∂1 b ]) = z1 (p∂0 b ) · z1 (b) · z1 (p∂0 b )∗ . Therefore, zπz1 is equivalent in B(Ho ) to z1 . This entails that if πz is equivalent to πz1 , then z is equivalent in B(Ho ) to z1 . Finally, assume that π1 (P) = 1, then z(p) = 11 for any closed path p. This entails that z is path-independent on P, hence z is trivial in B(Ho ).
October 20, 2005 8:48 WSPC/148-RMP
1034
J070-00248
G. Ruzzi
2.4. Change of index set The purpose is to show the invariance of net-cohomology under a suitable change of the index set. To begin with, by a subposet of a poset P we mean a subset Pˆ of P equipped with the same order relation of P. Definition 2.9. Consider a subposet Pˆ of P. We will say that Pˆ is a refinement ˆ ∈ Pˆ such that O ˆ ≤ O. A refinement Pˆ of P is of P, if for any O ∈ P there exists O ˆ2 ∈ Pˆ ˆ1 , O said to be locally relatively connected if given O ∈ P, for any pair O ˆ ˆ ˆ ˆ ˆ p| ≤ O. with O1 , O2 ≤ O, there is a path pˆ in P from O1 to O2 such that |ˆ Lemma 2.10. Let Pˆ be a locally relatively connected refinement of P. (i) P is pathwise connected if, and only if, Pˆ is pathwise connected. (ii) If ⊥ is a causal disjointness relation for P, then the restriction of ⊥ to Pˆ is a causal disjointness relation. Proof. (i) Assume that P is pathwise connected. It easily follows from the definition of a locally relatively connected refinement that Pˆ is pathwise connected. ˆ0 , a ˆ1 ∈ Pˆ Conversely, assume that Pˆ is pathwise connected. Given a0 , a1 ∈ P, let a ˆ ˆ1 ≤ a1 , and let pˆ be a path in P from a ˆ0 to a ˆ1 . Then, be such that a ˆ0 ≤ a1 and a b1 ∗ p ∗ b0 is a path from a0 to a1 , where b0 , b1 are 1-simplices of P defined as ˆ0 ; |b0 | = a0 and ∂0 b1 = a1 , ∂1 b1 = a ˆ1 ; |b1 | = a1 . follows: ∂1 b0 = a0 , ∂0 b0 = a ˆ is a symmetric binary relation satisfying (ii) It is clear that ⊥, restricted to P, ˆ ∈ P. ˆ Since ⊥ is a causal disjointness the property (ii) of the definition (2.1). Let O ˆ ⊥ O1 . Since Pˆ is a refinement of P, relation on P, we can find O1 ∈ P with O ˆ ˆ ˆ⊥O ˆ1 , completing the proof. ˆ there exists O1 ∈ P with O1 ≤ O1 . Hence O Let P be a pathwise connected poset and let ⊥ be a causally disjointness relation for P. Let AP be an irreducible net of local algebras indexed by P and defined on a Hilbert space Ho . If Pˆ is a locally relatively connected refinement of P, then, by the previous lemma, Pˆ is pathwise connected and ⊥ is a causal disjointness relation on ˆ Furthermore, the restriction of AP to Pˆ is a net of local algebras A ˆ indexed P. P|P ˆ trivial in B(Ho ), with values ˆ Let Zt1 (A ˆ ) be the category of 1-cocycles of P, by P. P|P
in the net AP|Pˆ . Notice that AP|Pˆ might be not irreducible, hence it is not clear, at a first sight, if the trivial 1-cocycle ˆι of Zt1 (AP|Pˆ ) is irreducible or not. This could create some problems in the following, since the properties of tensor C∗ -categories whose identity is not irreducible are quite complicated (see [21, 31, 5]). However, as a consequence of the fact that Pˆ is a refinement of P, this is not the case as shown by the following lemma.
Lemma 2.11. Let AP be an irreducible net of local algebras. For any locally relatively connected refinement Pˆ of P, the trivial 1-cocycle ˆι of Zt1 (AP|Pˆ ) is irreducible. Proof. Let tˆ ∈ (ˆι, ˆι). By the definition of ˆι, we have that tˆ∂1 ˆb = tˆ∂0 ˆb for any ˆ Since Pˆ is pathwise connected, we have that tˆaˆ = tˆaˆ1 for any pair 1-simplex ˆb of P.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1035
ˆ By the localization properties of tˆ, it turns out that if we a ˆ, a ˆ1 of 0-simplices of P. ˆ then T ∈ A(O) ˆ for any O ˆ ∈ P. ˆ Now, define T ≡ tˆaˆ for some 0-simplex a ˆ of P, observe that given O ∈ P, by the definition of causal disjointness relation, there is ˆ1 ∈ Pˆ such that O1 ∈ P such that O1 ⊥ O. Since Pˆ is a refinement of P, there is O ˆ ˆ ˆ O1 ⊆ O1 . Hence, O1 ⊥ O. Since T ∈ A(O1 ), we have that T ∈ A(O) . But this holds for any O ∈ P, hence T = c · 11 because the net AP is irreducible. We now are ready to show the main result of this section. Theorem 2.12. Let Pˆ be locally relatively connected refinement of P. Then, the categories Zt1 (AP ) and Zt1 (AP|Pˆ ) are equivalent. Proof. For any z ∈ Zt1 (AP ) and for any t ∈ (z, z1 ), define ˆ R(z) ≡ z Σ1 (P),
ˆ R(t) ≡ t Σ0 (P).
It is clear that R is a covariant and faithful functor from Zt1 (AP ) into Zt1 (AP|Pˆ ). We now define a functor from Zt1 (AP|Pˆ ) to Zt1 (AP ). To this purpose, we choose a function f : P → Pˆ satisfying the following properties: given O ∈ P, if O ∈ Pˆ ⇒ f(O) = O,
otherwise f(O) ≤ O.
For any b ∈ Σ1 (P), we denote by pˆ(f(∂0 b), f(∂1 b)) a path of Pˆ from f(∂0 b) to f(∂1 b) whose support is contained in |b|, this is possible because Pˆ is a locally relatively connected refinement of P. For any zˆ ∈ Zt1 (AP|Pˆ ), we define F(ˆ z )(b) ≡ zˆ(ˆ p(f(∂0 b), f(∂1 b))),
b ∈ Σ1 (P).
z )(b) ∈ A(|b|). For By the properties of the path pˆ(f(∂0 b), f(∂1 b)), it follows that F(ˆ any c ∈ Σ2 (P), by using the path-independence of zˆ we have z )(∂2 c) = zˆ(ˆ p(f(∂00 c), f(∂10 c))) · zˆ(ˆ p(f(∂02 c), f(∂12 c))) F(ˆ z )(∂0 c) · F(ˆ p(f(∂02 c), f(∂11 c))) = zˆ(ˆ p(f(∂01 c), f(∂02 c))) · zˆ(ˆ z )(∂1 c). = zˆ(ˆ p(f(∂01 c), f(∂11 c))) = F(ˆ Hence, F(ˆ z ) satisfies the 1-cocycle identity, and it is trivial in B(Ho ) because so z , zˆ1 ), define is zˆ. Therefore, F(ˆ z ) ∈ Zt1 (AP ). Now, for any tˆ ∈ (ˆ F(tˆ)a ≡ tˆf(a) ,
a ∈ Σ0 (P).
z )(b) = Clearly, F(tˆ)a ∈ A(a). Moreover, for any b ∈ Σ1 (P), we have F(tˆ)∂0 b · F(ˆ p(f(∂0 b), f(∂1 b))) = zˆ1 (ˆ p(f(∂0 b), f(∂1 b))) · tˆf(∂1 b) = F(ˆ z1 )(b)·F(tˆ)∂1 b . Theretˆf(∂0 b) · zˆ(ˆ fore, F is a covariant functor from Zt1 (AP|Pˆ ) to Zt1 (AP ). Now, we show that the pair R, F states an equivalence between Zt1 (AP ) and 1 ˆ we have that Zt (AP|Pˆ ). Given zˆ ∈ Zt1 (AP|Pˆ ), for any ˆb ∈ Σ1 (P), (R ◦ F)(ˆ z )(ˆb) = F(ˆ z )(ˆb) = zˆ(ˆ p(f(∂0 b), f(∂1 b))) = zˆ(ˆb),
October 20, 2005 8:48 WSPC/148-RMP
1036
J070-00248
G. Ruzzi
ˆ a) = tˆaˆ for any a because f(∂0ˆb) = ∂0ˆb and f(∂1ˆb) = ∂1ˆb. Clearly, (R◦F)(tˆ)(ˆ ˆ ∈ Σ0 (P). 1 Therefore, R ◦ F = 1Zt1 (AP|Pˆ ) , where 1Zt1 (AP|Pˆ ) is the identity functor of Zt (AP|Pˆ ). The proof follows once we have shown that the functor F◦ R is naturally isomorphic to 1Zt1 (AP ) . To this end, for any a ∈ Σ0 (P), let b(f(a), a) ∈ Σ1 (P) defined as ∂0 b(f(a), a) = f(a),
∂1 b(f(a), a) = a,
|b(f(a), a)| = a.
Given z ∈ Zt1 (AP ), let u(z)a ≡ z(b(f(a), a)) for any a ∈ Σ0 (P). By definition, u(z)a ∈ A(a). Furthermore, u(z)∂0 b · z(b) = z(b(f(∂0 b), ∂0 b)) · z(b) = z(p(f(∂0 b), f(∂1 b))) · z(b(f(∂1 b), ∂0 b)) · z(b) = (F ◦ R)(z)(b) · z(p(f(∂1 b), ∂1 b)) = (F ◦ R)(z)(b) · u(z)∂1 b . Hence, u(z) ∈ (z, (F ◦ R)(z)) for any z ∈ Zt1 (AP ). Let t ∈ (z1 , z), then u(z)a · ta = tf(a) · z1 (b(f(a), a)) = (F ◦ R)(t)a · u(z1 )a for any a ∈ Σ0 (P). This means that the mapping u : Zt1 (AP ) z → u(z) ∈ (z, (F ◦ R)(z)) is a natural transformation between 1Zt1 (AP ) and (F ◦ R). Finally, note that u(z)∗ ∈ ((F ◦ R)(z), z). Combining this with the fact that u(z) is unitary, we have that u is a natural isomorphism between 1Zt1 (AP ) and (F ◦ R), completing the proof.
2.5. The poset as a basis for a topological space Given a topological Hausdorff space X . The topics of the previous sections are now investigated in the case that P is a basis for the topology X ordered under inclusion ⊆. This allows us to show the connection between the notions for posets and the corresponding topological ones, and to understand how topology affects net-cohomology.
2.5.1. Homotopy In what follows, by a curve γ of X , we mean a continuous function from the interval [0, 1] into X . We recall that the reverse of a curve γ is the curve γ¯ defined as γ¯ (t) ≡ γ(1 − t) for t ∈ [0, 1]. If β is a curve such that β(1) = γ(0), the composition γ ∗ β is the curve β(2t), 0 ≤ t ≤ 1/2, (γ ∗ β)(t) ≡ γ(2t − 1), 1/2 ≤ t ≤ 1. Finally, the constant curve ex is the curve ex (t) = x for any t ∈ [0, 1].
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
δ 1b 2 = δ 0b 1
b2
δ 0b 2
b1
γ ( s3)
Fig. 3.
1037
γ ( s2)
γ (s)
δ 1b 1
γ (s1)
The path {b2 , b1 } is an approximation of the curve γ (dashed). The symbol δ stands for ∂.
Definition 2.13. Given a curve γ. A path p = {bn , . . . , b1 } is said to be a posetapproximation of γ (or simply an approximation) if there is a partition 0 = s0 < s1 < · · · < sn = 1 of the interval [0, 1] such that γ([si−1 , si ]) ⊆ |bi |,
γ(si−1 ) ∈ ∂1 bi ,
γ(si ) ∈ ∂0 bi ,
for i = 1, . . . , n (Fig. 3). By App(γ), we denote the set of approximations of γ. Since P is a basis for the topology of X , we have that App(γ) = ∅ for any curve γ. It can be easily checked that the approximations of curves enjoy the following properties p ∈ App(σ),
p ∈ App(γ) ⇔ p¯ ∈ App(¯ γ ), q ∈ App(β) ⇒ p ∗ q ∈ App(σ ∗ β),
(2.11)
where β(1) = σ(0), ∂0 q = ∂1 p. Definition 2.14. Given p, q ∈ App(γ), we say that q is finer than p whenever p = {bn , . . . , b1 } and q = qn ∗ · · · ∗ q1 where qi are paths satisfying |qi | ⊆ |bi |,
∂0 qi ⊆ ∂0 bi ,
∂1 qi ⊆ ∂1 bi ,
i = 1, . . . , n.
We will write p ≺ q to denote that q is a finer approximation than p (Fig. 4) Note that ≺ is an order relation in App(γ). Since P is a basis for the topology of X , (App(γ), ≺) is directed: that is, for any pair p, q ∈ App(γ) there exists p1 ∈ App(γ) of γ such that p, q ≺ p1 . As already said, we can find an approximation for any curve γ. The converse, namely that for a given path p, there is a curve γ such γ
b2
b′4
b1
b′3
b′2
b′1
Fig. 4. The path {b2 , b1 } (dashed) is an approximation of the curve γ. The path {b4 , b3 , b2 , b1 } (bold) is an approximation of γ finer than {b2 , b1 }.
October 20, 2005 8:48 WSPC/148-RMP
1038
J070-00248
G. Ruzzi
that p is an approximation of γ, holds if the elements of P are arcwise connected sets of the topological space X . Concerning the relation between connectedness for posets and connectedness for topological spaces, in [16] it has been shown that: if the elements of P are arcwise connected sets of X , then an open set X ⊆ X is arcwise connected in X if, and only if, the poset PX defined as PX ≡ {O ∈ P | O ⊆ X}
(2.12)
is pathwise connected. Note that the set PX is a sieve of P, namely a subfamily S of P such that, if O ∈ S and O1 ⊆ O, then O1 ∈ S. Now, assume that P is a sieve of P. Then, P is pathwise connected in P if, and only if, the open set XP defined as XP ≡ ∪{O ⊆ X | O ∈ P }
(2.13)
is arcwise connected in X . We now turn to analyze simply connectedness. Lemma 2.15. Let p, q ∈ P(a0 , a1 ) be two approximations of γ. Then, p and q are homotopic paths. Proof. It is enough to prove the statement in the case where p ≺ q. So, let p = {bn , . . . , b1 } and let q = {qn , . . . , q1 } be such the paths qi satisfy |qi | ⊆ |bi |,
∂0 qi ⊆ ∂0 bi ,
∂1 qi ⊆ ∂1 bi ,
i = 1, . . . , n.
Note that for any i the poset formed by O ∈ P with O ⊆ |bi | is directed. As |qi | ⊆ |bi | for any i, by Proposition 2.5, we have that b1 ∼ b1 ∗ q1 ,
bi ∼ bi ∗ qi ∗ bi−1
for i = 2, . . . , n − 1,
bn ∼ qn ∗ bn−1 ,
where bi is a 1-simplex such that ∂1 bi = ∂0 qi ,
∂0 bi = ∂0 bi ,
|bi | = ∂0 bi
for i = 1, . . . , n − 1. Hence, p = bn ∗ bn−1 ∗ · · · ∗ b2 ∗ b1 ∼ qn ∗ bn−1 ∗ bn−1 ∗ qn−1 ∗ · · · ∗ b2 ∗ q2 ∗ b1 ∗ b1 ∗ q1 ∼ qn ∗ b(∂0 qn−1 ) ∗ qn−1 ∗ · · · ∗ b(∂0 q2 ) ∗ q2 ∗ b(∂0 q1 ) ∗ q1 ∼ q, completing the proof. Lemma 2.16. Assume that the elements of P are arcwise and simply connected subsets of X . Let γ, β be two curves with the same endpoints. If there exists a path p such that p ∈ App(γ) ∩ App(β), then γ ∼ β. Proof. Since the path p = {bn , . . . , b1 } is an approximation both of γ and β, there are two partitions 0 = s0 < s1 < · · · < sn = 1 and 0 = t0 < t1 < · · · < tn = 1, such that γ([si−1 , si ]) ⊆ |bi |,
γ(si−1 ) ∈ ∂1 bi ,
γ(si ) ∈ ∂0 bi ,
β([ti−1 , ti ]) ⊆ |bi |,
β(ti−1 ) ∈ ∂1 bi ,
β(ti ) ∈ ∂0 bi ,
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
for i = 1, . . . , n. Let us define γi (s) ≡ γ s · (si − si−1 ) + si−1 , βi (t) ≡ β t · (ti − ti−1 ) + ti−1 ,
1039
s ∈ [0, 1], t ∈ [0, 1],
for i = 1, . . . , n. Note that γ ∼ γn ∗ · · · ∗ γ1 ,
β ∼ βn ∗ · · · ∗ β1 .
Since γi (1), βi (1) ∈ ∂0 bi and ∂0 bi is arcwise connected subset of X , we can find a curve σi for such that σi ([0, 1]) ⊆ ∂0 bi ,
σi (0) = γi (1),
σi (1) = βi (1),
i = 1, . . . n − 1.
Let τ1 (t) ≡ (σ1 ∗ γ1 )(t), τi (t) ≡ (σi ∗ γi ∗ σi−1 )(t),
2 ≤ i ≤ n − 1,
τn (t) ≡ (γn ∗ σn−1 )(t). For i = 1, . . . , n, the curve τi is contained in |bi | and has the same endpoints of βi , thus τi is homotopic to βi because |bi | is simply connected. Therefore, γ ∼ γn ∗ · · · ∗ γ2 ∗ γ1 ∼ γn ∗ σn−1 ∗ σn−1 ∗ · · · ∗ σ2 ∗ σ2 ∗ γ2 ∗ σ1 ∗ σ1 ∗ γ1 = τn ∗ · · · ∗ τ2 ∗ τ1 ∼ βn ∗ · · · ∗ β2 ∗ β1 ∼ β, completing the proof. Lemma 2.17. Assume that the elements of P are arcwise and simply connected subsets of X . Let p, q ∈ P(a0 , a1 ) be respectively two approximations of a pair of curve γ and β with the same endpoints. p and q are homotopic if, and only if, γ and β are homotopic. Proof. (⇒) It is enough to prove the assertion in the case where q is an elementary ampliation of p. So let p = {bn , . . . , b1 } and q an ampliation of p of the form {bn , . . . , bi+1 , ∂0 c, ∂2 c, bi−1 , . . . , b1 } where c ∈ A(bi ). Let s1 , t1 , s2 , t2 ∈ [0, 1] be such that γ(s1 ), β(t1 ) ∈ ∂1 bi and γ(s2 ), β(t2 ) ∈ ∂0 bi . We can decompose γ ∼ γ3 ∗ γ2 ∗ γ1 and β ∼ β3 ∗ β2 ∗ β1 , where γ1 (s) ≡ γ s · s1 , β1 (t) ≡ β t · t1 , γ2 (s) ≡ γ s · (s2 − s1 ) + s1 , β2 (t) ≡ β t · (t2 − t1 ) + t1 , γ3 (s) ≡ γ s · (1 − s2 ) + s2 , β3 (t) ≡ β t · (1 − t2 ) + t2 , for s, t ∈ [0, 1]. In general, γi and βi might not have the same endpoints. So let σ1 , σ2 be two curves such that σ1 (0) = γ(s1 ) σ1 (1) = β(s1 ) and σ1 ([0, 1]) ⊆ ∂1 bi , and σ2 (0) = γ(s2 ) σ1 (1) = β(s2 ) and σ2 ([0, 1]) ⊆ ∂0 bi . We now set τ1 = σ1 ∗ γ1 ,
τ2 = σ2 ∗ γ1 ∗ σ1 ,
τ3 = γ3 ∗ σ2 .
October 20, 2005 8:48 WSPC/148-RMP
1040
J070-00248
G. Ruzzi
Observe that γ ∼ τ3 ∗ τ2 ∗ τ1 , and that for i = 1, 2, 3, the curve τi has the same endpoints of βi . Furthermore, by construction we have {bi−1 , . . . , b1 } ∈ App(τ1 ) ∩ App(β1 ),
{bn , . . . , bi+1 } ∈ App(τ3 ) ∩ App(β3 ),
and that τ2 and σ2 are contained in the support of c. So, by Lemma 2.16, τ1 ∼ β1 and τ3 ∼ β3 . Moreover, τ2 ∼ β2 because the support of c is simply connected. Hence γ ∼ β, completing the proof. (⇐) Let h : [0, 1] × [0, 1] → X such that the curves γt (s) ≡ h(t, s) satisfy γ0 (s) = γ(s), γ1 (s) = β(s) and γt (0) = x0 , γt (1) = x1 for any t ∈ [0, 1]. For any t ∈ [0, 1], let pt ∈ P(a0 , a1 ) be an approximation of γt such that p0 = p and p1 = q. Now, let us define St ≡ {l ∈ [0, 1] | pt is an approximation of γl }. St is nonempty. In fact, t ∈ St because pt is an approximation of γt . Moreover St is open. To see this, assume pt = {bn , . . . , b1 }. By definition of approximation, there is a partition 0 = s0 < s1 < · · · < sn = 1 of [0, 1] such that γt ([si , si+1 ]) ⊆ |bi+1 |,
γt (si ) ∈ ∂1 bi+1 ,
γt (si+1 ) ∈ ∂0 bi+1 ,
for i = 0, . . . , n − 1. By continuity of h, we can find εi > 0 such that γl ([si , si+1 ]) ⊆ |bi+1 |, for any l ∈ (t − εi , t + εi ) ⇒ γl (si ) ∈ ∂1 bi+1 , γl (si+1 ) ∈ ∂0 bi+1 , for any i = 0, . . . , n − 1. So, if we define ε ≡ min{εi | i ∈ {0, . . . , n − 1}}, we obtain that pt is an approximation of γl for any l ∈ (t − ε, t + ε), hence St is open in the relative topology of [0, 1]. Now, for any t ∈ [0, 1], let It ⊆ St be an open interval of t. Note that for any l ∈ It , pt is a approximation of γl . By compactness, we can find a finite open covering It0 , It1 , . . . , Itn of [0, 1], where 0 = t0 < t1 < · · · < tn = 1. We also have Iti ∩ Iti+1 = ∅ for any i = 0, . . . , n − 1. This entails that for any i = 0, . . . , n − 1, there is li such that ti ≤ li ≤ ti+1 and that pti , pti+1 are approximations of γli . By Lemma 2.15, we have that pti and pti+1 are homotopic, completing the proof. Theorem 2.18. Let X be a Hausdorff, arcwise connected topological space, and let P be a basis for the topology of X whose elements are arcwise and simply connected subsets of X . Then, π1 (X ) π1 (P). Proof. Fix a base 0-simplex a0 and a base point x0 ∈ a0 . Define π1 (X , x0 ) [γ] → [p] ∈ π1 (P, a0 ), where p is an approximation of γ. By (2.11) and Lemma 2.17, this map is group isomorphism.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1041
Corollary 2.19. Let X and P be as in the previous theorem. If X is nonsimply connected, then P is not directed under inclusion. Proof. If X is not simply connected, the by the previous theorem P is not simply connected. By Proposition 2.5, P is not directed. 2.5.2. Net-cohomology Let X be an arcwise connected, Hausdorff topological space. Let O(X ) be the set of open subsets of X ordered under inclusion. Assume that O(X ) is equipped with a causal disjointness relation ⊥. Definition 2.20. We say that P ⊆ O(X ) is a good index set associated with (X , ⊥) if P is a basis for the topology of M whose elements are nonempty, arcwise and simply connected subsets of M with a nonempty causal complement. We denote by I(X , ⊥) the collection of good index sets associated with (X , ⊥). Some observations are in order. Firstly, note that I(X , ⊥) can be empty. However, this does not happen in the applications we have in mind. Secondly, we have used the term “good index set” because it is reasonable to assume that any index set of nets local algebras over (X , ⊥) has to belong to I(X , ⊥). This, to avoid the “artificial” introduction of topological obstructions because, by Theorem 2.18, π1 (P) π1 (X ) for any P ∈ I(X , ⊥). Given P ∈ I(X , ⊥), let us consider an irreducible net of local algebras AP defined on a Hilbert space Ho . The first aim is to give an answer to the question, posed at the beginning of this paper, about the existence of topological obstructions to the triviality in B(Ho ) of 1-cocycles. To this end, note that if X is simply connected, then by, Theorem 2.18, π1 (P) = C · 11. Hence as a trivial consequence of Theorem 2.8, we have the following: Corollary 2.21. If X is simply connected, any 1-cocycle is trivial in B(Ho ), namely Z 1 (AP ) = Zt1 (AP ). On the grounds of this result, we can affirm that there might exists only a topological obstruction to the triviality in B(Ho ) of 1-cocycles: the nonsimply connectedness of X . “Might” because we are not able to provide here an example of a 1-cocycle which is not trivial in B(Ho ). The next aim is to show that net-cohomology is stable under a suitable change of the index set. Let us start by observing that the notion of a locally relatively connected refinement of a poset, Definition 2.9, induces an order relation on I(X , ⊥). Given P1 , P2 ∈ I(X , ⊥), define P1 P2 ⇔ P1
is a locally relatively connected refinement of P2 .
One can easily check that is an order relation on I(X , ⊥).
(2.14)
October 20, 2005 8:48 WSPC/148-RMP
1042
J070-00248
G. Ruzzi
Lemma 2.22. The following assertions hold. (i) Given P ∈ I(X , ⊥), let P1 be a subfamily of P. If P1 is a basis for the topology of X , then P1 ∈ I(X , ⊥) and P1 P. (ii) (I(X , ⊥), ) is a directed poset with a maximum Pmax . Proof. (i) Follows from the Definition 2.9 and from Lemma 2.10. (ii) Define Pmax ≡ {O ⊆ X | O ∈ P for some P ∈ I(X , ⊥)}. It is clear that Pmax ∈ I(X , ⊥). By (i), we have that P Pmax for any P ∈ I(X , ⊥). Hence, Pmax is the maximum. As an easy consequence of Theorem 2.12, we have the following: Theorem 2.23. Let APmax be an irreducible net, defined on a Hilbert space Ho , and indexed by Pmax . For any pair P1 , P2 ∈ I(X , ⊥), the categories Zt1 (APmax |P1 ), Zt1 (APmax ) and Zt1 (APmax |P2 ) are equivalent. Remark 2.24. Some observations on this theorem are in order. (i) The Theorem 2.23 says that, once a net of local algebras APmax is given, the category Zt1 (APmax ) is an invariant of I(X , ⊥). (ii) Once an irreducible net AP indexed by an element P ∈ I(X , ⊥) is given, then it is assigned a net indexed by Pmax . In fact, P is a basis for the topology of X , therefore by defining A(O) ≡ (∪{A(O1 ) | O1 ∈ P, O1 ⊆ O}) , for any O ∈ Pmax , we obtain an irreducible net APmax such that APmax |P = AP . (iii) Concerning the applications to the theory of superselection sectors, we can assume, without loss of generality, the independence of the theory of the choice of the index set. 3. Good Index Sets for a Globally Hyperbolic Space-Time In the papers [16, 27], the index set used to study superselection sectors in a globally hyperbolic space-time M is the set K of regular diamonds. On the one hand, this is a good choice because K ∈ I(M, ⊥). But on the other hand, regular diamonds do not need to have pathwise connected causal complements, and to this fact are connected several problems (see the Introduction). A way to overcome these problems is provided by Theorem 2.23: it is enough to replace K with another good index set whose elements have pathwise connected causal complements. The net-cohomology is unaffected by this change and the mentioned problems are overcome. In this section, we show that such a good index set exists: it is the set K of diamonds of M. The net-cohomology of K will provide us important information for the theory of superselection sectors. We want to stress that throughout both this section and in Sec. 4, by a globally hyperbolic space-time we will mean a globally hyperbolic space-time with dimension ≥ 3.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1043
3.1. Preliminaries on space-time geometry We recall some basics on the causal structure of space-times and establish our notation. Standard references for this topic are [24, 34, 13]. A space-time M consists of a Hausdorff, paracompact, smooth, oriented manifold M, with dimension ≥ 3, endowed with a smooth metric g with signature (−, +, +, . . . , +), and with a time-orientation, that is a smooth timelike vector field v, (throughout this paper smooth means C ∞ ). A curve γ in M is a continuous, piecewise smooth, regular function γ : I → M, where I is a connected subset of R with nonempty interior. It is called timelike, lightlike, spacelike if respectively g(γ, ˙ γ) ˙ < 0, = 0, > 0 all along γ, where γ˙ = dγ dt . Assume now that γ is causal, i.e., a nonspacelike curve; we can classify it according to the time-orientation v as future-directed (f-d) or past-directed (p-d) if respectively g(γ, ˙ v) < 0, > 0 all along γ. When γ is f-d and limt→sup I γ(t) exists (limt→inf I γ(t)), then it is said to have a future (past) endpoint. Otherwise, it is said to be future (past) endless; γ is said to be endless if neither of them exists. Analogous definitions are assumed for p-d causal curves. The chronological future I+ (S), the causal future J+ (S) and the future domain of dependence D+ (S) of a subset S ⊂ M are defined as: I+ (S) ≡ {x ∈ M | there is a f-d timelike curve from S to x}; J+ (S) ≡ S ∪ {x ∈ M | there is a f-d causal curve from S to x}; D+ (S) ≡ {x ∈ M | any p-d endless causal curve through x meets S}. These definitions have a dual in which “future” is replaced by “past” and the + by −. So, we define I(S) ≡ I+ (S) ∪ I− (S), J(S) ≡ J+ (S) ∪ J− (S) and D(S) ≡ D+ (S) ∪ D− (S). We recall that: 1. I+ (S) is an open set; 2. I+ (cl(S)) = I+ (S); + + a (S)) 3. cl(J+ (S)) = cl(I+ (S)) and int(J = I (S). Furthermore, by (2.) + (3.), we + + have that 4. cl J (S) = cl J (cl(S)) . A subset S of M is achronal (acausal ) if for any pair x1 , x2 ∈ S, we have x1 ∈ I(x2 ) (x1 ∈ J(x2 )). Two subsets S1 , S2 ⊆ M, are said to be causally disjoint, whenever S1 ⊥ S2 ⇔ S1 ⊆ M \ J(S2 ).
(3.1)
A (acausal ) Cauchy surface is an achronal (acausal) set C verifying D(C) = M. Any Cauchy surface is a closed, arcwise connected, Lipschitz hypersurface of M. Furthermore, all Cauchy surfaces are homeomorphic. A spacelike Cauchy surface is a smooth Cauchy surface whose tangent space is everywhere spacelike. It turns out that any spacelike Cauchy surface is acausal. A space-time M satisfies the strong causality condition if the following property is verified for any point x of M: any open neighborhood U of x contains an open neighborhood V of x such that for any pair x1 , x2 ∈ V the set J+ (x1 ) ∩ J− (x2 ) is either empty or contained in V . The space-time is said to be globally hyperbolic a cl(S)
and int(S) denote respectively the closure and the internal part of the set S.
October 20, 2005 8:48 WSPC/148-RMP
1044
J070-00248
G. Ruzzi
if it satisfies the strong causality condition and if for any pair x1 , x2 ∈ M, the set J+ (x1 ) ∩ J− (x2 ) is either empty or compact. It turns out that M is globally hyperbolic if, and only if, it admits a Cauchy surface. We recall that if M is a globally hyperbolic spacetime, for any relatively compact set K we have: 5. J+ (cl(K)) is closed; 6. D+ (cl(K)) is compact; by the properties 4. and 5. we have that 7. J+ (cl(K)) = cl J+ (K) . Although, a globally hyperbolic space-time M can be continuously (smoothly) foliated by (spacelike) Cauchy surfaces [9], for our purposes it is enough that for any Cauchy surface C, the space-time M admits a foliation “based” on C, that is there exists a 3-dimensional manifold Σ and a homeomorphism F : R × Σ → M such that Σt ≡ F (t, Σ) are topological hypersurfaces of M,
Σ0 = C,
but, in general, for t = 0, the surface Σt need not be a Cauchy surface [8]. Lemma 3.1. Let M be a globally hyperbolic space-time. Then, π1 (M) is isomorphic to π1 (C) for any Cauchy surface of M. Any curve γ : [0, 1] → M whose endpoints lie in C is homotopic to a curve and lying in C\{x} for any x ∈ M with x = γ(0), γ(1). Proof. Let F be the foliation of M based on C as described above. Let (τ (x), y(x)) ≡ F −1 (x) for x ∈ M. Note that h(t, x) ≡ F ((1 − t) · τ (x), y(x)),
t ∈ [0, 1], x ∈ M,
is a deformation retract. Hence, π1 (M) is isomorphic to π1 (C). Let h1 (t, s) ≡ h(t, γ(s)). Then, curve γ(s) = h1 (0, s) is homotopic to the curve β(s) ≡ h1 (1, s) lying in C. Given x ∈ M with x = β(1), β(0). It is clear that, as C is a 3-dimensional surface, β is homotopic in C to a curve σ lying in C\{x}. Now, note that the relation ⊥, defined by (3.1), is a causal disjointness relation on the poset O(M) formed by the open sets of M ordered under inclusion. Lemma 3.2. Let P ∈ I(M, ⊥) (see Sec. 2.5.2). If M has compact Cauchy surfaces, then P is not directed under inclusion. Proof. Let O1 , . . . , On be a finite covering of a Cauchy surface C of M. If P were directed, then we could find an element O ∈ P with O1 ∪ · · · ∪ On ⊆ O. Then, C ⊆ O. But, by definition of causal disjointness relation, there exists O0 ∈ P with O ⊥ O0 . This leads to a contradiction because O0 ⊂ M\J(O) ⊂ M\J(C) = ∅ (see Definition 2.20). 3.2. The set of diamonds Consider a globally hyperbolic space-time M. We have already observed that the set of regular diamonds K of M is an element of the set of indices I(M, ⊥) associated
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1045
with (M, ⊥), where ⊥ is the relation defined by (3.1). We now introduce the set of diamonds K of M. We prove that K is a locally relatively connected refinement of K , and that diamonds have pathwise connected causal complements. The last part of this section is devoted to study the causal punctures of K induced by points of the space-time. Definition 3.3. Given a spacelike Cauchy surface C, we denote by G(C) the collection of the open subsets G of C of the form φ(B), where (U, φ) is a chart of C and B is an open ball of R3 with cl(B) ⊂ φ−1 (U ). We call a diamondb of M a subset O of the form D(G) where G ∈ G(C) for some spacelike Cauchy surface C: G is called the base of O while O is said to be based on C. We denote by K the collection of diamonds of M. Proposition 3.4. K is a basis for the topology of M. Any diamond O is a relatively compact, arcwise and simply connected, open subset of M. K ∈ I(M, ⊥) and it is a locally relatively connected refinement of K . Proof. K is a basis for the topology of M because M is foliated by spacelike Cauchy surfaces. Observe that any G ∈ G(C), is an arcwise and simply connected, spacelike hypersurface of M. This entails that, (see [24, Sec. 14, Lemma 43]) D(G) is an open subset of M. Furthermore, D(G), considered as a space-time, is globally hyperbolic and G is a spacelike Cauchy surface of D(G). Since G is simply connected, by Lemma 3.1, D(G) is simply connected. Moreover, note that G is relatively compact in C. As C is closed in M, G is relatively compact in M. By 6., D(G) is relatively compact in M. Finally, K ⊂ K (see definition of K in [16]). As K is a basis for the topology of M, then K is a locally relatively connected refinement of K and K ∈ I(M, ⊥) (see Sec. 2.5.2). The next aim is to show that the causal complement O⊥ of a diamond, which is defined as O⊥ = {O1 ∈ K | O1 ⊥ O} (see Sec. 2.1) is pathwise connected in K. To this end, by (2.13), it is enough to prove that MO⊥ = ∪{O1 ∈ K | O1 ⊥ O} is arcwise connected in M because O⊥ is a sieve of K. Lemma 3.5. The following assertions hold. (i) MO⊥ = M\cl J(O) = M\J cl(O) for any O ∈ K. (ii) If O = D(G) for G ∈ G(C), then MO⊥ = D(C\cl(G)). Proof. (i) By 7. we have that M\J(cl(O)) = M\cl J(O) , because O is relatively compact. If O1 ⊥ O, then O1 ⊂ int(M\J(O)). This entails that b The
author is grateful to Gerardo Morsella for a fruitful discussion on the definition of a diamond of M.
October 20, 2005 8:48 WSPC/148-RMP
1046
J070-00248
G. Ruzzi
MO⊥ ⊆ M\J cl(O) . As M\J cl(O) is an open set and K is basis for the topology of M, for any x ∈ M\J cl(O) we can find O1 ∈ K such that x ∈ O1 , O1 ⊆ M\J cl(O) . Thus O1 ⊥ O, and x ∈ MO⊥ , completing the proof. (ii) Since cl(G) is compact in M, by (i) we have MO⊥ = M\cl(J(D(G))) = M\cl(J(G)) = M\J(cl(G)), where the identity J(D(G)) = J(G) has been used. Therefore, as D(C\cl(G)) ⊆ M\J(cl(G)), the inclusion D(C\cl(G)) ⊆ MO⊥ is verified. If x ∈ MO⊥ = M\J cl(G) , then any p-d endless causal curve through x meets the Cauchy surface C in C \ cl(G). Therefore, x ∈ D C\cl(G) and MO⊥ ⊆ D(C\cl(G)) which completes the proof. Proposition 3.6. The causal complement O⊥ of a diamond O is pathwise connected in K. Proof. Let O ∈ K be of the form D(G), where G ∈ G(C) and G = φ(B) with (U, φ) is a chart of C. By definition of G(C), there is an open ball B1 such that cl(B) ⊂ B1 , cl(B1 ) ⊂ φ−1 (U ). As φ(B1 )\cl(φ(B)) is arcwise connected in C, C\cl(G) is arcwise connected in C. Now, by the previous lemma MO⊥ = D(C\cl(G))), which is a globally hyperbolic set with an arcwise connected Cauchy surface C\cl(G). Therefore, MO⊥ is arcwise connected, hence O⊥ is pathwise connected in K. As claimed at the beginning of Sec. 3, we have established that K is a locally relatively connected refinement of K , and that any element of K has a pathwise connected causal complement. From now on, we will focus on K, because this will be the index set that we will use to study superselection sectors. Lemma 3.7. Let O ∈ K and let U be an open neighborhood of cl(O). There exist O1 , O2 ∈ K such that cl(O) ⊂ O1 , cl(O1 ) ⊂ U, and cl(O2 ) ⊂ U, O2 ⊥ O1 . Proof. Assume that O = D(G) with G = φ(B), where (φ, W ) is a chart of a spacelike Cauchy surface C, and B is a ball of R3 such that cl(B) ⊆ φ−1 (W ). As cl(O) ⊂ U , cl(B) is contained in the open set φ−1 (W ∩ U ). Therefore, there exists a ball B1 such that cl(B) ⊂ B1 and cl(B1 ) ⊂ φ−1 (W ∩ U ). Moreover, the latter inclusion entails that there is a ball B2 such that cl(B2 ) ⊂ φ−1 (W ∩ U ) and cl(B2 ) ∩ cl(B1 ) = ∅. Therefore, the diamonds O1 ≡ D(φ(B1 )), O2 ≡ D(φ(B2 )) verify the property written in the statement. As a trivial consequence of this lemma we have that if U is an open neighborhood of cl(O), then there exist O1 , O3 ∈ K such that cl(O) ⊂ O1 , cl(O1 ) ⊂ U and cl(O3 ) ⊂ O1 , and O3 ⊥ O.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1047
3.2.1. Causal punctures The causal puncture of K induced by a point x ∈ M, is the poset Kx defined as the collection Kx ≡ {O ∈ K | cl(O) ⊥ x}.
(3.2)
ordered under inclusion, where cl(O) ⊥ x means that cl(O) ⊆ M\J(x). The causal puncture Kx is a sieve of K, hence, some properties of Kx can be deduced by studying its topological realization Mx ≡ ∪{O ∈ K | O ∈ Kx }. Lemma 3.8. Given x ∈ M, then Mx = M\J(x) = D(C\{x}) for some spacelike Cauchy surface C that meets x. Proof. The set M\J(x) is open. If y ∈ M\J(x), it follows by the definition of K, that there is O ∈ K with y ∈ O and cl(O) ⊆ M\J(x), namely y ∈ Mx . Therefore, M\J(x) ⊆ Mx . The opposite inclusion is obvious, completing the proof of the first identity. As M can be foliated by spacelike Cauchy surfaces, there is a spacelike Cauchy surface C that meets x. Now, the proof proceed as in Lemma 3.5(ii). Considered as a space-time, Mx is globally hyperbolic [29]. An element O ∈ Kx does not need to be a diamond of the space-time Mx . However, Kx is a basis for the topology of Mx . Furthermore, as Mx is arcwise connected, Kx is pathwise connected. Now, for any O ∈ Kx , we define O⊥ |Kx ≡ {O1 ∈ Kx | O1 ⊥ O},
(3.3)
namely, the causal complement of O in Kx . Lemma 3.9. O⊥ |Kx is pathwise connected in Kx for any O ∈ Kx . Proof. Note that O⊥ |Kx is a sieve, hence its enough to prove that ∪{O1 ∈ Kx | O ⊥ O1 } is arcwise connected in M. Assume that O = D(G) where G ∈ G(C) for a spacelike Cauchy surface C of M. By Lemma 3.5(ii), MO⊥ = D(C\cl(G)). As D(C\cl(G)) is a globally hyperbolic space-time, there is a spacelike Cauchy surface C1 that meets x. By [33, Lemma 6] C1 ∪ cl(G) is an acausal Cauchy surface of M that meets x. Hence, by [29, Proposition 3.1], C2 ≡ (C1 \{x}) ∪ cl(G) is an acausal Cauchy surface of Mx . In other words, O is a set of the form D(G) with G ⊂ C2 , where C2 is an acausal Cauchy surface of Mx . Now, as in Proposition 3.6, C2 \cl(G) is arcwise connected in Mx . Furthermore, ∪{O1 ∈ O⊥ |Kx } = D(C2 \cl(G)). This is an arcwise connected set in Mx , therefore O⊥ |Kx is pathwise connected in Kx . For any O ∈ K with x ∈ O, let us define Kx |O ≡ {O1 ∈ Kx | O1 ⊆ O}. Note that Kx |O is a sieve of K.
(3.4)
October 20, 2005 8:48 WSPC/148-RMP
1048
J070-00248
G. Ruzzi
Lemma 3.10. Let O ∈ K with x ∈ O. Then, Kx |O is pathwise connected. Proof. O is a globally hyperbolic space-time, therefore there is a spacelike Cauchy surface C of O that meets x. Our aim is to show that D(C\{x}) = ∪{O1 ∈ Kx |O }. If this holds, since Kx |O is sieve and D(C\{x}) is arcwise connected, by (2.13), Kx |O is pathwise connected. We obtain the proof of this equality in two steps. Firstly, by Lemma 3.8, we have that ∪{O1 ∈ Kx |O } is contained in the open set (M\J(x))∩O. For any x1 ∈ (M\J(x))∩O, let O1 ∈ K with cl(O1 ) ⊆ (M\J(x)) ∩ O. This entails that, x1 ∈ ∪{O1 ∈ Kx |O }. Therefore, ∪{O1 ∈ Kx |O } = (M\J(x)) ∩ O.
(∗)
Secondly, note that D(C\{x}) ⊆ (M\J(x)) ∩ O, because C\{x} ⊆ (M\J(x)) ∩ O. Let x2 ∈ (M\J(x)) ∩ O. Then, any f-d endless causal curve through x2 meets C in C\{x}, therefore x2 ∈ D(C\{x}) and D(C\{x}) = (M\J(x)) ∩ O. This and (∗) entail that ∪{O1 ∈ Kx |O } = D(C\{x}), completing the proof. As the last issue of this section, consider the set Kx × Kx and endow it with the order relation defied as (O1 , O2 ) ≤ (O3 , O4 ) ⇔ O1 ⊆ O3 and O2 ⊆ O4 . The graph Kx⊥ ≡ {(O1 , O2 ) ∈ Kx × Kx | O1 ⊥ O2 }
(3.5)
of the relation ⊥ in Kx is pathwise connected in Kx × Kx . First note that Kx × Kx is a basis for the topology of Mx × Mx , and that Kx⊥ is a sieve in Kx × Kx . Now, the set ∪{(O1 , O2 ) ∈ Kx⊥ } is equal to M⊥ x ≡ {(x1 , x2 ) ∈ Mx × Mx | x1 ⊥ x2 }. As observed, Mx is globally hyperbolic. By [16, Lemma 2.2], M⊥ x is an arcwise connected set of Mx × Mx . By (2.13), Kx⊥ is pathwise connected in Kx × Kx . 3.3. Net-cohomology Before studying the net-cohomology of K, it is worth showing how the topological properties of the space-time stated in Lemma 3.1 are codified in the poset structure of K. Lemma 3.11. The following properties hold. (i) π1 (K) π1 (M) π1 (C) for any Cauchy surface C of M. (ii) Consider a path p ∈ P(a0 ) where a0 ∈ Σ0 (K) is, as a diamond, based on a spacelike Cauchy surface C0 . Let x ∈ C0 such that cl(a0 ) ∩ x = ∅. Then, p is homotopic to a path q = {bn , . . . , b1 } ∈ P(a0 ) such that |bi |, as a diamond, is based on C0 and |bi | ∩ x = ∅ for any i.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1049
Proof. (i) Follows from Theorem 2.18 and from Lemma 3.1. (ii) As observed in Sec. 2.5, since the elements of K are arcwise connected sets of M, there exists a curve γ : [0, 1] → M, with γ(0) = γ(1) ∈ a0 ∩ C0 , and such that p ∈ App(γ). By Lemma 3.1, γ is homotopic to a closed curve β lying in C0 \{x}. This allows us to find a path q ∈ App(β) such that the elements q, as a diamond, are based on C0 . Lemma 2.17 completes the proof. Let AK be an irreducible net of local algebras defined on a Hilbert space Ho . Let Z 1 (AK ) be the set of 1-cocycles of K with values on AK and let us denote by Zt1 (AK ) those elements of Z 1 (AK ) which are trivial in B(Ho ). As a trivial application of Corollary 2.21, we have that if M is simply connected, then Z 1 (AK ) = Zt1 (AK ). This result answers the question posed at the beginning of this paper, saying that the compactness of the Cauchy surfaces of the space-time is not a topological obstruction to the triviality in B(Ho ) of 1-cocycles. As already observed, the only possible obstruction in this sense is the nonsimply connectedness of the space-time. The next proposition will turn out to be fundamental for the theory of superselection sectors because it provides a way to prove triviality in B(Ho ) of 1-cocycles on an arbitrary globally hyperbolic space-time. Proposition 3.12. Assume that z ∈ Z 1 (AK ) is path-independent on Kx for any point x ∈ M. Then z is path-independent on K, therefore z ∈ Zt1 (AK ). Proof. Let p ∈ P(a0 ) and let C0 be the Cauchy surface where a0 is based. Let us take x ∈ C0 such that cl(a0 ) ∩ x = ∅. By Lemma 3.11(ii), p is homotopic to a path q ∈ P(a0 ) whose elements are based on C0 \{x}. This means that q ∈ Kx . z(p) = z(q) = 11 because p and q are homotopic and because z is path-independent on Kx for any x ∈ M. 4. Superselection Sectors We begin the study of the superselection sectors of a net of local observables on an arbitrary globally hyperbolic space-time M, with dimension ≥3. We start by describing the setting in which we study superselection sectors. Afterwards we explain the strategy we will follow, which consists in deducing the global properties of superselection sectors from the local ones. We refer the reader to the Appendix for all the categorical notions used in this section. Let K be the set of diamonds of M. We consider an irreducible net AK : K O → A(O) ⊆ B(Ho ) of local algebras defined on a fixed infinite dimensional separable Hilbert space Ho . We assume that AK satisfies the following two properties. • Punctured Haag duality, that means that A(O1 ) = ∩ A(O) | O ∈ Kx , O ⊥ O1 },
O1 ∈ Kx ,
for any x ∈ M, where Kx is the causal puncture of K induced by x (3.2).
(4.1)
October 20, 2005 8:48 WSPC/148-RMP
1050
J070-00248
G. Ruzzi
• The Borchers property, that means that given O ∈ K, there is O1 ∈ K with O1 ⊂ O such that for any orthogonal projection E ∈ A(O1 ), E = 0, there exists an isometry V ∈ A(O) such that V · V ∗ = E. Let Zt1 (AK ) be the C∗ -category of 1-cocycles of K, trivial in B(Ho ), with values in AK . Then, the superselection sectors are the equivalence classes [z] of the irreducible elements z of Zt1 (AK ). From now on, our aim will be to prove that Zt1 (AK ) is a tensor C∗ -category with a symmetry, left-inverses, and that any object with finite statistics has conjugates. Note, that by the Borchers property, Zt1 (AK ) is closed under direct sums and subobjects. We now discuss the differences between our setting and that used in [16, 27]. Firstly, we have used the set of diamonds K, instead of the set of regular diamonds K , as an index set of the net of local algebras. Secondly, we assume punctured Haag duality while in the cited papers, the authors assumed Haag duality, that is (4.2) A(O1 ) = ∩ A(O) | O ∈ K, O ⊥ O1 }, for any O1 ∈ K. Punctured Haag duality was introduced in [27]. Both the existence of models satisfying punctured Haag duality and the relation of this property to other properties of AK have been shown in [29]. It turns out that punctured Haag duality entails Haag duality and that AK is locally definite, namely (4.3) C · 11 = ∩ A(O) | O ∈ K, x ∈ O} for any x ∈ M. The reason why we assume punctured Haag duality will become clear in the next section. Remark 4.1. It is worth observing that in [29], punctured Haag duality has been shown for the net of local algebras FK , indexed by the set of regular diamonds, and associated with the free Klein–Gordon field in the representation induced by quasi-free Hadamard states. One might wonder if this property holds also for the net of fields FK |K obtained by restricting FK to K. The answer is yes, because the net FK is additive.c As observed in Sec. 3, K ∈ I(M, ⊥) and K K . Then, it can be easily checked that punctured Haag duality for FK entails punctured Haag duality for FK |K . 4.1. Presheaves and the strategy for studying superselection sectors The way we study superselection sectors resembles a standard argument of differential geometry. To prove the existence of global objects, like for instance the affine connection in a Riemannian manifold, one first shows that these objects exist locally, afterwards one checks that these local constructions can be glued together c The
net FK is additive if given O ∈ K and a covering ∪i Oi = O, then F (O) = (∪i A(Oi )) .
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1051
to form an object defined over all the manifold. Here, the role of the manifold is played by the category Zt1 (AK ) and the objects that we want to construct are a tensor product, a symmetry and a conjugation. To see what categories play the role of “charts” of Zt1 (AK ) some preliminary notions are necessary. The C∗ -presheaf associated with AK is the correspondence K O → A(O⊥ ) which associates the C∗ -algebra A(O⊥ ) with any O ∈ K, where A(O⊥ ) is the algebra associated with the causal complement of O (see Sec. 2.1). The stalk in a point x is the C∗ -algebra − A⊥ (x) ≡ ∪ {A(O⊥ ) | x ∈ O} . (4.4) Note that A⊥ (x) is also equal to the C∗ -algebra generated by the algebras A(O) for O ∈ Kx . The correspondence AKx : Kx O → A(O) ⊆ A⊥ (x) is a net of local algebras over the poset Kx . By local definiteness and punctured Haag duality, it can be easily verified that the net AKx is irreducible and satisfies Haag duality. Furthermore, AKx inherits from AK the Borchers property. Now, let Zt1 (AKx ) be the C∗ -category of the 1-cocycles of Kx , trivial in B(Ho ), with values in AKx . Observe that the category Zt1 (AK ) is connected to Zt1 (AKx ) by a covariant functor defined as Zt1 (AK ) z → z Σ1 (Kx ) ∈ Zt1 (AKx ), (z, z1 ) t → t Σ0 (Kx ) ∈ (z Σ1 (Kx ), z1 Σ1 (Kx )).
(4.5)
This is a faithful functor that we call the restriction functor to Kx . Then, the categories Zt1 (AKx ) play the role of “charts” of Zt1 (AK ). In the following, we first prove the existence of a tensor product, a symmetry, left inverses and conjugates in Zt1 (AKx ). Afterwards we will prove that all these constructions can be glued, leading to corresponding notions on Zt1 (AK ). We now explain the reasons why we choose the categories Zt1 (AKx ) as “charts” of Zt1 (AK ). Firstly, because studying Zt1 (AKx ) is very similar to studying superselection sectors in Minkowski space [26]. As observed, the net AKx is irreducible and verifies the Borchers property and Haag duality. Furthermore, the point x plays for Kx the same role that the spatial infinite plays for the set of double cones in the Minkowski space. In fact, Kx admits an asymptotically causally disjoint sequence of diamonds “converging” to x (see Sec. 4.2.2). Secondly, by Proposition 3.12 the mentioned gluing procedure, that we now explain, works well. A collection {zx }x∈M of 1-cocycles zx ∈ Zt1 (AKx ) is said to be extendible to 1 Zt (AK ), if there exists a 1-cocycle z of Zt1 (AK ) such that z Σ1 (Kx ) = zx for any x ∈ M. It is clear that if there exists an extension, then it is unique. Proposition 4.2. The collection {zx }x∈M , where zx ∈ Zt1 (AKx ), is extendible to Zt1 (AK ) if, and only if, for any b ∈ Σ1 (K) the relation zx1 (b) = zx2 (b) is verified for any pair x1 , x2 ∈ M with |b| ∈ Kx1 ∩ Kx2 .
(4.6)
October 20, 2005 8:48 WSPC/148-RMP
1052
J070-00248
G. Ruzzi
Proof. The implication (⇒) is trivial. (⇐) For any b ∈ Σ1 (K), we define z(b) ≡ zx (b)
for some x ∈ M with |b| ∈ Kx .
The definition does not depend on the chosen point x. Clearly z(b) ∈ A(|b|) because zx (b) ∈ A(|b|). Furthermore, given c ∈ Σ2 (K), let x ∈ M with |c| ∈ Kx . Then z(∂0 c) · z(∂2 c) = zx (∂0 c) · zx (∂2 c) = zx (∂1 c) = z(∂1 c), showing that z verifies the 1-cocycle identity. What remains to be shown is that z is trivial in B(Ho ). It is at this point that Proposition 3.12 intervenes in the proof. In fact, for any x ∈ M, z is path-independent on Kx because z Σ0 (Kx ) = zx and zx is path-independent on Kx . Then, the proof follows from Proposition 3.12. An analogous notion of extendibility can be given for arrows. Consider z, z1 ∈ Zt1 (AK ). A collection {tx }x∈M , where tx ∈ (z Σ1 (Kx ), z1 Σ1 (Kx )) in Zt1 (AKx ), is said to be extendible to Zt1 (AK ) if there exists an arrow t ∈ (z, z1 ) such that t Σ0 (Kx ) = tx for any x ∈ M. Also in this case, if the extension t exists, then it is unique. Proposition 4.3. Let z, z1 ∈ Zt1 (AK ). The collection {tx }x∈M , where tx ∈ (z Σ1 (Kx ), z1 Σ1 (Kx )), is extendible to Zt1 (AK ) if, and only if, for any a ∈ Σ0 (K) the relation (tx1 )a = (tx2 )a
(4.7)
is verified for any pair of points x1 , x2 with a ∈ Kx1 ∩ Kx2 . Proof. Also in this case, the implication (⇒) is trivial. (⇐) For any a ∈ Σ0 (K), define ta ≡ (tx )a
for some x ∈ M with a ∈ Kx .
The definition does not depend on the chosen point x. Clearly ta ∈ A(a). Given b ∈ Σ1 (K), let us take x ∈ M with |b| ∈ Kx . Then, t∂0 b · z(b) = (tx )∂0 b · z(b) = z1 (b) · (tx )∂1 b = z1 (b) · t∂1 b , completing the proof. In the following, we will refer to (4.6) and (4.7) as gluing conditions. 4.2. Local theory We begin the study of superselection structure of the category Zt1 (AKx ). Our first aim is to show that to 1-cocycles of Zt1 (AKx ) there correspond endomorphisms of the algebra A⊥ (x) which are localized and transportable, in the sense of DHR analysis. This is a key result for the local theory because it will allow us to introduce in a very easy way the tensor product on Zt1 (AKx ) and all the rest will proceed likewise to [26]. The usual procedure used to define endomorphisms associated with 1-cocycles [26, 16, 27], does not work in this case. This procedure leads to an endomorphism
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1053
of the net AKx , but it is not clear whether this is extendible to an endomorphism of A⊥ (x): since Kx might not be directed, A⊥ (x) might not be the C∗ -inductive limit of AKx . This problem can be overcome by applying, in a suitable way, a different procedure which makes use of the underlying presheaf structure [28]. Given z ∈ Zt1 (AKx ), fix a ∈ Σ0 (Kx ). For any diamond O ∈ K with x ∈ O, define z yO (a)(A) ≡ z(p) · A · z(p)∗ ,
A ∈ A(O⊥ ),
(4.8)
where p is path in Kx such that ∂1 p ⊂ O and ∂0 p = a. This definition does not depend on the path chosen and on the choice of the starting point ∂1 p, as the following lemma shows. Lemma 4.4. Let z ∈ Zt1 (AKx ) and let O ∈ K with x ∈ O. Let p, q be two paths in Kx with ∂0 p = ∂0 q and ∂1 p, ∂1 q ⊆ O. Then, z(p) · A · z(p)∗ = z(q) · A · z(q)∗ for any A ∈ A(O). Proof. Note that z(p) · A · z(p)∗ = z(q) · z(q ∗ p) · A · z(q ∗ p)∗ · z(q)∗ , for any A ∈ A(O⊥ ). q ∗ p is a path in Kx whose endpoints are contained in O. This means that the endpoints of q ∗p belong to Kx |O , see (3.4). As Kx |O is pathwise connected, Lemma 3.10, we can find a path q1 in Kx |O with the same endpoints of q ∗ p. By path-independence, we have that z(q ∗ p) = z(q1 ). But z(q1 ) ⊆ A(O) because the support |q1 | is contained in O. Therefore, z(q∗p)·A = A·z(q∗p) for any A ∈ A(O⊥ ), completing the proof. z z Therefore, if we take O1 ∈ K with x ∈ O1 ⊆ O, then yO (a) A(O⊥ ) = yO (a). 1 This means that the collection z y z (a) ≡ {yO (a) | O ∈ K, x ∈ O}
(4.9) ⊥
is a morphism of the presheaf {O1 ∈ K, x ∈ O1 } O → A(O ). It then follows that y z (a) is extendible to an endomorphism of A⊥ (x) (see the definition of A⊥ (x) (4.4)). Lemma 4.5. The following properties hold: (i) (ii) (iii) (iv) (v)
y z (a) : A⊥ (x) → A⊥ (x) is a unital endomorphism, y z (a) A(a1 ) = idA(a1 ) for any a1 ∈ Σ0 (Kx ) with a1 ⊥ a, if p is a path, then z(p) · y z (∂1 p)(A) = y z (∂0 p)(A) · z(p) for A ∈ A⊥ (x), if t ∈ (z, z1 ), then ta · y z (a)(A) = y z1 (a)(A) · ta for A ∈ A⊥ (x), y z (a)(A(a1 )) ⊆ A(a1 ) for any a1 ∈ Kx with a ⊆ a1 .
Proof. (i) Is obvious from the definition (4.8). (ii) Let O ∈ K with x ∈ O and O ⊥ a1 . Given A ∈ A(a1 ), it follows from the z (a)(A) = z(p) · A · z(p)∗ , where p is definition of y z (a) that y z (a)(A) = yO a path of Kx with ∂1 p ⊂ O, ∂0 p = a. Hence, ∂1 p, ∂0 p ⊥ a1 . As the causal complement of a1 is pathwise connected in Kx (Lemma 3.9), the proof follows by (2.3).
October 20, 2005 8:48 WSPC/148-RMP
1054
J070-00248
G. Ruzzi
(iii) and (iv) follow by routine calculations. (v) follows by (ii) because AKx fulfils Haag duality. Note that {y z (a) | a ∈ Σ0 (Kx )} is a collection of endomorphisms of the algebra A (x) which are localized and transportable in the same sense of the DHR analysis: Lemma 4.5(ii) says that y z (a) localized in a; Lemma 4.5(iii) says that y z (a) is transportable to any a1 ∈ Σ0 (Kx ). ⊥
4.2.1. Tensor structure The tensor product on Zt1 (AKx ) is defined by means of the localized and transportable endomorphisms of A⊥ (x) associated with 1-cocycles. To this end, some preliminaries are necessary. Let z(p) × z1 (q) ≡ z(p) · y z (∂1 p)(z1 (q)), ta × sa1 ≡ ta · y z (a)(sa1 ),
p, q paths in Kx , a, a1 ∈ Σ0 (Kx ),
(4.10)
for any z, z1 , z2 , z3 ∈ Zt1 (AKx ), t ∈ (z, z2 ) and s ∈ (z1 , z3 ). The tensor product in Zt1 (AKx ), that we will define later, is a particular case of ×. Lemma 4.6. Let z, z1 , z2 , z3 ∈ Zt1 (AKx ), and let t ∈ (z, z2 ), s ∈ (z1 , z3 ). The following relations hold: (i) (t∂0 p × s∂0 q ) · z(p) × z1 (q) = z2 (p) × z3 (q) · (t∂1 p × s∂1 q ), (ii) z(p2 ∗ p1 ) × z1 (q2 ∗ q1 ) = z(p2 ) × z1 (q2 ) · z(p1 ) × z1 (q1 ), for any p, q, p2 ∗ p1 , q2 ∗ q1 paths in Kx . Proof. (i) By using Lemma 4.5(iii) and Lemma 4.5(iv), we have t∂0 p × s∂0 q · z(p) × z1 (q) = t∂0 p · y z (∂0 p)(s∂0 q ) · z(p) · y z (∂1 p)(z1 (q)) = y z2 (∂0 p)(s∂0 q ) · z2 (p) · t∂1 p · y z (∂1 p)(z1 (q)) = z2 (p) · y z2 (∂1 p)(s∂0 q ) · y z2 (∂1 p)(z1 (q)) · t∂1 p = z2 (p) · y z2 (∂1 p)(z3 (q)) · y z2 (∂1 p)(s∂1 q ) · t∂1 p = z2 (p) × z3 (q) · t∂1 p · y z (∂1 p)(s∂1 q ) = z2 (p) × z3 (q) · t∂1 p × s∂1 q . (ii) By Lemma 4.5(iii), we have z(p2 ∗ p1 ) × z1 (q2 ∗ q1 ) = z(p2 ∗ p1 ) · y z (∂1 p1 )(z1 (q2 ∗ q1 )) = z(p2 ) · z(p1 ) · y z (∂1 p1 )(z1 (q2 )) · y z (∂1 p1 )(z1 (q1 )) = z(p2 ) · y z (∂1 p2 )(z1 (q2 )) · z(p1 ) · y z (∂1 p1 )(z1 (q1 )) = z(p2 ) × z1 (q2 ) · z(p1 ) × z1 (q1 ), where the equality ∂0 p1 = ∂1 p2 has been used.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1055
We are now ready to introduce the tensor product. Let us define (z ⊗ z1 )(b) ≡ z(b) × z1 (b), b ∈ Σ1 (Kx ), a ∈ Σ0 (Kx ), (t ⊗ s)a ≡ ta × sa ,
(4.11)
for any z, z1 , z2 , z3 ∈ Zt1 (AKx ), t ∈ (z, z1 ) and s ∈ (z2 , z3 ). Proposition 4.7. ⊗ is a tensor product in Zt1 (AKx ). Proof. First, we prove that if z, z1 ∈ Zt1 (AKx ), then z ⊗ z1 ∈ Zt1 (AKx ). By Lemma 4.5(v), we have that (z ⊗ z1 )(b) ∈ A(|b|). Given c ∈ Σ2 (Kx ), by applying Lemma 4.6(ii) with respect to the path {∂0 c, ∂2 c} we have (z ⊗ z1 )(∂0 c) · (z ⊗ z1 )(∂2 c) = z(∂0 c) · z(∂2 c) × z1 (∂0 c) · z1 (∂2 c) = (z ⊗ z1 )(∂1 c), proving that z ⊗ z1 satisfies the 1-cocycle identity. By Lemma 4.6(ii), it follows that (z ⊗ z1 )(bn ) · · · (z ⊗ z1 )(b1 ) = z(p) · y z (∂1 p)(z1 (p)), for any path p = {bn , . . . , b1 }. Therefore, as z and z1 are path-independent in Kx , (z ⊗ z1 ) is path independent in Kx . Namely, z ⊗ z1 ∈ Zt1 (AKx ). If t ∈ (z, z2 ) and s ∈ (z1 , z3 ), then by Lemma 4.6(i), it follows that t ⊗ s ∈ (z ⊗ z1 , z2 ⊗ z3 ). The rest of the properties that ⊗ has to satisfy to be a tensor product in Zt1 (AKx ) can be easily checked. 4.2.2. Symmetry and Statistics The following lemma is fundamental for the existence of a symmetry. Lemma 4.8. Let p, q be a pair of paths in Kx with ∂i p ⊥ ∂i q for i = 0, 1. Then, z(p) × z1 (q) = z1 (q) × z(p). Proof. As Kx⊥ is pathwise connected (see (3.5)), there are in Kx two paths p1 = {bjn , . . . , bj1 } and q1 = {bkn , . . . , bk1 } such that |bji | ⊥ |bki | for i = 1, . . . , n and ∂1 p1 = ∂1 p, ∂0 p1 = ∂0 p and ∂1 q1 = ∂1 q, ∂0 q1 = ∂0 q. By path-independence, z(p1 ) = z(p) and z1 (q1 ) = z(q). By Lemma 4.6(ii), we have z(p) × z1 (q) = z(p1 ) × z1 (q1 ) = z(bjn ) × z1 (bkn ) · · · z(bj1 ) × z1 (bk1 ) = z(bjn ) · y z (∂1 bjn )(z1 (bkn )) · · · z(bj1 ) · y z (∂1 bj1 )(z1 (bk1 )) = z(bjn ) · z1 (bkn ) · · · z(bj1 ) · z1 (bk1 ) = z1 (bkn ) · z(bjn ) · · · z1 (bk1 ) · z(bj1 ) = z1 (bkn ) · y z1 (∂1 bkn )(z(bjn )) · · · z1 (bk1 ) · y z1 (∂1 bk1 )(z(bj1 )) = z1 (q1 ) × z(p1 ) = z1 (q) × z(p), where the localization property of the endomorphisms y z (bji ), y z1 (bki ) has been used (Lemma 4.5(ii)).
October 20, 2005 8:48 WSPC/148-RMP
1056
J070-00248
G. Ruzzi
Theorem 4.9. There exists a symmetry ε in Zt1 (AKx ) defined as ε(z, z1 )a = z1 (q)∗ × z(p)∗ · z(p) × z1 (q),
a ∈ Σ0 (Kx ),
(4.12)
where p, q are two paths with ∂0 p ⊥ ∂0 q and ∂1 p = ∂1 q = a. Proof. First we prove that the r.h.s of (4.12) is independent of the choice of the paths p, q. So, let p1 , q1 be two paths in Kx such that ∂1 p1 = ∂1 q1 = a and ∂0 p1 ⊥ ∂0 q1 . Let q2 ≡ q ∗ q1 and p2 ≡ p ∗ p1 . By Lemma 4.6(ii), we have z1 (q)∗ × z(p)∗ · z(p) × z1 (q) = z1 (q ∗ q1 ∗ q1 )∗ × z(p ∗ p1 ∗ p1 )∗ · z(p ∗ p1 ∗ p1 ) × z1 (q ∗ q1 ∗ q1 ) = (z1 (q1 )∗ · z1 (q2 )∗ × z(p1 )∗ · z(p2 )∗ ) · (z(p2 ) · z(p1 ) × z1 (q2 ) · z1 (q1 )) = z1 (q1 )∗ × z(p1 )∗ · z1 (q2 )∗ × z(p2 )∗ · z(p2 ) × z1 (q2 ) · z(p1 ) × z1 (q1 ). Note that ∂i p2 ⊥ ∂i q2 for i = 0, 1. By Lemma 4.8, we have that z(p2 ) × z1 (q2 ) = z1 (q2 ) × z(p2 ). Therefore, z1 (q)∗ × z(p)∗ · z(p) × z1 (q) = z1 (q1 )∗ × z(p1 )∗ · z(p1 ) × z1 (q1 ), which proves our claim. We now prove that ε(z, z1 ) ∈ (z ⊗z1, z1 ⊗z). Let b ∈ Σ1 (Kx ) and let p, q be two paths with ∂1 p = ∂1 q = ∂0 b and ∂0 p ⊥ ∂0 q. By Lemma 4.6(ii), we have ε(z, z1 )∂0 b · (z ⊗ z1 )(b) = z1 (q)∗ × z(p)∗ · z(p) × z1 (q) · (z ⊗ z1 )(b) = z1 (q)∗ × z(p)∗ · (z(p) · z(b) × z1 (q) · z1 (b)) = (z1 ⊗ z)(b) · (z1 (q1 )∗ × z(p1 )∗ ) · (z(p1 ) × z1 (q1 )) = (z1 ⊗ z)(b) · ε(z, z1 )∂1 b , where p1 = p ∗ b and q1 = q ∗ b and it is trivial to check that p1 and q1 satisfy the properties written in the statement. Given t ∈ (z, z2 ), s ∈ (z1 , z3 ), and two paths p, q with ∂1 p = ∂1 q = a, ∂0 p ⊥ ∂0 q, by Lemma 4.6, we have ε(z2 , z3 )a · (t ⊗ s)a = z3 (q)∗ × z2 (p)∗ · z2 (p) × z3 (q) · (t ⊗ s)a = z3 (q)∗ × z2 (p)∗ · (t∂0 p × s∂0 q ) · z(p) × z1 (q) = z3 (q)∗ × z2 (p)∗ · (s∂0 q × t∂0 p ) · z(p) × z1 (q) = (sa × ta ) · z1 (q)∗ × z(p)∗ · z(p) × z1 (q) = (s ⊗ t)a · ε(z, z1 )a , where t∂0 p × s∂0 q = t∂0 p · y (∂0 p)(s∂0 q ) = t∂0 p · s∂0 q = s∂0 q · t∂0 p = s∂0 q × t∂0 p , because ∂0 p ⊥ ∂0 q. The rest of the properties can be easily checked. z
Now, in order to classify the statistics of the irreducible elements of Zt1 (AKx ), we have to prove the existence of left inverses (see Appendix). To this end, consider a sequence {On }n∈N of diamonds of K such that x ∈ On ,
∀n ∈ N,
On+1 On ,
∩n∈N On = {x}.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1057
For any n, let us take an ∈ Σ0 (Kx ) such that an ⊂ On . We get in this way an asymptotically causally disjoint sequence {an }n∈N : for any a ∈ Σ0 (Kx ), there exists k(a) ∈ N such that for any n ≥ k(a), we have an ⊥ a. This is enough to prove the existence of left inverses. Following [16, 27], given z ∈ Zt1 (AKx ) and a ∈ Σ0 (Kx ), let pn be a path from a to an . Let φza (A) ≡ lim z(pn ) · A · z(pn )∗ , n
A ∈ A⊥ (x),
(4.13)
be a Banach-limit over n. φza : A⊥ (x) → B(Ho ) is a positive linear map and, it can be easily checked, that φza (A · y z (a)(B)) = φza (A) · B,
A, B ∈ A⊥ (x),
(4.14)
A ∈ A⊥ (x).
(4.15)
and that for any b ∈ Σ1 (Kx ), we have φz∂0 b (z(b) · A · z(b)∗ ) = φz∂1 b (A),
Proposition 4.10. Given z ∈ Zt1 (AKx ) and t ∈ (z ⊗ z1 , z ⊗ z2 ), let φzz1 ,z2 (t)a ≡ φza (ta ),
a ∈ Σ0 (Kx ),
(4.16)
where φza is defined by (4.13). Then, the collection φz ≡ {φzz1 ,z2 | z1 , z2 ∈ Zt1 (AKx )} is a left inverse of z. Proof. Following [27], by using (4.14) and (4.15), one can easily show that φzz1 ,z2 (t)∂0 b · z1 (b) = z2 (b) · φzz1 ,z2 (t)∂1 b for t ∈ (z ⊗ z1 , z ⊗ z2 ). Let O ∈ Kx with O ⊥ a, for any B ∈ A(O), we have that φzz1 ,z2 (t)a · B = φza (ta ) · B = φza (ta · y z (a)(B)) = φza (ta · B) = φza (B · ta ) = B · φza (ta ) = B · φzz1 ,z2 (t)a . Hence, φzz1 ,z2 (t)a ∈ A(a) because AKx satisfies Haag duality. This entails that φzz1 ,z2 (t) ∈ (z1 , z2 ). The other properties of left inverses can be easily checked (see [27]). An object of Zt1 (AKx ) is said to have finite statistics if it admits a standard left inverse, namely a left inverse φz such that φzz,z (ε(z, z)) · φzz,z (ε(z, z)) = c · 11
with c > 0.
In the opposite case, z is said to have infinite statistics. The type of statistics is an invariant of the equivalence class of objects. Let Zt1 (AKx )f be the full subcategory of Zt1 (AKx ) whose objects have finite statistics. Zt1 (AKx )f is closed under direct sums, subobjects and tensor products. Furthermore, any object of Zt1 (AKx )f is a finite direct sums of irreducible objects. From now on we focus on Zt1 (AKx )f , because the finiteness of the statistics is a necessary condition for the existence of conjugates (see Appendix).
October 20, 2005 8:48 WSPC/148-RMP
1058
J070-00248
G. Ruzzi
4.2.3. Conjugation The proof of the existence of conjugates in Zt1 (AKx )f is equivalent to proving that any simple object has conjugates (see Appendix). Recall that an object z ∈ Zt1 (AKx )f is said to be simple whenever ε(z, z) = χ(z) · 1z⊗z ,
where χ(z) ∈ {1, −1}.
Simplicity is a property of the equivalence class, and it turns out to be equivalent to that fact that z ⊗n is irreducible for any n ∈ N, where z ⊗n is the n-fold tensor product of z. Lemma 4.11. Let z be a simple object. Then, χ(z) · z(b) = y z (∂0 b)(z(b)) = y z (∂1 b)(z(b)), for any 1-simplex b with ∂1 b ⊥ ∂0 b. Proof. Consider the 1-simplex b(∂1 b) degenerate to ∂1 b and recall that any 1-cocycle evaluated on a degenerate 1-simplex is equal to 11, Lemma 2.7(i). By the defining relation of ε, (4.12), we have ε(z, z)∂1 b = z(b(∂1 b))∗ × z(b)∗ · z(b) × z(b(∂1 b)) = y z (∂1 b)(z(b)∗ ) · z(b). Since χ(z) · 11 = ε(z, z)∂1 b , we have χ(z) · z(b) = y z (∂1 b)(z(b)). The other identity follows by replacing, in this reasoning, b by ¯b. Proposition 4.12. Let z be a simple object. Then, y z (a) : A⊥ (x) → A⊥ (x) is an automorphism, for any a ∈ Σ0 (Kx ). Proof. Let O ∈ K with x ∈ O and O ⊥ a. As the causal complement of a in Kx is pathwise connected, Lemma 3.9, there is a path q of the form b ∗ p, where b is a 1-simplex such that ∂0 b = a and ∂1 b ⊥ a; p is a path satisfying ∂1 p ⊂ O,
∂1 p ⊥ x,
∂0 p = ∂1 b,
|p| ⊥ a. z
Now, observe that by Lemma 4.5(ii), we have that y (a)(z(p)) = z(p) and that y z (∂1 p)(A) = A for any A ∈ A(O⊥ ). By using these relations and the previous lemma, for any A ∈ A(O⊥ ) we have y z (a)(A) = z(q) · y z (∂1 p)(A) · z ∗ (q) = z(b) · z(p) · A · z ∗ (p) · z(b)∗ = z(b) · y z (a)(z(p)) · A · y z (a)(z(p)∗ ) · z(b)∗ = χ(z) · y z (a)(z(b)) · y z (a)(z(p)) · A · y z (a)(z(p)∗ ) · χ(z) · y z (a)(z(b)∗ ) = y z (a)(z(b) · z(p)) · A · y z (a)((z(b) · z(p))∗ ). That is y z (a) z(q)∗ · A · z(q) = A for any A ∈ A(O⊥ ). This means that A⊥ (x) ⊆ y z (a)(A⊥ (x)), that entails that y z (a) is an automorphism of A⊥ (x). Assume that z is a simple object of Zt1 (AKx ). Let us denote by y z−1 (a) the inverse of y z (a). Clearly, y z−1 (a) is an automorphism of A⊥ (x) localized in a. Let z¯(b) ≡ y z−1 (∂0 b)(z(b)∗ ),
b ∈ Σ1 (Kx ).
(4.17)
We claim that z¯ is the conjugate object of z. The proof is achieved in two steps.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1059
Lemma 4.13. Let z be a simple object. Then, z¯(p) = y z−1 (∂0 p)(z(p)∗ ) = y z−1 (∂1 p)(z(p)∗ ), for any path p in Kx . Proof. Within this proof, to save space, we will omit the superscript z from y z (a) and y z−1 (a). First we prove the relations written above in the case that p is a 1-simplex b. For any A ∈ A⊥ (x), we have z¯(b) · y −1 (∂1 b)(A) = y −1 (∂0 b)(z(b)∗ ) · y −1 (∂1 b)(A) = y −1 (∂0 b) z(b)∗ · y(∂0 b) y −1 (∂1 b)(A) = y −1 (∂0 b) y(∂1 b) y −1 (∂1 b)(A) · z(b)∗ = y −1 (∂0 b)(A · z(b)∗ ) = y −1 (∂0 b)(A) · z¯(b). Using this relation, we obtain z¯(b) · y −1 (∂1 b)(z(b)) = y −1 (∂0 b)(z(b)) · z¯(b) = y −1 (∂0 b)(z(b)) · y −1 (∂0 b)(z(b)∗ ) = 11, completing the first part of the proof. We now proceed by induction: let p = {bn , . . . , b1 } and assume that the statement holds for the path q = {bn−1 , . . . , b1 }, then z¯(p) = z¯(bn ) · · · z¯(b1 ) = z¯(bn ) · y −1 (∂0 q)(z(q)∗ ) = y −1 (∂1 q)(z(q)∗ ) · z¯(bn ) = y −1 (∂0 bn )(z(q)∗ ) · z¯(bn ) = y −1 (∂0 bn )(z(q)∗ ) · y −1 (∂0 bn )(z(bn )∗ ) = y −1 (∂0 p)(z(q)∗ · z(bn )∗ ) = y −1 (∂0 p)(z(p)∗ ). The other relation is obtained in a similar way. Lemma 4.14. Let z be a simple object of Zt1 (AKx ). Then, z¯ ∈ Zt1 (AKx ) and it is a conjugate object of z. Proof. By Lemma 4.5(v), we have that z¯(b) ∈ A(|b|) for any b ∈ Σ1 (Kx ). Let c ∈ Σ2 (Kx ), then z¯(∂0 c) · z¯(∂2 c) = y z−1 (∂00 c)(z(∂0 c)∗ ) · y z−1 (∂02 c)(z(∂2 c)∗ ) = y z−1 (∂00 c) z(∂0 c)∗ · y z (∂00 c)(y z−1 (∂02 c)(z(∂2 c)∗ )) = y z−1 (∂00 c) y z (∂10 c)(y z−1 (∂02 c)(z(∂2 c)∗ )) · z(∂0 c)∗ = y z−1 (∂00 c) z(∂2 c)∗ · z(∂0 c)∗ = y z−1 (∂00 c)(z(∂1 c)∗ ) = y z−1 (∂01 c)(z(∂1 c)∗ ) = z¯(∂1 c), where the relations ∂00 c = ∂01 c ∂10 c = ∂02 c have been used. Finally, by Lemma 4.13, z¯ is path-independent in Kx because z is path-independent in Kx . Therefore, z¯ is trivial in B(Ho ), thus is an object of Zt1 (AKx ). Now, we have to prove that z¯ is the conjugate object of z (see definition in Appendix). We need a preliminary observation. Let y z¯(a) be the endomorphisms of A⊥ (x) associated with z¯. Then, y z¯(a) = y z−1 (a),
∀a ∈ Σ0 (Kx ).
(4.18)
October 20, 2005 8:48 WSPC/148-RMP
1060
J070-00248
G. Ruzzi
In fact, let O ∈ K with x ∈ O and O ⊥ a. Let p be path in Kx with ∂1 p ⊂ O and ∂0 p = a. For any A ∈ A(O⊥ ), we have y z¯(a)(A) = z¯(p) · A · z¯(p)∗ = z¯(p) · y z−1 (∂1 p)(A) · z¯(p)∗ = y z−1 (a)(A), which proves (4.18). Now, by (4.18) and by Lemma 4.13, we have (z ⊗ z¯)(b) = z(b) · y z (∂1 b)(y z−1 (∂1 b)(z(b)∗ )) = z(b) · z(b)∗ = 11, (¯ z ⊗ z)(b) = z¯(b) · y z¯(∂1 b) z(b) = y z−1 (∂1 b)(z(b)∗ ) · y z−1 (∂1 b)(z(b)) = 11. So, if we take r = r¯ = 11, then r and r¯ satisfy the conjugate equations for z and z¯, completing the proof. According to the discussion made at the beginning of this section, we have: Theorem 4.15. Any object of Zt1 (AKx )f has conjugates. 4.3. Global theory We now turn back to study Zt1 (AK ). The aim of this section is to show that all the constructions we have made in the categories Zt1 (AKx ) can be glued together and extended to corresponding constructions on Zt1 (AK ). Given z ∈ Zt1 (AK ), let us denote by yxz (a) the morphism of the algebra A⊥ (x) associated with the restriction z Σ1 (Kx ) ∈ Zt1 (AKx ) (4.9). For any a ∈ Σ0 (K), we define y z (a) ≡ {yxz (a) | x ∈ M, with a ∈ Kx }.
(4.19)
We call y z (a) a morphism of stalks because it is compatible with the presheaf structure, that is given O ∈ K, for any pair of points x, x1 ∈ O, we have yxz (a) A(O⊥ ) = yxz 1 (a) A(O⊥ ). This is an easy consequence of the following: Lemma 4.16 (Gluing Lemma). Let z ∈ Zt1 (AK ) and O ∈ K. Then, yxz 1 (a) A(O) = yxz 2 (a) A(O),
(4.20)
for any pair x1 , x2 ∈ M with O ∈ Kx1 ∩ Kx2 . Let p be a path in K for which there exists a pair of points x1 , x2 ∈ M with |p| ⊂ Kx1 ∩ Kx2 . Then, yxz 1 (a)(z(p)) = yxz 2 (a)(z(p)). Proof. By (4.8) and (4.9), for A ∈ A(O), we have that yxz i (a)(A) = z(pi )·A·z(pi )∗ , for i = 1, 2, where pi is a path in Kxi such that ∂0 pi = a, ∂1 pi = ai and ai is contained in some diamond Oi such that xi ∈ Oi and Oi ⊥ O for i = 1, 2. Note that p2 ∗ p1 is a path from a1 to a2 and that a1 , a2 ⊥ O. By (2.3) we have yxz 1 (a)(A) = z(p1 ) · A · z(p1 )∗ = z(p2 ) · z(p2 ∗ p1 ) · A · z(p2 ∗ p1 )∗ · z(p2 )∗ = z(p2 ) · A · z(p2 )∗ = yxz 2 (a)(A)
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1061
for any A ∈ A(O), which proves (4.20). Now, let p and x1 , x2 be as in the statement. By applying (4.20), we have yxz 1 (a)(z(p)) = yxz 1 (a)(z(bn )) · · · yxz 1 (a)(z(b1 )) = yxz 2 (a)(z(bn )) · · · yxz 2 (a)(z(b1 )) = yxz 2 (a)(z(p)), completing the proof. We have called this lemma the gluing lemma because it will allow us to extend to Zt1 (AK ) the constructions that we have made on the categories Zt1 (AKx ). As a first application of this fact, we define the tensor product on Zt1 (AK ). For any z, z1 ∈ Zt1 (AK ) and for any pair t, s of arrows of Zt1 (AK ), we define (z ⊗ z1 )(b) ≡ (z ⊗x z1 )(b), (t ⊗ s)a ≡ (t ⊗x1 s)a ,
b ∈ Σ1 (K), a ∈ Σ0 (K),
(4.21)
for some x ∈ M with |b| ∈ Kx and for some x1 ∈ M with a ∈ Kx1 , where ⊗x is the tensor product in Zt1 (AKx ). Lemma 4.17. ⊗ is a tensor product on Zt1 (AK ). Proof. By the gluing lemma, we have (z ⊗x z1 )(b) = z(b) · yxz (∂1 b)(z1 (b)) = z(b) · yxz 1 (∂1 b)(z1 (b)) = (z ⊗x1 z1 )(b), for any pair of points x, x1 with |b| ∈ Kx ∩ Kx1 . Therefore, by Proposition 4.2, we have that (z ⊗ z1 ) ∈ Zt1 (AK ). Now, let t ∈ (z, z2 ) and let s ∈ (z1 , z3 ). Note that the gluing lemma entails that (t ⊗x1 s)a = ta · yxz (a)(sa ) = ta · yxz 1 (a)(sa ) for any pair of points x, x1 with a ∈ Kx ∩ Kx1 . By Proposition 4.3, we have that t ⊗ s ∈ (z ⊗ z1 , z2 ⊗ z3 ). The other properties of the tensor product can be easily checked. Remark 4.18. As an easy consequence of Lemma 4.17, we have that the tensor ˆ 1 ∈ Zt1 (AK ) ˆ introduced in [16] is well defined in Zt1 (AK ): namely, z ⊗z product ⊗ (see Introduction). It is enough to observe that the restriction of yxz (a) to the algebras A(O) with O ⊥ x and a ⊆ O, is equal to the morphism of the net associated with z, introduced in that paper, and used to define the tensor product. ˆ z1 = z ⊗ z1 , where ⊗ is the tensor product (4.21). This entails that z ⊗ We conclude this section, by generalizing, to an arbitrary globally hyperbolic space-time, [27, Theorem 30.2] which holds for globally hyperbolic space-times with noncompact Cauchy surfaces. Proposition 4.19. The restriction functor (4.5) is a full and faithful tensor functor. Proof. It is clear that the restriction functor is a faithful tensor functor. So, we have to prove that this functor is full. To begin with, recall the construction made
October 20, 2005 8:48 WSPC/148-RMP
1062
J070-00248
G. Ruzzi
in [27, Theorem 30.2]. Let tx0 be an element of (z Σ1 (Kx0 ), z1 Σ1 (Kx0 )) in Zt1 (AKx0 ). For a ∈ Σ0 (K), we define ta ≡ z1 (p) · (tx0 )a0 · z(p)∗ ,
(4.22)
where a0 is a 0-simplex in Kx0 and p is path from a0 to a. The definition does not depend on the chosen path on the choice of a0 in Kx0 . This entails that ta = (tx0 )a for any a ∈ Σ0 (Kx0 ). Furthermore, given b ∈ Σ1 (K) and a path p from a0 to ∂0 b, we have t∂0 b · z(b) = z1 (p) · (tx0 )a0 · z(p)∗ · z(b) = z1 (b) · z1 (p1 ) · (tx0 )a0 · z(p1 )∗ = z1 (b) · t∂1 b , where p1 = ¯b ∗ p. So, what remains to be proved is that ta ∈ A(a) for any a ∈ Σ0 (K). Our proof starts from this point. First we prove that if x1 ⊥ x0 , then ta ∈ A(a) for a ∈ Σ0 (Kx1 ). Take a1 ∈ Σ0 (Kx1 ) with a1 ⊥ a. Since Σ0 (Kx1 ) admits an asymptotically causally disjoint sequence (Sec. 4.2.2), and since x0 ⊥ x1 , we can find a2 ∈ Kx1 ∩ Kx0 with a2 ⊥ a1 . Therefore, ta = z1 (p1 ) · (tx0 )a2 · z(p1 )∗ where p1 is a path from a2 to a. Note that a2 and a belong to the causal complement a⊥ 1 |Kx1 of a1 in Kx1 and that a⊥ 1 |Kx1 is pathwise connected in Kx1 (Lemma 3.9). Since z, z1 are path-independent, we can assume that p1 is contained in a⊥ 1 |Kx1 . Hence, for any A ∈ A(a1 ), we have ta · A = z1 (p1 ) · (tx0 )a2 · z(p1 )∗ · A = z1 (p1 ) · (tx0 )a2 · A · z(p1 )∗ = z1 (p1 ) · A · (tx0 )a2 · z(p1 )∗ = A · ta . Namely, ta ∈ A(a1 ) for any a1 ∈ Kx1 such that a1 ⊥ a. Since AKx1 satisfies the Haag duality, we have ta ∈ A(a). Now, if xn is a generic point of M, observe that we can find a finite sequence of points x1 , . . . , xn−1 such that x0 ⊥ x1 , x1 ⊥ x2 , . . . , xn−1 ⊥ xn . From this observation, the proof of the fullness of the restriction functor follows. Given z ∈ Zt1 (AK ), we know that if z Σ1 (Kx ) is irreducible in Zt1 (AKx ) for some x ∈ M, then z is irreducible. The converse is an easy consequence of Proposition 4.19, namely if z is irreducible, then z Σ1 (Kx ) is irreducible in Zt1 (AKx ) for any x ∈ M. Finally, note that Proposition 4.19 is a strengthening of Proposition 4.3. 4.3.1. Symmetry, statistics and conjugation We conclude our analysis of Zt1 (AK ). First we prove the existence of a symmetry in Zt1 (AK ). Afterwards, we prove the existence of left inverses and define the category of objects with finite statistics. Finally, we prove that this category has conjugates. Let εx denote the symmetry of the category Zt1 (AKx ). Lemma 4.20. There exists a unique symmetry ε in Zt1 (AK ) such that given z, z1 ∈ Zt1 (AK ) and a ∈ Σ0 (K), then ε(z, z1 )a = εx (z, z1 )a for any x ∈ M with x ⊥ a.
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1063
Proof. Let z, z1 ∈ Zt1 (AK ). For any a ∈ Σ0 (K), define ε(z, z1 )a ≡ εx (z, z1 )a
(4.23)
for some x ∈ M with x ⊥ a. The uniqueness follows once we have shown that (4.23) defines a symmetry in Zt1 (AK ). To this end, we prove that (4.23) is independent of the chosen x. Let x1 ∈ M with x1 ⊥ a. This means that a is contained in the open set M\(J(x) ∪ J(x1 )). There exists a 1-simplex b with cl(|b|) ⊂ M\(J(x) ∪ J(x1 )) (this is equivalent to |b| ∈ Kx ∩ Kx1 ) and ∂1 b = a and ∂0 b ⊥ a (see observation below the Lemma 3.7). Note that the paths b and b(a) satisfy the assumptions in the definition (4.12). By the gluing lemma, we have εx (z, z1 )a = z1 (b(a))∗ ×x z(b)∗ · z(b) ×x z1 (b(a)) = z1 (b(a))∗ ×x1 z(b)∗ · z(b) ×x1 z1 (b(a)) = εx1 (z, z1 )a , which proves our claim. By Proposition 4.3, we have that ε(z, z1 ) ∈ (z ⊗ z1 , z1 ⊗ z). The remaining properties can be easily checked. We now turn to prove the existence of left inverses in Zt1 (AK ). Let φzx be a left inverse of the restriction z Σ1 (Kx ) in Zt1 (AKx ) for x ∈ M. For any t ∈ (z ⊗ z1 , z ⊗ z2 ), we define φzz1 ,z2 (t)a ≡ z2 (p) · (φzx )z1 ,z2 (t)a0 · z1 (p)∗ ,
a ∈ Σ0 (K),
(4.24)
where a0 ∈ Σ0 (Kx ) and p is a path from a0 to a. By the same argument used in Proposition 4.19, we have that φzz1 ,z2 (t) ∈ (z1 , z2 ) for any t ∈ (z ⊗ z1 , z ⊗ z2 ). Furthermore, as φzx is a left inverse of z Σ1 (Kx ), one can easily check that φz is a left inverse of z. Therefore, any element of Zt1 (AK ) has left inverses. Proposition 4.21. Let z ∈ Zt1 (AK ). (i) If z has finite statistics, then z Σ1 (Kx ) has finite statistics in Zt1 (AKx ) for any x ∈ M, and if z is irreducible, then λ(z) = λx (z) for any x ∈ M, where λ(z) and λx (z) are the statistics parameters of z and z Σ1 (Kx ) respectively (see Appendix). (ii) If for some x ∈ M, z Σ1 (Kx ) has finite statistics in Zt1 (AKx ), then z has finite statistics, and, if z Σ1 (Kx ) is irreducible, then λ(z) = λx (z). Proof. (i) If z has finite statistics, then z admits a standard left inverse φz . Clearly φz is a left inverse also for the restriction z Σ1 (Kx ) for any x ∈ M. Because of Lemma 4.20, ε(z, z)a = εx (z, z)a for a ∈ Σ0 (Kx ). Hence, φzz,z (ε(z, z))a = φzz,z (εx (z, z))a ,
(∗)
which entails that φz is a standard left inverse of z Σ1 (Kx ) for any x ∈ M. If z is irreducible, then, by Proposition 4.19, z Σ1 (Kx ) is irreducible in Zt1 (AKx ) for any x ∈ M. By (∗), we have λ(z) = λx (z) for any x ∈ M.
October 20, 2005 8:48 WSPC/148-RMP
1064
J070-00248
G. Ruzzi
(ii) Let φzx be a left inverse of z Σ1 (Kx ), and let φz the left inverse of z, associated with φzx , defined by (4.24). Let φzx be a standard left inverse of z Σ1 (Kx ), and let φz be the left inverse of z defined by (4.24). Then, φzz,z (ε(z, z))a = z(p) · (φzx )z,z (εx (z, z))a0 · z(p)∗ ,
(∗∗)
which implies that φz is a standard left inverse of z. If z Σ1 (Kx ) is irreducible, then z is irreducible. Therefore, by (∗∗), we have that λ(z) = λx (z), completing the proof. Let Zt1 (AK )f be the full subcategory of Zt1 (AK ) whose objects have finite statistics. Theorem 4.22. Any object Zt1 (AK )f has conjugates. Proof. As can be seen from Appendix, it is sufficient to prove that the theorem holds in the case of simple objects. Thus, let z be a simple object of Zt1 (AK ). By Proposition 4.21, any restriction z Σ1 (Kx ) is a simple object. Let y z (a) = {yxz (a) | x ∈ M, a ∈ Kx } be the morphism of stalks associated with z. y z (a) is in fact an automorphism of stalks, because by Proposition 4.12, any yxz (a) is an automorphism of A⊥ (x). Clearly also the inverse yxz−1 (a) of yxz (a) is an automorphism of A⊥ (x). Given a ∈ Σ0 (K), let y z−1 (a) ≡ {yxz−1 (a) | x ∈ M, a ∈ Kx }. (From this definition, up to this moment, we can assert neither that the gluing lemma is applicable to y z−1 (a) nor that y z−1 (a) is an automorphism of stalks. Both these properties will be a consequence of the fact that y z−1 (a) = y z¯(a), where z¯ will be defined below.) Now, we prove that for y z−1 (a) a weaker form of the gluing lemma holds, namely given O ∈ K with a ⊆ O, we have (a) A(O) = yxz−1 (a) A(O) yxz−1 1 2
(∗)
for any pair of points x1 , x2 with O ∈ Kx1 ∩ Kx2 . In fact, by using the gluing lemma for y z (a), we have (a)(A) = yxz−1 (a) yxz 2 (a) yxz−1 (a)(A) yxz−1 1 1 2 (a) yxz 1 (a) yxz−1 (a)(A) = yxz−1 (a)(A) = yxz−1 1 2 2 for any A ∈ A(O), which proves (∗). Within this proof, we have used the identities: yxz (a)(A(O)) = A(O), yxz−1 (a)(A(O)) = A(O) for any O ∈ Kx with a ⊆ O. Both the identities derive from the Lemma 4.5(v), and from the fact that yxz (a) is an automorphism of A⊥ (x). Now, recall that the conjugate z¯x of z Σ1 (Kx ) in Zt1 (AKx ) is defined as z¯x (b) = yxz −1 (∂0 b)(z(b)∗ ) (4.17). Given b ∈ Σ1 (K), by applying (∗), we have that (∂0 b)(z(b)∗ ) = yxz−1 (∂0 b)(z(b)∗ ) = z¯x2 (b) z¯x1 (b) = yxz−1 1 2
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1065
for any pair of points x1 , x2 with |b| ∈ Kx1 ∩ Kx2 . Therefore, by defining z¯(b) ≡ z¯x (b),
b ∈ Σ1 (K),
for some point x with |b| ∈ Kx , by Proposition 4.2, we have that z¯ ∈ Zt1 (AK ). Furthermore, by (4.18) we have y z−1 (a) = y z¯(a), where y z¯(a) is the morphism of stalks associated with z¯. To prove that z¯ is the conjugate of z, it is enough to observe that for any b ∈ Σ1 (K), we have (¯ z ⊗ z)(b) = (¯ zx ⊗x zx )(b) = 11,
(z ⊗ z¯)(b) = (zx ⊗x z¯x )(b) = 11,
(see within the proof of Lemma 4.14) for some x ∈ M with |b| ∈ Kx . By defining r = r¯ = 11, we have that r and r¯ satisfy the conjugate equations for z and z¯, completing the proof. 5. Concluding Remarks (1) The topology of the space-time affects the net-cohomology of posets. We have shown that the poset, used as index set of a net of local algebras, is nondirected when the space-time is either nonsimply connected or has compact Cauchy surfaces. In the former case, furthermore, there might exist 1-cocycles which are nontrivial in B(Ho ). In spite of these facts, the structure of superselection sectors of DHR-type is the same as in the case of the Minkowski space (as one can expect because of the sharp localization): sectors define a C∗ -category in which the charge structure manifests itself by the existence of a tensor product, a symmetry and a conjugation. An aspect of the theory, not covered by this paper, and that deserves further investigation, is the reconstruction of the net of local fields and of the gauge group from the net of local observables and the superselection sectors. The mathematical machinery developed in [12] to prove the reconstruction theorem in the Minkowski space does not apply as it stands when the index set of the net of local observables is nondirected. (2) In Sec. 2, we presented net-cohomology in terms of abstract posets. The intention is to provide a general framework for the theory of superselection sectors. In particular, we also hope to find applications in the study of sectors which might be induced by the nontrivial topology of space-times. It has been shown in [1] that the topology of Schwartzschild space-time, a space whose second homotopy group is nontrivial, might induce superselection sectors. However, as observed earlier, it is not possible, up until now, to apply the ideas of DHR-analysis to these sectors since their localization properties are not known. However, the results obtained in this paper allow us to make some speculations in the case that the space-time is nonsimply connected: the existence of 1-cocycles nontrivial in B(Ho ), might be related to the existence of superselection sectors induced by the nontrivial topology of the space-time. In fact, these cocycles define nontrivial representations of the fundamental group of the space-time (Theorems 2.8 and 2.18). However, what is missing in this interpretation is the proof that these 1-cocycles are associated
October 20, 2005 8:48 WSPC/148-RMP
1066
J070-00248
G. Ruzzi
with representations of the net of local observables. We foresee approaching this problem in the future. Finally, we believe that this framework could be suitably generalized for applications in the context of the generally locally covariant quantum field theories [4, 7]. (3) Some techniques introduced in this paper present analogies with techniques adopted to study superselection sectors of conformally covariant theories on the circle S 1 . In these theories, the space-time is the circle S 1 ; the index set for the net of local observables is the set J of the open intervals of S 1 ; the causal disjointness relation is the disjointness: given I, J ∈ J , then I ⊥ J if I ∩ J = ∅. The analogies arise because, referring to Sec. 2, the poset formed by J with the inclusion order relation, is nondirected, pathwise connected, and nonsimply connected. It is usual in these theories to restrict the study of superselection sectors to the space-time S 1 /{x} for x ∈ S 1 , i.e., the causal puncture of S 1 in x (see, for instance, [6, 14, 15]); the same idea has been used in [2] to study superselection sectors over compact spaces.d The punctured Haag duality is strictly related to strong additivity (see [20] and references therein). Finally, in [14], in order to prove that endomorphisms of the net are extendible to the universal C∗ -algebra, the authors’ need to check the invariance of these extensions for homotopic paths (this definition of a homotopy of paths is a particular case of that given in [27, p. 322]). (4) The way we define the first homotopy group of a poset is very similar to some constructions in algebraic topology. We are referring to the edge paths group of a simplicial complex [30] and to the first homotopy group of a Kan complex [23]. Although similar, they are different. Indeed, the simplicial set Σ∗ (P) of a poset P is not a simplicial complex. Furthermore, if P is not directed, then Σ∗ (P) is not a Kan complex. Acknowledgments I would like to thank John E. Roberts for his constant support throughout this work, and Daniele Guido for fruitful discussions and helpful suggestions. The present paper has been improved by comments and suggestions of the anonymous referees: I am grateful to them. I wish to thank the II Institute of Theoretical Physics of the University of Hamburg for their kind hospitality, 12–19, January 2004, and for extensive discussions on the topics studied in this paper. Finally, I would like to thank my family for their support before and during this work. Appendix A. Tensor C∗ -Categories We give some basics definitions and results on tensor C∗ -categories. References for this appendix are [22, 21]. d We stress that in the papers [2, 6, 14, 15], the authors puncture the space-time in order to obtain a directed set of indices. Our aim is different since Kx is in general nondirected. Indeed, Kx has an asymptotically causally disjoint sequence of diamonds “converging to x” (see Sec. 4.2.2) which is sufficient for the analysis of the categories Zt1 (AKx ).
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1067
Let C be a category. We denote by z, z1 , z2 , . . . the objects of the category and the set of the arrows between z, z1 by (z, z1 ). The composition of arrows is indicated by “·” and the unit arrow of z by 1z . Tensor C∗ -categories. A category C is said to be a C∗ -category if the set of the arrows between two objects (z, z1 ) is a complex Banach space and the composition between arrows is bilinear; there should be an adjoint, that is an involutive contravariant functor ∗ acting as the identity on the objects and the norm should satisfy the C∗ -property, namely r∗ r = r2 for each r ∈ (z, z1 ). Notice, that if C is a C∗ -category then (z, z) is a C∗ -algebra for each z. Assume that C is a C∗ -category. An arrow v ∈ (z, z1 ) is said to be an isometry if v ∗ · v = 1z ; a unitary, if it is an isometry and v · v ∗ = 1z1 . The property of admitting a unitary arrow, defines an equivalence relation on the set of the objects of the category. We denote by the symbol [z] the unitary equivalence class of the object z. An object z is said to be irreducible if (z, z) = C · 1z . C is said to be closed under subobjects if for each orthogonal projection e ∈ (z, z), e = 0, there exists an isometry v ∈ (z1 , z) such that v · v ∗ = e. C is said to be closed under direct sums, if given zi i = 1, 2, there exists an object z and two isometries wi ∈ (zi , z) such that w1 · w1∗ + w2 · w2∗ = 1z . A strict tensor C∗ -category (or tensor C∗ -category) is a C∗ -category C equipped with a tensor product, namely an associative bifunctor ⊗ : C × C → C with a unit ι, commuting with ∗, bilinear on the arrows and satisfying the exchange property, i.e., (t ⊗ s) · (t1 ⊗ s1 ) = t · t1 ⊗ s · s1 when the composition of the arrows is defined. From now on, we assume that C is a tensor C∗ -category closed under direct sums, subobjects and that the identity object ι is irreducible. Symmetry and left inverses. A symmetry ε in the tensor C∗ -category C is a map C z1 , z2 → ε(z1 , z2 ) ∈ (z1 ⊗ z2 , z2 ⊗ z1 ) satisfying the relations: (i) (ii) (iii) (iv)
ε(z3 , z4 ) · t ⊗ s = s ⊗ t · ε(z1 , z2 ), ε(z1 , z2 )∗ = ε(z2 , z1 ), ε(z1 , z2 ⊗ z) = 1z2 ⊗ ε(z1 , z) · ε(z1 , z2 ) ⊗ 1z , ε(z1 , z2 ) · ε(z2 , z1 ) = 1z2 ⊗z1 ,
where t ∈ (z2 , z4 ), s ∈ (z1 , z3 ). By (ii)–(iv), it follows that ε(z, ι) = ε(ι, z) = 1z for any z. In this paper, by the left inverse of an object z we mean a set of non-zero linear maps φz = {φzz1 ,z2 : (z ⊗ z1 , z ⊗ z2 ) → (z1 , z2 )} satisfying (i) (ii) (iii) (iv)
φzz3 ,z4 (1z ⊗ t · r · 1z ⊗ s∗ ) = t · φzz1 ,z2 (r) · s∗ , φzz1 ⊗z3 ,z2 ⊗z3 (r ⊗ 1z3 ) = φzz1 ,z2 (r) ⊗ 1z3 , φzz1 ,z1 (s∗1 · s1 ) ≥ 0, φzι,ι (1z ) = 11,
where t ∈ (z1 , z3 ), s ∈ (z2 , z4 ), r ∈ (z ⊗ z1 , z ⊗ z2 ) and s1 ∈ (z ⊗ z1 , z ⊗ z1 ).
October 20, 2005 8:48 WSPC/148-RMP
1068
J070-00248
G. Ruzzi
Statistics. From now on, we assume that C has a symmetry ε and that any object of C has left inverses. An object z of C is said to have finite statistics if it admits a standard left inverse, that is, a left inverse φz satisfying the relation φzz,z (ε(z, z)) · φzz,z (ε(z, z)) = c · 1z
with c > 0.
The full subcategory Cf of C whose objects have finite statistics, is closed under direct sums, subobjects, tensor products and equivalence. Any object of Cf is direct sums of irreducible objects. Given an irreducible object z of Cf and a left inverse φz of z, we have φzz,z (ε(z, z)) = λ(z) · 1z . It turns out that λ(z) is an invariant of the equivalence class of z, called the statistics parameter, and it is the product of two invariants: λ(z) = χ(z) · d(z)−1
where χ(z) ∈ {1, −1}, d(z) ∈ N.
The possible statistics of z are classified by the statistical phase χ(z) distinguishing para-Bose (1) and para-Fermi (−1) statistics and by the statistical dimension d(z) giving the order of the parastatistics. Ordinary Bose and Fermi statistics correspond to d(z) = 1. The objects with d(z) = 1 are called simple objects. The following properties are equivalent ([28]): z is simple ⇔ ε(z, z) = χ(z) · 1z⊗z ⇔ z ⊗n is irreducible ∀n ∈ N. Conjugation. An object z has conjugates if there exists an object z¯ and a pair of arrows r ∈ (ι, z¯ ⊗ z), r¯ ∈ (ι, z ⊗ z¯) satisfying the conjugate equations r¯∗ ⊗ 1z · 1z ⊗ r = 1z ,
r∗ ⊗ 1z¯ · 1z¯ ⊗ r¯ = 1z¯.
Conjugation is a property stable under, subobjects, direct sums, tensor products and, furthermore, it is stable under equivalence. It turns out that z has conjugates ⇒ z has finite statistics. The full subcategory of objects with finite statistics Cf has conjugates if, and only if, each object with statistical dimension equal to one has conjugates (see [11, 19]). Firstly, we observe that if each irreducible object of Cf has conjugates, then any object of Cf has conjugates, because any object of Cf is a finite direct sum of irreducibles, and because conjugation is stable under direct sums. Secondly, note that if z is an irreducible object with statistical dimension d(z), then there exists a pair of isometries v ∈ (z0 , z ⊗d(z) ) and w ∈ (z1 , z ⊗d(z)−1 ) where z0 is a simple object. Assume given z0 and a pair of arrows s, s¯ which solve the conjugate equations for z0 and z0 . Let φz be standard left inverses of z. Setting z¯ ≡ z1 ⊗ z0 , r¯ ≡ d(z)1/2 · (1z ⊗ w∗ ⊗ 1z0 ) · v ⊗ 1z0 · s¯, r ⊗ 1z ), r ≡ d(z) · φzι,¯z⊗z (¯ one can easily show that r, r¯ solve the conjugate equations for z and z¯ ([11]).
October 20, 2005 8:48 WSPC/148-RMP
J070-00248
Homotopy of Posets, Net-Cohomology and Superselection Sectors
1069
References [1] A. Ashtekar and A. Sen, On the role of space-time topology in quantum phenomena: Superselection of charge and emergence of nontrivial vacua, J. Math. Phys. 21(3) (1980) 526–533. [2] H. Baumg¨ artel, On Haag dual nets over compact spaces, Lett. Math. Phys. 33 (1995) 7–22. [3] D. Buchholz and K. Fredenhagen, Locality and the structure of particle states, Commun. Math. Phys. 84 (1982) 1–54. [4] R. Brunetti, K. Fredenhagen and R. Verch, The generally covariant locality principle — A new paradigm for local quantum physics, Commum. Math. Phys. 237 (2003) 31–68. [5] H. Baumg¨ artel and F. Lled´ o, Duality of compact groups and Hilbert C∗ -systems for C∗ -algebras with nontrivial center, Internat. J. Math. 15(8) (2004) 759–812. [6] D. Buchholz, G. Mack and I. Todorov, The current algebra on the circle as a germ for local field theories, in Conformal Field Theories and Related Topics, eds. Bin´etruy et al., Nucl. Phys. B Proc. Suppl. 5B (1988) 20–56. [7] R. Brunetti and G. Ruzzi, Superselection sectors in a generally covariant setting, work in progress. [8] A. N. Bernal and M. S´ anchez, On smooth Cauchy hypersurfaces and Geroch’s splitting theorem, Commun. Math. Phys. 243 (2003) 461–470. [9] A. N. Bernal and M. S´ anchez, Smoothness of time functions and the metric splitting of globally hyperbolic space-times, to appear in Commun. Math. Phys. gr–qc/0401112. [10] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics I, Commun. Math. Phys. 23 (1971) 199–230. [11] S. Doplicher, R. Haag and J. E. Roberts, Local observables and particle statistics II, Commun. Math. Phys. 35 (1974) 49–85. [12] S. Doplicher and J. E. Roberts, Why there is a field algebra with a compact gauge group describing the superselection sectors in particle physics, Commun. Math. Phys. 131(1) (1990) 51–107. [13] G. F. R. Ellis and S. W. Hawking, The Large Scale Structure of Space-Time (Cambridge University Press, 1973). [14] K. Fredenhagen, K.-H. Rehren and B. Schroer, Superselection sectors with braid group statistics and exchange algebras. II: Geometric aspects and conformal covariance, Rev. Math. Phys. (Special Issue) (1992) 113–157. [15] D. Guido and R. Longo, The conformal spin-statistics theorem, Commun. Math. Phys. 181 (1996) 11–35. [16] D. Guido, R. Longo, J. E. Roberts and R. Verch, Charged sectors, spin and statistics in quantum field theory on curved space-times, Rev. Math. Phys. 13(2) (2001) 125–198. [17] R. Haag, Local Quantum Physics, 2nd edn. (Springer Texts and Monographs in Physics, 1996). [18] R. Haag and D. Kastler, An algebraic approach to quantum field theory, J. Math. Phys. 43 (1964) 848. [19] W. Kunhardt, On infravacua and superselection theory with massless particles, PhD thesis, G¨ ottingen (2001), math-ph/0109001. [20] Y. Kawahigashi, R. Longo and M. M¨ uger, Multi-interval subfactors and modularity of representations in conformal field theory, Commun. Math. Phys. 219 (2001) 631–669. [21] R. Longo and J. E. Roberts, A theory of dimension, K-Theory 11(2) (1997) 103–159. [22] S. MacLane, Categories for the Working Mathematician (Springer Verlag, New YorkHeidelberg-Berlin, 1971).
October 20, 2005 8:48 WSPC/148-RMP
1070
J070-00248
G. Ruzzi
[23] J. P. May, Simplicial Objects in Algebraic Topology (Chicago University Press, 1967). [24] B. O’Neill, Semi-Riemannian Geometry, (Academic Press, New York, 1983). [25] J. E. Roberts, Local cohomology and superselection structure, Commun. Math. Phys 51(2) (1976) 107–119. [26] J. E. Roberts, Lectures on algebraic quantum field theory, in The Algebraic Theory of Superselection Sectors (Palermo 1989), ed. D. Kastler (World Sci. Publishing, River Edge, NJ, 1990), pp. 1–112. [27] J. E. Roberts, More lectures in algebraic quantum field theory, in Noncommutative Geometry, C.I.M.E. Lectures, Martina Franca, Italy, 2000, eds. S. Doplicher and R. Longo (Springer, 2003). [28] G. Ruzzi, Essential properties of the vacuum sector for a theory of superselection sectors, Rev. Math. Phys. 15(10) (2003) 1255–1283. [29] G. Ruzzi, Punctured Haag duality in locally covariant quantum field theories, Commun. Math. Phys. 256(3) (2005) 621–634. [30] I. M. Singer and J. A. Thorpe Lecture Notes on Elementary Topology and Geometry (Springer-Verlag, New York, 1967). [31] E. Vasselli, Continuous fields of C∗ -algebras arising from extensions of tensor C∗ -categories, J. Funct. Anal. 199 (2003) 123–153. [32] R. Verch, Continuity of symplectically adjoint maps and the algebraic structure of Hadamard vacuum representations for quantum fields in curved space-time, Rev. Math. Phys. 9(5) (1997) 635–674. [33] R. Verch, Notes on regular diamonds, preprint, http://www/lqp.uni-goettingen.de/ lqp/papers/. [34] R. M. Wald, General Relativity (University of Chicago Press, 1984).
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Reviews in Mathematical Physics Vol. 17, No. 9 (2005) 1071–1109 c World Scientific Publishing Company
GENERALIZED WEAK WEYL RELATION AND DECAY OF QUANTUM DYNAMICS
ASAO ARAI Department of Mathematics, Hokkaido University, Sapporo 060-0810, Japan [email protected] Received 08 April 2005 Revised 13 August 2005 Let H be a self-adjoint operator on a Hilbert space H, T be a symmetric operator on H and K(t) (t ∈ R) be a bounded self-adjoint operator on H. We say that (T, H, K) obeys the generalized weak Weyl relation (GWWR) if e−itH D(T ) ⊂ D(T ) for all t ∈ R and T e−itH ψ = e−itH (T + K(t))ψ, ∀ψ ∈ D(T ) (D(T ) denotes the domain of T ). In the context of quantum mechanics where H is the Hamiltonian of a quantum system, we call T a generalized time operator of H. We first investigate, in an abstract framework, mathematical structures and properties of triples (T, H, K) obeying the GWWR. These include the absolute continuity of the spectrum of H restricted to a closed subspace of H, an uncertainty relation between H and T (a “time-energy uncertainty relation”), ˛˙ ¸˛2 the decay property of transition probabilities ˛ ψ, e−itH φ ˛ as |t| → ∞ for all vectors ψ and φ in a subspace of H, where ·, · denotes the inner product of H. We describe methods to construct various examples of triples (T, H, K) obeying the GWWR. In particular, we show that there exist generalized time operators of second quantization operators on Fock spaces (full Fock spaces, boson Fock spaces and fermion Fock spaces) which may have applications to quantum field models with interactions. Keywords: Generalized weak Weyl relation; time operator; canonical commutation relation; Hamiltonian; quantum dynamics; survival probability; decay in time; timeenergy uncertainty relation; Schr¨ odinger operator; Dirac operator; Fock space; second quantization. Mathematics Subject Classification 2000: 81Q10, 47N50
Contents 1. Introduction 2. Fundamental Properties of the GWWR 2.1. Elementary facts 2.2. Nonself-adjointness of generalized time operators 2.3. Construction of triples obeying the GWWR in direct sums of Hilbert spaces 2.4. Perturbations 3. Transition Probability Amplitudes and the Point Spectra of Hamiltonians 4. Generalized Weak CCR and Time-Energy Uncertainty Relations 1071
1072 1075 1075 1077 1079 1080 1081 1082
October 20, 2005 8:48 WSPC/148-RMP
1072
J070-00247
A. Arai
5. The Point Spectra of Generalized Time Operators 6. Commutation Formulas and Absolute Continuity 6.1. General cases 6.2. A special case 7. Absence of Minimum-Uncertainty States 8. Power Decays of Transition Probability Amplitudes in Quantum Dynamics 8.1. A simple case 8.2. Higher order decays in smaller subspaces 8.3. Correlation functions 8.4. Heat semigroups 9. Abstract Version of Wigner’s Time-Energy Uncertainty Relation 10. Structure Producing Successively Triples Obeying the GWWR 11. Generalized Time Operators of Partial Differential Operators 11.1. Constructions from the Schr¨ odinger representation of the CCR with d degrees of freedom 11.2. Abstract Dirac operators 12. Representations of the GWWR in Fock Spaces 12.1. Tensor representations of the GWWR 12.2. Constructions of triples obeying the GWWR in Fock spaces
1084 1084 1085 1086 1089 1091 1092 1092 1095 1096 1097 1099 1101 1101 1103 1105 1105 1105
1. Introduction In this paper we develop, in an abstract framework, an operator theory of a commutation relation, which is a generalization of a variant of the canonical commutation relation (CCR) with one degree of freedom, and we put a basis for applications of the theory to quantum mechanics and quantum field theory. To explain new features of the present work, we first recall some of the basic aspects on the representation theory of the CCR. As is well known, a representation of the CCR with one degree of freedom is defined to be a triple (H, D, (Q, P )) consisting of a complex Hilbert space H, a dense subspace D of H and a pair (Q, P ) of symmetric operators on H such that D ⊂ D(QP ) ∩ D(P Q) (D(·) denotes operator domain) and QP − P Q = iI
(1.1)
√ on D, where i := −1 and I denotes the identity on H.a If both Q and P are self-adjoint and D is a common core of Q and P , then we say that the representation (H, D, (Q, P )) is self-adjoint.b a One can generalize the concept of the representation of the CCR by taking the commutation relation (1.1) in the sense of sesquilinear form, i.e., D ⊂ D(Q) ∩ D(P ) and Qψ, P φ − P ψ, Qφ = i ψ, φ , ψ, φ ∈ D, where ·, · denotes the inner product of H. b There are representations of the CCR which cannot be self-adjoint. For example, consider the Hilbert space L2 (R+ ) with R+ := (0, ∞) and define operators q+ , p+ on L2 (R+ ) as follows: ( ) ˛Z ˛ D(q+ ) := f ∈ L2 (R+ )˛˛ |rf (r)|2 dr < ∞ , (q+ f )(r) := rf (r), f ∈ D(q+ ), a.e. r ∈ R+ , R+
D(p+ ) :=
C0∞ (R+ ),
(p+ f )(r) := −i
df (r) , dr
f ∈ D(p+ ),
a.e. r ∈ R+ .
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1073
A typical example of self-adjoint representations of the CCR is the Schr¨ odinger 2 ∞ representation (L (R), C0 (R), (q, p)) with q being the multiplication operator by the function x ∈ R acting in L2 (R) and p := −iDx the generalized differential operator in the variable x acting in L2 (R). There is a stronger form of representation of the CCR: A double (H, (Q, P )) consisting of a complex Hilbert space H and a pair (Q, P ) of self-adjoint operators on H is called a Weyl representation of the CCR with one degree of freedom if eitQ eisP = e−its eisP eitQ ,
∀ t, s ∈ R.
This relation is called the Weyl relation (e.g., [19, pp. 274–275]). It is easy to see that the Schr¨ odinger representation (L2 (R), (q, p)) is a Weyl representation. Von Neumann [16] proved that every Weyl representation on a separable Hilbert space is unitarily equivalent to a direct sum of the Schr¨ odinger representation. This theorem — the von Neumann uniqueness theorem — implies that each Weyl representation of the CCR is a self-adjoint representation of the CCR (for details, see [6, 18]). But a self-adjoint representation of the CCR is not necessarily a Weyl representation of the CCR, namely there are self-adjoint representations of the CCR that are not Weyl representations (e.g., [8]). Physically interesting examples of self-adjoint representations of the CCR’s with two degrees of freedom which are not necessarily unitarily equivalent to the Schr¨ odinger representation of the CCR’s appear in two-dimensional gauge quantum mechanics with singular gauge potentials. These representations, which are closely related to the so-called Aharonov–Bohm effect [1], have been studied extensively by the present author in a series of papers (see [5] and references therein). Schm¨ udgen [21] presented and studied a weaker version of the Weyl relation with one degree of freedom: Let T be a symmetric operator and H be a self-adjoint operator on a Hilbert space H. We say that (T, H) obeys the weak Weyl relation (WWR) if e−itH D(T ) ⊂ D(T ) for all t ∈ R and T e−itH ψ = e−itH (T + t)ψ,
∀ ψ ∈ D(T ),
∀ t ∈ R,
where, for later convenience, we use the symbols (T, H) instead of (Q, P ). We call (H, (T, H)) a weak Weyl representation of the CCR with one degree of freedom. It is easy to see that every Weyl representation of the CCR is a weak Weyl representation of the CCR. But the converse is not true [21].c It should be remarked also that the WWR implies the CCR, but a representation of the CCR is not necessarily a weak Weyl representation of the CCR. In this sense, the WWR is between the CCR and the Weyl relation. Then q+ is self-adjoint, p is symmetric and (L2 (R+ ), C0∞ (R+ ), (q+ , p+ )) is a representation of the CCR with one degree of freedom. It is not so difficult to prove that p+ has no self-adjoint extensions (e.g., see [6, Chap. 2, Example D.1]). Therefore, (q+ , p+ ) cannot be extended to a self-adjoint representation of the CCR on L2 (R+ ). c The pair (p , −q ) in the preceding footnote obeys the WWR. But it cannot be a Weyl repre+ + sentation, since p+ is not self-adjoint (or −q+ is non-positive).
October 20, 2005 8:48 WSPC/148-RMP
1074
J070-00247
A. Arai
The WWR was used to study a time operator with application to survival probabilities in quantum dynamics [11, 12] (in the article [11], the WWR is called the T -weak Weyl relation), where H is taken to be the Hamiltonian of a quantum system. It was proven in [11] that, if (T, H) obeys the WWR, then H has no point spectrum and its spectrum is purely absolutely continuous [11, Corollary 4.3, Theorem 4.4]. This kind class of H, however, is somewhat restrictive. From this point of view, it would be natural to investigate a general version of the WWR (if any) such that H is not necessarily purely absolutely continuous. This is one of the motivations of the present work. The general version of the WWR we consider in the present paper is defined as follows: Definition 1.1. Let T be a symmetric operator on a Hilbert space H, H be a self-adjoint operator on H and K(t) (t ∈ R) be a bounded self-adjoint operator on H with D(K(t)) = H, ∀ t ∈ R. We say that (T, H, K) obeys the generalized weak Weyl relation (GWWR) in H if e−itH D(T ) ⊂ D(T ) for all t ∈ R and T e−itH ψ = e−itH (T + K(t))ψ,
∀ ψ ∈ D(T ),
∀ t ∈ R.
(1.2)
We call the operator-valued function K the commutation factor in the GWWR. Also we sometimes say that (T, H, K) is a representation of the GWWR. Obviously the case K(t) = t in the GWWR gives the WWR. Hence the GWWR is certainly a generalization of the WWR. One can show also that the GWWR implies a generalized version of the CCR (Proposition 4.3 in the present paper). Since the (G)WWR is a weaker version of the Weyl relation, the strong properties arising from the Weyl relation (e.g., spectral properties) may be weakened by the (G)WWR. It is very interesting to investigate this aspect. Thus triples (T, H, K) obeying the GWWR become the main objects of the present paper. As suggested above, in applications to quantum mechanics and quantum field theory, we have in mind the case where H is the Hamiltonian of a quantum system. In this realization of H, we call T a generalized time operator. We show that the GWWR implies a “time-energy uncertainty relation” between H and T (for physical discussions related to this aspect, see [15] and references therein). Mathematically rigorous studies for time-energy uncertainty relations, which, however, do not use time operators, are given in [17]. In the present paper, we construct generalized time operators for Hamiltonians in both relativistic and nonrelativistic quantum mechanics including Dirac type operators as well as in quantum field theory. The outline of the present paper is as follows. In Sec. 2, we discuss fundamental properties of representations (T, H, K) of the GWWR. In Sec. 3, we derive a decay property (in time t ∈ R) of transition probability amplitudes ψ, e−itH φ for vectors ψ, φ in a suitable subspace, where ·, · denotes the inner product of H. In Sec. 4, we introduce a concept of the generalized weak CCR and show that the GWWR implies the generalized weak CCR. We derive also a time-energy uncertainty relation. Section 5 is concerned with properties of the point spectrum of T .
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1075
In Sec. 6, we develop functional calculus for the GWWR. In a special case where K(t) = KC (t) := tC with C a bounded self-adjoint operator, we prove the absolute continuity of H restricted to the closure of Ran(C), the range of C. In Sec. 7, we prove absence of minimum-uncertainty states for each representation (T, H, KC ) with T being closed and H being bounded from below. This is an interesting aspect in the sense that it shows a difference from the Weyl representation (on a separable Hilbert space) or the Fock representation of the CCR in which minimumuncertainty states exist (see Remark 7.1 for more details). In Sec. 8, for representations (T, H, KC ) of the GWWR, we derive power laws for decays (in time) of transition probability amplitudes as well as two-point correlation functions. Heat semi-groups e−βH (β > 0) generated by H (under the condition that H is bounded from below) are also considered. In Sec. 9, we present an abstract version of Wigner’s time-energy uncertainty relation [23]. In Sec. 10, we show that there exists a structure producing successively representations of the GWWR. In Sec. 11, we construct concrete classes of representations of the GWWR, using partial differential operators acting in L2 (Rd ) (d ∈ N). We also find generalized time operators for an abstract Dirac operator. In the last section, we present a tensor representation of the GWWR and construct generalized time operators for second quantization operators on Fock spaces (full Fock spaces, boson Fock spaces, fermion Fock spaces). This puts a basis to investigations of quantum field models with interactions. 2. Fundamental Properties of the GWWR Throughout this section, we assume that (T, H, K) obeys the GWWR in a Hilbert space H (Definition 1.1). We denote the inner product and the norm of H by ·, · and · respectively. 2.1. Elementary facts Proposition 2.1. For all t ∈ R, e−itH D(T ) = D(T ) and the operator equality T e−itH = e−itH (T + K(t))
(2.1)
K(0) = 0.
(2.2)
holds. Moreover,
Proof. Taking −t as the t in Definition 1.1, we have eitH D(T ) ⊂ D(T ) for all t ∈ R. Hence, D(T ) ⊂ e−itH D(T ) ⊂ D(T ) for all t ∈ R, which implies that e−itH D(T ) = D(T ) for all t ∈ R. By definition, we have e−itH (T + K(t)) ⊂ T e−itH for all t ∈ R. On the other hand, the preceding result implies that D(T e−itH ) = D(T ) for all t ∈ R. Thus, (2.1) follows. Letting t = 0 in (2.1), we have K(0) = 0 on D(T ). Since D(T ) is dense and K(0) is bounded, we obtain (2.2). Proposition 2.2. Let T¯ be the closure of T . Then (T¯, H, K) obeys the GWWR.
October 20, 2005 8:48 WSPC/148-RMP
1076
J070-00247
A. Arai
Proof. For each ψ ∈ D(T¯ ), there exists a sequence {ψn }∞ n=1 ⊂ D(T ) such that limn→∞ ψn = ψ and limn→∞ T ψn = T¯ψ. We have T e−itH ψn = e−itH (T + K(t))ψn . The right-hand side (r.h.s.) strongly converges to e−itH (T¯ + K(t))ψ as n → ∞. We have e−itH ψn → e−itH ψ as n → ∞. Hence e−itH ψ ∈ D(T¯ ) and T¯e−itH ψ = e−itH (T¯ + K(t))ψ. Thus, (T¯, H, K) obeys the GWWR. For a linear operator A, we denote by σ(A) (resp. σp (A)) the spectrum (resp. the point spectrum) of A. Corollary 2.3. For all t ∈ R, σ(T + K(t)) = σ(T ) and σp (T + K(t)) = σp (T ), where the multiplicity of each λ ∈ σp (T ) is equal to that of λ ∈ σp (T + K(t)). Proof. Operator equality (2.1) means the unitary equivalence of T + K(t) and T . Hence, by a general theorem, the desired results follow. Definition 2.4. We say that a linear operator L on H strongly commutes with H if e−itH D(L) ⊂ D(L) for all t ∈ R and e−itH L ⊂ Le−itH. Remark 2.5. In the same way as in the proof of Proposition 2.1, one can show that L strongly commutes with H if and only if operator equality e−itH L = Le−itH holds for all t ∈ R. Proposition 2.6. Let S be a symmetric operator on H strongly commuting with H such that D(S) ∩ D(T ) is dense (hence T + S is a symmetric operator with D(T + S) := D(T ) ∩ D(S)). Then, (T + S, H, K) obeys the GWWR. Proof. A simple calculation. We denote by B(H) the Banach space of all bounded linear operators on H with domains equal to H. Proposition 2.7. For all t ∈ R, eitH K(−t) + K(t)eitH = 0.
(2.3)
In particular, σ(K(t)) = σ(−K(−t)),
σp (K(t)) = σp (−K(−t)),
∀ t ∈ R.
(2.4)
Proof. Let t ∈ R. In general, for all W ∈ B(H) and every densely defined linear operator A on H, we have (W A)∗ = A∗ W ∗ (operator equality). Hence, by taking the adjoint of (2.1), we have eitH T ∗ ⊂ (T ∗ + K(t))eitH for all t ∈ R. Hence, for all ψ ∈ D(T ), eitH T ψ = (T + K(t))eitH ψ = eitH (T + K(−t))ψ + K(t)eitH ψ, where we have used (2.1) to obtain the second equality. This implies that eitH K(−t)ψ + K(t)eitH ψ = 0 for all t ∈ R. Since D(T ) is dense, we obtain (2.3). Operator equality (2.3) implies the unitary equivalence of K(t) and −K(−t). Hence (2.4) follows.
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1077
2.2. Nonself-adjointness of generalized time operators In this section, we prove the following theorem: Theorem 2.8. Assume that K : R → B(H) is strongly differentiable on R and let dK(t) , (2.5) dt the strong derivative of K in t ∈ R. Suppose that K (0) = 0, H is semi-bounded (i.e., bounded from below or bounded from above) and K (t) := s-
K(t)T ⊂ T K(t)
(2.6)
for all t ∈ R. Then, T is not self-adjoint. Remark 2.9. In the simple case K(t) = t, the fact stated in Theorem 2.8 has been pointed out in [11]. We need some preliminary results. Proposition 2.10. Suppose that T is self-adjoint. Then, for all t ∈ R, T + K(t) is self-adjoint and e−isT e−itH = e−itH e−is(T +K(t)) ,
∀ s, ∀ t ∈ R.
(2.7)
Proof. The self-adjointness of T + K(t) follows from a simple application of the Kato–Rellich theorem, since K(t) is bounded and self-adjoint. By (2.1), we have eitH T e−itH = T + K(t)
(2.8)
as operator equality. Hence, by the functional calculus, we have for all s, t ∈ R, eitH e−isT e−itH = e−is(T +K(t)) , which is equivalent to (2.7). Definition 2.11. We say that two self-adjoint operators A and B on a Hilbert space strongly commute if their spectral measures commute: EA (J1 )EB (J2 ) = EB (J2 )EA (J1 ) for all Borel sets J1 , J2 in R, where EA (resp. EB ) denotes the spectral measure of A (resp. B). For characterizations of the strong commutativity of two self-adjoint operators, we refer the reader to [19, Theorem VIII.13]. Lemma 2.12. Let A and B be self-adjoint operators on a Hilbert space. Suppose that B is bounded and BA ⊂ AB. Then, A and B strongly commute. Proof. The assumption implies that BD(A) ⊂ D(A) and BAψ = ABψ, ∀ ψ ∈ D(A). This implies that, for all n ∈ N and ψ ∈ D(A), B n ψ ∈ D(A) and B n Aψ = N AB n ψ. For N ∈ N and t ∈ R, we set eN := n=0 (it)n B n /n!. Let ψ ∈ D(A). Then, eN Aψ = AeN ψ. It is easy to see that, for all φ ∈ H, eN φ → eitB φ as N → ∞.
October 20, 2005 8:48 WSPC/148-RMP
1078
J070-00247
A. Arai
Hence, AeN ψ → eitB Aψ as N → ∞. Since A is closed, it follows that eitB ψ ∈ D(A) and AeitB ψ = eitB Aψ. In particular, eitB D(A) ⊂ D(A) for all t ∈ R and hence, in fact, eitB D(A) = D(A). Therefore, operator equality eitB Ae−itB = A follows. By the functional calculus, we have eitB eisA e−itB = eisA , s, t ∈ R. This implies the strong commutativity of A and B (apply [19, Theorem VIII.13]). Lemma 2.13. Let A and B be strongly commuting self-adjoint operators on a Hilbert space. Then, A + B is essentially self-adjoint and, for all t ∈ R, eit(A+B) = eitA eitB = eitB eitA .
(2.9)
Proof. The essential self-adjointness of A + B follows from the two variable functional calculus (note that A + B = R2 (λ + µ)dEA,B (λ, µ), where EA,B is the twodimensional spectral measure such that EA,B (J1 × J2 ) = EA (J1 )EB (J2 ) for all Borel sets J1 , J2 in R). Formula (2.9) follows from the Trotter product formula [19, Theorem VIII.31] and [19, Theorem VIII.13(c)]. Proof of Theorem 2.8. Suppose that T were self-adjoint. Then, it follows from (2.6) and Lemma 2.12 that K(t) and T strongly commute. Hence, by (2.7) and Lemma 2.13, we have for all s ∈ R, t ∈ R\{0}, −itH −itH e − 1 −isT −1 e ψ = e−isT e ψ t t −isK(t) e −1 − e−itH e−isT ψ (2.10) t for all ψ ∈ H. We can write K(t) e−isK(t) − 1 = −is + Ms (t) t t with Ms (t) :=
∞ (−is)n K(t) K(t)n−1 n! t n=2
in the operator norm topology. By the strong differentiability of K with K(0) = 0 (Proposition 2.1) and the principle of uniform boundedness, we have for each δ > 0, Cδ := sup K(t)/t < ∞, |t|<δ
Dδ := sup K(t) < ∞, |t|<δ
where, for a bounded operator A ∈ B(H), A denotes the operator norm of A. Hence, for all ψ ∈ H and 0 < |t| < δ, ∞ |s|n n−2 D K(t)ψ. Ms (t)ψ ≤ Cδ n! δ n=2
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1079
Using this estimate and the fact that K(0) = 0, we obtain limt→0 Ms (t)ψ = 0. Hence, e−isK(t) − 1 ψ = −isK (0)ψ. t→0 t lim
Let ψ ∈ D(H). Then, by (2.10), −itH − 1 −isT e lim ψ = e−isT (−iH)ψ − e−isT (−is)K (0)ψ, e t→0 t which implies that e−isT ψ ∈ D(H) and (−iH)e−isT ψ = e−isT (−iH)ψ − e−isT (−is)K (0)ψ, i.e., eisT He−isT ψ = Hψ − sK (0)ψ. Hence,
−isT e ψ, He−isT ψ = ψ, Hψ − s ψ, K (0)ψ .
(2.11)
Since K (0) is a non-zero bounded self-adjoint operator by the present assumption, there exists a vector η ∈ D(H) such that η = 1 and η, K (0)η = 0. Equation (2.11) implies that sup e−isT η, He−isT η = ∞, inf e−isT η, He−isT η = −∞. s∈R
s∈R
Hence, H is not semi-bounded (note that e−isT η = 1 for all s ∈ R). Thus, we are led to a contradiction. 2.3. Construction of triples obeying the GWWR in direct sums of Hilbert spaces Let H1 be a Hilbert space and F := H ⊕ H1 . Let (T1 , H1 , K1 ) be a triple obeying the GWWR in H1 . We define H 0 ˜ H := H ⊕ H1 = . (2.12) 0 H1 Proposition 2.14. Let A be a bounded linear operator from H to H1 with D(A) = H and ∗ K(t) eitH A∗ e−itH1 − A∗ T A ˜ . (2.13) T˜ := , K(t) := A T1 eitH1 Ae−itH − A K1 (t) ˜ K) ˜ obeys the GWWR in F . Then, (T˜, H, ˜
Proof. By the functional calculus, we have e−itH = e−itH ⊕ e−itH1 for all t ∈ R. Then direct computations yield the desired result.
October 20, 2005 8:48 WSPC/148-RMP
1080
J070-00247
A. Arai
Note that, in Proposition 2.14, T˜ is not diagonal if A = 0. This procedure of construction of a new triple obeying the GWWR obviously yields an algorithm to obtain a triple obeying the GWWR in the N direct sum ⊕N n=1 Hn of Hilbert spaces Hn (N ≥ 2), provided that, for each n, a triple (Tn , Hn , Kn ) obeying the GWWR in Hn is given. 2.4. Perturbations Let V be a symmetric operator on H and assume that H(V ) := H + V
(2.14)
is essentially self-adjoint. It is an interesting problem to investigate if there exists a symmetric operator TV and an operator-valued function KV : R → B(H) such that (TV , H(V ), KV ) obeys the GWWR. Here, we present only an abstract answer to this problem. Proposition 2.15. Assume that the following conditions (i)–(iii) hold: (i) The operators T, H and K(t) (t ∈ R) are reduced by a closed subspace M of H. We denote their reduced part by TM , HM and KM (t) respectively. (ii) The operator H(V ) is reduced by a closed subspace N of H. (iii) There exists a unitary operator U : M → N such that U HM U −1 = H(V )N . Let TV := (U TM U −1 ) ⊕ 0,
KV (t) := (U KM (t)U −1 ) ⊕ 0
(2.15)
relative to the orthogonal decomposition H = N ⊕N ⊥ . Then, (TV , H(V ), KV ) obeys the GWWR. Proof. It is obvious that TV is symmetric and KV (t) is a bounded selfadjoint operator. By direct computations, one sees that (TV , H(V ), KV ) obeys the GWWR. Remark 2.16. A method to find the unitary operator U in Proposition 2.15 is to use the method of wave operators with respect to the pair (H, H(V )). In that case, U would be one of the wave operators W± := s-limt→±∞ eitH(V ) Je−itH Pac (H) (if they exist) (Pac (H) is the orthogonal projection onto the absolutely continuous space of H and J is a linear operator), M = (ker W± )⊥ and N = Ran(W± ) (e.g., [9, §4.2], [20, p. 34, Proposition 4]). This method was taken in [11, 12] in the case where H is the one-dimensional Laplacian and V is a real-valued function on R (hence, H(V ) is a one-dimensional Schr¨ odinger operator). In the present paper, we do not develop this aspect.
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1081
3. Transition Probability Amplitudes and the Point Spectra of Hamiltonians If a self-adjoint operator H on a Hilbert space represents the Hamiltonian of a quantum system, then the transition probability amplitude of an initialstate ψ ∈ H with ψ = 1 to a state φ ∈ H with φ = 1 at time t ∈ R is given by φ, e−itH ψ .
2 The square φ, e−itH ψ of its modulus is called the transition probability from ψ
2 at time 0 to φ at time t. In particular, ψ, e−itH ψ is referred to as the survival probability, at time t, of the state ψ. Let (T, H, K) be a triple obeying the GWWR in a Hilbert space H. The following proposition is concerned with upper bounds of the modulus of a transition probabilty amplitude in time t. Proposition 3.1. Suppose that there is a constant α > 0 such that the strong limit K(t) ∈ B(H) (3.1) tα exists. Let S be a symmetric operator strongly commuting with H. Then, for all ψ, φ ∈ D(T ) ∩ D(S) and t > 0,
ψ, e−itH Lα φ ≤ (T + S)ψφ + ψ(T + S)φ + ψ Lα − K(t) φ . α α t t (3.2) Lα := s- lim
t→∞
Proof. Let L = Lα and L(t) := K(t)/tα . Then
ψ, e−itH Lφ ≤ ψ, e−itH (L − L(t))φ + ψ, e−itH L(t)φ
≤ ψ(L − L(t))φ + ψ, e−itH L(t)φ . By Proposition 2.6, we have e−itH L(t)φ = t−α [(T + S)e−itH φ − e−itH (T + S)φ]. Using this relation, we have
ψ, e−itH L(t)φ ≤ (T + S)ψφ + ψ(T + S)φ . tα Hence (3.2) follows. Remark 3.2. Proposition 3.1 is a generalization of [11, Theorem 4.1] where the special case K(t) = t is considered. The following corollary is a generalized version of [11, Corollary 4.3]: Corollary 3.3. Suppose that the assumption of Proposition 3.1 holds. Then, for all ψ, φ ∈ H, (3.3) lim ψ, e−itH Lα φ = 0. t→∞
Proof. The proof is similar to that of [11, Corollary 4.3]. By the polarization identity, we need only to prove (3.3) with φ = ψ. Since D(T ) is dense, there exists
October 20, 2005 8:48 WSPC/148-RMP
1082
J070-00247
A. Arai
a sequence ψn ∈ D(T ) such that limn→∞ ψn = ψ. Then, in the same way as in the proof of the preceding proposition, we have
ψ, e−itH Lα ψ ≤ ψ − ψn Lαψ + ψn Lα ψ − ψn
+ ψn , e−itH Lα ψn .
By (3.2) with ψ = φ = ψn and S = 0, we have limt→∞ ψn , e−itH Lα ψn = 0. Hence,
lim sup ψ, e−itH Lα ψ ≤ ψ − ψn Lα ψ + ψn Lα ψ − ψn . t→∞
Then, taking n → ∞, we obtain (3.3) with φ = ψ. This corollary implies an interesting structure of the point spectrum of H: Corollary 3.4. Suppose that the assumption of Proposition 3.1 holds. Then, for all E ∈ R, ker(H − E) ⊂ ker Lα . In particular, if ker Lα = {0}, then σp (H) = ∅. Proof. Let ψE ∈ ker(H − E). Then, eitH ψE = eitE ψE . Taking ψ = ψE in (3.3), we obtain ψE , Lα φ = 0 for all φ ∈ H. This implies that Lα ψE = 0, i.e., ψE ∈ ker Lα . Remark 3.5. Corollary 3.4 is a generalization of [11, Corollary 4.3] where the case K(t) = t is considered. 4. Generalized Weak CCR and Time-Energy Uncertainty Relations Let A, B be symmetric operators on a Hilbert space H and C ∈ B(H) be a self-adjoint operator. We say that (A, B, C) obeys the generalized weak CCR (GWCCR) if Aψ, Bφ − Bψ, Aφ = ψ, iCφ ,
∀ ψ, φ ∈ D(A) ∩ D(B).
(4.1)
The case C = I (the identity on H) is the usual CCR with one degree of freedom in the sense of sesquilinear form. For a symmetric operator A on a Hilbert space, a constant a ∈ R and a unit vector ψ ∈ D(A), we define (∆A)ψ (a) := (A − a)ψ,
(4.2)
an uncertainty of A in the state vector ψ. The quantity (∆A)ψ (a) with a = ψ, Aψ is the usual uncertainty of A in the state vector ψ. We set (∆A)ψ := (∆A)ψ (ψ, Aψ).
(4.3)
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1083
We also introduce δC :=
inf
ψ∈(ker C)⊥ ,ψ=1
| ψ, Cψ |.
(4.4)
Proposition 4.1. Suppose that (A, B, C) obeys the GWCCR. Then, for all ψ ∈ D(A) ∩ D(B) ∩ (ker C)⊥ with ψ = 1 and all a, b ∈ R, (∆A)ψ (a)(∆B)ψ (b) ≥
δC . 2
(4.5)
Proof. It is easy to see that (A − a, B − b, C) also obeys the GWCCR. Hence, applying the Cauchy–Schwarz inequality on a non-negative, sesquilinear form on a space of linear operators [17, Lemma 1], we have for all ψ ∈ D(A) ∩ D(B) (∆A)ψ (a)(∆B)ψ (b) ≥
1 | ψ, Cψ |. 2
Thus (4.5) follows. Proposition 4.2. Suppose that (A, B, C) obeys the GWCCR with C ≥ 0. Then, for all λ ∈ R, ker(B − λ) ∩ D(A) ⊂ ker C and ker(A − λ) ∩ D(B) ⊂ ker C. Proof. Let ψ ∈ ker(B − λ) ∩ D(A). Then, taking φ = ψ in (4.1), we have ψ, Cψ = 0. Since C is non-negative, it follows that Cψ = 0, i.e., ψ ∈ ker C. The following proposition gives a connection of the GWWR with the GWCCR: Proposition 4.3. Let (T, H, K) be a triple obeying the GWWR in H. Assume that K is strongly differentiable on R. Then (T, H, K (0)) obeys the GWCCR: T ψ, Hφ − Hψ, T φ = ψ, iK (0)φ,
ψ, φ ∈ D(T ) ∩ D(H).
(4.6)
Proof. Let ψ, φ ∈ D(T ) ∩ D(H). Then we have by (1.2), T ψ, e−itH φ = eitH ψ, T φ + eitH ψ, K(t)φ .
(4.7)
It is well known that, for all η ∈ D(H), eisH η is strongly differentiable with s-deisH η/ds = iHeisH η = ieisH Hη. Hence, the both sides of (4.7) are differentiable in t. Evaluating the derivatives at t = 0 and using (2.2), we obtain (4.6).
October 20, 2005 8:48 WSPC/148-RMP
1084
J070-00247
A. Arai
Propositions 4.3 and 4.1 yield the following result: Corollary 4.4. Suppose that the same assumption as in Proposition 4.3 holds. Then, for all ψ ∈ D(T ) ∩ D(H) ∩ (ker K (0))⊥ with ψ = 1 and all t, E ∈ R, δK (0) . (4.8) 2 In applications to quantum theory, (4.8) gives a time-energy uncertainty relation if H is the Hamiltonian of a quantum system. (∆T )ψ (t)(∆H)ψ (E) ≥
5. The Point Spectra of Generalized Time Operators For a linear operator L on a Hilbert space H, we introduce a subset of H: N0 (L) := {ψ ∈ D(L) | ψ, Lψ = 0}.
(5.1)
It is obvious that ker L ⊂ N0 (L). Remark 5.1. If L is a non-negative self-adjoint operator, then N0 (L) = ker L. Proposition 5.2. Assume that (T, H, K) obeys the GWWR and K is strongly differentiable on R. Then, for all E ∈ R, ker(T − E) ⊂ N0 (K (0)).
(5.2)
Proof. Let ψ0 ∈ ker(T − E). Then T ψ0 = Eψ0 . Taking the inner product with the vector obtained from the operation of (2.1) to ψ0 , we have of ψ0−itH ψ0 , e K(t)ψ0 = 0, t ∈ R. Dividing the both sides by t = 0 and taking the limit t → 0, we obtain ψ0 , K (0)ψ0 = 0, where we have used (2.2). Hence, ψ0 ∈ N0 (K (0)). Thus, (5.2) follows. Corollary 5.3. Assume that (T, H, K) obeys the GWWR and K is strongly differentiable on R. Then: (i) If N0 (K (0)) = {0}, then σp (T ) = ∅. (ii) If K (0) ≥ 0 or K (0) ≤ 0, then σp (T |[D(T ) ∩ (ker K (0))⊥ ]) = ∅. Remark 5.4. Corollary 5.3 is a generalization of [11, Corollary 4.2] where the case K(t) = t is considered. Remark 5.5. It may be interesting to note that the behavior of K(t) at t = 0 and t = ∞ is respectively related to σp (T ) (Corollary 5.3) and σp (H) (Corollary 3.4). 6. Commutation Formulas and Absolute Continuity In this section, we prove commutation relations derived from the GWWR. Moreover, in the special case where the commutation factor K(t) is of the form tC with C a bounded self-adjoint operator, we show that H is reduced by Ran(C) (Ran(C) denotes the range of C) and its reduced part is absolutely continuous.
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1085
6.1. General cases For p ≥ 0, we introduce a class of Borel measurable functions on R:
1 p
Lp (R) := F : R → C, Borel measurable
|F (t)|(1 + |t| ) dt < ∞ .
(6.1)
R
It is easy to see that L1p (R) includes the space S(R) of rapidly decreasing C ∞ functions on R. We say that a Borel measurable function f is in the set Mp if it is the Fourier transform of an element Ff ∈ L1p (R):
1 f (λ) = √ Ff (t)e−itλ dt, λ ∈ R. (6.2) 2π R Note that, for each f ∈ Mp , Ff is uniquely determined. We have S(R) ⊂ Mp .
(6.3)
Moreover, every element f of Mp is bounded, [p] times continuously differentiable ([p] denotes the largest integer not exceeding p) and, for j = 1, . . . , [p], dj f /dλj is bounded. Let H be a self-adjoint operator on a Hilbert space H and S : R → B(H) be Borel measurable such that, for all ψ ∈ H, S(t)ψ ≤ c(1 + |t|p )ψ,
t∈R
with constants c > 0 and p ≥ 0 independent of ψ. Then, for all ψ ∈ H and f ∈ Mp , the strong integral
1 f (H, S)ψ := √ Ff (t)e−itH S(t)ψ dt (6.4) 2π R exists and f (H, S) ∈ B(H). Theorem 6.1. Assume that (T, H, K) obeys the GWWR. Suppose that K is strongly continuous and, for all ψ ∈ H, K(t)ψ ≤ c(1 + |t|p )ψ,
(6.5)
where c > 0 and p ≥ 0 are constants independent of ψ. Let f ∈ Mp . Then, for all ψ ∈ D(T¯ ), we have f (H)ψ ∈ D(T¯ ) and T¯f (H)ψ = f (H)T¯ψ + f (H, K)ψ, where f (H) :=
R
(6.6)
f (λ)dEH (λ).
Proof. Let f ∈ Mp and ψ ∈ D(T ∗ ), φ ∈ D(T¯ ). Then, by the functional calculus of the self-adjoint operator H, we have
1 Ff (t) T ∗ ψ, e−itH φ dt. T ∗ ψ, f (H)φ = √ 2π R
October 20, 2005 8:48 WSPC/148-RMP
1086
J070-00247
A. Arai
Since e−itH φ∈ D(T¯ ) = D(T ∗∗ ) by Proposition 2.2, it follows that T ∗ ψ, e−itH φ = ¯, H, K) obeys the GWWR (Proposition 2.2), ψ, T¯e−itH φ . Using the fact that (T−itH T¯φ + ψ, e−itH K(t)φ . Hence, we obtain we see that the r.h.s. is equal to ψ, e
1 T ∗ ψ, f (H)φ = ψ, f (H)T¯φ + √ Ff (t) ψ, e−itH K(t)φ dt. 2π R ∗ Therefore, T ψ, f (H)φ = ψ, f (H)T¯φ + f (H, K)φ . Since ψ ∈ D(T ∗ ) is arbitrary, it follows that f (H)φ ∈ D(T ∗∗ ) = D(T¯ ) and (6.6) holds. 6.2. A special case In this section, we consider a special case of a triple (T, H, K) obeying the GWWR in a Hilbert space H: We assume that K is of the form KC (t) := tC,
t∈R
(6.7)
with C being a bounded self-adjoint operator on H. In this case, a more detailed analysis is possible as shown below. We set Cb1 (R) := {f ∈ C 1 (R)|f and f are bounded},
1 Cb,+ (R) := f ∈ C 1 (R) for some a ∈ R, sup |f (λ)| < ∞ and sup |f (λ)| < ∞ .
(6.8)
λ≥a
(6.9)
λ≥a
Theorem 6.2. Let C be a bounded self-adjoint operator on H and suppose that (T, H, KC ) obeys the GWWR. (i) Let f ∈ Cb1 (R). Then, f (H)D(T¯ ) ⊂ D(T¯ ) and T¯f (H)ψ − f (H)T¯ψ = if (H)Cψ
(6.10)
for all ψ ∈ D(T¯). 1 (R), the same (ii) Suppose that H is bounded from below. Then, for all f ∈ Cb,+ conclusion as that of part (i) holds. In particular, for all z ∈ C with z > 0, e−zH D(T¯) ⊂ D(T¯ ) and, for all ψ ∈ D(T¯ ), T¯e−zH ψ − e−zH T¯ψ = −ize−zH Cψ.
(6.11)
Proof. (i) We first consider the case where f ∈ M1 . Then, it is easy to see that f (H, KC ) = if (H)C. By this fact and Theorem 6.1, (6.10) holds. We next consider the case where f is an arbitrary element in the set C01 (R) := {f ∈ C 1 (R)|supp f is compact}, where supp f means the support of f . For a function g on R, we set g∞ := sup |g(λ)|. λ∈R
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1087
∞ Using the Friedrichs mollifier, we can find a sequence {fn }∞ n=1 ⊂ C0 (R) (the set of infinitely differentiable functions on R with compact support) such that supn≥1 fn ∞ < ∞, supn≥1 fn ∞ < ∞ and, for all λ ∈ R, fn (λ) → f (λ), fn (λ) → f (λ) as n → ∞. Then, by using the functional calculus for H, we see that
fn (H) → f (H),
fn (H) → f (H)
strongly as n → ∞. By the fact that C0∞ (R) ⊂ M1 and the preceding result, we have T¯fn (H)ψ = fn (H)T¯ψ + ifn (H)Cψ for all ψ ∈ D(T¯). Hence, T¯fn (H)ψ → f (H)T¯ψ + if (H)Cψ as n → ∞. Since T¯ is closed, it follows that f (H)ψ ∈ D(T¯ ) and (6.10) holds. Finally we consider the case where f ∈ Cb1 (R). Let χ ∈ C0∞ (R) such that χ(0) = 1 and set fn (λ) := χ(λ/n)f (λ). Then, fn ∈ C01 (R) and λ λ 1 fn (λ) = χ f (λ) + χ f (λ). n n n It is easy to see that, for all λ ∈ R, fn (λ) → f (λ), fn (λ) → f (λ) as n → ∞ and |fn (λ)| ≤ χ ∞ f ∞ + χ∞ f ∞ .
|fn (λ)| ≤ χ∞ f ∞,
Hence, by the functional calculus for H, we obtain that fn (H) → f (H), fn (H) → f (H) strongly as n → ∞. Then, in the same manner as in the preceding paragraph, we obtain the desired conclusion. 1 (R). Let χ be as above (ii) Let H ≥ −M with a constant M ≥ 0 and f ∈ Cb,+ 1 and define fn (λ) := χ(λ/n)f (λ). Then, fn ∈ C0 (R). Hence, by part (i), T¯fn (H)ψ = fn (H)T¯ψ + ifn (H)Cψ for all ψ ∈ D(T¯). We have for all ψ ∈ H
2 fn (H)ψ − f (H)ψ =
(6.12)
Fn (λ)dEH (λ)ψ2
(6.13)
[−M,∞)
with
2
λ Fn (λ) :=
χ − 1
|f (λ)|2 . n
We have
2
|Fn (λ)| ≤ (χ∞ + 1)
2 sup |f (λ)| ,
λ ∈ [−M, ∞).
λ≥−M
Hence, by the Lebesgue dominated convergence theorem, the r.h.s. of (6.13) converges to 0 as n → ∞. Hence, fn (H) → f (H) strongly as n → ∞. Similarly, we can show that fn (H) → f (H) strongly as n → ∞. Thus, in the same way as in part (i), we obtain (6.10). If z > 0 (z ∈ C), then the function fz : R → C defined by fz (λ) := e−zλ is in 1 Cb,+ (R). Hence, by applying the preceding result, we obtain (6.11).
October 20, 2005 8:48 WSPC/148-RMP
1088
J070-00247
A. Arai
Remark 6.3. Theorem 6.2 is a slight generalization of [21, 3.1, Proposition 1]. Proposition 6.4. Let C be a bounded self-adjoint operator on H and suppose that (T, H, KC ) obeys the GWWR. Then, H and C strongly commute. Proof. We can apply Proposition 2.7 to obtain that e−itH CeitH = C, ∀ t ∈ R. This implies the strong commutativity of H and C. Lemma 6.5. Let A and B be strongly commuting self-adjoint operators on a Hilbert space H. Then, Ran(B) reduces A. Proof. Since we have the orthogonal decomposition H = ker B ⊕ Ran(B), it is sufficient to prove that ker B reduces A. Let P be the orthogonal projection onto ker B. Then, we have P = EB ({0}). Hence, by the strong commutativity of A and B, eitA P = P eitA for all t ∈ R. This implies that P A ⊂ AP . Thus, ker B reduces A. This lemma and Proposition 6.4 imply the following fact: Corollary 6.6. Under the same assumption as in Proposition 6.4, H is reduced by Ran(C). As in the case of [21, 3.2, Corollary 2], we have from Theorem 6.2 and Corollary 6.6 the following theorem. For a self-adjoint operator H, we set EH (λ) := EH ((−∞, λ]),
λ ∈ R.
Theorem 6.7. Suppose that (T, H, KC ) obeys the GWWR. Then, H is reduced by Ran(C) and the reduced part H|Ran(C) is absolutely continuous. Moreover, for all ψ, φ ∈ D(T¯), the Radon–Nikodym derivative d ψ, EH (λ)Cφ /dλ is given by d ψ, EH (λ)Cφ = i T¯ψ, EH (λ)φ − EH (λ)ψ, T¯φ . (6.14) dλ Proof. The reducibility of H by Ran(C) has already been proved in Corollary 6.6. Let f ∈ S(R) and ψ, φ ∈ D(T¯ ). It is easy to see that S(R) ⊂ Cb1 (R). Hence, by Theorem 6.2, we have T¯ψ, f (H)φ − ψ, f (H)T¯φ = iψ, f (H)Cφ. (6.15) Let µ(λ) := ψ, EH (λ)Cφ and σ(λ) := T¯ψ, EH (λ)φ − EH (λ)ψ, T¯φ . Then, by (6.15) and the spectral theorem, we have
f (λ) dσ(λ) = i f (λ) dµ(λ). R
R
Applying to the l.h.s. the integration by parts formula on the Stieltjes integral, we have
(6.16) − f (λ)σ(λ) dλ = i f (λ) dµ(λ). R
R
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1089
It is easy to see that the functional Φ : S(R) → C defined by
Φ(g) := i g(λ) dµ(λ) + g(λ)σ(λ) dλ, g ∈ S(R) R
R
is a tempered distribution on R. Equation (6.16) implies that Φ(f ) = 0 for all f ∈ S(R). Hence, Φ(f ) = α R f (λ) dλ with α being a constant. This implies that the measure µ is absolutely continuous and its Radon–Nikodym derivative dµ/dλ is given by
dµ(λ) = −iα + iσ(λ). (6.17) dλ By a limiting argument, the absolute continuity of µ can be extended to that of ψ, EH (·)Cφ for all ψ, φ ∈ H. Hence, in particular, H|Ran(C) is absolutely continuous. It follows from (6.17) that, for all λ ∈ R |α| ≤ |µ(λ + 1) − µ(λ)| +
sup λ ∈(λ,λ+1]
|σ(λ )|.
Noting the fact that limλ→−∞ µ(λ) = 0, limλ→−∞ σ(λ) = 0, we obtain α = 0. Hence, (6.14) follows. 7. Absence of Minimum-Uncertainty States Let (A, B, C) be a triple obeying the GWCCR. A vector ψ0 ∈ D(A) ∩ D(B) ∩ (ker C)⊥ with ψ0 = 1 which attains the equality (∆A)ψ0 (∆B)ψ0 = δC /2 > 0 in the uncertainty relation (4.5) with a = ψ0 , Aψ0 and b = ψ0 , Bψ0 is called a minimum-uncertainty state for (A, B, C). Remark 7.1. It is well known that the Schr¨ odinger representation (q, p) of the CCR has minimum-uncertainty states. For example, the vector f0 ∈ L2 (R) given 2 2 by f0 (x) := (2π)−1/4 σ −1/2 e−(x−a) /(4σ ) , x ∈ R with a ∈ R and σ > 0 being constants is a minimum-uncertainty state for (q, p): (∆q)f0 (∆p)f0 = 1/2. It follows from this fact that every representation (Q, P ) of the CCR unitarily equivalent to the Schr¨ odinger one has minimum-uncertainty states. In particular, by the von Neumann uniqueness theorem mentioned in the Introduction of the present paper, every Weyl representation on a separable Hilbert space has minimum-uncertainty states. Also, the Fock representation of the CCR with one degree of freedom has minimum-uncertainty states (see, e.g., [10, 11.5.1, 11.5.2]). In this section, in contrast to the facts stated in Remark 7.1, we give a sufficient condition for a triple (T, H, C) to have no minimum-uncertainty states. Theorem 7.2 (Absence of Minimum-Uncertainty State). Suppose that (T, H, KC ) obeys the GWWR with T being closed. Assume that H is bounded from below and that C ≥ 0 with δC > 0. Then, there exist no vectors ψ0 ∈ D(H) ∩ D(T ) ∩ (ker C)⊥ with ψ0 = 1 such that (∆T )ψ0 (∆H)ψ0 =
δC > 0. 2
(7.1)
October 20, 2005 8:48 WSPC/148-RMP
1090
J070-00247
A. Arai
Remark 7.3. An essential condition in this theorem is the boundedness below of H (note that the operators q and p in the Schr¨ odinger representation of the CCR are unbounded both above and below with σ(q) = σ(p) = R). Remark 7.4. Theorem 7.2 is an extension of [11, Theorem 5.1], where the case C = I is considered. A new point here is that one does not need to assume the analytic continuation property of the weak Weyl relation (the GWWR with C = I) as is done in [11, Theorem 5.1]. To prove Theorem 7.2, we need two lemmas. Lemma 7.5. Assume that (A, B, C) obeys the GWCCR with C ≥ 0. Suppose that there exists a unit vector ψ ∈ D(A) ∩ D(B) ∩ (ker C)⊥ such that (A + aB + b)ψ = 0, Aψ, Bψ + Bψ, Aψ − 2ψ, Aψψ, Bψ = 0, where a and b are complex constants. Then, a = 0 and a > 0. Proof. In the same way as in the proof of [11, Lemma 5.3], we can show that iψ, Cψ = 2a(∆B)2ψ . By Proposition 4.2, ψ cannot be an eigenvector of B. Hence, (∆B)ψ = 0. Thus, a = 0 and ψ, Cψ = 2(a)(∆B)2ψ . The l.h.s. on the second equation is strictly positive, since C ≥ 0 and ψ ∈ (ker C)⊥ . Hence, a > 0. For a self-adjoint operator A on a Hilbert space, we define E0 (A) := inf σ(A),
(7.2)
called the ground state energy of A, provided that A is bounded from below. If E0 (A) is an eigenvalue of A, then a non-zero vector in ker(A − E0 (A)) is called a ground state of A. By the variational principle, one has E0 (A) =
inf
ψ∈D(A),ψ=1
ψ, Aψ .
(7.3)
The following lemma is well known (e.g., [4, Theorem 6.16]). Lemma 7.6. Let A be a self-adjoint operator on a Hilbert space and bounded from below. Suppose that there exists a unit vector ψ0 ∈ D(A) such that ψ0 , Aψ0 = E0 (A). Then, Aψ0 = E0 (A)ψ0 , i.e. ψ0 is a ground state. Proof of Theorem 7.2. It is sufficient to prove Theorem 7.2 in the case where H ≥ 0. Suppose that there existed a unit vector ψ0 ∈ D(T ) ∩ D(H) ∩ (ker C)⊥ ˆ := H − ψ0 , Hψ0 . Then, from the satisfying (7.1). Set Tˆ := T − ψ0 , T ψ0 , H derivation of (4.5) [17, Lemma 1], we have
ˆ 0 = Tˆψ0 , Hψ ˆ 0 = Tˆψ0 , Hψ ˆ 0
Tˆψ0 Hψ 1 1 (7.4) = ψ0 , Cψ0 = δC . 2 2
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1091
ˆ 0 |. By this The first equality is the equality in the Schwarz inequality for |Tˆψ0 , Hψ fact and the condition δC > 0, there exists a constant c ∈ C, c = 0 such that ˆ 0 = 0. Tˆψ0 + cHψ
(7.5)
ˆ 0 = 0. Hence The second equality in (7.4) implies that Tˆψ0 , Hψ T ψ0 , Hψ0 + Hψ0 , T ψ0 − 2ψ0 , T ψ0 ψ0 , Hψ0 = 0.
(7.6)
The forth equality in (7.4) and Lemma 7.6 imply that Cψ0 = δC ψ0 .
(7.7)
By (7.5), (7.6) and Lemma 7.5, we have c = 0 and c > 0. Hence, let c = ia with a > 0. Then (7.5) implies that T ψ0 + iaHψ0 + bψ0 = 0,
(7.8)
where b is a constant. Hence, ψ0 , T ψ0 + ia ψ0 , Hψ0 + b = 0, which implies that b = −a ψ0 , Hψ0 < 0, since H ≥ 0 and H 1/2 ψ0 = 0 (see (7.5)). Let z ∈ C with z > 0. Then, by (6.11), (7.8) and (7.7), we have T e−zH ψ0 = e−zH (−iaHψ0 − bψ0 ) − izδC e−zH ψ0 . Since (ib/δC ) > 0, we can take z = ib/δC so that T e−zH ψ0 = −iae−zH Hψ0 . Hence, −zH ψ0 , T e−zH ψ0 = −ia e−zH ψ0 , He−zH ψ0 . e The l.h.s. is pure imaginary. Hence, the both sides must vanish. is real and the r.h.s. Hence, e−zH ψ0 , He−zH ψ0 = 0, which implies that H 1/2 ψ0 = 0. But this is a contradiction. 8. Power Decays of Transition Probability Amplitudes in Quantum Dynamics In Sec. 3, we have derived an estimate for transition probability amplitudes in time t. In this section, we consider a triple (T, H, KC ) obeying the GWWR (discussed in Sec. 6.2) and show that, for state vectors in “smaller” subspaces, transition probability amplitudes decay in powers of t as |t| → ∞. We apply the results to two-point correlation functions of Heisenberg operators. We also discuss decays of heat semigroups e−βH on β > 0 the inverse of the absolute temperature. Let H be a self-adjoint operator on a Hilbert space H and C = 0 be a bounded self-adjoint operator on H. We introduce a set of generalized time operators: T(H, C) := {T |(T, H, KC ) obeys the GWWR}.
(8.1)
By Proposition 2.6, if T ∈ T(H, C), then T + S ∈ T(H, C) for all symmetric operators S on H strongly commuting with H such that D(T )∩D(S) is dense in H.
October 20, 2005 8:48 WSPC/148-RMP
1092
J070-00247
A. Arai
8.1. A simple case Theorem 8.1. Let T ∈ T(H, C) and ψ, φ ∈ D(T ). Then, for all t ∈ R\{0},
φ, e−itH Cψ ≤ 1 (T φψ + φT ψ). (8.2) |t| Proof. In the present case, we have Lα = C with α = 1. Hence Proposition 3.1 gives the desired result. Remark 8.2. For vectors φ, ψ ∈ H, we can define a set of operators Tφ,ψ (H, C) := {T ∈ T(H, C)|φ, ψ ∈ D(T )} and put cφ,ψ :=
inf
(T φψ + φT ψ),
T ∈Tφ,ψ (H,C)
then (8.2) implies that
φ, e−itH Cψ ≤ cφ,ψ . |t|
(8.3)
Remark 8.3. Let T ∈ T(H, C). Then, for all ψ ∈ D(T ) with ψ = 1, T − ψ, T ψ is in the set T(H, C). Hence (8.2) implies that
4(∆T )2ψ
ψ, e−itH Cψ 2 ≤ . t2 Hence Theorem 8.1 gives a generalization of [11, Theorem 4.1].
(8.4)
8.2. Higher order decays in smaller subspaces As demonstrated in a concrete example [11, of a Proposition 3.2], the modulus −itH −1
transition probability amplitude φ, e ψ may decay faster than |t| as |t| → ∞ for a class of vectors φ and ψ. In this subsection we investigate this aspect in an abstract framework and show that φ, e−itH ψ decays faster than |t|−1 for all φ and ψ in smaller subspaces. Lemma 8.4. Let T ∈ T(H, C). Suppose that CT ⊂ T C.
(8.5)
e−itH D(T n ) = D(T n ),
(8.6)
T n e−itH = e−itH (T + tC)n .
(8.7)
Then, for all n ∈ N and t ∈ R,
Proof. By (2.8), we have the operator equality eitH T n e−itH = (T + tC)n . Condition (8.5) implies that D((T + tC)n ) = D(T n ). Hence (8.6) follows and (8.7) holds.
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1093
Theorem 8.5. Let T ∈ T(H, C). Assume (8.5). Let n ∈ N and ψ, φ ∈ D(T n ). We define constants dTk (φ, ψ), k = 1, . . . , n, by the following recursion relation: dT1 (φ, ψ) := T φψ + φT ψ,
(8.8)
dTn (φ, ψ) := T n φψ + φT nψ +
n−1
T n Cr dn−r
(φ, T r ψ) ,
n ≥ 2,
(8.9)
r=1
where n Cr := n!/[(n − r)!r!]. Then, for all t ∈ R\{0}, T
φ, e−itH C n ψ ≤ dn (φ, ψ) . |t|n
(8.10)
Proof. We prove (8.10) by induction in n. The case n = 1 holds by Theorem 8.1. Suppose that (8.10) holds for n = 1, . . . , m − 1 (m ≥ 2). Let ψ, φ ∈ D(T m ). Condition (8.5) implies that, for all k, r ∈ N, C k T r ⊂ T r C k . By this fact and Lemma 8.4, we have T m e−itH ψ = e−itH (T + tC)m ψ = e−itH T m ψ +
m−1
m Cr t
m−r −itH
e
C m−r T r ψ + tm e−itH C m ψ.
r=1
Hence,
|t|m φ, e−itH C m ψ ≤ T m φψ + φT mψ +
m−1
m−r m Cr |t|
φ, e−itH C m−r T r ψ .
r=1
By the induction hypothesis, we have
|t|m−r φ, e−itH C m−r T r ψ ≤ dTm−r (φ, T r ψ). Hence, (8.10) with n = m follows. Theorem 8.5 can be generalized. We need a lemma. Lemma 8.6. Let T1 , . . . , Tn ∈ T(H, C) (n ∈ N) such that CTj ⊂ Tj C, j = 1, . . . , n. Then, for all t ∈ R, e−itH D((T1 + tC) · · · (Tn + tC)) ⊂ D(T1 · · · Tn )
(8.11)
and, for all ψ ∈ D((T1 + tC) · · · (Tn + tC)), T1 · · · Tn e−itH ψ = e−itH (T1 + tC) · · · (Tn + tC)ψ.
(8.12)
Proof. We prove the assertion by induction in n. The case n = 1 obviously holds. Suppose that, for an m ∈ N, (8.11) and (8.12) hold. Let φ ∈ D((T1 +tC) · · · (Tm+1 + tC)). Then, the vector (Tm+1 + tC)φ is in D((T1 + tC) · · · (Tm + tC)). Hence, by the induction hypothesis, e−itH (Tm+1 + tC)φ is in D(T1 · · · Tm ) and T1 · · · Tm e−itH (Tm+1 + tC)φ = e−itH (T1 + tC) · · · (Tm+1 + tC)φ.
October 20, 2005 8:48 WSPC/148-RMP
1094
J070-00247
A. Arai
On the other hand, e−itH φ is in D(Tm+1 ) and Tm+1 e−itH φ = e−itH (Tm+1 + tC)φ. Hence, Tm+1 e−itH φ is in D(T1 · · · Tm ), i.e., e−itH φ ∈ D(T1 · · · Tm+1 ), and (8.12) with n = m + 1 holds. Thus, the assertion holds for n = m + 1. Theorem 8.7. Let T, T1 , . . . , Tn ∈ T(H, C) such that CT ⊂ T C, CTj ⊂ Tj C, j = 1, . . . , n. Let φ ∈ D(Tn · · · T1 ) ∩ D(T n−1 ) and n−r Ti1 · · · Tir ). For k = 1, . . . , n, we define a constant ψ ∈ ∩n−1 r=1 ∩1≤i1 <···
n−1
dTn−r (φ, Ti1 · · · Tir ψ).
(8.13) (8.14)
r=1 1≤i1 <···
Then, for all t ∈ R\{0}, T
φ, e−itH C n ψ ≤ δn (φ, ψ; T1 , . . . , Tn ) . |t|n
(8.15)
Proof. By Lemma 8.6, we have T1 · · · Tn e−itH ψ = e−itH T1 · · · Tn ψ n−1 tn−r +
e−itH C n−r Ti1 · · · Tir ψ + tn e−itH C n ψ.
1≤i1 <···
r=1
Hence,
|t|n φ, e−itH C n ψ ≤ Tn · · · T1 φψ + φT1 · · · Tn ψ +
n−1 r=1
|t|n−r
φ, e−itH C n−r Ti1 · · · Tir ψ .
1≤i1 <···
Note that φ, Ti1 · · · Tir ψ ∈ D(T n−r ) for r = 1, . . . , n − 1. Hence, by Theorem 8.5, we have
|t|n−r φ, e−itH C n−r Ti1 · · · Tir ψ ≤ dTn−r (φ, Ti1 · · · Tir ψ). Thus, (8.15) follows. Finally, we discuss the case where condition (8.5) is not necessarily satisfied. For n ≥ 2 and r = 1, . . . , n − 1, we introduce a set Jn,r := {j := (j1 , . . . , jr+1 ) ∈ {0, 1}r+1|j1 + · · · + jr+1 = n − r}
(8.16)
and, for each j ∈ Jn,r , we define (j) Kn,r := T j1 CT j2 C · · · CT jr CT jr+1 .
(8.17)
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1095
Let Dn (T, C) :=
ψ ∈ D(T n ) ∩
n−1
r=1 j∈Jn,r
(j) (j) D Kn,r
Kn,r ψ ∈ Ran(C r |D(T r )),
j ∈ Jn,r , r = 1, . . . , n − 1
(n ≥ 2).
(8.18)
We set D1 (T, C) := D(T ). Remark 8.8. If (8.5) holds, then Dn (T, C) = D(T n ). Theorem 8.9. Let T ∈ T(H, C). Then, for all φ ∈ D(T n ) and ψ ∈ Dn (T, C) and t ∈ R\{0},
φ, e−itH C n ψ ≤ dn (φ, ψ) , (8.19) |t|n where dn (φ, ψ) > 0 is a constant independent of t. Proof. We prove (8.19) by induction in n. The case n = 1 is just Theorem 8.1. Suppose that (8.19) holds with n = 1, . . . , m − 1 (m ≥ 2). For all ψ ∈ Dm (T, C), we have m−1 (j) tr e−itH Km,r ψ. T m e−itH ψ = e−itH T m ψ + tm e−itH C m ψ + r=1
j∈Jm,r
Hence, for all φ ∈ D(T ), |t|m φ, e−itH C m ψ ≤ T mφψ + φT mψ m
+
m−1
|t|r
r=1
(j)
φ, e−itH Km,r ψ . j∈Jm,r
(j)
(j)
Since Km,r ψ ∈ Ran(C r |D(T r )), there is a vector χj ∈ D(T r ) such that Km,r ψ =
(j)
C r χj . By this fact and the induction hypothesis, we have |t|r φ, e−itH Km,r ψ ≤ cm,r with a constant cm,r independent of t. Thus, (8.19) with n = m follows. 8.3. Correlation functions In this section, we show that the existence of generalized time-operators gives upper bounds for correlation functions for a class of linear operators. For a linear operator A on H and a self-adjoint operator H on H, we define A(t) := eitH Ae−itH ,
t ∈ R,
(8.20)
the Heisenberg operator of A with respect to H. Let B be a linear operator on H. Let ψ∈ [D(Ae−itH ) ∩ D(Be−itH )] t∈R
October 20, 2005 8:48 WSPC/148-RMP
1096
J070-00247
A. Arai
with ψ = 1. Then, we can define W (t, s; ψ) := A(t)ψ, B(s)ψ ,
s, t ∈ R.
(8.21)
We call it the two-point correlation function of A and B with repect to the vector ψ. Theorem 8.10. Let T ∈ T(H, C). Suppose that ψ is an eigenvector of H such that Aψ ∈ D(T ) and Bψ ∈ Ran(C|D(T )). Then, for all t, s ∈ R with t = s, cA,B,T , (8.22) |W (t, s; ψ)| ≤ |t − s| where cA,B,T :=
inf
χ∈D(T ),Bψ=Cχ
(T Aψχ + AψT χ).
Proof. Let E be the eigenvalue of H with eigenvector ψ. Then, we have W (t, s) = ei(t−s)E Aψ, e−i(t−s)H Bψ.
(8.23)
There exists a vector χ ∈ D(T ) such that Bψ = Cχ. Hence, applying Theorem 8.1, we obtain T Aψχ + AψT χ . |W (t, s; ψ)| ≤ |t − s| Thus, (8.22) follows. Theorem 8.10 can be strengthened: Theorem 8.11. Let T ∈ T(H, C) with (8.5). Suppose that ψ is an eigenvector of H such that ψ ∈ D(A) and Aψ ∈ D(T n ) and Bψ ∈ Ran(C n |D(T n )). Then, for all t, s ∈ R with t = s, (n)
|W (t, s)| ≤
cA,B,T |t − s|n
,
(8.24)
where (n)
cA,B,T :=
inf
χ∈D(T n ),Bψ=C n χ
dTn (Aψ, χ).
Proof. This follows from (8.23) and an application of Theorem 8.5. 8.4. Heat semigroups In this section, we assume the following: Hypothesis (H). The self-adjoint operator H is bounded from below and there exists a closed symmetric operator T such that T ∈ T(H, C). Then, (6.11) holds. We set ˆ := H − E0 (H). H
(8.25)
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1097
We have for all β ≥ 0, ˆ
e−β H = 1. Hence, in the same way as in Secs. and 8.2, 8.1 we obtain the following results on ˆ −β H the decay (in β) of the quantity φ, e Cψ . Theorem 8.12. Let φ, ψ ∈ D(T ). Then, for all β > 0,
φ, e−β Hˆ Cψ ≤ 1 (T φψ + φT ψ). β
(8.26)
Theorem 8.13. Assume (8.5). Then, for all φ, ψ ∈ D(T n ) and β > 0, T
φ, e−β Hˆ C n ψ ≤ dn (φ, ψ) . βn
Theorem 8.14. For all φ ∈ D(T n ), ψ ∈ Dn (T, C) and β > 0,
φ, e−β Hˆ C n ψ ≤ cn , βn
(8.27)
(8.28)
where cn is a constant independent of β. 9. Abstract Version of Wigner’s Time-Energy Uncertainty Relation In this section, we apply Theorems 8.1 and 8.9 to establish an abstract version of Wigner’s time-energy uncertainty relation [23]. Let H be a self-adjoint operator on a Hilbert space H. In the context of quantum mechanics where H is interpreted as the Hamiltonian of a quantum system, the state vector at time t ∈ R is given by ψ(t) := e−itH ψ with ψ ∈ H being the initial state. Let φ0 ∈ H and suppose that
t2 |φ0 , ψ(t)|2 dt < ∞.
(9.1)
(9.2)
R
Then,
N0 :=
is finite and we can define
R
|φ0 , ψ(t)|2 dt
t|φ0 , ψ(t)|2 dt , N0 1/2 (t − t)2 |φ0 , ψ(t)|2 dt R . ∆t := N0 t :=
(9.3)
R
(9.4) (9.5)
Physically, t may be interpreted as the expectation value of the “arrival time”, i.e., the time t when the state ψ(t) “arrives” at φ0 . In this interpretation, ∆t expresses the standard deviation of the arrival time of the intial state ψ to the state φ0 .
October 20, 2005 8:48 WSPC/148-RMP
1098
J070-00247
A. Arai
We define fH (t; φ0 , ψ)) :=
φ0 , Hψ(t) ; if ψ ∈ D(H) Hφ0 , ψ(t) ; if φ0 ∈ D(H)
.
We assume that φ0 ∈ D(H) or ψ ∈ D(H) and
|fH (t; φ0 , ψ)|2 dt < ∞.
(9.6)
(9.7)
R
Then, we can define
ψ(t), φ0 fH (t; φ0 , ψ) dt , N0 1/2 |f (t; φ0 , ψ)|2 dt 2 R H := − EH . N0
EH := ∆EH
R
(9.8)
(9.9)
Theorem 9.1. Suppose that φ0 ∈ D(H) or ψ ∈ D(H) and that (9.2) and (9.7) hold. Then, 1 (9.10) ∆t · ∆EH ≥ . 2 ˆ tˆ) of the CCR in L2 (R), where tˆ is Proof. From the Schr¨ odinger representation (E, the multiplication operator by the coordinate function t ∈ R and Eˆ := iDt with Dt being the generalized differential operator in the variable t, we have the standard uncertainty relation 1 E ˆ − u, Eu ˆ u · tˆ − u, ˆtu u ≥ 2 ˆ ∩ D(tˆ) with u = 1. We define f : R → C by f (t) := φ0 , ψ(t). for all u ∈ D(E) ˆ with Ef ˆ (t) = fH (t; φ0 , ψ). Let Then, by (9.2) and (9.7), f is in D(tˆ) ∩ D(E) ˜ f (t) := f (t)/f . Then, it is easy to see that ˆ f˜ f˜, ∆t := tˆ − f˜, tˆf˜ f˜. ∆EH = Eˆ − f˜, E Hence (9.10) follows. Theorem 9.2. Let T ∈ T(H, C). Suppose that ψ = C 2 χ with χ ∈ D(T 2 ) ∩ D(T C) satisfying T Cχ ∈ Ran(C|D(T )) and that φ0 ∈ D(T 2 ) ∩ D(T H). Then (9.10) holds. Proof. By the present assumption, φ0 is in D(H) with Hφ0 ∈ D(T ) and ψ = C(Cχ) with Cχ ∈ D(T ). Hence, we can apply Theorem 8.1 to obtain |Hφ0 , ψ(t)| ≤
1 (T Hφ0 Cχ + Hφ0 T Cχ). |t|
Using this estimate, we see that (9.7) holds. We have φ0 ∈ D(T 2 ) and χ ∈ D2 (T, C), where Dn (T, C) is defined by (8.18). Hence, by Theorem 8.9, |φ0 , ψ(t)| ≤ c/|t|2 with c > 0 a constant. This implies that (9.2) holds. Thus, by Theorem 9.1, we obtain the desired result.
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1099
10. Structure Producing Successively Triples Obeying the GWWR Let (Q, P, KC ) be a triple obeying the GWWR in a Hilbert space H, i.e., Qe−itP = e−itP (Q + tC),
∀ t ∈ R.
(10.1)
(See Proposition 2.1.) 2
¯ and all z ∈ C with z > 0, we have e−zP ψ ∈ Lemma 10.1. (i) For all ψ ∈ D(Q) ¯ D(Q) and ¯ − 2izP e−zP 2 Cψ. ¯ −zP 2 ψ = e−zP 2 Qψ Qe
(10.2) 2
¯ ∩ D(P ) and ¯ ∩ D(P ) and all t ∈ R, we have e−itP ψ ∈ D(Q) (ii) For all ψ ∈ D(Q) ¯ −itP 2 ψ = e−itP 2 (Q ¯ + 2tP C)ψ. Qe
(10.3) 2
Proof. (i) Let z > 0. Then, the function: t → f (t) := e−zt , t ∈ R is in the class 2 Cb1 (R) and f (t) = −2zte−zt . Hence, applying Theorem 6.2(i), we conclude that 2 ¯ ⊂ D(Q) ¯ and (10.2) holds for all ψ ∈ D(Q). ¯ e−zP D(Q) ¯ ∗ and ψ ∈ (ii) Let ε > 0. By part (i), we have for all φ ∈ D(Q∗ ) = D (Q) ¯ ∩ D(P ) D(Q) ∗ 2 2 ¯ − φ, 2i(ε + it)P e−(ε+it)P 2 Cψ . Q φ, e−(ε+it)P ψ = φ, e−(ε+it)P Qψ Applying Proposition 6.4 with H = P and T = Q, we see that Cψ ∈ D(P ). Hence, taking ε → 0, we obtain ∗ 2 2 ¯ + φ, 2tP e−itP 2 Cψ. Q φ, e−itP ψ = φ, e−itP Qψ 2
¯ and (10.3) holds. This implies that e−itP ψ ∈ D(Q∗∗ ) = D(Q) In the rest of this section, we consider a simple case such that ker C = {0}.
(10.4)
Then, by Corollary 3.4 applied to H = P , T = Q and Lα = C, we have σp (P ) = ∅. In particular, P is injective. Hence, we can define 1 ¯ −1 ¯ −1 )∗ , QP + (QP (10.5) T (Q, P ) := 4 ¯ −1 ) is dense in H. provided that D(QP ¯ −1 ) is dense in H. Then, for all Theorem 10.2. Assume (10.4) and that D(QP −itP 2 D(T (Q, P )) ⊂ D(T (Q, P )) and t ∈ R, e 2
2
T (Q, P )e−itP ψ = e−itP (T (Q, P ) + tC)ψ,
ψ ∈ D(T (Q, P )).
(10.6)
¯ −1 ). Then, P −1 ψ ∈ D(Q)∩D(P ¯ Proof. Let ψ ∈ D(QP ). Hence, by Lemma 10.1(ii), −itP 2 −1 ¯ e P ψ ∈ D(Q) ∩ D(P ) for all t ∈ R and ¯ −1 e−itP 2 ψ = e−itP 2 (QP ¯ −1 + 2tC)ψ, QP
October 20, 2005 8:48 WSPC/148-RMP
1100
J070-00247
A. Arai
2 ¯ −1 + 2tC) ⊂ QP ¯ −1 e−itP 2 . Hence, which implies that, for all t ∈ R, e−itP (QP 2 ¯ −1 )∗ ⊂ (QP ¯ −1 )∗ + 2tC)eitP 2 eitP (QP 2
2
¯ −1 )∗ + 2tC] ⊂ (QP ¯ −1 )∗ e−itP for all for all t ∈ R. This implies that e−itP [(QP t ∈ R. Thus, the desired result follows. ¯ −1 ) ∩ D((QP ¯ −1 )∗ ) is dense in addition to Theorem 10.2 shows that, if D(QP the assumption there, then (T (Q, P ), P 2 , KC ) obeys the GWWR. This structure is very interesting, since, apart from domain problems, it produces a series of triples obeying the GWWR. Indeed, let (10.7) T1 (Q, P ) := T (Q, P ), 2n−1 , n ≥ 2. (10.8) Tn (Q, P ) := T Tn−1 (Q, P ), P 2n Then Tn (Q, P ), P , KC obeys the GWWR for all n = 1, . . . , N (N ∈ N), provided that, for each n = 1, . . . , N , Tn (Q, P ) is symmetric. Example 10.3. A simple example of (Q, P ) is given by the Schr¨ odinger representation (q, p) on L2 (R) of the CCR with one degree of freedom. It is easy to see that (q, p, KI ) obeys the GWWR. The operator T (q, p) in this case is called the Aharonov–Bohm time operator [2, 11, 12]. Theorem 10.2 clarifies a general mathematical structure behind this operator. A simple application of Theorem 10.2 to n−1 (n ∈ N). this special case produces generalized time operators to H = (−∆)2 We can extend the theory presented above to the case of finitely many degrees of freedom. Let Qj (j = 1, . . . , n, n ∈ N) be a symmetric operator on H and Pj be a self-adjoint operator on H such that Pj strongly commutes with Qk and Pk (j, k = 1, . . . , n, j = k). Suppose that each (Qj , Pj , KC ) obeys the GWWR with C satisfying (10.4). Then, as already shown, each Pj is injective. Suppose that ¯ j P −1 ) is dense. Then we can define D(Q j 1 ¯ −1 ¯ j P −1 )∗ , j = 1, . . . , n. Qj Pj + (Q j 4 By the strong commutativity of Pj ’s (j = 1, . . . , n), the operator Tj :=
H :=
n
Pj2
(10.9)
(10.10)
j=1
is self-adjoint and non-negative. Theorem 10.4. Let the assumption stated above on (Qj , Pj ) (j = 1, . . . , n) be satisfied. Then, for all t ∈ R and j = 1, . . . , n, e−itH D(Tj )) ⊂ D(Tj ) and Tj e−itH ψ = e−itH (Tj + tC)ψ,
ψ ∈ D(Tj ).
(10.11)
Proof. By the strong commutativity of Qj with Pk (k = j), we have Qj e−itH = 2 e−itHj Qj e−itPj on D(Pj ), where Hj := k =j Pk2 . Hence, applying Theorem 10.2 with Q = Qj , P = Pj , we obtain the desired result.
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1101
In this case too, a remark similar to the one after Theorem 10.2 is applicable. 11. Generalized Time Operators of Partial Differential Operators In this section, we construct classes of generalized time operators of partial differential operators. 11.1. Constructions from the Schr¨ odinger representation of the CCR with d degrees of freedom Let q = (q1 , . . . , qd ) and p = (p1 , . . . , pd ) be the Schr¨ odinger representation of the CCR with d degrees of freedom. Namely, qj is the multiplication operator by xj , the jth coordinate function (x = (x1 , . . . , xd ) ∈ Rd ) acting in L2 (Rd ) and pj = −iDj on L2 (Rd ) (Dj is the generalized partial differential operator in xj ). The following properties are well known: (i) pj and pl are strongly commuting self-adjoint operators. (ii) For all t ∈ R, qj e−itpl = e−itpl (qj + δjl t),
j, l = 1, . . . , d.
We denote by ˆ d := {k = (k1 , . . . , kd )|kj ∈ R, j = 1, . . . , d} R
(11.1)
ˆ d ) the set of r times continuously the d-dimensional momentum space and by C r (R d ˆ differentiable functions on R ˆ d ) be real-valued. Then, the operators Let F ∈ C 1 (R HF := F (p)
(11.2)
and (1)
Fj
:= (∂j F )(p)
(11.3)
(j) NF
ˆ d with Lebesgue measure zero and are self-adjoint. Let be a closed subset of R (j) 1 ˆd G ∈ C (R \NF ) such that sup (j)
|G(k)∂j F (k)| < ∞.
(11.4)
ˆ d \N k∈R F
(j)
Then, we can define a linear operator TF,G on L2 (Rd ) by TF,G := G(p)qj + qj G(p)∗ ,
(11.5)
(j) ˆ d \N (j) ), D(TF,G ) := F −1 C01 (R F
(11.6)
(j)
with domain
ˆ d ) is the Fourier transform: where F : L2 (Rd ) → L2 (R
1 e−ikx ψ(x) dx, ψ ∈ L2 (Rd ) (F ψ)(k) := (2π)d/2 Rd
(11.7)
October 20, 2005 8:48 WSPC/148-RMP
1102
J070-00247
A. Arai
d (j) ˆ \N in the L2 -sense. Note that, for all u ∈ C01 R , F ∂u(k) ∂ (j) +i [G(k)∗ u(k)], F TF,G F −1 u (k) = G(k)i ∂kj ∂kj
ˆ d \N (j) . a.e. k ∈ R F
(11.8)
(j) (j) It is easy to see that TF,G is symmetric [note that D TF,G is dense in L2 (Rd ), d (j) ˆ \N ˆ d )]. since C0∞ R is dense in L2 (R F (j) (j) Lemma 11.1. For all t ∈ R and ψ ∈ D TF,G , one has e−itHF ψ ∈ D TF,G . Proof. We have by the functional calculus F e−itHF F −1 = e−itMF , where MF is the multiplication operator by function F . It is easy to see that, for all the ˆ d \N (j) , e−itMF u ∈ C 1 R ˆ d \N (j) . Thus, the assertion follows. u ∈ C01 R 0 F F For the functions F and G as above, the operator CF,G := [G(p) + G(p)∗ ]Fj (j)
(1)
(11.9)
is bounded and self-adjoint. (j) Proposition 11.2. Let F and G be as above. Then, TF,G , HF , KC (j) obeys the F,G
GWWR. Proof. By using the Fourier transform, it is easy to see that qj e−itHF ψ = e−itHF qj ψ + te−itHF Fj ψ (1)
(1)
for all ψ ∈ D(qj ) ∩ D(Fj ), which, together with Lemma 11.1, implies that G(p)qj e−itHF ψ = e−itHF G(p)qj ψ + te−itHF G(p)Fj ψ, (1)
qj G(p)∗ e−itHF ψ = e−itHF qj G(p)∗ ψ + te−itHF Fj G(p)∗ ψ (1)
(j)
for all ψ ∈ D(TF,G ). Adding these equations, we obtain TF,G e−itHF ψ = e−itHF TF,G ψ + te−itHF CF,G ψ. (j)
(j)
(j)
Thus, the desired result follows. Example 11.3. (The Free Hamiltonian of a Nonrelativistic Quantum Particle). Consider the case where F (k) = k 2 /(2m) (m > 0 is a constant denoting the mass of a quantum particle) and G(k) = v(k)/∂j F (k) = mv(k)kj−1 with ˆ d ) being bounded. Then, v ∈ C 1 (R ∆ (j) ∗ −1 , TF,G = Tv(j) := m v(p)p−1 , j qj + qj v(p) pj 2m = Cv := v(p) + v(p)∗ ,
HF = HNR := − (j)
CF,G
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1103
(j) ˆ d |kj = 0} (hence its d-dimensional Lebesgue measure is zero), with NF = {k ∈ R where
∆ :=
d
Dj2
(11.10)
j=1 (j)
is the d-dimensional generalized Laplacian on L2 (Rd ). Then, (Tv , HNR , KCv ) obeys (j) the GWWR. The operator Tv with d = 1, m = 1/2 and v = 1/2 is just the Aharonov–Bohm time operator mentioned in Example 10.3. Example 11.4. (The Free Hamiltonian √ of a Relativistic Quantum k 2 + m2 with m > 0 (a constant). Particle). Consider the case where F (k) = √ −1 Let G(k) = v(k)/∂j F (k) = v(k) k 2 + m2 kj . Then, HF = Hrel := −∆ + m2 , (j) (j) ∗ −∆ + m2 p−1 TF,G = Trel := v(p) −∆ + m2 p−1 j qj + qj v(p) j , (j)
CF,G = Cv , (j)
(j)
ˆ d |kj = 0}. Then, (T , Hrel , KC ) obeys the GWWR. with NF = {k ∈ R v rel Example 11.5. (Powers of the d-dimensional Laplacian). Let α > 0 be a constant and consider the case where F (k) = Fα (k) := |k|2α ,
ˆ d. k∈R
In this case, we have HFα = (−∆)α . ˆ d |kj = 0} and Let Mj := Rd \{k ∈ R (α)
Tj (α)
with D(Tj
:=
1 ∗ −α+1 −1 v(p)(−∆)−α+1 p−1 pj , j qj + qj v(p) (−∆) 2α
) := F −1 C01 (Mj ). Then, (Tj , HFα , KCv ) obeys the GWWR. (α)
11.2. Abstract Dirac operators Let K be a Hilbert space and Aj (j = 1, . . . , d) and B be bounded self-adjoint operators satisfying the anticommutation relations {Aj , Al } = 2δjl , 2
B = I,
{Aj , B} = 0, j, l = 1, . . . , d,
where {X, Y } := XY +Y X. Let M be a strictly positive, continuously differentiable function on Rd such that ∂j M is bounded (j = 1, . . . , d) and the set ˆ d |kj + M (k)(∂j M )(k) = 0} Nj := {k ∈ R
October 20, 2005 8:48 WSPC/148-RMP
1104
J070-00247
A. Arai
is closed with Lebesgue measure zero. Then, the operator pj (M ) := pj + M (p)(∂j M )(p) −1
ˆ d \Nj ) C0∞ (R
−1
⊂ D(pj (M ) is injective and F We define an operator of Dirac type HD :=
d
(11.11)
).
Aj ⊗ pj + B ⊗ M (p)
(11.12)
j=1
acting on K ⊗ L2 (Rd ). This is an abstract Dirac operator. Example 11.6. The usual free Dirac operator [22, p. 7] is given by taking d = 3, K = C4 , Aj = αj , M = m (a positive constant), B = β with αj and β being (4 × 4)-Hermitian matrices satisfying {αj , αl } = 2δjl , {β, αj } = 0, β 2 = I, j, l = 1, 2, 3. By a general theorem [3, Theorem 4.3], HD is self-adjoint and 2 HD = I ⊗ (−∆ + M (p)2 ).
In what follows, under the natural identification K ⊗ L2 (Rd ) = L2 (Rd ; K) (the Hilbert space of K-valued square integrable functions on Rd ), we write Aj ⊗ pj as Aj pj . The self-adjoint operator L := (−∆ + M (p)2 )1/2 is strictly positive. Hence, it has a bounded inverse. Let 1 U := √ B(HD + BL)L−1/2 (L + M (p))−1/2 . 2 1 ˆd Let v ∈ C (R ) be bounded and Tj := v(p)HD pj (M )−1 U ∗ qj U + U ∗ qj U v(p)∗ HD pj (M )−1 ˆ d \Nj ), where ⊗ ˆ F −1 C0∞ (R ˆ denotes algebraic tensor product. with D(Tj ) := K ⊗ Theorem 11.7. The triple (Tj , HD , KCv ) obeys the GWWR. Proof. The operator U is unitary and U HD U ∗ = BL. (This is an abstract structure of the usual free Dirac operator, see [22, §1.4].) We also have Tˆj := U Tj U ∗ = v(p)BLpj (M )−1 qj + qj v(p)∗ BLpj (M )−1 with D(Tˆj ) := U D(Tj ) (note that U Cv = Cv U ). Using the Fourier analysis, we can show that (Tˆj , BL, KCv ) obeys the GWWR. This implies that (Tj , HD , KCv ) obeys the GWWR. Remark 11.8. We can apply general results established in Secs. 2–10 to the Hamiltonians HF , HNR , Hrel , HFα and the Dirac operator HD . In particular, we can derive decay estimates (in time t) for transition probability amplitudes of states with these Hamiltonians. However, we do not write down them here.
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1105
12. Representations of the GWWR in Fock Spaces In this section, we show that, given a triple obeying the GWWR in a Hilbert space H, there exist triples obeying the GWWR in Fock spaces (full Fock spaces, boson Fock spaces, fermion Fock spaces) over H. 12.1. Tensor representations of the GWWR Let n ∈ N and Hj (j = 1, . . . , n, n ∈ N) be a Hilbert space. Suppose that (Tj , Hj , KCj ) obeys the GWWR in Hj such that Tj is closed symmetric. We consider an operator L :=
n j=1
ˆ nj=1 D(Hj ), I ⊗ · · · ⊗ I ⊗ Hj ⊗ I ⊗ · · · ⊗ I |⊗
(12.1)
n
ˆ denotes algebraic tensor product. It is well known that L is self-adjoint where ⊗ [19, §VIII.10] and e−itL = e−itH1 ⊗ · · · ⊗ e−itHn ,
t ∈ R.
(12.2)
Let T˜j := I ⊗ · · · ⊗ I ⊗ Tj ⊗ I ⊗ · · · ⊗ I ,
(12.3)
n
C˜j := I ⊗ · · · ⊗ I ⊗ Cj ⊗ I ⊗ · · · ⊗ I .
(12.4)
n
Proposition 12.1. For all j = 1, . . . , n, (T˜j , L, KC˜j ) obeys the GWWR. Proof. Let Ψ = ψ1 ⊗ · · · ⊗ ψn with ψj ∈ D(Tj ), j = 1, . . . , n. Then, for all t ∈ R, ˆ nl=1 D(Tl ) and e−itL Ψ = ⊗nl=1 e−itHl ψl ∈ ⊗ T˜j e−itL Ψ = e−itH1 ψ1 ⊗ · · · e−itHj−1 ψj−1 ⊗ Tj e−itHj ψj ⊗ e−itHj+1 ψj+1 · · · ⊗ ⊗ e−itHn ψn = e−itH1 ψ1 ⊗ · · · ⊗ e−itHj (Tj + tCj )ψj ⊗ · · · ⊗ e−itHn ψn = e−itL (T˜j + tC˜j )Ψ. ˆ nj=1 D(Tj ) is a core of T˜j , the last equality extends to all Ψ ∈ D(T˜j ) with Since ⊗ e−itL Ψ ∈ D(T˜j ). Hence, (T˜j , L, KC˜j ) obeys the GWWR. 12.2. Constructions of triples obeying the GWWR in Fock spaces Let H be a Hilbert space. Then, the full Fock space over H is defined by n F (H) := ⊕∞ n=0 ⊗ H
(12.5)
with ⊗0 H := C (see, e.g., [7, §5.2] or [19, §II.4, §VIII.10] for Fock space theory). For a densely defined closed linear operator A on H and n ∈ {0} ∪ N, we define a
October 20, 2005 8:48 WSPC/148-RMP
1106
J070-00247
A. Arai (n)
linear operator AΣ on ⊗n H as follows: (0)
AΣ := 0, (n)
AΣ
(12.6)
jth n
ˆ n D(A) := I ⊗ · · · ⊗ I⊗ A ⊗I ⊗ · · · ⊗ I
⊗
j=1 n
(12.7)
for n ≥ 1. Then, the operator dΓ(A) := ⊕∞ n=0 AΣ , (n)
(12.8)
called the second quantization of A, is densely defined and closed. If A is self-adjoint, then so is dΓ(A). Let H be a self-adjoint operator on H, T be a closed symmetric operator on H and C ∈ B(H) be self-adjoint. In what follows, we take the following hypothesis: Hypothesis (I). The triple (T, H, KC ) obeys the GWWR in H. We define an,j ; 0≤n<j (n) j -th , Tj :=
⊗I ⊗ · · · ⊗ I ; n ≥ j I ⊗ · · · ⊗ I⊗ T
(12.9)
n
where an,j is an arbitrary real constant. For j ∈ N, we define a linear operator Tj on F (H) by Tj := ⊕∞ n=0 Tj . (n)
(12.10)
Then, Tj is a closed symmetric operator. Let 0; 0≤n<j (n) jth Cj :=
I ⊗ · · · ⊗ I⊗ ⊗I ⊗ · · · ⊗ I ; n ≥ j C
(12.11)
n
and Cj := ⊕∞ n=0 Cj . (n)
(12.12)
Then, Cj is a bounded self-adjoint operator on F (H). Proposition 12.2. Under Hypothesis (I), for all j ∈ N, (Tj , dΓ(H), KCj ) obeys the GWWR. (n) (n) Proof. It follows from Proposition 12.1 that, for all n, Tj , HΣ , KC (n) obeys j
the GWWR. It is easy to see that this implies the desired result. We next construct a triple obeying the GWWR in F (H) which is reduced by the boson Fock space over H n Fb (H) := ⊕∞ n=0 ⊗s H,
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1107
where ⊗ns H denotes the n-fold symmetric tensor product of H, and the fermion Fock space over H n Ff (H) := ⊕∞ n=0 ⊗as H,
where ⊗nas H denotes the n-fold antisymmetric tensor product of H. The operator ˆ := dΓ(I) N
(12.13)
on F (H) is called the number operator. The vector Ω := {1, 0, 0, . . .} ∈ F(H)
(12.14)
is called the Fock vacuum. We denote by P0 the orthogonal projection onto the one-dimensional subspace {zΩ|z ∈ C}. We set Q0 := I − P0 .
(12.15)
ˆ −1/2 dΓ(C)N ˆ −1/2 Q0 . S := Q0 N
(12.16)
Let
Lemma 12.3. The operator S is bounded and self-adjoint with S ≤ C. Proof. It is easy to see that, for all Ψ ∈ D(S), (SΨ)(0) = 0 and, for n ≥ 1, (SΨ)(n) =
(dΓ(C)Ψ)(n) . n
Hence, (SΨ)(n) ≤ CΨ(n), which implies that SΨ2 ≤ C2 Ψ2 . Hence, S is bounded with S ≤ C and D(S) = F (H). It is easy to see that S is symmetric. Let
and
ˆn DT := ⊕∞ n=0 ⊗ D(T )
(12.17)
ˆ −1/2 dΓ(T )N ˆ −1/2 Q0 |DT . τ (T ) := Q0 N
(12.18)
Then, τ (T ) is symmetric. Theorem 12.4. Assume Hypothesis (I). Then, (τ (T ), dΓ(H), KS ) obeys the GWWR. ˆ −1/2 Q0 Ψ ∈ DT , since Proof. For all Ψ ∈ DT , N −1/2 (0) −1/2 (n) ˆ ˆ N Q0 Ψ = 0, N Q0 Ψ = n−1/2 Ψ(n) ,
n ≥ 1.
ˆ −1/2 Q0 = Using this property and the easily proven fact that e−it dΓ(H) N −1/2 −it dΓ(H) ˆ N Q0 e , we have by direct computations ˆ −1/2 Q0 e−it dΓ(H) Ψ = e−it dΓ(H) dΓ(T )N ˆ −1/2 Q0 Ψ dΓ(T )N ˆ −1/2 Q0 Ψ. + te−it dΓ(H) dΓ(C)N
October 20, 2005 8:48 WSPC/148-RMP
1108
J070-00247
A. Arai
ˆ −1/2 to the both sides, we obtain Operating Q0 N τ (T )e−it dΓ(H) Ψ = e−it dΓ(H) τ (T )Ψ + te−it dΓ(H) SΨ, which implies the desired result. Remark 12.5. Applying the results in Sec. 8, we can derive decay estimates for
Φ, e−itdΓ(H) Ψ (t ∈ R) for vectors Φ, Ψ in suitable subspaces. However, we do not write down them here. The same applies to other results implied by the abstract theory. It is easy to see that dΓ(H), τ (T ) and S are reduced by F# (H) (# = b, f). Hence, Theorem 12.4 holds for the reduced parts of them too. It is more interesting and important to construct generalized time operators of perturbed Hamiltonians of the form dΓ(H) + V on the boson Fock space Fb (H) or the fermion Fock space Ff (H) with V a symmetric operator. For this purpose, the method given in Sec. 2.4 can be applied. However, in the present paper, we leave this problem for future study. References [1] Y. Aharonov and D. Bohm, Significance of electromagnetic potentials in the quantum theory, Phys. Rev. 115 (1959) 485–491. [2] Y. Aharonov and D. Bohm, Time in the quantum theory and the uncertainty relation for time and energy, Phys. Rev. 122 (1961) 1649–1658. [3] A. Arai, Characterization of anticommutativity of self-adjoint operators in connection with Clifford algebra and applications, Integr. Equat. Oper. Theory 17 (1993) 451– 463. [4] A. Arai, Hilbert Spaces and Quantum Mechanics (Kyoritsu-shuppan, Tokyo, 1997) (in Japanese). [5] A. Arai, Representation-theoretic aspects of two-dimensional quantum systems in singular vector potentials: Canonical commutation relations, quantum algebras and reduction to lattice quantum systems, J. Math. Phys. 39 (1998) 2476–2498. [6] A. Arai, Mathematical Principles of Quantum Phenomena (Asakura-shoten, Tokyo, 2005) to be published (in Japanese). [7] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics 2, 2nd ed. (Springer, Berlin, Heidelberg, 1997). [8] B. Fuglede, On the relation P Q − QP = −iId , Math. Scand. 20 (1967) 79–88. [9] S. T. Kuroda, Spectral Theory II (Iwanami-shoten, 1979) (in Japanese). [10] L. Mandel and E. Wolf, Optical Coherence and Quantum Optics (Cambridge University Press, Cambridge, 1995). [11] M. Miyamoto, A generalized Weyl relation approach to the time operator and its connection to the survival probability, J. Math. Phys. 42 (2001) 1038–1052. [12] M. Miyamoto, Characteristic decay of the autocorrelation functions prescribed by the Aharonov–Bohm time operator, arXiv:quant-ph/0105033v2 (2001). [13] M. Miyamoto, The various power decays of the survival probability at long times for free quantum particle, J. Phys. A35 (2002) 7159–7171. [14] M. Miyamoto, Difference between the decay forms of the survival and non-escape probabilities, arXiv:quant-ph/0207067v2 (2002).
October 20, 2005 8:48 WSPC/148-RMP
J070-00247
Generalized Weak Weyl Relation and Decay of Quantum Dynamics
1109
[15] J. G. Muga, R. Sala Mayato and I. L. Egusquiza (eds.), Time in Quantum Mechanics (Springer, 2002). [16] J. von Neumann, Die Eindeutigkeit der Schr¨ odingerschen Operatoren, Math. Ann. 104 (1931) 570–578. [17] P. Pfeifer and J. Fr¨ ohlich, Generalized time-energy uncertainty relations and bounds on lifetimes of resonances, Rev. Mod. Phys. 67 (1995) 759–779. [18] C. R. Putnam, Commutation Properties of Hilbert Space Operators (Springer, Berlin, 1967). [19] M. Reed and B. Simon, Methods of Modern Mathematical Physics I: Functional Analysis (Academic Press, New York, 1972). [20] M. Reed and B. Simon, Methods of Modern Mathematical Physics III: Scattering Theory (Academic Press, New York, 1979). [21] K. Schm¨ udgen, On the Heisenberg commutation relation. I J. Funct. Anal. 50 (1983) 8–49. [22] B. Thaller, The Dirac Equation (Springer, Berlin Heidelberg, 1992). [23] E. P. Wigner, On the time-energy uncertainty relation, in Aspects of Quantum Theory, eds. A. Salam and E. P. Wigner (Cambridge University, Cambridge, UK, 1972), pp. 237–247.
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Reviews in Mathematical Physics Vol. 17, No. 10 (2005) 1111–1142 c World Scientific Publishing Company
LOWER SPECTRAL BRANCHES OF A PARTICLE COUPLED TO A BOSE FIELD
NICOLAE ANGELESCU National Institute of Physics and Nuclear Engineering “H. Hulubei”, P.O. Box MG-6, Bucharest, Romania [email protected] ROBERT A. MINLOS Institute for Information Transmissions Problems, Bolshoj Karetny per.19, GSP-4, Moscow 101447, Russia [email protected] VALENTIN A. ZAGREBNOV Universit´ e de la M´ editerran´ ee and Centre de Physique Th´ eorique — Luminy, Case 907, Marseille 13288, Cedex 9, France [email protected] Received 9 April 2005 Revised 21 September 2005 The structure of the lower part (i.e. ε-away below the two-boson threshold) spectrum of Fr¨ ohlich’s polaron Hamiltonian in the weak coupling regime is obtained in spatial dimension d ≥ 3. It contains a single polaron branch defined for total momentum p ∈ G(0) , where G(0) ⊂ Rd is a bounded domain, and, for any p ∈ Rd , a manifold of polaron + one-boson states with boson momentum q in a bounded domain depending on p. The polaron becomes unstable and dissolves into the one-boson manifold at the boundary of G(0) . The dispersion laws and generalized eigenfunctions are calculated. Keywords: Polaron problem; Bose field; spectral branches. Mathematics Subject Classification 2000: 81Q10, 47A40, 47A10, 47A55
1. Introduction We consider the quantum system consisting of a particle (electron) coupled with a Bose field (describing the optical phonons in an ionic crystal) by an interaction linear in the creation-annihilation operators, known in the physics literature as Fr¨ ohlich’s polaron model [1, 2]. In particular, we are not concerned here with the relevant infrared problem when acoustic phonons are considered [3–5]. There are many papers, both physical and mathematical, devoted to this sub(0) ject. These are mainly concerned with the ground state Fp of the Hamiltonian 1111
November 18, 2005 10:54 WSPC/148-RMP
1112
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
Hp of the system at fixed total momentum p acting in the Hilbert space H(p) (see below). The ground state describes the “polaron”, i.e. the particle in a “cloud of virtual bosons”. There are several regimes of interest in the polaron model [6]. In the strong coupling regime, the main object study is the asymptotic behaviour at large of (0) 2 coupling of the polaron mass m Fp ∼ p /2m, p → ∞ , see [7] and references therein. In the perturbational regime, detailed information about the existence and structure of the ground state is available for the case when the particle Hamiltonian already has a ground state [8, 9]. We are concerned in this paper with the translation invariant case at small α, for which the ground state has been rigorously constructed [10, 11]. It is shown that (0) for sufficiently small particle-field coupling constant α, the ground state Fp exists (0) d (0) only for momentum p in a certain domain G ⊂ R , where G is bounded for space dimension d ≥ 3 and G(0) = Rd for d = 1, 2. Our aim is to investigate further this regime by studying the next, “one-boson”, branch of the spectrum of Hp for d ≥ 3. The expected mathematical picture is the following: there exists an invariant subspace H1 (p) ⊂ H(p) of the operator (1) (1) Hp , which is isomorphic in a natural way with L2 Gp , dq , where Gp ⊂ Rd is a certain bounded domain, such that Hp acts in this subspace as multiplication with a function ξp (q), which can be viewed as the energy of a boson of momentum q (while the total momentum of the system is p). The range of this function is the segment [λ1 (p), λ2 (p)), where λ1 (p) and λ2 (p) are the thresholds of the one- and two-boson states, respectively. (0) Moreover, in the subspace orthogonal to H0 (p) ⊕ H1 (p), where H0 (p) = cFp is the one-dimensional subspace generated by the ground state whenever it exists, the spectrum of Hp lies above λ2 (p) (this latter property will be called “the completeness of the one-boson spectrum”). The states in H1 (p) can be viewed as scattering states of a boson and a polaron. Unfortunately, we shall obtain here only part of the above picture. Namely, we (1),κ are able to construct only a subspace H1κ (p) ⊂ H1 (p) isomorphic to L2 Gp , dq , (1),κ (1) = q ∈ Gp : ξp (q) < κ . Here, κ < λ2 (p) can be chosen arbitrarily where Gp close to λ2 (p), at the expense of taking the coupling constant α sufficiently small. (1) Our techniques allow the construction of the whole space Gp and the proof of the completeness of the one-boson spectrum for d ≥ 5, but we do not burden this paper with more technicalities of less physical interest (in view of the large space dimension). Our analysis of the one-boson branch covers only the cases d ≥ 3, though we (1) expect that the same picture holds in lower dimension, with Gp = Rd for d = 1, 2. The calculations are based on a technique used by one of the authors in [10], and also on certain facts connected with the spectral analysis of the so-called generalized Friedrichs model [13]. After the completion of this work, we became aware of the recent paper [12] where the decomposition of the Hilbert space into subspaces corresponding to the n-boson branches of the spectrum is performed with a completely different approach
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1113
(using time-dependent scattering methods). As compared to [12], which gives essentially an existence result, our approach, though restricted here to n = 1, has the advantage of allowing a quite explicit construction of the generalized eigenvectors in this sector. We proceed now to a detailed presentation of the model and a precise statement of the main result. The state space of our model is the Hilbert space H = L2 (Rd ) ⊗ F, where F is the symmetric (boson) Fock space F = Fsym (L2 (Rd )) =
∞
H(n) ,
n=0
with H = C, H = (L (R vectors of H are sequences (0)
(n)
2
d
))⊗n sym
the symmetric tensor power (n ≥ 1). Thus, the
F = {f0 (p0 ), f1 (p0 ; q), . . . , fn (p0 ; q1 , . . . , qn ), . . .},
(1.1)
where fn are, for every p0 ∈ Rd , symmetric functions of the variables q1 , . . . , qn , and the norm is given by ∞ n 1 F 2 = |f0 (p0 )|2 dp0 + |fn (p0 ; q1 , . . . , qn )|2 dp0 dqi . (1.2) n! Rd Rnd Rd n=1 i=1 The Hamiltonian of our system has the form H = H0part + H0field + αHint ,
(1.3)
where α > 0 is a coupling constant and part 1 H0 F n (p0 ; q1 , . . . , qn ) = p20 fn (p0 ; q1 , . . . , qn ), 2 n
field H0 F )n (p0 ; q1 , . . . , qn ) = ε(qi ) fn (p0 ; q1 , . . . , qn ), i=1
(Hint F )n (p0 ; q1 , . . . , qn ) =
n
(1.4)
c(p0 ; qi )fn−1 (p0 + qi ; q1 , . . . , qˇi , . . . , qn )
i=1
+ Rd
c(p0 ; q)fn+1 (p0 − q; q1 , . . . , qn , q) dq,
with the convention that a sum over a void set is 0, and where the notation qˇ means that the variable q is omitted. The properties of the functions ε and c will be given in detail later. Notice that, with the minimal assumptions: ε is a positive real function and the function c is bounded and with sufficiently rapid decay for q → ∞, the operator H is self-adjoint and bounded from below.
November 18, 2005 10:54 WSPC/148-RMP
1114
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
Remark 1.1. The polaron Hamiltonian [1, 2] has the form above with c(p, q) ∼ 1/|q|. We consider a slight generalization thereof by allowing a p-dependence of c, which actually appears, e.g., in Pauli–Fierz-type models, and to which our approach applies without any further difficulties. On the other hand, we need to impose decay and regularity properties of c, which are not satisfied by 1/|q|. A first simplification in the spectral analysis of H comes from the conservation of the total momentum, i.e. from the fact that H commutes with the operator
n qi fn (p0 ; q1 , . . . , qn ), n ≥ 0. (1.5) (Pˆ F )n (p0 ; q1 , . . . , qn ) = p0 + i=1
As a consequence, H can be written as a direct integral of Hilbert spaces H(p) ⊕ H(p) dp, (1.6) H= Rd
which reduces both Pˆ and H, i.e. induces the decompositions ⊕ ⊕ Pˆ = pI dp, H = Hp dp, Rd
(1.7)
Rd
where I (the unit operator) and Hp are operators in H(p). For a vector F as given in (1.1), we get the representation: ⊕ Fˆp dp, F = Rd
where Fˆp = {fˆp,n }n≥0 and fˆp,n is the restriction of fn to the hyperplane p0 + n 2 d i=1 qi = p. The spaces H(p) will be identified with F = Fsym (L (R )) by means of the unitaries
n ˆ qi ; q1 , . . . , qn . (1.8) (Up Fp )n (q1 , . . . , qn ) = fn p − i=1
With this identification, the action of Hp in F is given by the formula (Hp F )n (q1 , . . . , qn ) = e0n,p (q1 , . . . , qn )fn (q1 , . . . , qn )
n n +α c p− qj ; qi fn−1 (q1 , . . . , qˇi , . . . , qn ) i=1
+α Rd
where e0n,p (q1 , . . . , qn )
j=1
c p−q−
n
qj ; q fn+1 (q1 , . . . , qn , q) dq,
(1.9)
j=1
2 n n 1 p− = qi + ε(qi ). 2 i=1 i=1
(1.10)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1115
The functions ε and c are supposed to fulfill the following conditions: 1. ε(q) is a convex, nondecreasing function of |q| and there exists co > 0, such that ε(q1 ) + ε(q2 ) ≥ ε(q1 + q2 ) + co ,
∀ q1 , q2 ∈ Rd .
(1.11)
Also, we need stronger regularity properties: ε ∈ C ∞ (Rd ) and it has bounded derivatives, i.e. there exists R > 0, such that for all multi-indices α = (α1 , . . . , αd ) = 0, sup ∂qα ε(q) ≤ R, (1.12) q∈Rd
where ∂qα =
∂ |α| , α1 ∂q1 · · · ∂qdαd
q = (q1 , . . . , qd ),
|α| =
d
αi .
i=1
The following are physically interesting examples of such functions: (a) ε(q) = ε(0) > 0, (b) ε(q) = q 2 + m2 + co , m > 0, co > 0. 2. c(p, q) is sufficiently smooth and there exists a bounded, rapidly decreasing function h : Rd → R+ dominating c and all its derivatives, i.e. for all multi-indices α, β, there exists Cα,β > 0 (C0,0 = 1) such that α β ∂ ∂ c(p; q) ≤ Cα,β h(q), ∀ p, q ∈ Rd . (1.13) p q We are concerned here with the study by perturbation theory of the (lower part of the) spectrum of the Hamiltonian (1.9) for every fixed p: Hp = Hp(0) + αHp,int ,
(1.14)
(0)
where Hp denotes the first term, and αHp,int the other terms, of Eq. (1.9); sometimes, for notational simplicity, the index p will be omitted. (0) The spectrum of Hp consists of the eigenvalue 12 p2 , corresponding to the bare particle, and branches of continuous spectrum e0n,p (q1 , . . . , qn ), corresponding to bare particle + n-boson states, starting at the thresholds λ0n (p) = min e0n,p (q1 , . . . , qn ). q1 ,...,qn
(1.15)
Remark that, in view of the convexity of p2 and ε, the minimum of e01,p (q1 ) is attained at a single point q10 , which is its unique critical point and is nondegenerate. Moreover, as a consequence of the inequality (1.11), λ0n (p) ≥ λ0n−1 (p) + co .
(1.16)
The main result of the paper is summarized in the following: Theorem 1.2. (i) For any d ≥ 3, there exists α0 = α0 (d) such that for any α < α0 , there exist functions λ1 (p) < λ2 (p), with λ1 (p) < λ01 (p), λ2 (p) < λ02 (p), and a bounded domain G(0) ⊂ Rd , such that the spectrum of Hp in (−∞, λ1 (p)) consists
November 18, 2005 10:54 WSPC/148-RMP
1116
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov (0)
of one nondegenerate eigenvalue ξp if p ∈ G(0) and is void if p ∈ / G(0) . Moreover, (0) (0) 2 (0) ξp < p /2 and limp →p∈∂G(0) ξp = λ1 (p), where ∂G denotes the boundary of (0)
the domain G(0) . The associated eigenvector Fp
is the ground state of Hp .
¯0 = α ¯ 0 (κ, p, d) such (ii) For any κ ∈ (λ1 (p), λ2 (p)) and any p ∈ R , there exists α (1),κ d ∞ ⊂ R , a C -function ξpκ : that for any α < α ¯ 0 , there exists a domain Gp d
(1),κ
Gp → [λ1 (p), κ] and a subspace H1κ (p) ⊂ F invariant for Hp , such that the restriction of Hp to H1κ (p) is unitarily equivalent to the operator of multiplication (1),κ by the function ξpκ in L2 Gp , dq . Thereby, for κ1 < κ2 ∈ (λ1 (p), λ2 (p)), one gets (1),κ1
Gp
(1),κ2
⊂ Gp
and ξpκ1 = ξpκ2 |G(1),κ1. p
Remark 1.3. Refining slightly the technique of this paper, one can reach κ = λ2 (p) if the dimension d is sufficiently large, i.e. α ¯ 0 (·, p, d) is bounded away from zero, κ=λ (p) κ=λ (p) can be and the whole one-boson subspace H1 2 (p) and the function ξp 2 constructed. 2. Outline of the Proof We shall present first the strategy we adopt in proving Theorem 1.2, in order not to obscure it by the cumbersome calculations to be done. Our constructions involve the resolvent of Hp : R(z) = (Hp − zI)−1 .
(2.1)
We have therefore to solve, for any L ∈ F, the equation: (Hp − zI)F = L.
(2.2)
We split the space F as an orthogonal sum H(≤1) ⊕H(≥2) , corresponding to number n of bare bosons ≤ 1, and ≥ 2, respectively, and denote Π1 , Π2 the corresponding orthogonal projections. Hence, F = F1 + F2 , where F1 = Π1 F = {f0 , f1 , 0, 0, . . .}, F2 = Π2 F = {0, 0, f2, . . .} and similarly the vector L = L1 + L2 . Then, the operator Hp has a matrix representation: A11 αA12 Hp = , (2.3) αA21 A22 where Aii = Πi Hp Πi , αAij = Πi Hp Πj (i = j), in terms of which Eq. (2.2) writes as (A11 − zI)F1 + αA12 F2 = L1 . (2.4) αA21 F1 + (A22 − zI)F2 = L2 We define λ2 (p) = inf spec(A22 ).
(2.5)
By the variational principle, λ2 (p) =
inf
(Π2 F, Hp Π2 F ) ≤ λ02 (p)
F ∈F ;F =1
(2.6)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1117
as λ02 (p) is obtained as the infimum of the same expression taken over the subspace H(≤2) of vectors F with at most two bare bosons. Likewise, we define λ1 (p) =
inf
(F, Hp F ) ≤ λ01 (p)
F ∈H(≥1) ;F =1
(2.7)
the infimum of the spectrum of the restriction of Hp to the subspace H(≥1) with at least one bare boson. For z in the resolvent set of A22 , the second equation in (2.4) can be solved for F2 : F2 = (A22 − zI)−1 (L2 − αA21 F1 ),
(2.8)
and hence, one arrives at the following equation for F1 : (A11 − α2 A12 (A22 − zI)−1 A21 − zI)F1 = L1 − αA12 (A22 − zI)−1 L2 .
(2.9)
Let us now restrict to real z = ξ and consider the family of self-adjoint operators acting in H(≤1) : Ap (ξ) = A11 − α2 A12 (A22 − ξI)−1 A21 ,
ξ ∈ (−∞, λ2 (p)).
(2.10)
We shall show that, under our assumptions and for ξ ≤ κ ∈ (λ1 (p), λ2 (p)), Ap (ξ) are generalized Friedrichs operators, i.e. each operator Ap (ξ) = A allows in the space H(≤1) = C ⊕ L2 (Rd , dq) a representation of the form: (AF )0 = e(0) f0 + α v¯(q)f1 (q) dq (2.11) (AF )1 = αv(q)f0 + a(q)f1 (q) + α2 D(q, q )f1 (q ) dq , where F = (f0 , f1 ) ∈ H(≤1) . Here, v(q), a(q) and the kernel D(q, q ) fulfill a set of smoothness conditions (given in detail in Sec. 4.1), a(q) is bounded from below and grows at most linearly at ∞, and v(q), D(q, q ) decrease fast at ∞. Such operators allow, for small α, a complete spectral analysis (see [13–15] and Sec. 4.1 below), namely, letting aside the possible eigenvalue ep (ground state), they are unitarily equivalent to the operator of multiplication by a(q) in L2 (Rd , dq). Let us denote, for given p and ξ, by ap (ξ, q) the function a(q) entering Eq. (2.11) written for Ap (ξ). In essence, the key to the spectral analysis of Hp lies the following remark: Remark 2.1. Let F = F1 + F2 (F1 ∈ H(≤1) , F2 ∈ H(≥2) ) be a (generalized) eigenvector of Hp with eigenvalue ξ. Then, by Eq. (2.9), F1 is a (generalized) eigenvector of the operator Ap (ξ) with the same eigenvalue ξ. Conversely, suppose that Fξ,1 is the eigenvector of Ap (ξ) of eigenvalue ep (ξ) (whenever it exists). If the equation ep (ξ) = ξ
(2.12)
(0)
has a solution ξp , then F = Fξ(0) ,1 + Fξ(0) ,2 , where p
p
Fξ(0) ,2 = −α(A22 − ξp(0) I)−1 A21 Fξ(0) ,1 p
p
November 18, 2005 10:54 WSPC/148-RMP
1118
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov (0)
q is an eigenvector of Hp with eigenvalue ξp . Likewise, let for a given p, Fξ,1 be a generalized eigenvector of the operator Ap (ξ) corresponding to the eigenvalue ap (ξ, q) and ξp (q) be a solution of the equation
ap (ξ, q) = ξ.
(2.13)
F2q = −α(A22 − ξ(q)I)−1 A21 F1q ,
(2.14)
q and Then, for F1q = Fξ(q),1
q
the vector F =
F1q
+ F2q
is a generalized eigenvector of Hp for the eigenvalue ξp (q).
The domain G(0) is identified with the set of p for which Eq. (2.12) has a (1),κ is the set of q for which Eq. (2.13) has a solution solution. For any given p, Gp ξ(q) ≤ κ. The constructions of the subspace H1κ (p) and of the unitary equivalence of the operator Hp |Hκ1 (p) to the multiplication by ξpκ (q) = ξp (q) are done in the standard way in terms of the family {F q }q∈G(1),κ of generalized eigenvectors of Hp . p
3. Elimination of the Many-Body Components In this section, we study perturbatively the solution (2.8) and derive its main properties of interest to us. By virtue of Eq. (1.14) and the inequality (2.6), one can (0) / [λ2 (p), ∞), and bring factorize the unperturbed (diagonal) part Hp − z for z ∈ the second equation (2.4) to the form of the equivalent fixed-point equation: −1 F2 + Q(z)F2 = α Hp(0) − z L2 − GF1 , (3.1) where
−1 Π2 Hp,int Π2 , Q(z) = α Hp(0) − z (0) −1 G = α Hp − z Π2 Hp,int Π1 .
(3.2) (3.3)
Explicitly, the vector GF1 has the form {0, 0, g2, 0, . . .} with g2 (q1 , q2 ) =
α[c(p − q1 − q2 ; q1 )f1 (q2 ) + c(p − q1 − q2 ; q2 )f1 (q1 )] . e02,p (q1 , q2 ) − z
(3.4)
Lemma 3.1. For every κ ∈ (λ1 (p), λ2 (p)), there exists α0 (κ) such that, for any α < α0 (κ), and any z ∈ C with Re z = ξ ≤ κ, Q(z) < 1/2, therefore Eq. (3.1) has a unique solution F2 for every f1 ∈ L2 (Rd , dq) and L2 ∈ H(≥2) . Proof. We write Q(z) as a sum of its creation and annihilation parts: Q(z) = Q + Q , with (Q F )n (q1 , . . . , qn ) n = 2, 0, n −1 = 0 c p− qj ; qi fn−1 (· · · qˇi · · ·), n > 2. α en,p (q1 , . . . , qn ) − z i=1
j
(3.5)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
(Q F )n (q1 , . . . , qn ) 0 −1 = α en,p (q1 , . . . , qn ) − z c p− qj − q; q fn+1 (q1 , . . . , qn , q) dq,
1119
n ≥ 2.
j
(3.6) By condition (1.11) and the inequality (2.6), one gets for ξ ≤ κ, 0 en,p (q1 , . . . , qn ) − z ≥ e0n,p (q1 , . . . , qn ) − ξ ≥ (n − 2)co + λ2 (p) − κ,
(3.7)
therefore, by virtue of (1.13), (Q F )n L2 (Rdn ) ≤
nα · h L2 (Rd ) · fn−1 L2 (Rd(n−1) ) , (n − 2)co + λ2 (p) − κ
n > 2,
while, for n = 2, the norm vanishes. Therefore, Q F 2H(≥2) =
∞ 1 · (Q F )n 2L2 (Rdn ) n! n=2
≤ α2 h 2L2 (Rd ) max[n((n − 2)co + λ2 (p) − κ)−2 ] n≥3
×
∞
1 · fn−1 2L2 (Rd(n−1) ) (n − 1)! n=3
≤ α2 h 2L2 (Rd ) 3(co + λ2 (p) − κ)−2 F 2H(≥2) , √ implying that Q ≤ α · h L2 (Rd ) 3/(co + λ2 (p) − κ). A similar calculation shows √ that Q ≤ α · h L2 (Rd ) 3/(λ2 (p) − κ). This finishes the proof of the lemma. Let us denote by F20 (z, f1 ) = fn0 (z; ·); n ≥ 2 the solution of Eq. (3.1) for L2 = 0. Underthe conditions of Lemma 3.1 and taking into account Eq. (3.4), we get F20 (z, f1 )H(≥2) ≤ Cα f1 L2 (Rd ) . From now on, we shall denote by S(z) the linear operator: S(z)
H(≤1) f1 −→ F20 (z, f1 ) ∈ H(≥2) .
(3.8)
To proceed further with the analysis, we need more information about the structure and regularity of the solution F20 (z, f1 ). To this aim, we shall solve Eq. (3.1) with L2 = 0. In particular, we shall show that the components of F20 (z, f1 ) have the representation fn0 (z; q1 , . . . , qn ) =
n
bn (q1 , . . . , qˇi , . . . , qn ; qi )f1 (qi )
i=1
+
dn (q1 , . . . , qn ; q)f1 (q) dq,
(3.9)
where the functions bn (q1 , . . . , qn−1 ; qn ) are symmetric in q1 , . . . , qn−1 and the functions dn (q1 , . . . , qn ; q) are symmetric in q1 , . . . , qn , n ≥ 2. The functions bn and dn will be called the coefficient functions.
November 18, 2005 10:54 WSPC/148-RMP
1120
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
A simple calculation shows that, if F ∈ H(≥2) has the representation (3.9), then also Fˆ = Q(z)F has the same kind of representation, with coefficient functions ˆbn (q1 , . . . , qn−1 ; qn ) = α e0 (q1 , . . . , qn ) − z −1 n,p n−1
n × c p− qj ; qi bn−1 (q1 , . . . , qˇi , . . . , qn−1 ; qn )
i=1
j=1
c¯ p −
+
n
qj − q; q bn+1 (q1 , . . . , qn−1 , q; qn ) dq ,
j=1
(3.10) −1 dˆn (q1 , . . . , qn ; q) = α e0n,p (q1 , . . . , qn ) − z n
n × c p− qj ; qi dn−1 (q1 , . . . , qˇi , . . . , qn ; q)
i=1
j=1
c¯ p −
+
n
qj − q ; q
j=1
+ c¯ p −
n
dn+1 (q1 , . . . , qn , q ; q) dq
qj − q; q bn+1 (q1 , . . . , qn ; q) .
(3.11)
j=1
Let us now define the space M of all pairs µ = {(bn )n≥2 , (dn )n≥2 } of sequences of bounded continuous functions, bn : (Rd )(n−1) × Rd → C, dn : (Rd )n × Rd → C, symmetric with respect to the first group of variables. Let them fulfill the following condition: there exists a constant M such that, supq |bn (q1 , . . . , qn−1 ; q)| ≤ M
n−1 i=1
|dn (q1 , . . . , qn ; q)| ≤ M h(q)
h(qi ), n
(3.12) h(qi ),
∀ n ≥ 2,
i=1
where h is the function appearing in Eq. (1.13). M is a Banach space with the norm µ = inf M,
(3.13)
where the infimum is taken over all M for which the condition (3.12) holds. Clearly, Eq. (3.9) defines a continuous application of H(1) into H(≥2) . The operator Γ(z) acting in M according to Γ(z)µ = µ ˆ, where µ = ˆ = {(ˆbn )n≥2 , (dˆn )n≥2 } are related by (3.10) and (3.11), {(bn )n≥2 , (dn )n≥2 } and µ translates in M the action of Q(z). Then, Eq. (3.1) with L2 = 0 is transformed into µ + Γ(z)µ = µ0 ,
(3.14)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1121
where µ0 = {(b0n ), (d0n )} with b0n = 0, ∀ n ≥ 3, d0n = 0, ∀ n ≥ 2, and b02 (q1 ; q) = −
αc(p − q1 − q; q1 ) . e02,p (q1 , q) − z
(3.15)
˜ 0 (κ) such that for any Lemma 3.2. For every κ ∈ (λ1 (p), λ2 (p)), there exists α α<α ˜ 0 (κ), and any z ∈ C with Re z = ξ ≤ κ, Γ(z) < 1/2, and µ0 ∈ M, µ0 ≤ α/(λ2 (p) − κ). Therefore, Eq. (3.14) has a unique solution µ(z) ∈ M, which is an analytic function of z in the half-plane Re z ≤ κ. Moreover, for any r ≥ 1, there ˜ r (κ), the components of µ(z) are C r-functions of exists α ˜ r (κ) such that, for α < α their arguments and the derivatives up to order r satisfy estimates like (1.13), more n precisely, for any multi-indices An = {α1 , . . . , αn , β} with |An | = i=1 |αi |+β ≤ r, ˜ n ) such where αi = α1i , . . . , αdi , β = (β 1 , . . . , β d ), there exist constants C(An ), C(A that, for any n ≥ 2 and for all z in the half-plane, the following inequalities hold: |∂ An−1 bn (z; q1 , . . . , qn−1 ; q)| ≤ α(λ2 (p) − κ)−1 · C(An−1 ) · |∂
An
−1
dn (z; q1 , . . . , qn ; q)| ≤ α(λ2 (p) − κ)
n−1
h(qi ),
i=1 n
· C(An ) · h(q)
(3.16) h(qi ),
i=1
n−1 d A ∂ n−1 bn (z; q1 , . . . , qn−1 ; q) ≤ α(λ2 (p) − κ)−2 · C(A ˜ n−1 ) · h(qi ), dz i=1 (3.17) n d A ∂ n dn (z; q1 , . . . , qn ; q) ≤ α(λ2 (p) − κ)−2 · C(A ˜ n ) · h(q) h(qi ), dz i=1 n where ∂ An = ( i=1 ∂qαii )∂qβ . The vector F20 (z; f1 ) given by Eq. (3.9), having as coefficient functions the components bn , dn of µ(z), belongs to H(≥2) and it is the unique solution of Eq. (3.1) for L2 = 0. Proof. Suppose µ ∈ M, µ = 1, i.e. (bn )n≥2 , (dn )n≥2 satisfy the estimates (3.12) with M = 1. Then, Γ(z)µ = µ ˆ of components (3.10) and (3.11) satisfies the same estimates with Aˆ = α · max n≥2
α(3 + h 2 ) n + 1 + h 2 = . (n − 2)co + λ2 (p) − κ λ2 (p) − κ
Therefore, Γ(z) < 1/2 for α sufficiently small. The estimate of µ0 is immediate, therefore µ(z) ≤ 2α(λ2 (p) − κ)−1 . So, we are left with the proof of the smoothness of the coefficient functions, Eqs. (3.16) and (3.17). We shall consider only the first derivatives, i.e. |An | = 1. Consider the subspace M1 ⊂ M of all µ with differentiable components (bn )n≥2 ,
November 18, 2005 10:54 WSPC/148-RMP
1122
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
(dn )n≥2 for which there exists M1 > 0, such that max
n−1 max |∇qi bn (q1 , . . . , qn−1 ; q)|, |∇q bn (q1 , . . . , qn−1 ; q)| ≤ M1 h(qi ),
1≤i≤n−1
max
i=1
n max |∇qi dn (q1 , . . . , qn ; q)|, |∇q dn (q1 , . . . , qn ; q)| ≤ M1 h(q) h(qi ),
1≤i≤n
i=1
(3.18) which is a Banach space with norm µ 1 = max{ µ , inf M1 }, where inf is taken over all M1 fulfilling (3.18). We show that, for small α, Γ(z) is a contraction in M1 as well. Taking derivatives with respect, say, to q1 in Eq. (3.10), one obtains ∇q1 ˆbn (q1 , . . . , qn−1 ; qn ) = −α(e0n,p (q1 , . . . , qn ) − z)−2 ∇q1 e0n,p (q1 , . . . , qn−1 , qn )
n−1 n c p− qj ; qi bn−1 (q1 , . . . , qˇi , . . . ; qn ) × i=1
j=1
c¯ p −
+
n
qj − q; q bn+1 (q1 , . . . , qn−1 , q; qn ) dq
j=1
+ α(e0n,p (q1 , . . . , qn ) − z)−1
n−1
n n × ∇q c p − qj ; q1 − ∇p c p − qj ; qi bn−1 (q1 , . . . , qˇi , . . . ; qn ) −
+
∇p c¯ p − n−1 i=2
+
j=1
j=1
c p−
c¯ p −
n
n
j=1
qj − q; q bn+1 (q1 , . . . , qn−1 , q; qn ) dq
qj ; qi ∇q1 bn−1 (q1 , . . . , qˇi , . . . , qn−1 ; qn )
j=1 n
i=1
qj − q; q ∇q1 bn+1 (q1 , . . . , qn−1 , q; qn ) dq .
(3.19)
j=1
Here, ∇p c and ∇q c denote the gradient of the function c(p, q) with respect to the first, respectively the second, argument. Similar expressions are obtained for ∇qi dˆn , ∇qn ˆbn and ∇q dˆn . Suppose that µ 1 = 1. Then, using the simple estimate ∇ e0 (q , . . . , q n−1 , qn ) q1 n,p 1 ¯ (3.20) ≤ R, e0n,p (q1 , . . . , qn ) − z ¯ is a constant independent of n, and also the assumption (1.13), one obtains where R ˆ µ 1 ≤
¯ + C¯ + b) α(a + h 2 )(R , λ2 (p) − κ
(3.21)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1123
where a and b are absolute constants and C¯ = max{|∇p c(p, q)|, |∇q c(p, q)|}. Equation (3.21) shows that Γ(z) leaves M1 invariant and that Γ(z) < 1/2 for α sufficiently small. Since µ0 ∈ M1 , and ¯ + C), ¯ 1}, µ0 1 < 2α(λ2 (p) − κ)−1 max{(R we see that the solution µ(z) of Eq. (3.14) belongs to M1 and has norm of the order of α/(λ2 (p) − κ). This finishes the proof of the inequalities (3.16) in the case r = 1. The higher values of r can be treated similarly, with stronger limitations on α. Finally, µ(z) and its derivatives are analytic in the half-plane ξ ≤ κ for any κ ∈ (κ, λ2 (p)) and satisfy there inequalities like (3.16), implying (3.17) in ξ ≤ κ. The lemma is proved. Finally, going back to the system (2.4) with L2 = 0, we remark that the solution F 20 (z; f1 ) of the second equation enters the first equation only through its first (n = 2) component, f 20 (z; f1 ), which, in view of Eq. (3.9) has the form: f 20 (z; f1 ; q1 , q2 ) = b2 (z; q2 ; q1 )f1 (q1 ) + b2 (z; q1 ; q2 )f1 (q2 ) + d2 (z; q1 , q2 ; q)f1 (q) dq.
(3.22)
Inserting this representation into the first equation (2.4) and using the notations: mp (z; q) = α c(p − q − q ; q )b2 (z; q ; q) dq , (3.23) 1 Dp (z; q, q ) = c(p − q − q , q )b2 (z; q; q ) α + c(p − q − q , q )d2 (z; q, q ; q ) dq , (3.24) one arrives at the following system of equations for the n = 0, 1 components: e00,p − z f0 + α c(p − q, q)f1 (q) dq = l0 , (3.25) αc(p − q, q)f0 + [ap (z; q) − z]f1 (q) + α2 Dp (z; q, q )f1 (q ) dq = l1 , where ap (z; q) = e01,p (q) + mp (z; q).
(3.26)
Corollary 3.3. For z real, the function ap (z; q) is real and the kernel Dp (z; q, q ) is self-adjoint. Indeed, the operator V (z) defined by (V (z)f1 )(q) = mp (z; q)f1 (q) + α2
Dp (z; q, q )f1 (q ) dq
November 18, 2005 10:54 WSPC/148-RMP
1124
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
is equal to −A12 (A22 − zI)−1 A21 appearing in Eq. (2.9), which is manifestly selfadjoint for real z. Corollary 3.4. The following asymptotic formulae hold: |c(p − q − q ; q )|2 2 mp (z; q) = −α dq + O(α3 ), e02,p (q, q ) − z
(3.27)
where O(α3 ) is a C 1 -function of norm O(α3 ) 1 ≤ Cα3 for some constant C depending on κ; Dp (z; q, q ) = −
c(p − q − q ; q )c(p − q − q ; q) + O(α), e02,p (q, q ) − z
(3.28)
where O(α) is a smooth function bounded by Cα h(q)h(q ) for some constant C depending on κ. ∞ By virtue of Lemma 3.2, the solution µ(z) has an expansion n=0 (−Γ(z))n µ0 convergent in M1 , the nth term of which is of the order αn , wherefrom the assertion. 4. Study of the Reduced System (3.25) 4.1. The generalized Friedrichs model (a digression) We collect here the needed information about the spectral representation of the generalized Friedrichs operator A acting in H(≤1) = C ⊕ L2 (Rd , dq), Eq. (2.11). We shall study A as a perturbation of A0 = A(α = 0), so α > 0 is supposed sufficiently small to ensure the convergence. In order to calculate the resolvent RA (z) of A, one has to solve (e(0) − z)f0 + α v¯(q)f1 (q) dq = g0 , (4.1) αv(q)f0 + (a(q) − z)f1 (q) + α2 D(q, q )f1 (q ) dq = g1 , for all (g0 , g1 ) = G ∈ H(≤1) . To this end the following assumptions are made: 1. a(q) is a real, sufficiently smooth function, and there exist constants C1 , C2 , C3 , such that C1 ≤ a(q) ≤ C2 |q|2 + C3 , |∇a(q)| ≤ C2 (|q| + 1), |∂ a(q)| ≤ C2 , α
(4.2)
|α| ≥ 2;
a(q) has a unique nondegenerate minimum a ¯ at q¯0 and no other critical points. We denote I = [¯ a, ∞) ⊂ R the range of the function a. 2. The function v(q) is continuous and |v(q)| ≤ h(q), for some bounded, rapidly decreasing, positive h; 3. a(q) restricted to the 0-level of v, {q : v(q) = 0}, is not constant;
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1125
4. The kernel D(q, q ) is sufficiently smooth and there exists a constant N such that for any multi-indices α, β with |α|, |β| ≤ r = [d/2] + 2, α β ∂q ∂ D(q, q ) ≤ N h(q)h(q ). (4.3) q In solving Eq. (4.1) we proceed as outlined in Sec. 2, i.e. we solve the second equation for f1 into terms of f0 and plug the solution into the first equation. Let B be the operator defined on its maximal domain in L2 (Rd , dq) by the formula: (4.4) Bf (q) = a(q)f (q) + α2 D(q, q )f (q ) dq . We need its resolvent RB (z) = (B − zI)−1 . We denote by Br the Banach space of all kernels D(q, q ) satisfying condition 4.1, i.e. which are r times differentiable and satisfy (4.3) for some N , endowed with the norm D r = inf N , where the infimum is taken over all N for which (4.3) holds. Lemma 4.1. For α sufficiently small and z ∈ / I, the resolvent RB (z) = (B − zI)−1 has the form (RB (z)g)(q) = (a(q) − z)−1 g(q) + α2 K(α, z; q, q )g(q )(a(q ) − z)−1 dq , g ∈ L2 (Rd , dq), (4.5) where the kernel K(α, z; ·, ·) ∈ Br and its norm is bounded for z ∈ C\I. Moreover, K is a Br -valued analytic function of z on C\I and its boundary values K ± (α, x; q, q ) = lim K(α, x ± iε; q, q ) ε↓0
(4.6)
exist in Br for all x ∈ I. Also, K ± (α, x; ·, ·) are [(d − 1)/2] − 1 times differentiable older as a Br -valued function of x ∈ I and their last derivative with respect to x is H¨ continuous of exponent γ = 1/3 (actually of any γ < 1/2 for even d and any γ < 1 for odd d). Remark 4.2. We shall express the last property of K ± by saying that I x → K ± (α, x; ·, ·) ∈ Br is s + 1/3 times differentiable, where we put s = [(d − 1)/2] − 1. In order to ˜ prove it, one has to find a constant N , such that for all multi-indices α, β with α , β ≤ r = [d/2] + 2 and k = 0, 1, . . . , s: k α β ± ˜ h(q)h(q ), ∂x ∂q ∂ K (α, x; q, q ) ≤ N (4.7) q and
s α β ± ∂x ∂q ∂ K (α, x; q, q ) − ∂xs ∂qα ∂ β K ± (α, y; q, q ) q q ˜ ≤ Nh(q)h(q ). max x,y∈I;|x−y|≤1 x − y 1/3 (4.8)
November 18, 2005 10:54 WSPC/148-RMP
1126
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
Proof of Lemma 4.1. Let B0 = B(α = 0), i.e. the operator of multiplication with a(q) and D the integral operator of kernel D(q, q ). Then, denoting M = α2 D(B0 − z)−1
(4.9)
which is an integral operator of kernel α2 D(q, q )(a(q ) − z)−1 , we have, formally, the expansion: RB (z) = (B0 − z)−1 (I + M )−1 = (B0 − z)−1 ∞ + (−1)n (B0 − z)−1 M n ,
(4.10)
n=1
where (B0 − z)−1 M n , n ≥ 1, are integral operators of kernels (a(q) − z)−1 Ln (α, z; q, q )(a(q ) − z)−1 , with
2n
Ln (α, z; q, q ) = α
···
D(q, q1 )D(q1 , q2 ) · · · D(qn−1 , q ) dq1 · · · dqn−1 . n−1 i=1 (a(qi ) − z) (4.11)
We shall prove the convergence of the series (4.10) in Br and, hence, show that K satisfies all the assertions of the lemma, by checking (by induction) the following properties of the function (4.11): (i) Ln (α, z; ·, ·) ∈ Br and Ln (α, z; ·, ·) ≤ (Cα2 )n−1 , r
(4.12)
where C is a constant (to be specified later); (ii) The limits lim Ln (α, x ± iε; q, q ) = L± n (α, x; q, q ) ε↓0
(4.13)
exist in Br for all x ∈ I; (iii) L± n (α, x; ·, ·) are s + 1/3 times differentiable, thereby they satisfy the estimates ˜ = (Cα2 )n−1 . (4.7) and (4.8) with N Indeed, for n = 1, i.e. for D(q, q ) these assertions hold obviously. For Im z ≥ 0, we represent 2 q ) − z)−1 d¯ q D(q, q¯)Ln (α, z; q¯, q )(a(¯ Ln+1 (α, z; q, q ) = α ∞ = iα2 dt eizt D(q, q¯)Ln (α, z; q¯, q )e−ita(¯q ) d¯ q , (4.14) 0
wherefrom ∂zk ∂qα ∂qβ Ln+1 (α, z; q, q ) ∞ dt(it)k eizt ∂qα D(q, q¯)∂qβ Ln (α, z; q¯, q )e−ita(¯q ) d¯ q. = iα2 0
(4.15)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1127
Using (4.3), the induction hypothesis and the condition 1 for a(q), the internal integral can be represented by the stationary phase method as Cˆ
∂qα D(q, q¯0 )∂qβ Ln (α, z; q¯0 , q )e−ita(¯q0 ) td/2 + 1
+ ∆αβ (t; q, q ),
(4.16)
where Cˆ is an absolute constant, and the kernel ∆αβ is bounded by ∆αβ (t; q, q ) ≤ N ¯ (Cα2 )n−1 h2 h(q)h(q ) (4.17) L2 td/2+1 + 1 ¯ dependent on d and on the function a. The integral with some constant N ∞ (it)k i(z−a(¯q0 ))t dt d/2 e t +1 0 is absolutely convergent for all k ≤ s and defines a continuous function of z in Im z ≥ 0, which, for k = s, is H¨ older continuous with respect to this variable. We have that the contribution to (4.15) of the first term in (4.16) has the estimate ∞ 2 β k it(z−a(¯ q0 )) d/2 −1 α iα Cˆ dt(it) e (t + 1) ∂ D(q, q ¯ )∂ L (α, z; q ¯ , q ) 0 q n 0 q 0 (4.18) 2 ˜ 2 n−1 ˜ h(q)h(q ), ≤ Ch(¯ q0 ) N (Cα ) where C˜ is a constant. One proves in the same way the H¨ older condition (4.8) for k = s. A similar estimate holds for the integral of the second term in (4.16): ∞ k izt ≤ C˜ h2 N ¯ (Cα2 )n−1 h(q)h(q ). (it) e ∆ (t; q, q ) dt αβ L2 0
˜ + h 2 N ¯ ˜ By taking C = max{N, C(|h(¯ q0 )|2 N L2 )}, one gets the estimate (4.12), the existence of the limit (4.13) and the assertion (iii) for n replaced by n + 1. Once we have RB (z), it is an easy matter to write down the solution of Eq. (4.1) for z ∈ C\I as f1 = RB (z)[g1 − αf0 v],
(4.19)
where f0 =
1 [g0 − α(v, RB (z)g1 )] ∆(z)
(4.20)
whenever ∆(z) = 0. Here, ∆(z) = e(0) − z − α2 (v, RB (z)v).
(4.21)
Clearly, ∆(z) is analytic in C\I, has limits at the cut I: lim ∆(x ± iε) = ∆± (x), ε↓0
x∈I
(4.22)
and the limits ∆± (x) are s + 1/3 times differentiable, by the same reasoning as in Lemma 4.1. More precisely, k d 1 ± dxk (∆ (x) + x) ≤ const, k = 0, . . . , s + 3 .
November 18, 2005 10:54 WSPC/148-RMP
1128
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
As one can read from Eqs. (4.19) and (4.20), the continuous spectrum of the operator A equals to the interval I. Besides, the real zeroes of ∆(z) below a ¯, if ¯, ∆(x) is any, are eigenvalues of A. Since −α2 (v, RB (x)v) is decreasing for x < a strictly decreasing from +∞ to ∆(¯ a) on this interval, therefore A has one simple eigenvalue e < a ¯ with eigenvector ψ0 = (f0 , f1 = −αf0 RB (e)v) ∈ H(≤1) , if, and only if, ∆(¯ a) < 0. As for small α, v(q) 2 dq + 0(α4 ) > 0, x ∈ I, (4.23) ±Im ∆± (x) = α2 π a(q)=x
in view of condition 3, it follows that there are no eigenvalues of A embedded in the continuous spectrum. Remark 4.3. It is easy to show using the explicit formulae for RA (z) that the general criteria of the absence of the singular continuous spectrum [15] are met in our case, hence that the continuous spectrum I is absolutely continuous. Therefore, we have ac H , ∆(¯ a) ≥ 0, (≤1) (4.24) = H ac a) < 0. {cψ0 } ⊕ H , ∆(¯ We come now to the scattering theory for the pair of self-adjoint operators (A, B0 ), where we denoted B0 the operator of multiplication with a(q) acting in H(1) = L2 (Rd , dq). We denote E : H(1) → H(≤1) the injection Eϕ = (0, ϕ) ∈ H(≤1) , ϕ ∈ H(1) . Known existence criteria for the wave operators (see e.g. [15]) can be applied to our case and ensure the existence of the strong limit: s- lim eitA Ee−itB0 = Ω+ , t→∞
which is a unitary operator Ω : H(1) → Hac ⊂ H(≤1) . The generalized eigenfunctions of the operator B0 are δq (·) = δ(q − ·), therefore, using known formulae in scattering theory, one can take +
ψ q = Ω+ δq = lim iεRA (a(q) − iε)Eδq ε↓0
(4.25)
as generalized eigenvectors of A corresponding to the eigenvalue a(q). Explicitly, in view of (4.19), (4.20) and Lemma 4.1, one gets for ψ q = (f0q , f1q (·)) the following expressions: v (q ) dq α K − (α, a(q); q , q)¯ q 2 f0 = − − v(q) + α , (4.26) ∆ (a(q)) a(q ) − a(q) + i0 α2 K − (α, a(q); q , q) 1 − αf0q f1q (q ) = δ(q − q ) + a(q ) − a(q) + i0 a(q ) − a(q) + i0 − (α, a(q); q , q )¯ v (q ) dq K 2 × v(q ) + α . (4.27) a(q ) − a(q) + i0 This somewhat formal derivation of the formulas (4.26) and (4.27) will be justified by the next lemma, which proves that ψ q ∈ C ⊕ S (Rd ) (where S (Rd ) is the
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1129
space of tempered distributions) and that it verifies the intertwining property of the wave-operator Ω+ . Lemma 4.4. Let d ≥ 3. Then, (i) For every fixed q ∈ Rd , f0q is finite and it is a bounded, continuous function of q. (ii) For every fixed q ∈ Rd , f1q (·) ∈ S (Rd ); moreover, for every fixed q ∈ Rd , q f1 (q ) ∈ S (Rd ) with respect to q. (iii) For ϕ ∈ S(Rd ), let us consider the vector ψϕ = (Cϕ,0 , Cϕ,1 (·)) ∈ H(≤1) , where (4.28) Cϕ,0 = f 0q ϕ(q) dq,
Cϕ,1 (q ) = Then, for any ϕ1 , ϕ2 ∈ S(Rd ), (ψϕ1 , ψϕ2 )H(≤1)
f 1q (q )ϕ(q) dq.
(4.29)
= C¯ϕ1 ,0 Cϕ2 ,0 +
C¯ϕ1 ,1 (q)Cϕ2 ,1 (q) dq = (ϕ1 , ϕ2 )L2 ,
(4.30)
therefore, the application ϕ → ψϕ extends to an isometry Ω+ : L2 (Rd , dq) → H(≤1) . (iv) The range of the operator Ω+ is Hac and AΩ+ = Ω+ B0 . Remark 4.5. The relation (4.30) may be written in the following formal way
(ψ q , ψ q )H(≤1) = f¯0q f 0q + (f 1q , f 1q )H(1) = δ(q − q ),
(4.31)
meaning the orthonormality of the generalized functions {ψ q , q ∈ Rd }. Remark 4.6. Usually, the generalized eigenvectors of a self-adjoint operator A acting in the Hilbert space H are defined as derivatives dEλ ϕ/dσϕ (λ), where ϕ ∈ H is an arbitrary vector, {Eλ } is the family of spectral projections of A, and σϕ (λ) is the spectral measure corresponding to ϕ. Moreover, if A leaves invariant a certain dense linear subspace H+ ⊂ H and H+ has a Hilbert space structure such that the inclusion is quasi-nuclear, then the derivative dEλ ϕ/dσϕ (λ) = χλ exists as an ∗ and it is an eigenvector with eigenvalue λ element of the conjugate space H− = H+ ∗ of the adjoint: (A|H+ ) , of the restriction of A to H+ , which is an extension of A. The vectors χλ ∈ H− are usually called generalized eigenvectors of the operator A. It can be shown that the generalized vectors introduced above are generalized eigenvectors of A in this sense. The same remark is valid for the generalized eigenvectors of the operator Hp (which will be constructed further on). Proof of Lemma 4.4. (i) This assertion follows easily from the representation K − (α, a(q); q , q)¯ v (q )[a(q ) − a(q) + i0]−1 dq ∞ =i dt eit(a(q )−a(q)) K − (α, a(q); q , q)¯ v (q ) dq 0
November 18, 2005 10:54 WSPC/148-RMP
1130
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
by applying the stationary phase method as done already in the proof of Lemma 4.1. (ii) In order to prove the second assertion, we have to consider Cϕ,1 (q ). To this aim, we represent the q -integral in (4.27) as before, using the stationary phase method: ∞ I(x; q ) : = i dt eit(a(q )−x) K − (α, x; q , q )¯ v (q ) dq 0
=i 0
∞
Cˆ it(a(¯ q0 )−x) − dt d/2 K (α, x; q , q¯0 )¯ v (¯ q0 ) + ∆(x; q , t) , (4.32) e t +1
where the correction term ∆ satisfies the estimates ˆ k α ∂x ∂q ∆(x; q , t) ≤ N h(q ) d/2+1 t +1 ˆ is a constant. for all multi-indices α, α ≤ [d/2]+1, and k = 0, . . . , s+1/3, where N Hence, I(x; q ) fulfills for d ≥ 3 the estimates k α ˜ h(q ); α ≤ [d/2] + 1, k = 0, . . . , s + 1 . ∂x ∂q I(x; q ) ≤ N 3 The contribution of this term to Cϕ,1 (q ) is: f 0q ϕ(q)K − (α, a(q); q , q )¯ v (q ) dq dq (a(q ) − a(q) + i0)(a(q ) − a(q) + i0) m(x)I(x; q ) f q ϕ(q)I(a(q); q ) = , (4.33) dx = dq 0 a(q ) − a(q) + i0 a(q ) − x + i0 R where
m(x) = a(q)=x
f 0q ϕ(q) dq.
(4.34)
As it follows from Part (i) of the proof, m(x) is s + 1/3 times differentiable. The same property is shared by I(x; q ) as a function of x for every fixed q . Therefore, the integral over x in (4.33) converges. The convergence of the other terms entering Cϕ,1 (q ) can be proved similarly. (iii) Using the representation ϕ(q) = ϕ(q0 )δ(q − q0 ) dq0 ,
ϕ ∈ S(Rd )
and the formula (4.25), we find that Ω+ ϕ = ϕ(q0 )ψ q0 dq0 = (Cϕ,0 , Cϕ,1 (·)) ∈ Hac ⊆ H(≤1) .
(4.35)
In view of the unitarity of the application Ω+ : L2 (Rd ) → Hac , one has (ϕ1 , ϕ2 )L2 (Rd ) = (Ω+ ϕ1 , Ω+ ϕ2 ) = C¯ϕ1 ,0 Cϕ2 ,0 + C¯ϕ1 ,1 (q)Cϕ2 ,1 (q) dq.
(4.36)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1131
(iv) Since S(Rd ) is dense in L2 (Rd ), the image Ω+ S(Rd ) is dense in Hac , therefore, in view of the unitarity of Ω+ , Ω+ L2 (Rd ) = Hac . The intertwining property AΩ+ = Ω+ B is obtained in the standard way. Thus, Lemma 4.4 is proved. This lemma implies in particular that any vector ψ ∈ Hac has a unique representation as f (q0 )ψ q0 dq0 := lim ψϕn , f ∈ L2 (Rd ). ψ = ψf = ϕn →f
Rd
Here, the limit in the right-hand side is meant in Hac and {ϕn } is a sequence of elements of S(Rd ) converging to f in L2 . 4.2. Construction of the one-boson subspace As explained in Sec. 2, the construction of the one-boson subspace of Hp relies on the spectral representation of the operators {Ap (ξ)}ξ≤κ , see (2.10), entering the reduced system (3.25): (Ap (ξ)F )0 = e00,p f0 + α c(p − q, q)f1 (q) dq 2 (Ap (ξ)F )1 (q) = αc(p − q, q)f0 + ap (ξ; q)f1 (q) + α Dp (ξ; q, q )f1 (q ) dq , (4.37) F = (f0 , f1 (·)) ∈ H(≤1) . Since for any ξ ≤ κ, the operator Ap (ξ) satisfies all the assumptions of the previous section, there exists a family q q q , fξ,1 (·) q∈Rd (4.38) Fξ,1 = fξ,0 of generalized eigenvectors of Ap (ξ) with eigenvalues {ap (ξ; q)}q∈Rd , given by (4.26) and (4.27), where a(q) is replaced by ap (ξ; q), and ∆− , K − by the functions ∆− ξ , q be constructed in Kξ− , entering the expression of the resolvent of Ap (ξ). Let Fξ,2 q q q (·) according to (3.9), i.e. Fξ,2 = S(ξ)fξ,1 where the application S(ξ) terms of fξ,1 was introduced in Eq. (3.8) (more precisely, S(ξ) is the extension of that operator (k) to the space B1 defined below), where the coefficient functions are the solution of the fixed point equation (3.14). Then, the complete sequence q q q q q , Fξ,2 , fξ,1 Fξq = Fξ,1 = fξ,0 q1 , fξ,2 q1 , q2 , . . . (4.39) satisfies the equation (4.40) Hp Fξq = ξFξq + (ap (ξ; q) − ξ)Fˆξq , q , 0 . Therefore, if ξ(q) is a solution of equation where we denoted Fˆξq = Fξ,1 ap (ξ; q) − ξ = 0,
(4.41)
q then Fξ(q) is a generalized eigenvector of the operator Hp with eigenvalue ξ(q) ≡ ξp (q), cf. Eq. (2.13).
November 18, 2005 10:54 WSPC/148-RMP
1132
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
We use the notion of generalized eigenvectors in the sense that eigenfunctions are elements of the dual (B (k) ) of an auxiliary Banach space B (k) , k = [d/2] + 2, densely and continuously embedded in the Fock space F .
Fξq
(k)
Definition 4.7. Let us denote Bn the space of all symmetric functions g of n variables q1 , . . . , qn ∈ Rd , k times continuously differentiable with respect to each exists a constant C such that, for all multi-indices α = qi , and for which there d (α1 , . . . , αn ), αi = α1i , . . . , αdi , |αi | = s=1 αsi ≤ k, one has n α ∂q g(q1 , . . . , qn ) ≤ C h(qi ),
∀ q1 , . . . , qn ∈ Rd .
(4.42)
i=1 (k)
It is a Banach space if endowed with the norm g n = inf C, where the infimum is taken over all C for which the estimate (4.42) holds. (k)
Clearly, the inclusion Bn Next, let
(k)
⊂ H(n) is continuous and Bn
is dense in H(n) .
(k)
B (k) = C + B1 + · · · + Bn(k) + · · · ⊂ F
(4.43)
be the space of sequences G = (g0 , g1 (q1 ), . . . , gn (q1 , . . . , qn ), . . .), with norm
(k) G B(k)
g0 ∈ C,
1 2 gn (k) = |g0 | + n n!
gn ∈ Bn(k) ,
1/2
2
.
(4.44)
n≥1
Obviously, B (k) is continuously and densely embedded in the Fock space F , as required. The dual (B (k)) of B (k) consists of sequences F = (f0 , f1 , . . . , fn , . . .), (k) (k) are linear continuous functionals on Bn ; thereby, where f0 ∈ C, and fn ∈ Bn (k) the value of F at an element G ∈ B is given by the series: 1/2 1 ¯ (F, G) = f0 g0 + (fn , gn ) , (4.45) n! n≥1
and the norm of F is (k) F (B(k) )
1 2 fn (B(k) ) = |f0 | + n n! 2
1/2 .
(4.46)
n≥1
Clearly, F ⊂ (B (k) ) and the inclusion is continuous. Lemma 4.8. For every q ∈ Rd and ξ ≤ κ, Fξq ∈ (B (k) ) and has the representation Fξq = δˆq + F˜ξq , where δˆq = (0, δq , 0, . . .) and q F˜ ξ
for some constant M .
(B(k) )
≤ M α h(q)
(4.47)
(4.48)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1133
Proof. We prove this statement in three steps. q Step I. The n = 0 component of F˜ξq , f˜ξ,0 , is given by Eq. (4.26), where v(q) = − − c(p − q, q), a(q) = ap (ξ, q) and K = Kξ . If condition 2 in Sec. 4.1 is fulfilled for every ξ ≤ κ and p, we have ∆− (a(q)) ≥ τ > 0, therefore we obtain for the first term in (4.26) (−αv(q)) α (4.49) ∆− (a(q)) ≤ τ h(q).
The second term in (4.26) is treated using as before the stationary phase method, which gives K − α, a(q); q , q v¯(q )(a(q ) − a(q) + i0)−1 dq ∞ ˆ d/2 + 1)−1 eit(a(¯q0 )−a(q)) K − (α, a(q); q¯0 , q)¯ =i dt C(t v (¯ q0 ) + ∆(q, t) , 0
(4.50) where |∆(q, t)| ≤ Ch(q)(td/2+1 + 1)−1 for some constant C. Equations (4.49), (4.50) and (4.51) provide q f˜ ≤ Bα h(q). ξ,0
(4.51)
(4.52)
q q Step II. The n = 1 component of Fξq , fξ,1 = δq +f˜ξ,1 , is given by Eq. (4.27), with the − . Again, reducing the estimate of every integral same assignments for v, a, and K q (k) , g1 for a generic g1 ∈ B1 to the estimate of the corresponding entering f˜ξ,1 oscillatory integral, and using thereby the estimate (4.49), we obtain q f˜ (k) ≤ Lα2 h(q). (4.53) ξ,1 (B1 ) q Step III. The higher components of F˜ξq , f˜ξ,n , are estimated using their n≥2 q representation (3.9) in terms of fξ,1 . We have n q q ˜ (qi )gn (q1 , . . . , qn ) dq1 · · · dqn bn (q1 , . . . , qˇi , . . . , qn ; qi )fξ,1 fξ,n , gn = i=1
+
q dn (q1 , . . . , qn ; q )fξ,1 (q )gn (q1 , . . . , qn ) dq1 · · · dqn dq .
(4.54)
Using the estimates for bn , dn and their derivatives (see (3.12) and Lemma 3.2), and also the bound (4.53), we have that bn (q1 , . . . , qˇi , . . . , qn ; qi )f˜q (qi )gn (q1 , . . . , qn ) dqi dq1 · · · dˇ qi · · · dqn ξ,1 n−1 −1 2 ≤ C1 α(λ2 (p) − κ) gn B(k) (1 + Lα )h(q) h(q ) dq n
November 18, 2005 10:54 WSPC/148-RMP
1134
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
for i = 1, . . . , n, and also that dn q1 , . . . , qn ; q f˜q (q )gn (q1 , . . . , qn ) dq1 · · · dqn dq ξ,1 n h(q ) dq . ≤ C2 α(λ2 (p) − κ)−1 gn B(k) (1 + Lα2 h(q))h(q) n
˜ L, ˜ Hence, with suitable constants C, q f˜ ξ,n
(k) (Bn )
≤ C˜
n−1 α ˜ 2 ) · h(q) (1 + Lα h(q ) dq . λ2 (p) − κ
(4.55)
Putting together Eqs. (4.52), (4.53) and (4.55), we obtain (4.48). The lemma is proved. q Now we come back to the study of the generalized eigenvectors Fξ(q) . Let us remark that ap (ξ; q) is, for every fixed q, a smooth, monotonously decreasing function of ξ on (−∞, κ]. If G(1),κ = q ∈ Rd : ap (κ; q) − κ < 0 , (4.56) p (1),κ
then Eq. (4.41) has a unique solution ξ(q) < κ if q ∈ Gp (1),κ (see Fig. 1). q∈ / Gp
, and no solution if
Proposition 4.9. The function ξ(q) ≡ ξp (q) can be represented in the form (0)
ξp (q) = ε(q) + ξp−q for q ∈
(1),κ Gp
(4.57)
such that p − q ∈ G(0) .
η
η
η=ξ
η=ξ
η = ap (ξ ,q)
ξ (q )
η = ap (ξ ,q)
κ
κ
ξ
(a) Fig. 1. set
(b) (1),κ
Solution of Eq. (4.41). (a) The case q in the set Gp
(1),κ . Gp
ξ
; (b) The case q not in the
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1135
Proof. Indeed, let us remark that, in view of Eqs. (3.10) and (3.15), the system of equations (3.14) contains a subset, depending parametrically on p, z, q, involving only the coefficient functions bn (q1 , . . . , qn−1 ; q) =: νn−1 (q1 , . . . , qn−1 ) at the given q. The remaining equations for dn involve the solution ν = (νn )n≥1 of the former subset. The parameters p, z, q enter the equation for ν only in the combination π = p − q,
ζ = z − ε(q),
(4.58)
i.e. the equation can be written in the form ˜ ζ)ν = ν 0 (π, ζ) ν + Γ(π,
(4.59)
˜ and its unique solution ν(π, ζ) is related with (with the obvious definition of Γ), the solution of Eq. (3.14) by νn−1 (q1 , . . . , qn−1 ; p − q, z − ε(q)) = bn (p, z; q1 , . . . , qn−1 ; q).
(4.60)
The sequence of functions (1, ν1 (q1 ; π, ζ), ν2 (q1 , q1 ; π, ζ), . . .) defines a vector F ∈ F which satisfies the system of equations [(Hπ − ζI)F ]n = 0,
n ≥ 1.
(0)
(4.61) (0)
If ζ = where we π , the system (4.61) is likewise satisfied by the eigenvector Fπ ξ(0) put Fπ 0 = 1. In view of the uniqueness of the solution of Eq. (4.61) under these (0)
conditions, F = Fπ , and therefore the following relation holds: 1 2 π − ξπ(0) + α c(π − q ; q )ν1 q ; π, ξπ(0) dq = 0. 2
(4.62)
Going back in this relation to the original p= π + q, z = ζ + ε(q), q parameters (0) (0) and replacing, according to Eq. (4.60), ν1 q ; π, ξπ by b2 p, ξπ + ε(q); q ; q , one obtains (0) 1 (0) (p − q)2 − ξp−q + α c(p − q − q ; q )b2 p, ξp−q + ε(q); q ; q dq = 0. (4.63) 2 Comparing this expression with Eqs. (4.41), (3.26) and (3.23), which define ξp (q), one obtains the desired relation (4.57). Clearly, by the convexity of e01,p and the asymptotical properties of mp (ξ, q) (1),κ
given in Corollary 3.4, Gp is a bounded domain, nonvoid for κ > λ1 (p), and minq∈G(1),κ ξ(q) = λ1 (p). p By the smoothness of ap (ξ, q) with respect to both arguments, the function ξ(q) (1),κ defined on Gp is also smooth. Moreover, for α sufficiently small, this function has a unique critical point (namely, a minimum), which is nondegenerate. In particular, it follows that on every level of ξ(q), : ξ(q) = x , χx = q ∈ G(1),κ p
November 18, 2005 10:54 WSPC/148-RMP
1136
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
one can define a measure νx (the Gelfand–Leray measure, see [16]), such that, (1),κ for any integrable function ϕ on Gp , κ ϕ(q) dq = dx ϕx dνx , (4.64) (1),κ
λ1 (p)
Gp
χx
where ϕx = ϕ|χx is the restriction of ϕ to the surface χx . From (4.64), it follows (1),κ that L2 Gp , dq can be represented as a direct integral of Hilbert spaces: L2 G(1),κ , dq = p
⊕
Hx dx,
(4.65)
[λ1 (p),κ]
with Hx := L2 (χx , νx ). q (k) ) of generalized eigenLet us now consider the family Fξ(q) (1),κ ⊂ (B q∈Gp vectors of Hp . The next lemma, which may be stated formally as an approximate orthonormality of this family, is an important element of our constructions. We denote q Fξ(q) ϕ(q) dq ∈ (B (k) ) , F (ϕ) := Gκ p
(1),κ
for ϕ ∈ D(Gp (1),κ in Gp .
), the space of infinitely differentable functions with support
(1),κ
Lemma 4.10. (i) For any ϕ ∈ D(Gp
), one has F (ϕ) ∈ F.
(ii) There exist functions S(q) and M (q, q ) defined for q, q ∈ Gp (1),κ any ϕ1 , ϕ2 ∈ D(Gp ), the following representation holds:
(1),κ
κ
(F (ϕ1 ), F (ϕ2 ))F =
(1 + Sx (q))|ϕx (q)|2 dνx
dx λ1 (p)
χx
+ χx
such that, for
ϕ¯x (q)Mx (q, q )ϕx (q ) dνx (q) dνx (q ) .
(4.66)
χx
Here, ϕx , Sx and Mx denote the restrictions of the functions ϕ, S and M to χx and χx × χx , respectively. ¯ C: ˆ (iii) The following estimates hold with suitable constants C, |S(q)| ≤ C¯
α , λ2 (p) − κ
(4.67)
ˆ h(q)h(q ), |M (q, q )| ≤ Cα (1),κ and implying that F (ϕ) ∈ F, for ϕ ∈ L2 Gp
(4.68)
C1 ϕ L2 (G(1),κ ) ≤ F (ϕ) ≤ C2 ϕ L2 (G(1),κ ) . p
p
(4.69)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
Proof. (i) This assertion will follow from the calculations below. (ii) In the sense of distributions, Eq. (4.66) means that q q = (1 + S(q))δ(q − q ) + M (q, q )δ(ξ(q) − ξ(q )). Fξ(q) , Fξ(q ) F
1137
(4.70)
Before we proceed, the following remarks are in order:
1◦ . We seemingly make an abuse in calculating the scalar product Fξq , Fξq F of two generalized functions. Such calculations can be justified in the following way. q q (ζ1 , . . . , ζn ) of the generalized function Fξ,n (q1 , . . . , qn ) The Fourier transform F˜ξ,n is, as one can easily verify, a usual function of the variables (ζ1 , . . . , ζn ) polynomially at infinity in these variables. If, further, we view the scalar product q bounded Fξ,n , Fξq ,n L2 (Rnd ) as the limit of scalar products: q F˜ξ,n , F˜ξq ,n L2 (Rnd ) := lim δ↓0
Rnd
q F˜ξ,n (ζ1 , . . . , ζn )F˜ξq ,n (ζ1 , . . . , ζn )
n −δ|ζi | e dζi , i=1
then one can prove that this limit exists in the sense of convergence of generalized functions of the variables q, q . We shall not provide the details of this justifying procedure, and write instead directly its result entering our calculations. q , 2◦ . We shall exploit the “orthogonality” of the generalized eigenfunctions Fξ(q) (1),κ q ∈ Gp , corresponding to different eigenvalues ξ(q) = ξ(q ), by supposing that the support of the generalized function q q Q(q, q ) := Fξ(q) , Fξ(q ) F
(1),κ (1),κ is contained in the surface Σ = (q, q ) ∈ Gp × Gp : ξ(q) = ξ(q ) : supp Q ⊂ Σ,
(4.71)
and neglect in our calculation all terms which do not contribute to the factors in front of δ(ξ(q) − ξ(q )) or δ(q − q ). However, the relation (4.71) likewise needs a justification. Namely, if we did not skip these “non-contributing” terms in the ˜ q ) such that for smooth calculations, we would obtain a generalized function Q(q, (1),κ functions ϕi (q), i = 1, 2 with support contained in Gp , the scalar product ˜ q )ϕ¯1 (q)ϕ2 (q ) dq dq Q(q, (F (ϕ1 ), F (ϕ2 ))F = (1),κ
Gp
(1),κ
×Gp
would be finite, in particular, F (ϕ) ∈ F for smooth ϕ. On the other hand, if the support of ϕ was contained in an ε-neighborhood of the level χx , then F (ϕ) ∈ E(x − ε, x + ε)F , where {E(∆)} denotes the family of spectral projections of Hp . Hence, for ϕi (q), i = 1, 2 with supports respectively contained in nonintersecting ε-neighborhoods of the levels χxi , where x1 = x2 , the vectors F (ϕi ), i = 1, 2, would be orthogonal. This proves in fact (4.71). Keeping these remarks in mind, we proceed with the proof of (ii).
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
1138
One has ∞ q q q q q q = Π F , Π F + . Fξ(q),n , Fξ(q Fξ(q) , Fξ(q ) ),n 1 1 (≤1) ξ(q ) H ξ(q) F H(n)
(4.72)
n=2 q are its generalized eigenfunctions with eigenvalue As Hp is self-adjoint and Fξ(q) q = ψq ξ(q), the support of this distribution is contained in ξ(q) = ξ(q ). As Π1 Fξ(q) q q and Π1 Fξ(q ) = ψ are generalized eigenvectors of the operator A(ξ) for ξ = ξ(q) = ξ(q ), we can use the relation (4.31): q q Π1 Fξ(q) , Π1 Fξ(q = δ(q − q ), (ξ(q) = ξ(q )). (4.73) ) H(≤1) q q We are therefore left with calculating Fξ(q),n , Fξ(q for n ≥ 2. To this aim, ),n H(n) use is made of the representation (3.9): n q q Fξ(q),n = bn (ξ(q); q1 , . . . , qˇi , . . . , qn ; qi )fξ(q),1 (qi ) i=1
+
q (q ) dq . dn (ξ(q); q1 , . . . , qn ; q )fξ(q),1
(4.74)
The second term in (4.74) is a smooth function of q1 , . . . , qn and does not contribute to the terms containing δ(ξ(q) − ξ(q )) or δ(q − q ). Likewise, it is not hard to see that the only contributions to such terms come from q q bn (ξ(q); q1 · · · qˇi · · · qn ; qi )fξ(q),1 (qi )bn (ξ(q ); q1 · · · qˇi · · · qn ; qi )fξ(q ),1 (qi ) dq1 · · · dqn q q = gn (ξ(q), ξ(q ); qˆ)fξ(q),1 (ˆ q )fξ(q q ) dˆ q, (4.75) ),1 (ˆ where
gn (ξ, ξ ; qˆ) =
bn (ξ; q1 · · · qn−1 ; qˆ)bn (ξ ; q1 · · · qn−1 ; qˆ) dq1 · · · dqn−1 .
(4.76)
In the integral over qˆ in the r.h.s. of Eq. (4.75), we separate the singular parts q q , fξ(q of fξ(q),1 ),1 using the Sokhotski formula: 1 1 =P + iπδ(x) x + i0 x in their expression (4.27) and the fact that ap (ξ(q), q ) = ξ(q) implies ξ(q) = ξ(q ), hence also ap (ξ(q), q ) = ξ(q ) (in view of the uniqueness of the solution of ap (ξ, q) = ξ): q fξ(q),1 (q ) = δ(q − q ) + iπRξ(q) (q, q )δ(ξ(q) − ξ(q ))
+ regular terms,
(4.77)
where q Rξ (q, q ) = α2 Kξ− (ξ; q, q ) − αfξ,0 c(p − q, q) q + α2 fξ,0 Kξ− (ξ; q , q )c(p − q , q )(ap (ξ, q ) − ξ + i0)−1 dq . (4.78)
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1139
The regular parts do not contribute to (4.75), which becomes, after performing the integration over qˆ: gn (ξ(q), ξ(q); q)δ(q − q ) ¯ξ(q) (q , q) + iπδ(ξ(q) − ξ(q )) gn (ξ(q), ξ(q); q )Rξ(q) (q, q ) − gn (ξ(q), ξ(q); q)R ¯ ξ(q) (q , q ) dq . + π2 gn (ξ(q), ξ(q); q )Rξ(q) (q, q )R (4.79) ξ(q )=ξ(q)
Let us define the function: T (ξ, q ) =
∞ 1 ngn (ξ, ξ; q ). n! n=2
(4.80)
Then, one can see from Eqs. (4.73) and (4.79) that (4.70) is fulfilled with S(q) = T (ξ(q), q), ¯ ξ(q) (q , q) M (q, q ) = iπ T (ξ(q), q )Rξ(q) (q, q ) − T (ξ(q), q)R ¯ ξ(q) (q , q ) dq . + π 2 T (ξ(q), q )Rξ(q) (q, q )R
(4.81)
(4.82)
(iii) Using the estimates in Lemma 3.2, one obtains for T : |T (ξ, q )| ≤ C
2(n−1) ∞ h L2 α α = C¯ . λ2 (p) − κ n=2 (n − 1)! λ2 (p) − κ
Also, from the inequalities (4.52) and (4.8), it follows that |Rξ(q) (q, q )| ≤ Ch(q)h(q ), wherefrom (4.67) and (4.68) follow. Using (1),κ these estimates in Eq. (4.66), one obtains that F (ϕ) ∈ F for any ϕ ∈ L2 Gp and, moreover the estimate (4.69) holds. The lemma is proved. (1),κ Let now H1κ (p) ⊂ F be the subspace spanned by F (ϕ), ϕ ∈ L2 Gp . Equation (4.69) implies that the application ϕ → F (ϕ) is continuous and invertible. Thereby, H1κ (p) is Hp -invariant and ˆ Hp F (ϕ) = F (ξϕ),
(4.83)
where ˆ (ξϕ)(q) = ξ(q)ϕ(q).
(4.84) (1),κ → Lemma 4.11. There exists a bounded, invertible operator B : L2 Gp (1),κ 2 which commutes with Hp and such that: L Gp (F (Bϕ1 ), F (Bϕ2 ))F = (ϕ1 , ϕ2 )L2 (G(1),κ ) . p
(4.85)
November 18, 2005 10:54 WSPC/148-RMP
1140
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
(1),κ Proof. We use the representation (4.65) of L2 Gp , dq as a direct integral of the spaces Hx and write H1κ (p) as a direct integral: ⊕ κ H1 (p) = H1,x dx. (4.86) [λ1 (p),κ]
(1),κ Here, H1,x is the image of Hx by the application of L2 Gp , dq into H1κ (p) and consists of functionals q Fξ(q) ϕ(q) dνx (q). (4.87) Fx (ϕ) = χx
By virtue of (4.70),
(Fx (ϕ1 ), Fx (ϕ2 ))F =
(1 + S(q))ϕ¯1 (q)ϕ2 (q) dνx (q) ϕ¯1 (q)M (q, q )ϕ2 (q ) dνx (q) dνx (q ) + χx
χx ×χx
= ((Ix + Vx )ϕ1 , ϕ2 )H1,x ,
(4.88)
where Ix is the unit operator in H1,x and Vx is a bounded operator with small norm (cf. Eqs. (4.67) and (4.68)). Also, Hp acts in H1,x as xIx . Let Bx = (Ix + Vx )−1/2 . Then, Eq. (4.88) reads as (Fx (Bx ϕ1 ), Fx (Bx ϕ2 ))F = (ϕ1 , ϕ2 )H1,x . ⊕ Finally, defining B = [λ1 (p),κ] Bx dx, one gets both that the operator B commutes with Hp and that Eq. (4.85) is satisfied. 5. The Ground State of Hp A detailed analysis of the ground state of Hp is performed in arbitrary dimension in [10]. In this section, we shall briefly show how the existence of the ground state follows from our considerations for d ≥ 3. As explained in Sec. 2, Hp has a ground state if, and only if, there exists ξ < λ1 (p), such that operator Ap (ξ) in H(≤1) has the eigenvalue ξ. By the analysis done in Sec. 4.1, Ap (ξ) has one simple eigenvalue ep (ξ) < λ1 (p) if, and only if, ∆p (λ1 (p)) < 0 (where ∆p (ξ) is the function defined by Eq. (4.21) for Ap (ξ)), in (0) which case ep (ξ) equals the unique solution of the equation ∆p (ξ) = 0. Since ep − λ1 (p) → ∞ for p → ∞, while (v, RB (λ1 (p))v) (with v and B corresponding to Ap (ξ)) is bounded, {p; ∆p (λ1 (p)) < 0} is a bounded domain. As seen from Eq. (2.10), Ap (ξ) is a decreasing family (in the usual order of selfadjoint operators), implying that ep (ξ) is a decreasing function of ξ ∈ (−∞, λ1 (p)). (0) We conclude that the equation ep (ξ) = ξ has a solution ξp if, and only if, p belongs to the subdomain G(0) = {p : ep (λ1 (p)) < λ1 (p)}.
November 18, 2005 10:54 WSPC/148-RMP
J070-00250
Lower Spectral Branches of a Particle Coupled to a Bose Field
1141
(0) For p ∈ G(0) , let Fξ(0) ,1 be an eigenvector of Ap ξp , and Fξp ,2 ∈ H(≥2) be defined p
(0)
according to (2.14). Then, the vector Fp = Fξ(0) ,1 + Fξ(0) ,2 is a ground state of the p p operator Hp . Therefore, Hp has a unique ground state if p ∈ G(0) , and no ground state if p ∈ / G(0) . Also, there are no eigenvalues embedded in the one-particle sector. Indeed, if e ∈ [λ1 (p), λ2 (p) − κ] is an eigenvalue of Hp with eigenvector F = (f0 , f1 , . . .), then F1 = (f0 , f1 ) would be an eigenvector of the Friedrichs operator Ap (e) with eigenvalue e ≥ λ1 (p), what contradicts the fact that the latter has no eigenvalues embedded in the continuous spectrum (cf. Eq. (4.23)).
6. Concluding Remarks The main result of the paper is the construction, in the weak coupling regime, of a manifold of states indexed by a phonon momentum q. The ground state describing a single polaron becomes unstable at a certain momentum threshold, above which it dissolves into this manifold. It is to be expected that at still higher momenta, the latter states become themselves unstable and dissolve into two-phonon states, etc. The representation (4.57) of the eigenvalue ξ(q) strongly suggests the generalized q associated to it are scattering states of a free phonon and a eigenfunctions Fξ(q) polaron in the sense of [12]. In this case, the ground state instability at high k might be interpreted as emission of a phonon. The approach used here of eliminating the higher components of the eigenvectors can equally well be applied in the case of the decomposition F = H≤n ⊕ H≥n+1 , leading to a family of self-adjoint operators {An (ξ), ξ ≤ κ}, (where λn (p) ≤ κ < λn+1 (p), i.e. κ is between the n-boson and the (n+1)-boson threshold, acting in the space H≤n . These operators have a more complicated structure than the Friedrichs operators in H≤1 and their spectral analysis and scattering theory is not available in such details as for the Friedrichs operators. If this theory was elaborated (e.g., using equations analogous to the Faddeev–Yakubovski equations for the resolvent of n-body Schr¨ odinger operators, see [18–20]), then the approach of the present paper would provide the construction of the whole (n − 1)-boson subspace and of a part of the n-boson subspace. Then, a reasoning for the n-boson branch similar to Proposition 4.9 implies for the energy the formula:
ξp (q1 , . . . , qn ) =
n i=1
(0)
ε(qi ) + ξp−Pn
i=1
qi .
We did not prove the completeness of the constructed subspaces H0 (p) (generated by the ground state) and H1κ (p), meaning that in (H0 (p) ⊕ H1κ (p))⊥ the spectrum of Hp has no point below κ. It is known to be true [12]. We hope to prove it by our method in the future.
November 18, 2005 10:54 WSPC/148-RMP
1142
J070-00250
N. Angelescu, R. A. Minlos & V. A. Zagrebnov
Acknowledgments N.A. and R.A.M. acknowledge the warm hospitality of the C.P.T. Luminy-Marseille, where the project of this work was born. R.A.M. also acknowledges financial support from the Scientific Fund of the Russian Federation (Grant No. 02-0100444) and the Presidential Fund for Support of Scientific Schools of Russia (Grant No. 934 2003.1). References [1] H. Fr¨ ohlich, Electrons in lattice fields, Adv. Phys. 3 (1954) 325–364. [2] R. P. Feynman, Lectures on Statistical Mechanics (Benjamin, New York, 1971). [3] J. Fr¨ ohlich, On the infrared problem in a model of scalar electrons and massless, scalar bosons, Ann. Inst. H. Poincar´ e AXIX (1973) 1–103. [4] J. Fr¨ ohlich, Existence of dressed one-electron states in a class of persistent models, Fortschr. Phys. 22 (1974) 159–198. [5] C. G´erard, On the scattering theory of massles Nelson models, Rev. Math. Phys. 14 (2002) 1165–1280. [6] B. Gerlach and H. L¨ owen, Analytical properties of polaron systems or: Do polaronic phase transitions exist or not?, Rev. Mod. Phys. 63 (1991) 63–90. [7] E. H. Lieb and L. E. Thomas, Exact ground state energy of the strong coupling polaron, Comm. Math. Phys. 183 (1997) 511–519. [8] V. Bach, J. Fr¨ ohlich and I. M. Sigal, Quantum electrodynamics of confined nonrelativistic particles, Adv. Math. 137 (1998) 299–395. [9] R. A. Minlos et al., Ground state properties of the Nelson Hamiltonian — a Gibbs measure approach, Rev. Math. Phys. 14 (2002) 173–198. [10] R. A. Minlos, Lower branch of the spectrum of a fermion interacting with a bosonic gas (polaron), Theor. Math. Phys. 92 (1992) 255–268. [11] H. Spohn, The polaron at large total momentum, J. Phys. A21 (1988) 1199–1212. [12] J. Fr¨ ohlich, M. Griesemer and B. Schlein, Asymptotic completeness for Compton Scattering, Comm. Math. Phys. 252 (2004) 415–476. [13] K. O. Friedrichs, Perturbation of Spectra in Hilbert Space (American Mathematical Society, Providence, 1965). [14] D. R. Yafaev, Scattering Theory (St.-Petersburg University, 1994). [15] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. IV (Academic Press, New York, 1983). [16] I. M. Gelfand and G. E. Shilov, Generalized Functions, Vol. 3 (Nauka, Moscow, 1959). [17] Yu. M. Berezanski, Eigenfunction Expansion of Self-Adjoint Operators (Naukova Dumka, Kiev, 1965). [18] O. A. Yakubovski, On the integral equations in the theory of N -particle scattering, Yadernaya Fizika 5 (1967) 1312–1320. [19] L. D. Faddeev, Mathematical questions in the quantum theory of scattering for a system of three particles, Trudy Mat. Inst. Steklov. 69 (1963) 122. [20] K. Hepp, On the quantum-mechanical N-body problem, Helv. Phys. Acta 42 (1969) 425–458.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Reviews in Mathematical Physics Vol. 17, No. 10 (2005) 1143–1207 c World Scientific Publishing Company
¨ ASYMPTOTIC STABILITY OF NONLINEAR SCHRODINGER EQUATIONS WITH POTENTIAL
ZHOU GANG∗ and I. M. SIGAL† Department of Mathematics, University of Toronto, Toronto, Canada ∗[email protected] †[email protected] Received 7 February 2005 Revised 28 September 2005 We prove asymptotic stability of trapped solitons in the generalized nonlinear Schr¨ odinger equation with a potential in dimension 1 and for even potential and even initial conditions. Keywords: Asymptotic stability; soliton; nonlinear Schr¨ odinger equations; potential. Mathematics Subject Classification 2000: 35Q55, 37K45
1. Introduction In this paper, we study the generalized nonlinear Schr¨ odinger equation with a potential ∂ψ = −ψxx + Vh ψ − f (|ψ|2 )ψ (1.1) ∂t in dimension 1. Here Vh : R → R is a family of external potentials, ψxx = ∂x2 ψ, and f (s) is a nonlinearity to be specified later. Such equations arise in the theory of Bose–Einstein condensation,a nonlinear optics, theory of water wavesb and in other areas. To fix ideas, we assume the potentials to be of the form Vh (x) := V (hx) with V smooth and decaying at ∞. Thus for h = 0, Eq. (1.1) becomes the standard generalized nonlinear Schr¨ odinger equation (gNLS) i
∂ψ = −ψxx + µψ − f (|ψ|2 )ψ, (1.2) ∂t where µ = V (0). For a certain class of nonlinearities, f (|ψ|2 ) (see Sec. 2), there is an interval I0 ⊂ R such that for any λ ∈ I0 , Eq. (1.2) has solutions of the form i
a In
this case, Eq. (1.1) is called the Gross–Pitaevskii equation. these two areas, V, generally time-dependent, arises if one takes into account impurities and/or variations in geometry of the medium. b In
1143
November 18, 2005 10:54 WSPC/148-RMP
1144
J070-00252
Z. Gang & I. M. Sigal
ei(λ−µ)t φλ0 (x) where φλ0 ∈ H2 (R) and φλ0 > 0. Such solutions (in general without the restriction φλ0 > 0) are called the solitary waves or solitons or, to emphasize the property φλ0 > 0, the ground states. For brevity, we will use the term soliton applying it also to the function φλ0 without the phase factor ei(λ−µ)t . Equation (1.2) is translationally and gauge invariant. Hence if ei(λ−µ)t φλ0 (x) is a solution for Eq. (1.2), then so is ei(λ−µ)t eiα φλ0 (x + a), for any λ ∈ I0 , a ∈ R, α ∈ [0, 2π). This situation changes dramatically when the potential Vh is turned on. In general, as was shown in [18, 30, 1], out of the three-parameter family ei(λ−µ)t eiα φλ0 (x + a), only a discrete set of two parameter families of solutions to Eq. (1.1) bifurcate: eiλt eiα φλ (x), α ∈ [0, 2π) and λ ∈ I, for some I ⊆ I0 . Here φλ ≡ φλh ∈ H2 (R) and φλ > 0. Each such family corresponds to a different critical point of the potential Vh (x). It was shown in [31] that the solutions corresponding to minima of Vh (x) are orbitally (Lyapunov) stable and to maxima, orbitally unstable. We call the solitary wave solutions described above which correspond to the minima of Vh (x) trapped solitons or just solitons of Eq. (1.1) omitting the last qualifier if it is clear which equation we are dealing with. The main result of this paper is a proof that the trapped solitons of Eq. (1.1) are asymptotically stable. The latter property means that if an initial condition of (1.1) is sufficiently close to a trapped soliton then the solution converges in some weighted L2 space to, in general, another trapped soliton of the same two-parameter family. In this paper, we prove this result under the additional assumption that the potential and the initial conditions are even. This limits the number of technical difficulties we have to deal with. In the subsequent paper, we remove this restriction and allow the soliton to “move”. In fact, in this paper we prove a result more general than asymptotic stability of trapped solitons. Namely, we show that if the initial conditions are of the form ψ0 = eiγ0 (φλ0 + χ0 ), with χ0 being small in the space (1 + x2 )−1 H1 , γ0 ∈ R and λ0 ∈ I (I is the same as above). Then the solution, ψ(t), of Eq. (1.1) can be written as ψ(t) = eiγ(t) (φλ(t) + χ(t)),
(1.3)
where γ(t) ∈ R, χ(t) → 0 in the (1 + x2 )2 H1 norm, and λ(t) → λ∞ for some λ∞ , as t → ∞. We observe that (1.1) is a Hamiltonian system with conserved energy (see Sec. 2) and, though orbital (Lyapunov) stability is expected, the asymptotic stability is a subtle matter. To have asymptotic stability, the system should be able to dispose of excess of its energy, in our case, by radiating it to infinity. The infinite dimensionality of a Hamiltonian system in question plays a crucial role here. This phenomenon as well as a general class of classical and quantum relaxation problems was pointed out by Fr¨ ohlich and Spencer. First attack on the asymptotic stability in infinite dimensional Hamiltonian systems was made in the pioneering work of Soffer and Weistein [43] where the
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1145
asymptotic stability of the nonlinear ground states was proved for the nonlinear Schr¨ odinger equation with a potential and a weak nonlinearity in the dimensions higher than or equal to 3. Asymptotic stability of moving solitons in the (generalized) nonlinear Schr¨ odinger equation without potential in dimension 1 was first proven by Buslaev and Perelman [4]. The above results were significantly extended by Soffer and Weinstein, Buslaev and Perelman, Tsai and Yau, Buslaev and Sulem, and Cuccagna (see [5, 6, 11–13, 44–49]). Related results in multi-soliton dynamics were obtained by Perelman, and Rodnianski, Schlag and Soffer (see [33, 38, 39]). Deift and Zhou (see [15]) used a different approach, inverse scattering method, to describe asymptotic dynamics of solitons of the 1-dimensional nonlinear Schr¨ odinger equations. Among earlier works, we should mention Shatah and Strauss, Weinstein, Grillakis, Shatah and Strauss on orbital stability (see [41, 50, 51, 22, 23]). Results of these authors were extended by Comech and Pelinosky, Comech, Cuccagna, Pelinovsky and Vougalter, Cuccagna and Pelinovsky, and Schlag (see [8–11, 14, 40]). There is an extensive physics literature on the subject; some of the references can be found in Grimshaw and Pelinovsky [21]. Long-term dynamics of solitons in external potentials was found by Bronski and Jerrard, Fr¨ ohlich, Tsai and Yau, Keraani, and Fr¨ ohlich, Gustafson, Jonsson and Sigal (see [2, 17, 28, 16]). We would like to point out the differences between our results and the results of the Soffer–Weinstein and Buslaev–Perelman type mentioned above. Like our work, [43–49] study the ground state of the NLS with a potential. However, these papers deal with the near-linear regime in which the nonlinear ground state arises from the ground state for the corresponding Schr¨ odinger operator −∆ + V (x). The present paper covers the highly nonlinear regime in which the ground state is produced by the nonlinearity (our analysis simplifies considerably in the near-linear case). Now, these papers [4–6] consider the NLS without a potential so the corresponding solitons, which were described above, are affected only by a perturbation of the initial conditions which disperses with time leaving them free. In our case, these solutions, in addition, are under the influence of the potential and they relax to an equilibrium state near a local minimum of the potential. We also mention that because of slow time-decay of the linearized propagator, the lower dimensions d = 1, 2 are harder to handle than the higher dimensions, d > 2. Our approach is built on the beautiful theory of 1-dimensional Schr¨ odinger operators developed by Buslaev and Perelman, and Buslaev and Sulem (see [4–6]). One of the key points in this approach is obtaining suitable (and somewhat surprising) estimates on the propagator for the linearization of Eq. (1.1) around the soliton family eiγ φλ . One of the difficulties here lies in the fact that the corresponding generator, L(λ), is not self-adjoint. To obtain the desired estimates, one develops the spectral representation for the propagator in terms of the boundary values of the resolvent (this can be also extended to other functions of the generator (see Sec 5.2))
November 18, 2005 10:54 WSPC/148-RMP
1146
J070-00252
Z. Gang & I. M. Sigal
and then estimates the integral kernel of the resolvent using estimates on various solutions of the corresponding spectral problem (L(λ) − σ)ξ = 0 (See Appendix A). These estimates are close to the corresponding estimates of [4–6]. Since these estimates are somewhat involved, we take pains to provide a detailed and readable account. Note that, independently, Schlag [40] has developed spectral representation similiar to ours (see Sec. 5.2), and Goldberg and Schlag [20] obtained (by a different technique) estimates on the propagators of the 1-dimensional, self-adjoint (scalar) Schr¨ odinger operators similiar to some of our estimates (see Sec. 5.1) but under more general assumptions on the potential than in our (nonself-adjoint vector) case. The paper is organized as follows: in Sec. 2, we describe the Hamiltonian structure of Eq. (1.1), cite a well-posedness result, formulate our conditions on the nonlinearity and the potential, and present our main result. In Sec. 3, we describe the spectral structure of the linearized equation around the trapped soliton. In Sec. 4, we decompose the solution into a part moving in the “soliton manifold” and a simplectically orthogonal fluctuation and find the equations for the soliton parameters and for the fluctuation. In the same section, we estimate the soliton parameters and the fluctuation assuming certain estimates on the linearized propagators (i.e. the solutions of the linearized equation). The latter estimates are proven in Sec. 5, modulo estimates on the generalized eigenfunctions which are obtained in Appendix A. In Appendix B, we analyze the implicit conditions on the nonlinearity and the potential made in Sec. 3. d λ φ and Ascustomary, we often denote derivatives by subindices as in φλλ = dλ d λ φ for φλ = φλ (x). The Sobolev and L2 spaces are denoted by H1 and φλx = dx L2 respectively. 2. Properties of (1.1), Assumptions and Results In this section, we discuss some general properties of Eq. (1.1) and formulate our results. 2.1. Hamiltonian structure and global well-posedness of (1.1) Equation (1.1) is a Hamiltonian system on the Sobolev space H1 (R, C) viewed as ¯ the real space H1 (R, R) ⊕ H1 (R, R) with the inner product (ψ, φ) = Re R ψφ and ¯ with the simpletic form ω(ψ, φ) = Im R ψφ. The Hamiltonian functional is: 1 2 2 2 (|ψx | + Vh |ψ| ) − F (|ψ| ) , H(ψ) := 2 u where F (u) := 12 0 f (ξ) dξ. Equation (1.1) has the time-translational and gauge symmetries which imply the the following conservation laws: for any t ≥ 0, we have (CE) conservation of energy: H(ψ(t)) = H(ψ(0));
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1147
(CP) conservation of the number of particles:
where N (ψ) :=
N (ψ(t)) = N (ψ(0)), |ψ|2 .
We need the following condition on the nonlinearity f for the global well-posedness of (1.1). (fA) The nonlinearity f is locally Lipschitz and f (ξ) ≤ c(1 + |ξ|q ) for some c > 0 and q < 2. The following theorem is proved in [32, 7]. Theorem 2.1. Assume that the nonlinearity f satisfies the condition (f A), and that the potential V is bounded. Then Eq. (1.1) is globally well posed in H1 , i.e. the Cauchy problem for Eq. (1.1) with initial datum ψ(0) ∈ H1 has a unique solution ψ(t) in the space H1 and this solution depends continuously on ψ(0). Moreover, ψ(t) satisfies the conservation laws (CE ) and (CP ). If ψ(0) has a finite norm (1 + |x|)ψ(0) 2 , then we have the following estimates:
(1 + |x|)ψ(t) 2 ≤ e( ψ(0) H1 )[ (1 + |x|)ψ(0) 2 + t ψ(0) H1 ],
(2.1)
where e : R+ → R+ is a smooth function. 2.2. Existence and stability of solitons In this section, we discuss the problem of existence and stability of solitons. It is proved in [3, 4] that if the nonlinearity f in Eq. (1.1) is smooth, real and satisfies the following condition: (fB) There is an interval I0 ∈ R+ s.t. for any λ ∈ I0 , φ2 U (φ, λ) := −λφ2 + f (ξ) dξ 0
has a positive root and the smallest positive root φ0 (λ) satisfies Uφ (φ0 (λ), λ) > 0, then for any λ ∈ I0 , there exists a unique solution of Eq. (1.2) of the form ei(λ−µ)t φλ0 with φλ0 ∈ H2 and φλ0 > 0. As mentioned in Sec. 1 such solutions are called the solitary waves or solitons or to emphasize that φλ0 > 0, the ground states. For brevity, we use the term soliton and we also apply it to the function φλ0 . Note that the function φλ0 satisfies the equation: 2 λ φ0 = 0. − φλ0 xx + λφλ0 − f φλ0 (2.2) Remark 2.2. If at the origin f (ξ) = c ξ p + o(ξ p ) with c, p > 0, then condition (fB) is satisfied for λ ∈ (0, δ) with δ sufficiently small. When the potential V is present, then some of the solitons above bifurcate into solitons for Eq. (1.1). Namely, assume f satisfies the following condition,
November 18, 2005 10:54 WSPC/148-RMP
1148
J070-00252
Z. Gang & I. M. Sigal
(fC) f is smooth, f (0) = 0 and |f (ξ)| ≤ c(1 + |ξ|p ) for some p < ∞, and if V satisfies the condition, (VA) V is smooth and 0 is a nondegenerate local minimum of V . Then, similarly as in [26, 1, 18, 30], one can show that Eq. (2.2) has a family of solitons φλ0 , λ ∈ I0 , and if h is sufficiently small, then for any λ ∈ I0V := {λ | λ > −inf x∈R {V (x)}}∩{λ | λ+V (0) ∈ I0 } there exists a soliton φλ (φλ ∈ H2 and φλ > 0) satisfying the equation −
d2 λ φ + (λ + Vh )φλ − f ((φλ )2 )φλ = 0 dx2 λ+V (0)
and having the form φλ ≡ φλh = φ0 + O(h3/2 ). Under more restrictive conditions on the nonlinearity f , one can show as in [22, 16, 51], that the soliton φλ is a minimizer of the energy functional H(ψ) for a fixed number of particles, N (ψ) = constant, if and only if d
φλ 22 > 0. dλ
(2.3)
The latter condition is also equivalent to the orbital stability of φλ . In what follows, we set ∂ λ
φ 2 > 0 . I = λ ∈ I0V : (2.4) ∂λ Observe that there exist some constants c, δ > 0 such that |φ (x)| ≤ ce λ
−δ|x|
and
d λ φ ≤ ce−δ|x| , dλ
(2.5)
d λ and similarly for the derivatives of φλ and dλ φ . The first estimate can be found d λ φ satisfies in [22] and the second estimate follows from the fact that the function dλ the equation
d h d2 λ 2 λ 2 λ 2 φ = −φλ − 2 + Vh + λ − f ((φ ) ) − 2f ((φ ) )(φ ) dx dλ λ
and standard arguments. For our main result, we will also require the following condition on the potential V : (VB) |V (x)| ≤ ce−α|x| for some c, α > 0.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1149
2.3. Linearized operator and spectral conditions In our analysis, we use some implicit spectral conditions on the Fr´echet derivative ∂G(φλ ) of the map
d2 (2.6) G(ψ) = −i − 2 + λ + Vh ψ + if (|ψ|2 )ψ dx appearing on the right-hand side of Eq. (1.1). We compute
d2 ∂G(φλ )χ = −i − 2 + λ + Vh χ + if ((φλ )2 )χ + 2if ((φλ )2 )(φλ )2 Re χ. dx
(2.7)
This is a real linear but not complex linear operator. To convert it to a linear operator, we pass from complex functions to real vector functions:
χ1 , χ↔χ
= χ2 where χ1 = Re χ and χ2 = Im χ. Then ∂G(φλ )χ ↔ L(λ) χ, where
L(λ) :=
0 −L+ (λ)
L− (λ) 0
,
(2.8)
with d2 + Vh + λ − f ((φλ )2 ), dx2
(2.9)
d2 + Vh + λ − f ((φλ )2 ) − 2f ((φλ )2 )(φλ )2 . dx2
(2.10)
L− (λ) := − and L+ (λ) := −
Then we extend the operator L(λ) to the complex space H2 (R, C) ⊕ H2 (R, C). By a general result (see, e.g., [37]), σess (L(λ)) = (−i∞, −iλ] ∩ [iλ, i∞) if the potential Vh in Eq. (1.1) decays at ∞. Furthermore,
since the operator is 0 1 real and is of the form L(λ) = JH(λ) where J = −1 0 and H ∗ (λ) = H(λ), the spectrum of L(λ) is symmetric with respect to the real and imaginary axes. We show in the next section that the operator L(λ) has at least four usual and 0
associated eigenvectors: the zero eigenvector φλ and associated zero eigenvector d φλ dλ related to the gauge symmetry ψ(x, t) → eiα ψ(x, t) of the original equation, 0 and two eigenvectors with O(h) eigenvalues originating from the zero eigenvector
November 18, 2005 10:54 WSPC/148-RMP
1150
Z. Gang & I. M. Sigal
∂
λ x φ0
J070-00252
of the V = 0 equation due to the translational symmetry of that equa 0 tion and associated zero eigenvector xφλ related to the boost transformation 0
0
ψ(x, t) → eibx ψ(x, t) coming from the Galilean symmetry of the V = 0 equation. Besides eigenvalues, the operator L(λ) may have resonances at the tips, ±iλ, of its essential spectrum (those tips are called thresholds). The definition of the resonance is as follows: Definition 2.3. A function h = 0 is called a resonance of L(λ) at iλ if and only if h is C 2 , is bounded and satisfies the equation (L(λ) − iλ)h = 0. Similarly we define a resonance at −iλ. In what follows, we make the following spectral assumptions: (SA) Dimension of the generalized eigenspace for isolated eigenvalues is 4, (SB) L(λ) has no embedded eigenvalues, (SC) L(λ) has no resonances at ±iλ. Condition (SA) is satisfied for a large class of nonlinearities (e.g., for perturbation of f (u) = u, see Remark 2.6 below), but it is not generic. For some open set of nonlinearities, the operator L(λ) might have other purely imaginary, isolated eigenvalues besides those mentioned above. We expect that with some work, our technique can be extended to this case (cf. [4, 6, 43, 49]). Conjecture 2.4. Conditions (SB) and (SC ) are satisfied for generic nonlinearities f and potentials V provided that V decays exponentially fast at ∞. There are standard techniques for proving the absence of embedded eigenvalues for self-adjoint Schr¨ odinger operators with generic potentials (see [25, Sec. VIIIA]). In [10], these results and techniques are extended to operators of the form (2.8) and we expect that Conjecture 2.4 for (SB) can be derived from these or related results. The following results support the (SC) part of the conjecture. Introduce the family of operators Lgeneral(U ) := L0 + U, where
d2 + β dx2 0
0 V1 and U := L0 := 2 d −V2 0 − β dx2 parametrized by β > 0, s ∈ C and the functions V1 (x) and V2 (x) satisfying 0
−
|V1 (x)|, |V2 (x)| ≤ ce−α|x| for some constants c, α > 0. Then, we have
(2.11)
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1151
Proposition 2.5. (A) If (SB) and (SC) are satisfied for a given U 0 , then (SB) and (SC) are satisfied for any U such that eα|x|(U − U 0 ) L∞ is sufficiently small, where α is the same as in Eq. (2.11). (B) If for some U 0 , the operator Lgeneral(U 0 ) has a resonance at iβ (or at −iβ) and if
∞
−∞
(V10 (x) + V20 (x)) dx = 0,
then there exists a small neighborhood A ⊂ C of 1 such that Lgeneral (sU 0 ) has no resonance at iβ for s ∈ A\{1}. Remark 2.6. 1. For U = 0, the operator Lgeneral (U ) = L0 has resonances at ±iβ. Hence statement (B) shows that the operators Lgeneral(sU ) with s = 0 and sufficiently small have no resonance at ±iβ. 2. It is proved in [29] that if f (u) = u and V (x) = 0 in Eq. (1.1), then conditions (SA) and (SB) hold, but condition (SC) fails. Proposition 2.5(B) implies that Eq. (1.1) with f (u) = u and V (x) = sV 0 (x), for a large class of V 0 (x) and for s = 0 sufficiently small, satisfies (SB) and (SC). It can be proved that for a large subclass of potentials V 0 (x), condition (SA) remains to be satisfied. 3. It can be shown that if in Eq. (1.1) f (u) = u2 and V (x) = 0, then the operator L(λ) satisfies the conditions (SB) and (SC). However, this equation fails condition (2.3) (it is a critical NLS) and (SA) (its generalized zero eigenvector space is of dimension 6). It is easy to stabilize this equation by changing the nonlinearity slightly, say, taking f (u) = u2− or f (u) = u2 − u4 . The resulting equations satisfy (2.3), (SB) and (SC) but not (SA). Specifically, if the nonlinearity f (u) = u2− and the potential V = 0, then Eq. (1.1) has a standing wave solution ψ(x, t) = eit φ(x) with 1
1
φ(x) = (12 − 4)− 4−2 [e(2−)x + e−(2−)x ]− 2− . Then by Proposition 2.5, statement (A) and an explicit form of the soliton φ the corresponding linearized operator, 0
d2 − 1 + (5 − 2)φ4−2 dx2
−
d2 4−2 + 1 − φ dx2 , 0
has no resonances at ±i and no embedded eigenvalues provided that > 0 is sufficiently small. In view of the above, the natural next step is to extend our result by removing condition (SA).
November 18, 2005 10:54 WSPC/148-RMP
1152
J070-00252
Z. Gang & I. M. Sigal
2.4. Main theorem Before stating our main theorem, we formulate another condition on the nonlinearities: (fD) f (0) = f (0) = f (0) = 0. Note that if f is a polynomial of the minimal degree ≥ 4 with the coefficient in front of the highest degree being negative, then the conditions (fA)–(fD) are satisfied. Theorem 2.7. Assume conditions (VA), (VB ), (fA)–(fD ) and (SA)–(SC ). Assume the external potential V is even, and λ ∈ I with I defined in Eq. (2.4). There exists a constant δ > 0 such that if ψ(0) is even and satisfies inf (1 + |x|)2 (eiγ ψ(0) − φλ ) H1 ≤ δ,
γ∈R
then there exists a constant λ∞ ∈ I and a differentiable function γ(t), such that
(1 + |x|)−ν (ψ(t) − eiγ(t) φλ∞ ) 2 → 0 as t → ∞ where ν > 3.5. In particular, the trapped soliton is asymptotically stable. 3. Properties of Operator L(λ) In this section, we find important standard and associated eigenvectors of operator L(λ). Here, we do not assume that the potential V is even. Our main theorem is: Theorem 3.1. If V satisfies conditions (VA) and (VB ) and if λ ∈ I, then L(λ) has 3 linearly independent eigenvectors and one associated eigenvector d with
small λ 0 dλ φ correeigenvalues: one eigenvector φλ and one associated eigenvector 0 sponding to the eigenvalue 0, both of which are even if V is even; 2 linearly independent eigenvectors with O(h) non-zero imaginary eigenvalues, which are odd if V is even. Proof. The proof
is based on the following facts: the operator L(λ) has the zero 0 eigenvector φλ :
L(λ)
0 φλ
= 0,
related to the
symmetry of the map G(ψ) (see Eq. (2.6)), and associated zero gauge eigenvector
λ d dλ φ
0
:
L(λ)
d λ dλ φ
0
=
0 φλ
.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1153
Moreover, for h = 0, the operator Lh=0 (λ) := L(λ)|h=0 has the zero eigenvector λ+V (0) ∂x φ0 : 0
L
h=0
(λ)
λ+V (0)
∂x φ0
= 0,
0
coming from the translational symmetry of the map G(ψ) and the associated zero
0 eigenvector xφλ+V (0) : 0
L
h=0
(λ)
0 λ+V (0)
xφ0
=
λ+V (0)
2∂x φ0
0
,
coming from the boost transformation. The first two properties above yield the first part of the theorem. The last two properties and elementary perturbation theory will yield the second part of this theorem. To prove the second part of the theorem, we observe that since the operator
first 0 1 L(λ) is of the form L(λ) = JH(λ) where J = −1 0 is the anti self-adjoint matrix
L (λ) 0 + is a real self-adjoint operator, the spectrum of L(λ) is and H(λ) = 0 L− (λ) symmetric with respect to1the real and imaginary axis. Hence the eigenvectors 1 0 0 λ+V (0)
@
∂ x φ0
0
A
and
@
0
λ+V (0)
xφ0
A
for h = 0 give rise to either two pure imaginary or
two real eigenvalues. We claim for V (0) > 0, the former case takes place; and for V (0) < 0, the latter one. To prove this, we use the Feshbach projection method (see [24]) with the projections P¯ := I − P and 0 d λ λ 2 φ 0 φ Pd = d λ 2 λ d λ + dλ 0 φ φ
φ
0 dλ dλ
∂x φλ 1 −ixφλ ∂x φλ ixφλ − . + ∂x φλ −ixφλ ∂x φλ i φλ 2 ixφλ Then the eigenvalue equation L(λ)ψ = µψ is equivalent to the nonlinear eigenvalue problem (P L(λ)P − W (µ))φ = µφ,
(3.1)
where φ ∈ Ran P and W := P L(λ)P¯ (P¯ L(λ)P¯ − µ)−1 P¯ L(λ)P. Note that by Proposition C.2 and similarly to Proposition 3.3, P − P0 = O(h) where P0 is the Riesz projection for Lh=0 (λ) associated to eigenvalue 0: 1 P0 = (Lh=0 (λ) − z)−1 dz 2πi |z|=
November 18, 2005 10:54 WSPC/148-RMP
1154
J070-00252
Z. Gang & I. M. Sigal
for sufficiently small. Since the operator P0 Lh=0 (λ)P0 restricted to Ran P0 has no spectrum in an O(1)-neighborhood of 0, then so does the operator P L(λ)P restricted to Ran P , hence there exists some constant δ1 , δ2 > 0 such that if h is sufficiently small and if |µ| ≤ δ1 , then for n = 0, 1, 2, n ∂ (P¯ L(λ)P¯ − µ)−1 2 ≤ δ2 . (3.2) µ L ∩RangeP¯ →L2 ∩RangeP¯ We claim that
W (µ) = O(h3 ). Indeed, similarly as in [30] we can get that L(λ) = Lh=0 (λ) + O(h3/2 ) on any compact domain and P = P h=0 + O(h3/2 ). Therefore P L(λ)P¯ = P h=0 Lh=0 (λ)P¯ h=0 + O(h3/2 ) = O(h3/2 ) and similarly P¯ L(λ)P = O(h3/2 ). Since |µ| ≤ δ we use estimate (3.2) to prove ∂µn W = O(h3 ), n = 0, 1, 2. Now we analyze the term P L(λ)P. We observe that
∂λ φλ 0 0 = 0, P L(λ)P = P L(λ)P λ 0 φλ φ
0 2∂x φλ P L(λ)P = , λ xφ 0
∂x φλ 0 P L(λ)P + O(h3 ). = 0 h2 V (0)xφλ Hence the operator P L(λ)P + W the 4 × 4 matrix: 0 1 0 0 0 0 0 0
restricted to the 4-dimensional space Ran P has 0 0 0 0 + O(h3 ). 0 h2 V (0) 2 0
Since the operator W (µ) depends smoothly on µ, we have by the implicit function theorem, four values of µ: 0 + O(h3/2 ), 0 + O(h3/2 ), ± −2h2 V (0) + O(h3/2 ), at which Eq. (3.1) has a nontrivial solution. By the above, the operator L(λ) has four eigenvectors with the eigenvalues given by these numbers. Since we already know that L(λ) has an eigenvalue 0 with multiplicity 2, the other two eigenvalues are ± −2h2 V (0) + O(h3/2 ). Since the spectrum of the operator L(λ) is symmetric with respect to the real axis, the latter two eigenvalues are purely imaginary. Corollary„ 3.2. There exist a real function ξ1 and an imaginary function η1 « ξ1 such that ±η are the eigenvectors of L(λ) with small, non-zero and imaginary 1 eigenvalue ±i1 .
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
L(λ)
ξ1 ±η1
= ±i1
ξ1 ±η1
1155
.
(3.3)
For the nonself-adjoint operator L(λ), we define the (Riesz) projection onto the pure point spectrum subspace of L(λ) as: 1 λ (L(λ) − z)−1 dz, Pd := 2iπ Γ where curve Γ is a small circle around 0 of radius < λ which contains the eigenvalues ±i (see Corollary 3.2). Proposition 3.3. Assume condition (SA). Then in the Dirac notation 0 d λ λ φ φ 2 0 Pdλ = d λ 2 λ d λ + dλ φ φ
φ
0 dλ 0 dλ
ξ1 1 −η1 ξ1 η1 − . + η1 ξ1 −η1 ξ1 2ξ1 , η1 The proof of this proposition is straightforward but tedious, and is given in Appendix C. Definition 3.4. We define the essential spectrum subspace of L(λ) as λ Range 1 − Pd where Pdλ is defined before Proposition 3.3. We set λ := 1 − Pdλ . Pess
(3.4)
4. Reparametrization of ψ(t) In this section, we introduce a convenient decomposition of the solution ψ(t) to Eq. (1.1) into a solitonic component and a simplectically fluctuation. 4.1. Decomposition of ψ(t) In this subsection, we decompose ψ(t) into a solitonic component and a simplectically orthogonal fluctuation and derive equations of each component. Theorem 4.1. Assume V and ψ(0) are even. There exists a constant δ > 0, so that if the datum ψ(0) satisfies inf γ∈R ψ(0) − eiγ φλ H1 < δ, then there exist differentiable functions λ, γ : R+ → R, such that ψ(t) = ei
Rt 0
λ(t)dt+iγ(t)
(φλ(t) + R),
where R is in the essential spectrum subspace, i.e. ImR, iφλ = Im R, φλλ = 0.
(4.1)
(4.2)
Proof. By the Lyapunov stability (see [22, 31]), ∀ > 0, there exists a constant δ, such that if infγ∈R ψ(0) − eiγ φλ H1 < δ, then ∀t > 0, infγ ψ(t) − eiγ φλ H1 < .
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Z. Gang & I. M. Sigal
1156
The decomposition (4.1) and (4.2) follows from Splitting Theorem in [16] and the „ « 0 fact that ψ(t) are even while all the eigenvectors, besides even eigenvectors φλ and
„ λ« φλ 0
, are odd.
Plug Eq. (4.1) into Eq. (1.1) to obtain: λ ˙ + Rt −γ(φ ˙ λ + R) + i λφ λ
¯ + N (R), (4.3) = −Rxx + λR + Vh R − f (|φλ |2 )R − f (|φλ |2 )(φλ )2 (R + R) where ¯ N (R) = −f (|φλ + R|2 )(φλ + R) + f (|φλ |2 )(φλ + R) + f (|φλ |2 )(φλ )2 (R + R). Passing from complex functions R = R1 + iR2 to real vector-functions obtain
˙ λ d R1 R1 Im N (R) R2 −λφ λ = L(λ) + + γ ˙ . + R2 −R1 −Re N (R) −γφ ˙ λ dt R2
„
R1 R2
«
we
(4.4)
Differentiating ImR, iφλ = 0 (see decomposition (4.2)) with respect to t, we get ImRt , φλ + λ˙ Im R, iφλλ = 0. (4.5) Multiply Eq. (4.3) by iφλ and use Eq. (4.5) to obtain: λ˙ φλλ , φλ − λ˙ Re R, φλλ − γ˙ ImR, φλ = ImN (R), φλ . By similar reasoning the relation Im R, φλλ = 0 implies that −γ˙ φλ , φλλ − γ˙ Re R, φλλ + λ˙ Im R, φλλλ = Re N (R), φλλ . Combine the last two equations into a matrix form: Lemma 4.2. The parameters λ and γ fixed by Eqs. (4.5) and (4.2) satisfy the equations: φλλ , φλ − Re R, φλλ −ImR, φλ ImN (R), φλ λ˙ = . (4.6) −Im R, φλλλ −Re N (R), φλλ φλλ , φλ + Re R, φλλ γ˙ of By our requirement, φλλ , φλ > 0 > 0 for some constant 0 (orbital stability the solitons). We show later that x−4 R 2 , and therefore, R, φλλ and R, φλλλ are small. Thus the matrix on the left-hand side is invertible and −1 φλλ , φλ − Re R, φλλ −Im R, φλ ≤c λ λ (4.7) λ λ −Im R, φλλ φλ , φ + Re R, φλ for some c > 0 independent of time t.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1157
4.2. Change of parametrization In this subsection, we study key Eq. (4.4). The study is complicated by the fact that the linearized operator L(λ(t)) depends on time t. To circumvent this difficulty, we rearrange Eq. (4.4) as follows. We fix time T > 0, and define function g T by ei
Rt 0
λ(s)ds+iγ(t)
R =: eiλ1 t+iγ1 g T ,
where γ1 = γ(T ) and λ1 = λ(T ). Denote t ∆1 := − λ(t) dt − γ(t) + λ1 t + γ1 .
(4.8)
(4.9)
0
From Eqs. (4.8) and (4.9), we derive the equation for g T . Let g T = g1T + ig2T , then Eq. (4.4) implies
g1T d g1T Im D = L(λ + ) , (4.10) 1 −Re D dt g2T g2T where D = D1 + D2 + D3 , ˙ λ e−i∆1 , ˙ λ e−i∆1 − iλφ D1 = γφ λ
D2 = [f (|φ | ) + f (|φλ1 |2 )(φλ1 )2 − f (|φλ |2 ) − f (|φλ |2 )(φλ )2 ]g T + [f (|φλ1 |2 )(φλ1 )2 − f (|φλ |2 )(φλ )2 ]g¯T + f (|φλ |2 )(φλ )2 [1 − e−2i∆1 ]g¯T , λ1 2
D3 = e−i∆1 N (R). spectrum subspaces of We decompose g T along the point spectrum λand λessential T 1 1 the operator L(λ1 ). Since g is even and φ , φλ > 0, there are differentiable real functions k1T , k2T : [0, T ] → R such that g T = ik1T φλ1 + k2T φλλ11 + hT ,
(4.11)
and hT is in the essential spectrum subspace of L(λ1 ), where, recall Pess from Eq. (3.4). Lemma 4.3. The functions k1T , k2T and hT = hT1 + ihT2 satisfy the following equations: T Reei∆1 hT , φλ −sin(∆1 )φλ1 , φλ , cos(∆1 ) φλλ11 , φλ k1 , (4.12) = − λ λ λ1 λ k2T Im ei∆1 hT , φλλ cos(∆1 ) φλ , φ 1 , sin(∆1 ) φλ1 , φλ hT1 Im D d hT1 = L(λ1 ) + Pess . (4.13) dt hT2 −Re D hT2 Proof. By Eq. (4.2), we have the following two equations: 0 = ImR, iφλ = Reei∆1 g T , φλ = k2T cos(∆1 ) φλλ11 , φλ − k1T sin(∆1 )φλ1 , φλ + Reei∆1 hT , φλ ;
November 18, 2005 10:54 WSPC/148-RMP
1158
J070-00252
Z. Gang & I. M. Sigal
0 = Im R, φλλ
= k2T sin(∆1 ) φλλ11 , φλλ + k1T cos(∆1 ) φλ1 , φλλ + Im ei∆1 hT , φλλ .
λ1 0 φλ 1 Since φλ1 and are eigenvectors of L(λ1 ), Eq. (4.10) implies Eq. (4.13). 0 When |λ − λ1 | is small, φλλ11 , φλ , φλλ , φλ1 > 0 > 0 for some constant 0 . Thus in this case, the matrix −sin(∆1 )φλ1 , φλ , cos(∆1 ) φλλ11 , φλ cos(∆1 ) φλλ , φλ1 , sin(∆1 ) φλλ11 , φλλ has an inverse uniformly bounded in t and T . 4.3. Estimates of the parameters λ, γ and the function R In this section, we will estimate the parameters λ(t), γ(t) and the function R(t). Proposition 4.4. Let ν > 7/2 and ρν := (1 + |x|)−ν . We have for time t ≥ 0, ˙ |λ(t)| + |γ(t)| ˙ ≤ c(1 + t)−3 ,
ρν R 2 ≤ c(1 + t)−3/2 , where the constant c is independent of t. The proof of this proposition is based on estimates of the evolution operator U (t) = etL(λ1 ) which we formulate now. Note that U (t) is defined in a standard way (see Lemma 5.2 for detailed definition). Recall that the operator Pess , defined in Eq. (3.4), is the projection onto the essential spectrum subspace Hpp (L∗ (λ))⊥ . We prove in Sec. 5 that in the 1-dimensional case U (t) satisfies the following estimates: 3
ρν U (t)Pess h 2 ≤ c(1 + t)− 2 ρ−2 h 2 , −3/2
ρν U (t)Pess h 2 ≤ c(1 + t)
U (t)Pess h
L∞
U (t)Pess h
L∞
≤ ct
−1/2
(4.14)
( ρ−2 h 1 + h 2 ),
( ρ−2 h 1 + h 2 ), − 12
≤ c(1 + t)
ρ−2 h
H1
,
where ν > 7/2 and recall ρν (x) = (1 + |x|)−ν . Proof of Proposition 4.4. We will estimate the following quantities: mT1 (t) = ρν hT 2 , M1 (T ) = sup (1 + τ )3/2 mT1 (τ ), τ ≤T
mT2 (t) = g T L∞ , M2 (T ) = sup (1 + τ )1/2 mT2 (τ ), τ ≤T
where ν is a constant greater than 3.5. Note various constants c used below do not depend on t or T .
(4.15) (4.16) (4.17)
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1159
The matrix on the left-hand side of Eq. (4.12) has a uniformly bounded inverse. Hence using the definition of M1 , we obtain T T k1 + k2 ≤ cM1 (1 + t)− 32 . (4.18) From Eqs. (4.6)–(4.8) and (4.11), we have 2 ˙ + |γ| |λ| ˙ ≤ c ρν R 22 ≤ c k1T + k2T + ρν hT 2 . Hence by the definition of M1 , ˙ + |γ| |λ| ˙ ≤ cM12 (1 + t)−3 .
(4.19)
By the definition of g T in (4.8), ρν R 2 = ρν g T 2 . By the decomposition (4.11), we have (4.20)
ρν R 2 ≤ c k1T + k2T + ρν hT 2 ≤ c(1 + t)−3/2 M1 . We need to estimate ∆1 given in Eq. (4.9). By the observations that T T ˙ λ(t) − λ1 = − γ(τ ˙ ) dτ λ(τ ) dτ and γ(t) − γ1 = − t
t
and estimate (4.19), we have |ei∆1 − 1| = |e− = |e ≤
Rt 0
−i
i(λ(t)−λ1 )dt−
Rt RT 0
t
Rt 0
˙ )dτ dt−i λ(τ
iγ(t)dt ˙
RT t
− 1|
γ(t)dt ˙
− 1|
cM12 .
Now we estimate
hT :=
Re hT Im hT
.
Using the Duhamel principle, we rewrite Eq. (4.13) as
t Im D
hT (t) = U (t) hT (0) + U (t − τ )Pess dτ, −Re D 0 where, recall U (t) = etL(λ1 ) . Using estimates (4.16) and (4.17), and Eq. (4.8), we obtain
hT L∞ =
hT L∞ ≤ U (t) hT (0) L∞ +
t! 3 Im Dn U (t − τ )Pess dτ −Re Dn L∞ 0 n=1
−1/2
≤ c(1 + t)
ρ−2 R(0) H1 t 3 ! +c |t − τ |−1/2 ( ρ−2 Dn 1 + Dn 2 ) dτ. 0
n=1
(4.21)
November 18, 2005 10:54 WSPC/148-RMP
1160
J070-00252
Z. Gang & I. M. Sigal
Using estimates (4.14) and (4.15), and Eq. (4.8), we derive
ρν hT 2 = ρν hT 2 ≤ ρν U (t) hT (0) 2 +
t! 3 ρν U (t − τ )Pess Im Dn dτ −Re Dn 0 n=1
2
−3/2
ρ−2 R(0) 2 ≤ c(1 + t) t 3 ! +c (1 + |t − τ |)−3/2 ( ρ−2 Dn 1 + Dn 2 ) dτ. 0
(4.22)
n=1
Next we estimate ρ−2 Dn 1 + Dn 2 , n = 1, 2, 3. By estimate (4.19), we can estimate D1 : ˙ ≤ cM 2 (1 + t)−3 .
ρ−2 D1 1 + D1 2 ≤ c(|γ| ˙ + |λ|) 1 For D2 , 3
ρ−2 D2 1 + D2 2 ≤ cM12 (1 + t)− 2 . For D3 , recalling the condition (fD) on the nonlinearity f , we isolate the terms containing at least one power of φλ , denote their sum by DI , and let DII := D3 −DI . Since the leading term of DI is c|g T |2 φλ , we have
DI 2 + ρ−2 DI 1 ≤ cM2 M1 (1 + t)−2 .
(4.23)
Here we have ignored the higher-order terms which are estimated by P (M2 )M1 (1 + t)−3 , where the function P is a polynomial such that P (0) = 0. Since c|g T |2p g T with p ≥ 4 is the leading-order term of DII for |g T | small, we have T T 2p−1
(1 + |x|)g T 22 .
DII 2 + (1 + |x|)2 DII 1 ≤ c g T 2p L∞ g 2 + g ∞ Using that g T 2 ≤ c, (1 + |x|)g T 2 ≤ c(1 + t) by Eq. (2.1), and g T 2p−1 ≤ L∞ 2p−1 −p+1/2 (1 + t) , and using that p ≥ 4 we have M2
DII 2 + (1 + |x|)2 DII 1 ≤ c M22p (1 + t)−p + M22p−1 (1 + t)−p+1/2 (1 + t)2 3 ≤ c M22p (1 + t)−4 + M22p−1 (1 + t)− 2 .
(4.24)
By estimates (4.23) and (4.24), and the fact that D3 = DI + DII , we obtain
D3 2 + ρ−2 D3 1 ≤ c(1 + t)−3/2 (M2 M1 + M22p + M22p−1 ). This finishes the estimates of the Di ’s. Now we return to the estimation of the hT .
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1161
By estimate (4.22), we have
ρν hT 2 ≤ c(1 + t)−3/2 ρ−2 R(0) 2 t M2 M1 + M12 + M22p + M22p−1 +c 0
1 (1 + t −
τ )3/2 (1
+ τ )3/2
dτ.
Thus (1 + t)3/2 ρν hT 2 ≤ c ρ−2 R(0) 2 + c M2 M1 + M12 + M22p + M22p−1 . Recall the definition of M1 and S = ρ−2 R(0) 2 + ρ−2 R(0) H1 . Then we have M1 ≤ cS + c M2 M1 + M12 + M22p + M22p−1 .
(4.25)
By estimate (4.21), we have
h
T
L∞
− 12
≤ c(1 + t)
ρ−2 R(0)
H1
t
+c 0
dτ ( D 2 + ρ−2 D 1 ). (t − τ )1/2
By the definition of M2 and the similar procedure of estimating D 2 + ρ−2 D 1 as above, we have (4.26) M2 ≤ cS + c M2 M1 + M12 + M22p + M22p−1 . If Mn (0), (n = 1, 2) are sufficiently close to zero, by estimates (4.25) and (4.26) we have shown that M1 (T ), M2 (T ) ≤ µ(S)S for any time T , where µ(S) is a function that is bounded for sufficiently small S. Thus we have shown that M1 (T ) + M2 (T ) ≤ c( ρ−2 R(0) 1 + ρ−2 R(0) H1 ). The last estimate together with estimates (4.19) and (4.20) implies Proposition 4.4.
4.4. Proof of Theorem 2.7 In this subsection, we prove our main Theorem 2.7. To this end, we use Proposition 4.4. Since ˙ + |γ| |λ| ˙ ≤ c(1 + t)−3 for some c > 0, there exist λ∞ , γ∞ such that |λ(t) − λ∞ | + |γ(t) − γ∞ | ≤ c(1 + t)−2 . Recall that the solutions ψ(t) can be written as (see Eq. (4.1)), ψ(t) = e−i
Rt 0
λ(t)dt+iγ(t)
(φλ(t) + R).
(4.27)
November 18, 2005 10:54 WSPC/148-RMP
1162
J070-00252
Z. Gang & I. M. Sigal
By Proposition 4.4 and Eq. (4.27), we have for ν > 3.5,
ρν (φλ(t) + R − φλ∞ ) 2 ≤ ρν (φλ(t) − φλ∞ ) 2 + ρν R 2 ≤ c[(1 + t)−2 + (1 + t)−3/2 ] ≤ c(1 + t)−3/2 , which implies Theorem 2.7. 5. Estimates on Propagators for Matrix Schr¨ odinger Operators 5.1. Formulation of the main result In this subsection, we prove estimates (4.14)–(4.17) on the propagator U (t) = etL(λ1 ) , where the operator L(λ) was defined in (2.8). Actually, we put L(λ) in a more general setting, which is of the form
0 L2 , Lgeneral = −L1 0 where the operators L1 := −
d2 + V1 + β dx2
and L2 := −
d2 + V2 + β, dx2
the constant β > 0, and the functions V1 and V2 are even, real and satisfy the estimates: |V1 (x)|, |V2 (x)| ≤ ce−α|x|
(5.1)
for some constants c, α > 0. By standard arguments (see, e.g., [37]) we have that σess (Lgeneral) = i(−∞, −β] ∪ i[β, ∞). The points −iβ and iβ are called thresholds. They affect the long time behavior of the semigroup etLgeneral in a crucial way. Now, we define the resonances at the thresholds for the operator Lgeneral. Definition 5.1. A function h = 0 is called the resonance of Lgeneral at iβ (or − iβ) if and only if h is bounded and satisfies the equation (Lgeneral − iβI)h = 0
(or (Lgeneral + iβI)h = 0).
Lemma 5.2. The operator Lgeneral generates a semigroup, etLgeneral , t ≥ 0. Proof. We write the operator Lgeneral as Lgeneral = L0 + U,
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
where
1163
d2 + β dx2 , 0
0 V1 L0 := 2 U := . d −V2 0 − β dx2 It is easy to verify that the operator L0 is a generator of a (C0 ) contraction semigroup (see, e.g., [19, 36]). Also the operator U : L2 → L2 is bounded. By [Theorem 6.4, 19] and [36], Lgeneral = L0 + U generates a (C0 ) semigroup. 0
−
Let Pess be the projection onto the essential spectrum subspace of Lgeneral where, recall, Pess is defined in Eq. (3.4). Theorem 5.3. Assume that the operator Lgeneral has no resonances at ±iβ, no eigenvalues embedded in the essential spectrum, and has no eigenvalues with nonzero real parts. Then for any constant µ > 3.5, there exists a constant c = c(µ) > 0 such that
ρµ etLgeneral Pess h 2 ≤ c(1 + t)−3/2 ρ−2 h 2 ,
ρµ e
tLgeneral
e
tLgeneral
e
tLgeneral
( ρ−2 h 1 + h 2 ),
(5.3)
−1/2
ρ−2 h H1 ,
(5.4)
Pess h 2 ≤ c(1 + t)
Pess h L∞ ≤ c(1 + t) Pess h L∞ ≤ ct −ν
where, recall that ρν (x) = (1 + |x|)
−1/2
(5.2)
−3/2
( h 2 + ρ−2 h 1 ),
(5.5)
.
The proof of a theorem equivalent to Theorem 5.3 is given in Sec. 5.3. It is more convenient to transform first the operator Lgeneral as H := −iT ∗Lgeneral T, where the 2 × 2 matrix 1 T := √ 2
1 i i 1
(5.6)
.
Compute the matrix operator H to get H = H0 + W, where
d2 +β − 2 H0 := dx 0
0
,
1 W := 2
(5.7)
V3 −iV4
−iV4 −V3
, (5.8) d2 − β dx2 with the functions V4 := V1 − V2 , V3 := V2 + V1 . By the properties of the functions V1 , V2 in (5.1), we have |V4 (x)|, |V3 (x)| ≤ ce−α|x|
(5.9)
for some constants c, α > 0. Hence, σess (H) = (−∞, −β] ∪ [β, ∞). The assumptions on Lgeneral are transfered to H as follows: H has no resonances at ±β, and has only finitely many eigenvalues located in the interval (−β, β).
November 18, 2005 10:54 WSPC/148-RMP
1164
J070-00252
Z. Gang & I. M. Sigal
Clearly the operator H also generates a semigroup e−itH . To prove the theorem above, we relate the propagator e−itH to the resolvent (H − λ ± i0)−1 of the generator H on the essential spectrum. We introduce some notation to be used below: let F = [fij ] be an n × m matrix with entries fij ∈ B, where B is a normed space, then !
F B :=
fij B . i,j
The H¨older inequality for such vector-valued functions reads: if p, q ≥ 1, and F1 ∈ Lp and F2 ∈ Lq , then
1 p
+ 1q = 1,
F1 F2 L1 ≤ F1 Lp F2 Lq . 5.2. The spectral representation and the integral kernel of the propagator e−itH Pess In this section, we compute the spectral representation and the integral kernel of e−itH Pess , where Pess is the projection onto the essential spectrum subspace of the unbounded and nonself-adjoint operator H. For θ ∈ R, we define the space L2,θ := (1 + |x|)θ L2 as
g L2,θ = (1 + |x|)θ g 2 . Recall that for |λ| sufficiently large, (H − λ ± i0)−1 maps L2,2 into L2,−2 . The main theorem of this subsection is: Theorem 5.4. Let 0 be a small positive number. Then, for any t ∈ R and any function g ∈ L2,2 K −β+0 1 −itH e Pess g = lim lim + e−itλ K→∞ →0+ 2iπ β−0 −K × [(H − λ − i)−1 − (H − λ + i)−1 ]g dλ,
(5.10)
where the limits on the right-hand side are in the topology L2,−2 , and are uniform in t in compact sets. Consequently,
e−iHt Pess L2,−2 →L2,2 ≤ c(t), where c(t) is bounded in compact sets of R. Note that the limits in (5.10) are independent of 0 because (H ± z + i0)−1 = (H ± z − i0)−1 for any z in the interval [β − 0 , β). Remark 5.5. Clearly we can extend Eq. (5.10) to any functions f (λ) satisfying |fˆ(t)| dt ≤ ∞ by ∞ −β 1 f (H)Pess = + f (λ)[(H − λ − i0)−1 − (H − λ + i0)−1 ] dλ, 2iπ β −∞
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1165
where the function fˆ is the Fourier transform of f . It can also be extended to other classes of functions using that if λ ∈ (−∞, −β] ∪ [β, ∞) and g ∈ C0∞ then 1 σ3 [(H − λ − i0)−1 − (H − λ + i0)−1 ]g ≥ 0, (5.11) g, 2πi
where σ3 := 10 −10 . Since the operator H plays an important role in the nonlinear theory, we discuss this extension elsewhere. It is not used in this paper. We divide the proof of Theorem 5.4 into two steps: first in Lemmas 5.6 and 5.7, we prove that the limits on the right-hand side exist; then we prove that the leftand right-hand sides are equal. Lemma 5.6. If the constant β < β and β − β is sufficiently small, and if K is a sufficiently large constant, then for any function g ∈ L2,2 we have K e−it(λ−i) (H − λ + i)−1 g dλ (5.12) lim →0+
β
exists in the space L2,−2 and its norm is less than c(t) g L2,2 where c(t) is a function bounded on compact sets of R. Proof. Suppose that f , g ∈ L2,2 . Then, the function u(z) := f, e−itz (H − z)−1 g is analytic on the set Γ := {z|Im z < 0, Re z ≥ β }. Therefore by standard techniques of complex analysis we have K u(z − i0) dz β
K−i
= β −i
K
u(z) dz +
β −i
u(z) dz + K−i
u(z) dz.
(5.13)
β
For the first term, we observe that (H − z)−1 : L2,2 → L2,−2 is uniformly bounded on the line (β − i, K − i) thus K−i u(z) dz ≤ c()et f 2 g 2 .
(5.14)
β −i
For the third term, since σess (H) = (−∞, −β]∪[β, ∞), [β −i, β ]∩σess (H) = ∅ for a fixed 0 ∈ (0, β). Therefore, |u(z)| ≤ c f 2 g 2
November 18, 2005 10:54 WSPC/148-RMP
1166
J070-00252
Z. Gang & I. M. Sigal
for any z ∈ [β − i, β ]. Hence β −i ≤ c f 2 g 2 . u(z) dz
(5.15)
β
Consider u(z) in the interval [K − i, K]. We claim that in this interval |u(z)| ≤ c f L2,2 g L2,2 which implies that
K
K−i
u(z) dz ≤ c f L2,2 g L2,2 .
(5.16)
To prove the claim, we use the equation (H − z)−1 = (1 + (H0 − z)−1 W )−1 (H0 − z)−1 . −1
We compute the integral kernel of (H0 − z) −i√z−β|x−y| e √ 1 i z−β G0 (x, y, z) = 2 0
(5.17)
as 0 −
e
√ − z+β|x−y|
√ z+β
,
(5.18)
√ √ where z + β and β − z are defined in such a way that their real parts are nonnegative. Hence for z ∈ K − i[0, ], the operators (H0 − z)−1 : L2,2 → L2,−2 are uniformly bounded in |Im z|, and converge to zero as K → ∞. This implies the operator 1 + (H0 − z)−1 W : L2,−2 → L2,−2 has a bounded inverse for sufficiently large Re z. The fact that the operators (H − z)−1 : L2,2 → L2,−2 are uniformly bounded for |Re z| large and Im z = 0 comes from Eq. (5.17) and the discussion above. Since the spaces L2,−2 and L2,2 are dual to each other, the claim follows. Equations (5.13)–(5.16) imply Eq. (5.12). Lemma 5.7. If the constant K1 is sufficiently large, then for any function g ∈ L2,2 , K1 e−itλ [(H − λ + i0)−1 − (H − λ − i0)−1 ]g dλ ∈ L2,−2 (5.19) β
and converges uniformly with respect to t in the norm L2,−2 as K1 → ∞. Proof. Instead of proving the lemma directly, we prove a sufficient statement: for a fixed function g ∈ L2,2 and a large constant K2 , the integral K1 |f, e−itλ [(H − λ + i0)−1 − (H − λ − i0)−1 ]g| dλ K2
≤
K1
|f, [(H − λ + i0)−1 − (H − λ − i0)−1 ]g| dλ
K2
converges uniformly as K1 → ∞ for any f ∈ L2,2 with f L2,2 = 1.
(5.20)
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1167
Since we only study the convergence of (5.20), we always assume the constant K2 is sufficiently large so that if λ ≥ K2 , then 1 + (H0 − λ ± i0)−1 W : L2,−2 → L2,−2 is invertible. First, using the second resolvent equation and the formula (H − λ ± i0)−1 = [1 + W (H0 − λ ± i0)−1 ]−1 (H0 − λ ± i0)−1 , we obtain (H − λ + i0)−1 − (H − λ − i0)−1 = (1 + (H0 − λ + i0)−1 W )−1 [(H0 − λ + i0)−1 − (H0 − λ − i0)−1 ](1 + W (H0 − λ − i0)−1 )−1 . 1 [(H0 − λ + Next, by a standard argument we compute the integral kernel of 2πi −1 −1 i0) − (H0 − λ − i0) ] as cos k(x − y)
−ikx ikx 0 −iky iky 1 e e k , 0 + e e , 0 , = 0 0 2k 0 0 √ where, recall k = λ − β. Since f , g ∈ L2,2 , the following functions are well defined
fλ∗ := [1 + W ∗ (H0 − λ − i0)−1 ]−1 f,
gλ := [1 + W (H0 − λ − i0)−1 ]−1 g ∈ L2,2 .
Moreover, by Eq. (5.18), we obtain that for large λ c c
f L2,2 , g − gλ L2,2 ≤
g L2,2 .
f − fλ∗ L2,2 ≤ |k| |k| Therefore,
K2
K1
(5.21)
|f, (H − λ + i0)−1 − (H − λ − i0)−1 ]g| dk 2
ikx ∗ eikx e ≤ , gλ fλ , 0 0 K1
−ikx −ikx e e , gλ dk. + fλ∗ , 0 0
K2
(5.22)
We only consider the first term of the right-hand side of (5.22) because the estimate of the second term is the same. We claim that
ikx K2 ∗ eikx e f , , g λ dk λ 0 0 K1 K2 f 2L2,2
g 2L2,2 2 2 a−2 + a + b ≤c K1 K1 K1 k2 k2 K1 ikx 2 ikx 2 2 e e −2 2 + bK1 g, , f dk, (5.23) + aK 1 + b K 1 0 0
November 18, 2005 10:54 WSPC/148-RMP
1168
J070-00252
Z. Gang & I. M. Sigal
where the scalar functions
∞
aK := K
g 2L2,2 dk k2
1/10 ,
∞
bK := K
ikx 2 1/10 g, e dk . 0
It is easy to see that as K1 , K2 → ∞, aK1 , bK1 → 0. We estimate the four terms on the right-side of estimate (5.23) as K2
g 2L2,2 −2 dk ≤ a8K1 , aK 1 k2 K1 K2 f 2L2,2 2 2 dk ≤ c a2K1 + b2K1 f 2L2,2 , aK 1 + b K 1 2 k K1 K2 ikx 2 dk ≤ b8K , g, e b−2 K1 1 0 K1 ikx 2 K2 2 e aK1 + b2K1 , f dk ≤ a2K1 + b2K1 f 22 . 0 K1 Therefore,
ikx K2 ∗ eikx 2 e 8 8 2 2 fλ , , g λ dk ≤ aK1 + bK1 + c aK1 + bK1 f L2,2 → 0 0 0 K1 for a fixed g ∈ L2,2 , and the decay is independent of f . What is left is to prove estimate (5.23): Indeed,
ikx ∗ eikx e f , , g λ λ 0 0
ikx ikx ∗ e e = fλ − f + f, , gλ − g + g 0 0
ikx ikx e e − g a−1 , g ≤ aK1 fλ∗ − f, λ K1 0 0 ikx ikx e e + bK1 f, b−1 , g K1 0 0
ikx ikx e e a−1 , g + aK1 fλ∗ − f, K1 0 0 ikx ikx e e −1 + bK1 f, b K1 , gλ − g . 0 0 By using the H¨older Inequality we obtain estimate (5.23). The proof is complete. Lemmas 5.6 and 5.7 show the existence of the limits on the right-hand side of Eq. (5.10). Now we prove Eq. (5.10).
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1169
Proof of Eq. (5.10). Instead of the unbounded, nonself-adjoint operator H, we consider the bounded, nonself-adjoint operators K,κ := (1 − i(H + iκ))−1 for any , κ ∈ R, where K,κ is well defined because H has no complex spectrum. For the operator K,κ , it is not invertible at the set (its spectrum) {(1 − i(an + iκ))−1 | an is eigenvalue of H} ∪ γ2 ∪ γ3 , where γ2 and γ3 are the curves: γ2 := {(1 − i(µ + iκ))−1 | µ ≥ β},
γ3 := {(1 − i(µ + iκ))−1 | λ ≤ −β}.
K
K,κ
Also it is easy to see that Pd ,κ = PdH , where, recall the definition of Pd Definition 3.4. Also by Definition 3.4,
from
K,κ = Pess , Pess K
where, recall that Pess,κ is the projection onto the essential spectrum subspace of the operator K,κ . Let f be an entire function on C. Then, we can define the operator f (K,κ ) by the Taylor series (see e.g. [26]), moreover one has 1 K,κ f (K,κ )Pess = f (λ)(K,κ − λ)−1 dλ, (5.24) 2iπ γ1 where γ1 is a contour around the curves γ2 , γ3 leaving {(1 − i(an + iκ))−1 |an is eigenvalue of H} outside. In order to get a similar formula as in Eq. (5.10), we transform the operator (K,κ − λ)−1 on the right-hand side of Eq. (5.24) as
−1
1 1 1 1 (K,κ − λ)−1 = − H + iκ − H + iκ − + . λ i iλ i There exists an 0 > 0 such that if |λ| ≤ 0 and λ ∈ γ2 ∪ γ3 , then
−1 1 1 : L2,2 → L2,−2 H + iκ − + i i(λ ± 0) are well defined and
−1 1 1 1 H + iκ − + λ i i(λ ± 0)
L2,2 →L2,−2
c ≤ |λ|
for some constant c. Therefore, the integral
−1 1 f (λ) 1 dλ H + iκ − + λ∈γ2 ∪γ3 , λ i i(λ ± 0) |λ|≤0
(5.25)
November 18, 2005 10:54 WSPC/148-RMP
1170
J070-00252
Z. Gang & I. M. Sigal
exists. Based on the arguments above, we deform the contour γ1 as 1 K,κ f (λ)(K,κ − λ)−1 dλ f (K,κ)Pess = 2iπ γ4 +γ5 1 f (λ)[(K,κ − λ + 0)−1 − (K,κ − λ − 0)−1 ] dλ, + 2iπ λ∈γ2 ∪γ3 , |λ|≤0
(5.26) where γ4 , γ5 are the contours around the spectral points γ2 ∩ {λ | |λ| > 0 }, γ3 ∩{λ | |λ| > 0 } respectively, and all the other spectral points of K,κ are kept outd2 −1 side. Since we proved that when |λ| ≤ 0 the operators (K,κ −λ±0)−1 : (1− dx 2) maps the space L2,2 into L2,−2 . Thus justifying the following calculation makes sense (K,κ − λ − 0)−1 − (K,κ − λ + 0)−1
−1
−1 1 1 1 1 1 =− 2 − H + iκ − + H + iκ − + . iλ i i(λ − 0) i i(λ + 0) 1 By the change of variable z = iλ + iκ + i , we have 1 f (λ)[(K,κ − λ + 0)−1 − (K,κ − λ − 0)−1 ] dλ 2iπ λ∈γ2 ∪γ3 |λ|≤0
∞ κ2 1 = + f ((1 − i(z + iκ))−1 )((H − z − i0)−1 2iπ κ1 −∞
− (H − z + i0)−1 ) dz, (5.27) 1 i where κ1 , κ2 are the points such that iκn + iκ + = 0 (n = 1, 2). On the other hand, 1 f (λ)(K,κ − λ)−1 dλ 2iπ γ4 +γ5 −1
1 f (λ) 1 1 1 =− H + iκ − H + iκ − + dλ 2iπ γ4 +γ5 λ i iλ i −1 1 f (λ) 1 1 + dλ. (5.28) = H + iκ − 2iπ γ4 +γ5 iλ2 i iλ Let z = iκ −
1 i
1 iλ ,
the equation equals to
1 1 f (H − z)−1 dz 1 2iπ γ6 +γ7 i(z − iκ + i ) −β κ1 1 = + f ((1 − i(z + iκ))−1 )[(H − z − i0)−1 2iπ κ2 β
+
− (H − z + i0)−1 ] dz,
(5.29)
where the curves γ6 and γ7 are transformations of γ4 and γ5 (from the parameter λ to z), the constants κ1 and κ2 are the same as those in Eq. (5.27).
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1171
By Eqs. (5.26)–(5.29), we have that for any entire function f , −β ∞ 1 f ((1 − i(z + iκ))−1 )[(H − z − i0)−1 + f (K,κ )Pess = 2iπ β −∞ − (H − z + i0)−1 ] dz.
(5.30)
By the results in [19, 36], we have that for some κ0 ∈ R, e−itH = s − lim+ e−itH(1−i(iκ0 +H))
−1
→0
t
−1
t
= s − lim+ e +(tκ0 + )(1−i(H+iκ0 )) .
(5.31)
→0
t
t
Since the function f (z) := e +(−itκ0 + )z is analytic, by Eq. (5.27) we have −1
e−itH(1−i(iκ0 +H)) Pess −β +∞ z 1 −it = + e 1−i(z+iκ0 ) ((H − z − i0)−1 − (H − z + i0)−1 ) dz. 2iπ β −∞ Let → 0+ , the limit on the left-hand side is e−itH by Eq. (5.31) and the limit on the right-hand side is −β +∞ 1 + e−itz ((H − z − i0)−1 − (H − z + i0)−1 ) dz 2iπ β −∞ by Eq. (5.30) and some arguments similar to the proof of Lemmas 5.6 and 5.7. The proof of Eq. (5.10) is complete. Define the space Lγ := e− 4 |x| L2 γ
with the norm γ
g Lγ := e 4 |x| g L2 .
(5.32)
It will be proved in the discussion after Lemma A.19 that for any λ > β, the operator 1 + (H0 − λ − i0)−1 W : L−α → L−α has a bounded inverse, where, recall the constant α from Eq. (A.3). Define
ikx e −1 −1 e(·, k) := [1 + (H0 − λ − i0) W ] (5.33) 0 and e% (x, k) := σ3 e(x, k), where
σ3 =
1 0
0 −1
.
November 18, 2005 10:54 WSPC/148-RMP
1172
J070-00252
Z. Gang & I. M. Sigal
There are two terms on the right-hand side of Eq. (5.10), we denote the first term + . by e−itH Pess Lemma 5.8. e
−iHt
+ Pess f
1 = 2π
e−i(k
2
+β)t
[e% (x, k), f e(·, k) + e% (−x, k)f e(−·, k)] dk.
k≥0
(5.34) Proof. For any functions f, g ∈ Lα , we define fλ∗ := (1 + W ∗ (H0 − λ + i0)−1 )−1 f,
gλ := (1 + W (H0 − λ + i0)−1 )−1 g.
It will be proved in the discussion after Lemma A.19 that for any λ > β, the operators 1 + (H0 − λ − i0)−1 W and 1 + (H0 − λ − i0)−1 W ∗ map L−α into L−α . Since the spaces Lα and L−α are dual to each other, 1 + W (H0 − λ + i0)−1 and 1 + W (H0 − λ + i0)−1 map the space Lα to Lα . Thus, fλ∗ , gλ ∈ Lα . Therefore, we compute f, −i((H − λ − i0)−1 − (H − λ + i0)−1 )g = fλ∗ , −i[(H0 − λ − i0)−1 − (H0 − λ + i0)−1 ]gλ
ikx ikx
−ikx −ikx 1 1 e e e e = fλ∗ , fλ∗ , , gλ + , gλ 0 0 0 0 2k 2k
ikx 1 e = f, (1 + (H0 − λ − i0)−1 W )−1 0 2k
ikx e × (1 + (H0 − λ − i0)−1 W ∗ )−1 ,g 0
−ikx 1 e −1 −1 + f, (1 + (H0 − λ − i0) W ) 0 2k
−ikx e × (1 + (H0 − λ − i0)−1 W ∗ )−1 ,g 0 1 (f, e(·, k)e% (·, k), g + f, e(−·, k)e% (−·, k), g). = 2k Using Eq. (5.10) again, we have Eq. (5.34). Taking t = 0 in Eq. (5.10), we obtain + − + Pess , Pess = Pess + where the operator Pess is given by K 1 + [(H − λ − i)−1 − (H − λ + i)−1 ]f dλ, Pess f = lim lim K→∞ →0+ 2iπ β− 0 1 e% (x, k), f e(·, k) + e% (−x, k)f e(−·, k) dk (5.35) = 2π k≥0
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1173
− ± for any sufficiently small 0 > 0, and similarly for Pess . In fact, the operators Pess are the spectral projections corresponding to the branches [β, ∞) and (−∞, −β] of the essential spectrum of the operator H.
5.3. Proof of Theorem 5.3 In this subsection, we prove key Theorem 5.3. Recall the definition of functions e(x, k) in Eq. (5.33). To this end, we use the following technical results proven in Appendix A.1. Theorem 5.9. Assume that there are no embedded eigenvalues in the essential spectrum and there are no resonances at the tips, ±β, of the essential spectrum. Then, Eq. (5.33) defines smooth functions e(x, k) which are generalized eigenfunc√ tions of the operator H : He(·, k) = λe(·, k) with k = λ − β. If k ≥ 0, we have the following estimates: n d (5.36) sup n e(x, k) ≤ cρ−n , k≥0 dk n d sup n (e(x, k)/k) ≤ cρ−n−1 , (5.37) k≥0 dk 1/2 n 2 d ≤ cρ−n−1 , (5.38) n (e(x, k)/k) dk k≥0 dk e(·, 0) = 0,
(5.39)
e (·, k), h H2 ≤ c ρ−2 h 2 , %
(5.40)
where n = 0, 1, 2, all c’s are constants independent of x, and recall ρν (x) = (1 + |x|)−ν . Before starting proving Theorem 5.3, we state the standard estimate it d2 e dx2 f ∞ ≤ ct−1/2 f L1 L which is valid for some constant c and for any f ∈ L1 . This estimate implies that if a function g satisfies eikx g(k) dk ∈ L1 and is even, then e−ik2 t g(k) dk ≤ ct−1/2 cos(kx)g(k) dk . (5.41) L1
Proof of estimate (5.2). We will only consider the first term on the right-hand side of e−iHt Pess in Eq. (5.10) and the first term of the right-hand side of Eq. (5.34). To simplify the notation, we denote h# (k) := e% (·, k), h. Also if the function g(x, k) has two variables such that for each fixed x (or k), g(x, ·) ∈ B (or g(·, k) ∈ B), then we define g B k = g(x, ·) B (or g B x =
g(·, k) B ).
November 18, 2005 10:54 WSPC/148-RMP
1174
J070-00252
Z. Gang & I. M. Sigal
Since e(x, 0) = 0 by estimate (5.39), we integrate by part and use estimate (5.41) to obtain
1 1 −1 d e(x, k)h# (k) −ik2 t e(x, k) −ik2 t # h (k) dk = t dk e 2π 2ik 2ik e 2π dk 2ik
e(x, k)h# (k) −3/2 d ≤ ct , (5.42) 1 dk 2ik L (dk)
∞
where gˆ(x) := 0 cos(kx)g(k) dk. Since ˆ g2 1 ≤ g2 H1 , we have 2 n d (e(x, ! d e(x, k) # k)h# (k)) ≤c dk n 2ik h (k) 2 dk 1k 2ik L (dk) n=0 (L )
≤ c h# H2
2 n ! d (e/k) dk n
.
L∞ (k)
n=0
Moreover by estimate (5.37), 2 n ! d ≤ c(1 + |x|)3 , dk n (e/k) ∞ L (dk) n=0 and by estimate (5.40)
h# H2 ≤ c ρ−2 h 2 . The last four estimates imply that 2 e% (·, k), he(x, k)e−ik t dk| ≤ ct−3/2 (1 + |x|)3 ρ−2 h 2 . Similarly, we estimate the other terms in Eqs. (5.34) and (5.10) to get |e−itH Pess h| ≤ ct−3/2 (1 + |x|)3 ρ−2 h 2 . Therefore if µ > 3.5, then
ρµ e−iHt Pess h 2 ≤ ct−3/2 ρ−2 h 2 . By Theorem 5.4, we have ρ2 e−iHt Pess h 2 ≤ c ρ−2 h 2 when t ≤ 1. The last two estimates imply
ρµ e−iHt Pess h 2 ≤ c(1 + t)−3/2 ρ−2 h 2 for some c. This estimate is equivalent to estimate (5.2). Proof of estimate (5.3). We start from the last line of Eq. (5.42), 2 n d (e(x, ! d e(x, k) # k)h# (k)) ≤c dk n 2ik h (k) 2 dk 1 2ik L (dk) n=0 L (dk)
≤c
2 n ! d # h dk n
n=0
2 n ! d . (e/k) dk n
L∞ n=0
2
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1175
By estimate (5.36), we have 2 n ! d # h dk n
L∞
n=0
≤ c ρ−2 h 1 ,
and by estimate (5.38), we have 2 n ! d 3 dk n (e/k) ≤ c(1 + |x|) . 2
n=0
To remove the singularity at t = 0 in the last estimate, we use
ρν e−iHt Pess h 2 ≤ ct−3/2 ρ−2 h 1 . Since ρν e−iHt Pess h 2 ≤ c ρ−2 h 2 . The proof of estimate (5.3) is complete. Proof of estimate (5.4). By the Duhamel principle,
e−itH Pess ρ2 h L∞ ≤ e−itH0 Pess ρ2 h L∞ t −i(t−s)H0 −isH + e W e P ρ h ds ess 2
.
(5.43)
L∞
0
Now, we estimate the two terms on the right-hand side of Eq. (5.43). By the estimates,
h L∞ ≤ c h H1
d2
and eit dx2 L1 →L∞ ≤ ct−1/2 ,
we have
e−itH0 Pess ρ2 h L∞ ≤ c(1 + t)−1/2 h H1 and t −i(t−s)H0 −isH e We Pess ρ2 h ds ∞ 0 L t ≤c |t − s|−1/2 W e−isH Pess ρ2 h L1 ds. 0
Furthermore, by estimate (A.3) we have |W | ≤ cρ6 . Therefore,
W e−isH Pess ρ2 h L1 ≤ c ρ4 e−isH Pess ρ2 h 2 . By estimate (5.2), we have
ρ4 e−isH Pess ρ2 h 2 ≤ c(1 + s)−3/2 h 2 .
(5.44)
November 18, 2005 10:54 WSPC/148-RMP
1176
J070-00252
Z. Gang & I. M. Sigal
Thus
0
t
e−i(t−s)H0 L1 →L∞ W e−isH Pess ρ2 h L1 ds t 1 1 ds h 2 ≤c 1/2 (1 + |s|)3/2 0 |t − s| ≤ c(1 + t)−1/2 h 2 .
(5.45)
Estimates (5.43)–(5.45) imply the inequality
e−itH Pess ρ2 h L∞ ≤ c(1 + t)−1/2 h H1 which is equivalent to estimate (5.4). Proof of estimate (5.5). By the Duhamel Principle, t −i(t−s)H0 −isH
e−itH Pess h L∞ ≤ e−itH0 Pess h L∞ + e W e P h ds ess
.
L∞
0
For the first term on the right-hand side, we have
e−itH0 Pess h L∞ ≤ ct−1/2 h 1 and for the second term t −i(t−s)H0 −isH e We Pess h ds
L∞
0
≤c 0
t
1
ρ−2 W e−isH Pess h 2 . |t − s|1/2
By estimate (5.3),
ρ−2 W e−isH Pess h 2 ≤ c(1 + s)−3/2 ( ρ−2 h 1 + h 2 ). Therefore,
e
−itH
Pess h
L∞
−1/2 ≤c t + 0
t
1 −3/2 (1 + s) ds ( h 2 + ρ−2 h 1 ) |t − s|1/2
≤ ct−1/2 ( h 2 + ρ−2 h 1 ), which gives the last estimate (5.5). Acknowledgments We are grateful to S. Cuccagna, V. Vougalter and, especially, V. S. Buslaev for fruitful discussions. In particular, V. S. Buslaev taught us beautiful ODE techniques many of which are due to him and his collaborators. Part of this work was done while the second author was visiting the Erwin Schr¨ odinger Institute. He is grateful to Peter C. Aichelburg, Piotr Bizo´ n and Sergiu Klainerman for the hospitality and for organizing a stimulating program on nonlinear evolution equations. This paper is a part of the first author’s Ph.D thesis requirement. While the paper has been prepared for publication, the authors received the preprints by Comech and Pelinovsky [9], Schlag [40], and Goldberg and Schlag [20]
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1177
whose results overlap with some of the results of the present paper, namely, with the description of the linearized spectrum (Sec. 3) and with construction of functions of nonself-adjoint operators (Sec. 5.2) and with an estimate of 1-dimensional propagators (Sec. 5.1) respectively. The authors are grateful to Comech, Pelinovsky and Schlag for communicating their results prior to publications. The first author is supported by NSERC under Grant NA7901 and NSF under Grant DMS-0400526. The second author is supported by NSF under Grant DMS-0400526. Appendix A. Proof of Theorem 5.9 In this appendix, we study the functions e(x, k) := [1 + (H0 − λ + i0)−1 W ]−1
eikx 0
√ with k = λ − β introduced in Sec. 5.3. A simple manipulation shows that He(·, k) = λe(·, k), i.e. e(·, k) are generalized eigenfunctions of the operator H corresponding to spectral points λ = k 2 + β. We begin with some auxiliary results on solutions to the equation Hξ = λξ. A.1. Generalized eigenfunctions of H In this subsection, we study the solutions of the equation Hξ = λξ, considered as a differential equation, with λ in an appropriate domain of the complex plane C. From now on, we will only consider the positive branch of the essential spectrum subspace of H. The negative branch is treated exactly the same. Thus, we always assume Re λ ≥ β > 0 and Im λ is sufficiently small. If Re λ ≥ β, we define two √ √ functions λ − β and λ + β such that they are analytic and if λ − β > 0 (or √ √ λ + β > 0), then λ − β > 0 (or λ + β > 0). Define the domain α Ω := λ | Re λ ≥ β, |Im λ − β| + |Im λ + β| ≤ , (A.1) 4 where, recall α from (A.3). We always denote k := λ − β and µ := λ + β. √ Hence the function µ = 2β + k 2 is analytic in k = λ − β, λ ∈ Ω. Below we will use the space L∞,β := eβ|x| L∞ with the norm
(A.2)
W L∞,β = e−β|x|W L∞ . We formulate the main result of this appendix. Recall the definition of the operator H := H0 + W , where d2
+ β 0 − 1 −iV4 V3 dx2 , H0 := , W := d2 2 −iV4 −V3 0 − β dx2
November 18, 2005 10:54 WSPC/148-RMP
1178
J070-00252
Z. Gang & I. M. Sigal
the constant β > 0, the functions V3 , V4 are even, smooth, real and decay exponentially fast at ∞, |V4 (x)|, |V3 (x)| ≤ ce−α|x| (A.3) √ for some constants c, 0 < α < β. The following is the main theorem of this section. Theorem A.1. If λ ∈ Ω, then the equation (H − λ)φ = 0
(A.4)
has C 3 solutions φ1 (·, µ, W ), ψ1 (·, k, W ), ψ2 (·, k, W ) and ξ1 (·, µ, W ) which are analytic in k and satisfy the following estimates: there exist constants R1 , c, 0 > 0 such that for ∀x > R1 , λ ≥ β n
d 0 µx ≤ ce−0 x , (x, µ, W )e − (A.5) φ 1 dk n 1 n
d 1 −ikx ≤ ce−0 x , − (A.6) dk n ψ1 (x, k, W )e 0 n
d 1 ikx ≤ ce−0 x , (x, k, W )e − (A.7) ψ 2 dk n 0 where n = 0, 1, 2; and lim ξ1 (x, µ, W )e
x→+∞
−µx
0 = . 1
(A.8)
For any constant R2 , there exists a constant c2 > 0 such that if β ≤ λ ≤ β + 1 and x ≥ R2 , then n d (A.9) dk n φ1 (x, µ, W ) ≤ c2 and
n d ≤ c2 (1 + |x|)n , ψ (x, k, W ) dk n 1 n d n dk n ψ2 (x, k, W ) ≤ c2 (1 + |x|) ,
(A.10) (A.11)
where n = 0, 1, 2. Moreover the following maps are continuous L∞,α W → φ˜1 (W ) ∈ C 2
and
L∞,α W → ψ˜1 (W ) ∈ C 2
are continuous: where, recall the constant α from Eq. (A.3),
d ˜ φ1 (x, 2β, W )|x=0 , φ1 (x, 2β, W )|x=0 , φ1 (W ) := dx
d ˜ ψ1 (x, 0, W )|x=0 , ψ1 (x, 0, W )|x=0 . ψ1 (W ) := dx
(A.12)
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1179
A proof of this theorem follows from Propositions A.3, A.5 and A.6 below. When the potential W is fixed, for brevity we use for the solutions above the notations φ1 (x, µ), ψ1 (x, k), ψ2 (x, k), ξ1 (x, µ) respectively if there is no confusion. Since we need to study the analyticity of all these functions and we derive them by convergent sequences, we use frequently the following lemma. Lemma A.2. If {fn }∞ n=0 is a sequence of analytic functions, and !
fn L∞ < ∞, then
n
" n
fn is an analytic function.
We begin with the function φ1 . √ Proposition A.3. Recall µ = λ + β. There is a solution φ1 (·, µ) to Eq. (A.4) which satisfies the following integral equation:
+∞ sin k(x − y) 0 0 k φ1 (x, µ) = −µx − −µ|x−y| µ|x−y| W (y)φ1 (y, µ) dy. − e e e x 0 − 2µ (A.13) Moreover φ1 (x, ·) is analytic in k, and satisfies estimates (A.5), (A.9) and (A.12). Proof. We begin with proving the existence of solutions φ1 (x, µ) to Eq. (A.13) which can be rewritten as
0 (A.14) ψ= + Aλ ψ, 1 where ψ(x, µ) := eµx φ1 (x, µ), and the operator Aλ : L∞ ([T, ∞)) → L∞ ([T, ∞)) is defined as ∞ sin k(x − y) 0 −µy k W (y)f (y) dy (Aλ f )(x) := eµx e−µ(|x−y|) − eµ|x−y| e x 0 − 2µ with T ∈ (−∞, ∞) an arbitrary constant. We claim for a sufficiently large n,
Anλ < 1. Therefore, Eq. (A.14) has a unique solution in L∞ ([T, ∞)) by a standard contraction argument. | ≤ c1 (1 + |x| + |y|)eµ|x−y| we have By observation that Re µ > 0 and | sin k(x−y) k that if x ∈ [T, ∞), then ∞ |Aλ ψ(x)| ≤ c(T )(1 + |x|) (1 + |y|)e−α|y| dy ψ L∞ ([T,∞)) (A.15) x
November 18, 2005 10:54 WSPC/148-RMP
1180
J070-00252
Z. Gang & I. M. Sigal
for some c(T ) independent of λ, where, recall α from inequality (A.3). Moreover, ∞ ∞ (1 + |x1 |)2 e−α|x1 | dx1 (1 + |x2 |)2 e−α|x2 | dx2 |Anλ ψ(x)| ≤ cn (T )(1 + |x|) x x1 ∞ −α|y| ··· (1 + |y|)e dy ψ L∞ ([T,∞)) xn
(1 + |x|)cn (T ) = n!
∞
2 −α|x|
(1 + |x|) e
n dx ψ L∞ ([T,∞)) .
x
Hence if n ∈ N is sufficiently large, then
Anλ L∞ ([T,∞))→L∞ ([T,∞)) < 1. By the Neumann series, there exists a unique ψ ∈ L∞ ([T, ∞)). Thus there exists a function φ1 (x, µ) defined in the interval x ∈ [T, ∞) which is solution to Eq. (A.13). Since T is an arbitrary constant, φ1 (x, µ) is well defined for x ∈ (−∞, ∞). The analyticity of φ1 (x, ·) comes from Lemma A.2 and the observations that “ ” "+∞ µx e φ1 (x, µ) = n=0 Anλ 10 , the sequence converges absolutely in the L∞ norms, “
”
and each function Anλ 01 is analytic in k. To estimate (A.5) we need the following fact: if the constant T is sufficiently large, then by estimate (A.15) we have that ∀x > T,
Aλ L∞ ((x,∞))→L∞ ((x,∞)) ≤ ce−0 x
(A.16)
for some constants c, 0 independent of x and λ. Using Eqs. (A.14) and (A.16), we prove (A.5). By a direct calculation, we have (H − λ)φ1 (·, µ) = 0. Estimate (A.12) is a special case λ = β. There is another solution φ2 (·, µ) to Eq. (A.4) defined by φ2 (x, µ) := φ1 (−x, µ). To prove the existence of the solutions ψ1 , ψ2 and ξ1 in Theorem A.1, we prove first their existence on the domain [R, +∞), where R is a large constant. Then we continue the solutions to the interval (−∞, ∞) by some ODE theories. To this end, the following lemma will be used: Lemma A.4. Let a ≥ b be constants and Ω2 ⊂ C be a bounded closed set on the complex plane. Define an 4 × 4 matrix T (x, k) := [Tij (x, k)] such that each entry Tij is C 2 continuous in the variable x ∈ [a, b], and analytic in the variable k ∈ Ω2 . Let X(x, k) : [a, b] × Ω2 → C4 be a solution to the ODE system dX(·, k) = T (·, k)X(·, k), (A.17) dx with an initial datum X(a, k). If X(a, ·) is an analytic function of k ∈ Ω2 , then X is C 2 in x ∈ [a, b] and analytic in k ∈ Ω2 .
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1181
Proof. We rewrite Eq. (A.17) as X(·, k) = X(a, k) + Ak X(·, k), where Ak : L∞ ([a, b]) → L∞ ([a, b]) is an operator defined by x Ak (X)(x) := T (y, k)X(y) dy. a
By a straightforward calculation, we have x |Ak (X)(x)| ≤ c dx X L∞ ,
(A.18)
a
where c is independent of x and k. Thus there exists an integer m ∈ N such that
Am k L∞ ([a,b])→L∞ ([a,b]) < 1. By the same strategy as in Proposition A.3 we obtain the existence, smoothness and analyticity of X(x, k). Proposition A.5. There exist solutions ψ1 and ψ2 to Eq. (A.4) which are analytic in k and satisfy estimates (A.6) and (A.7), (A.10)–(A.12). If λ = β, then there is a solution η such that x γx (A.19) η(x) = + O(e ) 0 as x → −∞, where γ > 0 is a constant. Proof. We will only prove the existence of solutions ψ1 (x, k), that of ψ2 (x, k) is almost the same. First we will prove the existence of a function ψ1 (·, k) on the domain x ∈ [R, +∞), λ ∈ Ω, satisfying the equation
+∞ − sin k(x − y) 0 1 k ψ1 (x, k) = eikx − W (y)ψ1 (y, k) dy 1 µ(x−y) 0 x 0 − e 2µ x 0 0 1 −µ(x−y) W (y)ψ1 (y, k) dy, − 0 − e R 2µ where R is a sufficiently large constant. Define ψk (x) := e−ikx ψ1 (x, k). By the equation for ψ1 (x, k), we have
1 ψk (x) = − Ak1 ψk − Ak2 ψk , 0 where the operators Ak1 and Ak2 : L∞ ([R, ∞)) → L∞ ([R, ∞)) are defined as: +∞ − sin k(x − y) 0 −ik(x−y) k (Ak1 ψ)(x) = ψ(y) dy 1 µ(x−y) W (y)e x 0 − e 2µ
November 18, 2005 10:54 WSPC/148-RMP
1182
J070-00252
Z. Gang & I. M. Sigal
and
x
(Ak2 ψ)(x) =
R
0 0
0
−
1 −µ(x−y) W (y)e e 2µ
−ik(x−y)
ψ(y) dy.
By the properties of the domain Ω from (A.1) and that |W (x)| ≤ e−α|x| , we have ∞ α e− 2 |x| dx ψ L∞ ≤ c1 e−0 |x| ψ L∞ , (A.20) |Ak1 ψ(x)| ≤ c1 |Ak2 ψ(x)| ≤ c1
x x
e−c2 |x−y| e− 2 |y| dy ψ L∞ ≤ c1 e−0 |x| ψ L∞ α
(A.21)
R
for some constants c1 , c2 , 0 > 0. Thus if R is sufficient large, then
Ak1 ψ L∞ ([R,∞)) + Ak2 ψ L∞ ([R,∞)) ≤ 1. By the contraction lemma, we obtain the existence of ψk and
!
+∞ 1 1 ψk (x) = (Ak1 + Ak2 )n + . 0 0 n=1
The estimate (A.6) is from estimates (A.20) and (A.21). When x is not necessarily large, we estimate ψ1 (x, k) using Lemma A.4: ∀R4 > 0, if x ∈ [R3 , −R4 ] and if λ ∈ [β, β + 1], then we have n d dk n ψ1 (x, k) ≤ c for some constant c, and n = 0, 1, 2. Thus estimate (A.10) is proven. Estimate (A.12) is a case λ = β, which is easy, thus omitted. By the similar strategy, we prove the existence of a solution η to (H − β)η = 0 such that it satisfies the estimate (A.19) and the equation
+∞ − sin k(x − y) 0 x k η(x) = − 1 µ(x−y) W (y)η(y) dy 0 x 0 − e 2µ x 0 0 − 1 −µ(x−y) W (y)η dy. 0 − e R 2µ By a direct calculation, we could prove that (H − λ)ψ1 (·, k) = 0 and (H − β)η = 0. √ Proposition A.6. Recall µ = λ + β. There exists a solution ξ1 (x, µ) to Eq. (A.4) which is analytic in k and as x → +∞,
0 (A.22) ξ1 (x, µ) = eµx + O(e−(λ)x ) 1 for some (λ) > 0.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1183
Proof. We follow an idea from [4]. We will prove that if the constant R is sufficiently large, then there exists a function ξ1 such that
∞ 0 0 0 W (y)ξ1 (y, µ) dy 1 ξ1 (x, µ) = eµx + 0 − eµ(x−y) 1 x 2µ x sin k(x − y) 0 k + 1 −µ(x−y) W (y)ξ1 (y, µ) dy. R 0 − e 2µ Define a new function ψ(x, µ) := e−µx ξ1 (x, µ), then ψ satisfies the equation
0 ψ= + Aλ1 ψ + Aλ2 ψ, 1 where Aλ1 , Aλ2 : L∞ ([R, ∞)) → L∞ ([R, ∞)) are operators defined as: ∞ 0 0 1 W (y)ψ(y) dy, (Aλ1 ψ)(x) = 0 − x 2µ and
x
(Aλ2 ψ)(x) = R
sin k(x − y) k 0
0 −µ(x−y) W (y)ψ(y) dy. 1 −µ(x−y) e − e 2µ
As usual, we want to find a large number R such that
Aλ1 + Aλ2 L∞ ([R,∞))→L∞ ([R,∞)) < 1. The operator Aλ1 satisfies the estimate ∞ |Aλ1 ψ(x)| ≤ c e−αy dy ψ L∞ ([R,∞)) .
(A.23)
x
Observe that if R is sufficiently large, then 1 . 2 For Aλ2 , since there exists some constant > 0 such that Re(µ − ±ik) > > 0, we obtain x |(Aλ2 ψ)(x)| ≤ c(1 + |x|) (1 + |y|)e−(λ)(x−y) e−α|y| dy ψ L∞ ([R,∞)) (A.24)
Aλ1 L∞ ([R,∞))→L∞ ([R,∞)) <
R
for some constant c independent of R. We observe that if R → +∞, then x −(λ)(x−y) −α|x| e e dy → 0, or Aλ2 L∞ ([R,∞))→L∞ ([R,∞)) → 0 max x∈[R,∞)
R
as R → +∞. Hence there exists a large constant R such that
Aλ1 + Aλ2 L∞ ([R,∞))→L∞ ([R,∞)) < 1.
November 18, 2005 10:54 WSPC/148-RMP
1184
J070-00252
Z. Gang & I. M. Sigal
By a standard contraction argument, „we«have the existence of the solution. "∞ Since each function (Aλ1 + Aλ2 )n 01 is analytic in k, ξ1 (x, ·) = n=0 (Aλ1 + Aλ2 )n 01 is analytic by Lemma A.2. Estimate (A.22) can be proved by estimates (A.23) and (A.24).
We define another solution to (A.4) by ξ2 (x, µ) := ξ1 (−x, µ).
A.2. Generalized Wronskian A generalized Wronskian function is defined in the next lemma, whose proof is straightforward and is omitted here: Lemma A.7. If two functions X1 and X2 satisfy (H − λ)Xi = 0, (i = 1, 2) then W (X1 , X2 ) := ∂x X1T X2 − X1T ∂x X2 = Const. Define two 2 × 2 matrices F1 (x, k) := [ψ1 (x, k), φ1 (x, µ)],
F2 (x, k) := [ψ1 (−x, µ), φ2 (x, µ)].
(A.25)
The 2 × 2 matrix D(k) := ∂x F1T F2 − F1T (∂x F2 ) is independent of x because each entry is a Wronskian function. We observe that D(k) is a symmetric matrix. Let
D(k) =:
D11 (k) D12 (k) D12 (k) D22 (k)
.
(A.26)
The entry D22 (k) = W (φ1 (·, µ), φ2 (·, µ)) will play an important role later. Under an assumption that there are no eigenvalues embedded in the essential spectrum one can prove, by strategies similar to those in [34, 38], that det D(k) = 0 and the operator 1 + (H0 − λ + i0)−1 W is invertible in some sense for any λ > β. In this paper, we approach these problems in a different way and do not assume the absence of the embedded eigenvalues. We have the following result: Theorem A.8. H has a resonance at the point β if and only if det D(0) = 0.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1185
Proof. First we prove the sufficient condition, i.e. assume H has no resonance at the point β. Since the vectors φ2 (·, 2β), ψ2 (−·, 0), η, ξ2 (·, 2β) form a basis in the solution space (H − β)ϕ = 0, there exist 2 × 2 matrices A2 and B2 , such that [ψ1 (·, 0), φ1 (·, 2β)] = [ψ2 (−·, 0), φ2 (·, 2β)]A2 + [η, ξ2 (·, 2β)]B2 . We claim det B2 = 0. Indeed if det B2 = 0, then there exists an invertible matrix B1 so that α1 0 B2 B1 = . α2 0 Thus
2β)]B1 = [ψ2 (−·, 0), φ2 (·, 2β)]A2 B1 + [η3 , 0]
[γ1 , γ2 ] := [ψ1 (·, 0), φ1 (·,
for some function η3 , which implies the function γ2 is bounded at −∞. Since we √ already know that ψ1 (·, 0) and φ1 (·, 2β) are bounded at +∞, γ2 is bounded at ±∞, i.e. is a resonance. This contradicts to the fact that there are no resonances at β. Therefore, B2 is invertible. We recompute D(0) to prove that it is invertible: √ T dψ1 (x, 0) dφ1 (x, 2β) , [ψ2 (−x, 0), φ2 (x, 2β)] D(0) = dx dx √ T dψ2 (x, 0) dφ2 (−x, 2β) , − [ψ1 (x, 0), φ1 (x, 2β)] dx dx # √ T dη dξ2 (x, 2β) , = B2T [ψ2 (−x, 0), φ2 (x, 2β)] dx dx √ dψ2 (−x, 0) dφ2 (x, 2β) − [η, ξ2 (x, 2β)]T , dx dx 1 ∗ √ = B2T , 0 2 2β where ∗ is an unimportant constant. Therefore, det D(0) = 0 because det B2 = 0. Now, we prove the necessary condition: Suppose that det D(0) = 0. Since the vectors φ1 (·, 2β), ψ1 (·, 0), η(−·), ξ2 (−·, 2β) form a basis to the solution space for the equation (H − β)ϕ = 0,
November 18, 2005 10:54 WSPC/148-RMP
1186
J070-00252
Z. Gang & I. M. Sigal
a resonance must be a linear combination of these vectors. First, we exclude the √ vectors having η(−·), ξ2 (−·, 2β) components by the fact that at ∞, the second one blows up exponentially fast and the first blows up at the rate of x while ψ1 and φ1 are bounded at ∞. √ Now, we only consider the linear combination of φ1 and ψ1 . First, φ1 (·, 2β) cannot be a resonance otherwise D22 (0) = D12 (0) = 0 which implies det D(0) = 0. √ We claim that for any scalar z, ψ1 (·, 0)+zφ1 (·, 2β) cannot be a resonance. Indeed, if it is a resonance, then we define two vector-functions
1 z 1 z G1 := F1 , G2 := F2 , 0 1 0 1 where, recall F1 , F2 from Eq. (A.25). Let
1 a1 a2 := W (G1 , G2 ) = a3 a4 z where a3 = W (ψ1 (·, 0) + zφ1 (·, and a1 = W (ψ1 (·, 0) + zφ1 (·,
0 1
D(0)
2β), φ2 (·,
1 z 0 1
,
2β)) = 0
2β), ψ1 (−·, 0) + zφ2 (·, 2β)) = 0,
thus det D(0) = 0, which contradicts to our assumption. By the discussion above we have that there is no resonance if det D(0) = 0. Recall that the space Lγ defined in Eq. (5.32). The following lemma explains the choice of the space L−α/2 in the next section. Lemma A.9. If a function φ ∈ L−α/2 satisfies (H − λ)φ = 0 and if λ > β, then W (φ, φ1 (±·, µ)) = 0,
(A.27)
and φ = b+1 φ1 (·, µ) + b+2 ψ1 (·, k) + b+3 ψ2 (·, k) = b−1 φ1 (−·, µ) + b−2 ψ1 (−·, k) + b−3 ψ2 (−·, k)
(A.28)
for some constants b±1 , b±2 , b±3 . Proof. The vectors {φ1 (·, µ), ψ1 (·, k), ψ2 (·, k), ξ1 (·, µ)} form a basis to the solution space of (H − λ)φ = 0. Moreover φ1 (·, µ), ψ1 (·, k), ψ2 (·, k) ∈ L−α/2 ([0, +∞)) while ξ1 (·, µ) ∈ L−α/2 ([0, +∞)) by the fact that Re µ > α, which follows from the assumption that α < β made after Eq. (A.3). This implies the + part of Eq. (A.28). The + part of Eq. (A.27) follows from Eq. (A.28) and the following results: W (φ1 (·, µ), φ1 (·, µ)) = W (ψ1 (·, k), φ1 (·, µ)) = W (ψ2 (·, k), φ1 (·, µ)) = 0 while W (ξ1 (·, µ), φ1 (·, µ)) = 0. The proof of the − part of the lemma is similar.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1187
A.3. Generalized eigenfunction e(x, k) In this subsection, we prove that the function e(x, k) in Eq. (5.33) is well defined. Recall the definition of the domain Ω from Eq. (A.1). For any λ ∈ Ω, we define the operator R+ (λ) : Lα → L−α by its integral kernel eik|x−y| 0 2ik , G+ (x, y) = (A.29) −µ|x−y| k e 0 − 2µ √ + where, recall that k = λ − β. The operator R (λ) is continuation to the resolvent (H0 − λ)−1 , λ ∈ Ω ∩ C+ in the following sense. Observe that σ(H0 ) = σess (H0 ) = (−∞, −β] ∪ [β, ∞). For any functions f , g ∈ Lα and λ ∈ C+ ∩ Ω, the quadratic form f, R+ (λ)g, λ ∈ Ω, is an analytic continuation of the quadratic form f, (H0 − λ)−1 g from Ω ∩ C+ to Ω. Similarly we define R− (λ) using the integral e−ik|x−y| 0 − 2ik G− −µ|x−y| . k (x, y) = e 0 − 2µ It is the analytic continuation of the resolvent (H0 − λ)−1 from λ ∈ Ω ∩ C− to Ω. Equation (A.29) and inequality (A.3) imply that if λ ∈ Ω, then R± (λ)W : −α/2 → L−α/2 are compact operators (in fact, trace class operators, see [35]). L The following theorem is the main result of this subsection: Theorem A.10. If λ > β is not an eigenvalues of H embedded in the essential spectrum, then the operators 1 + R+ (λ)W : L−α/2 → L−α/2
(A.30)
are invertible, and the functions e(·, k) in Eq. (5.33) are well defined and are of the form e(x, k) = −i
2D22 (k)k η(x, k), det D(k)
(A.31)
where D12 (k) φ1 (x, µ). (A.32) D22 (k) The proof of Theorem A.10 will be after Lemma A.19, and will use the results from Lemma A.15, Propositions A.17, A.18 and Lemma A.19. The following lemma whose proof is simple, thus omitted, is important in this subsection: η(x, k) := ψ1 (x, k) −
Lemma A.11. Let C be the operator of complex conjugating, and
1 0 σ3 := . 0 −1
November 18, 2005 10:54 WSPC/148-RMP
1188
J070-00252
Z. Gang & I. M. Sigal
If λ > β, then σ3 R± (λ)σ3 = R± (λ),
σ3 W σ3 = W ∗ ,
and therefore, Cσ3 (1 + R± (λ)W )Cσ3 = 1 + R∓ (λ)W, σ3 (1 + R± (λ)W ∗ )σ3 = 1 + R± (λ)W. Corollary A.12. If λ > β, then φ1 (x, µ) = −σ3 φ¯1 (x, µ). We start with studying the analytic function D(k) introduced in Eq. (A.26). Lemma A.13. If there exists some λ1 > β such that the operator 1 + R+ (λ1 )W √ is not invertible, then either λ1 is an eigenvalue of H, or D22 ( λ1 − β) = 0. Proof. Assume λ1 is not an eigenvalue of H and assume by contradiction that √ D22 ( λ1 − β) = 0. By the definition of D22 (k) and Lemma A.9 we have that φ1 (x) := √ φ1 (x, β + λ1 ) is a bounded function, i.e. φ1 (x) = c1 φ1 (−x) + c2 ψ2 (−x, λ1 − β) + c3 ψ1 (−x, λ1 − β) for some constants ci , i = 1, 2, 3. On the other hand, since 1 + R(λ1 )W is not invertible, there exists a function g ∈ L−α/2 \L2 such that (1 + R(λ1 )W )g = 0. By elementary calculations, we get that g(x) = c4 φ1 (x) + c5 ψ1 (x, λ1 − β) = c6 φ1 (−x) + c7 ψ1 (−x, λ1 − β) for some constants cn , n = 4, 5, 6, 7. Since g and φ1 satisfy the equation (H − λ1 )ψ = 0, W (φ1 , g) is independent of x. Therefore, 0 = W (φ1 , g) = 2c7 c3 λ1 − βi, which implies either c7 = 0 or c3 = 0. Similarly by calculating W (φ1 , g(−·)), we have that either c5 = 0 or c2 = 0. Hence, by Corollary A.12, either g ∈ L2 or φ1 ∈ L2 and is therefore an eigenfunction of H, which contradicts to the assumption at the √ beginning of the proof, thus D22 ( λ1 − β) = 0. Since R+ (λ)W is analytic in a neighborhood of the semi-axis [β, ∞) and since
R (λ)W L−α/2 →Lα/2 → 0 as λ → ∞, the operator 1 + R+ (λ) are not invertible for finite many points λ ∈ [β, ∞). If some point λ1 > β is not an eigenvalue of H, and 1 + R+ (λ1 )W is not invert√ ible, then by Lemma A.13, D22 ( λ1 − β) = 0. Since D22 is an analytic function of k, there exists a small neighborhood Ω1 of λ1 such that ∀λ ∈ Ω1 , D22 (k) = 0, and ∀λ ∈ Ω1 \{λ1 }, 1 + R+ (λ)W is invertible as an operator from L−α/2 to L−α/2 . Hence we have: +
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
Lemma A.14. The functions e(x, k) := (1 + R+ (λ)W )−1 for all Ω1 \{λ1 } and belong to L λ)e(·, k) = 0.
−α/2
„ ikx « e 0
1189
are well defined
. Moreover, they satisfy the equation (H −
Using Lemma A.14, we find some properties of e(x, k) when λ ∈ Ω\{λ1 }. Lemma A.15. If at some point λ ∈ Ω the operator 1 + R+ (λ)W is invertible, then there are functions s, a of k such that e(x, k) = s(k)ψ1 (x, k) + a(k)φ1 (x, µ). Especially, s(k) = 0 if λ is sufficiently large. Proof. Since for λ ∈ Ω1 \{λ1 }, (H − λ)e(·, k) = 0 and e(·, k) ∈ L−α/2 , we have by Lemma A.9 that e(·, k) = s(k)ψ1 (·, k) + a(k)φ1 (·, µ) + b(k)ψ2 (·, k) for some functions s, a, b. Therefore, compare to the statement of the lemma, we only need to prove that b = 0. From the properties of ψ1 , ψ2 , φ1 , we can get that
ikx
−ikx α e e e(·, k) = s(k) (A.33) + b(k) + O(e− 2 x ) 0 0 as x → +∞. By the definitions of the domains Ω and Ω1 , we have that |Im k| < α/2. On the other hand
ikx e = e(x, k) + R+ (λ)W e(·, k) 0
α a1 (k) ikx = e(x, k) + e (A.34) + O(e− 2 x ) 0 for some constant a1 (k) as x → +∞. Compare the Eqs. (A.33) and (A.34), we find that b = 0. By the fact that lim R+ (λ)W e(·, k) L∞ = 0
λ→∞
and Eq. (A.34), we have that s(k) = 0 if λ is large. Lemma A.16. The analytic function D22 (k) can be zero at only a discrete subset of Ω. Proof. We prove by contradiction. Suppose not, then D22 (k) = 0 globally. By Lemma A.9, W (e(·, k), φ1 (−·, µ)) = 0
November 18, 2005 10:54 WSPC/148-RMP
1190
J070-00252
Z. Gang & I. M. Sigal
for any λ provided that e(x, k) is well defined. We proved in Lemma A.15 that for large λ, e(x, k) is well defined and e(x, k) = s(k)ψ1 (x, k) + a(k)φ1 (x, u) for some constants s(k) = 0 and a(k). Since D22 (k) = W (φ1 (x, µ), φ1 (−x, µ)) = 0, we have D12 (k) = D21 (k) := W (ψ1 (x, k), φ1 (−x, k)) = 0 for any large k, where, recall the definition of D(k) in Eq. (A.26). Thus, det D(k) = 0 globally which contradicts to the fact that det D(0) = 0. Therefore, D22 (k) = 0 only at a discrete subset. Proposition A.17. Equation (A.31) holds for any λ ∈ Ω if the operator 1 + R+ (λ)W defined in Eq. (A.30) is invertible. Proof. Define a set Ω2 as Ω2 := {λ | λ ∈ Ω, 1 + R+ (λ)W is invertible, D22 ( λ − β) = 0}. By Lemma A.16, we can see that Ω\Ω2 is a discrete subset of Ω, and Ω1 ⊂ Ω2 . Therefore, e(x, k) is well defined if k 2 + β = λ ∈ Ω2 . By the fact of e(·, k) ∈ L−α/2 and Lemmas A.9 and A.15, there exist functions s, a, r, b of the variable k such that e(x, k) = s(k)ψ1 (x, k) + a(k)φ1 (x, µ) = ψ2 (−x, k) + r(k)ψ1 (−x, k) + b(k)φ2 (x, k).
(A.35)
If λ ∈ Ω2 , then the fact D22 (k) = 0 implies that s(k) = 0. Define s1 (k) := k1 (k) :=
a(k) s(k) .
1 s(k)
and
Then, e(x, k) =
1 [ψ1 (x, k) + k1 (k)φ1 (x, µ)]. s1 (k)
12 (k) We claim k1 (k) = − D D22 (k) . Indeed, consider the matrix
1 0 (s1 (k)e(x, k), φ1 (x, µ)) = (ψ1 (x, k), φ1 (x, µ)) k1 (k) 1
1 0 = F1 (x, k) . k1 (k) 1
Recall F1 and F2 from Eq. (A.25). By the fact W (e(·, k), φ2 (·, µ)) = 0, we have
D11 (k) D12 (k) = ∂x F1T F2 − F1T ∂x F2 D12 (k) D22 (k)
D2 (k) 1 −k1 (k) 1 0 0 D11 (k) − D12 22 (k) = . −k1 (k) 1 0 1 0 D22 (k)
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1191
This equality implies that D12 (k) + k1 (k)D22 (k) = 0 or equivalently k1 (k) = D12 (k) . By Eq. (A.35), −D 22 (k) D12 (k) φ1 (x, µ) D22 (k) = s1 (k)ψ2 (−x, k) + s1 (k)k1 (k)ψ1 (−x, k) + s1 (k)k2 (k)φ2 (x, µ).
ψ1 (x, k) −
For s1 (k), we have 2iks1 (k) = W (ψ1 (·, k) −
D12 (k) D2 (k) φ1 (·, µ), ψ1 (−·, k)) = D11 (k) − 12 . D22 (k) D22 (k)
Since the right-hand side is analytic, s1 (k) is meromorphic. Thus, we proved Eq. (A.31) if λ ∈ Ω2 . Since D22 (k) = 0 only at a discrete subset of Ω and η(x, k) and s1 (k) are meromorphic functions, Eq. (A.31) holds if the function e(x, k) is well defined. ikx Since 1+R+ (λ)W is not invertible at λ1 , we expect that [1+R+ (λ)W ]−1 e 0 blows up in some sense at λ = λ1 . We want to determine the nature of this blow up. Proposition A.18. If 1 + R+ (λ)W is not invertible at some point λ1 > β, then we have √ s1 ( λ1 − β) = 0, (A.36) √ [1 + R+ (λ1 )W ]η(·, λ1 − β) = 0, and η(·,
λ1 − β) ∈ L−α/2
where α is given in Eq. (A.3), Lγ := e−γ/4|x| L2 , recall the definition of the function η from Eq. (A.32), and the fact that λ1 is not an eigenvalue for L(λ). √ Proof. (1) We first prove the fact η(·, λ1 − β) ∈ L−α/2 . The analytic function W (η(·, k), φ2 (·, µ)) = 0 for λ ∈ Ω1 \{λ1 }, therefore for any λ ∈ Ω. By Lemma A.9, this implies that η(·, k) = k1 ψ1 (−·, k) + k2 ψ2 (−·, k) + k3 φ2 (·, µ) for some kn , n = 1, 2, 3. Thus if λ > β, then η is bounded at −∞. By the definition of η(·, k), it is bounded at +∞. Therefore, η(·, k) ∈ L∞ ⊂ L−α/2 if λ > β. √ (2) Now we prove (1 + R+ (λ1 )W )η(·, λ1 − β) = 0. If 1 + R+ (λ1 )W is not invertible, then there exists a function g ∈ L−α/2 satisfying (1 + R+ (λ1 )W )g = 0 and g = z1 ψ1 (·, λ1 − β) + z2 φ1
November 18, 2005 10:54 WSPC/148-RMP
1192
J070-00252
Z. Gang & I. M. Sigal
for some constants z1 and z2 by a similar argument as in Eq. (A.34). Since √ D22 ( λ1 − β) = 0, we have z1 = 0. Thus, without loss of generality, we assume z1 = 1. Thus, √ D12 ( λ1 − β) √ η(·, λ1 − β) − g = z2 + φ1 . D22 ( λ1 − β) Using Wronskian function, we obtain √ D12 ( λ1 − β) √ D22 ( λ1 − β) z2 + D22 ( λ1 − β) = W (η(·, λ1 − β) − g, φ2 (·, β + λ1 )) = 0. √ Since D22 ( λ1 − β) = 0 as proven in Lemma A.13, we have that √ D12 ( λ1 − β) √ , z2 = − D22 ( λ1 − β) √ which implies g = η(·, λ1 − β). √ (3) The equation s1 ( λ1 − β) = 0 follows from the following three facts
ikx e (1 + R+ (λ)W )η(·, k) = s1 (k) 0 which follows from Eqs. (5.33) and (A.36), lim η(·, k) = η(·, λ1 − β) in L−α/2 , √ k→ λ1 −β
which can be proved by Lemma A.4 and the Dominated Convergence Theorem, and Eq. (A.36). In the following lemma, we prove that s1 (k) could not be zero: Lemma A.19. There exist functions s, a, b, r such that e(x, k) = s(k)ψ1 (x, k) + a(k)φ1 (x, µ) = b(k)φ2 (x, µ) + ψ2 (−x, k) + r(k)ψ1 (−x, k), where |s(k)|2 + |r(k)|2 = 1, and s¯r + r¯s = 0. Proof. By Lemmas A.9 and A.15, there exist functions s, a, b, c and r such that e(x, k) = s(k)ψ1 (x, k) + a(k)φ1 (x, µ) = b(k)φ2 (x, µ) + c(k)ψ2 (−x, k) + r(k)ψ1 (−x, k). One can show that c = 1 by a similar expansion as in Eq. (A.34) at −∞. We divide the proof into two cases: s(k) = 0 and s(k) = 0.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1193
(1) If s(k) = 0, then a(k) = 0, which implies that e(x, k) = φ1 (x, k). a(k) By Corollary A.12, −σ3
e(x, k) e(x, k) = . a(k) a(k)
Thus, −
1 r(k) = , a ¯(k) a(k)
which implies |r(k)| = 1. This proves the lemma when s(k) = 0. (2) When s(k) = 0: it is easy to get that e(−x, k) = ψ2 (x, k) + r(k)ψ1 (x, k) + b(k)φ1 (x, µ) = s(k)ψ1 (−x, k) + a(k)φ2 (x, µ) is a solution to H − λ. There exist b1 , b2 such that σ3 e¯(−x, k) = ψ1 (x, k) + r¯(k)ψ2 (x, k) + b1 (k)φ1 (x, µ) = s¯(k)ψ2 (−x, k) + b2 (k)φ2 (x, µ) satisfies (H0 − λ + W )σ3 e¯(·, k) = 0. Therefore, e(x, k) =
r(k) 1 σ3 e¯(−x, k) + e(−x, k) + d(k)φ1 (x, µ) s¯(k) s(k)
(A.37)
for some d(k). We claim that if d(k) = 0, then D22 (k) = 0. Indeed, we already know that e(·, k) ∈ L−α/2 , which implies that φ1 (±·, µ) ∈ L−α/2 if d(k) = 0. Therefore, by Lemma A.9, D22 (k) = W (φ1 (·, µ), φ1 (−·, µ)) = 0. By the discussion above, we have that if D22 (k) = 0, then d(k) = 0. Thus Eq. (A.37) implies r2 (k) 1 + = s(k) s¯(k) s(k) and r¯(k) r(k) =− s(k) s¯(k) or |s(k)|2 + |r(k)|2 = 1 and s(k)¯ r (k) + s¯(k)r(k) = 0. Since s and r are meromorphic functions of k, and D22 (k) = 0 only at discrete points, the formula works for any k.
November 18, 2005 10:54 WSPC/148-RMP
1194
J070-00252
Z. Gang & I. M. Sigal
Proof of Theorem A.10. By Lemma A.19 and the proof of Proposition A.17, we have 2D22 (k)k 1 =i (A.38) s(k) = s1 (k) det D(k) and 1 2 ≤ 1. |s(k)|2 = (A.39) s1 (k) √ Equation (A.39) implies that s1 ( λ1 − β) = 0. Hence, the operator 1 + R+ (λ)W must be invertible at the point λ1 > β, otherwise there is a contradiction by Proposition A.18. By Lemma A.19 and the proof of Proposition A.17, we obtain 2ikD12 (k) D12 (k) =− . a(k) = −s(k) D22 (k) det D(k)
(A.40)
Moreover s, a, b, r are meromorphic functions of k. Since det D(0) = 0, they are analytic functions of k in a neighborhood of 0. Proposition A.20. If H has no resonance at β, then e(·, 0) = 0. Proof. Recall the fact that det D(0) = 0 in Theorem A.8. By Lemma A.19, and Eqs. (A.38) and (A.40), we have e(x, k) =
2ikD22 (k) 2ikD12 (k) ψ1 (x, k) − φ1 (x, k). det D(k) det D(k)
Hence, e(x, 0) = 0. A.4. Estimates on e(x, k) In this subsection, we estimate the eigenfunctions −1
e(x, k) = [1 + R (λ)W ] +
eikx 0
for all λ > β, which are well defined as proved in the last subsection. Theorem A.21. If λ ≥ β, then n d ≤ c(1 + |x|)n+1 , dk n (e(x, k)/k) 2 L (dk) n d (e(x, k)/k) ≤ c(1 + |x|)n+1 , dk n ∞ L (dk)
f # 2 ≤ c ρ−2 f 2 , n H d # dk n f ≤ c ρ−n f 1 , where n = 0, 1, 2, the constant c is independent of x, and, recall, ρν = (1 + |x|)−ν .
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1195
The estimates in Theorem A.21 will be proved in Propositions A.22 and A.24 and Corollary A.25. n d Proposition A.22. e(x, ·) ≤ c(1 + |x|)n , dk n ∞ L (dk) n d ≤ c(1 + |x|)n+1 , dk n (e(x, ·)/k) ∞ L (dk) where c is a constant independent of x, n and λ. n = 0, 1, 2, 3 and λ > β > 0. dn ∞ Proof. Since we proved e(x, 0) = 0, dk can be estimated by n (e(x, k)/k) L (dk) dn+1 n+1 e(x, k) ∞ . dk
L
(dk)
We divide the domain of λ into two regions, λ > β + 0 and β ≤ λ ≤ β + 0 , where 0 is a small positive number to be specified later. (1) If λ > β + 0 , then
ikx e e(x, k) = − R+ (λ)W e(x, k) 0 eik|x−y|
ikx ∞ 0 e 2k = − −µ|x−y| e 0 −∞ 0 − 2iµ
W e(y, k) dy.
We estimate e(x, k) L∞ (dx) by e(·, k) L−α/2 :
e(·, k) L∞ (dx)
c ≤1+ |k|
∞ −∞
|W (y)e
α 2 |y|
1/2 | dy 2
e(·, k) L−α/2 (dx) ,
where the constant c is independent of λ. ∀0 > 0, there exists a constant c(0 ) > 0, such that if λ > β + 0 , then
(1 + R+ (λ)W )−1 L−α/2 →L−α/2 ≤ c(0 ). Thus, if λ > β + 0 , then we have e−α/2|·| e(·, k) L2 (dx) ≤ c(0 ). Hence,
e(·, k) L∞ (dx) ≤ c(0 ). d e(x, k), we need Fubini’s Theorem to justify the following computaFor dk tion. Since it is tedious and not hard, we do not want to do it.
ikx ∞ d e e(x, k) = ix A(x, y, k)W e(y, k) dy − 0 dk −∞ eik|x−y| ∞ 0 d 2k e(x, k) dy, − −µ|x−y| W e dk −∞ 0 2µ
November 18, 2005 10:54 WSPC/148-RMP
1196
J070-00252
Z. Gang & I. M. Sigal
where
ik|x − y|eik|x−y| − keik|x−y| 2k A(x, y, k) := 0
0 −
ik|x − y|e
−µ|x−y|
− ike
−µ|x−y|.
2iµ3
Similar reasoning proves that if λ > β + 0 , then n d ≤ c(0 )(1 + |x|)n , dk n e(x, ·) ∞ L
(dx)
where the constant c is independent of x, n = 0, 1, 2, 3. (2) After estimating e(x, k) in the region λ ≥ β + 0 , we consider β ≤ λ ≤ β + 0 . We choose a small 0 such that if λ − β ≤ 0 , then det D(k) = 0. Recall that when we estimate the functions φ1 , φ2 , ψ1 , ψ2 , we always divide the domain (−∞, ∞) into two regions, (−∞, 0], [0, ∞). We will use the same strategy to estimate e(x, k) since e(x, k) are linear combinations of them. Also we use Lemma A.19 in which we prove that if x ∈ [0, +∞), e(x, k) =
2ikD12 (k) −2ikD22(k) ψ1 (x, k) + φ1 (x, µ). det D(k) det D(k)
If x ∈ [0, +∞), then by Theorem A.1 we have, n d ≤ c(1 + |x|)n dk n e(x, k) ∞ L
(dk)
for n = 0, 1, 2, 3. Similarly if x ∈ (−∞, 0], then n d ≤ c(1 + |x|)n dk n e(x, k) ∞ L
(dk)
for n = 0, 1, 2. Conclusion: there exists a constant c > 0, such that if λ ≥ β, then n d ≤ c(1 + |x|)n dk n e(x, k) ∞ L
(dk)
where n = 0, 1, 2, 3. In the following lemma, we make some approximations on e(x, k). Lemma A.23. If x ≥ 0 and λ > β, then there exists function s2 such that n
ikx d e ≤ c 1 e−0 |x| . (A.41) dk n e(x, k) − s2 (k) 0 1 + |k| If x ≤ 0 and λ > β, then there exists a function γ2 (k) such that n
−ikx
ikx d e e ≤ c 1 e−0 |x| . (k) e(x, k) − − γ 2 dk n 0 0 1 + |k| dn dn Also dkn s2 (k) , dkn γ2 (k) ≤ c. All c, 0 used do not depend on x and k; and n = 0, 1, 2.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1197
Proof. We only prove estimate (A.41), the proof of the second one is similar. Similar to the proof of Proposition A.22, we divide the domain of λ into two regions, λ > β + 0 and β ≤ λ ≤ β + 0 . (1) When λ > β + 0 , we start with the definition of e(x, k)
ikx e e(x, k) − 0 ∞ eik|x−y| 0 1 =− ke−µ|x−y| W (y)e(y, k) dy 2k −∞ 0 − µ ∞ 0 0 1 =− ke−µ|x−y| W (y)e(y, k) dy 2k −∞ 0 − µ
1 ikx ∞ e−iky 0 − e W (y)e(y, k) dy 0 0 2k −∞ ∞
1 2 sin k(x − y) 0 − W (y)e(y, k) dy. 0 0 2k x All the three terms on the right-hand side are nice functions, so we could use Fubini’s Theorem to make the following calculations: For the first term, if x ∈ [0, +∞), we have that n ∞ 0 0 d α −µ|x−y| W (y)e(y, k) dy ≤ ce− 4 |x| ke dk n 0 − −∞ µ by Proposition A.22; for the second term,
eikx ∞ e−iky 0 s2 (k) − W (y)e(y, k) dy = eikx , 0 0 0 2k −∞ where the scalar function s2 satisfies the estimate n d dk n s2 (k) ≤ c(0 ) for all λ > β + 0 ; for the third term, we have n ∞
d α 2 sin k(x − y) 0 W (y)e(y, k) dy ≤ ce− 4 |x| . dk n 0 0 x All the constants c used above are independent of k, x and n, where n = 0, 1, 2. (2) We consider the case β ≤ λ ≤ β + 0 . Since det D(0) = 0, there exist 0 , δ > 0 such that if |λ − β| ≤ 0 , then |det D(k)| ≥ δ.
November 18, 2005 10:54 WSPC/148-RMP
1198
J070-00252
Z. Gang & I. M. Sigal
If β ≤ λ ≤ β + 0 , then estimate (A.41) can be proved by using Lemma A.19 and the fact that if x > 0, then by Lemma A.4 and estimate (A.6) we have n
ikx d e ≤ ce−0 x (x, k) − ψ 1 dk n 0 for some constant c independent of x and λ, n = 0, 1, 2. Similar estimates can be obtained for ψ2 (±·, k). Since we have proved similar estimates many times, we will not go into details here. We prove the third estimate of Theorem A.21 by using Lemma A.23. Proposition A.24. f # H2 ≤ c (1 + | · |)2 f 2 , where, recall that ∞ e¯% (x, k) · f (x) dx. f # (k) = −∞
Proof. First, we decompose the function f # into two parts, 2 ∞ d % e¯ (x, k) · f (x) dx dk 2 −∞ 2 ∞ 2 0 d d ≤ 2 e¯% (x, k) · f (x) dx + 2 e¯% (x, k) · f (x) dx . dk dk 0
−∞
By Lemma A.23, we have 2 ∞ d % e¯ (x, k) · f (x) dx dk 2 0 2 ∞
−ikx d e % ≤ 2 e¯ (x, k) − s¯2 (k) · f (x) dx 0 dk 0 2 ∞
−ikx d e s¯2 (k) · f (x) dx + 2 0 dk 0 +∞ −ikx ∞ −0 |x| e e 2 ≤c |f (x)| dx + c · (1 + |x|) f (x) dx 0 1 + |k| 0 0 +∞ −ikx 1 e 2
f 2 + c · (1 + |x|) f (x) dx . ≤c 0 1 + |k| 0 Similarly 2 d dk 2
0
−∞
e¯% (x, k) · f (x) dx
0 −ikx 1 e 2 + c · (1 + |x|) f (x) dx ≤ c f 2 0 1 + |k| −∞ 0 ikx e + c · (1 + |x|)2 f (x) dx . 0 −∞
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1199
Combine the two parts together, we have 2 ∞ d % e ¯ (x, k) · f (x) dx dk 2 −∞ ∞ −ikx 1 e 2 ≤ c f 2 + c · (1 + |x|) f (x) dx 0 1 + |k| −∞ 0 ikx e + c · (1 + |x|)2 f (x) dx . 0 −∞ 2
d # Thus, we conclude that dk
2 ≤ c (1 + | · |)2 f 2 . 2f
It is easier to prove f # 2 ≤ c f 2 which together with the estimate above implies
f # H2 ≤ c (1 + |x|)2 f 2 . n d Corollary A.25. (e(x, k)/k) ≤ c(1 + |x|)n+1 , dk n 2 L (dk) n d # n dk n f ≤ c (1 + |x|) f 1 , where n = 0, 1, 2. Proof. When |k| ≤ 1, by Proposition A.22, we have n d n+1 . dk n (e(x, k)/k) ≤ c(1 + |x|) When k > 1, by Proposition A.22, we have n d 1 n dk n (e(x, k)/k) ≤ c 1 + |k| (1 + |x|) . Therefore, n d (e(x, k)/k) dk n
L2 (dk)
≤ c(1 + |x|)n+1 .
Recall f # (k) = e% (·, k), f . By Proposition A.22, n n d # ≤ c d e% (x, k) |f (x)| dx ≤ c (1 + |x|)n |f (x)| dx. f (k) dk n dk n This completes the proof of Theorem A.21.
November 18, 2005 10:54 WSPC/148-RMP
1200
J070-00252
Z. Gang & I. M. Sigal
Appendix B. Proof of Statements (A) and (B) of Proposition 2.5 In this appendix, we prove Proposition 2.5 translated in the context of the operator H, i.e. for the family of operators H(W ) := H0 + W where the operators H0 , W are defined in Eq. (5.8). Proof of (A). Suppose for some W0 , statements (SB) and (SC) hold. We use the notations and estimates from Sec. A.1. The Wronskian depends on the potential W and we display this dependence explicitly by writing D(k, W ) for D(k). By definitions (A.25) and (A.26), det D(0, W ) is a continuous functional of the functions √ dn dn dxn φ1 (x, 2β, W )|x=0 and dxn ψ1 (x, 0, W )|x=0 where n = 0, 1. By estimate (A.12), the last two functions are continuous in variable W. Thus, if det D(0, W0 ) = 0 for some W0 , then there exists a constant > 0 such that if the function W satisfies
eα|x| (W − W0 ) L∞ ≤ , then det D(0, W ) = 0. By Theorem A.8, H(W ) has no resonance at the point β. This completes the proof of (A). Proof of (B). We fix the function W . It is not hard to prove that the functions √ dn dn dxn φ1 (x, 2β, sW ), dxn ψ1 (x, 0, sW ) (n = 0, 1) are analytic in the variable s ∈ C. Therefore, the function det D(0, sW ) is analytic in s as well. Thus, det D(0, sW ) is either identically zero or vanishes at most ∞ at a discrete set of s. We only need to prove the first case does not occur if −∞ V3 = 0, where, recall V3 = V1 + V2 from (5.6). The proof is based on the following facts for sufficiently small s, √ √ (I) The Wronskian function W (φ1 (·, 2β, sW ), φ1 (−·, 2β, sW )) = 0 which will be proved in Lemma B.2 below; (II) As shown in Lemma B.1 below, there exists a solution ϕ1 (x, sW ) with the following properties: (DI) for ∞ each s there exist constants c1 (s) and c2 (s) such that c1 (s) = 0 if −∞ V3 = 0 and
x 1 ϕ1 (x, sW ) − c1 (s) − c2 (s) 0 0 decays exponentially fast at −∞; (DII) for any x ∈ R, we have |ϕ1 (x, sW )| ≤ c(1 + |x|). The function ϕ1 (x, sW ) −
1 0
(B.1)
decays exponentially fast at +∞. Given these facts, we see that
W (φ1 (·, 2β, sW ), φ1 (−·, 2β, sW )) = 0, W (ϕ1 (·, sW ), φ1 (±·, 2β, sW )) = 0.
W (ϕ1 (·, sW ), ϕ1 (−·, sW )) = 0,
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1201
√ By Eq. (B.1), ϕ1 (x, s) = ψ1 (x, 0, sW ) + c3 (s)φ1 (x, 2β, sW ) for some constant c3 (s) and the definition of D(0) ≡ D(0, sW ) from Eq. (A.26) we have that det D(0, sW ) = 0. This implies that the operator H(sW ) can have resonances only at discrete values of s. The statement (B) is proved by assuming Lemmas B.1 and B.2 below. Lemma B.1. There exists a solution ϕ1 (·, sW ) of the equation [H(sW ) − β]ϕ1 (·, sW ) = 0 with the properties (DI) and (DII). Proof. Recall the definition H(sW ) = H0 + sW, where H0 =
−
d2 +β dx2 0
0 d2 −β dx2
,
1 W = 2
V3 −iV4
−iV4 −V3
,
the functions V3 , V4 are smooth, even, and decay exponentially fast at ∞. Rewrite the equation for
ϕ11 (·, sW ) ϕ1 (·, sW ) =: ϕ12 (·, sW ) as
ϕ11 (·, sW ) ϕ12 (·, sW )
=
∞
∞
V3 (t)ϕ11 (t, sW ) − iV4 (t)ϕ12 (t, sW ) dt
1 y x + s/2
. −1 0 d2 − 2 + 2β (iV4 ϕ11 (·, sW ) + V3 ϕ12 (·, sW )) dx (B.2)
The proof of the existence of ϕ1 (·, sW ) and the fact that |ϕ1 (x, sW )| ≤ c(1 + |x|) is easy because when s is small, we could use the contraction lemma. We will not go into the details because we have solved similar problems many times. Since the Wronskian function W (ϕ1 (x, sW ), ϕ1 (−x, sW )) is independent of x and analytic in s, we expand it in the variable s. We compute from Eq. (B.2) to get ∞ ∞ V (t) dtdy
3 1 y x ϕ1 (x, sW ) = + s/2
+ O(s2 ). −1 0 d2 V4 i − 2 + 2β dx
November 18, 2005 10:54 WSPC/148-RMP
1202
J070-00252
Z. Gang & I. M. Sigal
Thus W (ϕ1 (·, sW ), ϕ1 (−·, sW )) d d T ϕ (x, sW )ϕ1 (−x, sW ) − ϕT1 (x, sW ) ϕ1 (−x, sW ) = dx 1 dx = −s = −s
∞
V3 (t) dt − s
x ∞
V3 (t) dt − s
x +∞
= −s
−∞
∞
−x x −∞
V3 (t) dt + O(s2 ) V3 (−t) dt + O(s2 )
V3 + o(s2 ) + O(s2 ).
Thus c1 (s) = 0 if s is sufficiently small and
+∞ −∞
V3 = 0, otherwise
W (ϕ1 (·, sW ), ϕ1 (−·, sW )) = 0. Lemma B.2. If s is sufficiently small, then W (φ1 (·, 2β, sW ), φ1 (−·, 2β, sW )) = 0. Proof. For n = 0, 1, by Proposition A.3, we have that as s → 0, dn dn φ (·, 2β, sW ) → φ1 (·, 2β, 0) 1 n n dx dx in the L∞ ([0, ∞)) norm. Obviously
0 √ φ1 (x, 2β, 0) = . e− 2βx By using Wronskian function again, we have as s → 0, W (φ1 (·, 2β, s), φ1 (−·, 2β, s)) d d T φ1 (x, 2β, s)φ1 (−x, 2β, s) − φT1 (·, 2β, s) φ1 (−·, 2β, s)|x=0 = dx dx → −2 2β. Thus, the proof of the lemma is complete. Appendix C. Proof of Proposition 3.3 In this appendix, we prove Proposition 3.3 in a more general setting. We base our arguments on a general form of the operator Lgeneral given in Sec. 5.1. Lemma C.1. For any constant λ0 ∈ (−i∞, −iβ] ∪ [iβ, i∞), the operator-valued function (Lgeneral − λ0 + z)−1 is an analytic function of z in a small neighborhood of 0 with 0 deleted. Furthermore, (L(λ) − λ0 + z)−1 =
+∞ !
z n Kn ,
n=m0
where m0 > −∞ is an integer and the Kn’s are bounded operators.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1203
Proof. Recall Lgeneral = L0 + U , where d2
0 − 2 +β 0 V1 dx L0 := 2 , , U := d −V2 0 − β 0 dx2 β is a positive constant, V1 and V2 are smooth functions decaying exponentially fast at ∞. Since it is hard to get the Laurent series for (Lgeneral − λ0 − z)−1 directly, we make a transformation: (L(λ) − λ0 + z)−1 = (1 + (L0 − λ0 + z)−1 U )−1 (L0 − λ0 + z)−1 . We expand each factor on the right-hand side. The operator-family (L0 −λ0 +z)−1 is analytic for z sufficiently small, so there exist bounded operators K1,n (n = 0, 1, . . .) such that ∞ ! (L0 − λ0 + z)−1 = z n K1,n . n=0 −1
For z sufficiently small, (L0 − λ0 + z) VI.14],
U is trace class. Thus by [35, Theorem
(1 + (L0 − λ0 + z)−1 U )−1 =
∞ !
z n K2,n ,
n=m0
where K2,n (n = m0 , m0 + 1, . . .) are bounded operators and m0 > −∞ is an integer. Multiplying the last two series gives the result. The following is the main theorem of this section. Proposition C.2. Suppose A is an operator having a complex number θ as an isolated eigenvalue and (A − θ − z)−1 =
+∞ !
An z n ,
n=m
where m > −∞ is an integer and An are bounded operators. Then, we have (EnA) A−1 is a projection operator : A−1 = PθA where 1 PθA := (A − x)−1 dx, 2iπ |x−θ|= where > 0 is a sufficiently small constant. Range PθA is the space of eigenvectors and associated eigenvectors of A with the eigenvalue θ, i.e. Range A−1 = {x | (A − θ)k x = 0 for some positive integer k}. If the operator PθA is finite-dimensional, then the operator A∗ has n inde¯ pendent eigenvectors and associated eigenvectors with the eigenvalue θ,
November 18, 2005 10:54 WSPC/148-RMP
1204
J070-00252
Z. Gang & I. M. Sigal
(EnB) Let Range PθA = {ξ1 , . . . , ξn } and η1 · · · ηn be linearly independent vectors and associated eigenvectors of ¯ Then, the n × n matrix T = [Tij ] A∗ corresponding to the eigenvalues θ. where Tij := ηi , ξj is invertible, (EnC) The operator PθA is of the form: η1 , f PθA f = (ξ1 , . . . , ξn )T −1 ... . (C.1) ηn , f Proof. The proof of (EnA) is well known (see, e.g. [37, 27]). To prove (EnB), assume det T = 0. Then there exist constants b1 , . . . , bn such that b1 η1 + · · · + bn ηn ⊥ ξ1 , . . . , ξn and at least one of the constants b1 , . . . , bn is not zero. Since PθA f Span{ξ1 , . . . , ξn } for any vector f , we have
∈
0 = b1 η1 + b2 η2 + · · · + bn ηn , PθA f ∗
= Pθ¯A (b1 η1 + b2 η2 + · · · + bn ηn ), f = b1 η1 + b2 η2 + · · · + bn ηn , f . Thus b1 η1 + · · · + bn ηn = 0. This contradicts to the fact that the vectors η1 , . . . , ηn are linearly independent. Therefore, we proved that the matrix T is invertible. For (EnC), we have that any vector f, PθA f ∈ Span{ξ1 , . . . , ξn }. Therefore, there is a n × 1 scalar matrix (a1 , . . . , an )T such that PθA f = (ξ1 , . . . , ξn )(a1 , . . . , an )T . What is left is to compute a concrete form of ai ’s. We have the following formula ∗
ηi , f = Pθ¯A ηi , f = ηi , PθA f , while
a1 ηi , PθA f = (ηi , ξ1 , . . . , ηi , ξn ) ...
an which implies formula (C.1).
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1205
Now we prove Proposition 3.3. Proof of Proposition 3.3. The proof of Proposition 3.3 is a direct product of Proposition C.2 since we know the eigenvectors of the operator L(λ) and correspondingly these of L∗ (λ). References [1] A. Ambrosetti, M. Badiale and S. Cingolani, Semiclassical states of nonlinear Schr¨ odinger equations with bounded potentials, Atti Accad. Naz. Lincei Cl. Sci. Fis. Mat. Natur. Rend. Lincei (9) Mat. Appl. 7(3) (1996) 155–160. [2] J. C. Bronski and R. L. Jerrard, Soliton dynamics in a potential, Math. Res. Lett. 7(2, 3) (2000) 329–342. [3] H. Berestycki and P.-L. Lions, Nonlinear scalar field equations, I, Existence of a ground state, Arch. Ration. Mech. Anal. 82(4) (1983) 313–345. [4] V. S. Buslaev and G. S. Perelman, Scattering for the nonlinear Schr¨ odinger equation: States close to a soliton, St. Petersburg Math. J. 4(6) (1993) 1111–1142. [5] V. S. Buslaev and G. S. Perelman, Nonlinear scattering: The states which are close to a soliton, J. Math. Sci. 77(3) (1995) 3161–3169. [6] V. S. Buslaev and C. Sulem, On asymptotic stability of solitary waves for nonlinear Schr¨ odinger equations, Ann. Inst. H. Poincar´ e Anal. Non Lin¯ eaire 20(3) (2003) 419–475. [7] T. Cazenave, An Introduction to Nonlinear Schr¨ odinger Equations, Textos de M´etodos Matem´ aticos, Vol. 22 (I.M.U.F.R.J., Rio de Janeiro, 1989). [8] A. Comech, On orbital stability of quasistationary solitary waves of minimal energy, preprint. [9] A. Comech and D. Pelinovsky, Purely nonlinear instability of standing waves with minimal energy, Comm. Pure Appl. Math. 56(11) (2003) 1565–1607. [10] S. Cuccagna, D. Pelinovsky and V. Vougalter, Spectra of positive and negative energies in the linearized NLS problem, Comm. Pure Appl. Math. 58(1) (2005) 1–29. [11] S. Cuccagna, Stabilization of solutions to nonlinear Schr¨ odinger equations, Comm. Pure Appl. Math. 54(9) (2001) 1110–1145. [12] S. Cuccagna, On asymptotic stability of ground states of NLS, Comm. Pure Appl. Math. 54(2) (2001) 135–152. [13] S. Cuccagna, On asymptotic stability of ground states of NLS, Rev. Math. Phys. 15(8) (2003) 877–903. [14] S. Cuccagna and D. Pelinovsky, Bifurcations from the end points of the essential spectrum in the linearized NLS problem, J. Math. Phys. 46 (2005). [15] P. Deift and X. Zhou, A steepest descent method for oscillatory Riemann–Hilbert problems. Asymptotics for the MKDV equation, Ann. of Math. (2) 137(2) (1993) 295–368. [16] J. Fr¨ ohlich, S. Gustafson, B. L. G. Jonsson and I. M. Sigal, Solitary wave dynamics in an external potential, Comm. Math. Phys. (to appear). [17] J. Fr¨ ohlich, T. P. Tsai and H. T. Yau, On the point-particle (Newtonian) limit of the nonlinear Hartree equation, Comm. Math. Phys. 225(2) (2002) 223–274. [18] A. Floer and A. Weinstein, Nonspreading wave packets for the cubic Schr¨ odinger equation with a bounded potential, J. Funct. Anal. 69 (1986) 397–408. [19] J. A. Goldstein, Semigroups of Linear Operators and Applications, Oxford Mathematical Monographs (Oxford University Press, New York, 1985). [20] M. Goldberg and W. Schlag, Dispersive estimates for Schr¨ odinger operators in dimensions one and three, arXiv:math.AP/0306108 v1 2003.
November 18, 2005 10:54 WSPC/148-RMP
1206
J070-00252
Z. Gang & I. M. Sigal
[21] R. H. J. Grimshaw and D. Pelinovsky, Nonlocal models for envelope waves in a stratified fluid, Stud. Appl. Math. 97(4) (1996) 369–391. [22] M. Grillakis, J. Shatah and W. Strauss, Stability theory of solitary waves in the presence of symmetry, I, J. Funct. Anal. 74(1) (1987) 160–197. [23] M. Grillakis, J. Shatah and W. Strauss, Stability theory of solitary waves in the presence of symmetry, II, J. Funct. Anal. 94(2) (1990) 308–348. [24] S. J. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics (Springer, New York, 2003). [25] W. Hunziker and I. M. Sigal, The quantum N-body problem, J. Math. Phys. 41(6) (2000). [26] B. L. G. Jonsson, M. Merkli and I. M. Sigal, Lectures in Applied Analysis, preliminary text, Toronto (2005), www.math.toronto.edu/sigal. [27] T. Kato, Perturbation Theory for Linear Operators (Springer-Verlag, Berlin/ New York, 1984). [28] S. Keraani, Semiclassical limit of a class of Schr¨ odinger equations with potential, Commun. Partial Differential Equations 27(3, 4) (2002) 693–704. [29] D. J. Kaup, Closure of the squared Zakharov–Shabat eigenstates, J. Math. Anal. Appl. 54(3) (1976) 849–864. [30] Y.-G. Oh, Existence of semiclassical bound states of nonlinear Schr¨ odinger equations with potential of the class (Va ), Commun. Partial Differential Equations 13(12) (1988) 1499–1519. [31] Y.-G. Oh, Stability of semiclassical bound states of nonlinear Schr¨ odinger equations with potentials, Commun. Math. Phys. 121 (1989) 11–33. [32] Y.-G. Oh, Cauchy problem and Ehrenfest’s law of nonlinear Schr¨ odinger equations with potential, J. Differential Equations 81 (1989) 255–274. [33] G. Perelman, Stability of Solitary Waves for Nonlinear Schr¨ odinger Equation, ´ S´eminaire sur les Equations aux D´eriv´ees Partielles (1995–1996), Exp. No. XIII, ´ ´ S´emin. Equ. D´eriv. Partielles (Ecole Polytech., Palaiseau, 1996). [34] J. Rauch, Local decay of scattering solutions to Schr¨ odinger eqaution, Comm. Math. Phys. 61 (1978) 149–168. [35] M. Reed and B. Simon, Methods of Modern Mathematical Physics, I, Functional Analysis (Academic Press, 1978). [36] M. Reed and B. Simon, Methods of Modern Mathematical Physics, II, Fourier Analysis (Academic Press, 1978). [37] M. Reed and B. Simon, Methods of Modern Mathematical Physics, IV, Analysis of Operators (Academic Press, 1978). [38] I. Rodnianski, W. Schlag and A. Soffer, Dispersive analysis of charge transfer models, Comm. Pure Appl. Math. 58(2) (2005) 149–216. [39] I. Rodnianski, W. Schlag and A. Soffer, Asymptotic stability of N-soliton state of NLS, arXiv:math.AP. [40] W. Schlag, Stable manifold for orbitally unstable NLS, arXiv:math.AP. [41] J. Shatah and W. Strauss, Instability of nonlinear bound states, Comm. Math. Phys. 100(2) (1985) 173–190. [42] C. Sulem and P.-L. Sulem, The Nonlinear Schr¨ odinger Equation. Self-Focusing and Wave Collapse, Applied Mathematical Sciences, Vol. 139 (Springer-Verlag, New York, 1999). [43] A. Soffer and M. I. Weinstein, Multichannel nonlinear scattering for nonintegrable ˆ d Ol´ eron, 1988) Lecture Notes equations, in Integrable Systems and Applications (Ile in Physics, Vol. 342 (Springer, Berlin, 1989), pp. 312–327.
November 18, 2005 10:54 WSPC/148-RMP
J070-00252
Asymptotic Stability of Nonlinear Schr¨ odinger Equations with Potential
1207
[44] A. Soffer and M. I. Weinstein, Multichannel nonlinear scattering for nonintegrable equations, Comm. Math. Phys. 133 (1990) 119–146. [45] A. Soffer and M. I. Weinstein, Multichannel nonlinear scattering for nonintegrable equations. II. The case of anisotropic potentials and data, J. Differential Equations 98(2) (1992) 376–390. [46] A. Soffer and M. I. Weinstien, Selection of the ground state for nonlinear Schr¨ odinger equations, to appear in Rev. Math. Phys. [47] T.-P. Tsai and H.-T. Yau, Asymptotic dynamics of nonlinear Schr¨ odinger equations: Resonance-dominated and dispersion-dominated solutions, Comm. Pure. Appl. Math. 55 (2002) 153–216. [48] T.-P. Tsai and H.-T. Yau, Relaxation of excited states in nonlinear Schr¨ odinger equations, Int. Math. Res. Not. 31 (2002) 1629–1673. [49] T.-P. Tsai and H.-T. Yau, Stable directions for excited states of nonlinear Schr¨ odinger equations, Comm. Partial Differential Equations 27(11, 12) (2002) 2363–2402. [50] M. I. Weinstein, Modulational stability of ground states of nonlinear Schr¨ odinger equations, SIAM J. Math. Anal. 16(3) (1985) 472–491. [51] M. I. Weinstein, Lyapunov stability of ground states of nonlinear dispersive evolution equations, Comm. Pure Appl. Math. 39(1) (1986) 51–67.
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
Reviews in Mathematical Physics Vol. 17, No. 10 (2005) 1209–1239 c World Scientific Publishing Company
THE EXTENDED VARIATIONAL PRINCIPLE FOR MEAN-FIELD, CLASSICAL SPIN SYSTEMS
E. KRITCHEVSKI Department of Mathematics and Statistics, McGill University, 805, Sherbrooke Street West, Montreal, Qu´ eb´ ec, H3A 2K6, Canada [email protected] S. STARR Department of Mathematics, University of California, Los Angeles, Box 951555, Los Angeles, CA, 90095-1555, USA [email protected] Received 3 October 2005 The purpose of this article is to obtain a better understanding of the extended variational principle (EVP). The EVP is a formula for the thermodynamic pressure of a statistical mechanical system as a limit of a sequence of minimization problems. It was developed for disordered mean-field spin systems, spin systems where the underlying Hamiltonian is itself random, and whose distribution is permutation invariant. We present the EVP in the simpler setting of classical mean-field spin systems, where the Hamiltonian is non-random and symmetric. The EVP essentially solves these models. We compare the EVP with another method for mean-field spin systems: the self-consistent mean-field equations. The two approaches lead to dual convex optimization problems. This is a new connection, and it permits a generalization of the EVP. Keywords: Classical spin systems; exchangeability; extended variational principle. Mathematics Subject Classification 2000: 82B20, 60G09
Contents 1. Introduction 2. Mean-Field Spin Systems: Definition 3. The Extended Variational Principle 3.1. Setup 3.1.1. Equivalence of thermodynamic pressure 3.2. Results 3.2.1. Changes for the concave case 3.3. Proofs 1209
1210 1212 1213 1213 1214 1216 1218 1218
November 18, 2005 10:54 WSPC/148-RMP
1210
J070-00251
E. Kritchevski & S. Starr
4. Optimizers for the EVP 4.1. Extension for convex two-body interactions 5. The Gibbs–de Finetti Principle 5.1. Setup 5.2. Results 5.3. Proofs 6. Minimax Theorem 6.1. Setup 6.2. Proofs 7. Example: Quadratic Kernel
1222 1223 1224 1224 1227 1227 1232 1232 1233 1236
1. Introduction The extended variational principle (EVP) was introduced in [2], by Aizenman, Sims, and one of the present authors. It was applied to a mean-field disordered spin system, known as the Sherrington–Kirkpatrick spin glass. This is an Ising spin system, whose underlying Hamiltonian is random, such that the joint distribution of the coupling constants is permutation symmetric. The purpose of the EVP there was to give a variational formulation of the pressure, different than the usual Gibbs variational principle (GVP). For spin glasses, it seems that the GVP does not yield a useful characterization of the pressure because of the complicated dependence of that formula on the random coupling constants. The EVP was used to re-derive upper bounds on the quenched pressure originally proved by Guerra in [13]. Also, the proof in [2] helps to unify that bound with the earlier proof of existence of the quenched pressure by Guerra and Toninelli [14]. Moreover, the approach of [2] introduced the new concept of “random overlap structures” of which, Ruelle’s random probability cascade (RPC) [23] seems to give distinguished examples, having certain invariance properties. On the other hand, the sequence of variational formulas that comprise the EVP are still difficult to work with. For example, the Euler–Lagrange equations were not derived. Shortly after the preprint for [2], Talagrand announced a proof of the most interesting problem related to the Sherrington–Kirkpatrick model, namely “Parisi’s ansatz”. (Cf. Talagrand’s paper [30] and his book [31].) This does not diminish interest in the EVP and its relation to spin glasses. There is hope that the new insight which could be gained by finding a proof of Parisi’s ansatz based on the EVP and random overlap structures would lead to more general results. Since the Euler–Lagrange equations are so hard to determine for mean-field spin glasses, it seems like a good idea to consider mean-field classical spin systems, where the situation is easier. These are Ising-type spin systems (and generalized versions) where the Hamiltonian is non-random, and permutation symmetric. It turns out that for such systems, not only can the Euler–Lagrange equations be derived, they can be essentially “solved”. For the spin systems just described, there is another method of solution, called the “self-consistent mean-field equations”. It consists of solving an implicit,
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1211
self-consistency equation for a 1-body measure. One way to quickly derive the implicit formula is to write down the GVP. The GVP requires one to optimize a certain function over the set of all permutation-invariant N -body measures. But instead one optimizes just over the restricted manifold of N -body product measures. The Euler–Lagrange equation for the GVP on this restricted set gives the self-consistent mean-field equation. Often one cannot explicitly solve this 1-body problem, but the mere fact that it reduces an N -body problem to a 1-body problem justifies calling this a “solution”. The solution obtained by the EVP is similar in that it also reduces the N -body problem to a 1-body problem. In the course of our research, we were leda to the beautiful and concise paper of Fannes, Spohn and Verbeure which treats mean-field quantum spin systems and gives a rigorous justification of the self-consistent mean-field equations. By specialization, their results also apply to classical spin systems. In the classical case,b their method uses the Gibbs variational formula, combined with de Finetti’s theorem. We will call this the Gibbs–de Finetti principle (GdFP), henceforth. The de Finetti theorem says the following. Consider a countable number of spins, indexed by sites of N, say. Then the measure on ΩN is called “exchangeable” if it is permutation invariant, for permutations of the arguments which fix all but a finite number of them. The limit Gibbs measures all have this property by virtue of the underlying symmetry of the Hamiltonians. The de Finetti theorem says that the most general exchangeable measure is a mixture of i.i.d., product measures on the spins indexed by N. With further work, one can restrict attention to the extreme measures, which are i.i.d., product measures (so that the mixture is trivial). One of the goals of our paper is to compare the EVP to the GdFP. Before describing the comparison, let us mention two other useful approaches to solving mean-field spin systems, which we will not discuss in this paper. One approach is the “coherent states approach”, which is useful for quantum spin systems. This was worked out by Lieb in [18] for the large-spin limit of the Heisenberg model, and was also applied to the Dicke Maser model by Hepp and Lieb in [15]. In fact, it seems to be Hepp and Lieb’s work on the Dicke Maser model which motivated Fannes, Spohn and Verbeure. The other approach uses large deviation estimates. A good reference is Ellis and Newman’s paper on the Curie–Weiss model, [7]. An advantage of [10] is its generality. Coming back to the comparison of the EVP and the GdFP, let us say that both give the same information, when they work. This leads one to expect that there may be a more direct link between the two approaches. Indeed there is. It is simplest to see in the 2-body case, when the interaction defines a convex bilinear form on measures. Then the two problems can be viewed as dual optimization a We
are very grateful to B. Nachtergaele for bringing this paper to our attention. the quantum case, their method uses a generalization of de Finetti’s theorem by Størmer [28], and an alternative to the Gibbs formulation suitable for quantum spin systems by two of those authors [8, 9].
b In
November 18, 2005 10:54 WSPC/148-RMP
1212
J070-00251
E. Kritchevski & S. Starr
problems in the sense of convex variational analysis. More precisely, there is a joint “Lagrangian” which is a concave-convex function of two variables. Maximizing over the concave variable gives the nonlinear function which one needs to minimize in the EVP. Minimizing over the convex variable yields the nonlinear function which one needs to maximize in the GdFP. So the fact that both methods lead to the same quantity — the thermodynamic pressure — is a consequence of the fact that the max-min of the joint Lagrangian equals the min-max. In case the interaction is continuous and bounded, it is trivial to see that the minmax and the max-min are equivalent, even in the non-convex case, and for n-body interactions with n > 2. But for singular interactions, the equality is nontrivial. Nevertheless, it is true, and follows from a theorem called the Kneser–Fan theorem. This theorem is a generalization of the famous von Neumann minimax theorem. This allows one to generalize the extended variational principle to some models with singular interactions (e.g., Coulomb repulsions). As the reader will see, the EVP is easy to understand and prove, because it only uses estimates based on convexity and Jensen’s inequality. In comparison, to prove the GdFP, one must use properties of relative entropy as well as the de Finetti theorem. Therefore, the latter is more complicated than the former. On the other hand, the GdFP is more robust. In conclusion, we would like to make one extrapolation to spin glasses, which is the following: it would be useful to have an analogue of de Finetti’s theorem, suitable for spin glasses. By this, we mean an intrinsic characterization of the limiting measures in spin glasses in terms of an invariance principle with respect to some stochastic dynamics. (Note that the proof of de Finetti’s theorem [16] actually characterizes the measures on ΩN which are invariant under the shift on N.) So far, there is one result in this direction. It is the recent and interesting paper by Aizenman and Ruzmaikina [1], which characterizes the 1-level replica-symmetry-breaking RPC’s by an invariance principle called “quasi-stationarity”. The layout of this paper is as follows. In Sec. 2, we give the definition of what we mean by mean-field spin system. In Sec. 3, we give the main results related to the EVP. In Sec. 4, we determine the optimizers of the EVP, and use this to give a simpler formula for the pressure. We also state a generalization which we prove later, for singular interactions. In Sec. 5, we recall the main results of the GdFP, as proved in [10] (specialized to classical spin systems). In Sec. 6, we construct a joint Lagrangian for the EVP and GdFP. We also prove the generalization of the EVP from Sec. 4. In Sec. 7, we give the simplest example.
2. Mean-Field Spin Systems: Definition In this section, we define the notation and set-up for a “mean-field spin system”. For us, a mean-field spin system is defined by a quadruple (Ω, α, n, φ) where: Ω is a compact metric space; α is a distinguished Borel probability measure on Ω called the a priori measure; n is a positive integer determining the number of bodies in
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1213
the interaction; and φ : Ωn → R ∪ {+∞} is the n-body interaction. It is useful that Ω is a compact metric space, and that α is a Borel probability measure. (For example, this means that α is regular.) This is the level of generality one will find for classical spin systems in [17, 26]. We denote the set of all Borel probability measures on Ω by M+ 1 (Ω) so that + α ∈ M1 (Ω). We will assume that φ is a Borel measurable function and that it is bounded from below. Furthermore, we assume that α and φ are compatible in the sense that α⊗n (φ) < ∞. (Henceforth, whenever µ is a measure on a σ-algebra and f is a measurable function on the same σ-algebra, we write µ(f ) for the integral of f against µ. We also write f µ for the (possibly signed) measure such that (f µ)(A) = µ(f χA ). We use tensor notation to denote product measures.) We will assume that φ is symmetric on Ωn with respect to the natural action of the symmetric group Sn , as fits with our intention of studying a mean-field system. For each N ≥ n, we define a Hamiltonian, HN : ΩN → R ∪ {+∞}. For x = (x1 , . . . , xN ) ∈ ΩN , −1 N φ(xi(1) , . . . , xi(n) ). (2.1) HN (x) := N n 1≤i(1)<···
Note that HN is symmetric with respect to the natural action of SN . Equivalently, we can think of the underlying lattice as begin a complete graph. For each N ≥ n, the partition function is the number Z(N ) := α⊗N (exp[−HN ]), and the finite approximation to the pressure is p(N ) := N −1 log Z(N ). The thermodynamic pressure is defined as the limit p∗ := lim p(N ), N →∞
if it exists. We are primarily interested in the thermodynamic pressure. Later, we will recall a well-known result (Theorem 5.5) which guarantees that the limit does always exist. We have eliminated the inverse-temperature parameter β, by absorbing it into the Hamiltonian. It will be fixed and finite for our entire discussion. 3. The Extended Variational Principle 3.1. Setup The extended variational principle is a method for calculating the pressure of a ˜ N : N ) which are close to (HN : N ). In this section, we family of Hamiltonians (H will assume that φ is a bounded function; i.e. we assume that it is bounded below, in addition to being bounded above, as in the general set-up. With this assumption,
November 18, 2005 10:54 WSPC/148-RMP
1214
J070-00251
E. Kritchevski & S. Starr
the new Hamiltonians will be so close to the old ones that the thermodynamic pressures will be equal (as we will show). Let us define a function, Φ : M+ 1 (Ω) → R by Φ(µ) := µ⊗n (φ). ˜ N : ΩN → R as Now, for each N ∈ N+ , we may define a new Hamiltonian H ˜ N (x) := N Φ(µx ), H
(3.1)
where for x = (x1 , . . . , xN ) ∈ ΩN , µx := N −1
N
δx i .
i=1
The measure µx is called the empirical measure of the point x. Note that, in the important case that n = 2, the main difference between HN ˜ N is the appearance of self-interaction terms φ(xi , xi ), for i = 1, . . . , N . One and H intuitively expects that these terms make a small contribution, since there are only N of them, compared to the total number of terms N (N −1)/2. However, if φ would have an infinite repulsion, so that φ(x1 , x2 ) = +∞ whenever x1 = x2 , this would ˜ N ≡ +∞, entirely dominated by the self-interaction terms. lead to a Hamiltonian H This is why we must assume that φ is bounded above, as well as below. Of course, this is a strong requirement, excluding many physically interesting examples. (In ˜ N are the natural generalizations of these, the case n > 2, the extra terms in H where two or more of the indices coincide.) We define ˜ N ]). ˜ ) := α⊗N (exp[−H Z(N We define ˜ ), p˜(N ) = N −1 log Z(N and we define p˜∗ as p˜∗ := lim p˜(N ), N →∞
if it exists. There is one more important condition which we put on φ. We assume that φ satisfies the necessary conditions so that Φ : M+ 1 (Ω) → R is either convex or concave. It makes sense to speak of convexity or concavity of Φ because M+ 1 (Ω) is a convex set. 3.1.1. Equivalence of thermodynamic pressure We will now show the relationship between p∗ and p˜∗ , under the assumption that φ is ˜ N (x) bounded. To begin, we observe that the energy densities N −1 HN (x) and N −1 H
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
are close, in fact
˜ N (x) n(n − 1) HN (x) H φ∞ . N − N ≤ N
1215
(3.2)
Indeed, this follows because ˜ N (x) H = E[φ(xI(1) , . . . , xI(n) )], N where the indices I(1), . . . , I(n) are i.i.d. random variables, which are uniform on {1, . . . , N }, and HN (x) = E[φ(xI(1) , . . . , xI(n) )|I(1), . . . , I(n) are distinct]. N Therefore, ˜ N (x) HN (x) H ≤ 2φ∞ P{I(1), . . . , I(n) are not distinct}. − N N Then (3.2) follows by bounding the probability,
P({I(1), . . . , I(n) are not distinct}) ≤
P{I(j) = I(k)}.
1≤j
Now we use an elementary inequality to bound the difference in pN and p˜N , starting from (3.2). But since we will use the same bound repeatedly hereafter, we will state it in some generality. Suppose X is a compact metric space, and θ ∈ M+ 1 (X) is a Borel probability measure on X. Define a function, Ψ, on the set of Borel measurable functions f : X → R ∪ {±∞}, as Ψ(f ) := log θ(ef ). Then we have the following: |Ψ(f ) − Ψ(g)| ≤ f − g∞ .
(3.3)
Indeed, one sees that θ(eg ) ≤ eg−f θ(ef ) which proves [Ψ(g) − Ψ(f )] ≤ g − f . The other inequality follows symmetrically. Using Eq. (3.3), we see that |p(N ) − p˜(N )| ≤
n(n − 1) φ∞ . N
In particular it implies the following. Corollary 3.1. Under the assumption that φ∞ < ∞, the thermodynamic pressures p∗ and p˜∗ either both exist, or both do not exist, together. In case they both exist, they are equal.
November 18, 2005 10:54 WSPC/148-RMP
1216
J070-00251
E. Kritchevski & S. Starr
3.2. Results In the bulk of this section we assume that Φ is convex. In Sec. 3.2.1, we will state what changes when Φ is concave. The first main result is the following important fact. Theorem 3.2. The sequence (N p˜(N ) : N ∈ N+ ) is superadditive. That is, for every pair N1 , N2 ∈ N+ , p(N1 + N2 ) ≥ N1 p˜(N1 ) + N2 p˜(N2 ). (N1 + N2 )˜
(3.4)
Moreover the sequence, (˜ p(N ) : N ∈ N) converges in R. Remark 3.3. Compare to the main theorem in [14]. Also compare to [4]. It is a well-known fact that for a superadditive sequence (X(N ) : N ∈ N+ ), the limit of N −1 X(N ) exists, although possibly equal to +∞. (See the original by Fekete [11] or P´ olya and Szeg¨ o [20, problem no. 98] [for which there are English translations].) Therefore, the importance of the second part of the theorem is that the limit is not +∞. The second main result is a variational formula for p˜∗ . To set this up, we require some definitions. We first note that, since Ω is a compact metric space, the weak topology on M+ 1 (Ω) is compact and metrizable. (Cf. [21, Sec. IV.4] and [6, + + Sec. V.5.]) Thinking of M+ 1 (Ω) as a compact metric space, we define M1 (M1 (Ω)) as the set of Borel probability measures on it, which is also compact and metrizable + with the weak topology. We also define M+ f (M1 (Ω)) to be the set of all (positive) Borel measures, ρ, such that 0 < ρ(M+ 1 (Ω)) < ∞. + This is a cone whose base is the Choquet simplex M+ 1 (M1 (Ω)). The main idea behind the extended variational principle is a physical notion called the cavity step. Following the prescription in [2], we will define a sequence of functions, which we call the cavity field functions. There is a different cavity field function for each N ∈ N+ corresponding to adding N extra particles to a system, whose size is supposed to be much larger than N . If the original system is large enough, then instead of considering a configuration in ΩM for some large M , we instead consider a measure in M+ 1 (Ω). Note that because the Hamiltonian is permutation invariant, we only ever need to consider configurations in ΩM modulo permutations. But using the empirical measures one + can embed the quotient space ΩM /SM into M+ 1 (Ω) for each M . Since M1 (Ω) contains each of these finite configurations spaces, it does make sense to consider a large system size by replacing configurations in some ΩM by measures in M+ 1 (Ω). + (Ω) × M (Ω) → R as It is useful to define Φ(1) : M+ 1 1
Φ(1) (ν, µ) = n[ν ⊗n−1 ⊗ µ](φ).
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1217
This is the directional derivative of Φ(ν) in the direction µ, i.e. d Φ(ν + tµ) = Φ(1) (ν, µ). dt t=0 (1)
It is also useful to define, for each ν ∈ M+ 1 (Ω), a function φν : Ω → R ∪ +∞ such (1) + (1) that Φ (ν, µ) = nµ(φν ) for all µ ∈ M1 (Ω). In other words, (1) φν (x) = φ(y, x) dν ⊗n−1 (y), (3.5) Ωn−1
for all x ∈ Ω. + For each N ∈ N+ , we define two functions from M+ f (M1 (Ω)) to R. These are (1) N ˜ (1) (ρ) := N −1 log G α e−nφν dρ(ν) N M+ 1 (Ω)
and ˜ (2) (ρ) := N −1 log G N
M+ 1 (Ω)
e−(n−1)N Φ(ν) dρ(ν).
We define the cavity field function (for addition of N particles to a large system) as ˜ N (ρ) := G ˜ (1) (ρ) − G ˜ (2) (ρ). G N N
(3.6)
This function is homogeneous of degree 0. This means that + ∀ ρ ∈ M+ f (M1 (Ω)),
∀ t ∈ (0, ∞)
:
˜ N (ρ) = G ˜ N (tρ). G
This fact is obvious because scaling by t simply adds the same constant to each of ˜ (1) (ρ) and G ˜ (2) (ρ), which cancels in the difference. G N N + For every measure ρ ∈ M+ f (M1 (Ω)), there exists a t ∈ (0, ∞) such that tρ is actually a probability measure. Therefore, we could restrict attention to + M+ 1 (M1 (Ω)). But it is sometimes useful to be free of the constraint that all mea˜ N is bounded on M+ (M+ (Ω)) sures should be normalized. One easily sees that G 1 1 + using Eq. (3.3). Therefore, using homogeneity, it is bounded on M+ f (M1 (Ω)). Moreover, using the monotone class theorem, and the fact that Φ is Borel˜ N is Borel-measurable (cf. [19, Sec. 1.3]). If φ measurable, one can check that G ˜ N is also continuous. is continuous, then Φ is continuous, and it is clear that then G The main theorem for this section is the following characterization of the pressure. Theorem 3.4 (EVP). For each N ∈ N+ , ˜ N (ρ), p˜(N ) ≤ inf G ρ
(3.7)
+ where the infimum is taken over ρ ∈ M+ f (M1 (Ω)). Moreover,
˜ N (ρN ), p∗ = p˜∗ = lim inf G N →∞ ρN
+ where, for each N ∈ N, we infimize over ρN ∈ M+ f (M1 (Ω)) separately.
(3.8)
November 18, 2005 10:54 WSPC/148-RMP
1218
J070-00251
E. Kritchevski & S. Starr
Remark 3.5. Compare to the main theorem in [2]. We will prove this theorem, as well as Theorem 3.2, in the next section. First, we state what changes if Φ is concave instead of convex. 3.2.1. Changes for the concave case In the concave case, the sequence of finite approximations to the pressure is subadditive instead of superadditive, so that the inequality in (3.4), from Theorem 3.2, is reversed. In the extended variational principle, Theorem 3.4, the inequality of (3.7) is reversed, and the infimum is replaced by the supremum. The identity in (3.8) still holds, but with the infimum replaced by the supremum. 3.3. Proofs All proofs are exactly symmetric between the convex and concave cases for Φ. So we will only give proofs of the convex case. ˜ ˜ ˜ ) for every Proof of Theorem 3.2. One needs to show Z(M + N ) ≥ Z(M )Z(N M, N ∈ N+ , i.e., ˜ M+N ]) ≥ α⊗M (exp[−H ˜ M ])α⊗N (exp[−H ˜ N ]). α⊗M+N (exp[−H Since ΩM+N = ΩM × ΩN , this inequality follows by proving ˜ M+N ((x, y)) ≤ H ˜ N (x) + H ˜ M (y). H This is equivalent to (M + N )Φ(µ(x,y) ) ≤ N Φ(µx ) + M Φ(µy ),
(3.9)
using the definition (3.1). But µ(x,y) is a convex combination (M + N ) · µ(x,y) = N · µx + M · µy . So (3.9) follows from convexity of Φ and Jensen’s inequality. This proves superadditivity. It is a well-known fact that for superadditive sequences, (X(N ) : N ∈ N), the limit of N −1 X(N ) exists, although possibly equal to +∞. See [11], or see Lemma 3.6, below. Therefore, (˜ p(N ) : N ∈ N) converges in R ∪ {+∞}. But there are obvious upper bounds which rule out the limit +∞, namely p˜(N ) ≤ φ∞ . The first half of Theorem 3.4 easily follows from convexity. Proof of Theorem 3.4, Eq. (3.7). It suffices to show that (2)
(1)
˜ (ρ) ≤ G ˜ (ρ) p˜(N ) + G N N + for every ρ ∈ M+ f (M1 (Ω)). Direct calculation yields
˜ (2) (ρ) = N −1 log ρ ⊗ α⊗N (exp[−N U ]), p˜(N ) + G N
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1219
where U (ν, x) = (n − 1)Φ(ν) + Φ(µx ) = Φ(1) (ν, ν) − Φ(ν) + Φ(µx ). Similarly, ˜ (1) (ρ) = N −1 log ρ ⊗ α⊗N (exp[−N V ]), G N where (1) (ν, δx ). V (ν, x) = nφ(1) ν (x) = Φ
Therefore, the inequality holds by showing that U (ν, x) − V (ν, x) = Φ(µx ) − Φ(ν) − Φ(1) (ν, µx − ν) ≥ 0.
(3.10)
But one easily checks that, for 0 < t < 1, d Φ(t · µx + (1 − t) · ν) = Φ(1) (t · µx + (1 − t) · ν; µx − ν). dt Using this, (3.10) is a standard consequence of convexity of Φ. For the proof of the second half of Theorem 3.4, we will rely on the following lemma. Although it is well known that the limit N −1 X(N ) exists when (X(N ) : N ∈ N+ ) is a superadditive sequence, there is another simple fact which is not as well known, but which is essential to the extended variational principle. This was used, notably, in [2]. We repeat the proof here, for completeness. Lemma 3.6. Let (X(N ) : N ∈ N+ ) be a superadditive sequence. Then, X(N ) X(M + N ) − X(M ) = lim lim inf . N →∞ N →∞ M→∞ N N lim
Proof. For M, N ∈ N+ , define X(M + N ) − X(M ) N and Y (N ) := lim inf M→∞ Y (M, N ). By superadditivity, Y (M, N ) ≥ N −1 X(N ) for all M ∈ N+ , therefore Y (M, N ) :=
Y (N ) ≥ N −1 X(N ).
(3.11)
Suppose that k, M, N ∈ N+ and that r ≥ M . Then, by a telescoping sum, Y (r, kN ) =
k−1 1 Y (r + jN, N ) ≥ inf Y (M , N ). M ≥M k j=0
(3.12)
Given M, N ∈ N+ , define functions k : Z → Z and r : Z → [M, M + N − 1], which are uniquely specifiedc by the requirement n = k(n)N + r(n) (and the requirement M ≤ r(n) ≤ M + N − 1). Then, one can easily see lim inf n→∞
c In
X(n) X(n) − X(r(n)) = lim inf = lim inf Y (r(n), k(n)N ). n→∞ n→∞ n n − r(n)
Matlab language, k(n) = div(n − M, N ) and r(n) = M + mod(n − M, N ).
November 18, 2005 10:54 WSPC/148-RMP
1220
J070-00251
E. Kritchevski & S. Starr
Using (3.12), this implies lim inf n→∞
X(n) = lim inf min Y (r, kN ) ≥ inf Y (M , N ). k→∞ M≤r≤M+N −1 M ≥M n
Taking the monotone limit in M , we obtain lim inf n→∞
X(n) ≥ lim inf Y (M, N ) = Y (N ). M→∞ n
Therefore, by (3.11) lim inf n→∞
X(N ) X(n) ≥ Y (N ) ≥ . n N
Taking the limsup in N shows that (N −1 X(N ) : N ∈ N+ ) converges. Then, by the sandwich theorem, (Y (N ) : N ∈ N+ ) also converges, to the same limit. It seems that much of the physicists’ so-called cavity step is encoded in Lemma 3.6. Using it, we now complete the proof of Theorem 3.4. Proof of Theorem 3.4, Eq. (3.8). By (3.7), ˜ N (ρN ). p˜∗ = lim inf p˜(N ) ≤ lim inf inf G N →∞
N →∞ ρN
Therefore, (3.8) follows if one can prove ˜ N (ρN ). p˜∗ ≥ lim sup inf G N →∞ ρN
(3.13)
For each N ∈ N, suppose there were a sequence of measures (ρM N : M ∈ N+ ) such that (1) M ˜ (ρ ) − M + N p˜(M + N ) = 0 (3.14) lim G N N M→∞ N and
(2) M M ˜ lim GN (ρN ) − p˜(M ) = 0. M→∞ N
(3.15)
Then, it would follow ˜ N (ρM ) − (M + N ) p˜(M + N ) − M p˜(M ) = 0. lim G N M→∞ N N Taking (ρM N : M ∈ N+ ) as a variational sequence, this would imply lim inf M→∞
(M + N )˜ p(M + N ) − M p˜(M ) ˜ N (ρN ). ≥ inf G ρN N
But by Lemma 3.6, applied to X(N ) = N p˜(N ), this would give (3.13). Therefore, it only remains to prove (3.14) and (3.15). Only (3.14) will be proved: the proof of (3.15) is very similar. The map y → µy is a continuous function from ΩM to M+ 1 (Ω) (with respect to M the weak topology on the target). Therefore, given any Borel measure ρ˜ ∈ M+ f (Ω ),
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1221
+ there is a unique measure ρ ∈ M+ f (M1 (Ω)), called the push-forward, such that for every (weakly) continuous function F : M+ 1 (Ω) → R, ρ(F ) = F (µy ) d˜ ρ(y). ΩM
+ M Now consider the measure ρ˜M N ∈ Mf (Ω ), absolutely continuous with respect ⊗M to α , with density M d˜ ρM N ˜M . H = exp − dα⊗M M +N + + ˜M Let ρM N ∈ Mf (M1 (Ω)) be the push-forward of ρ N . Then, one verifies −1 ˜ (1) (ρM G log α⊗M ⊗ α⊗N (exp[−U ]), N)= N N
(3.16)
where U (x, y) =
M2 Φ(µy ) + N Φ(1) (µy , µx ). M +N
The next step is to prove (3.14). One can write a formula analogous to Eq. (3.16) for p˜(M + N ), namely M +N p˜(M + N ) = N −1 log α⊗M+N (exp[−V ]), N where V (z) = (M + N )Φ(µz ). Using the formula µ(x,y) =
(3.17)
N µx + M µy N = µy + [µx − µy ], M +N M +N
one writes N V ((x, y)) − U (x, y) = Φ µy + [µx − µy ] M +N M +N N N2 Φ(1) (µy , µx − µy ) + − Φ(µy ) − Φ(µy ). M +N (M + N )2 Now, by Taylor’s theorem, N N Φ µy + [µx − µy ] − Φ(µy ) − Φ(1) (µy , µx − µy ) M +N M +N 1 N2 d2 = (1 − θ) Φ (µ + t[µ − µ ]) dθ, y x y θ (M + N )2 0 dt2 t= MN+N and one easily calculates d2 Φ(ν + tµ) = dt2 Therefore, f − gsup ≤
n ν ⊗n−2 ⊗ µ⊗2 (φ). 2
N2 1 n 1+ φ∞ . M +N 2 2
November 18, 2005 10:54 WSPC/148-RMP
1222
J070-00251
E. Kritchevski & S. Starr
By Eq. (3.3), and Eqs. (3.16) and (3.17), this means (1) M N 1 n G ˜ (ρN ) − M + N p˜(M + N ) ≤ 1 + φ∞ . N M +N N 2 2 This certainly does converge to zero as M → ∞, proving (3.14). The argument for (3.15) is similar, and is left to the reader. 4. Optimizers for the EVP Proposition 4.1. Suppose that φ is continuous, and Φ is convex or concave. Then, ˜ N is attained. Moreover, the normalized optimizers for each N, the optimum of G (scaled to be probability measures) form a union of faces of the Choquet simplex + M+ 1 (M1 (Ω)). Proof. Restrict attention to the case that Φ is convex, since the concave case is ˜ N is continuous. Since M+ (M+ (Ω)) is compact, proved symmetrically. Note that G 1 1 the minimum is attained. This proves the first part of the proposition. The second part of the proposition is equivalent to the following statement: let ρ be any + minimizer in M+ 1 (M1 (Ω)), then for each ν ∈ supp(ρ), the measure δν is also a minimizer. Let f : M1+ → [0, 1] be any Borel measurable function such that ρ(f ) > 0. For + t > −1, define ρt ∈ M+ f (M1 (Ω)) by dρt := 1 + tf. dρ Then, for i = 1, 2,
˜ (i) (ρt ) = N −1 log exp N G ˜ (i) (ρ) + t exp N G ˜ (i) (˜ G N N N ρ) ,
+ ˜ = fρ, using an obvious notation. where ρ˜ ∈ M+ f (M1 (Ω)) is the measure ρ (i) ˜ The two functions, t → GN (ρt ), for i = 1, 2, are obviously differentiable on (−1, ∞). Therefore, by criticality, d ˜ (1) ˜ (2) (ρt ) GN (ρt ) − G = 0. N t=0 dt But careful consideration of this equation yields
˜ N (˜ ˜ N (ρ). G ρ) = G So ρ˜ is another optimizer. Now, for any ν ∈ supp(ρ) consider the sequence of functions f = χD(ν;) (for
> 0) where χ is the indicator and D(ν; ) is the closed ball, with reference to any metric on M+ 1 (Ω) which yields the weak topology. (Such a metric is guaranteed to exist since Ω is compact and hence separable. Cf. [6, Sec. V.5].) Since ν ∈ supp(ρ), one knows ρ(f ) > 0 for all > 0. The family of rescaled measures ρ(f )−1 f ρ ˜ N , that means δν is converge weakly to δν in the ↓ 0 limit. Using continuity of G a minimizer, as claimed.
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1223
˜ When ρ has the simple form δν for some ν ∈ M+ 1 (Ω), all the values GN (δν ) (for N ∈ N+ ) are identical, and are given by the function g˜(ν) written below. Therefore, the limit in Theorem 3.4, Eq. (3.8) is trivial. We state this as the following: Corollary 4.2. Define g˜ : M+ 1 (Ω) → R by
. g˜(ν) = (n − 1)Φ(ν) + log α exp −nφ(1) ν
(4.1)
Suppose that φ is continuous and Φ is convex. Then p∗ = p˜∗ = min g˜(ν). ν
(If Φ is concave instead of convex, the minimum changes to the maximum.) 4.1. Extension for convex two-body interactions Suppose we drop the restriction that φ is bounded. Instead, as in Sec. 2, we suppose that φ : Ωn → R ∪ {+∞} is Borel measurable and bounded below, and that α⊗n (φ) < ∞. In this case p˜N may no longer exist (or rather it may equal +∞, ˜ N is still well defined identically) for each N ∈ N+ . But the cavity field function G and finite, if we put certain natural restrictions on the measures ρ which we use. The same is true for its restriction to extreme points, defined by g˜. It is reasonable to ask if one can still determine p∗ (which may now be inequivalent to p˜∗ ) using g˜? At least in some cases the answer is, “yes”. Theorem 4.3. Suppose n = 2 and Φ is convex. For each C ≥ 0, define
dν +
≤ eC . (Ω, α, C) = ν ∈ M (Ω) : ν
α and M+ 1 1
dα ∞ Then, p∗ = lim
inf
C→∞ ν∈M+ (Ω,α,C)
g˜(ν).
1
Remark 4.4. 1. Restricting to M+ 1 (Ω, α, C) is a technical necessity. If we do not (Ω), then it is possible that the two summands put some restrictions on ν ∈ M+ 1 in (4.1) are +∞ and −∞. On the other hand, because Φ(α) < ∞, both terms are finite when ν ∈ M+ 1 (Ω, α, C) for some C. By taking the C → ∞ limit, at the end, we relax these restrictions. This is also the condition that we need in Sec. 6, to apply the Kneser–Fan Theorem. 2. Our proof uses convexity of Φ. It does not give the analogous statement for the case that Φ is concave. The proof of this fact will be given at the end of Sec. 6. It can be seen as the motivation for the following two sections, though they are also interesting on their own.
November 18, 2005 10:54 WSPC/148-RMP
1224
J070-00251
E. Kritchevski & S. Starr
5. The Gibbs–de Finetti Principle In this section, we will give a pedagogical introduction to the paper of Fannes, Spohn and Verbeure [10]. In fact, while they considered quantum spin systems, which are more general, we specialize the classical case. In order to be self-contained, we will review the specialization of their results. 5.1. Setup In this section, we relax the conditions on φ relative to the previous section. We only assume the conditions from Sec. 2. Namely, we assume that φ : Ωn → R ∪ {+∞} is Borel measurable and bounded below. We suppose that α⊗n (φ) < ∞ and that φ is invariant under the natural action of Sn . We will use two important principles, called the Gibbs variational formula and de Finetti’s theorem. The Gibbs formula gives a variational formulation for the finitevolume approximations to the pressure, (p(N ) : N ≥ n). The de Finetti theorem is a representation theorem for all infinite exchangeable probability measures. When combined, these two principles give a mathematically rigorous variational formula for the thermodynamic pressure of a mean-field classical spin system, which the physicists also use (but usually without referring to the rigorous justification). We start by stating the Gibbs variational formula. The first step is to recall N entropy. Given a measure ρN ∈ M+ 1 (Ω ), its relative entropy with respect to ⊗N N will be denoted as SN (ρ ). (Usually the relative entropy would be denoted α SN (ρN , α⊗N ), but we suppress α⊗N .) This is a quantity in R ∪ {−∞}. If ρN is absolutely continuous with respect to α⊗N , then N dρN ⊗N SN ρ := α ψ , dα⊗N where
−t log t if t ∈ (0, ∞], ψ(t) := 0 if t = 0.
If ρN is not absolutely continuous with respect to α⊗N (i.e. if the singular component has a positive mass) then SN (ρN ) is defined to be −∞. (Even if ρN α⊗N , the relative entropy may equal to −∞ depending on the Radon–Nikodym derivative.) Henceforth, we will call the quantity “relative entropy with respect to α⊗N ” just by the term “entropy”. The following important properties of the entropy, except for property (1), are proved in the monographs by Israel and Simon, respectively: [17, Sec. II.2] and [26, Sec. III.4]. The best reference for property (1) is the seminal paper by Ruelle and Robinson, [24]. One can also consult the monograph by Georgii [12, Chap. 15] for related issues. N As a notational point, for ρN ∈ M+ 1 (Ω ), and A ⊂ [1, N ], we denote by + N |A| ρ A, the measure in M1 (Ω ), naturally identified as the marginal of µN on the
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1225
σ-subalgebra of Borel measurable functions on ΩN depending only on coordinates of x for indices in A. Proposition 5.1 (Properties of Relative Entropy). The functions, SN : N M+ 1 (Ω ) → R ∪ {−∞} (for N ∈ N) have the following properties: 1. (Definition through continuous partitions) SN (ρN ) = inf
inf
R∈N (u1 ,...,uR )
R ψ ρN (ur )/α⊗N (ur ) α⊗N (ur ), r=1
where (u1 , . . . , uR ) varies over all continuous partitions of unity on ΩN , such that α⊗N (ur ) > 0 for each r. N 2. (Non-positivity) SN (ρN ) ≤ 0 for all ρN ∈ M+ 1 (Ω ) and equality holds for ρN = α⊗N . N 3. (Upper semicontinuity) The function SN : M+ 1 (Ω ) → R ∪ {−∞} is upper semicontinuous with respect to the topology of weak convergence. + N N 4. (Strict concavity) For ρN 1 , ρ2 ∈ M1 (Ω ) and θ ∈ (0, 1), N N N SN (θ · ρN 1 + (1 − θ) · ρ2 ) ≥ θSN (ρ1 ) + (1 − θ)SN (ρ2 ). N N The inequality is strict if SN (ρN i ) > −∞ for both i = 1, 2, unless ρ1 = ρ2 . 5. (“Almost convexity”) For the setting as above, N N N SN (θ · ρN 1 + (1 − θ) · ρ2 ) ≤ θSN (ρ1 ) + (1 − θ)SN (ρ2 ) + ψ(θ) + ψ(1 − θ).
6. (Strong subadditivity) Given subsets A, B ⊂ [1, N ], S|A∪B| (ρN A ∪ B) + S|A∩B| (ρN A ∩ B) ≤ S|A| (ρN A) + S|B| (ρN B). For this to be consistent, we need to define S0 . The need arises when one takes the marginal (ρN A ∩ B) and A ∩ B = ∅. One can make sense of this by defining Ω0 = {∅} to be the 1-point space, defining α⊗0 to be the unique 0 N ∅ to be that same measure no matmeasure in M+ 1 (Ω ), and defining ρ + N N ter what ρ ∈ M1 (Ω ) may be. Then, the appropriate definition is obviously N S0 (ρN ∅) = 0 for all ρN ∈ M+ 1 (Ω ). + N The Gibbs function on M1 (Ω ) is defined as
GN (ρN ) = N −1 SN (ρN ) − ρN (HN ) . (5.1) (Let us reiterate that we have absorbed the inverse temperature β into the + N Hamiltonian.) The Gibbs measure is a measure ρN ∗ ∈ M1 (Ω ), symmetric under N N ⊗N and the natural action of SN on Ω , such that ρ∗ α dρN ∗ = Z(N )−1 exp[−HN ]. dα⊗N Note that, since α⊗n (φ) > 0, we know that Z(N ) > 0 by an elementary application of Jensen’s inequality. An important formula for statistical mechanics is the following.
November 18, 2005 10:54 WSPC/148-RMP
1226
J070-00251
E. Kritchevski & S. Starr
Theorem 5.2 (Gibbs Variational Formula). The Gibbs function is strictly concave and upper semicontinuous (on the set of measures where it is not equal to −∞). The maximum is attained at a unique point, which is the measure ρN ∗ . ) = p(N ). Moreover, GN (ρN ∗ Proof. Note that GN (ρN ) := N −1 SN (ρN ; ρN ∗ ) + p(N ), where the first term on the right-hand side is the relative entropy with respect to ρN ∗ . All of the properties from Proposition 5.1 are also valid for relative entropy with respect to measures other than α⊗N mutatis mutandis. The theorem is just a collection of some of these. (The only thing that changes is the precise statement of strong subadditivity, which is not used in this theorem anyway.) Having stated the Gibbs formula, let us now state de Finetti’s theorem. To N set this up, we will need some notation. If ρN ∈ M+ 1 (Ω ) is symmetric under N the natural action of SN on Ω , then it is called “exchangeable”. In this case, ρN A clearly only depends on the cardinality, say R = |A|. As a notational simplification, when this is the case, we allow ourselves to write ρN R in place N of ρN A. We will write the set of all exchangeable measures in M+ 1 (Ω ) as + N M1 (Ω , Sym). Definition 5.3. Given a strictly increasing sequence (N (k) ∈ N+ : k ∈ N+ ) and N (k) , Sym), we will say that the sequence a sequence of measures ρN (k) ∈ M+ 1 (Ω converges weakly if, for every N ∈ N+ , it happens that the subsequence of marginals N (ρN (k)N : k such that N (k) ≥ N ) converges weakly in M+ 1 (Ω ). Because of properties of the marginal, it will be clear that, if the sequence of measures ρN (k) : k ∈ N+ converges weakly, then the weak limits ρ∞N := limk→∞ ρN (k)N are consistent with respect to taking further marginals. Therefore, the measures satisfy the hypotheses of Kolmogorov’s extension theorem. (Cf. [5, Theorem 12.1.2] N or [29, Exercise 3.1.18].) So there is a naturally identified measure + ∞ ρ ∈ M1 Ω , which is defined on the smallest σ-algebra containing all cylinder sets (depending on finitely many variables). Moreover, ρ∞ is defined just so that the finite-dimensional marginals are equal to ρ∞N , justifying the notation a posteriori. N A measure ρ∞ ∈ M+ 1 (Ω ) is called exchangeable if all of its finite marginals ∞N N are exchangeable. Let M+ ρ 1 (Ω , Sym) be the set of exchangeable measures + N in M1 (Ω ). One may define a topology on the set of exchangeable measures such ∞N + N converges (as that a sequence of measures µ∞ k ∈ M1 (Ω , Sym) converges iff µk k → ∞) for each N ∈ N+ . This topology is metrizable and compact. Indeed it is the weak topology with respect to the compact metrizable topology on ΩN (cf. [21, Theorem IV.5]). The de Finetti theorem completely characterizes the measures in N M+ 1 (Ω , Sym).
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1227
Theorem 5.4 (de Finetti’s Representation). For every measure ρ∞ ∈ + + N M+ 1 (Ω , Sym), there is a unique ρ ∈ M1 (M1 (Ω)), such that µ⊗N dρ(µ) (5.2) ρ∞N = M+ 1 (Ω)
for every N ∈ N+ . For a general proof of this theorem, see the paper by Hewitt and Savage [16]. For many connections to interesting results in probability theory, see the review of Aldous [3] and references therein. 5.2. Results The first result, analogous to Theorem 3.2 is the following: Theorem 5.5. For every N1 , N2 ≥ n, (N1 + N2 )p(N1 + N2 ) ≤ N1 p(N1 ) + N2 p(N2 ).
(5.3)
Also, for each N ≥ n, p(N ) ≥ −α⊗n (φ).
(5.4)
In particular, the sequence (p(N ) : N ≥ n) converges in R. The second main result of this section is the following formula for the pressure. In Sec. 6, this will be compared to Corollary 4.2 in the convex case. Theorem 5.6 (Gibbs–de M+ 1 (Ω) → R ∪ {−∞} by
Finetti
Variational
Principle). Define g :
g(µ) := S1 (µ) − Φ(µ). Then, for every N ≥ n, p(N ) ≥ sup g(µ),
(5.5)
µ
where the supremum is taken over all µ ∈ M+ 1 (Ω). The function g is upper semicontinuous, so the maximum is attained. Moreover, p∗ = max g(µ). µ
(5.6)
5.3. Proofs M+N Proof of Theorem 5.5. Suppose ρM+N ∈ M+ , Sym). By the definition 1 (Ω of the sequence of (permutation invariant) Hamiltonians,
(M + N )−1 ρM+N (HM+N ) = M −1 ρM+N M (HM ) = N −1 ρM+N N (HN ). This implies that ρM+N (HM+N ) = ρM+N M (HM ) + ρM+N N (HN ).
November 18, 2005 10:54 WSPC/148-RMP
1228
J070-00251
E. Kritchevski & S. Starr
By subadditivity of the entropy, which is Proposition 5.1, property (6) specialized to the case that A = [1, M ] and B = [M + 1, M + N ], one knows SM+N (ρM+N ) ≤ SM (ρM+N M ) + SN (ρM+N N ). Therefore, (M + N )GM+N (ρM+N ) ≤ M GM (ρM+N M ) + N GN (ρM+N N ).
(5.7)
M+N ∈ M+ , Sym) and use Theorem 5.2. The left-hand Now, apply this to ρM+N ∗ 1 (Ω side of (5.7) becomes (M + N )p(M + N ), and the two terms on the right-hand side are bounded above by M p(M ) and N p(N ). The bound (5.4) is a variational lower bound obtained by the trial ρN = α⊗N and Theorem 5.2. From it, we know that p∗ > −∞, which is important because subadditive sequences generally may have the limit −∞, but this one does not.
Let us prove the easy part of Theorem 5.6, which only uses Theorem 5.2. Proof of Theorem 5.6, Eq. (5.5). Suppose µ ∈ M+,1 (Ω). Define ρN = µ⊗N . Observe that SN (ρN ) = N S1 (µ) and 1 N 1 ρ (HN ) = ρN n (Hn ) = µ⊗n (φ) = Φ(µ). N n So GN (ρN ) = g(µ). Then, using Theorem 5.2, one obtains p(N ) ≥ g(µ) as a variational lower bound. The equation follows. To prove the second half of Theorem 5.6, we will use the following important fact. So far, we have only used subadditivity of the pressure, which is a special case of Theorem 5.1, property (6). The next result uses strong subadditivity; in fact, it is equivalent to it. N Lemma 5.7. Suppose N ∈ N+ and ρN ∈ M+ 1 (Ω , Sym). Then,
M −1 Sn (ρN M ) ≥ N −1 SN (ρN )
(5.8)
for every M ∈ [1, N ]. Proof. It is sufficient to prove that, SN (ρN ) − SN −1 (ρN N −1 ) ≤ SN −1 (ρN N −1 ) − SN −2 (ρN N −2 )
(5.9)
for every N > 1. This is because, by iterating this inequality, one gets SN (ρN ) − SN −1 (ρN N −1 ) ≤ SM (ρN M ) − SM−1 (ρN M−1 ) for all M ≤ N − 1. Summing these inequalities over M ∈ [1, N − 1] gives a telescoping sum on the right-hand side. So, (N − 1)SN (ρN ) − (N − 1)SN −1 (ρN N −1 ) ≤ SN −1 (ρN N −1 ) − S0 (ρN 0 ).
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1229
By rearranging terms, this would prove (5.8) when M = N − 1 (recall that S0 (ρN 0 ) := 0). But then by iterating that, one could reach all M ≤ N − 1. It remains to prove (5.9). Use Proposition 5.1, property (6), with A = [1, N − 1] and B = [2, N ]. One of the most important consequences of de Finetti’s theorem, for us, is the fact that the relative entropy becomes very simple in the N → ∞ limit for exchangeable measures. In fact, it is affine. This is expressed in the following lemma, which also uses Lemma 5.7. N Lemma 5.8 (Mean Entropy). For every ρ∞ ∈ M+ 1 (Ω , Sym), the following limit exists:
s(ρ∞ ) := lim N −1 SN (ρ∞N ). N →∞
The function s : More precisely,
N M+ 1 (Ω , Sym)
→ R ∪ {−∞} is affine and upper semicontinuous. s(ρ∞ ) = ρ(S1 ),
+ ∞ via de where ρ ∈ M+ 1 (M1 (Ω)) is the “directing measure” corresponding to ρ Finetti’s theorem.
Before proving this lemma, we note the following stronger version of propR erty (4) from Proposition 5.1. Suppose that ρ = r=1 θr δxr for x1 , . . . , xR ∈ ΩN R and θ1 , . . . , θR ≥ 0 are such that r=1 θr = 1. Then, by iterating property (4), SN (ρN ) ≥
R
θr SN (δxr ).
r=1
We generalize to decompositions).
allow
continuous
convex
combinations
(barycentric
Lemma 5.9. Let (W, Σ) be a measure space with probability measure θ. Suppose + N that there is a measurable mapping w ∈ W → ρN w ∈ M1 (Ω ). Define the barycenter ρN such that ρN (f ) = ρN w (f )dθ(w) W
for each f ∈ C(Ω ). Then, N
SN (ρ ) ≥ N
W
SN (ρN w )dθ(w).
(5.10)
Proof. Use Proposition 5.1, property (1). Let (u1 , . . . , uR ) be a continuous partition of unity on ΩN such that α⊗N (ur ) > 0 for each r. By
November 18, 2005 10:54 WSPC/148-RMP
1230
J070-00251
E. Kritchevski & S. Starr
concavity of ψ, R
ψ ρN (ur )/α⊗N (ur ) α⊗N (ur )
r=1
≥
R W
r=1
≥
W
⊗N ψ ρN (u )/α (u ) dθ(w) α⊗N (ur ) r r w
SN (ρN w )dθ(w).
Since this is true for every such partition, Eq. (5.10) follows. Proof of Lemma 5.8. The existence of the limit s(ρ∞ ) can be proved either by subadditivity, (the specialization of Proposition 5.1, property (6)) or by monotonicity of the entropy density as in Lemma 5.7. By the latter, it is clear that s is upper semicontinuous being the infimum of upper semicontinuous functions. Also, s is concave by Proposition 5.1, property (4). Moreover, one can deduce that s is convex by using Proposition 5.1, property (5), and noting that for the mean entropy one divides each SN by N , and takes the limit as N → ∞ (so that the error terms in “almost convexity” converge to 0 uniformly). Therefore, s is affine. Using these properties and Lemma 5.9, one can prove that s(δµ )dρ(µ). (5.11) s(ρ∞ ) ≥ M+ 1 (Ω)
But when ρ = δµ , one has ρ∞N = µ⊗N for all N , and as already noted SN (µ⊗N ) = N S1 (µ). So s(δµ ) = S1 (µ). One can also prove the reverse inequality by using the fact that s is affine and upper semicontinuous. Using these properties and compactness of M+ 1 (Ω), one can easily see that s(ρ) ≤ max{s(δµ ) = S1 (µ) : µ ∈ supp(ρ)}. Then, for any partition of (−∞, 1/e], say e−1 = x0 > x1 > x2 > · · · > xR = −∞, define Er = {µ : xr < S1 (µ) ≤ xr−1 } for r = 1, . . . , R. Let us adjoin to ER all the measures such that S1 (µ) = −∞. By deleting some parts if necessary, we can assume that ρ(Er ) > 0 for all r = 1, . . . , R. Since s is affine, we know that s(ρ∞ ) =
∞
ρ(Er )s(ρ∞ (·|Er )),
r=1
where ρ∞ (·|Er ) = ρ(ER )−1 χEr ρ. But by the property above, this means s(ρ∞ ) ≤
∞ r=1
ρ(Er ) max S1 (µ) ≤ µ∈Er
∞
ρ(Er )xr .
r=1
By taking increasingly refined partitions, the right-hand side will be monotone decreasing. Therefore, using the monotone convergence theorem, one obtains the
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
inequality s(ρ∞ ) ≤
1231
M+ 1
S1 (µ)dρ(µ).
Combining with (5.11) gives the result. N (k) Proof of Theorem 5.6, Eq. (5.6). Let ρ∗ : k ∈ N+ be any weakly convergent subsequence of the Gibbs measures (which exists because the set of all such sequences is compact with respect to the topology of weak convergence), and let + N ρ∞ ∗ ∈ M1 (Ω , Sym) be the limit. Fix N ≥ n. By Lemma 5.7, N (k) N (k)N ≤ GN ρ∗ p(N (k)) = GN (k) ρ∗ for all k such that N (k) ≥ N . Using Theorem 5.2, GN is upper semicontinuous. Therefore, ). lim sup p(N (k)) ≤ GN (ρ∞N ∗ k→∞
On the other hand, p(N ) converges to p∗ by Theorem 5.5. So, p∗ ≤ GN (ρ∞N ). ∗ Since this inequality is true for every N ≥ n, it is also true that ). p∗ ≤ inf GN (ρ∞N ∗ N ∈N+
(5.12)
N Define another affine function G∞ : M+ 1 (Ω ) → R ∪ {−∞} by
G∞ (ρ∞ ) = s(ρ∞ ) − ρ∞n (φ). Using Lemma 5.7, one can conclude that GN (ρ∞N ) is a decreasing sequence, converging to G∞ (ρ∞ ). Hence, G∞ is upper semicontinuous. Using this and (5.12), p∗ ≤ G∞ (ρ∞ ∗ ). This is true for each limit point, and there is at least one. Therefore, p∗ ≤
sup N ρ∞ ∈M+ 1 (Ω ,Sym)
G∞ (ρ∞ ).
N Since M+ 1 (Ω , Sym) is compact and convex, and G∞ is a convex (in fact affine) and upper semicontinuous function, the maximum is achieved, and it is achieved at an extreme point. By Theorem 5.4, the extreme points are of the form ρ∞ = µ⊗N + + for some µ ∈ M+ 1 (Ω). In other words, the measure ρ ∈ M1 (M1 (Ω)) defined ∞ is an extreme point of via de Finetti’s theorem is δµ for some µ ∈ M+ 1 (Ω) if ρ + N M1 (Ω , Sym). In this case, one can explicitly calculate G∞ (ρ∞ ). It is g(µ). This also proves that g is upper semicontinuous because it is the restriction of G∞ , and that function is upper semicontinuous.
November 18, 2005 10:54 WSPC/148-RMP
1232
J070-00251
E. Kritchevski & S. Starr
6. Minimax Theorem 6.1. Setup Recall that, under the hypothesis that φ∞ < ∞, p∗ = p˜∗ by Corollary 3.1. Therefore, there is a strong connection between the extended variational principle and the Gibbs–de Finetti principle. We will make one more connection, by constructing a joint “Lagrangian”. The joint Lagrangian we construct is the function L(µ, ν) = S1 (µ) − Φ(ν) − Φ(1) (ν, µ − ν). For each ν, define Φν (µ) = Φ(1) (ν, µ). This is the analogue of Φ, which would (1) be defined if one replaced the n-body interaction φ by the 1-body interaction nφν defined in (3.5). Define g as in Theorem 5.6, but relative to Φν instead of Φ. Then, L(µ, ν) = gν (µ) − Φ(ν) + Φ(1) (ν, ν). In Theorem 5.6, we found that g was concave and upper semicontinuous, with no conditions on φ other than those introduced in Sec. 2. Therefore, replacing φ (1) by nφν , which satisfies the same conditions, we see that gν is also concave and upper semicontinuous. Moreover, for 1-body interactions it is trivial to calculate the thermodynamic pressure for the corresponding Hamiltonian. From this, and using Theorem 5.2, we see that max L(µ, ν) = g˜(ν). µ
Similarly, using convexity of Φ, it is trivial to check that inf L(µ, ν) = min L(µ, ν) = g(µ). ν
ν
The minimum is attained at µ = ν. This is by inequality (3.10). In the concave case, the analogous inequality proves that sup L(µ, ν) = max L(µ, ν) = g(µ). ν
ν
The main purpose of this section is to prove Theorem 4.3. For this purpose, we will use the following generalization of von Neumann’s minimax theorem. We refer to [27] for an elegant proof. Theorem 6.1 (Kneser–Fan Minimax Theorem). Let M be a compact, convex space and let N be any convex space. Suppose that L is a function on M × N that is concave-convex. If L is upper semicontinuous on M for each ν ∈ N, then sup inf L(µ, ν) = inf sup L(µ, ν).
µ∈M ν∈N
ν∈N µ∈M
Remark 6.2. The Kneser–Fan theorem generalizes the “von Neumann minimax theorem” which is well known as one of the first mathematical results in game theory.
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1233
The definition of being concave-convex is that: for each ν ∈ N, the function L(·, ν) should be concave on M, and for each µ ∈ M, the function L(µ, ·) should be convex on N. Note that in the n = 2 case, we can write L(µ, ν) = S1 (µ) + Φ(ν) − Φ(1) (ν, µ) which is convex in ν as long as Φ is convex, because Φ(1) (·, µ) is linear. Therefore, in this case L(µ, ν) is concave-convex. Among other things, this means that g˜ is convex. One requirement for applying Theorem 4.3 is that the function L is assumed to map into R (instead of R ∪ {±∞}). This is the reason that we stated Theorem 6.1 in the precise way we did. 6.2. Proofs In order to prove Theorem 4.3, we will need more information about the maximizer of g. Note that, since g is upper semicontinuous, it does attain its maximum on the compact set M+ 1 (Ω). If Φ is convex, then g is also strictly concave simply because g(µ) = S1 (µ) − Φ(µ) and S1 is strictly concave. Therefore, the maximum is unique. In order to state the following lemma, let Cφ be the finite constant Cφ = inf x∈Ω φ(x). Let Cα = α⊗n (φ) < ∞. Note that Cα ≥ Cφ . Lemma 6.3. Let µ∗ ∈ M+ 1 (Ω) be the maximizer of g. Then µ α and dµ∗ (x) = exp C∗ − Φ(1) (µ∗ , δx ) dα
(6.1)
for α-a.e. x ∈ Ω. Here, C∗ is a finite constant related to µ∗ and p∗ by C∗ = Φ(µ∗ ) − p∗ = S1 (µ∗ ) − 2p∗ . In particular, one has the bounds Cα + Cφ ≤ C∗ ≤ 2Cα so that
dµ∗
dα ≤ exp(2[Cα − Cφ ]) . ∞
Proof. Note that g is finite on α, so that g(µ∗ ) > −∞. In particular, this means that S1 (µ∗ ) > −∞. So µ∗ α. Suppose, in order to reach a contradiction, that supp(α)\supp(µ∗ ) = ∅. Then, there is a ball B = B(x; r) ⊂ Ω, r > 0, such that µ∗ (B) = 0 and α(B) > 0. Let ν := α(B)−1 χB α,
November 18, 2005 10:54 WSPC/148-RMP
1234
J070-00251
E. Kritchevski & S. Starr
where χB is the indicator function of B. Let µ := (1 − ) · µ∗ + · ν. A straightforward calculation shows that lim −1 [S1 (µ ) − S1 (µ∗ )] = +∞, ↓0
whereas lim −1 [Φ(µ ) − Φ(µ∗ )] = Φ(1) (µ∗ , ν − µ∗ ) ↓0
is a finite number. Hence, there is an > 0 small enough so that g(µ ) > g(µ∗ ), contradicting the fact that µ∗ is a maximizer. Now let B = B(x0 ; r) ⊂ Ω, for some x0 ∈ supp(µ∗ ) and r > 0. Let ν := µ∗ (B)−1 χB µ∗ . For t ∈ R, let µt = (1 − t) · µ∗ + t · ν. Note that for −µ∗ (B) < t < 1, one has that µt ∈ M+,1 (Ω). It is easy to see that the following function is continuously differentiable, γ(t) := g(µt ) = S1 (µt ) − Φ(µt ). Moreover, the derivative at 0 is dµ∗ (x) dν(x) − Φ(1) (µ∗ , ν) − S1 (µ∗ ) + Φ(1) (µ∗ , µ∗ ). γ (0) = −log dα Ω By criticality, this must equal 0. So, dµ∗ (x) dν(x) + Φ(1) (µ∗ , ν) = C∗ , log dα Ω where C∗ = Φ(1) (µ∗ , µ∗ ) − S1 (µ∗ ) = g(µ∗ ) − Φ(µ∗ ) is independent of x and r. Note that since g(µ∗ ) = p∗ , this gives the previous formulas for C∗ . Note that dµ∗ (x) dν(x) + Φ(1) (µ∗ , ν) − C∗ log dα Ω dµ∗ (x) + Φ(1) (µ∗ , δx ) − C∗ dν(x). = log dα Ω Since the total integral equals zero for all ν, and x0 and r are arbitrary, one concludes that dµ∗ (x) + Φ(1) (µ∗ , δx ) − C∗ = 0 log dα for almost every x ∈ supp(µ∗ ). But supp(µ∗ ) = supp(α). Exponentiating this equation yields (6.1).
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1235
Proof of Theorem 4.3. Observe that, for any 0 ≤ C < ∞, the subset M+ 1 (Ω, α, C) (Ω). Also, L(µ, ν) is well defined and finite for all is compact and convex in M+ 1 (1) (Ω, α, C). Part of this statement is that Φ(ν) and Φ (ν, µ) are finite. µ, ν ∈ M+ 1 This is tantamount to the first remark following the statement of Theorem 4.3. The other fact is that S1 (µ) is finite, because S1 (µ) ≥ ψ(eC ) > −∞. Therefore, the hypotheses of Theorem 6.1 are satisfied, so that sup inf L(µ, ν) = inf sup L(µ, ν)
µ∈M ν∈N
ν∈N µ∈M
(6.2)
when M = N = M+ 1 (Ω, α, c). By inequality (3.10), for any µ ∈ M+ 1 (Ω), inf
ν∈M+ 1 (Ω)
L(µ, ν) = g(µ).
Moreover, the minimum is attained at µ = ν. In particular, if µ ∈ M+ 1 (Ω, α, C), then so is the minimizer ν, i.e. inf L(µ, ν) = g(µ)
ν∈N
for all µ ∈ M. Therefore, sup g(µ) = inf sup L(µ, ν)
(6.3)
ν∈N µ∈N
µ∈M
by (6.2). By Theorem 5.2, sup µ∈M+ 1 (Ω)
L(µ, ν) = g˜(ν)
(6.4)
(1) for any ν ∈ M+ (ν, µ) as a (ν-dependent) Hamiltonian inte1 (Ω), by viewing Φ grated against µ. So, optimizing over the smaller set gives the inequality
sup L(µ, ν) ≤ g˜(ν). µ∈N
Therefore, sup µ∈M+ 1 (Ω,α,c)
g(µ) ≤
inf
ν∈M+ 1 (Ω,α,C)
g˜(ν)
(6.5)
by (6.3). By Lemma 6.3, the unrestricted optimizer of g, over M+ 1 (Ω) is µ∗ which is in + M1 (Ω, α, C) for every C > 2(Cα − Cφ ). Moreover, g(µ∗ ) = p∗ . So, by (6.5), p∗ ≤
inf
ν∈M+ 1 (Ω,α,C)
g˜(ν)
for every C > 2(Cα − Cφ ). In particular, lim
inf
C→∞ ν∈M+ (Ω,α,C)
g˜(ν) ≥ p∗ .
1
The proof will be completed by also establishing the opposite inequality.
(6.6)
November 18, 2005 10:54 WSPC/148-RMP
1236
J070-00251
E. Kritchevski & S. Starr
As noted, µ∗ is in M+ 1 (Ω, α, C) for C > 2(Cα − Cφ ). Therefore, inf
lim
C→∞ ν∈M+ (Ω,α,C)
g˜(ν) ≤ g˜(µ∗ ).
(6.7)
1
By (4.1),
g˜(µ∗ ) = Φ(µ∗ ) + log
Ω
exp −Φ(1) (µ∗ ; δx ) dα(x).
But, by Eq. (6.1), dµ∗ (x)dα(x) = exp(−C∗ ). exp −Φ(1) (µ∗ ; δx ) dα(x) = exp(−C∗ ) Ω Ω dα Therefore, g˜(µ∗ ) = Φ(µ∗ ) − C∗ . But also by Lemma 6.3, Φ(µ∗ ) − C∗ = p∗ . Therefore, combining with (6.7), lim
inf
C→∞ ν∈M+ (Ω,α,C)
≤ p∗ ,
1
as needed. Remark 6.4. A posteriori it is clear that there is a saddle point for the Lagrangian L(µ, ν) at µ = ν = µ∗ . However since L may not be strictly concave-convex, this may not be the only argminmax or argmaxmin. (Cf. [22], Chap. 11, Secs. I and J, for the relevant notation from convex variational analysis.) If one could establish that g˜ has an optimizer which can be identified by the Euler–Lagrange equations, then it must also be an optimizer for g because the Euler–Lagrange equations are the same. However, except in the case that φ is bounded and continuous, it is not clear that this is the case a priori. 7. Example: Quadratic Kernel Let us consider Ω ⊂ Rd compact, and φ(x, y) = −x − y2 . It is well known that ¯ (1) : M0 (Ω) × M0 (Ω) → R by the map this defines a positive semidefinite form Φ (1) ¯ Φ (µ, ν) = 2(µ ⊗ ν)(φ), where M0 (Ω) is the set of all bounded-variation, signed measures with total measure equal to 0. (In fact this is the critical homogeneous potential with this property. Cf. Schoenberg [25].) Therefore, Φ is convex. We note that, for ν ∈ M+ 1 (Ω), 1 (1) Φ (ν, δx ) = − x − y2 dν(y) 2 Ω 2 = −x + 2 (x, y)dν(y) − y2 dν(y) Ω
= −x − E[X]2 − Var(X),
Ω
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1237
where X is a random variable which is ν-distributed. Using this, we also have x − E[X]2 dν(x) Φ(ν) = −Var(X) − = −2 Var(X). Therefore,
Ω
exp −Φ(1) (ν, δx ) dα(x) Ω = −2 Var(X) + log exp 2 Var(X) + 2x − E[X]2 dα(x) Ω = log exp 2x − E[X]2 dα(x).
g˜(ν) = Φ(ν) + log
Ω
In particular, this only depends on ν through Eν [X]. (We will write Eν [X] when we want to specify that X is ν-distributed.) Given any x0 ∈ Ω, we can choose ν = δx0 so that there is at least one ν such that Eν [X] = x0 . Therefore, the extended variational principle tells us that exp 2x − y2 dα(x). p∗ = min log y∈Ω
Ω
This is obviously a convex optimization problem, where the convex cost functional to be minimized is exp 2x − y2 dα(x). C(y) = log Ω
Moreover, since Ω is compact and since the cost functional is continuous, there does exist a unique solution. Notice that the criticality condition is the implicit characterization: 2 xe2 x−y dα(x) y = Ω . 2 e2 x−y dα(x) Ω
This example contains mean-field Ising and Heisenberg antiferromagnets as special cases. These are obtained by taking Ω = Sd−1 , the spheres in Rd . The Ising case is d = 1 for which we have S0 = {−1, +1}. We can include a 1-body term, representing and external magnetic field, by a special choices of the a priori measure. We can also determine the Gibbs measure. It is equal to 2 dρ∗ (x) = Z −1 e2 x−x∗ , dα where y = x∗ solves the optimization problem above. If we change φ to −φ, we obtain the ferromagnetic version of these mean-field models. However, the analogous cost function becomes exp −2x − y2 dα(x) C(y) = log
Ω
November 18, 2005 10:54 WSPC/148-RMP
1238
J070-00251
E. Kritchevski & S. Starr
and we have p∗ = maxy∈Ω C(y). Since the cost function is not concave, there can be multiple optimizers (depending on Ω and α) which may be interpreted as the existence of a phase transition. Acknowledgments S.S. is most grateful to Michael Aizenman and Bob Sims, especially to Michael Aizenman who invented the EVP. We also benefited from discussions with the following people: Aernout van Enter, Bruno Nachtergaele, Vojkan Jaksic, Mark Fannes, Marco Merkli and Ugur G¨ ul. In particular, A.C.D. van Enter and B. Nachtergaele alerted us to Lemma 5.7 as well as gave other assistance. The research of E.K. was supported in part by FQRNT. References [1] M. Aizenman and A. Ruzmaikina, Ann. Probab. 33(1) (2005) 82113. [2] M. Aizenman, R. Sims and S. Starr, Phys. Rev. B 68 (2003) 214403. ´ ´ e de Probabilit´es de Saint[3] D. J. Aldous, Exchangeability and related topics, Ecole d’Et´ Flour XII, ed. P. L. Hennequin, Lecture Notes in Mathematics, Vol. 1117 (SpringerVerlag, 1985), pp. 1–198. [4] A. Bianchi, P. Contucci and C. Giardina, Math. Phys. Electron. J. 9 (2003) 6. [5] R. M. Dudley, Real Analysis and Probability (Cambridge University Press, Cambridge, 2002). [6] N. Dunford and J. T. Schwartz, Linear Operators. I. General Theory, with the assistance of W. G. Bade and R. G. Bartle (Interscience Publishers, Inc., New York, 1958). [7] R. Ellis and C. Newman, Z. Wahrsch. Verw. Gebiete 44(2) (1978) 117–139. [8] M. Fannes and A. Verbeure, Commun. Math. Phys. 55(2) (1977) 125–131. [9] M. Fannes and A. Verbeure, Commun. Math. Phys. 57(2) (1977) 165–171. [10] M. Fannes, H. Spohn and A. Verbeure, J. Math. Phys. 21 (1980) 355–358. [11] M. Fekete, Math. Z. 17 (1923) 228–249. [12] H.-O. Georgii, Gibbs Measures and Phase Transitions (Walter de Gruyter and Co., Berlin, 1988). [13] F. Guerra, Commun. Math. Phys. 233(1) (2003) 1–12. [14] F. Guerra and F. L. Toninelli, Commun. Math. Phys. 230(1) (2002) 71–79. [15] K. Hepp and E. H. Lieb, Phys. Rev. A 8 (1973) 2517–2525. [16] E. Hewitt and L. J. Savage, Trans. Amer. Math. Soc. 80 (1955) 470–501. [17] R. B. Israel, Convexity in the Theory of Lattice Gases, With an Introduction by Arthur S. Wightman (Princeton University Press, Princeton, 1979). [18] E. H. Lieb, Commun. Math. Phys. 31 (1973) 327–340. [19] E. H. Lieb and M. Loss, Analysis, 2nd edn. (American Mathematical Society, Providence, RI, 2001). [20] G. P´ olya and G. Szeg¨ o, Aufgaben und Lehrs¨ atze aus der Analysis. Erster Band. Reihen, Integralrechnung, Funktionentheorie (Springer-Verlag, Berlin, 1954). [21] M. Reed and B. Simon, Methods of Modern Mathematical Physics I: Functional Analysis (Academic Press, San Diego, 1980). [22] R. T. Rockafellar and R. J.-B. Wets, Variational Analysis (Springer-Verlag, Berlin, 1998). [23] D. Ruelle, Commun. Math. Phys. 108(2) (1987) 225–239.
November 18, 2005 10:54 WSPC/148-RMP
J070-00251
The Extended Variational Principle for Mean-Field, Classical Spin Systems
1239
[24] D. Ruelle and D. Robinson, Commun. Math. Phys. 5 (1967) 288–300. [25] I. J. Schoenberg, Ann. of Math. 39(4) (1938) 811–841. [26] B. Simon, The Statistical Mechanics of Lattice Gases, Vol. I (Princeton University Press, Princeton, 1993). [27] M. Sion, Pacific J. Math. 8 (1958) 171–176. [28] E. Størmer, J. Funct. Anal. 3 (1969) 48–68. [29] D. W. Stroock, Probability Theory, an Analytic View, Revised edition (Cambridge University Press, Cambridge, 1993). [30] M. Talagrand, The Parisi Formula, to appear in Annals of Math. [31] M. Talagrand, Spin Glasses: A Challenge for Mathematicians. Cavity and Mean Field Models (Springer-Verlag, Berlin, 2003).
November 18, 2005 10:54 WSPC/148-RMP
J070-00254
Reviews in Mathematical Physics Vol. 17, No. 10 (2005) 1241 c World Scientific Publishing Company
ERRATUM QUANTUM MACROSTATISTICAL THEORY OF NONEQUILIBRIUM STEADY STATES
[Reviews in Mathematical Physics, Vol. 17, No. 9 (2005) 977–1020] GEOFFREY L. SEWELL Department of Physics, Queen Mary, University of London, Mile End Road, London E1 4NS, UK [email protected]
On page 1016, Eq. (C.17) should read as: T (F ) = dxC n,n (x)∂xn ∂xn F (x, x ) | x =x n,n ∈Nd
Ω
1241
∀F ∈ D(Ω21 )
(C.17)
November 18, 2005 10:54 WSPC/148-RMP J070-00253
REVIEWS IN MATHEMATICAL PHYSICS Author Index Volume 17 (2005)
Akiho, N., Hiai, F. & Petz, D., Equilibrium states and their entropy densities in gauge-invariant C ∗ -systems Albeverio, S., Gottschalk, H. & Yoshida, M. W., Systems of classical particles in the grand canonical ensemble, scaling limits and quantum field theory Ali, S. T. & Engliˇs, M., Quantization methods: A guide for physicists and analysts Angelescu, N., Minlos, R. A. & Zagrebnov, V. A., Lower spectral branches of a particle coupled to a Bose field Arai, A., Generalized weak Weyl relation and decay of quantum dynamics Araki, H. & Zsid´ o, L., Extension of the structure theorem of Borchers and its application to half-sided modular inclusions Bahn, C., Ko, C. K. & Park, Y. M., Remarks on sufficient conditions for conservativity of minimal quantum dynamical semigroups Bunke, U. & Schick, T., On the topology of T -duality Eng, D. & Erd˝ os, L., The linear Boltzmann equation as the low density limit of a
random Schr¨ odinger equation Engliˇs, M., see Ali, S. T. Erd˝ os, L., see Eng, D. Fewster, C. J. & Hollands, S., Quantum energy inequalities in two-dimensional conformal field theory Figueroa, H. & Gracia-Bond´ıa, J. M., Combinatorial Hopf algebras in quantum field theory I Gang, Z. & Sigal, I. M., Asymptotic stability of nonlinear Schr¨ odinger equations with potential G´ erard, C. & J¨ akel, C. D., Thermal quantum fields without cut-offs in 1+1 space-time dimensions Gottschalk, H., see Albeverio, S. Gracia-Bond´ıa, J. M., see Figueroa, H. Guilfoyle, B., Klingenberg, W. & Sen, S., The Casimir effect between non-parallel plates by geometric optics Hiai, F., see Akiho, N. Hollands, S. & Wald, R. M., Conservation of the stress tensor in perturbative interacting quantum field theory in curved spacetimes
4 (2005) 365
2 (2005) 175
4 (2005) 391
10 (2005) 1111
9 (2005) 1071
5 (2005) 491
7 (2005) 745 1 (2005) 77
1243
6 (2005) 669 4 (2005) 391 6 (2005) 669
5 (2005) 577
8 (2005) 881
10 (2005) 1143
2 (2005) 113 2 (2005) 175 8 (2005) 881
8 (2005) 859 4 (2005) 365
3 (2005) 227
November 18, 2005 10:54 WSPC/148-RMP
1244
J070-00253
Author Index
Hollands, S., see Fewster, C. J. J¨ akel, C. D., see G´ erard, C. Klingenberg, W., see Guilfoyle, B. Ko, C. K., see Bahn, C. Kritchevski, E. & Starr, S., The extended variational principle for mean-field, classical spin systems L´ opez, R., Wetting phenomena and constant mean curvature surfaces with boundary Minlos, R. A., see Angelescu, N. Nikolov, N. M. & Todorov, I. T., Elliptic thermal correlation functions and modular forms in a globally conformal invariant QFT Panchenko, D., Free energy in the generalized Sherrington– Kirkpatrick mean field model Park, Y. M., see Bahn, C. Petz, D., see Akiho, N. Ruzzi, G., Homotopy of posets, net-cohomology and superselection sectors in globally hyperbolic space-times
5 (2005) 577 2 (2005) 113 8 (2005) 859 7 (2005) 745
10 (2005) 1209
7 (2005) 769 10 (2005) 1111
6 (2005) 613
7 (2005) 793 7 (2005) 745 4 (2005) 365
9 (2005) 1021
Schick, T., see Bunke, U. Sen, S., see Guilfoyle, B. Sewell, G. L., Quantum macrostatistical theory of nonequilibrium steady states Sigal, I. M., see Gang, Z. Soltan, P. M., New quantum “az + b” groups Starr, S., see Kritchevski, E. Todorov, I. T., see Nikolov, N. M. Verch, R. & Werner, R. F., Distillability and positivity of partial transposes in general quantum field systems Wald, R. M., see Hollands, S. Waldmann, S., States and representations in deformation quantization Werner, R. F., see Verch, R. Wreszinski, W. F., Passivity of ground states of quantum systems Yoshida, M. W., see Albeverio, S. Zagrebnov, V. A., see Angelescu, N. Zsid´ o, L., see Araki, H.
1 (2005) 77 8 (2005) 859
9 (2005) 977 10 (2005) 1143 3 (2005) 313 10 (2005) 1209 6 (2005) 613
5 (2005) 545 3 (2005) 227
1 (2005) 15 5 (2005) 545
1 (2005) 1 2 (2005) 175 10 (2005) 1111 5 (2005) 491